The partial success of image interpretation systems
results from the severe simplifications of the used models
on one hand and from the excellent engineering capabili
ties of system designers on the other hand.
The different approaches for fusing these information
sources may be classified depending on
whether the data is represented in raster or vector
form, or equivalently, whether the data structure al
ready reflects the structure of the object or not;
whether the semantics of the fused information is used
explicitly or only implicitly.
We thus may distinguish signal based, property based,
feature based and object based information fusion (fig. 7).
information content
of object model
geometry biology
physics semantics
signal
property
raster
based
based
representation
of image data
feature
object
vector
based
based
Fig. 7. The different types of fusing information.
Signal based fusion is characterized by the raster struc
ture of the used data. Any type of image processing
technique may be used here to obtain raster oriented
attributes. No explicit reference to the semantics is as
sumed to be necessary.
Property based fusion uses the original or derived raster
data together with their meaning derived by some sta
tistical classification procedures. Thus local properties of
the objects and possibly the relations between these
properties are used.
Feature based fusion is characterized by the structural
description of the image, e.g. lists, graphs or relational
descriptions, including the attributes linked to the fea
tures or relations. No direct reference to the meaning of
the features is assumed at this level, though the feature
extraction in general will be guided by the scope of the
interpretation.
Object based fusion relies on the semantics of real world
objects or their models, either in the object model or in the
interpretation model. Thus it is related to the relationships
between the objects or between non-local properties of
the objects to be extracted.
We do not want to discuss the techniques or the systems
as such, which operate on these types of information. We
will, however, concentrate on the ability of the differently
structured approaches to fuse information dependent on
their various types.
Signal based information fusion
Fusing information on the signal level is motivated by
the possibility to formally invert the above mentioned
observation process using some kind of least squares
techniques in case geometric/physical properties are to
be derived from the sensor data. This is the reason for the
break through in digital photogrammetry with the object
centred surface reconstruction schemes by [37 and 18];
also see [36 and 93]. The shape-from-techniques, being
a special case of these approaches, therefore belong to
this class as long as they lead to an iconic description of
the object, e.g. a raster DTM.
As the information about the object used in these ap
proaches is purely geometrical or physical, the resultant
surface form and reflectance parameters may be the
basis for a more deeper, property based information
extraction in case the geometric and physical information
can be related to the objects to be recovered.
Property based information fusion
Fusing information for image interpretation based on
properties of the objects visible in the images is the most
common and intuitive technique [19]. It is motivated by:
ease of maximum likelihood classification applied to
the channels of multi-spectral images which due to the
design of the sensors (approximately) refer to the
same object position;
dominance of the spectral features for identifying ob
ject classes in case of low resolution images 30 m
pixel size) where geometric features play a secondary
role.
The reason for the still great success of pixel based
classification schemes are the increasingly advanced
techniques to reduce the signal i.e. data values to para
meters describing the object related, i.e. invariant reflec
tance properties. Reductions include sensor calibration,
atmospheric corrections, influence of terrain aspect and
illumination direction etc. Rastered maps may be used as
additional channel [49]. An increase of the classification
accuracy (probability of correct classification) is expected
from taking the local context into account. This on one
hand may be achieved by using parameters derived from
a certain neighbourhood, reflecting texture parameters
such as variance, average orientation, gradient magni
tude or local power spectra. In contrast to the use of the
(reduced) intensity values themselves no strong physical
models are available which motivate the selection of
proper texture parameters.
The result of the pixelwise classification usually shows
unfavourable irregularities especially at the borders of
otherwise homogeneous areas [40, 58 and 82]. In order
to achieve cleaner results often a post processing is
applied e.g. by replacing the class of a pixel by the
majority in a 3 by 3 neighbourhood, which of course is an
ad hoc procedure [42 and 59].
A more rigorous model based technique are hidden
Markov random fields (HMRF) [10, 27 and 69]. The re
lation of neighbouring class labels is modeled using their
conditional probabilities. Also line processes may be in
cluded to obtain smooth region boundaries. The tech
niques have been extended to allow the integration of
multiple sensor data, of data from different times or even
of map data.
An example is the approach given in [60]. There an
integration multi-temporal images on the pixel level, by
actually using surface elements, are derived from rectifi
cation. The model explicitly applies the temporal relations
using probabilities of crop rotation, i.e. about the temporal
relation of the local object properties between the fields
the surface elements belong to. Therefore explicit refer-
NGT GEODESIA 93 8
377