ence to the semantics of the classified elements is made
via the appearance model of the fields. The increase in
classification accuracy was significant.
In all cases a registration with respect to a common
geometric frame is necessary. This requires some type of
resampling both for image and for map data. Common
practice is to use nearest neighbour resampling in order
to leave the measured data unchanged.
A closer analysis reveals this approach to be inapprop
riate, moreover it uncovers the deficiencies of the pixel
based classification techniques:
the underlying model is image, not object based thus
not natural. Modelling line processes in the HMRF
approach relates to the crack edges which are only
crude approximations of the edges of the objects;
it is by no way clear how to model the resampled in
tensities derived from sensors having different reso
lutions, as the original intensities approximately are
integrations of the object centred reflectance function
with the point spread function of the sensor. Finally
there is no way to solve the „mixed pixels" problem
without explicit reference to some true or estimated
object boundaries;
pixel based classification schemes do not allow to in
troduce a larger context. Though the HMRF approach
theoretically is able to model dependencies of classes
over a larger range the modelling is done implicitly.
E.g. the straightness of boundaries can not be ex
pressed. Even larger contexts such as hierarchical
(containment) or topological (road) structures cannot
be handled at all. With increasing resolution 10 m
pixel size) the amount of information contained in geo
metric structures increases which cannot be captured
by a pure pixel based approach;
the fusion of images taken at different times has to
track the spectral responses of each pixel over time.
This introduces an additional instability in the classifi
cation as the causing (e.g. growth) processes usually
are correlated between neighbouring pixels. Though
this also may be modeled using Markov random fields
the computational effort is high in case a certain rigour
in the modelling is aimed at.
These critics of course only hold for automatic interpre
tation schemes. The methods may very well be used for
getting approximate classification results, or for support
ing manual interpretation, e.g. [2, 35 and 90] or in case
the pixel resolution compared to the size of the objects is
sufficient and the objects are highly homogeneous.
The discussion, however, wanted to show the pixel based
methods to have severe disadvantages in case infor
mation of different sensors or maps has to be merged.
One way to avoid these deficiencies is to base the inter
pretation on larger, aggregated units.
Feature based information fusion
Symbolic descriptions of the image may have any level of
abstraction and then may always be made equivalent to
the aggregation level of the object model, provided a high
enough resolution of the images is available. We may dis
tinguish at least three levels of abstraction:
lowest representation level is characterized by lists of
basic elements, namely attributed points, edges and
or regions;
medium representation level in addition contains attri
buted relations between the basic elements;
highest representation level consists of further ag
gregated basic elements which may result from a
grouping process.
In all cases we do not assume the semantic aspect to play
the central role, i.e. no interpretation has taken place.
However the selection of the criteria for extracting fea
tures, their relations and possibly their grouping may very
well be guided or at least motivated by the image model,
which itself contains information about both the structure
and the meaning of the different components of an
image. Thus feature extraction may be performed bottom-
up on the complete data set or top-down depending on a
request of the analysis system or the image analyst. The
main task of fusion on the feature level again is the
registration and rectification of the image data.
List of points, lines and regions
The features easiest to extract in remote sensing images
are points, lines or regions and their attributes such as
type of point (T-, Y-, or L-junction, blob) or type of line
(edge, dark/light line, strength, contrast) or type of region
(round, rectangular, polygon shaped) etc. The advantage
already of these low-level features is their invariance to a
great variety of transformations, i.e. observational situa
tions. Especially their geometric properties and a great
number of their attributes remain invariant to lighting and
sensor conditions, e.g. edges (between fields), lines
(representing roads) or homogeneous regions (caused by
lakes) appear very similar over time and may even be
linked with map data [31]. This enables at least a geo
metric link between different images and possibly map
data.
In case the data are already geocoded a link with map
data appears to be promising, as the aggregation es
pecially of lines and regions may be guided by map infor
mation. In case no or poor map data are available, geo
coded features of different image sources may be used
for grouping, thus supporting each other to arrive at a
higher level structured description, which is richer than
the ones derived from the individual images. The draw
back of these features is obvious: Stable features only
cover a small percentage of the scenes to be analyzed.
Most features to be extracted are unstable in existence,
geometry, and attribute. Except for very clear lines (re
presenting line type objects or boundaries of regions)
a fusion of different information sources is extremely
difficult in case no relations between these features are
used to increase mutual consistency with the image
model.
An example for the fusion of image data using these low
level features is the registration based on straight line
segments for lining image and map data [80] for orien
tation.
Relational descriptions
Relations between the low level features increase the
strength of the representation. The same relations as in
GIS-modelling could be used [65] in case a complete
representation of the images can be achieved. As this
may not be feasible without knowledge about the object
classes shown in the image, also weaker relations (near
to, possibly connected etc.) may be used.
The fusion of different images may rely on the similarity
of the relational descriptions in order to arrive at a more
complete relational description. There seem to be only
few examples of using relational descriptions without
378
NGT GEODESIA 93 - 8