Inter
national
J
our
nal
of
Robotics
and
A
utomation
(IJRA)
V
ol.
9,
No.
2,
June
2020,
pp.
63
72
ISSN:
2089-4856,
DOI:
10.11591/ijra.v9i2.pp63-72
r
63
General
concepts
of
multi-sensor
data-fusion
based
SLAM
J
an
Kle
ˇ
cka,
Kar
el
Hor
´
ak,
Ond
ˇ
rej
Bo
ˇ
st
´
ık
Department
of
Control
and
Instrumentation
at
Brno
Uni
v
ersity
of
T
echnology
,
Czech
Republic
Article
Inf
o
Article
history:
Recei
v
ed
Sep
30,
2019
Re
vised
Oct
06,
2019
Accepted
Feb
18,
2020
K
eyw
ords:
Data
fusion
Localization
Mapping
P
artially
collecti
v
e
mapping
Simultaneous
localization
and
mapping
(SLAM)
ABSTRA
CT
This
paper
is
approaching
a
problem
of
Simultaneous
Localization
and
Mapping
(SLAM)
algorithms
focused
specifically
on
processing
of
data
from
a
heterogeneous
set
of
sensors
concurrently
.
Sensors
are
considered
to
be
dif
ferent
in
a
sense
of
mea-
sured
ph
ysical
quantity
and
so
the
problem
of
ef
fecti
v
e
data-fusion
is
discussed.
A
special
e
xtension
of
the
standard
probabilistic
approach
to
SLAM
algorithms
is
pre-
sented.
This
e
xtension
is
composed
of
tw
o
parts.
Firstly
is
presented
general
perspec-
ti
v
e
multiple-sensors
based
S
LAM
and
then
thee
archetypical
special
case
s
are
dis-
cuses.
One
archetype
pro
visionally
designated
as
”partially
collecti
v
e
mapping”
has
been
analyzed
also
in
a
practical
perspecti
v
e
because
it
implies
a
promising
options
for
implicit
map-le
v
el
data-fusion.
This
is
an
open
access
article
under
the
CC
BY
-SA
license
.
Corresponding
A
uthor:
Jan
Kle
ˇ
cka,
Department
of
Control
and
Instrumentation,
Brno
Uni
v
ersity
of
T
echnology
,
T
echnick
´
a
12,
Brno,
Czech
Republic
Email:
klecka@feec.vutbr
.cz
1.
INTR
ODUCTION
After
more
than
thee
decades
of
research
the
Simultaneous
Locali
zation
and
M
apping
(SLAM)
algo-
rithms
pro
vide
still
a
v
ariety
of
open
topics
for
further
de
v
elopment
as
we
can
see
e.g.
surv
e
y
by
C.
Cadena’
s
et
al.
[1]
or
in
critique
by
Huang
et
al.
[2].
These
algorithms
are
designed
to
continuously
process
gi
v
en
observ
ations
of
surroundings
to
pro
vide
observ
er’
s
current
position
(or
sometimes
whole
trajectory)
and
map
of
observ
ed
en
vironment.
Such
information
is
unsubstitutable
feedback
for
practically
an
y
na
vig
ation
task
e.g.
trajectory
planning
or
comple
x
mo
v
ement
e
x
ecution.
There
can
be
found
man
y
application
fields
for
SLAM
algorithms.
W
e
chose
to
underline
only
thee
which,
as
we
feel,
are
no
w
adays
widely
discussed.
Na
vig
ation
of
autonomous
cars
as
discussed
by
Bresson
et
al.
[3],
v
arious
industry
4.0
tasks
e.g.
Beul
presented
w
arehouse
in
v
entory
check
[4]
or
augmented
reality
task
as
sho
wn
by
Klein
and
Murray
[5].
F
or
se
v
eral
years
ha
v
e
we
been
dealing
with
SLAM
based
on
v
arious
sensor
data-fusion
and
this
paper
aims
to
report
some
general
findings
we
ha
v
e
done.
Our
original
methodology
has
been
originally
mainly
inducti
v
e
process.
W
e
originally
be
g
an
with
the
concept
of
b
uilding
map
using
simple
geometrical
entities
to
approximate
in
piece-wise
manner
surf
aces
of
solids
that
are
creating
the
mapped
en
vironment
and
during
the
de
v
elopment,
we
iterati
v
ely
generalize
this
specific
concept
until
it
fits
the
standard
probabilistic
SLAM
algorithms
theory
.
Ho
we
v
er
follo
wing
descriptions
are
conducted
in
a
more
comprehensible
deducti
v
e
process
where
we
start
with
the
general
and
w
ork
our
w
ay
to
the
specific.
W
e
ha
v
e
been
trying
to
use
common
notation
customs
although
for
maximal
clarity
of
follo
wing
descriptions
we
quickly
state
used
rules.
Matrices
and
v
ectors
symbols
are
bold
e.g.
A
;
x
where
uppercase
J
ournal
homepage:
http://ijr
a.iaescor
e
.com/inde
x.php/IJRA
Evaluation Warning : The document was created with Spire.PDF for Python.
64
r
ISSN:
2089-4856
is
used
for
matrices
and
l
o
wercase
for
v
ectors.
Bold
uppercase
symbols
are
also
used
for
sets
which
also
has
lo
wer
inde
x
sho
w
range
of
their
cardinality
e.g.
Z
0:
N
=
z
0
;
z
1
;
;
z
N
.
Scalar
symbols
are
italics
e.g.
N
.
Subscripts
are
used
to
e
xpress
specific
element
of
a
lar
ger
collection
e.g.
z
n
is
a
realization
of
z
in
time
t
=
n
.
Superscripts
in
square
brack
ets
symbolize
specific
modality
e.g.
z
[
k
]
is
z
associated
with
k
-type
sensor
.
F
or
functions
is
used
a
normal
font
e.g.
h
(
)
is
function
named
h.
2.
RELA
TED
W
ORKS
As
we
already
indicate
in
introduction
e
xcept
for
concept
data-fusion
based
SLAM
we
also
dealing
with
the
concept
of
SLAM
using
map
representation
in
the
form
of
a
collection
of
geometric
entities
so
we
split
this
section
into
respecti
v
e
subsections.
2.1.
Data-fusion
in
context
of
SLAM
A
substantial
amount
of
papers
that
mention
k
e
yw
ord
fusion
in
the
conte
xt
of
SLAM
algorithms
deals
with
processing
observ
ations
from
a
single
RGB-D
camera
(or
often
e
v
en
specifically
the
Microsoft
Kinect).
Examples
of
such
w
orks
are:
KinectFusion
algorithm
presented
by
Ne
wcombe
et
al.
[6],
algorithm
Fusion++
by
McCormac
et
al.
[7]
or
ElasticFusion
by
Whelan
et
al.
[8,
9].
Se
v
eral
teams
reported
a
lso
about
SLAM
based
on
observ
ations
from
multiple
sensors.
F
or
e
xample
with
processing
data
from
custom
made
sensory
head
equipped
with
tw
o
CCD
cameras,
tw
o
thermo-cameras
and
range
finder
has
dealt
Burian
et
al.
[10]
-
data
from
rangefinder
is
used
depth
reference
for
camera
images
and
therefore
can
be
enhanced
by
using
mathematical
models
of
indi
vidual
cameras.
F
ang
et
al.
presented
a
SLAM
capable
system
with
CCD
camera
and
sonar
[11]
which
impro
v
es
the
reliability
by
utilizing
feature-
le
v
el
data-fusion.
Let’
s
notice
that
in
so
f
ar
listed
algorithms
the
data-fusion
is
conducted
al
w
ays
prior
to
SLAM
itera-
tion
and
so
the
SLAM
algorithms
then
process
already
fused
data.
Notice
moreo
v
er
that
v
arious
modaliti
es
are
typically
conceptuall
y
in
mutually
nonequi
v
alent
status.
The
dept
perception
modality
is
typically
in
unsubsti-
tutable
position
and
other
modalities
(lik
e
color)
are
used
to
increase
the
rob
ustness
of
the
whole
sol
ution
or
just
for
map
presentation
purposes.
2.2.
Map
as
a
set
on
non-point
geometrical
entities
There
can
be
found
some
papers
that
preset
solutions
to
SLAM
problems
that
use
representation
of
map
in
the
form
of
a
collection
of
geometrical
entities.
F
or
e
xample,
lidar
-based
2D
SLAM
that
represents
the
en
vironment
by
a
set
of
lines
is
sho
wn
by
Garulli
et
al.
in
[12]
and
also
by
Choi
et
al.
[13].
Example
of
lidar
-based
3D
SLAM
which
uses
plane
features
is
presented
by
Ulas
and
T
emeltas
[14].
These
concepts
aren’
t
specific
only
for
Lida
r
.
Zhou
et
al.
[15]
and
Uehara
et
al.
[16]
are
reported
vision-based
SLAM
algori
thms
that
utilize
line
features.
Y
ang
et
al.
[17]
sho
ws
that
utilizing
planes
can
impro
v
e
rob
ustness
of
monocular
SLAM
ag
ainst
standard
strictly
point-based
approaches
.
There
can
be
found
also
reports
that
approach
only
partial
problems
lik
e
se
gmentation.
F
or
e
xample
algorithm
for
approximation
point
2D
cloud
by
collection
of
lines
by
Jelinek
et
al.
[18]
or
detection
of
planes
in
3D
point-cloud
by
Hulik
et
al.
[19]
and
also
by
P
athak
et
al.
[20].
3.
PR
OB
ABILISTIC
APPR
O
A
CH
In
this
section,
the
mathematical
background
of
fusion-based
algorithms
is
presented.
W
e
present
the
problem
from
a
probabilistic
perspecti
v
e
to
ur
ge
the
m
aximal
generality
of
gi
v
en
formulas.
Ev
en
though
some
concretization
had
been
made.
W
e
assumed
strictly
the
static
en
vironment
and
from
perspecti
v
e
of
estimated
trajectory
,
we
pro
vide
solution
to
tw
o
v
ariants
-
the
”online”
SLAM
that
aims
only
to
estimate
the
most
recent
pose
and
the
”full”
SLAM
which
pro
vide
a
w
ay
to
estimate
the
whole
trajectory
.
3.1.
Standard
theory
Presented
description
is
equi
v
alent
to
thous
gi
v
en
in
standard
SLAM-oriented
publications
e.g.
surv
e
y
by
Durrant-White
et
al.
[21]
or
book
Probabilistic
robotics
by
Thrun
et
al.
[22].
Let’
s
ha
v
e
some
observ
er
which
mo
v
es
in
an
en
vironment
gi
v
en
by
parameterization
m
and
during
its
mo
v
ement
is
the
observ
er
repeatedly
conducting
observ
ations
z
.
Observ
er
relation
to
this
en
vi
ronment,
e.g.
its
position
and
orientation,
is
gi
v
en
by
state
x
.
Int
J
Rob
&
Autom,
V
ol.
9,
No.
2,
June
2020
:
63
–
72
Evaluation Warning : The document was created with Spire.PDF for Python.
Int
J
Rob
&
Autom
ISSN:
2089-4856
r
65
Observ
ations
describe
the
observ
er
surroundings
and
are
de
graded
by
noise.
Therefore
it
can
be
defined
by
a
conditional
probability
distrib
ution
that
is
usually
called
the
observ
ation
model:
p
(
z
n
j
x
n
;
m
)
(1)
Because
of
the
nature
of
the
observ
er
entity
,
the
state
v
ector
will
most
probably
be
subjected
to
some
dynamic
that
bounds
its
change
between
observ
ations.
This
link
may
be
dependent
on
some
observ
able
quantity
u
and
it’
s
also
stochastic
so
can
be
defined
by
conditional
probability
distrib
ution
called
motion
model:
p
(
x
n
j
x
n
1
;
u
n
)
(2)
Because
the
stochastic
nature
of
both
observ
ation
and
motion
model
the
SLAM
problem
lies
from
the
gen-
eral
point
of
vie
w
in
defining
a
probability
distrib
ution
of
a
pose
and
a
map
conditioned
by
the
conducted
observ
ations:
p
(
x
N
;
m
j
Z
0:
N
;
U
1:
N
)
(3)
This
distrib
ution
has
to
also
represent
our
prior
belief
about
the
state
and
map
distrib
ution.
Analytic
solution
of
this
problem
can
be
found
using
Bayes
formula
as:
p
(
x
N
;
m
j
Z
0:
N
;
U
1:
N
)
=
p
(
z
N
j
x
N
;
m
)
p
(
x
N
;
m
j
Z
0:
N
1
;
U
1:
N
)
(4)
where
is
an
arbitrary
normalization
constant
and
second
term
can
be
defined
by
propag
ation
pre
vious
belie
v
e
into
current
time
using
motion
model:
p
(
x
N
;
m
j
Z
0:
N
1
)
=
Z
p
(
x
N
1
;
m
j
Z
0:
N
1
)
p
(
x
N
j
x
N
1
;
u
N
)
d
x
N
1
(5)
Usually
the
realization
of
equation
(4)
is
called
the
update
step
and
realization
of
equation
(5)
is
called
a
prediction
step.
This
recurrent
form
of
solution
is
standardly
referred
to
as
an
”online”
SLAM
and
can
be
f
airly
straightforw
ardly
seen
as
applicable
to
real-time
process.
The
second
frequently
utilized
form
of
SLAM
solution
is
the
so-called
”full”
SLAM
that
is
non-recurrent
and
aims
at
the
description
of
whole
trajectory
distrib
ution.
p
(
X
0:
N
;
m
j
Z
0:
N
;
U
1:
N
)
=
h
N
Y
n
=0
p
(
z
n
j
x
n
;
m
)
ih
N
Y
n
=0
p
(
x
n
j
x
n
1
;
u
n
)
i
p
(
x
0
)
(6)
3.2.
General
multi-sensor
based
SLAM
No
w
,
let’
s
consider
that
set
of
observ
ations
is
composed
of
subsets
and
each
subset
contain
only
observ
ations
from
one
particular
sensor
modality
Z
0:
N
=
Z
[1]
0
1
:
N
1
;
Z
[2]
0
2
:
N
2
;
;
Z
[
K
]
0
K
:
N
K
(7)
where
an
y
time
inde
x
es
range
0
k
:
N
k
0
:
N
.
Then
each
modality
has
its
o
wn
unique
particular
observ
ation
model
p
(
z
[
k
]
n
j
x
n
;
m
)
(8)
Motion
model
stays
conceptually
unchanged,
we
can
assume
the
same
form
as
in
the
general
case.
These
e
v
entualities
do
not
change
abo
v
e
mentioned
equations
dramatically
.
The
only
change
lies
in
the
substitution
of
general
observ
ation
models
for
particular
ones.
Specifically
,
the
update
step
of
the
online
SLAM
gonna
look
lik
e
this
p
(
x
N
;
m
j
Z
0:
N
;
U
1:
N
)
=
p
(
z
[
k
]
N
j
x
N
;
m
)
p
(
x
N
;
m
j
Z
0:
N
1
;
U
1:
N
)
(9)
and
the
probability
distrib
ution
of
full
v
ariant
will
be
in
the
follo
wing
form
p
(
X
0:
N
;
m
j
Z
0:
N
;
U
1:
N
)
=
h
N
Y
n
=0
p
(
z
[
k
]
n
j
x
n
;
m
)
ih
N
Y
n
=0
p
(
x
n
j
x
n
1
;
u
n
)
i
p
(
x
0
)
(10)
It
may
look
lik
e
no
progress
at
all
ho
we
v
er
that
because
we
did
not
tak
e
into
account
that
with
addi-
tional
modalities
will
be
changing
more
things
than
just
the
observ
ation
model.
Gener
al
concepts
of
multi-sensor
data-fusion
based
SLAM
(J
an
Kle
ˇ
cka)
Evaluation Warning : The document was created with Spire.PDF for Python.
66
r
ISSN:
2089-4856
z
0
[1]
m
[1]
m
[2]
z
1
[2]
x
0
x
1
x
2
x
3
z
2
[1]
z
3
[2]
Figure
1.
Conditionally
independent
algorithms
3.3.
Special
cases
multi-sensor
based
SLAM
In
this
section,
we
specify
the
abo
v
e-mentioned
formulas
by
assuming
specific
structure
deri
v
ed
from
mutual
relations
of
dif
ferent
modality
observ
ations.
Specifically
,
we
analyze
thee
cases
that
we
consider
to
be
archetypes
from
which
the
real
situations
can
be
composed
of.
3.3.1.
Conditionally
independent
algorithms
Let’
s
consider
that
gi
v
en
modalities
(or
at
least
used
style
of
their
abstraction)
does
both
not
allo
w
forming
an
y
cross-modality
quantity
that
could
represent
a
common
map
elements
and
in
addition
their
obser
-
v
ations
are
asynchronous
in
time
of
their
capture
-
so
each
one
belongs
to
dif
ferent
state
of
the
observ
er
(see
Figure
1).
That
will
leads
to
separation
of
the
map
parameterization
m
into
a
set
of
sensor
-specific
representa-
tions
m
=
M
[1:
K
]
=
m
[1]
;
m
[2]
;
;
m
[
K
]
(11)
where
each
particular
map
m
[
k
]
is
independent
of
an
y
observ
ation
z
[
l
]
.
p
(
z
[
k
]
n
j
x
n
;
m
[
l
]
)
=
p
(
z
[
k
]
n
)
8
k
6
=
l
(12)
If
we
apply
these
rules
to
the
recurrent
SLAM
equation
we
can
in
this
case,
alter
them
into
a
form
where
the
update
step
is
separable
in
terms
of
modality
.
So
let’
s
notice
that
only
the
cross-modality
link
is
in
this
case
established
by
the
motion
model.
The
weak
er
the
motion
model,
the
closer
the
uni-modal
parts
are
to
mutual
independenc
y
and
in
an
e
xtreme
case,
assuming
that
the
motion
model
does
not
e
xist
at
all,
this
archetype
leads
to
completely
independent
parallel
SLAM
algorithms.
Generally
,
we
can
state
that
particular
maps
can
be
considered
conditionally
independent
gi
v
en
the
state.
Data-fusion
is
in
this
case
scheduled
to
postprocessing
with
no
benefit
to
runtime.
3.3.2.
Super
-obser
v
ation
The
second
archetype
is
based
on
the
assumption
that
the
acquisition
of
the
observ
ations
is
conducted
in
a
synchronized
manner
.
So
e
v
en
though
observ
er
using
multiple
sensors
their
capturing
times
are
syn-
chronized
and
so
all
particular
modalities
observ
ations
al
w
ays
belongs
to
one
single
state
realization
x
(see
Figure
2).
Under
these
assumptions,
we
can
define
the
observ
ation
set
as
a
collection
of
subsets
that
contain
isochronous
observ
ations.
Z
0:
N
=
Z
[1:
K
]
0
;
Z
[1:
K
]
1
;
;
Z
[1:
K
]
N
(13)
Because
from
an
analytical
perspecti
v
e
it
is
irrele
v
ant
whether
the
observ
ation
is
v
ector
or
set,
we
can
define
the
composed
observ
ation
model
and
then
apply
the
single-observ
ation
theory
.
(
Z
[1:
K
]
n
j
x
n
;
m
)
=
K
Y
k
=1
p
(
z
[
k
]
n
j
x
n
;
m
)
(14)
Let’
s
notice
that
data-fusion,
in
this
case,
tak
es
place
in
a
preprocessing
step.
Int
J
Rob
&
Autom,
V
ol.
9,
No.
2,
June
2020
:
63
–
72
Evaluation Warning : The document was created with Spire.PDF for Python.
Int
J
Rob
&
Autom
ISSN:
2089-4856
r
67
z
0
[1]
m
z
0
[2]
x
0
x
1
z
1
[1]
z
1
[2]
Figure
2.
Super
-observ
ation
m
[com]
z
0
[1]
r
[1]
r
[2]
z
1
[2]
x
0
x
1
x
2
x
3
z
2
[1]
z
3
[2]
Figure
3.
P
artially
collecti
v
e
mapping
3.3.3.
P
artially
collecti
v
e
mapping
The
third
and
final
archetype
we
presenting
in
this
section
is
unique
in
its
map
composition.
At
leas
t
part
of
the
map
representation
is
common
to
all
a
v
ailable
modalities
and
so
on
its
estimation
participates
all
sensors
(see
Figure
3).
Let’
s
assume
that
the
map
representation
can
be
defined
as
the
follo
wing
collection:
m
=
m
[
com
]
;
r
[1]
;
r
[2]
;
;
r
[
K
]
(15)
where
m
[
com
]
is
a
common
part
of
map
(or
just
a
common
map)
and
all
r
[
k
]
are
modality
specific
remainder
v
ectors.
Combination
of
common
map
m
[
com
]
and
a
particular
remainder
v
ector
r
[
k
]
can
be
interpreted
as
a
particular
map
m
[
k
]
.
So
common
map
m
[
com
]
is
dependent
on
e
v
ery
observ
ation
and
remainder
v
ectors
r
[
k
]
are
mutually
conditionally
independent.
p
(
m
[
com
]
j
x
0:
N
;
Z
[
k
]
0
k
:
N
k
)
6
=
p
(
m
[
com
]
j
x
0:
N
)
8
k
2
1
:
K
(16)
Data-fusion
is
in
this
case
implicitly
embedded
into
the
SLAM
algorithm.
4.
PRA
CTICAL
ASPECT
OF
COMMON
MAP
By
analysis
of
the
abo
v
e-mentioned
archetypes,
we
concluded
that
the
concept
of
the
common
map
represents
a
promising
w
ay
for
the
de
v
elopment
of
ef
fecti
v
e
multi-sensor
data-based
SLAM
algorithms
because
it
implicitly
enforces
a
high
le
v
el
of
data
fusion.
Ho
we
v
er
probabilistic
approach
to
this
concept
is
highly
abstract
and
that’
s
wh
y
we
de
v
oted
this
section
to
more
specific
and
practical
aspects
of
this
concept.
There
are
tw
o
subsections
follo
wing.
In
the
first,
we
are
dealing
with
specifics
w
ay
to
practica
lly
implement
the
concept
of
the
common
map
which
is
composing
it
as
parameters
of
a
piece
wise
function
that
represent
the
surf
ace
of
the
observ
ed
en
vironment.
In
the
second
subsection,
we
follo
w
up
the
pre
vious
findings
Gener
al
concepts
of
multi-sensor
data-fusion
based
SLAM
(J
an
Kle
ˇ
cka)
Evaluation Warning : The document was created with Spire.PDF for Python.
68
r
ISSN:
2089-4856
into
set
requirements
on
t
he
observ
ation
functions
that
lead
to
the
cate
gorization
of
real
sensors
accordingly
to
their
utilizability
in
the
conte
xt
of
geometrical-entities
based
collecti
v
e
map.
4.1.
Geometrical-entities
based
collecti
v
e
map
Continues
function
that
approximates
the
surf
ace
of
obstacles
is,
in
our
opinion,
an
adv
antageous
thing
to
utilize
for
the
common
map
definition
because
standard
SLAM
capable
sensors
al
w
ays
observ
e
this
quantity
in
some
w
ay
.
F
or
e
xample,
there
is
a
v
ery
lo
w
probability
that
data
from
Lidar
,
visible
spectrum
(vis)
camera,
thermal
(IR)
camera
w
ould
share
a
substantial
amount
of
feature
points
in
terms
of
belonging
to
t
he
same
spacial
points.
Ho
we
v
er
,
what
is
highly
probable
is
that
these
observ
ations
w
ould
describe
the
same
planes
and
curv
es
that
form
the
en
vironment
surf
aces.
Let’
s
ha
v
e
an
analytical
formula
for
an
observ
ation
model,
where
observ
ation
is
a
v
ector
that
in
a
spatially
distinguished
point-wise
manner
describes
some
quantity
e
xhibited
by
points
of
the
surrounding
en-
vironment.
z
[
k
]
n
=
h
[
k
]
(
x
n
;
m
[
k
]
;
v
[
k
]
n
)
(17)
where
h
[
k
]
is
observ
ation
function,
v
[
k
]
n
is
noise
v
ector
that
models
stochasticity
of
the
process.
If
we
w
ould
kno
w
that
some
subsets
of
the
observ
ation
elements
belongs
to
specific
geometri
cal-entity
we
can
generally
e
xpress
this
kno
wledge
by
some
equality
constraints
G
i
(
m
)
=
0
(18)
where
G
i
is
function
that
define
constraints
specific
to
i
-th
entity
.
F
or
e
xample,
follo
wing
constraint
bounds
the
specific
points
to
lie
on
the
same
line/plane
G
i
(
m
)
=
M
i
1
i
=
0
(19)
where
i
is
a
v
ector
of
coef
ficients
that
defines
line/plane
and
M
i
matrix
whose
ro
ws
are
spacial
points
that
belongs
to
i
-th
entity
.
P
arameters
that
define
specific
form
of
the
constraint
equation
(in
our
e
xample
i
)
are
elements
that
forms
the
common
map
m
[
com
]
.
F
or
practical
applications,
we
also
define
a
projection
function
g
that
is
used
in
the
optimization
process
for
error
e
v
aluation.
m
[
k
]
=
g
[
k
]
i
(
m
[
com
]
;
r
[
k
]
)
(20)
this
function
ha
v
e
to
be
from
general
perspecti
v
e
modality
specifics,
ho
we
v
er
,
usually
,
it
w
ould
be
v
ery
similar
across
all
modalities.
The
consequence
of
map
parametrization
in
this
w
ay
is
that
dimensionality
of
the
map
is
greatly
reduced
compared
to
the
non-constraint
case
and
this
w
ould
v
ery
lik
ely
ha
v
e
positi
v
e
ef
fects
on
the
optimization
process
as
sho
wn
in
[23,
24].
The
last
practical
aspect
we
discuss
in
this
subsection
is
the
ob
vious
problem
that
in
the
real-w
orld
scenarios
point
elements
af
filiation
to
specific
geometrical
entities
is
apriori
unkno
wn.
Di
viding
single
observ
ations
into
parts
where
each
describes
the
common
entity
is
generally
a
se
gmentation
problem
and
the
probabilistic
w
ay
to
approach
it
is
by
statistical
h
ypothesis
testing.
p
G
i
(
m
)
=
0
j
Z
0:
N
>
(21)
where
is
the
significance
le
v
el.
This
can
be
practically
conducted
by
defining
statistics
that
e
v
aluates
whether
the
reprojection
error
can
be
caused
by
observ
ation
noise
t
i
=
d
h
(
X
0:
N
;
m
;
v
=
0
)
;
Z
0:
N
(22)
and
comparing
it
ag
ainst
gi
v
en
critical
v
alue
t
i
<
t
cr
it
.
An
yw
ay
,
it
is
ob
vious
that
man
y
testable
h
ypotheses
gonna
be
significantly
higher
then
computational
resources
allo
w
us
to
test,
so
necessary
part
of
the
se
gmentation
algorithm
has
to
be
also
a
method
which
generates
h
ypothesis
to
test.
Experiment
sho
wing
practical
e
xample
of
such
algorithm
can
be
find
[25].
Int
J
Rob
&
Autom,
V
ol.
9,
No.
2,
June
2020
:
63
–
72
Evaluation Warning : The document was created with Spire.PDF for Python.
Int
J
Rob
&
Autom
ISSN:
2089-4856
r
69
4.2.
Sensors
In
perspecti
v
e
of
abo
v
e-mentioned
theory
,
let’
s
analyze
what
properties
ha
v
e
to
the
observ
ation
func-
tion
meet
to
be
compliant
e.g.
usable
with
it.
Just
for
the
formalism,
we
start
with
the
ob
vious.
Firstly
,
the
mathematical
model
of
the
sensor
has
to
be
c
o
ns
istent
with
reality
.
Secondly
,
an
y
sensor
used
as
the
primary
source
of
data
for
the
SLAM
algorithm
has
to
measure
some
spatially
dependent
quantity
that
is
suitable
to
be
mapped.
This
leads
to
a
model’
s
ambiguity
when
state
or
map
is
unkno
wn,
ho
we
v
er
,
combined
kno
wledge
about
both
state
and
map
forms
an
information
g
ain.
p
(
z
j
x
)
=
p
(
z
j
m
)
=
p
(
z
)
(23)
p
(
z
j
x
;
m
)
6
=
p
(
z
)
(24)
From
perspecti
v
e
multiple-sensor
based
SLAM
while
assuming
to
ha
v
e
limited
resources,
it
is
reasonable
also
to
consider
whether
all
sensors
will
ha
v
e
a
perceptible
contrib
ution
to
o
v
erall
result.
A
form
of
the
contrib
ution
is
although
in
this
conte
xt
highly
unclear
.
Generally
,
it
can
be
vie
wed
as
an
y
criterion
that
e
v
aluates
the
result.
Ho
we
v
er
,
we
usually
think
about
it
as
a
noticeable
impro
v
ement
of
a
common
map
v
ariance.
V
ar
p
(
m
[
com
]
j
Z
[1:
K
]
)
<
V
ar
p
(
m
[
com
]
j
Z
[1:
K
n
k
]
)
(25)
where
used
probability
distrib
utions
are
mar
ginalized
distrib
utions
p
(
m
[
com
]
j
Z
)
=
Z
p
(
X
0:
N
;
m
[
com
]
;
r
[
k
]
j
Z
)
d
X
0:
N
;
r
[
k
]
(26)
where
represent
domain
of
mar
ginalized
quantities.
Such
criterion
is
ho
we
v
er
practically
impossible
to
compute
a
priory
and
only
real
possibility
is
to
e
v
aluate
it
e
xperimentally
.
W
e
used
this
condition
t
o
classify
the
usage
of
v
arious
sensor
types
the
o
v
ervie
w
is
in
T
able
1
and
detailed
descriptions
are
follo
wing.
T
able
1.
Sensor
type
cate
gorization
Cate
gory
Example
Usage
Lo
w
DOF
Thermometer
Mapping
Inertial
Accelerometer
Motion
model
Modality
profile
Camera
SLAM
Local
structure
Lidar
SLAM
Link
to
ref.
frame
GPS
Position
reference
4.2.1.
Lo
w
degr
ees-of-fr
eedom
T
o
this
cate
gory
belongs
sensors
which
quite
clearly
cannot
satisfy
perceptible
contrib
ution
condi-
tion
because
a
number
of
de
g
r
ees-of-freedom
(DOF)
of
their
observ
ation
range
does
not
allo
w
unambiguous
enough
localization
in
the
observ
er’
s
state-space.
T
ypical
members
of
this
group
are
scalar
sensors
of
local
en
vironmental
quantities
i.e.
thermometer
,
light-intensity
sensor
,
etc.,
b
ut
also
a
linear
lidar
can
be
listed
here
while
assuming
that
the
observ
er
is
mo
ving
in
3D
space
with
6
DOF
.
Sensors
from
this
cate
gory
can
be
used
for
unique
modality
map
creation
(assuming
that
pose
data
is
pro
vided
from
another
source),
ho
we
v
er
,
direct
con-
trib
ution
to
SLAM
algorithms
can
be
considered
to
be
none
(with
e
xception
of
some
multi-modal
localization
scenarios
where
correct
mode
can
be
chosen
only
by
unique
en
vironmental
quantity).
4.2.2.
Inertial
This
is
a
cat
e
g
or
y
of
sensors
that
pro
vide
data
that
brings
links
between
subsequent
observ
er
state
e.g.
forms
data
f
o
r
motion
model.
It
is
clear
that
these
sensors
do
not
fulfill
the
observing
en
vironmental
quantity
condition
-
the
y
ha
v
e
no
link
to
en
vironment
structure.
This
group
consists
of
v
arious
encoders,
accelerometers,
gyroscopes,
etc.
These
are
the
typical
support
sensors
that
ha
v
e
no
direct
w
ay
to
contrib
ute
to
the
common
map
estimation
b
ut
data
from.
Because
historical
reasons
observ
ations
from
these
se
nsors
are
mark
ed
with
symbol
u
rather
than
z
.
Gener
al
concepts
of
multi-sensor
data-fusion
based
SLAM
(J
an
Kle
ˇ
cka)
Evaluation Warning : The document was created with Spire.PDF for Python.
70
r
ISSN:
2089-4856
4.2.3.
Modality
pr
ofile
Sensors
from
this
cate
gory
are
generally
sensor
that
observ
es
the
properties
of
some
ambient
sig-
nal
generated
by
the
en
vironment.
From
a
practical
perspecti
v
e,
these
are
strictly
v
arious
types
of
cameras
that
measure
directional
characteristics
of
intensity
of
electromagnetic
radiation
on
specific
spectral
interv
al
(light).
By
assuming
that
indi
vidual
parts
of
the
obstacle
surf
ace
emitting
e.g.
reflecting
the
light
in
such
w
ay
that
it
i
s
possible
to
identify
the
same
spacial
points
in
multiple
images,
we
can
use
photogram
metry
to
reconstruct
vie
wed
structure.
Characteristic
property
is
that
standard
photogrammetry
techniques
applied
on
single-camera
data
can
pro
vide
reconstruction
in
v
ariant
only
up
to
unkno
wn
similarity
transformation.
So
the
scale
of
unkno
wn
and
if
needed
then
ha
v
e
to
be
fix
ed
by
implementing
additional
data
into
the
process.
Sensors
of
this
cate
gory
can
be
under
the
right
conditions
used
for
realization
of
SLAM
as
sho
wn
for
e
xample
by
[26]
or
by
[27]
and
also
can
be
addition
to
multi-sensor
SLAM
system.
4.2.4.
Local
structur
e
This
cate
gory
contains
the
most
typical
sensor
used
i
n
the
conte
xt
of
SLAM
algorithms.
Observ
ations
pro
vided
by
these
sensors
represent
the
profile
of
the
surrounding
en
vironment
from
their
perspec
ti
v
e.
T
yp-
ical
members
of
this
group
are
lidars,
rangefinders,
and
RGB-D
cameras
and
the
y
ha
v
e
the
potential
to
be
a
contrib
ution
in
the
sense
of
common
map
estimation.
4.2.5.
Link
to
r
efer
ence
frame
As
the
designation
probably
suggests
sens
ors
of
the
last
group
pro
vide
direct
information
about
po-
sition
in
some
reference
frame.
It
is
sensors
lik
e
global
na
vig
ation
satellite
system
(GNSS),
local
positioning
systems
(LPS)
surv
e
yed
for
e
xample
by
[28],
an
y
similar
beacon-based
system
or
e
v
en
a
compas
s.
From
a
for
-
mal
perspecti
v
e,
these
sensor
does
not
observ
e
an
y
en
vironmental
property
so
primary
the
y
can
not
contrib
ute
to
estimation
common
map,
although
the
y
ha
v
e
a
lar
ge
potential
to
contrib
ute
i
ndirectly
as
link
to
reference
frame
can
eliminate
an
y
drift
in
pose
estimation.
The
main
problem
is
that
these
sensors
may
w
ork
poorly
in
urban
areas
or
indoor
(GNSS)
or
the
y
require
some
special
infrastructure
(LPS),
and
so
these
data
are
rarely
a
v
ailable.
Let’
s
notice
that
a
substantial
part
of
moti
v
ation
to
SLAM
algorithms
lies
in
that
the
pose
data
are
directly
una
v
ailable
or
at
least
una
v
ailable
in
suf
ficient
quality
.
5.
CONCLUSION
W
e
presented
our
theoretical
analysis
of
fundamental
aspects
of
multiple-sensor
data-fusion
based
SLAM
problem
from
probabilistic
approach
perspecti
v
e.
W
e
concluded
that
the
most
promising
w
ay
to
gen-
erally
approaching
it
is
by
utilizing
the
concept
of
a
common
map
as
sho
wn
by
presented
archetype
partially
collecti
v
e
mapping.
As
we
see
it
the
typical
no
w
adays
published
SLAM
algorithm
based
on
data-fusion
is
similar
to
super
-observ
ation
archetype,
b
ut
these
concepts
are
in
our
opinion
suboptimal
in
terms
of
rob
ustness.
Ev
ery
sensor
has
some
limitation
that
determines
situations
where
it
can
be
used.
Super
observ
ation
concept
will
safely
w
ork
in
situations
gi
v
en
by
the
intersection
of
all
sensors
applications
fields.
On
the
contrary
,
the
partially
collecti
v
e
mapping
archetype
can
w
ork
in
sit
uations
gi
v
en
by
unification
of
all
sensors
applications
fields.
From
a
practical
perspecti
v
e,
we
discussed
options
for
common
map
implementation.
As
a
mapped
quantity
we
proposed
to
utilize
the
surf
ace
of
obst
acles
and
describing
it
as
a
piece-wise
function
composed
of
simple
geometrical
entities.
After
that,
we
find
out
three
major
problems
that
ha
v
e
to
be
solv
ed
before
im-
plementation.
Firstly
,
the
mathematical
model
of
geometrical
entities
m
ust
be
defined.
That
includes
defining
constraints
equations,
specific
form
of
common
map
v
ector
and
sensors-specific
remainder
v
ectors
and
pro-
jection
function.
Secondly
,
some
statistics
posing
as
a
se
gmentation
criterion
must
be
defined.
And
lastly
,
a
strate
gy
for
selecting
re
gions
to
test
on
the
geometrical-entity
h
ypothesis
must
be
defined.
W
e
ha
v
e
confidence
in
the
proposed
method
and
our
future
w
ork
will
be
aimed
at
the
creation
of
real
implementation
and
conducting
e
xperiments
that
comparing
its
quality
on
publicly
a
v
ailable
datasets.
A
CKNO
WLEDGEMENT
The
completion
of
this
paper
w
as
made
possible
by
the
grant
No.
FEKT
-S-17-4234
-
”Industry
4.0
in
automation
and
c
ybernetics”
financially
supported
by
the
Internal
Science
Fund
of
Brno
Uni
v
ersity
of
Int
J
Rob
&
Autom,
V
ol.
9,
No.
2,
June
2020
:
63
–
72
Evaluation Warning : The document was created with Spire.PDF for Python.
Int
J
Rob
&
Autom
ISSN:
2089-4856
r
71
T
echnology
and
by
the
grant
No.
TN01000024
by
the
National
C
o
m
petence
Center
-Cybernetics
a
nd
Artificial
Intelligence.
REFERENCES
[1]
C.
Cadena,
et
al.
,
“P
ast,
Present,
and
Future
of
Simultaneous
Localization
and
Mapping:
T
o
w
ard
the
Rob
ust-Perception
Age,
”
IEEE
T
ransactions
on
Robotics
,
v
ol.
32,
no.
6,
pp.
1309–1332,
Dec
2016.
[On-
line].
A
v
ailable:
http://ieee
xplore.ieee.or
g/document/7747236/
[2]
S.
Huang
and
G.
Dissanayak
e,
“
A
critique
of
current
de
v
elopments
in
simultaneous
localization
and
map-
ping,
”
International
Journal
of
Adv
anced
Robotic
Systems
,
v
ol.
13,
no.
5,
pp.
1–13,
2016.
[3]
G.
Bresson,
et
al.
,
“Simultaneous
Localization
and
Mapping:
A
Surv
e
y
of
Current
T
rends
in
Autonomous
Dri
ving,
”
IEEE
T
ransactions
on
Intelligent
V
ehicles
,
v
ol.
2,
no.
3,
pp.
194–220,
Sep
2017.
[Online].
A
v
ailable:
http://ieee
xplore.ieee.or
g/document/8025618/
[4]
M.
Beul,
et
al.
,
“F
ast
Autonomous
Flight
in
W
arehouses
for
In
v
entory
Applications,
”
IEEE
Robotics
and
Automation
Letters
,
v
ol.
3,
no.
4,
pp.
3121–3128,
Oct
2018.
[Online].
A
v
ailable:
https://ieee
xplore.ieee.or
g/document/8392775/
[5]
G.
Klein
and
D.
Murray
,
“P
arallel
T
racking
and
Mapping
for
Small
AR
W
orkspaces,
”
in
2007
6th
IEEE
and
A
CM
International
Symposium
on
Mix
ed
and
Augmented
Reality
,
IEEE,
No
v
2007,
pp.
225–234.
[Online].
A
v
ailable:
http://ieee
xplore.ieee.or
g/document/4538852/
[6]
R.
A.
Ne
wcombe,
et
al.
,
“KinectFusion:
Real-time
dense
surf
ace
mapping
and
tracking,
”
in
2011
10th
IEEE
International
Symposium
on
Mix
ed
and
Augmented
Real
ity
,
IEEE,
Oct
2011,
pp.
127–136.
[On-
line].
A
v
ailable:
http://ieee
xplore.ieee.or
g/document/6162880/
[7]
J.
Mccormac,
et
al.
,
“Fusion++:
V
olumetric
Object-Le
v
el
SLAM,
”
in
2018
Interna-
tional
Conference
on
3D
V
ision
(3D
V)
,
V
erona,
2018,
pp.
32–41.
[Online].
A
v
ailable:
https://ieee
xplore.ieee.or
g/document/8490953/
[8]
T
.
Whelan,
et
al.
,
“ElasticFusion:
Dense
SLAM
W
ithout
A
Pose
Graph,
”
in
Robotics:
Sci-
ence
and
Systems
XI.
Robotics:
Science
and
Systems
F
oundation
,
Jul
2015.
[Online].
A
v
ailable:
http://www
.roboticsproceedings.or
g/rss11/p01.pdf
[9]
T
.
Whelan,
et
al.
,
“Real-time
lar
ge-scale
dense
RGB-D
SLAM
with
v
olumetric
fusion,
”
International
Journal
of
Robotics
Research
,
v
ol.
34,
no.
4-5,
pp.
598–626,
2015.
[10]
F
.
Burian,
P
.
K
ocmano
v
a,
and
L.
Zalud,
“Robot
mapping
with
range
camera,
CCD
cameras
and
thermal
imagers,
”
in
2014
19th
International
Conference
on
Methods
and
Models
in
Automation
and
Robotics,
MMAR
2014
,
Institute
of
Electrical
and
Electronics
Engineers
Inc.,
2014,
pp.
200–205.
[11]
F
.
F
ang,
X.
Ma,
and
X.
Dai,
“
A
multi-sensor
fusion
SLAM
approach
for
mobile
robots,
”
in
IEEE
Inter
-
national
Conference
Mechatronics
and
Automation
,
2005,
v
ol.
4,
no.
2002.
IEEE,
2006,
pp.
1837–1841.
[Online].
A
v
ailable:
http://ieee
xplore.ieee.or
g/document/1626840/
[12]
A.
Garulli,
et
al.
,
“Mobile
robot
SLAM
for
line-based
en
vironment
representation,
”
in
Proceedings
of
the
44th
IEEE
Conference
on
Decision
and
Control
,
v
ol.
2005.
IEEE,
2005,
pp.
2041–2046.
[Online].
A
v
ailable:
http://ieee
xplore.ieee.or
g/document/1582461/
[13]
Y
.-H.
Choi,
T
.-K.
Lee,
and
S.-Y
.
Oh,
“
A
line
feature
based
SLAM
with
lo
w
grade
range
sensors
using
geometric
const
raints
and
acti
v
e
e
xploration
for
mobile
robot,
”
Autonomous
Robot
s
,
v
ol.
24,
no.
1,
pp.
13–27,
Jan
2008.
[Online].
A
v
ailable:
http://link.springer
.com/10.1007/s10514-007-9050-y
[14]
C.
Ulas
and
H.
T
emeltas,
“Plane-feature
based
3D
outdoor
SLAM
with
Gaussian
filters,
”
in
2012
IEEE
International
Conference
on
V
ehicular
Electronics
and
Safety
,
ICVES
2012
,
2012,
pp.
13–18.
[15]
H.
Zhou,
et
al.
,
“StructSLAM:
V
isual
SLAM
W
ith
Building
Structure
Lines,
”
IEEE
T
ransac-
tions
on
V
ehicular
T
echnology
,
v
ol.
64,
no.
4,
pp.
1364–1375,
Apr
2015.
[Online].
A
v
ailable:
http://ieee
xplore.ieee.or
g/document/7001715/
[16]
K.
Uehara,
H.
Saito,
and
K.
Hara,
“Line-based
SLAM
Considering
Directional
Distrib
ution
of
Line
Fea-
tures
in
an
Urban
En
vironment,
”
in
Proceedings
of
the
12th
International
Joint
Conference
on
C
omputer
V
ision,
Imaging
and
Computer
Graphics
Theory
and
Applications
,
no.
V
isigrapp.
SCITEPRESS
-
Science
and
T
echnology
Publications,
2017,
pp.
255–264.
[17]
S.
Y
ang,
et
al.
,
“Pop-up
SLAM:
Semantic
monocular
plane
SLAM
for
lo
w-te
xture
en
vironments,
”
IEEE
International
Conference
on
Intelligent
Robots
and
Systems
,
v
ol.
2016-No
v
em,
pp.
1222–1229,
2016.
[18]
A.
Jelinek,
L.
Zalud,
and
T
.
Jilek,
“F
ast
total
least
squares
v
ectorization,
”
Journal
of
Real-T
ime
Image
Gener
al
concepts
of
multi-sensor
data-fusion
based
SLAM
(J
an
Kle
ˇ
cka)
Evaluation Warning : The document was created with Spire.PDF for Python.
72
r
ISSN:
2089-4856
Processing,
pp.
1–17
,
Jan
2016.
[Online].
A
v
ailable:
http://link.springer
.com/10.1007/s11554-016-0562-
6
[19]
R.
Hulik,
et
al.
,
“Continuous
plane
detection
in
point-cloud
data
based
on
3D
Hough
T
ransform,
”
Journal
of
V
isual
Communication
and
Image
Representation
,
v
ol.
25,
no.
1,
pp.
86–97,
2014.
[Online].
A
v
ailable:
http://dx.doi.or
g/10.1016/j.jvcir
.2013.04.001
[20]
K.
P
athak,
N.
V
ask
e
vicius,
and
A.
Birk,
“Uncertainty
analysis
for
opt
imum
plane
e
xtraction
from
noisy
3D
range-sensor
point-clouds,
”
Intelligent
Service
Robotics
,
v
ol.
3,
no.
1,
pp.
37–48,
2009.
[21]
H.
Durrant-Wh
yte
and
T
.
Baile
y
,
“Simultaneous
localization
and
mapping:
part
I,
”
IEEE
Robotics
&
Automation
Mag
azine
,
v
ol.
13,
no.
2,
pp.
99–110,
2006.
[Online].
A
v
ailable:
http://ieee
xplore.ieee.or
g/document/1638022/
[22]
S.
Thrun,
W
.
Bur
g
ard,
and
D.
F
ox,
Probabilistic
Robotics.
The
MIT
Press,
1999.
[23]
J.
Klecka,
et
al.
,
“Non-odometry
SLAM
and
Ef
fect
of
Feature
Space
P
arametrization
on
its
Co
v
ariance
Con
v
er
gence,
”
in
IF
A
C-P
apersOnLine
,
no.
25,
Brno,
2016,
pp.
139–144.
[24]
J.
Klecka
and
O.
Bostik,
“Ef
fects
of
En
vironment
Model
P
arametrization
on
Photogrammetry
Recon-
struction,
”
in
Mendel
,
Brno,
2018,
pp.
151–158.
[25]
J.
Klecka,
et
al.
,
“Plane
se
gment
ation
and
reconstruction
from
stereo
disparity
map,
”
in
Mendel
,
Brno,
2016,
pp.
199–204.
[26]
J.
Engel,
T
.
Sch
¨
ops,
and
D.
Cremers,
“LSD-SLAM:
Lar
ge-Scale
Direct
Monocular
SLAM,
”
in
Lecture
Notes
in
Computer
Science
,
T
.
T
.
Fleet
D.,
P
ajdla
T
.,
Schiele
B.,
Ed.
Cham:
Springer
,
2014,
pp.
834–849.
[Online].
A
v
ailable:
http://link.springer
.com/10.1007/978-3-319-10605-2
54
[27]
R.
Mur
-Artal
and
J.
D.
T
ardos,
“ORB-SLAM2:
an
Open-Source
SLAM
System
for
Monocular
,
Stereo
and
RGB-D
Cameras,
”
IEEE
T
ransactions
on
Robotics
,
v
ol.
33,
no.
5,
pp.
1255–1262,
Oct
2016.
[Online].
A
v
ailable:
http://arxi
v
.or
g/abs/1610.06475
[28]
K.
Curran,
et
al.
,
“
An
e
v
aluation
of
indoor
location
determination
technologies,
”
Journal
of
Location
Based
Services
,
v
ol.
5,
no.
2,
pp.
61–78,
Jun
2011.
[Online].
A
v
ailable:
http://www
.tandfonline.com/doi/abs/10.1080/17489725.2011.562927
Int
J
Rob
&
Autom,
V
ol.
9,
No.
2,
June
2020
:
63
–
72
Evaluation Warning : The document was created with Spire.PDF for Python.