Indonesian J
ournal of Ele
c
trical Engin
eering and
Computer Sci
e
nce
Vol. 2, No. 1,
April 201
6, pp. 151 ~ 16
0
DOI: 10.115
9
1
/ijeecs.v2.i1.pp15
1-1
6
0
151
Re
cei
v
ed
Jan
uary 21, 201
6
;
Revi
sed Ma
rch 2
1
, 2016;
Acce
pted Ma
rch 3
1
, 2016
Background Modeling to Detect Foreground Objects
Based on ANN and Spatio-Temporal Analysis
N Satish
Ku
m
a
r*
1
, Shobha G
2
CSE Dep
a
rtme
nt, R V Colleg
e
of Engine
eri
n
g
,
Bangal
ore, K
a
rnatak
a, India
*Corres
p
o
ndi
n
g
author, e-ma
i
l
: satish.rmgm@gmai
l.com
1
, shob
ha
g@rvce
.edu.in
2
A
b
st
r
a
ct
T
h
is pap
er presente
d
an a
p
p
roac
h to buil
d
in
g backgr
o
u
nd mod
e
l for mov
i
n
g
obj
ect detecti
o
n
usin
g u
n
su
per
vised
artifici
al
neur
al
netw
o
rk
(ANN) w
i
th
out
any
pri
o
r k
n
o
w
ledge
a
bout
foregr
oun
d
obj
e
c
ts.
First, using
loc
a
l b
i
n
a
ry p
a
ttern (LBP) w
h
ic
h
is textur
e f
eatu
r
e, bui
lds
a sta
t
istical Back
gro
und
Mod
e
l
usi
n
g
ANN, then, co
mp
arin
g the b
ehav
ior
of next
inco
mi
ng fra
m
e w
i
th mod
e
l a
nd dec
id
e eac
h pixe
l w
hethe
r is
d
e
v
ia
ti
n
g
from
a
m
o
d
e
l
o
r
n
o
t.
An
d
b
a
s
e
d
o
n
i
f
m
e
th
od
d
e
t
ects
fo
re
g
r
ou
n
d
o
b
j
e
c
ts
th
e
n
ba
ckg
rou
n
d
m
odel
is up
date
d
to
mak
e
this
mo
del
ad
aptiv
e. Also, sp
atia
l-te
mp
oral
inf
o
rmation
has
be
e
n
ex
plo
i
ted
in
th
i
s
meth
od to su
ppress su
dde
n illu
minati
o
n
variatio
n and
to suppress false foregr
ou
nd pix
e
ls. It
w
a
s
de
mo
nstrated
and pr
ove
d
, by qual
itat
ive a
nd qu
antitativ
e
metrics that
the new
ly pr
es
ented
appr
oac
h i
s
ada
ptive,
gen
e
r
ic an
d c
an
ad
dress
all
issu
e
s
an
d ch
al
l
e
n
g
e
s for
backgr
o
und
subtr
a
ctio
n. T
o
ev
alu
a
te
the
perfor
m
a
n
ce o
f
the prese
n
te
d ap
proac
h thi
s
pap
er
co
mp
ared w
i
th rec
e
nt appr
oac
hes
by usi
ng stan
da
r
d
metrics a
nd pr
oved that pr
es
ented
meth
od
out
perfor
m
s
many existi
ng re
cent appr
oac
h
e
s.
Ke
y
w
ords
:
L
o
cal Bi
nary
Pattern, Illu
mi
natio
n variati
o
n,
spatial-te
mpora
l
, ANN, HCI, Backgro
und
Subtractio
n
Copy
right
©
2016 In
stitu
t
e o
f
Ad
van
ced
En
g
i
n
eerin
g and
Scien
ce. All
rig
h
t
s reser
ve
d
.
1. Introduc
tion
The ability to extract movin
g
foreg
r
ou
nd obje
c
ts from
a compl
e
x video seque
nce is first
and foremo
st
step of m
a
ny comp
uter vision p
r
obl
ems [1, 2], traffic mo
nitoring [3], hum
an
detectio
n
a
n
d
tra
cki
ng o
r
h
u
man
-
ma
chin
e interfa
c
e
(HCI) [4,
5, 6] v
i
deo
sum
m
ari
z
ation,
amon
g
other a
pplica
t
ions. Backg
r
ound Subt
ra
ction is not
hin
g
but discrim
i
nating movin
g
obje
c
ts fro
m
static in a giv
en video seq
uen
ce.
There may be many different alg
o
rith
ms ha
s be
e
n
use
d
for
many years in many
comp
uter visi
on appli
c
atio
ns, exampl
e
object d
e
te
ction an
d tra
cki
ng, the ta
rget re
co
gniti
on,
human
tra
cki
ng etc., [7,
8]. Even though
re
sult
s
of the existi
ng ba
ckg
r
ou
nd subtractio
n
algorith
m
s
are fairly goo
d,
however
, ma
ny of these
al
gorithm
s a
r
e
vulnera
b
le to
both glo
bal a
nd
local
ill
umina
t
ion
ch
ang
es
su
ch as sh
ado
ws and highlight
s.
T
hese
p
r
oble
m
s cau
s
e m
any
comp
uter visi
on appli
c
atio
ns to fail.
As an
in
stan
ce if optical flo
w
b
a
sed Ba
ckgr
oun
d Subt
ractio
n [9, 1
0
]
is a
nalyze
d
, this i
s
comp
utationa
lly expensive
and
not suita
b
le for re
al
-ti
m
e sce
n
a
r
ios. Therefore, t
here
is the
n
eed
for an algo
rithm whi
c
h sh
ould be comp
utationally afforda
b
le for re
al-time scen
a
r
ios a
nd gen
e
r
ic.
Apart from th
ese
req
u
irem
ents the
r
e ex
ist impo
rtant probl
em
s
in backg
ro
u
nd subtra
ction
tho
s
e
are: sen
s
itivity to dynamic backg
rou
nd
cha
nge
s,
a
s
cou
n
term
ea
sure m
odel, h
a
s to a
dapt
via
backg
rou
nd maintena
nce or
u
pdatin
g. Followi
ng are
som
e
of th
e
familiar i
s
sue
s
a
nd
chall
e
n
ges
in backg
rou
n
d
maintena
nce:
Light cha
n
g
es:
Th
e mod
e
l sho
u
ld ad
a
p
t to gradual
& sudde
n illu
mination cha
nge
s.
M
o
ving bac
k
ground
:
T
h
e mod
e
l
sho
u
ld n
o
t dete
c
t cha
ngin
g
b
a
ckgroun
d
which
is not
of
intere
st for vi
sual
surveill
a
n
ce, such as
wa
ving tre
e
s,
rippling the
water.
Cas
t
shad
o
w
s:
Th
e mod
e
l sho
u
ld det
ect and
sup
p
ress moving cast sh
ado
ws.
Bootstr
a
ppi
ng:
Th
e mo
d
e
l sh
ould
be
accurately ini
t
ia
lized
even
in the ab
se
n
c
e of
static
(free of movin
g
obje
c
ts) trai
ning set at the time of building mod
e
l.
Cam
o
u
f
lage
:
Moving obj
ects
sho
u
ld
be prope
rly detecte
d eve
n
if there is
a ch
romati
c
simila
rity to th
ose of the ba
ckgro
und m
o
del.
It is very mu
ch
de
sira
ble
to have a
c
cu
rate
a
nd
efficient re
sult
s o
f
non-station
a
ry an
d
foreg
r
ou
nd o
b
ject
s dete
c
ti
on in vide
o seque
nce with
out sh
ado
ws
and illumi
nati
on effect
s. Th
ese
Evaluation Warning : The document was created with Spire.PDF for Python.
ISSN: 25
02-4
752
IJEECS
Vol.
2, No. 1, April 2016 : 151 –
160
152
probl
em
s are
the und
erlyi
ng motivation
of this work
descri
bed i
n
this p
a
pe
r. A lot of re
sea
r
ch
has
bee
n do
ne in b
a
ckg
r
ound
su
btra
ction field, but
still there is a re
quireme
nt to develop
a
robu
st, efficie
n
t and ge
neri
c
ba
ckgroun
d
subtractio
n algorith
m whi
c
h is
able to
also d
e
tect a
n
d
sup
p
re
ss sha
dows from
all kind
s of co
mplex vi
deos. Even shado
w dete
ction i
s
very useful
to
many appli
c
a
t
ions such as Shape from Shado
w
pro
b
l
ems [11]. Th
e descri
b
ed
method in thi
s
pape
r m
u
st
a
l
so
add
re
ss p
r
oble
m
s such
as sen
s
it
ivity, reliability, ro
bustn
ess,
sh
ado
w d
e
tecti
o
n
and
sp
eed
of fore
gro
u
n
d
obj
ect
det
ection. In
th
i
s
p
a
pe
r,
we
present a
gene
r
ic ad
ap
tive
backg
rou
nd subtra
ction alg
o
rithm
with
shado
w supp
ression for d
e
tecting movin
g
obje
c
ts fro
m
all
types of com
p
lex backg
ro
und video
s.
Here is the
organi
zatio
n
of the paper. Secti
on I
I
describe
s
overview of
existing
approaches of
background
subtraction. Section III explains
background model methodologies
and
sha
d
o
w
sup
p
re
ssion.
In Section IV
, we repo
rt
ed
re
sults achie
v
ed
with exp
e
rime
ntation of
the prop
osed
appro
a
ch in terms of a
c
cu
racy an
d
efficiency, com
p
a
r
ing them
with several oth
e
r
existing po
pu
lar metho
d
s.
Section V co
nclu
de
s
and
she
d
s lig
hts
on furthe
r re
search di
re
ctio
ns
in backg
rou
n
d
subtractio
n and shad
ow
detectio
n
.
Figure. 1. Backgro
und
sub
t
raction flo
w
diagram
2. Litera
ture
Rev
i
e
w
There are m
any tradition
al approa
che
s
to moving object dete
c
tion whi
c
h i
n
clu
d
e
s
optical flo
w
[10], temporal differe
nci
n
g [12
], and
backg
r
o
u
nd
subt
ra
ction
[13]. Temp
oral
d
i
ffe
r
e
nc
in
g
w
o
rks
b
y
tak
i
n
g
d
i
ffe
r
enc
es
in
co
nsecutive vide
o fram
es,
which
is
ea
sy to
distinguish static
objects from
moving foreground
objects. Th
i
s
approach
w
ill incorporate
adaptive n
a
tu
re to dyn
a
mi
c envi
r
onm
en
ts and
can
so
lve su
dde
n i
llumination v
a
riation. But i
t
is
subj
ect to th
e foreg
r
ou
nd
apertu
re p
r
oblem. Opti
cal flow techn
i
que
s aim at
comp
uting
a
n
approximatio
n of the 2
D
m
o
tion fiel
d fro
m
the spatio-t
empo
ral info
rmation of ima
g
e pixel valu
es.
Even thoug
h
they ca
n det
ect movin
g
o
b
ject
s in th
e pre
s
en
ce
of came
ra moti
on,
mo
st
opti
cal
flow comp
uta
t
ion metho
d
s are
com
puta
t
ionally ex
pe
nsive, an
d cannot b
e
ap
p
lied in
real
-time
videos.
Obviou
sly Ba
ckgro
und
su
btractio
n i
s
t
he
comm
on and efficient method
of
d
e
tecting
moving foreg
r
oun
d obj
ect
s
from th
e stationary
ca
mera
(e.g. [
13]). It works ba
se
d on
the
differen
c
ing
of
cu
rrent
frame seq
uen
ce with
refe
rence
b
a
ckg
r
ound
mod
e
l without any prio
r
kno
w
le
dge a
bout ho
w m
any obje
c
ts,
velocities
o
f
moving ob
jects a
nd
sh
ould not h
a
v
e
foreground aperture probl
em. But op
tical flow i
s
very sensitive to
illuminations variations due to
variou
s
rea
s
ons. Even
thoug
h the
s
e
are
dete
c
te
d, they leav
e be
hind
hol
e
s,
whe
r
e
n
e
wly
entere
d
o
b
je
cts differ from
backg
rou
nd
model. Th
ere
will b
e
a fal
s
e ala
r
m rate f
o
r a
sh
ort p
e
riod
of time.
Apart from above
said
state-of
-art methods, fu
rther
sections
will gi
ve insi
ght int
o
many
recent alg
o
rit
h
ms in
order to
su
cceed i
n
dete
c
ting a
nd extra
c
ting
moving fore
grou
nd o
b
je
cts.
There are many pixel-based algorithms s
u
c
h
as
ViB
e
[14] and PBAS [15]
but they differ from
traditional
ap
proa
ch
es
as
they con
s
id
er on rando
m
pixel sam
p
lin
g and l
a
bel
diffusion. Ev
en
though
pixel-based metho
d
s are simpl
e
,
lightwei
gh
t,
and
effective
they a
r
e
not
co
nsi
d
e
r
ed
the
spatial
rel
a
tio
n
shi
p
of pixe
ls an
d a
r
e
n
o
t obj
e
c
t-b
a
sed. Spatial-b
a
se
d meth
od
s o
n
the
oth
e
r
hand, attemp
t to harne
ss t
h
is info
rmatio
n usin
g f
eatu
r
es
or blo
c
k
descri
ptors [1
6, 17] and lo
cal
colo
r histo
g
ra
ms [18] in ord
e
r to achieve better and effi
cient re
sult
s.
There a
r
e t
y
pical
ca
se
s whe
r
e thi
s
con
c
e
p
t is useful, fo
r example, f
o
reg
r
o
und
occlu
s
ion
s
wi
th pixel inten
s
ities
whi
c
h
are e
qual to
that of backg
roun
d an
d gl
obal illumi
nat
ion
variation
s
. In orde
r to overcome th
ese issue
s
tempo
r
al and
spatiot
e
mpo
r
al-ba
s
e
d
method
s ha
ve
been
propo
sed by m
any
resea
r
chers
whi
c
h ta
ke
i
n
to a
c
count
temporally re
currin
g be
ha
vior
Evaluation Warning : The document was created with Spire.PDF for Python.
IJEECS
ISSN:
2502-4
752
Backgroun
d Modelin
g to Dete
ct Fore
gr
ound O
b
je
cts Based on A
NN a
nd Spati
o
-…
(NS Kumar)
153
betwe
en the
backg
rou
nd
model a
nd th
e previo
us,
current an
d u
p
comi
ng fra
m
es
of the video.
Such
u
s
eful
informatio
n
can pl
ay a
ve
ry si
gnifi
cant
rol
e
in
an
al
yzing
and
d
e
tecting
movi
ng
foreg
r
ou
nd o
b
ject
s witho
u
t any noise a
n
d
accuracy. T
h
is informatio
n is also useful in estimati
on
of sho
r
t-te
rm
intensity cha
nge
s (dynami
c
ba
ckg
r
ou
nd
s e.g. ri
pplin
g
wate
r, swayi
ng tre
e
s, et
c.),
sudden illumi
nation
chang
es,
and moving cast
s
hadow
detection.
Such a solut
i
on i
s
present
ed
usin
g bidirecti
onal tempo
r
al
analysi
s
[19].
There have
been p
r
o
posi
ng many ap
proa
ch
es
to
merg
e the concepts of d
i
fferent
model
s that rely on multiple algo
rithm
techni
qu
es simultan
eou
sl
y
including
p
o
st-p
ro
ce
ssin
g,
advan
ced
mo
rphol
ogi
cal
o
peratio
ns.
Ho
wever with
th
ese
combin
ations are ve
ry su
cce
ssful
a
t
improvin
g the
perfo
rman
ce
of their re
sp
ective al
g
o
rit
h
ms, but ofte
n they suffer
from in
cre
a
si
ng
comp
utationa
l expen
se
s,
time, and
so
me re
qui
red
prio
r trai
nin
g
pha
se
which i
s
p
r
a
c
tically
infeasi
b
le for real
-time a
pplication
s
. Hei
kkil
a
an
d
M
Pietikaine
n [20] propo
sed
ba
ckgro
und
subtractio
n m
e
thodol
ogy b
y
exploiting lo
cal
bina
ry pat
tern
(LBP) a
s
a featu
r
e,
wh
ich i
s
prove
d
to
be reliable
a
nd efficie
n
t texture b
a
se
d
method.
Sin
c
e the
n
, man
y
research
ers have
propo
sed
alternative
s
:
Yoshin
aga [2
1] pre
s
e
n
ted
a metho
dol
og
y by integrati
on of bot
h sp
atial-ba
se
d a
nd
pixel-ba
se
d a
ppro
a
che
s
u
s
ing Mixture o
f
Gau
ssi
an
s
(MoG
s).
On t
he oth
e
r ha
n
d
, Zha
ng
et
al.
prop
osed an
appro
a
ch b
y
combinin
g
spatial texture an
d tem
poral m
o
tion
analysi
s
using
weig
hted
LBP histo
g
ram
s
. Obje
ct T
r
a
c
king
u
s
ing
Camshift
and
MoG [2
2, 23]
wa
s
propo
se
d to
track moving
obje
c
t at real-time.
Ho
wever the
r
e are m
any b
a
ckgroun
d
su
btractio
n
al
go
rithms proved
to be
very ef
ficient
and reliable, t
here i
s
no
sin
g
le gen
eri
c
al
gorithm
whi
c
h can
solve a
ll the issues
and challe
ng
es
mentione
d in the previou
s
se
ction.
This
pape
r p
r
opo
se
d an
approa
ch to
building
ba
ckgrou
nd m
ode
l for moving
obje
c
t
detectio
n
an
d sha
d
o
w
su
ppre
s
sion
wh
ich is b
a
sed
on unsupe
rvised si
mple
Artificial Ne
ural
Network
(ANN)
witho
u
t an
y prio
r kno
w
l
edge
abo
ut foreg
r
o
und
an
d shad
ow. T
h
e idea
con
s
isting
of adoptin
g bi
ologi
cally in
spired
ANN to
model
th
e b
a
ckgroun
d, compa
r
ing th
e
behavio
r of
next
incomi
ng f
r
a
m
e with
mo
d
e
l an
d d
e
cidi
ng p
e
r
pixel
wheth
e
r i
s
de
viating from
a
model
o
r
n
o
t. And
also
the
prop
ose
d
m
e
thod
ma
ke
s u
s
e
of sp
atia
l info
rmation
of th
e ba
ckg
r
oun
d
mod
e
l to
det
ect
foreg
r
ou
nd o
b
ject
s. It was demon
strate
d and
prov
e
d
, by qualitat
ive and q
uan
titative metrics,
that the new propo
se
d method is ad
a
p
tive,
generi
c
ca
n add
re
ss all a
bove
said issue
s
and
chall
enge
s fo
r backg
ro
und
subtractio
n.
3. Proposed
Approac
h
This presente
d
a
ne
w a
p
p
r
oach of
buildi
ng
a
ba
ckg
r
o
und
model
ba
sed
on
LBP
(texture)
feature
and A
NN, in
spi
r
ed
by Kohone
n [
24]. This
pap
er p
r
e
s
ente
d
a metho
d
whi
c
h e
m
ployed
a
simple
2-D flat grid of no
des to b
u
ild
a
backg
ro
un
d model. Ea
ch node j (out
put neu
ron
)
has
weig
ht vector Wj. It builds a neuronal m
ap for ea
ch
p
i
xel which
co
nsi
s
ts of nine
weight vecto
r
s.
Features at
each pixel
h
a
ve cl
uste
re
d into th
e
set of weight
vecto
r
s ba
sed o
n
Eu
clid
ian
distan
ce. Th
e
LBP feature
vectors a
r
e
pre
s
ente
d
to
all the neu
ro
ns a
s
inp
u
ts.
Then, for e
a
c
h
input ve
ctor,
the ne
uro
n
c ha
s
sele
cted
with
minimu
m
Eucli
d
ian
distan
ce. Foregro
und
mov
i
ng
obje
c
t dete
c
ti
on
carrie
d o
u
t
by ch
eckin
g
the diffe
ren
c
e bet
wee
n
cu
rre
nt fram
e a
nd b
a
ckg
r
ou
nd
model by Eu
clidia
n dista
n
c
e. If incomin
g
pixel exhi
bi
ts sam
e
beh
a
v
ior that of the model, the
n
it
is terme
d
as
backg
rou
nd, otherwise as
a forei
gn fo
re
grou
nd pixel. Backgroun
d Model is u
p
d
a
ted
if any pixel is
cla
ssifie
d
a
s
backg
rou
nd.
Back
groun
d
Model
buildin
g, foreg
r
ou
nd
detectio
n
u
s
i
ng
Euclidia
n dist
ance and u
p
d
a
ting the Mod
e
l is given in followin
g
se
cti
ons.
3.1 Bac
k
grou
nd
Model
The b
a
ckg
r
o
und m
odel
is built u
s
in
g t
he first fra
m
e
se
que
nce from the vid
e
o
;
that is,
each of the ni
ne weight ve
ctor is
assig
n
e
d
with
co
rre
spondi
ng LBP
operator
of a
pixel of the first
frame
sequ
e
n
ce. In this
a
ppro
a
ch
, to repre
s
e
n
ts
we
ight vector, L
BP feature of
a pixel ha
s b
een
cho
s
e
n
, whi
c
h is very rob
u
s
t and invari
a
n
t to illumination and
colo
r.
The
set
of
we
ight vecto
r
s o
f
an im
age
I
with
N
ro
ws a
nd M
colum
n
s i
s
rep
r
e
s
e
n
ted a
s
2-
D flat g
r
id
of
neuron
s A
with 3xN and
3x
M dime
nsi
o
n
s
, an
d
weight
vectors fo
r
pi
xel (x, y) a
r
e
at
positio
n (i, j
)
, i=3x,…, 3x
+2
and
j=3y,…, 3y+2
. Ex
ample
2-D fl
at grid
of n
e
u
ron
a
l m
ap i
s
demon
strated
for an image
I with 2 rows
and 3 column
s in figure 1.
Evaluation Warning : The document was created with Spire.PDF for Python.
ISSN: 25
02-4
752
IJEECS
Vol.
2, No. 1, April 2016 : 151 –
160
154
Figure 2. (a)
A simple ima
ge (b
) the ne
uron
al map
structu
r
e
Figure (a
), is image with 2
rows an
d 3 column
s, (b
)
repre
s
e
n
ts we
ight vect
ors (a1, …, a9) st
ore
in 3x3 m
a
trix
of neu
ron
a
l
map A, a
nd
similarly, (f
1,
…, f9)
weight
vec
t
ors with
respec
t to
pixel f
in image (a) a
nd so o
n
.
3.2 Feature
Extr
action
This pap
er e
m
ployed LBP
texture
feat
ure
s
to build
ba
ckgro
und model, whi
c
h
is
ve
ry
efficient an
d
con
s
id
ers the
neigh
borhoo
d of ea
ch
pix
e
l and
co
nverted into a bi
n
a
ry num
ber.
Due
to its simplici
t
y, LBP operator ha
s become a ve
ry popul
ar featu
r
e for many comp
uter visi
on
appli
c
ation
s
.
The mo
st importa
nt pro
perty that made LBP fe
ature to sel
e
ct for buil
d
ing
backg
rou
nd
model
is its
comp
utationa
l sim
p
licit
y a
nd its robu
st
ness to
illum
i
nation va
riati
on.
This
se
ction
descri
b
e
s
ho
w to
calculat
e LBP feat
ures from a
n
i
m
age. Thi
s
i
s
de
mon
s
trat
ed in
figure 2. To
extract LBP, a circul
ar n
e
i
ghbo
rh
o
od d
enoted by (S
, R) is
con
s
i
dere
d
, wh
ere
S
rep
r
e
s
ent
s sa
mpling poi
nts and R is the
radiu
s
of the neigh
borhoo
d
.
Figure 3. An example of L
BP computati
o
n
The point
s around the pixe
l (x, y)
located at coo
r
dinat
e points (x
p
, y
p
) is given by:
(x
p
, y
p
) =
(x + R
co
s
(
2
π
p/P
)
, y
−
R
sin (2
π
p/P))
(1)
With the e
q
u
a
tion (1
) if a
sampli
ng p
o
i
nt doe
s not
yield at integ
e
r coo
r
din
a
te
s, the value
is
interpol
ated.
The LBP lab
e
l for the
ce
nter pixel
(x,
y) of image f
(
x, y) is obtai
ned throug
h
the
equatio
n (2
).
,
,
,
,
2
(2)
Whe
r
e
s(z) i
s
the threshold
function an
d is given by,
sz
1,
0
0,
0
(3)
3.3
Finding the
Bes
t
Match
Given the
current pixel p
at time t, the value I
t
(p
) is
co
mpared to th
e cu
rrent pixe
l model,
given by M
t
−
1
(p), to determi
ne the wei
ght
vector BM(p
) that best matche
s it:
Evaluation Warning : The document was created with Spire.PDF for Python.
IJEECS
ISSN:
2502-4
752
Backgroun
d Modelin
g to Dete
ct Fore
gr
ound O
b
je
cts Based on A
NN a
nd Spati
o
-…
(NS Kumar)
155
,
m
i
n
,
,…
,
,
,
(4)
Where
d(:,:)
is the E
u
cli
d
i
an di
stance
betwe
en two vectors
bet
ween background model
and
incomi
ng fra
m
e pixel.
3.4
Foregro
und
Detec
t
ion an
d Upda
ting the Model
By subtractin
g the
cu
rrent
image
fr
o
m
the ba
ckgrou
nd mo
del, e
a
c
h
pixel p
t
of
the t
th
seq
uen
ce
fra
m
e, I
t
is com
pare
d
to
the
curre
n
t pixel
weig
ht vecto
r
s to
dete
r
min
e
if the
r
e
exists a
weig
ht vector that best m
a
tche
s it. Th
e best ma
t
c
hing weight
vector i
s
use
d
as the pix
e
l’s
encodin
g
app
roximation, a
nd therefore p
t
is dete
c
ted
as foregroun
d if no acce
p
t
able matchin
g
weig
ht vector exists; otherwise
it is cla
s
sified a
s
ba
ckgrou
nd.
1
∀
∈
(5)
Whe
r
e
repre
s
ent
s the 2-D neighb
orh
o
o
d
with width 2
k
+1
∈
N
The sp
atial a
nalysi
s
is introdu
ced u
s
ing
K-Ne
a
r
e
s
t Neighb
or (K
NN) Search alg
o
rithm in
Backgroun
d Model, and th
e backg
rou
n
d
subtra
ction
mask Dt(p) i
s
compute
d
as:
1
,
/
2
0
(6)
Whe
r
e K
N
N is result of KNN
se
arch
o
n
finding
be
st match in
th
e ba
ckgroun
d mod
e
l, n is the
numbe
r of be
st match
e
s fo
und in Backg
r
ound Mo
del.
4.
Experimenta
t
ion and Res
u
lt Analy
s
is
Experimental
result
s for
Backgroun
d Model an
d foreg
r
o
und d
e
tection h
a
ve been
perfo
rmed fo
r all types of challen
g
ing video sequ
en
ces. 5 differe
nt types of videos with
(movi
ng
background, illumination
variation,
water surface, camera
jitter)
with frame rate 15 fps and
320x24
0 reso
lutions a
r
e
co
nsid
ere
d
. Th
e
sel
e
cte
d
p
a
rameters fo
r
e
x
perime
n
tatio
n
is a
s
foll
ows:
The nu
mbe
r
of model
cho
s
en i
s
3x3, that is ea
ch
p
i
xel has to b
e
rep
eated 9
times, dista
n
ce
threshold
is 1
.
0 for t
r
ainin
g
and
0.00
8 fo
r te
sting
pha
se, lea
r
ning
ra
te fixed to
1 f
o
r trainin
g
a
n
d
0.5 for testing
.
4.1 Performa
nce Mea
s
ure
Method
To evalu
a
te
the pe
rform
a
nce
of the p
r
opo
se
d met
hod
with stat
e-of-th
e
a
r
t method
s
three me
asures were u
s
e
d
: Recall, Preci
s
ion,
an
d F-mea
s
u
r
e.
Those metri
c
s definition a
r
e
given as follo
wing.
Table 1. A co
ntingen
cy tab
l
e
Correct
Foreg
r
ou
nd
Correct
Backgro
und
Classified as Foreground
True Positi
ves (T
P)
False Positives (
F
P)
Classified as Background
False Negat
ives (FN
)
True N
egatives (
T
N)
0
,
(9)
0,
(10
)
In orde
r to o
b
tain high
Re
call, actu
ally Preci
s
io
n ha
s to be scarifi
ed and vice versa, so
there i
s
a tra
de-off bet
wee
n
Re
call a
nd
Preci
s
io
n.
To
avoid the
s
e
mislea
ding, t
he pa
per
use
d
F
-
measure [25]
as a
nothe
r
very importa
nt perfo
rman
ce
met
r
ic whi
c
h con
s
id
ers
both
Recall and
Preci
s
io
n re
sults simulta
n
e
ously. The F
-
measure expression i
s
give
n belo
w
as:
,
2
(11
)
Evaluation Warning : The document was created with Spire.PDF for Python.
ISSN: 25
02-4
752
IJEECS
Vol.
2, No. 1, April 2016 : 151 –
160
156
whe
r
e r: Recall; p: Precisi
o
n
4.2 Experimental Results
In orde
r to
evaluate the
perfo
rman
ce of the propo
sed m
e
thodol
ogy, five video
seq
uen
ce
s from the Li da
taset have to
be us
ed. Th
e results fro
m
the propo
s
ed method
was
comp
ared
wi
th those fro
m
MoGv2
[2
5], GMG
[2
6], and
Texture BGS [2
0]. The
ba
ckgro
und
subtractio
n result
s obtain
ed from pro
posed
meth
od and oth
e
r
state-of-a
rt methods
were
demon
strated
in further sections.
Figure 5. Shopping Mall Vi
deo: (a
) cu
rre
n
t fram
e; (b)
Grou
nd T
r
uth
foregroun
d Mask; (c)
MoGv2; (d
) G
M
G; (e) Text
ure BGS; (f)
Propo
se
d me
thod
Her
e
m
o
v
i
ng
E
scal
a
t
o
r
(S
S
)
v
i
deo
seq
u
e
n
ce
ha
s
bee
n
co
nsi
dered.
This seque
nce is an
example for
complex ba
ckgrou
nd an
d same is de
mo
nstrate
d
in figure 6.
Figure 6. Escalato
r Video:
(a) current frame;
(b
) Gro
und Truth foregro
u
nd M
a
sk; (c) MoGv2;
(d)
GMG; (e
) Texture BGS; (f) Propo
se
d me
thod
Figure 7. Indoor Lig
h
t Switch Video: (a)
curre
n
t
frame
;
(b) Groun
d Truth foregro
und Ma
sk; (c)
MoGv2; (d
) G
M
G; (e) Text
ure BGS; (f)
Propo
se
d me
thod
Lobby (LB
)
video sequ
en
ce ha
s be
en
con
s
id
e
r
ed,
which is an
indoor
environment;
contai
ns 1,545 fram
es.
This
demonstrated
sudden illuminati
on variation in an indoor
environ
ment as sho
w
n
in figure
7.
To comp
ar
e the
prop
osed me
thod with the
recent pop
ul
ar
backg
rou
nd
subtra
ction
alg
o
rithm
s
, pap
e
r
u
s
ed
pa
ram
e
ters given i
n
those refere
nce
pap
ers o
r
by repeatin
g
the experim
ents. A
ll the Backgro
und
Subtractio
n method
s incl
uding p
r
op
osed,
Evaluation Warning : The document was created with Spire.PDF for Python.
IJEECS
ISSN:
2502-4
752
Backgroun
d Modelin
g to Dete
ct Fore
gr
ound O
b
je
cts Based on A
NN a
nd Spati
o
-…
(NS Kumar)
157
MoGv2, GM
G, and T
e
xture BGS al
gori
t
hms a
r
e im
pl
emented i
n
MATLAB. In this p
ape
r g
r
o
und
truth foreg
r
ou
nd mask fra
m
es a
r
e emp
l
oyed provi
d
e
d
by Li dataset. The recall
values a
c
qui
red
by applyin
g
al
l the m
e
thod
s incl
udin
g
p
r
o
posed
metho
d
a
r
e
demo
n
strated i
n
figu
re 8.
Th
e recall
values obtain
ed by a
pplyin
g
propo
se
d m
e
thod
out
pe
rform
s
all th
e e
x
isting meth
o
d
s
and
sa
me
is
sho
w
n in figu
re 8.
Figure 8. Re
call results for
prop
osed an
d
other me
tho
d
s for all vide
o seq
uen
ce
s
of Li dataset
The p
r
e
c
isio
n
values o
b
tai
ned by p
r
opo
sed
and tho
s
e com
p
a
r
iso
n
s
with oth
e
r
method
s
are
demo
n
st
rated in fig
u
re
9. The
preci
s
ion
value
s
o
f
prop
osed m
e
thod
are
bet
ter than
all th
e
existing meth
ods a
nd sam
e
is sh
own in figure 9.
Figure 9. Pre
c
isi
on re
sult
s for pro
p
o
s
ed
and othe
r
me
thods fo
r all video sequ
en
ces of Li data
s
et
The F1
mea
s
ure valu
es for all video
seq
uen
ce
s of Li
dataset hav
e been sh
own
i
n
figure
11. The F1
result i
s
a ve
ry good
perf
o
rma
n
ce met
r
ic
whi
c
h ta
kes b
o
th re
cal
l
and p
r
e
c
isi
o
n
simultan
eou
sl
y. F1 values are mo
re for t
he pro
p
o
s
ed
whe
n
com
p
a
r
ed with othe
r method
s.
Figure 10. F-Measure re
su
lts fo
r pro
p
o
s
ed and oth
e
r
method
s for a
ll video seq
u
e
n
ce
s of Li
dataset
0
0.2
0.4
0.6
0.8
1
SHOP
ESC
S
WITH
FOUNT
T
REES
Recall
PROP
MOGV2
GMG
TEXBGS
0
0.5
1
1.5
SHOP
ESC
S
WITH
FOUNT
T
REES
Pr
ecision
PROP
MOGV2
GMG
TEXBGS
0.00
0.50
1.00
PROP
MOGV2
G
MG
TEXBGS
F
‐
Measur
e
SHOP
ESC
SWITH
FOUNT
Evaluation Warning : The document was created with Spire.PDF for Python.
ISSN: 25
02-4
752
IJEECS
Vol.
2, No. 1, April 2016 : 151 –
160
158
Figure 11 sh
ows the average prec
i
s
io
n
,
recall an
d F
-
mea
s
u
r
e for
all video seq
uen
ce of
Li dataset. The pro
p
o
s
ed
method
sho
w
s goo
d avera
ge value
s
of all perfo
rman
ce metri
c
s for all
videos.
Figure 11. Averag
e Pre
c
isi
on, Re
call an
d F-Me
asu
r
e
for all video seque
nces of
Li dataset
There is anot
her pe
rforma
nce met
r
ic
ca
lled,
percenta
ge of wro
ng
cla
ssifi
cation
(PWC)
[27]. This gi
ves the p
e
rcentage
of pi
xels cl
assi
fie
d
as fo
reg
r
o
und b
u
t those are
not a
c
tual
foreg
r
ou
nd pi
xels. This i
s
a
l
so called p
e
rcenta
ge of false ala
r
m
s
.
PWC (% of wrong
cla
ssifi
cation): 100
*(F
N
+FP)/ (TP+FN+FP+T
N
)
(12
)
Figure 12
sh
ows PWC of
the prop
os
e
d
and all oth
e
r metho
d
s,
also figu
re d
e
mon
s
trate
s
that
PWC value
s
for
all vide
o sequ
en
ce
s are le
ss
co
mpared
with
othe
r m
e
th
ods.
That
m
ean
s
prop
osed met
hod re
sult
s in
fewer fal
s
e a
l
arm
s
wh
en compa
r
ed
with
other metho
d
s.
5. Conclu
sion
Very less re
sea
r
ch ha
s
been
done f
o
r p
r
op
osin
g
gene
ric
ba
ckgroun
d subt
ractio
n
algorith
m
whi
c
h
can
solve
almost all th
e
issue
s
and
challen
ges
of foreg
r
o
und d
e
t
ection p
r
obl
e
m
.
The p
r
opo
se
d wo
rk i
s
a
contri
bution t
o
the
ne
w b
a
ckgroun
d subtra
ct
ion m
e
thod u
s
ing
ANN,
spatio
-temp
o
ral informatio
n. This algo
ri
thm al
so ma
ke
s use of gradie
n
t information wh
en
ever
necessa
ry in
orde
r to
compen
sate
sudde
n
illumi
nation vari
ation for in
doo
r environme
n
t.
Perform
a
n
c
e of the propo
sed method wi
th all exis
ting is rep
o
rted. Analysi
s
sho
w
s that propo
sed
method i
s
v
e
ry robu
st a
nd outp
e
rfo
r
ms m
any
e
x
isting alg
o
ri
thms fo
r all
types of vi
deo
seq
uen
ce
s. Propo
se
d me
thod ca
n be
applie
d for a
n
y type of video sequ
en
ces an
d meth
od is
gene
ric a
nd a
c
hieve
s
go
od
result
s for m
any chall
engi
ng video seq
uen
ce
s.
Figure 12. Perce
n
tage of
Wrong Cl
assi
ficati
on for all
video seq
u
e
n
ce
s of Li dat
aset
0
0.5
1
PROP
MOGV2
G
MG
TEXTUREBGS
Av
e
r
a
g
e
P,
R,
and
F
‐
measur
e
AVERAGE
PRECISSION
PR
AVERAGE
PRECISSION
REC
AVERAGE
PRECISSION
F
‐
MEA
0.00
50.00
100.00
150.00
SHOP
ESC
S
WITH
FOUNT
T
REES
PROP
MOGV2
GMG
TEXBGS
Evaluation Warning : The document was created with Spire.PDF for Python.
IJEECS
ISSN:
2502-4
752
Backgroun
d Modelin
g to Dete
ct Fore
gr
ound O
b
je
cts Based on A
NN a
nd Spati
o
-…
(NS Kumar)
159
Referen
ces
[1]
Ismail H
a
rita
og
lu, D
a
vid
H
a
r
w
ood,
an
d
Larr
y
S D
a
vis. “
Re
al
-ti
m
e
System fo
r De
te
cti
n
g an
d
Tra
c
king
Peop
le
”. Proc.
the third IEEE
Inter
national Conference
on
Automatic Fac
e
and Gestur
e Recognition
(Nara, Japan), IEEE Computer So
ciet
y
Pr
ess
,
Los Alamitos,
Calif
.
199
8: 22
2-22
7.
[2]
D Butl
er, V B
o
ve, a
n
d
S S
h
rid
hara
n
. “R
e
a
l tim
e
a
d
a
p
ti
ve b
a
ckgro
un
d
/foregrou
nd
se
gmentati
on”.
EURASIP Jour
nal o
n
App
lie
d Sign
al Process
i
ng
. 20
05; 14:
229
2–
230
4.
[3]
N F
r
iedma
n
a
nd S R
u
ssel
l
. “
Imag
e Se
g
m
e
n
tation
in Vi
de
o Seq
u
e
n
ces:
A Proba
bil
i
stic
Appro
a
ch
”,
Procee
din
g
s of
the 13th Co
nferenc
e
on U
n
c
e
rtaint
y in Artifi
cial int
e
ll
ige
n
ce
, Morgan Kauf
mann. 19
97.
[4]
CR W
r
en, A Azarba
ye
ja
ni, T
Darrell, a
nd
A Pent
lan
d
. “Pfinder: Re
al-ti
m
e T
r
acking of the Huma
n
Body
”.
IEEE Trans. on
Pattern
Analys
is an
d
Machi
ne Inte
lli
genc
e.
IEEE Computer S
o
ciety
Press,
Los
Alamitos, Ca
lif. 1997; 1
9
(7): 7
80-7
85.
[5]
J Ohy
a
, et al. “Vir
tual Metamorphosis”.
IEEE Multimedia. D
OI: 10.1109/93.771371
. 199
9; 6(2): 29
–
39.
[6]
J Davis, and A
Bobick. “
T
he repres
entati
on
and R
e
co
gniti
o
n
of Action usi
ng T
e
mpor
al T
e
mpl
a
tes
”.
Procee
din
g
s of
Confere
n
ce o
n
Comp
uter Vi
sion a
nd Patter
n
Reco
gniti
on.
199
7.
[7]
A Utsumi, H Mori, J
Ohy
a
, and M Yachida. “
Mu
l
t
i
p
le
-hum
an
tra
cki
ng
using multiple c
a
meras
”. In Proc.
the thrid IEEE Internation
a
l
C
onf. Automati
c Face and Gesture Rec
o
g
n
i
tion (Nar
a, Ja
pan). IEEE
Comp
uter Soci
et
y
Press, Los
Alamitos, Ca
lif. 1998.
[8]
M Yama
da, K
Ebih
ara,
and
J
Oh
ya
.
“
A
new
r
obust r
eal-ti
m
e
metho
d
for
ext
r
acting
hu
man
silho
uettes
from co
lor i
m
ages
”. In Pro
c
. the third IEEE Internatio
nal C
onf. Aut
o
matic Face
and Gestu
r
e
Recognition (Nara,
Japan), IEEE Computer
Societ
y
Press,
Los Alam
itos, Calif. 1998: 528–533.
[9]
T
Horprasert, I Haritao
g
l
u
, C W
r
en,
D Har
w
o
o
d
, LS Davi
s, and A Pentl
and. “
Re
al-ti
m
e 3d
moti
o
n
capture
”. In Proc. 1998 W
o
rk
shop o
n
Perce
p
tual Us
er Interface (PUI’98),
San F
r
ancisc
o
. 1998
[10]
Dani
el
D
Do
yl
e
,
Alan
L J
e
n
n
in
gs, Jon
a
tha
n
T
Bl
ack
“Optical
flo
w
back
g
ro
u
nd
estimatio
n
f
o
r re
al-tim
e
pan/tilt cam
e
ra
obj
ect trackin
g
”.
Journ
a
l of t
he Intern
atio
na
l Meas
ure
m
e
n
t Confe
derati
o
n
. 2014; 4
8
:
195
–2
07.
[11]
Xu
e Yu
an,
Xi
a
o
li H
ao, H
o
u
jin
Che
n
, Xue
y
e
W
e
i. Backgro
u
nd Mo
del
in
g M
e
thod
bas
ed
o
n
3D S
h
a
p
e
Reco
nstruction
T
e
chnol
og
y.
T
E
LKOMNIKA e-ISSN: 2087-2
78X
.
201
3; 11(
4): 2079~
2
0
8
3
[12]
Álvaro B
a
yon
a
,
Juan C Sa
n
Migue
l, José
M. Martínez. “
Stationary F
o
regr
oun
d Det
e
ction Us
in
g
Backgro
un
d S
ubtractio
n
and
T
e
mp
ora
l
D
i
fference
In
V
i
de
o
Surve
ill
ance
”
.
Procee
din
g
s
of 20
10
IEEE
17th Intern
atio
nal C
onfere
n
ce
on Ima
ge Proc
essin
g
.
Hon
g
Kong. 20
10.
[13]
C Stauffer a
n
d
E Grimso
n. Ada
p
tive
bac
kg
rou
nd m
i
xtu
r
e mod
e
ls for
real-tim
e trac
king.
IE
EE
Confer
ence
on
Computer Vis
i
on an
d Pattern
Recog
n
itio
n, CVPR
. 199
9: 2
46–
25
2.
[14]
O Barnic
h
an
d M V
a
n
Dro
oge
nbro
e
ck. “
V
iBe:
a
p
o
w
e
rful ra
ndom
t
e
chn
i
qu
e to
e
s
timate t
h
e
backgr
oun
d in
video s
equ
e
n
ces.”
Internat
ion
a
l Co
nfere
n
ce on Ac
ous
tics, Speech,
and Si
gn
a
l
Processing, ICASSP 2009
. 2009: 94
5–
94
8.
[15]
M Hofmann, PT
iefenbach
e
r, G
Rigoll. “Bac
kgrou
nd Segm
entatio
n
w
i
th F
eed
back: T
he
Pixel-B
a
s
e
d
Adaptiv
e Segm
enter”, in proc
of IEEE
Works
hop o
n
Ch
ang
e Detectio
n. 20
12.
[16]
Harita
ogl
u I, Har
w
o
od D, D
a
v
i
d LS. W
4
: “Re
a
l-time S
u
rvei
ll
ance
of Peo
p
le
and th
eir Activ
i
ties.”
IE
EE
T
r
ans. on PAMI
. 2000; 22(8):
809
–8
30.
[17]
Pierre-Marc J
o
doi
n, Ma
x Mig
notte, an
d Jan
u
sz Konr
ad
.
“
S
tatistical Bac
k
grou
nd S
ubtr
a
ction
Usin
g
Spatia
l Cues”.
IEEE Transactions on C
i
rcuits
and Syste
m
s for Vide
o Techn
o
lo
gy
. 200
7; 17(12).
[18]
Shen
gpi
ng Z
h
ang, H
o
n
g
x
un
Yao,
Sha
o
h
u
i
Liu. “D
yn
amic
Backgro
un
d S
ubtractio
n Bas
ed o
n
L
o
c
a
l
Dep
end
enc
y Histogr
am”
.
T
he E
i
ghth
Int
e
rnati
ona
l W
o
rkshop
on
Vis
ual
Surve
ill
an
ce - VS
200
8
,
Marseil
l
e, F
r
an
ce. 2008.
[19]
Atsushi Sh
im
ada, H
a
jim
e
Nag
ahar
a, Ri
n-ichir
o
T
anig
u
chi, “Back
g
r
oun
d Mod
e
l
i
n
g
bas
ed
on
Bidirecti
o
n
a
l A
nal
ysis”.
CVP
R
. 2013.
[20]
Heikki
la M,
Pietikai
nen M.
“A texture-b
a
se
d method for mode
lin
g the backgr
oun
d a
nd detecti
n
g
movin
g
obj
ects”.
IEEE
Transa
c
tionso
n
Patter
n
Analys
is an
d Machi
ne Intell
i
genc
e
. 200
4.
[21]
S Yoshin
ag
a, A Shimad
a, H
Naga
har
a an
d
R T
aniguch
i
. “Backgrou
n
d
model
base
d
on inte
nsit
y
chan
ge simi
lar
i
t
y
amo
ng p
i
xels”.
In F
r
ontiers of Comp
ute
r
Visi
on, (F
CV), 2013 19th
Korea-J
apa
n
Joint Workshop
, 2013: 2
76–
2
80.
[22]
Li Z
h
u, T
ao H
u
. Res
earch
of
CamS
hift Al
g
o
rithm to
T
r
ack Motio
n
Ob
je
cts.
TELKOMNIKA, e-ISSN:
208
7-27
8X
. 20
13; 11(8): 4
372
~
4378.
[23]
Hon
g
-
x
un,
Z
h
ang
a
n
d
De
Xu.
“F
usi
ng c
o
lor
an
d
gra
d
ie
nt featur
es
for b
a
ckgr
o
und
mo
del”.
In 8th Internati
ona
l Conf
erenc
e on Sig
n
a
l
Processi
ng
. 20
06
.
[24]
David
e
B
a
ll
ab
i
o
, Vivi
an
a C
o
nson
ni, R
o
b
e
rto T
odesch
ini.
“
T
he Koho
nen
an
d
CP-ANN
tool
bo
x:
A
collecti
on
of MAT
L
AB modul
es for Self Organ
izin
g
Maps
and C
o
u
n
terp
ropa
gatio
n Arti
ficial N
eura
l
Net
w
orks.
Ch
e
m
o
m
etrics an
d
Intellig
ent La
b
o
ratory Syste
m
s
.
2009; 98(
2): 115
–1
22.
Evaluation Warning : The document was created with Spire.PDF for Python.
ISSN: 25
02-4
752
IJEECS
Vol.
2, No. 1, April 2016 : 151 –
160
160
[25]
Andre
w
B God
beh
ere, Akihir
o
Matsuka
w
a, Ken Gold
berg. “
V
isua
l
T
r
acking of Human Vi
sitors und
er
Varia
b
le-
L
ig
hti
ng
Con
d
iti
ons
for a
Res
p
o
n
sive
Aud
i
o
Art Installati
on
”.
IEEE Amer
ican Control
Confer
ence F
a
irmont Quee
n Eli
z
a
b
e
th, Mon
t
réal
, Can
a
d
a
. 201
2.
[26]
F
El Baf, T
Bou
w
m
ans, a
n
d
B Vach
on. “T
yp
e-2 f
u
zz
y mi
xture
of Ga
uss
i
ans m
o
d
e
l: A
pplic
atio
n to
backgr
oun
d mode
lin
g”.
Intern
ation
a
l Sy
mpo
s
iu
m on Vis
ual
Co
mp
uting, ISVC
.
2008: 7
7
2
–78
1.
[27]
Agun
g N
ugr
oh
o Jati, L
e
d
y
a
Novamiz
anti,
Mirsa Ba
yu
Pr
aset
yo, An
d
y
Ruh
end
y
Putra
.
Evalu
a
tion
o
f
Movin
g
Obj
e
c
t
Detectio
n
Methods
bas
ed o
n
Ge
ne
ral Pur
pos
e
Sing
le B
oard
Comp
uter.
T
E
LKOMNIKA Indon
esi
an Jou
r
nal of Electric
al Eng
i
ne
eri
n
g
.
2015; 1
4
(1): 1
23 ~
129.
Evaluation Warning : The document was created with Spire.PDF for Python.