TELKOM
NIKA
, Vol.14, No
.2, June 20
16
, pp. 622~6
2
9
ISSN: 1693-6
930,
accredited
A
by DIKTI, De
cree No: 58/DIK
T
I/Kep/2013
DOI
:
10.12928/TELKOMNIKA.v14i1.2754
622
Re
cei
v
ed
De
cem
ber 2
4
, 2015; Re
vi
sed
March 19, 20
16; Accepted
April 8, 2016
Object Recognition Based on Maximally Stable
Extremal Region and Scale-In
variant Feature Transform
Hongjun G
u
o
1*
, Lil
i
Chen
1,2
1
Labor
ator
y
of Intelli
ge
nt Information Proc
es
si
ng, Suzh
ou U
n
iversit
y
, Suzh
ou 23
40
00, Chi
n
a
2
T
he Ke
y
L
a
b
o
r
ator
y
of Intelli
gent Com
puti
n
g & Si
gna
l Pro
c
essin
g
of MOE, Anhui Un
ive
r
sit
y
,
Hefei 2
3
0
039,
Chin
a
*Corres
p
o
ndi
n
g
author, e-ma
i
l
: ghj5
2
1
888
@
163.com
A
b
st
r
a
ct
F
o
r the
defect
i
n
d
e
scrib
in
g aff
i
ne
an
d
bl
ur i
n
v
a
ria
b
le
of sc
ale
-
invari
ant fe
atu
r
e transfor
m
(S
IF
T
)
at
larg
e view
po
int
variatio
n, a n
e
w
object reco
g
n
itio
n meth
o
d
i
s
prop
osed
in t
h
is p
aper, w
h
ic
h use
d
maxi
ma
lly
stable extre
m
al
regio
n
(MSER
) detecting MS
ERs and SIF
T
descri
b
in
g loca
l feature of the
s
e regi
ons. F
i
rst,
a new
most sta
b
ility cr
iterio
n i
s
ado
pt
to i
m
pr
ove th
e d
e
tecti
on effect
at irre
gul
ar sh
ape
d r
egi
ons
and
u
n
der
blur
con
d
iti
ons
; then, th
e l
o
c
a
l fe
ature
des
criptors
of MS
ERs is
extract
ed
by th
e SIF
T
;
and fi
na
lly,
the
meth
od
pro
pos
ed is
co
mp
ari
n
g the
n
corr
ect rate of
SIF
T
an
d the
pro
pose
d
throu
gh
imag
e
recog
n
iti
on w
i
t
h
standar
d test ima
ges. Exper
i
m
e
n
tal res
u
lts show
that
the meth
od pr
op
os
ed can stil
l ac
hiev
e more th
an
74% rec
o
g
n
itio
n correct rate a
t
differ
ent view
poi
nt, w
h
ich is better than SIF
T
.
Ke
y
w
ords
: Ma
ximally Sta
b
le
Extrema
l
Re
gi
on, SI
F
T
,
Object Recogn
ition,
Local F
eatur
e
Copy
right
©
2016 Un
ive
r
sita
s Ah
mad
Dah
l
an
. All rig
h
t
s r
ese
rved
.
1. Introduc
tion
In the o
b
je
ct re
co
gnition
with
com
p
licated
b
a
ckg
r
o
und
or o
ccl
u
s
ion, l
o
cal fe
ature
is
better than gl
obal featu
r
e i
n
stability, re
peatab
ility an
d authenti
c
a
b
ility and it has be
en wi
d
e
ly
applie
d in image matching
, machine vision and othe
r fi
elds in re
cent years. Th
is pap
er mai
n
ly
make
s in
-de
p
t
h rese
arch to
the detection
and de
scripti
on of local re
gion feature.
Scale
-
invari
a
n
t feature tra
n
sform
(SIFT
)
[1] algorith
m
has ex
cell
ent scale inv
a
rian
ce
and rotation i
n
varian
ce i
n
feature
point
extraction
i
n
linear
scale
space and th
e
main directio
n of
local g
r
adi
ent
distributio
n, but it has no
affine
invariance. Comp
a
r
ed with
bl
otch feature, the
regio
n
dete
c
t
i
on meth
od
s pro
p
o
s
ed i
n
re
ce
nt years are a
ppl
icabl
e to the
feature
re
gi
on
detectio
n
of variou
s
sha
p
e
s
an
d it ca
n
pre
s
e
r
ve excellent invari
a
n
ce
even
wh
en the vie
w
-a
ngle
cha
nge
s g
r
ea
tly. Literature
[2] has m
ade
comp
ar
ative analysi
s
in
su
ch m
e
thod
s a
s
SIFT, Ha
rri
s-
Affine, Hessi
an-Affine a
n
d
maximally stable extrem
al
region
(MSE
R) [3] region
detectio
n
whi
c
h
is propo
se
d by Matas an
d the re
sult
sho
w
s t
hat MSER ha
s the be
st dete
c
tion effe
cts in
recogni
zin
g
g
r
ay-level
con
s
iste
ncy re
gi
on with
stron
g
boun
dari
e
s to be reco
gn
ized, view-an
g
le
cha
nge
s a
n
d
light variatio
n; that wh
en
the
ima
ge
scale
cha
n
g
e
s, MSER fo
llows only
after
He
ssi
an-Affin
e
and that when the imag
e is fuzzy,
MSER is the most non
-ide
al
in perform
an
ce.
The resea
r
ch
re
sult of Lit
e
ratu
re [4]
shows t
hat SI
FT ha
s b
e
tter de
scriptio
n
effect in pl
a
n
e
obje
c
ts, but MSER has ex
celle
nt descri
p
ti
on effect in
most natural scene
s.
In the lo
cal f
eature
de
scri
ption, plenty
of
local featu
r
e d
e
scriptors have b
een
p
r
opo
se
d
in recent years a
nd thei
r perfo
rma
n
ces are sig
n
ificantly different in
different applications;
however, the
r
e i
s
no
uni
versal
de
scri
ption al
gorith
m
. Literatu
re
[5] and
[6] analy
z
e t
h
e
perfo
rman
ce
of the lo
cal fe
ature
de
script
ors which
a
r
e
pro
p
o
s
ed
in
the pa
st yea
r
s fro
m
differe
nt
perspe
c
tives
and the an
alytical re
sult de
monst
r
at
e
s
that the SIFT descrip
to
r ba
sed on on
e-orde
r
histog
ram
ha
s the
be
st scale i
n
varia
n
c
e a
nd M
R
O
G
H [7] h
a
s t
he be
st pe
rf
orma
nce in li
ght
variation. Hu
ang an
d oth
e
rs
have
co
me up
with
the local fe
a
t
ure de
script
or ba
se
d on
the
distrib
u
tion of
the histogra
m
s of
se
con
d
-
order g
r
adi
e
n
ts (HS
OG)
and it excels
in descri
b
ing
the
geomet
rical feature
s
relat
ed to cu
rvature; ho
weve
r, it is low in the recognit
i
on efficien
cy
of
se
con
d
-o
rd
er histog
ram. T
herefo
r
e, thi
s
pape
r
integ
r
ates MSE
R
d
e
tection
and
SIFT descri
p
tion
and uses the
improved M
SER to detect the loca
l o
b
jective local
feature regi
on and SIFT
to
con
s
tru
c
t feat
ure de
scri
ptor for object recognition.
Evaluation Warning : The document was created with Spire.PDF for Python.
TELKOM
NIKA
ISSN:
1693-6
930
Obje
ct Re
cog
n
ition Base
d on Maxim
a
ll
y Stable Extre
m
al Region a
nd … (Hongj
un Guo
)
623
2. Dete
ction
of Maximally
Stable Extre
m
al Regions
Different fro
m
the corn
er detection in
last
sect
io
n,
MS
E
R
s us
e
s
a con
c
e
p
t
simila
r t
o
watersh
ed to
obtain the lo
cally stable re
gion
s.
The watershed
algo
rithm in imag
e pro
c
e
s
sing
is
mainly use
d
in image se
g
m
entation an
d it focus
on the “water lev
e
l”(im
age g
r
a
y
-level) wh
en
the
regio
n
s me
rg
e, but the region are
a
s a
r
e
not stable whi
l
e MSERs focus on the “wa
t
er level” wh
e
n
the regio
n
s a
r
e sta
b
le. Wh
en the “water level” ch
a
n
g
e
s an
d the chang
e rate of
the area of thi
s
regio
n
is the
minimum, the regio
n
formed this ti
me is the maximally stable.
This is also
the
typical regi
on
stability reco
gnition metho
d
and its procedures a
r
e in
dicate
d as Fi
gure
1
.
Figure 1. The
relation
ship
betwe
en MS
ERs
Her
e
,
i
Q
,
i
Q
and
i
Q
are a
se
rie
s
of inter-e
mb
raci
ng extre
m
al regi
on
s.
i
Q
is
the
regio
n
o
b
tain
ed in t
he th
resh
old
seg
m
entation
with
the gray-level
to be
i
;
is the tiny g
r
ay-
level cha
nge
and
i
Q
is the bounda
ry of the extremal re
gi
on
i
Q
. As
s
u
ming that Points
x
and
y
are a
n
y pixels in the re
gio
n
s
i
Q
and
i
Q
, whe
n
the gray-l
e
v
els
()
I
x
and
()
I
y
of
Points
x
and
y
, then Regio
n
i
Q
is the extremal re
gion
. When it m
eets form
ula
(1)(a), it is calle
d th
e
maximally extremal regio
n
and when it meets formul
a (1)(b
)
, it is the minimally
extremal re
gi
on.
()
(
)
(
)
()
(
)
(
)
I
xI
y
a
I
xI
y
b
(1)
The d
e
termi
n
ation
conditio
n
of MSER is defined
a
s
th
e ratio
of the
area
of the
e
x
tremal
regio
n
an
d the are
a
ch
ang
e rate, whi
c
h
is indi
cated
a
s
form
ula (2
).
When
ch
an
ges a
nd
whe
n
the cha
nge
rate of the extremal regio
n
is the mi
nim
u
m, then this regio
n
is th
e stable
regi
on.
Whe
n
()
i
Q
obtain
s
the
maximu
m value
at th
e grey-level
of
i
, then th
e
corre
s
p
ondin
g
re
gion
is the locally
maximally sta
b
le extremal region.
()
()
()
i
MS
E
R
i
i
SQ
Q
d
SQ
di
(2)
Becau
s
e the
cha
nge rate
of the regio
n
area i
s
define
d
as the extre
m
ity of the differen
ce
of two regio
n
area
s. For th
ese two inter-bra
c
ing
regio
n
s, the extre
m
ity of
the differen
c
e of these
two a
r
ea
s i
s
equal to
the i
n
tegral
of the
bou
nda
ry
cu
rve, nam
ely that the d
eno
minator of formul
a
(2)
can al
so b
e
sho
w
n a
s
follows:
0
1
()
l
i
m
i
ii
i
Q
dd
s
S
Q
SQ
SQ
di
I
(3)
Assu
ming th
at the grey-l
evel cha
nge
in the regi
on
bound
ary is con
s
tant
C
and that
regio
n
cha
n
g
e
rate
is the
function
of th
e bo
unda
ry p
e
rimete
r, na
mely
1
()
(
)
ii
d
SQ
L
Q
di
C
, the
detectio
n
rule
formula (2
) o
f
MSER is the ratio of the region a
r
ea a
n
d
perim
eter, namely:
1
()
()
()
i
i
i
SQ
QC
LQ
(4)
Evaluation Warning : The document was created with Spire.PDF for Python.
ISSN: 16
93-6
930
TELKOM
NIKA
Vol. 14, No. 2, June 20
16 : 622 – 62
9
624
The form of t
he formul
a a
bove is
simila
r to the sh
ap
e factor
2
4
S
L
with a value ra
ng
e of
(0, 1). When
it is 1, it mean
s the form is a
re
gul
ar shape, i.e.
circle an
d rectan
gle. For two
regio
n
s which have
the
same a
r
e
a
a
n
d
the
sam
e
g
r
ay-level
cha
nge i
n
thei
r
region
bou
nda
ries,
the sm
aller t
he region
bo
unda
ry pe
ri
m
e
ter i
s
, the
bigge
r the v
a
lue of
1
is. Therefore,
th
e
traditional M
SER dete
c
tio
n
rule te
nd
s to dete
c
t t
he region
s with
regula
r
shap
e
s
; ho
wever, i
n
the
obje
c
t matchi
ng, the u
s
ef
ul feature re
gion
s u
s
ually
have irreg
u
l
a
r
sha
p
e
s
. F
o
r exam
ple,
the
camo
uflage
coatin
gs of t
he military o
b
ject
s and th
e airline l
ogo
s in the civil
aircrafts a
r
e
all
irre
gula
r
in
shape; h
o
wever, the
s
e regi
ons
hav
e hi
g
h
authe
nticity and they a
r
e
good fo
r obj
ect
matchin
g
and
recognition.
On the othe
r hand, when
t
he image i
s
fuzzy, the affine invaria
n
ce
of
MSER falls. Therefore, Ki
mmel and ot
hers
[8] have brou
ght forth a new M
SER detectio
n
operator in i
m
provin
g the
determi
natio
n crite
r
ia
of
the stability of extremal
regio
n
s
and
it
overcome
s th
e defe
c
ts of t
he traditio
nal
MSER
dete
c
t
i
on alg
o
rithm
and h
a
s
exce
llent tran
sform
invarian
ce. T
h
is p
ape
r u
s
es MSE
R
det
ection
algo
rithm propo
se
d
by Kimmel to dete
c
t the l
o
ca
l
feature regio
n
s.
Literatu
re [9] points o
u
t that SIFT algorithm
is ba
se
d
on the dete
c
tion feature
p
o
int of
linear scal
e
space, which
althoug
h ha
s scale
invari
a
n
ce,
but it i
s
not affine i
n
variant; the
r
ef
ore,
the auth
o
r d
e
tects MSERs in
the
curv
ature
scal
e
space. Sin
c
e
the inte
re
stin
g regio
n
s wit
h
stron
g
e
r
disti
ngui
shing
abi
lity usually h
a
ve irre
gula
r
sha
p
e
s
and
1
tends to d
e
te
ct the MSER
regio
n
s
with regul
ar
shap
es, to use th
e rule of
the ratio of the arc len
g
th and
the area of th
e
regio
n
to d
e
tect the
sh
ap
e have
wea
k
er regul
ar
re
gion
s, as ind
i
cated
as formula (5).
He
re,
()
i
N
Q
is the norm
a
l
i
zed regio
n
.
2
2
()
()
()
i
i
i
LN
Q
Q
SN
Q
(5)
Con
s
id
erin
g
the actu
al i
m
aging, the
point spre
ad
function
of the cam
e
ra
perfo
rms
certai
n sm
oo
th blurri
ng o
n
the actual
sce
ne
s and
to use
1
de
tection rule
has bl
urring
invarian
ce. Although to use the rule of
2
can still obt
ain excellent detection effect when the
image is fu
zzy, normalized
pro
c
essin
g
is still
nee
ded
to be performed in the i
m
age. To u
s
e the
determi
nation
method in
di
cated
as fo
rmula (6) to
p
e
rform
the m
a
ximally stab
le determinat
ion
not only h
a
s the adva
n
ta
ges of
2
and
better detec
t
ion effec
t
s
in affine trans
f
ormation and
image blu
rri
n
g
, but it also doe
sn’t nee
d to cond
uct no
rmali
z
ed p
r
o
c
essing.
3
1/
3
22
()
2
i
i
xx
y
x
y
x
y
y
y
x
SQ
Q
II
I
I
I
I
I
ds
I
(6)
3. Scale In
v
a
riant Fea
t
ur
e
Transform (SIFT)
SIFT algorith
m
is mad
e
u
p
of four ste
p
s:
scale
sp
a
c
e extrem
um
detection,
key points
locatio
n
, grad
ient prin
cipal
directio
n det
ermin
a
tion a
nd key point
s de
scription
[10]. This pape
r
repla
c
e
s
the scale spa
c
e extremum
d
e
tection of
SIFT with MSER detectio
n
me
thod an
d obt
ain
s
the extremal
regio
n
with b
e
tter affine in
varian
ce
a
nd
blurring inva
ri
ance. The im
age ellip
se from
the extrem
al
regi
on
dete
r
mined
by th
e second
-o
rd
er m
o
ment
s
of the ima
g
e
ha
s the
sa
me
statistic featu
r
es a
s
the
ori
g
inal
extrema
l
regi
on; the
r
efore, the
e
lli
pse
center is
taken
a
s
the
key
point. The im
age ellip
se is
defined a
s
fol
l
ows:
1
11
20
02
2
1
tan
2
(7)
1
00
2
I
a
,
2
00
2
I
b
(8)
Evaluation Warning : The document was created with Spire.PDF for Python.
TELKOM
NIKA
ISSN:
1693-6
930
Obje
ct Re
cog
n
ition Base
d on Maxim
a
ll
y Stable Extre
m
al Region a
nd … (Hongj
un Guo
)
625
Her
e
,
p
q
is the
se
con
d
-o
rd
er cent
ral m
o
m
ent of the im
age.
1
I
and
2
I
are the p
r
imitive
inertia mom
e
nts, whi
c
h are defined a
s
follows:
1
22
2
20
02
20
02
11
1
()
(
)
4
2
I
(9)
1
22
2
20
02
20
02
11
2
()
(
)
4
2
I
(10)
In the
key
poi
nts, calculate
the g
r
a
d
ient
dire
ction
dist
ribution
and
th
e stati
s
tical
g
r
adie
n
t
histog
ram. T
a
ke
the
co
rre
s
po
ndin
g
di
re
ction to
the
p
eak value
of t
he g
r
adi
ent h
i
stogram
as the
main directio
n of this extremal re
gion
[11].
The gra
d
ient mag
n
itude an
d dire
ction of re
gi
on
(,
)
Qx
y
are dete
r
min
ed by the followin
g
formul
as:
22
(,
)
(
1
,
)
(
1
,
)
(
,
1
)
(
,
1
)
Mx
y
Q
x
y
Q
x
y
Q
x
y
Q
x
y
(11)
(
,
)
a
r
c
t
a
n
(
,
1
)
(
,
1
)
/
(
1
,)
(
1
,)
x
y
Q
x
y
Qx
y
Q
x
y
Qx
y
(12)
Rotate the e
x
tremal re
gio
n
to the pri
n
cip
a
l directi
on in o
r
de
r
to obtain of
rotation
invarian
ce of
the descri
p
tor. With the
key point
a
s
the cente
r
, calcul
ate the gradi
ent dire
ction
and ma
gnitud
e
within it
s 8
x
8 neigh
borh
ood a
nd
cal
c
ul
ate the g
r
a
d
ient hi
stogr
a
m
in 8 di
re
ctions
in the 4x4 se
gments
by using Gau
s
sian
weig
ht. A
ccu
mulate the va
lue in every g
r
adie
n
t dire
cti
o
n
and form a
seed poi
nt. Every se
ed poi
nt includ
es 8
dire
ctions. O
b
tain 128
-di
m
ensi
onal ve
cto
r
and the proce
ss
can b
e
indi
cated a
s
follo
ws:
Figure 2. SIFT descri
p
tor
4. Experiment Re
sult an
d Analy
s
is
4.1. MSER Detec
t
ion
This
paper m
a
inly test
s the
stability of
MSER detect
i
on al
gor
ithm
whi
c
h uses
different
stability judgment standard in
a
ffine transformation, li
ght
change and image
blurring and
this
experim
ent use
s
the rep
e
a
tablity of the
feature r
egi
o
n
as the judg
ment
stand
ard. Assum
e
that
Points
a
x
and
b
x
in the feature
regio
n
of Images
a an
d b meet the con
d
ition of
ab
x
Hx
. Here,
H
is the tran
sformatio
n
mat
r
ix of these two imag
es
a
nd it is the u
n
it matrix in the light ch
an
ge
and blu
rri
ng chang
e. The o
v
erlappi
ng error of t
he feature regio
n
is
defined a
s
fol
l
ows:
()
()
1
T
a
b
T
a
b
HH
o
HH
RR
RR
(13)
Her
e
,
R
is the
detected elli
pse regio
n
o
f
feature regi
on fitting;
is the
covari
an
ce
matrix to defi
ne the
ellipse;
and
are th
e unio
n
set a
nd inte
rsectio
n
set
of the el
lipse
re
gio
n
and
o
is the
overlap
p
ing
error. Th
e ra
tio of the m
a
tchin
g
poi
nt set
S
of the two imag
e
s
obtaine
d fro
m
formula
(1
4) an
d the mi
nimum n
u
mb
er
of featu
r
e
regio
n
s i
n
the
s
e two imag
e
s
i
s
defined a
s
th
e repe
atablity, namely:
Evaluation Warning : The document was created with Spire.PDF for Python.
ISSN: 16
93-6
930
TELKOM
NIKA
Vol. 14, No. 2, June 20
16 : 622 – 62
9
626
mi
n
,
ab
S
rep
e
a
t
a
b
lity
xx
(14)
Her
e
,
is the numbe
r of the element
s in
the set.
Since th
e transfo
rmatio
n
matrix amo
ng the
im
ag
es vie
w
-angl
e
s
shot i
n
reality is
unkno
wn, the
r
e i
s
certain
error i
n
the transfo
rm
atio
n
matrix obtai
ned by the
i
m
age
regi
stration
method
s an
d
it affects the
estimatio
n
o
f
the over
lapp
in
g
er
ro
r
;
the
r
e
f
or
e
,
th
is
p
a
p
er
us
es
th
e
method
in
Literatu
re [1
2]
which
u
s
e
s
th
e
preset
trans
f
ormation matrix to
p
e
rfo
r
m
affine tran
sfo
r
m
on the
ima
g
e
,
detect
s
th
e f
eature
regio
n
s
o
n
th
e tra
n
s
form
ed i
m
ag
e, cal
c
ul
ates
the ove
r
lappi
ng
error and
ove
r
co
me
s the
e
rro
r
bro
ught
by artifici
al re
gistratio
n
. Affine tran
sform
is d
e
compo
s
ed
into shea
r transfo
rmatio
n, ani
sotro
p
y scalin
g tr
an
s
f
o
r
ma
tion
an
d r
o
ta
tio
n
tr
ans
fo
r
m
a
t
io
n
an
d
descri
bed
by
four
paramet
ers.
In the
scaling tran
sformation, thi
s
p
aper a
s
sume
s that
the
scale
factors
of two
directio
ns are the
same
that it
simplifi
e
s fo
ur pa
ra
meters
to three parameters
. By
fixing the two
of the three
para
m
eters t
o
cha
nge th
e 3rd p
a
ra
mete
r, it detects th
e rep
eatabilit
y in
different affin
e
tran
sform
s
.
The test im
a
ge u
s
ed in th
e experi
m
ent
is shown a
s
Figure 3, wh
en
the overlap
p
i
ng error
o
is smaller tha
n
5
0
%, these two regio
n
s a
r
e
deemed m
a
tche
d and it
only considers the detected region
s
of these two ima
ges. Fi
gure 4 and 5
are the repeatability
curve
s
i
n
diff
erent
affine transfo
rm, lig
ht tran
sf
orm a
n
d
blu
rri
ng. A
m
ong th
ese
curves,
the
blu
rry
image is o
b
ta
ined from the
convol
ution o
f
the G
aussia
n
function
s wi
th different varian
ce
s.
Figure 3. Test images
Figure 4. Rep
eatablity in affine transfo
r
m
1.
5
2
.
0
2.
5
3
.
0
3.
5
4
.
0
0
10
20
30
40
50
60
70
80
90
10
0
Rep
eat
ab
l
i
t
y
(
%
)
Sc
a
l
e
Ru
l
e
1
Ru
l
e
2
Ru
l
e
3
2
0
40
60
80
1
0
0
0
10
20
30
40
50
60
70
80
90
100
R
e
pe
at
ab
lit
y (%
)
Ro
ta
ti
o
n
Ru
l
e
1
Ru
l
e
2
Ru
l
e
3
0
.
20
.
4
0
.
60
.
8
1
.
0
0
10
20
30
40
50
60
70
80
90
10
0
R
epea
t
a
bli
ty
(%)
Sh
ea
r
Rule
1
Rule
2
Rule
3
Evaluation Warning : The document was created with Spire.PDF for Python.
TELKOM
NIKA
ISSN:
1693-6
930
Obje
ct Re
cog
n
ition Base
d on Maxim
a
ll
y Stable Extre
m
al Region a
nd … (Hongj
un Guo
)
627
Figure 5. Rep
eatablity in different light an
d blurring
Dete
ction re
sult of
1
Dete
ction re
sult of
3
Figure 6. MSER detectio
n
result by
1
and
3
In Figu
re
5,
Rule
1, Rule2
and
Rule3
use
1
,
2
and
3
a
s
the
ba
se
s to
ju
dg
e the
stability. It can be
seen
from Fi
gure
4 and 5,
these three m
e
thods al
l
have good
affine
10
2
0
30
4
0
50
0
10
20
30
40
50
60
70
80
90
10
0
R
epea
t
a
blit
y
(%
)
Illu
min
a
tion
Ru
l
e
1
Ru
l
e
2
Ru
l
e
3
23456
0
10
20
30
40
50
60
70
80
90
100
Re
peata
b
lity
(%)
Bl
u
r
Rule1
Rule2
Rule3
Evaluation Warning : The document was created with Spire.PDF for Python.
ISSN: 16
93-6
930
TELKOM
NIKA
Vol. 14, No. 2, June 20
16 : 622 – 62
9
628
invarian
ce a
nd light inva
rian
ce. The
method
s by
2
and
3
can still obtain over 40%
repeatability after amplifying the images by 4 time
s;
when the image is fu
zzy, the repeatablity of
the method
by
1
falls qui
ckly while th
e method
s b
y
2
and
3
as the stability judgment
stand
ard
hav
e hig
her repe
atability than
that by
1
. Since to u
s
e
2
to j
udge
the
stab
ility need
s
to pe
rform
no
rmali
z
ed
processing
on
the
image;
there
f
ore, the
met
hod
by
2
is
be
tte
r
th
an
th
a
t
by
3
; however,
the detectio
n
by
3
doesn’t need no
rmal
ized p
r
o
c
e
s
sing. Figure 6
comp
a
r
e
s
the MSER detection and
ellipse fitting result of
1
and
3
in the affine tran
sform
and image
blurring.
In Figu
re
6(a
)
a
nd
(b
), the
1st
col
u
mn
i
s
the
dete
c
tio
n
result
of th
e o
r
iginal
ima
g
e; the
2nd
col
u
mn i
s
the
imag
e
unde
r the
vie
w
-a
ngle
chan
ges of the
st
anda
rd te
st i
m
age
and
th
e 3rd
colum
n
i
s
the
dete
c
tion re
sult after Ga
ussian fu
ncti
on
with a va
rian
ce
of 10
blurs the
ori
g
inal
image
s. It ca
n be seen fro
m
the dete
c
tion re
sult that
the method
by
3
remove
s
the un
stable
extremal re
gi
on in the image, obtain
s
a more
stable d
e
tection resul
t
than that by
1
and still gets
excelle
nt detection effe
cts in fuzzy tran
sform.
4.2. The Image Rec
ogniti
on Integra
tin
g
MSER and SIFT
This sectio
n use
s
the met
hod of this p
aper
a
nd the
local feature
extraction m
e
thod
descri
bed
by
SIFT to m
a
tch the
stan
dard
te
st im
age
s, u
s
e
s
Euclide
an
di
stan
ce to
p
e
r
form
simila
rity measu
r
em
ent o
n
the featu
r
e
vector
of
these two im
age
s and in
ord
e
r to eliminate t
h
e
error m
a
tchin
g
pair
ca
use
d
by the im
age o
c
clus
i
o
n or b
a
ckg
r
o
u
nd info
rmati
o
n, it use
s
the
method
to
compa
r
e th
e
nearest
neig
hbor di
stan
ce an
d the
n
e
xt nea
re
st
neigh
bor di
stance
prop
osed by
Lowe
and
eliminate the
erro
r mat
c
hi
ng pai
r. Assuming that t
he feature to
be
matche
d i
s
A
C
,
its ne
arest fe
ature
is
B
C
and
the
next ne
a
r
est
featu
r
e i
s
D
C
,
the con
d
ition t
o
judge the feat
ure mat
c
hin
g
is:
2
2
AB
AD
CC
t
CC
(15)
Her
e
,
t
is the
matchin
g
thresh
old a
nd it
is 0.6 in
this pape
r
. Figu
re 7 is the m
a
tchin
g
result, only shows the 30
matching p
a
i
rs with the h
i
ghe
s
t match
sco
re
s, and
Table 1 is the
st
at
ist
i
c of
t
h
e
result
.
SIFT
MSER+
S
IFT
Figure 7. Re
sult of image reco
gnition
Evaluation Warning : The document was created with Spire.PDF for Python.
TELKOM
NIKA
ISSN:
1693-6
930
Obje
ct Re
cog
n
ition Base
d on Maxim
a
ll
y Stable Extre
m
al Region a
nd … (Hongj
un Guo
)
629
Table 1. Statistic of Matchi
ng Re
sult
Local Feature
Matching/Pair
Accurate Matching/Pair
Accur
a
cy
(%)
SIFT
901
731
81.13
MSER+SIFT
1144
953
83.30
It c
a
n be seen from the s
t
atis
tic
of the
matc
hi
n
g
re
sult in T
able
1
that the
lo
ca
l feature
extraction m
e
thod ba
sed
o
n
MSER and
SIFT this pap
er propo
se
s f
o
r the
standa
rd test ima
ge
in
the view-an
g
l
e
chan
ges is better th
an t
he meth
od
o
f
SIF
T
in
th
e
ma
tc
h
i
ng
pa
irs
an
d
acc
u
r
a
te
matchin
g
pai
rs of feature p
o
ints.
5. Conclusio
n
This
pape
r u
s
e
s
the im
pro
v
ed maximall
y stable
regi
o
n
judgm
ent
standa
rd to
det
est the
local MSER region
s. Com
pare
d
with th
e traditional
method
s, the method of this pap
er is m
o
re
stable
in the
detectio
n
effe
ct an
d it still
h
a
s
exce
lle
nt
detectio
n
effe
cts fo
r the
irregula
r
regio
n
s
. It
take
s MSER
regio
n
a
s
th
e
obje
c
tive lo
cal re
gi
on
feat
ure. T
he m
e
thod
based
o
n
SIFT de
scri
bes
the MSER regions
whic
h
are detec
ted
and cons
truc
ts
the local feature desc
riptor. The method
of Gau
s
sian
functio
n
weig
ht co
nsi
ders t
he influ
e
n
c
e
of different
pi
xels pl
ay on
the
central pix
e
l
and imp
r
ove
s
the stability and it is appli
c
able for t
he o
b
ject mat
c
hin
g
in the view-angle
cha
nge
.
Ackn
o
w
l
e
dg
ements
This
wo
rk wa
s supp
orte
d
by Open
Proj
ect of Intellig
ent Informati
on Pro
c
e
s
sin
g
Lab
at
Suzho
u
Univ
ersity of Chin
a (No.2
013Y
KF17), Ho
ri
zontal Proje
c
t at Suzhou
University of China
(No.2
015
hx0
25)
and
Qu
a
lity proje
c
t of
Anhui
P
r
ovi
n
ce: Software en
ginee
rin
g
tea
c
hing
te
am
(201
5jxtd041
).
Referen
ces
[1]
Lo
w
e
DG. Distinctive Image Featur
es fr
om Scale-Inv
a
riant Ke
yp
oi
nts.
Internation
a
l
Journ
a
l of
Co
mp
uter Visi
on
. 200
4; 60(2)
: 91-110.
[2]
Mikola
jcz
y
k K,
T
u
y
t
el
aars
T
,
Schmid
C, et
a
l
. A C
o
mpar
iso
n
Of Affine
Re
gio
n
D
e
tectors.
International
Journ
a
l of Co
mputer Visi
on
. 2
005; 65(
1): 43-
72.
[3]
Matas J, C
h
u
m
O. Rob
u
st
W
i
de-b
a
sel
i
ne
Stereo from
M
a
ximal
l
y
Sta
b
l
e
E
x
trema
l
R
e
gio
n
s.
British
Machi
ne Visi
on
Computi
n
g
. 20
02; 22(1
0
): 761
-767.
[4]
Per-Erik Forssen, David G Lo
w
e
.
Sh
ape
D
e
scriptors for
Maxi
mal
l
y Sta
b
le Extre
m
a
l
Regi
ons
. IEE
E
Internatio
na
l C
onfere
n
ce o
n
Co
mp
uter Visi
on. 200
7: 1-8.
[5]
Mikola
jczik K,
Schmid
C. A
Performanc
e
Evalu
a
tion
of
Loca
l
Descr
ipt
o
rs.
IEEE Transactions on
Pattern Analys
i
s
& Machine In
tellig
enc
e
. 200
5; 27(10): 1
615
-163
0.
[6]
Hu J, P
e
n
g
X,
Fu C. A
Comp
ar
iso
n
of F
e
atu
r
e D
e
scripti
on
Algorit
hms.
Optik-Internatio
na
l Jo
urna
l fo
r
Lig
h
t and El
ectron Optics
. 201
5; 126: 27
4-27
8.
[7]
F
an B. R
o
tatio
nall
y
Invar
i
a
n
t
Descript
o
rs us
i
ng Inte
nsit
y Or
der P
ool
in
g.
P
a
ttern An
alys
is
an
d M
a
chi
ne
Intelli
genc
e, IEEE Transactio
n
s
. 2012; 3
4
(1
0): 2031-
20
45.
[8]
C Z
hu, D Hua
n
g
, CE Bichot, Y W
ang, L Ch
en
.
HSOG: A Nov
e
l Local Descrip
tor Based on
Histograms
of Seco
nd Or
der Gra
d
ie
nts
for Object C
a
t
egor
i
z
a
t
io
n
. Pr
oc. of ACM In
ternatio
nal
C
o
nferenc
e o
n
Multimed
ia R
e
trieval (ICMR).
201
3: 199-
206,
[9]
Z
hang
C, Br
on
stein AM, K
i
m
m
el R,
et
al. Ar
e MSER
F
eatu
r
es R
eal
l
y
Inter
e
sting
?
.
IEEE Transactions
on Softw
are Engi
neer
in
g
. 20
11; 33(1
1
): 231
6 - 2320.
[10]
Yueq
iu
Jia
ng,
Yigu
ang
C
h
e
n
g
, Ho
ng
w
e
i
Ga
o. Im
prove
d
C
haracters
F
eat
ure E
x
tracti
on
and
Match
i
ng
Algorit
hm Bas
ed o
n
SIF
T
.
TELKOMNIKA Indo
nesi
an J
o
u
r
nal
of Electric
al En
gin
eer
ing
.
201
4; 12(
1):
334-
343.
[11]
Quan Su
n, Jia
n
xun Z
h
ang.
Parall
el R
e
se
a
r
ch
an
d Imple
m
entatio
n of S
A
R Image R
e
g
i
stration B
a
sed
on Optimiz
ed
SIFT
.
T
E
LKOMNIKA Indon
e
s
ian J
ourn
a
l
of Electrica
l
En
g
i
ne
erin
g
. 20
14;
12(2): 1
1
2
5
-
113
1.
[12]
Hon
gpi
ng C
a
i.
T
e
chniques for
Local F
e
ature
Based Ima
ge
Categ
o
rizati
on.
Disertatio
n
. C
h
in
a: Natio
nal
Univers
i
t
y
of Defense T
e
chno
log
y
; 20
10.
Evaluation Warning : The document was created with Spire.PDF for Python.