TELKOM
NIKA
, Vol.12, No
.4, Dece
mbe
r
2014, pp. 92
1~9
3
2
ISSN: 1693-6
930,
accredited
A
by DIKTI, De
cree No: 58/DIK
T
I/Kep/2013
DOI
:
10.12928/TELKOMNIKA.v12i4.810
921
Re
cei
v
ed Au
gust 17, 20
14
; Revi
sed O
c
t
ober 1
4
, 201
4; Acce
pted
No
vem
ber 4,
2014
Face Recognition Using Invariance with a Single
Training Sample
Qian Tian
N
ational ASIC
Engi
neer
in
g R
e
searc
h
Cent
e
r
, Southeast U
n
iversit
y
, Na
nji
ng, Chi
n
a
e-mail: tia
nqi
an
@seu.e
du.cn
A
b
st
r
a
ct
F
o
r the li
mits
of me
mor
i
es
and co
mputi
ng perfor
m
anc
e of current i
n
telli
ge
nt termi
nals, it is
necess
a
ry to d
e
vel
op so
me s
t
rategies w
h
ic
h
can ke
ep th
e
bal
ance
of the
accuracy
and
runn
ing ti
me fo
r
fa
ce
re
co
gn
i
t
i
o
n
.
Th
e p
u
r
p
o
s
e o
f
the
wo
rk
i
n
th
i
s
pa
pe
r i
s
to fi
n
d
th
e in
va
ria
n
t
fea
t
u
r
e
s
o
f
fa
ci
a
l
im
ag
e
s
and
repres
ent
eac
h su
bject
w
i
th on
ly o
n
e
trai
nin
g
sa
mp
l
e
f
o
r face
reco
g
n
itio
n. W
e
pr
o
pose
a
tw
o-la
yer
hier
archic
al mode
l, calle
d in
varia
n
ce mod
e
l
, and it
s correspo
ndi
ng a
l
g
o
rith
ms to kee
p
the bal
anc
e
of
accuracy, stor
age
an
d ru
nn
i
ng ti
me. Es
pe
cially, w
e
tak
e
adv
antag
es
of w
a
velet tra
n
sformatio
n
s
an
d
invari
ant
mo
ments to obtai
n the key
feature
s
as w
e
ll as reduce d
i
mens
io
ns of feature d
a
ta base
d
on t
h
e
cogn
itive rul
e
s
of human bra
i
ns. F
u
rthermore, w
e
im
prov
e
usual p
ool
in
g meth
ods, e.g. max p
o
o
lin
g a
n
d
avera
ge po
ol
in
g, and pro
pos
e the w
e
ighte
d
poo
lin
g me
thod to red
u
ce
dimensi
ons
w
i
th no effect on
accuracy, w
h
ic
h let stor
ag
e r
equ
ire
m
e
n
t an
d reco
gn
it
ion
time gr
eatly
de
crease. T
h
e si
mu
lati
on r
e
sult
s
show
that the
prop
osed
me
thod d
oes b
e
tter than so
me
typical a
nd
near
ly-pro
pos
e
d
alg
o
rith
ms i
n
bal
anci
ng the a
ccuracy an
d ru
nni
ng ti
me.
Ke
y
w
ords
:
invarian
ce m
ode
l, single traini
ng sam
p
le, face re
co
gnitio
n
1. Introduc
tion
In recent ye
ars, fa
ce recognition te
ch
nique h
a
s
b
een ap
plied
on intellige
n
t mobile
terminal
s for varieties of a
pplication
s
.
Ho
weve
r,
t
h
e
t
e
rminals h
a
v
e some re
s
our
ce limit
s s
u
ch
as lo
w po
we
r, small memo
ries
and lo
w
comp
uting
pe
rforma
nce. In this ca
se, th
e small traini
ng
base an
d si
mple alg
o
rith
ms a
r
e p
r
ef
erred. So it
is ne
ce
ssary
to study
strategie
s
for f
a
ce
recognitio
n
with one traini
ng sa
mple t
o
kee
p
a ba
lance of the accuracy a
nd ru
nning ti
me.
Ac
tually, there are some fruits
[1]-[5] in ac
a
demi
cs. T
he improved discrimi
native
multi-manifol
d
analysi
s
(DM
M
A) meth
od
has be
en
pro
posed i
n
[1]
b
y
partitioni
ng
each fa
ce
im
age i
n
to
seve
ral
non-overl
appi
ng patche
s
to form an i
m
age set for each p
e
rso
n
. Although the recognit
i
on
accuracy
of DMMA [1]
wa
s hig
h
e
r
than
80 p
e
rcent
s
for AR
datab
ase, it
s runni
ng time o
n
P
C
i
s
as 1
00 time
s as typi
cal m
e
thod
s such
as P
C
A du
e
to its
compl
e
xity. Thus,
one p
r
o
b
lem
is
addresse
d th
at wheth
e
r a
n
invaria
n
t factor
we
can
acq
u
ire fo
r face
re
cog
n
ition u
s
ing
sim
p
le
method
s to
p
e
rform
le
ss runnin
g
time
a
nd a
n
a
c
cept
able a
c
cu
ra
cy. This p
ape
r is tryin
g
to
solve
this problem.
The id
ea i
n
th
is p
ape
r i
s
in
spired
by the
i
n
varian
ce
in t
he visual
re
cognition
[6]-[8
], which
imitates the tuning propert
i
es of
view-tuned cell
s in infero-tempo
ral (IT) cortex [9],[10]. One of
the co
rrespo
nding fa
mou
s
mo
del
s
is Hierarchi
c
al Model and
X
(HMAX
)
mo
del p
r
op
ose
d
by
Poggio and his resear
ch
group [11],[12]. HMAX
was a four-layer hierarchical model, in whi
c
h
two type
s of
comp
utation
s
: linea
r
summ
ation a
nd no
n-line
a
r
max operation alte
rnated
b
e
twe
en
layers.
HMA
X
model
wa
s
first p
r
op
osed
for o
b
je
ct
re
cog
n
ition, not
sp
atially for f
a
ce
re
co
gniti
on.
The re
se
arch
grou
p of Poggio late
r pro
posed
the m
e
thod for fa
ce re
cog
n
ition
based on
HM
AX,
whi
c
h de
sc
rib
ed cog
n
it
iv
e cha
r
a
c
t
e
ri
st
ic
s in ac
cur
a
cy
as
a fac
e
des
c
riptor [6]-[8]. This
method is
simulate
d on
PC, not con
s
ide
r
ing the
operating pl
a
tform, espe
ci
ally for the termin
als with
out
enou
gh re
so
urces su
ch a
s
storage
s an
d
comp
uting abilities. The
HMAX
b
a
sed
metho
d
s
mai
n
ly
make
be
st of
Gabo
r filters for featu
r
e e
x
tracti
on,
whil
e Gab
o
r filters could
not g
e
t enou
gh lo
cal
details th
at repre
s
e
n
t the
distin
ct feature
s
of
different face
s. Th
us, we have
to find a ki
nd
of
wavelet filte
r
s inste
ad
of G
abor filters to
get e
nou
gh
details on
face featu
r
e
s
at
different
scal
es.
To improve the accu
ra
cy usin
g
one training sample,
the method o
f
patch se
gm
ents is al
so u
s
e
d
for feature
extraction. Fu
rt
herm
o
re, to redu
ce
the qu
antities of dat
a and
still ke
ep the invari
a
n
ce
of the facial image
s, we ta
ke adva
n
tage
s of
invaria
n
t moment
s [13],[14] and pool
ing tech
niqu
e
s
Evaluation Warning : The document was created with Spire.PDF for Python.
ISSN: 16
93-6
930
TELKOM
NIKA
Vol. 12, No. 4, Dece
mb
er 201
4: 921
– 932
922
to keep the
g
eometri
c inva
rian
ce, such as rotation, shift and zoo
m
ing, of the image
s as
well
as
redu
cin
g
the
data sto
r
ag
e.Based
on the
s
e con
s
ide
r
at
ions, we prop
ose a
n
invari
ance mod
e
l a
nd
its co
rre
sp
on
ding alg
o
rith
ms for face reco
gnition u
s
ing one trai
ni
ng sa
mple in
conditio
n
s of
low
stora
ge an
d low computin
g perfo
rman
ce platform.
This p
ape
r i
s
organi
ze
d as five se
cti
ons. Se
ction
II introduce
s
some relat
ed wo
rk
including
HM
AX model and moment invariants.
Section III illustrates the proposed invariance
model an
d th
e co
rre
sp
ond
ing algo
rithm
in details. T
he simul
a
tion
results ba
se
d on publi
c
face
databa
se
s a
r
e discu
s
sed i
n
se
ction IV. Section
V ma
ke
s a con
c
lusion and di
scu
s
ses the fu
rth
e
r
wor
k
.
2. Related Work
2.1. HMAX M
odel
There are two simpl
e
unit
s
an
d two
co
mplex units i
n
HMAX mo
del. The
sch
e
matic o
f
HMAX is sho
w
n in Figu
re
1.
Figure 1. Sch
e
matic of HM
AX model
In Figu
re
1,
S1 an
d S2
a
r
e
simpl
e
u
n
i
t
s, and
C1
a
nd
C2
are
complex
units.
At S1,
arrays
of two
-
dime
nsi
onal
filters, g
ene
rally Gab
o
r
filt
ers,
at fou
r
d
i
fferent o
r
ient
ations are u
s
ed
for the in
put
raw imag
es.
Gabo
r filters
are tig
h
tly tuned fo
r both
orientatio
n a
nd fre
que
ncy
but
cover a
wide
rang
e of
spati
a
l freq
uen
cie
s
. The
n
, the
grou
ps of cell
s at
same
ori
entation b
u
t a
t
a
slightly different scale
s
a
nd po
sition
s are fed from
S1 into C1
usin
g MAX operatio
n [11]. At
S2,within e
a
c
h filter
ban
d, a squa
re
of four
a
d
ja
cent and
non
-overlap
ping
C1
cell
s in
2×2
arrang
ement
is g
r
ou
ped
a
s
cell
s, which are a
gai
n
fed fro
m
S2
i
n
to C2, finall
y
achi
eving
size
invarian
ce
ov
er all
filter
sizes i
n
the
four
filter ban
ds a
nd p
o
sition
in
varian
ce
over the
whole
in
put
image
s.
2.2. Moment In
v
a
riants
Feature mom
ent as
a glo
b
a
l invaria
n
t is oft
en used f
o
r the fe
ature
sele
ction to
redu
ce
the input fo
r
cla
ssifie
r
s. T
he r
epresent
ations
of seven mom
ent
i
n
variant
s b
a
sed on
the
se
con
d
-
orde
r a
nd thi
r
d-o
r
de
r no
rm
alize
d
center
moment
s a
r
e
given by [7]. For a
two
-
di
mensi
onal
M
×
N
image fun
c
tion f(x,y), the definitions o
f
p+q
order
geomet
ry mo
ment mpq
and the cent
er
moment a
r
e given by (1
) and (2) .
(1)
(2)
p
q
11
(,
)
MN
pq
pq
ij
mi
jf
i
j
11
()
(
)
(
,
)
MN
pq
pq
ij
ix
jy
f
i
j
Evaluation Warning : The document was created with Spire.PDF for Python.
TELKOM
NIKA
ISSN:
1693-6
930
Face
Re
cog
n
i
tion Usi
ng In
vari
an
ce with
a Single Trai
ning Sam
p
le (Qian Ti
an)
923
Here,
i
s
th
e g
r
a
v
ity center o
n
the
directio
n of x in
the
image,
is
th
e g
r
a
v
ity c
e
n
t
er
on
th
e d
i
r
e
c
t
io
n o
f
y in
th
e im
ag
e. Ze
ro
-order
cente
r
m
o
me
nt
is u
s
ed
to
norm
a
lize the
moments. S
o
the norm
a
li
zed
cente
r
m
o
ment is give
n by (3).
(3)
Then, the seven mome
n
t
invariants
are giv
en
by (4). The
de
finition of the seven
invariant
s a
r
e
given in
(4
)
for the
se
co
n
d
and
third
o
r
de
r mom
ent
s. The
seven
invaria
n
ts a
r
e
useful fo
r not
only pattern
identificatio
n indep
ende
nt
ly of position,
size
and ori
e
ntation
but
al
so
indep
ende
ntly of parallel p
r
oje
c
tion.
(4)
3. In
v
a
riance Model and Proposed
Algorithms
As the illustration in section II,
Gabor filters are generally us
ed for feature extraction at
the firs
t layer
in HMAX model. Ga
b
o
r transfo
rmatio
n
coul
d sim
u
lat
e
the hu
man
visual
system
by
decompo
sin
g
the retinal i
m
age into
a
set of f
ilter image
s. And the ea
ch f
ilter image
can
rep
r
e
s
ent ch
ange
s in the intensitie
s of the frequ
en
cy and dire
ction
in the local scop
e of the raw
image. So
th
e texture fe
ature
s
can
be
obtaine
d by
a group
of th
e multi-
ch
ann
el Ga
bor filters.
Equation (5
)
sho
w
s the G
abor filter, th
e key
param
eters a
r
e the
frequen
cy function
s an
d the
wind
ow
size
of Gauss functio
n
. Actu
ally, Gabor
t
r
an
sform
a
tion
use
s
Gau
s
s functio
n
a
s
a
wind
ow fun
c
t
i
on to make
local F
ouri
e
r
transfo
rmatio
n by cho
o
sin
g
the frequ
e
n
cy and
Gau
s
s
para
m
eters. Although Ga
bor filters
ha
ve tempo-sp
atial cha
r
a
c
te
rist
ics, they are not orth
og
onal
so that different feature compon
ents h
a
ve redu
nda
ncie
s whi
c
h l
ead to the low efficien
cy for
texture feature extraction.
22
1'
y
'
(
x
,
y
,
,
f)
ex
p
(
((
)
(
)
)
)
s
i
n
(
2
fx
'
)
2
x
G
sx
s
y
x'
xcos
y
s
in
y'
y
c
o
s
x
s
i
n
(5)
Here, sx an
d sy are the
variances a
l
ong
x and y axes resp
ectively, and f is the
freque
ncy of the sin
u
soidal
function,
is the orientatio
n of the Gabo
r filter.
We
ran
domly
sele
ct a fa
ci
al image
fro
m
ORL data
base sho
w
n
in Figu
re 2.
Figure3
sho
w
s the filter imag
es of
Figure 2 usi
n
g (5).
Th
e im
age
s in Figu
re3 are
norm
a
lize
d
for ea
s
y
observation.
The up
pe
r ro
w imag
es i
n
Figure 3 a
r
e
the four filter
image
s at on
e scale
and f
o
u
r
orientatio
ns.
The l
o
wer row i
m
ag
es
are
the fo
ur filter im
age
s at
an
other scale
a
nd
four
00
p
q
pq
r
01
0
0
/
ym
m
10
00
/
x
mm
12
0
0
2
22
22
0
0
2
1
1
22
33
0
1
2
2
1
0
3
22
43
0
1
2
2
1
0
3
22
53
0
1
2
3
0
1
2
3
0
1
2
2
1
0
3
22
21
03
21
0
3
30
12
21
03
2
62
0
0
2
3
0
1
2
2
1
0
()
4
(3
)
(
3
)
()
(
)
(
3
)(
)[
(
)
3
(
)
]
(
3
)(
)[
3
(
)
(
)
]
()
[
(
)
(
2
3
1
1
30
12
2
1
03
22
7
2
1
0
3
3
01
2
3
01
2
2
1
0
3
22
30
1
2
2
1
0
3
30
12
21
0
3
)]
4(
)
(
)
(
3
)
(
)[(
)
3
(
)
]
(
3
)(
)[
3
(
)
(
)
]
p
q
Evaluation Warning : The document was created with Spire.PDF for Python.
ISSN: 16
93-6
930
TELKOM
NIKA
Vol. 12, No. 4, Dece
mb
er 201
4: 921
– 932
924
orientatio
ns.
Figure 3
sho
w
s th
at there
are ve
ry
slig
htly variance
s
of the ei
ght
image
s an
d
the
more
lo
cal
d
e
tails
are
n
o
t
enou
gh. O
n
the
othe
r
hand, fo
r th
e
sa
me
ra
w i
m
age,
we
m
a
ke
wavelet tran
sformation
at f
our
scale
s
usi
ng the
wavel
e
t 'db5
'. Fig
u
re 4
sh
ows th
e re
sult, in
which
the up
pe
r left
imag
e, the
u
pper ri
ght im
age, the
lo
we
r left im
age
a
nd the
lo
we
r
right
image
a
r
e
the detail im
age
s at scale
four, scale t
h
ree,
scale t
w
o an
d scal
e
one respe
c
tively. Comparing
Figure 3 and
Figure 4, wa
velet transfo
rmation ca
n o
b
tain more d
e
tails and lo
cal feature
s
than
Gabo
r tran
sfo
r
mation
s. Actually, we let the most
left image of the uppe
r ro
w in Figure 3 subt
ract
the othe
r
se
ven imag
es
of Figu
re
3 resp
ective
ly t
o
get
seve
n
differen
c
e
m
a
trice
s
, a
nd t
hen
cal
c
ulate th
e
corre
s
p
ondin
g
stan
da
rd v
a
rian
ce
of th
e seve
n matri
c
e
s
. Similarly
,
we
cond
uct
the
same
op
eration o
n
Fig
u
re
4. The
com
putation
re
su
lts are
sho
w
n in Fi
gure 5
.
Obviou
sly, the
stand
ard
devi
a
tion value
s
of the filtered
image
s
u
s
in
g wavel
e
ts
are mu
ch bi
gge
r than th
at u
s
ing
Gabo
r. T
he
stand
ard
d
e
viation ave
r
ag
es
usi
n
g
wa
velets a
n
d
G
abor a
r
e
5.2
420
and
0.7
182
r
e
spec
tively.
Figure 2. The
raw ima
ge in
ORL data
b
a
s
e
Figure 3. The
upper
ro
w image
s are the
results u
s
i
ng
Gabo
r filters
with the win
d
o
w si
ze of (2,
1), and the o
r
ientation of
2
/
12
,
5
/
12
,
8
/
12
11
/
1
2
and
resp
ectively. The lower row
image
s are
the results u
s
ing Gab
o
r filters
with the wi
ndo
w si
ze of (4, 2), and th
e orientatio
n of
2
/
12
,
5
/
12
,
8
/
1
2
,
11
/
1
2
r
e
spec
tively
Therefore,
we ch
ose wavelet tran
sformation in
stea
d of Ga
bor fil
t
ers to
get m
o
re l
o
ca
l
feature
s
. Th
e typical o
r
t
hogo
nal
wav
e
let 'db5
' is use
d
a
s
t
he kern
el to
make wavelet
transfo
rmatio
ns. Fo
r
ea
ch
image, the
wavelet tran
sf
o
r
mation
is co
ndu
cted at
four scale
s
to ge
t
the co
rrespo
nding
app
roxi
mation coefficient
s who
s
e
ma
trix is n
a
m
ed A0 at the f
ourth
scale a
nd
the detail co
efficients
wh
ose mat
r
ices are nam
ed
D1, D2, D3, and D4 at four differe
nt scale
s
respe
c
tively. Then, A0,
D1
, D2,
D3
an
d
D4
are
respe
c
tively sin
g
ly
use
d
to
ma
ke
re
co
nstructio
n
so that to gen
erate the reconstruc
te
d five image
s, e.g. Figure 4.
Evaluation Warning : The document was created with Spire.PDF for Python.
TELKOM
NIKA
ISSN:
1693-6
9
30
Face
Re
cog
n
i
tion Usi
ng In
vari
an
ce with
a Single Trai
ning Sam
p
le (Qian Ti
an)
925
Figure 4. The
reco
nst
r
u
c
tio
n
image
s
usi
ng the detail
coeffici
ents of
the wavelet tran
sform
a
tion
at four scale
s
re
spe
c
tively
Figure 5. The
standa
rd dev
iation com
p
a
r
ison u
s
in
g wa
velets tran
sfo
r
mation a
nd
Gabo
r
transfo
rmatio
n
After wavele
t transfo
rmat
ions
and
re
con
s
tru
c
tion
s, one ima
g
e
is d
e
co
mp
ose
d
to
gene
rate
five image
s
at different scale
s
so
th
at
t
he
qu
antities
of dat
a in
cre
a
sed
b
y
five times.
To
solve thi
s
pro
b
lem, the
techniqu
e of
dim
ensi
on
red
u
ct
ion h
a
s to
be
con
s
id
ere
d
. A
l
though
PCA
is
a typical
met
hod
of dime
n
s
ion
re
du
ctio
n, it ope
ra
te
s on th
e gl
oba
l
feature
s
wit
hout
con
s
id
ering
the lo
cal
deta
ils. While i
n
case
of o
n
ly o
ne training
sa
mple fo
r e
a
ch
face, th
e lo
cal detail
s
play
a
very important role.
To
keep
lo
cal
features as well as redu
ce
data
dimen
s
io
ns,
we fi
rst
di
vide the
recon
s
tru
c
ted
images i
n
to patch
es.
Combi
n
ing t
he co
gnitive
law and t
h
e facial im
age
cha
r
a
c
teri
stics, we divide
the facial im
age into
9 p
a
tche
s shown in Figure 6
.
Each patch
just
rep
r
e
s
ent
s o
ne physi
ologi
cal re
gion of
the face. Patch1 is a left forehe
ad, pat
ch2 i
s
a mid
d
le
forehe
ad, pat
ch3 is a ri
ght
forehea
d, patch4 i
s
a left chee
k as well as a left eye, patch5 is a
nose, pat
ch6
is
a ri
ght
ch
eek a
s
well a
s
a
rig
h
t
eye,
patch7 i
s
a l
e
ft chin,
patch8 i
s
a
lip, a
n
d
patch
9 is a ri
ght chin.
Evaluation Warning : The document was created with Spire.PDF for Python.
ISSN: 16
93-6
930
TELKOM
NIKA
Vol. 12, No. 4, Dece
mb
er 201
4: 921
– 932
926
Figure 6. Patch segme
n
ts
of the facial image
Among
so m
u
ch fe
ature d
a
ta, the key featur
e
s
have
to be p
r
e
s
e
r
ved to re
pre
s
ent the
cha
r
a
c
teri
stics of e
a
ch
facial image. Be
cau
s
e
ev
en i
f
the image
s of the same
face, they
still
have slightly
varia
n
ce
in
different co
n
d
itions
su
ch
as ill
uminatio
n, the inva
ria
n
ce
of featu
r
es
become m
o
re and m
o
re i
m
porta
nt. Th
us, the te
chn
i
que of mo
m
ent invaria
n
ts is con
s
ide
r
e
d
to
pre
s
e
r
ve the
invarian
ce
of
feature
data
due to
its inv
a
rian
ce
in
scale, zoom
an
d rotation.
T
h
e
details of mo
ment invaria
n
t
s are introdu
ced in
se
ction
II.
For
ea
ch
re
constructe
d im
age,
we fi
rst
divide it into
9 pat
che
s
, a
nd the
n
com
pute th
e
seven mo
me
nt invariants
of each p
a
tch, and fi
nally one col
u
mn
of 63 data is obtain
ed. The
corre
s
p
ondin
g
algorith
m
is summa
rized
in Table I.
Table 1. Algo
rithm 1(A.1
)
:Feature Extra
c
tion
A.1 Featu
r
e e
x
tr
action
Step 1. Wavelet decomposition
For each tr
aining facial imag
e, wavelet tran
sformation is made at four
scales to get
appro
x
imation c
oefficients named A0, an
d detail
coefficients named D1,
D2,
D
3
, D4
at fou
r
scales.
Step 2. Image
re
construction.
We make w
a
vel
e
t reconstruction respectively
using A0, D1, D2,
D3 and D4 to g
enerate five
reconstructed im
ages (e.g. Figu
re
5).
Step 3. Patch segments For e
a
ch reconstructed im
age,
w
e
divide it into 9 patches (e.
g
.Figure6
).
Step 4. Moment i
n
variants
We calculate
seven moments for each patch so that totally
9 gr
oups of seven
moments are
obtained fo
r eac
h reconstructed
image. Let
the 9
groups of
data
b
e
one column ve
ctor
w
i
th the
size of 63 data.
Thus, one vector of 63 data repre
s
ents the features of one reconstructed image
(e.g.Figu
r
e7 and
Figure8)
.
Step 5. Featu
r
e
extraction
According to th
e results of step 4, there
are
63 featur
e data
for one
reconst
r
ucted image.
Because one facial image is cor
r
e
sponding to five r
e
constr
ucte
d images, there
are totall
y
315
feature dat
a for o
ne facial image.
We
rand
omly
cho
s
e
som
e
image
s sho
w
n in
Figu
re
7 and
Figu
re
8 to illust
rat
e
feature
extraction. Figure 7
sho
w
s that the moment in
vari
ants of the two face
s of the sam
e
person
have only sl
ight differen
c
es, e
s
pe
ciall
y
, among th
e seve
n mo
ment invaria
n
ts refe
rri
ng
to
equatio
n (4
),
1
and
2
are alm
o
st the same.
While Figu
re 8 sho
w
s th
e moment inv
a
riant
s of
the two
different faci
al im
a
ges.
Obviou
sl
y, there
are d
i
fferences b
e
twee
n the
ci
rcle line
s
and
t
he
cro
s
s line
s
, espe
cially in th
e recon
s
tru
c
t
ed image
s
at scale 1, 2 an
d 3(see Fig
u
re 8(c),(d),(e)).
Evaluation Warning : The document was created with Spire.PDF for Python.
TELKOM
NIKA
ISSN:
1693-6
9
30
Face
Re
cog
n
i
tion Usi
ng In
vari
an
ce with
a Single Trai
ning Sam
p
le (Qian Ti
an)
927
Figure 7. Co
mpari
s
o
n
of wavelet re
co
nstru
c
ted im
a
ges b
e
twe
en
two frontal fa
cial imag
es of
the
s
a
me pers
on. In from (b) to (f), the c
i
rc
le
li
nes rep
r
e
s
e
n
t moment in
variants of re
con
s
tru
c
ted
image
s of the left face, and the cro
s
s lines re
pr
esents moment inva
riants of the reco
nstructe
d
image
s of the right face
Figure 8. Co
mpari
s
o
n
of wavelet re
co
nstru
c
ted im
a
ges b
e
twe
en
two frontal fa
cial imag
es of
the
s
a
me pers
on. In from (b) to (f), the c
i
rc
le
li
nes rep
r
e
s
e
n
t moment in
variants of re
con
s
tru
c
ted
image
s of the left face, and the cro
s
s lines re
pr
esent moment invariants of the re
con
s
tru
c
ted
image
s of the right face
To red
u
ce the dimen
s
io
ns of feature
data
furtherly, we also
prop
osed a
weig
hted
averag
e p
ooli
ng meth
od.
Observing
th
e pat
che
s
in
Figure 6
and
combi
ng th
e
cog
n
itive rul
e
s of
human
b
r
ain
s
, we
supp
ose
patch
4, 5,
6
and
8
hav
e more contri
b
u
tions than ot
her pat
c
he
s for
rep
r
e
s
entin
g one
fa
ce, so we
di
stri
bute comp
arativ
ely bigge
r
weig
hts n
a
med
'w4','w5
','w6
'
,'w8'
to patch 4, 5,
6 an
d 8
than
those na
med
'w1
'
,'w2
','w
3
'
,'w7','
w
9'
to p
a
tch
1,2,3,7,9
.
All of the ni
ne
weig
hts a
r
e e
x
perien
c
in
g value
s
from m
any exper
im
e
n
ts, and the
sum of all the weig
hts is
eq
ual
to 1. The details are shown in Table
II. Us
ing the weighted av
er
age pooling method, each
(a)
Two f
r
ontal f
acial
i
m
ages
of the sam
e
per
s
ons
(b
)
M
o
m
e
nt invar
i
ants
of
a
pp
r
oxim
a
tions
(d
)
M
o
m
e
nt invar
i
ants
of
details at scale
2
(c)
M
o
m
e
nt invar
i
ants
of
details at scale
1
(f
)
M
o
m
e
nt invar
i
ants
of
details at scale
4
(e)
M
o
m
e
nt invar
i
ants
of
details at scale
3
(a)
Two f
r
ontal f
acial
i
m
ages
of dif
f
er
ent per
s
ons
(b
)
M
o
m
e
nt invar
i
ants
of
appr
oxim
a
tions
(d
)
M
o
m
e
nt invar
i
ants
of
details at scale
2
(c)
M
o
m
e
nt invar
i
ants
of
details at scale
1
(f
)
M
o
m
e
nt invar
i
ants
of
details at scale
4
(e)
M
o
m
e
nt invar
i
ants
of
details at scale
3
Evaluation Warning : The document was created with Spire.PDF for Python.
ISSN: 16
93-6
930
TELKOM
NIKA
Vol. 12, No. 4, Dece
mb
er 201
4: 921
– 932
928
recon
s
tru
c
ted
image
ha
s
seven featu
r
e
data so t
hat
each trai
n sa
mple h
a
s
35 f
eature
data.
The
dimen
s
ion
s
o
f
training dat
a are g
r
e
a
tly redu
ce
d. Th
at is, the sto
r
age
req
u
ire
m
ent de
cre
a
s
e
s
whi
c
h i
s
a ve
ry impo
rtant
advantag
e fo
r em
bed
ded
device,
e.g. sensor nod
es of
sm
all size and
low cos
t.
Table 2 Algo
rithm 2(A.2
)
Dimen
s
io
n redu
ction
A.2 Dimension re
duction
Step 1: Weights distribution
Distributing 'w
1
','w
2
''w
3
','
w
4
',
'
w
5
','
w
6
','
w
7
',
'
w
8
','
w
9
' to p
a
tch 1,2,3,4,5,6,
7,8,9 respecti
v
sum of the nine
w
e
ights is 1.
Step2 :Weighted
average pooling
Supposing the m
o
ment invariants vectors
of the nine patches are r
e
spectively
v
1
,v
2
,v
3
,v
4
,v
5
,v
6
,v
7
,v
8
, and v
9
, in sequence, then the
feature vector of
each reconst
r
is
.
9
1
*
ii
i
vv
w
In a sum
m
ary, an invaria
n
ce m
odel
of two layers i
s
con
s
tru
c
ted
.
Figure
9
sh
ows the
invarian
ce m
odel st
ru
cture com
p
o
s
ed
of invarian
ce
attributes
(IA) layer an
d invarian
ce
cl
uste
r
(IC) layer. In IA, wavelet trans
formations
are
c
o
n
d
u
c
ted at four scales resulting
into four-sca
le
detail coeffici
ents an
d app
roximation co
efficients, whi
c
h are re
spe
c
tively used to gene
rate n
e
w
five rec
o
ns
truc
ted images
. To keep the invari
an
ce
an
d redu
ce
qu
antities of
data, we take
advantag
es
of the techni
que of patch
segme
n
ts
a
nd invaria
n
t moment
s to kee
p
the glo
bal
invarian
ce
of each recon
s
tructed
imag
es. Th
e corre
s
po
ndin
g
alg
o
rithm of AI i
s
summa
ri
ze
d in
A.1. On the
base of AI, f
eature
cl
uste
r ha
s to
be
condu
cted
to redu
ce
dimen
s
ion
s
as well a
s
kee
p
feature i
n
varian
ce. So
in IC, the improv
ed
weig
h
t
ed pooling te
chni
que is u
s
ed to sele
ct the
key feature
s
.
The co
rre
sp
ondin
g
algo
rithm of IC is
summari
ze
d in
A.2. Finally,
a vector of 3
5
feature data
repre
s
e
n
ts the
raw ima
ge.
Figure 9. The invarian
ce
model st
ru
ctu
r
e
Evaluation Warning : The document was created with Spire.PDF for Python.
TELKOM
NIKA
ISSN:
1693-6
9
30
Face
Re
cog
n
i
tion Usi
ng In
vari
an
ce with
a Single Trai
ning Sam
p
le (Qian Ti
an)
929
4. Simulations
Thre
e pu
blic
face d
a
taba
ses: O
R
L
dat
aba
se, AR
d
a
taba
se a
n
d
FERET data
base a
r
e
us
ed for s
i
mulations
[15].
The details of se
lec
t
ed
fac
i
al images are
shown in Table II. The
simulatio
n
is runnin
g
on PC with 3.3GHz
CPU an
d 4G
RAM.
Table 3. Ch
aracteri
stics of
sele
cted ima
g
es in O
R
L,
AR and FERET
Name
No.
of
people
No. of
pictures
per person
conditions size
per
image
ORL
40
10
frontal and slight
tilt;light expressions;
variable light
112×92
A
R 120
26
Frontal vie
w
w
i
th
different e
x
pressions;
variable illuminati
ons
and occlusions
80×100
FERET
200
7
Multiple pose, dif
ferent
time face for per
person
80×80
We fi
rst
ch
ose the
propo
sed meth
od,
called IM
sh
ort
for i
n
varia
n
ce mod
e
l, in
cl
uding
A.1
and A.2, to make fe
ature
extraction
a
nd dime
nsi
o
n
redu
ction, a
n
d then
cho
s
e KNN
and
L
D
A,
t
he simpl
e
cl
as
sif
i
er
s t
o
m
a
ke
cla
ssif
i
ca
t
i
on.
To
eval
uate the p
r
op
ose
d
metho
d
,
after step 2
in
A.1, we use
PCA, the typical dim
e
n
s
i
on red
u
ctio
n
techniq
ue to make
dim
ensi
on re
du
ction
instea
d of moment invari
ants, and th
en ch
ose th
e same
cla
s
sifiers to make cl
assificat
i
on.
Figure10, Fig
u
re1
1
, Figu
re
12 an
d Figu
re13
sho
w
th
e simul
a
tion
results of
different m
e
thod
s
usin
g on
e tra
i
ning
sam
p
le
re
spe
c
tively. Beca
use the
pro
p
o
s
ed
m
e
thod i
s
n
o
t
con
s
id
erin
g t
h
e
occlu
s
ion, th
e facial imag
es with o
c
clu
s
ion in
AR d
a
taba
se are not use
d
for the simulatio
n
.
Table IV sho
w
s the re
co
g
n
ition time for one te
st im
age su
ppo
sin
g
the training
data have been
ready.
Figure 10. Reco
gnition rates u
s
ing the
prop
osed met
hod
Evaluation Warning : The document was created with Spire.PDF for Python.
ISSN: 16
93-6
9
30
TELKOM
NIKA
Vol. 12, No. 4, Dece
mb
er 201
4: 921
– 932
930
Figure 11. Reco
gnition rates u
s
ing the
both method
s
Figure 12. Reco
gnition rates u
s
ing the
both method
s
Figure 13. Re
cog
n
ition rate
s usi
ng PCA
0
20
40
60
80
100
AR
ORL
F
ERET
Recognition
Rate(%)
PCA+KNN
PCA+LDA
Evaluation Warning : The document was created with Spire.PDF for Python.