Co
m
pu
ter Sci
ence a
nd Inf
or
mat
i
on
Tec
h
no
lo
gies
Vo
l.
2
, No
.
1
,
Ma
rch
2021
,
pp.
26
~
32
IS
S
N:
27
22
-
3221
,
DOI: 10
.11
591
/
csi
t.v
2i1
.p
26
-
32
26
Journ
al
h
om
e
page
:
http:
//
ia
esprime
.com/i
ndex.
php/csit
Featur
e
extractio
n and cl
assificati
on meth
ods of fa
cial
ex
p
ressi
on: a surey
Mo
e
Moe
Ht
ay
Facul
t
y
of
Com
pute
r
Sc
ie
nc
e
,
Un
ive
rsit
y
of
Com
pute
r
Stud
ie
s,
Ma
ndal
a
y
(UCS
M)
,
Patheing
y
i,
M
yanmar
Art
ic
le
In
f
o
ABSTR
A
CT
Art
ic
le
history:
Re
cei
ved
Ma
y
1
0
, 20
2
0
Re
vised
Jun
1
,
20
2
0
Accepte
d
J
ul
2
4
, 2
0
2
0
Faci
a
l
Expr
ession
is
a
signif
icant
rol
e
in
aff
e
ct
iv
e
computing
and
one
of
the
non
-
ver
b
al
comm
unic
at
ion
for
hum
an
computer
intera
ct
ion
.
Autom
atic
rec
ogni
ti
on
of
h
um
an
aff
ec
ts
ha
s
bec
om
e
m
ore
cha
l
le
nging
and
in
te
rest
ing
proble
m
in
r
ece
nt
y
e
ars.
Fa
ci
a
l
Expre
ss
ion
is
the
signi
ficant
f
ea
tur
es
to
rec
ogni
ze
th
e
hum
an
emotion
in
hum
an
d
ail
y
li
fe
.
Fac
ia
l
expr
ession
rec
ogni
ti
on
s
y
st
em
(FERS
)
ca
n
be
d
eve
lop
ed
f
or
th
e
applic
at
io
n
of
hum
an
aff
ect
an
aly
sis,
hea
l
th
c
are
as
sess
m
ent
,
dista
n
ce
l
ea
rn
ing,
dr
i
ver
fa
ti
gue
det
e
ct
ion
and
h
um
an
computer
int
er
action.
Ba
sica
lly
,
the
re
ar
e
three
m
ai
n
compon
ent
s to
r
ec
ogni
ze t
he
hu
m
an
facia
l
expr
e
ss
ion.
Th
e
y
are f
ac
e
or
face’s
components
d
etec
t
ion,
fe
at
ur
e
ext
ra
ct
ion
of
f
a
ce
image
,
c
la
ss
ifi
c
at
ion
of
expr
ession.
Th
e
stud
y
propose
d
the
m
et
hods
of
feature
extrac
t
ion
and
cl
assifi
ca
t
ion
for
FER.
Ke
yw
or
d
s
:
Ex
pr
essi
on classi
ficat
ion
Faci
al
d
at
aset
s
Faci
al
f
eat
ures
Feat
ur
e
ex
tr
act
ion
This
is an
open
acc
ess arti
cl
e
un
der
the
CC
B
Y
-
SA
l
ic
ense
.
Corres
pond
in
g
Aut
h
or
:
Moe Moe
H
ta
y
,
Faculty
of Com
pu
te
r
Scie
nc
e,
Un
i
ver
sit
y o
f C
om
pu
te
r
Stu
di
es, Man
dalay
(
UCSM
),
Pathein
gyi, M
ya
nm
ar
.
Em
a
il
:
m
oe
m
o
ehtay
@u
csm
.ed
u.m
m
1.
INTROD
U
CTION
In
Ar
ti
fici
al
I
nt
el
li
gen
t
era,
f
aci
al
exp
re
ssio
n
rec
ogniti
on
(
FER)
is
intere
sti
ng
a
nd
c
halle
ng
i
ng
ta
s
k
with
the
pro
bl
e
m
s
of
li
m
i
ted
dataset
,
di
fferent
e
nviro
nm
ents,
pose,
oc
cl
us
io
n,
per
s
on
va
riat
ion
e
tc
.
FE
R
syst
e
m
s
hav
e
been
a
pp
li
ed
m
any
syst
e
m
s
su
c
h
as
hum
an
-
com
pu
te
r
-
inte
racti
on
(
HCI
),
gam
es,
anim
ation
of
data
-
dr
ive
n,
s
urveil
la
nce,
cl
in
ic
al
m
on
it
or
in
g
et
c.,
[
1
]
.
E
km
a
n
a
nd
Fr
ie
se
n,
ps
yc
holo
gists
f
ro
m
Am
erica
def
ine
d
six
unive
rsal
fa
ci
al
exp
res
sio
ns:
fear,
happine
ss,
an
ger,
dis
gust,
su
r
pri
se,
an
d
sad
ness
a
nd
a
lso
ex
plored
A
ct
ion
Un
it
s
ba
sed
f
aci
al
act
ion
cod
i
ng
syst
em
(F
AC
S)
to
de
scribe
facial
featur
es
of
e
xpress
i
ons
[
2
]
.
Faci
al
expressi
on
s
c
onvey
nonver
ba
l
com
m
un
ic
at
i
on
c
ues
that
pl
ay
a
si
gn
i
ficant
r
ole
i
n
inter
personal
relat
ion
s.
S
om
e
li
te
ratur
es
w
or
k
ad
ding
on
ot
her
em
otion
s
neu
t
ral,
co
nte
m
pt,
and
m
any
com
po
und
f
aci
al
e
m
otion
s.
So
m
e
researc
hers
em
plo
ye
d
on
ha
nd
crafted
feat
ur
e
s
extracte
d
us
ing
al
gorithm
s
and
ot
her
s
em
plo
ye
d
on
com
plica
te
d
featur
e
s
e
xtrac
te
d
usi
ng
deep
le
arn
i
ng
m
et
h
od
s
.
In
this
pa
per,
we
e
xp
l
ored
the
feat
ur
e
extracti
on
m
eth
ods
,
featur
e
desc
rip
tors,
cl
assifi
cat
ion
m
et
ho
ds,
m
et
ho
ds
of
fea
ture
dim
ens
ion
reducti
on,
fr
a
m
ewo
r
ks
of
th
e
facial
expressi
on
rec
ogniti
on
syst
e
m
and
the
c
omparis
on
of
t
he
resu
lt
s.
T
he
re
m
ai
nd
er
of
the
pa
per
is
orga
ni
zed
a
s
fo
ll
ows.
In
sect
ion 2
, Lite
rat
ure o
f
c
urre
nt FE
R sy
stem
. Typi
cal
FER syst
e
m
is sh
own
in
Sect
ion
3.
Af
te
r
tha
t,
two
ty
pe
s
feat
ur
e
of
facial
im
ages
is
disc
usse
d
i
n
sect
io
n
4,
an
d
sect
ion
5
desc
ribe
d
fa
ci
al
databases
for
F
ER
syst
e
m
.
Sect
io
n
6
de
scri
bes
t
he
pro
blem
sta
tem
ent
of
F
ER
syst
em
.
In
the
la
st
sect
io
n,
c
on
cl
us
io
n
an
d
fu
t
ur
e
work is
pr
ese
nt
ed.
Evaluation Warning : The document was created with Spire.PDF for Python.
Com
pu
t. Sci.
I
nf. Tec
hnol.
Feature
extrac
ti
on
and
cl
as
sif
ic
ation
met
hods of f
acial ex
pre
ssion:
a
s
ur
ey
… (
Moe M
oe Htay
)
27
2.
LIT
ERATUR
E
OF
CUR
RENT FER
S
YST
EM
Used
ge
om
et
ri
c
featur
e
e
xtra
ct
ion
,
re
gi
on
al
local
bin
a
ry
pa
tt
ern
(LBP
)
fea
tures
e
xtracti
on,
f
us
i
on
of
bo
t
h
the
featu
r
es
us
in
g
a
uto
e
ncode
rs
an
d
se
lf
-
or
gan
iz
in
g
m
ap
(SOM)
-
ba
sed
cl
assifi
er
.
The
a
ver
a
ge
a
ccur
acy
97.55%
of
M
MI
an
d
98.
95
%
o
f
C
K+
da
ta
base.
The
accuracy
of
S
OM
-
based
cl
a
ssifie
r
is
sig
ni
ficant
i
m
pr
ovem
ent
over
S
VM
with
3.94%
inc
rease
f
or
CK+
an
d
4.3
6%
f
or
MM
I
dataset
res
pecti
vely
[
3
].
E
xp
l
or
e
d
m
ul
ti
ple
feature
fu
si
on
ap
plyi
ng
Histo
gram
of
or
ie
nted
gra
dients
from
three
or
t
hogonal
planes
(
HOG
-
TOP)
with
e
xp
e
rim
e
ntati
on
of
th
re
e
dataset
s
CK
+,
GEME
P
-
FE
RA
20
11,
a
nd
act
ed
facial
e
xpressi
on
i
n
th
e
wild
(A
F
E
W)
4.0
[
4
].
Pr
ese
nted
a
FER
m
od
el
us
i
ng
Haa
r
casca
des
face
c
om
po
ne
nts
detect
io
n
a
nd
ne
ur
al
ne
twork
(NN)
to
trai
n
t
he
ey
e
an
d
ad
ding
m
ou
th
fe
at
ur
es
on
JA
F
FE
Ja
pa
nese
da
ta
base.
Com
par
iso
n
of
t
he
r
esult
of
pro
po
se
d
m
et
ho
d
with
S
ob
el
e
dg
e
detect
io
n
m
et
ho
ds
is
that
the
syst
em
has
achieve
d
m
or
e
good
acc
uracy
.
Th
e
pro
blem
of
il
lu
m
inati
on
a
nd
pose
of
t
he
im
a
ge
a
nd
to
m
ake
f
ully
m
eet
theor
y
an
d
pract
ic
al
re
qu
i
rem
ents
by
integrati
ng
othe
r
bi
om
et
ric
authen
ti
cat
ion
m
et
hods
a
nd
HC
I
pe
rce
ption
m
et
hods
is
sti
ll
existe
d
[
5
]
.
E
xa
m
ined
e
m
otion
rec
og
niti
on
syst
em
us
in
g
hy
br
i
d
f
eat
ur
e
descr
i
pto
rs
c
om
bin
ing
sp
at
ia
l
Ba
g
of
featu
res
a
nd
sp
at
ia
l
scal
e
-
inv
a
riant
featur
e
tra
nsfo
r
m
(
SBoF
-
S
SIF
T)
a
nd classi
fiers
of
K
-
nea
res
t neig
hbor.
Co
debo
ok c
onstruc
ti
on
is
ap
plied
a
fter
featu
res
ext
ra
ct
ion
t
o
repres
ent
la
r
ge
featu
r
e
set
s
by
gro
uping
sim
i
la
r
feat
ur
es
int
o
a
s
pe
ci
fied
cl
us
te
r
num
ber
.
The
e
xperim
entat
ion
acc
uracy
has
sho
we
d
98.
33
an
d
98.
5%
on
J
AFF
E
an
d
exte
nd
e
d
co
hn
-
canad
e
(C
K+)
dataset
res
pecti
vely
.
Howe
ve
r,
the
rec
ogniti
on
pe
rfor
m
ance
depen
ds
on
t
he
num
ber
of
c
luster
s
for
c
odeb
ook
gen
e
rati
on,
nu
m
ber
of
detect
ed
featu
res,
le
vels
for
im
age
se
gm
entat
ion
,
an
d
siz
e
of
tr
ai
ni
ng
dataset
[
6
].
Im
plem
ented
co
gnit
ion
an
d
m
a
pp
e
d
bi
nar
y
pa
tt
ern
-
base
d
F
ER
us
i
ng
basi
c
em
otion
m
od
el
a
nd
ci
rcu
m
plex
m
od
el
on
C
K+
wi
th
10
0
im
ages
for
trai
ni
ng
an
d
50
im
ages
f
or
te
sti
ng.
I
n
t
he
pr
e
proces
sin
g
ste
p,
unwa
nted
in
f
orm
at
ion
su
c
h
as
hair,
ea
r,
a
nd
backg
rou
nd
a
r
e
rem
ov
ed
fro
m
the
facial
i
m
age.
LBP
an
d
ps
e
udo
3D
m
od
el
are
us
e
d
to
e
xtract
the
facial
c
ontours
a
nd
to
se
gm
ent
face
ar
e
a
into
s
ub
-
re
gi
on
s
.
T
o
re
duce
the
dim
ension
of
t
he
fe
at
ur
e
s
m
a
pp
e
d
l
ocal
bi
na
ry
patte
r
n
is
e
m
plo
ye
d
and
then
us
e
d
t
wo
c
l
assifi
ers
of
S
VM
an
d
so
ftm
ax.
T
he
r
esult
f
ound
t
ha
t
local
feat
ur
es
and
e
xpressio
ns
are
co
rr
el
at
e
d.
Mo
re
ov
e
r,
the
tw
o
cl
assifi
e
rs
ha
ve
a
li
tt
le
diff
ere
nce
in
pe
rform
ance.
T
he
e
xistence
of
oc
cl
us
io
n,
c
om
plex
c
onditi
ons,
an
d
m
ic
ro
-
ex
pr
essi
on
recog
niti
on
w
il
l
be
co
nducte
d
in
f
utu
re F
ER sy
stem
[
7
].
Pro
po
s
ed
a
m
et
hod
A
ngle
d
L
oca
l
Directi
on
al
P
at
te
rn
(A
L
DP)
f
or
te
xture
a
naly
sis
of
fa
ci
al
ex
pre
ssion
with
six
cl
assifi
ers
k
-
N
N,
S
VM,
DT
,
RF,
Ga
us
sia
n
NB
a
nd
Perce
ptron
on
CK+
dataset
.
Firstl
y,
facial
im
age
was
dete
ct
ed
usi
ng
Ha
a
r
-
li
ke
a
s
[
5
]
an
d
the
n
c
rop
pe
d
an
d
norm
al
iz
ed
the
detect
ed
im
age.
The
acc
ur
ac
y
i
m
pr
ov
e
d
99%
with
A
LD
P
m
et
ho
d
with
no
pre
processin
g
[
8
].
Also
pr
opos
e
d
Gr
ey
Wo
lf
o
ptim
iz
at
ion
f
or
fe
at
ur
e
sel
ect
ion
and
G
WO
-
ne
ural
netw
ork
(G
WO
-
N
N
)
for
f
eat
ur
e
cl
assifi
cat
ion
.
The
par
ts
of
fa
ce
ey
es,
no
se
,
m
ou
th
an
d
ea
r
s
are
detect
ed
us
in
g
Viola
-
John
al
gorithm
and
the
n
SI
FT
featur
e
e
xtracti
on
is
use
d
featu
re
poi
nts.
The
acc
uracy
89.
79%
on
C
K+
is
le
ss
than
[
8
]
a
nd
a
chieve
d
91.22%
[
9
].
P
r
opos
e
d
a
f
ram
e
work
with
high
-
dim
ension
al
f
eat
ur
es
c
om
bin
at
ion
of
ap
pea
r
ance
an
d
geo
m
et
ric
featur
e
s.
The
s
yst
e
m
us
ed
de
ep
s
pa
rse
a
utoe
ncode
rs
(DS
AE)
to
le
ar
n
robu
st
disc
rim
in
at
ive
feat
ur
e
a
nd
act
iv
e
app
ea
ra
nce
m
od
el
(
AA
M
)
to
locat
e
the f
aci
al
la
nd
m
ark
s 51
po
i
nts.
T
hree
f
eat
ur
e d
esc
ript
or
s
H
oG, g
ray valu
e
and
LBP
a
re
ut
il
iz
ed
to
desc
r
ibe
t
he
l
ocal
f
eat
ur
es.
Li
near
dim
ension
re
du
ct
io
n
m
et
ho
d
of
PCA
is
use
d
to
com
pr
ess
the
f
eat
ur
es
a
nd
t
he
n
giv
e
t
he
m
ap
as
the
in
pu
t
of
D
AS
E
.
T
he
ac
cur
acy
of
the
pro
posed
f
ram
e
wor
k
achie
ve
d 95.79
% of CK
+
dataset
b
y
us
in
g
le
ave
on s
ubj
ect
ou
t c
r
os
s
-
valid
at
ion
m
et
ho
d [
10
].
Pr
ese
nted
t
hr
e
e
m
od
el
s
of
diff
e
re
ntial
ge
om
et
ric
fu
sio
n
netw
ork
(DGF
N)
with
e
xtracti
on
of
handc
raf
te
d
fe
at
ur
es,
de
ep
fa
ci
al
sequ
e
ntial
netw
ork
(DFS
N)
base
d
on
C
NN
with
au
to
-
extracte
d
feat
ures,
a
nd
DF
S
N
-
1
c
om
bin
at
ion
of
t
he
adv
a
ntage
s
of
DGFN
a
nd
D
F
SN
by
m
app
ing
an
d
c
on
c
at
en
at
ion
of
handc
raf
te
d
and
aut
o
-
e
xtra
ct
ed
featu
res.
DF
S
N
-
1
ac
hie
ved
t
he
best
pe
rfor
m
ance
am
ong
th
e
th
ree
m
od
el
s
on
al
l
of
C
K+,
Ou
l
u
-
C
AS
I
A
a
nd
MM
I
datase
t
[
11
]
.
Use
d
de
ep
c
onvoluti
on
al
neural
netw
ork
(
DCN
N)
usi
ng
caf
fe
fr
am
ework
and
Tel
sa
K
20Xm
GP
U
.
T
he
f
rontal
fac
e
is
detect
ed
and
c
r
oppe
d
a
pp
li
ed
by
ope
nCV
in
facial
i
m
ages
pr
e
processi
ng
from
CK+
a
nd
JA
FFE
.
T
he
a
ccur
acy
of
e
xp
erim
ent
achiev
ed
97%
with
l
eave
-
one
-
s
ubj
e
ct
-
out
cro
ss
validat
io
n
on
C
K+
a
nd
98.12%
with
10
-
fo
l
ds
c
r
os
s
va
li
dation
on
J
A
FFE
[
1
2
].
Pr
e
s
ented
th
ree
m
od
el
s
of
diff
e
re
ntial
ge
om
et
ric
fu
sio
n
netw
ork
(DGF
N)
with
extr
act
ion
of
handc
ra
fted
featu
res,
de
ep
facial
seq
ue
ntial
netw
ork
(
DF
S
N)
ba
sed
on
C
NN
with
a
uto
-
extracte
d
f
eat
ures,
an
d
D
FS
N
-
1
c
om
bin
at
ion
of
t
he
ad
va
ntages
of
DGFN
a
nd
D
F
SN
by
m
app
ing
an
d
c
oncat
en
at
ion
of
handc
r
afted
a
nd
a
uto
-
extracte
d
featu
res.
DF
S
N
-
1
ac
hieve
d
the
be
st
pe
rform
ance
am
on
g
the
th
ree
m
od
e
ls
on
al
l
of
CK
+,
O
ulu
-
CA
SIA
a
nd
M
MI
da
ta
set
[
11
]
.
U
se
d
deep
conv
olu
ti
onal
neural
netw
ork
(
DCN
N)
us
in
g
ca
ff
e
fr
am
ework
a
nd
Tel
s
a
K
20Xm
GPU.
T
he
fro
ntal
face
i
s
detect
ed
a
nd cr
oppe
d
a
pp
li
ed
by ope
nCV
i
n faci
al
i
m
ages prepro
ce
ssin
g from
CK+ and J
AF
FE
. T
he
acc
ur
ac
y
of
e
xperim
ent
achieve
d
97%
with
le
ave
-
one
-
sub
j
ect
-
ou
t
cr
os
s
validat
ion
on
C
K+
an
d
98.
12%
with
10
-
f
ol
ds
cro
ss
validat
io
n
on
J
AF
FE
[
1
2
].
Re
viewe
d
a
naly
sis
of
22
L
ocal
Bi
nar
y
Pa
tt
ern
va
riances
on
J
AFFE
an
d
C
K
databases
usi
ng
the
sim
ple
par
am
et
er
-
fr
ee
nea
rest
nei
gh
bor
cl
assifi
e
r
(1
-
N
N).
F
or
J
AF
FE
E
databa
se,
the
highest
rec
ogni
ti
on
accu
racy
achieve
d
97.14%
by
us
i
ng
dLBPα,
E
LGS
and
LTP
,
w
hi
le
CK
databa
se,
t
he
highest
rec
ogni
ti
on
rate
of
100%
by
usi
ng
AELT
P,
BG
C3,
CS
ALTP,
dLBPα,
nLB
Pd
,
STS,
a
nd
W
LD
discript
or
s
.
T
he
basic
LBP
de
scripto
r
ac
hi
ev
ed
the
acce
ptable
pe
rfor
m
ance
of
95.71%
on
JA
FFE
a
nd
99.
28%
of CK
databa
s
e. T
he
stu
dy ca
n be e
xten
ded i
nclu
ding
oth
e
r pro
blem
s an
d othe
r data
set
s.
Evaluation Warning : The document was created with Spire.PDF for Python.
IS
S
N
:
2722
-
3221
Com
pu
t. Sci.
I
nf. Tec
hnol.
,
V
ol. 2, N
o.
1, M
arch 2
021
:
26
–
32
28
Used
DCN
N
a
dd
i
ng
data
au
gm
entat
ion
,
c
ros
s
entr
opy
a
nd
L
2
m
ulti
-
cl
ass
SV
M
[
13
]
.
I
n
[
14
]
,
weig
hted
center
reg
ressi
on
ada
ptive
fea
ture
m
app
in
g
(
W
-
CR
-
AF
M
)
f
or
featu
re
distri
bu
ti
on
a
nd
CN
N
f
or
feat
ur
e
tr
ai
nin
g
on
CK+,
Ra
dbou
nd
Faces
database
(RaF
D)
,
Am
ste
rd
a
m
dyna
m
ic
fa
ci
al
expressio
n
set
(ADFE
S)
a
nd
pro
pr
ie
ta
ry
dat
abase.
Diff
e
re
nt
of
oth
e
r
pa
pe
rs,
s
patia
l
nor
m
al
iz
a
t
ion
a
nd
feat
ur
e
e
nh
a
nc
e
m
ent
prep
r
oc
essing
m
et
ho
ds
a
re
us
e
d.
T
he
rec
ogniti
on
obta
ined
89.
84
%
,
96.27%
,
92.
70%
f
or
C
K+,
Ra
FD
an
d
ADFES
resp
ect
ively
.
A
ddress
il
lum
inati
on
prob
le
m
of
real
-
world
faci
al
i
m
ages
us
in
g
fast
f
ourier
t
ra
ns
f
or
m
and
co
nt
rast
lim
it
ed
ad
a
ptiv
e h
ist
ogram
equali
zat
ion
(F
F
T+C
LA
HE) f
or
poor il
lum
ina
ti
on
a
nd the
n
a
pp
li
ed
m
erg
ed bina
ry
patte
rn
co
de
(
MB
PC).
PC
A
is
us
e
d
as
a
m
et
hod
of
featu
r
e
dim
ension
re
du
ct
io
n
a
nd
k
-
NN
as
a
cl
assi
fier
on
SFE
W
dataset
[1
5
].
Re
le
ased
a
ne
w
data
base
iC
V
-
MEFE
D
a
t
F
G
wor
k
-
s
ho
p.
M
ulti
-
m
od
al
it
y
CNN
is
com
par
e
d
with
CN
N
f
or
m
ic
ro
em
otion
rec
ogniti
on
i
n
t
he
pa
pe
r.
T
he
pro
pose
d
ne
twork
e
xtract
ed
firstly
visua
l
an
d
geo
m
et
rical
infor
m
at
ion
of f
e
at
ur
es
the
n co
nc
at
enated
these
into
a
lo
ng
ve
ct
or
.
T
he
f
eat
ure
vecto
r
is
fed to
th
e
hinge
l
os
s
la
ye
r.
The
fr
am
ework
is
bette
r
perf
or
m
ance
tha
n
CNN
with
t
he
m
isc
la
ssific
at
i
on
of
80.21
2137
us
i
ng
caffe
[
16
].
Als
o
propose
d
a
nothe
r
th
ree
work
s
of
the
w
ork
-
s
hop.
T
he
first
wi
nn
e
r
m
eth
od
us
in
g
C
N
N
with
geo
m
et
ric
re
presentat
io
n
of
la
ndm
ark
dis
placem
ent
le
adin
g
bette
r
re
su
lt
s
c
om
par
ed
with
te
xt
ure
-
only
inf
or
m
at
ion
.
T
he
rec
ogniti
on
accuracy
ac
hieves
51.84%
f
or
se
ve
n
ex
pr
essions
a
nd
13.
7%
f
or
c
om
pound
e
m
otion
with t
he per
form
ance of a
ver
a
ge
ti
m
e 1
.57m
s u
sing G
PU o
r 30m
s u
sing
CP
U [
17
].
Em
plo
ye
d
de
e
p
em
otion
al
at
te
ntion
m
od
el
usi
ng
cr
os
s
cha
nn
el
C
N
N
by
add
i
ng
at
te
ntio
n
m
od
ulat
or
on
the
bim
od
al
face
a
nd
bo
dy
(FAB
O)
be
nc
hm
ark
databas
e.
T
he
syst
em
app
li
ed
C
NN
t
o
le
a
rn
the
loc
at
ion
of
face
e
xpres
sio
ns
in
a
cl
uttere
d
scene
.
The
st
ud
y
ha
s
s
how
n
that
the
e
xp
e
ri
m
entat
ion
of
one
e
xpres
sio
n
a
tt
ention
m
echan
ism
and
tw
o
e
xpressi
on
at
te
ntion
m
echan
ism
.
The
accu
racy
of
th
e
fr
am
ework
w
it
h
at
te
ntion
is
bette
r
than
t
hat
of
without
at
te
ntion
[
18
]
.
P
r
opos
e
d
a
rob
us
t
faci
al
la
nd
m
ark
e
xt
racti
on
m
et
hod
by
com
bin
in
g
data
-
d
ri
ven
of
fu
ll
y
co
nvolu
ti
on
ne
twork
(F
C
N)
and
m
od
el
-
dri
ven
of
pre
-
trai
ned
point
distr
ibu
ti
on
m
od
el
(P
DM
)
with
th
ree
ste
ps
est
i
m
at
ion
-
correct
ion
-
tu
ning
(ECT
).
The
com
pu
ta
ti
on
of
res
pons
e
m
aps
of
global
la
nd
m
ar
k
est
i
m
ation
is
tr
ai
ned
by
FCN
and
t
hen
t
he
m
axim
u
m
po
ints
of
t
he
m
aps
are
fitt
ed
with
P
DM
to
gen
e
rate
init
ia
l
facial
sh
a
pe.
I
n
the
fi
nal,
a
weig
hted
ve
rsi
on
of
regulariz
ed
la
ndm
ark
m
ean
-
s
hift
(RL
MS)
is
a
ppli
ed
to
fine
-
tun
e t
he faci
al
sh
a
pe
it
erati
vel
y [
19
].
Desig
ne
d
to
le
arn
N
N
arc
hitec
ture
with
th
re
e
l
os
s
f
unct
ions
f
ully
super
vi
sed,
wee
kly
s
uper
vised
a
nd
hybri
d
regulari
zat
ion
.
The
ex
per
im
entat
ion
of
the
pro
pose
d
m
od
el
ha
s
a
chieve
d
prom
i
sing
res
ults
on
CK+,
JAF
F
E
unde
r
la
b
-
e
nv
ir
onm
e
nt
an
d
SFE
W
in
the
wil
d
[
20
].
Pro
po
s
ed
trans
duct
ive
deep
tra
ns
fe
r
l
earn
i
n
g
(TDTL)
arc
hitec
ture
t
o
a
ddr
ess
the
pro
ble
m
of
cross
-
da
ta
base
non
-
fro
ntal
facial
ex
pr
essi
on
rec
og
niti
on
app
ly
in
g
VGG
face
16
-
Net
on
BU
-
3DEF
a
nd
Mult
i
-
P
IE
da
ta
set
s.
The
st
udy
f
ound
that
f
eat
ur
e
represe
nt
at
ion
with
V
GG
net
work
is
bette
r
tha
n
tradit
io
na
l
ha
nd
cra
fted
feat
ur
es
s
uc
h
li
ke
SIFT
an
d
LBP
to
re
present
com
plica
te
d
fe
at
ur
es
[
21
]
.
[
22
]
Als
o
us
e
d
the
t
wo
dataset
s
f
or
the
e
xp
e
r
i
m
entat
ion
to
a
ddress
the
pr
oble
m
of
cro
ss
-
do
m
ai
n
and
cr
os
s
-
vie
w
of
facial
e
xpressi
on
s
usi
ng
tra
ns
duct
iv
e
trans
fer
re
gula
rized
le
ast
-
sq
ua
r
e
regressio
n
(TT
RLSR
)
m
od
el
,
colo
r
SI
F
T
(C
SI
FT
)
featu
res
with
49
la
ndm
ark
s
a
nd
S
V
M
cl
assifi
ers.
The
t
w
o
databases
ha
ve
only
fou
r
ide
nt
ic
al
cat
ego
ries
ne
utral,
s
urp
rise,
ha
ppy
an
d
disgust.
T
he
e
xperim
entat
ion
of
t
he
stud
y
c
onduct
e
d
tw
o
ki
nd
s
cr
os
s
-
dom
ai
n
an
d
sam
e
view
an
d
cro
ss
-
view
an
d
sam
e
do
m
ai
n.
PCA
al
gorithm
al
s
o
app
li
ed
to re
duce the
featu
res dim
ension
.
The
stu
dies
in
ref
ere
nces
[
3
,
5
-
7
]
cl
assifi
ed
six
un
i
ver
sal
e
m
otion
s
as
happine
ss,
a
ngry,
sad
ness
,
su
r
pr
ise
,
fea
r,
and
dis
gu
st
.
I
n
[
9
,
13
,
1
5
,
23
-
2
4
]
hav
e
cl
assi
fied
on
e
m
or
e
cl
ass
as
n
e
utral
and
[
8
,
17
,
2
3
]
hav
e
done
c
onte
m
pt
cl
ass.
All
of
ei
gh
t
cl
asses
ha
ve
bee
n
cl
assifi
e
d
by
the
stu
die
s
in
[
11
,
10,
16
]
.
Howe
ver,
[
21
]
an
d
[
22
]
hav
e
w
or
ked
on
ne
utral
,
happines
s,
surp
rise
a
nd
disgust
e
xpressi
o
ns
.
Chen
et
al.
[
4
]
em
plo
ye
d
with
5
cl
asses
of
GE
MEP
-
FER
A
2011
data
base
a
nd
7
cl
asses
of
CK+
a
nd
A
FE
W.
Li
et
al.
[
25
]
exp
la
ine
d
sev
en
basi
c
e
m
otion
s
a
nd
11
com
pound
e
m
otion
s
s
adly
an
gr
y,
sa
dly
su
r
pr
ise
d,
sadly
fear
f
ul,
happil
y
su
r
pri
sed,
ha
pp
il
y
disgusted
,
sa
dl
y
disgusted
,
fear
fu
ll
y
s
urpri
sed,
fear
f
ully
an
gr
y,
a
ngrily
su
r
pri
sed
,
a
ngrily
dis
gu
st
ed
a
nd
disgustedly
s
urpr
ise
d.
Ferr
ei
ra
et
al.
[
20
]
ha
s
worked
cl
assifi
cat
ion
6
unive
r
sal
cl
asses
of
J
AF
FE
,
SF
E
W
with
cl
asses of
6 ba
sic
and
neu
tral
,
and C
K+
with
8
cl
asse
s incl
udin
g
c
on
te
m
pt.
3.
TYPIC
AL FE
R
S
YS
TE
M
Ty
pical
FER
s
yst
e
m
is
showe
d
i
n
t
he
f
ollow
i
ng
syst
em
flow
Fig
ure
1.
I
n
th
e
detect
io
n
of
f
ace
co
ns
ist
s
of
th
ree
w
orks
:
locat
e
the
fa
ce,
c
rop
the
fa
ce,
a
nd
scal
e
the
face.
Feat
ures
e
xtracti
on
m
et
ho
ds,
dim
e
ns
io
n
reducti
on m
et
ho
d an
d
cl
assi
ficat
ion
m
et
ho
ds co
uld
be se
le
ct
ed.
Evaluation Warning : The document was created with Spire.PDF for Python.
Com
pu
t. Sci.
I
nf. Tec
hnol.
Feature
extrac
ti
on
and
cl
as
sif
ic
ation
met
hods of f
acial ex
pre
ssion:
a
s
ur
ey
… (
Moe M
oe Htay
)
29
Figure
1.
Ty
pical
FER Syst
em
4.
FEATU
RES
OF F
ACIAL I
MAGE
S
Most
of
the
FE
R
syst
e
m
us
ed
geo
m
et
rical
featur
es
or
vis
ual
f
eat
ur
es
or
both
of
t
hese
featur
e
s
to
ext
ract
the f
eat
ur
es
from
the i
m
ages
of f
ace
s.
4.1.
Geome
trical
f
eatures
Geo
m
et
rical
m
et
hods
ca
n
est
im
at
e
facial
la
nd
m
ark
s
locat
io
n
or
s
om
e
com
po
nen
ts
of
f
aci
al
i
m
ages
su
c
h
as
the
ey
ebro
ws,
t
he
m
ou
t
h,
a
nd
the
nose
a
nd
these
f
eat
ur
es
ca
n
be
m
easur
ed
by
di
sta
nces,
c
urva
tures,
defor
m
at
ion
s,
an
d othe
r
geom
et
ric
prop
e
rtie
s
to
re
pr
e
sent
the g
eom
et
ric
f
aci
al
feature
s
a
s
they
a
re s
ensi
ti
ve
to
no
ise
[
3
-
4
,
9,
16
-
17
]
. T
he pa
pe
r [
9
]
desc
ribe
d faci
al
point
e
xtracti
on m
et
ho
d t
o ext
ract t
he
points
of eye,
nose,
m
ou
th,
an
d
e
ar
s
base
d
on
Viol
a
-
Jones
obj
ect
detect
i
on
al
go
rithm
.
Fo
ur
ke
y
reg
i
on
s
of
fa
ce
are
us
e
d
to
extract
geo
m
et
ric
featur
es
with
fou
r
ste
ps
:
detect
fa
ce,
detect
ey
es
,
locat
e
ey
e
ce
nter
t
he
n
get
e
ye
reg
i
on
heig
ht,
a
nd
est
i
m
at
e
no
se
and
li
ps
re
gion
s.
In
the
pap
e
r
[
17
]
,
facial
la
nd
m
ark
dis
place
m
ent
m
et
ho
d
is
a
pp
li
ed
to
ext
ract
geo
m
et
rical
inform
ation
.
Aff
ect
ive
ge
om
et
r
ic
featur
es
are
extracte
d
us
i
ng
the
warp
tra
ns
f
or
m
at
ion
of
facial
la
nd
m
ark
s
to
c
aptu
re
the
co
nf
igurat
ion
of
fa
ci
al
la
nd
m
ark
in
[
4
]
.
Faci
al
la
nd
m
ark
with
68
po
i
nts
is
des
cribe
d
as g
e
om
et
rical
represe
ntati
on
of f
ace
[
16
].
4.2.
Ap
per
an
ce
fe
at
u
res
Appea
ran
ce
m
et
hods
s
uc
h
as
scal
e
inv
a
riant
featur
e
tra
ns
f
orm
(S
IF
T),
Ga
bor
ap
pear
a
nce,
local
phas
e
qu
a
ntiza
ti
on
ca
n
detect
the
m
ulti
-
scal
e,
m
ult
i
-
directi
on
of
t
he
l
ocal
te
xt
ure
cha
nges
on
e
it
her
s
pecific
r
e
gions
or
t
he
w
hole
face
to
e
nc
ode
the
te
xture
[
3
-
4
,
8
-
9
,
16
]
.
In
[
7
]
,
m
app
ed
local
bin
a
r
y
patte
rn
with
four
neig
hborh
oods
is
us
e
d
t
o
desc
ribe
t
he
c
hange
of
local
te
xtur
e
featu
res
a
nd
t
hen
face
is
di
vi
ded
six
re
gions
su
c
h
as
f
or
e
hea
d,
ey
es,
nose,
m
ou
th,
le
f
t
c
hee
k
a
nd
rig
ht
c
heek
usi
ng
pse
ud
o
3D
m
od
el
.
The
pap
e
r
[
8
]
de
scr
ibed
the
te
xtu
re
feat
ur
e
us
i
ng
an
gled
local
directi
on
al
patte
rn
co
nsi
der
in
g
t
he
c
enter
pix
el
.
I
n
ref
e
re
nce
[
9
]
,
Scal
e
Invar
ia
nce
Fea
ture
T
ra
ns
f
or
m
m
e
tho
d
is
ap
pl
ie
d
to
e
xtract
the
uniq
ue
a
nd
pr
eci
se
inf
or
m
at
ive
face
fe
at
ur
es.
The
pa
per
[
3
]
us
e
d
local
bin
a
ry
patte
r
n
to
ex
tract
local
te
xt
ur
e
featu
re
of
f
our
basic
re
gi
ons
of
face:
tw
o
ey
es,
no
s
e
a
nd
m
outh.
T
o
extract
the
dynam
ic
t
extu
re
feature
s
from
the
vid
e
o,
[
4
]
us
e
d
hi
stogram
of
or
i
ented
gra
dients
f
ro
m
thr
ee
or
t
hogo
nal
planes
(
H
OG
-
T
OP
)
.
The
vis
ual
featu
re
s
are
e
xtracted
f
ro
m
the
c
olor
im
age
us
in
g
c
onvolut
ion
al
neural
ne
twork
(CN
N)
as
a
featu
re
de
scripto
r
in
[
16
]
.
The
e
ff
ect
s
of
the
a
ppr
oaches
are
tim
e
-
con
su
m
ing
,
a
nd
the
c
harac
te
risti
c
di
m
e
ns
i
on
is
hu
ge,
so
the
dim
ension
al
it
y
reducti
on
m
et
hods
ar
e
use
d
to affect
the
ac
cur
acy
of f
aci
a
l expressi
on r
e
cogniti
on.
5.
FACIAL
D
A
TASET
S
Faci
al
express
ion
dataset
s
ha
ve
tw
o
ty
pes
of
c
reati
on
of
im
ages:
po
s
ed
e
xpressio
ns
i
m
ages
an
d
sp
onta
neous
e
xpressi
on
s
im
ages
dataset
s.
Re
searche
rs
ac
qu
i
red
facial
im
ages
in
th
re
e
ways
s
uch
a
s
pea
k
expressi
on
im
ages
on
ly
, im
age seque
nces
portrayi
ng an em
otion f
ro
m
n
eu
tral
to
it
s p
ea
k, an
d vide
o
cl
ip
s w
it
h
Evaluation Warning : The document was created with Spire.PDF for Python.
IS
S
N
:
2722
-
3221
Com
pu
t. Sci.
I
nf. Tec
hnol.
,
V
ol. 2, N
o.
1, M
arch 2
021
:
26
–
32
30
e
m
otion
al
a
nnotati
on
s
.
T
he
two
widely
us
e
d
dataset
s
a
re
CK+
a
nd
J
AFF
E
[
26
-
29
]
.
T
he
real
-
world
facial
databases
are
FER
-
2013,
F
E
RG
-
DB,
S
FE
W2.0
(stat
ic
f
aci
al
expressio
n
in
the
wild)
,
RAF
-
DB
(r
ea
l
world
aff
ect
ive
face
database
)
a
nd
Affect
Net
data
base.
Sam
ple
im
ages
of
basic
facial
e
xpress
ion
are
de
scri
be
d
i
n
Table.
1 for ea
c
h datase
t.
Table
1.
Sam
pl
e
i
m
age of
f
aci
al
i
m
age d
at
as
et
s
Dataset
Sa
m
p
le
I
m
ag
es
Hap
p
y
Sad
Su
rprise
Fear
An
g
er
Disg
u
st
CK+
JAFFE
FER
-
2013
FERG
-
DB
SFE
W
RAF
-
DB
Af
f
ectNet
5.1.
Extended
c
ohn
-
c
an
ade d
ata
set
(CK+)
CK+
data
set
ha
ve
bee
n
widel
y
us
ed
i
n
m
any
ye
ars
in
facial
e
xpressi
on
syst
e
m
.
This
data
set
com
pr
ise
s
of
593
se
qu
e
nc
es
of
im
age
var
y
in
durati
on
f
ro
m
10
to
60
f
ram
es
colle
ct
e
d
f
ro
m
12
3
s
ubj
ect
s
.
The
a
ge
range
of
s
ubj
ect
s
is
18
-
50
ye
ars,
w
he
re
31%
a
re
m
en
a
nd
69%
ar
e
wo
m
en.
T
he
i
m
ages
express
seve
n
cat
eg
or
i
es
of
expressi
on
s:
ha
pp
y,
sa
d,
surp
rise,
a
nger,
fea
r,
disgust,
an
d
neu
t
ral
that
c
over
the
basic
e
m
ot
ion
s.
Eac
h
i
m
age
has 6
40 *
640
-
or
490
-
pix
el
s
r
esolutio
n [
27
].
5.2.
Japanse
fem
ale faci
al e
xp
res
sion
d
atase
t
(JAFFE)
JAF
F
E
data
set
is
al
so
widely
us
e
d
in
e
xpres
s
ion
rec
ogniti
on
of
hum
an
em
o
ti
on
.
This
datas
et
consi
sts
of
213 i
m
ages o
f 1
0
Ja
pa
nese
fem
al
es includ
ing seve
n
e
xpre
ssion
s:
six
basi
c (h
a
ppy, s
urp
r
ise
, s
ad
, a
ng
e
r, fear
and d
is
gust) a
nd
neu
t
ral. Eac
h im
age
has
the
reso
l
ution o
f 2
56 * 25
6 pixels
[
28
].
5.3.
FER
2013
d
at
as
et
FER
-
2013
data
set
co
ntains
28
,000
im
ages
that
are
la
bele
d.
The
dataset
is
c
reated
i
n
2013
for
le
ar
ni
ng
fo
c
us
e
d
on
th
r
ee
chall
en
ges:
the
black
bo
x
le
arn
i
ng,
th
e
facial
ex
pr
es
sion
rec
ogniti
on
ch
al
le
nges
an
d
the
m
ul
tim
od
al
le
arn
i
ng
c
halle
ng
es.
T
he
im
ages
are
48
*
48
pi
xels
gray
scal
e
of
faces
in
se
ven
ex
pr
e
ssio
ns:
six
basic e
xpressio
n
a
nd n
e
utral
[
30
].
Evaluation Warning : The document was created with Spire.PDF for Python.
Com
pu
t. Sci.
I
nf. Tec
hnol.
Feature
extrac
ti
on
and
cl
as
sif
ic
ation
met
hods of f
acial ex
pre
ssion:
a
s
ur
ey
… (
Moe M
oe Htay
)
31
5.4.
FERG
-
DB
dat
as
et
FERG
-
DB
sta
nd
s
f
or
fa
ci
al
expressi
on
res
earch
gro
up
da
ta
base
that
co
nsi
sts
of
face
i
m
ages
of
si
x
sty
li
zed
char
ac
te
rs
gr
oupe
d
i
nto
seve
n
ty
pe
s
of
e
xpressi
ons:
six
ba
sic
e
xpressi
on
s
an
d
ne
utral.
The
dataset
include
s
555767 im
ages [
31
].
5.5.
St
ati
c
fa
ci
al e
xp
ressi
on
in
the w
il
d d
at
as
e
t
(
SFEW
)
The
im
ages
in
the
SFE
W
ar
e
extracte
d
fro
m
a
tem
po
ral
facial
expressi
on
s
datab
ase
Acted
Faci
al
Ex
pr
essi
ons
i
n
the
W
il
d
(
AFE
W
)
wh
ic
h
ha
s
bee
n
ext
racted
from
m
ov
ie
s.
T
he
data
base
co
ntains
70
0
i
m
ages
that ha
ve bee
n l
abeled i
nto
si
x basic e
xpressi
on
s
[1
6].
5.6.
Rea
l
-
w
orld
affective
face
data
b
as
e
(
RAF
-
DB)
RAF
-
DB
datab
ase
is
a
la
r
ge
-
s
cal
e
facial
ex
pressi
on
data
bas
e
that
inclu
des
f
aci
al
i
m
ages
dow
nlo
a
de
d
from
internet.
The
dataset
is
annotat
ed
seve
n
-
dim
ension
al
expressi
on
dist
rib
ution
vect
ors
f
or
eac
h i
m
age
[10].
5.7.
Affect
Net
d
ataset
AffeNet
is a
la
rg
est
d
at
abase
o
f
fa
ci
al
ex
pre
ssion
in
t
he
rea
l
-
w
or
ld
a
nd
c
onta
ins
m
or
e
th
an
1,0
00,
000
facial
i
m
ages
dow
nlo
a
ded
f
r
om
the
inter
ne
t
searc
h
by
si
x
differe
nt
la
ngua
ges
with
1250
em
otion
r
el
at
ed
keyw
ords.
T
he
da
ta
base
def
i
ned
el
e
ven
cat
egories
of
ex
pressi
on:
six
ba
sic
ex
pressi
on
s,
ne
utral,
co
nt
e
m
pt,
none, u
nce
rtai
n,
a
nd
non
-
fac
e [
16
]
.
6.
PROBLE
M
S
TATE
MENT
FER
syst
e
m
is
need
t
o
de
velo
p
un
der
t
he
pro
blem
of
il
lu
m
i
nation,
li
gh
ti
ng,
pose,
a
ging,
oc
cl
us
io
n
f
or
the
real
-
w
or
l
d
expressi
on classi
ficat
ion
syst
e
m
. Th
e m
ajo
r
c
halle
ng
e
s
of
t
he
stu
dy inclu
de
:
Most o
f resear
ches classi
fy
ba
sic
em
otion
s
bu
t
fine
-
grai
n
e
m
otion
is
rela
ti
vely
s
m
al
l.
The reaea
rc
h works
on m
ocr
o
-
e
xpressi
on a
nd co
m
pound e
m
otion
recog
niti
on
syst
e
m
are
li
m
it
ed.
Ma
them
a
ti
cal
m
od
el
is
nee
de
d
t
o
be
de
velo
ped
f
or
e
xtract
ion
m
or
e
discri
m
inant
feat
ur
e
s
facial
im
ages
i
n
the w
il
d.
Re
al
tim
e facia
l expressi
on r
e
cogniti
on syst
em
s sh
ou
ld
b
e
develo
ped to m
eet
p
racti
cal
app
li
cat
ion
.
Deep l
ear
ning
m
od
el
also n
ee
d
to
create
f
or
i
m
pr
ov
in
g faci
a
l feat
ur
e
e
xtrac
ti
on
a
nd classi
f
ic
at
ion
.
7.
C
O
NC
L
US
I
O
N
AND
F
U
RT
UR
E
WO
RK
Faci
al
ex
pr
essi
on
rec
ogniti
on is
an
act
ive
res
earch
area
an
d
m
or
e
interest
in
g
for
researc
he
r
under
th
e
pro
blem
of
oc
cl
us
io
n,
br
ig
ht
ness,
viewi
ng
ang
le
,
pose,
a
nd
bac
kgr
ound
in
the
real
-
li
fe
i
m
ages,
seq
ue
nce
of
i
m
ages
an
d
vid
eo
s.
T
his
re
view
pa
per
ha
s
prese
nted
m
et
hods
of
pr
e
processi
ng,
fe
at
ur
e
e
xtracti
on
a
nd
cl
assifi
cat
ion
schem
e.
The
FER
resea
rc
h
goes
on
to
m
eet
real
-
li
fe
ap
plica
ti
on
s
for
dr
i
ver
dro
wsin
es
s
recog
niti
on
,
as
sist
ant
of
dista
nce
le
a
rn
i
ng,
c
li
nical
patie
nt
m
on
it
or
ing
a
nd
te
ac
hing
r
obot,
healt
h
care
syst
e
m
for
a
utism
childre
n.
I
n
the
f
ut
ur
e
,
FER
syst
e
m
will
be
de
vel
op
e
d
f
or
fine
d
gr
ai
ned
facial
e
xpressi
on
s
r
eco
gn
it
io
n
and com
pound em
otion
s r
ec
ogniti
on
by usi
ng
facial
i
m
ages.
REFERE
NCE
S
[1]
Kalsum
,
Te
hm
in
a,
Anw
ar,
S
y
ed,
Maj
id,
Muham
m
ad,
Ali
,
Sahi
bza
d
a.
“
Emotion
Re
c
ognit
ion
from
Fa
ci
a
l
Expr
essions
using H
y
br
id
Fe
at
ure
Descri
p
tor
s.
”
IET
Image P
roce
ss
ing
.
vol
.
12
,
no
.
6
,
Janu
ar
y
2018
.
[2]
P.
Ekman,
W
.
V
.
Friese
n
.
“
Faci
a
l
a
ct
ion
cod
ing
s
y
stem
a
te
chn
iqu
e
for
the
m
ea
sur
ement
of
fa
ci
a
l
m
ovement
.”
Palo
Alto
:
Consulti
ng
Psyc
ho
logi
sts P
ress
,
pp.
271
-
30
2,
1978
.
[3]
A.
Ma
jumder,
L
.
B
ehe
r
a
and
V.
K.
Subram
anian
.
“
Autom
at
i
c
Fa
ci
a
l
Expre
ss
ion
Rec
ognition
S
y
s
te
m
Us
ing
Dee
p
Network
-
Based Data
Fus
ion
.
”
in
IEEE
Tr
ansactions
on
Cybe
rn
etics
,
vo
l. 4
8,
no.
1,
pp
.
103
-
114
,
J
an.
2018
.
[4]
J.
Chen
,
Z
.
Chen
,
Z.
Ch
i
and
H.
F
u
.
“
Fac
ia
l
Expres
sion
Rec
ogn
it
io
n
in
Vid
eo
with
Multi
ple
Fe
at
ur
e
Fus
ion
.”
in
IE
E
E
Tr
ansacti
ons on Aff
e
ctive Compu
ti
ng
,
vol. 9
,
no
.
1,
pp
.
38
-
50
,
1
Jan.
-
Mar
ch
2018
.
[5]
Yang,
Dongri
,
A
bee
r A
lsad
oon
,
P. W
.
Chand
ana
Prasad,
As
hutosh
Kum
ar S
ingh a
nd
Am
r Elc
houe
m
i.
“
An
Emotion
Rec
ognition
Model
Based
on
Fa
ci
a
l
Rec
ogn
it
ion
in
Virtual
Lear
ning
Envi
ronm
e
nt.
”
Proce
dia
Computer
Sci
en
c
e
.
vol.
125
,
pp
.
2
-
10
,
2018
.
[6]
T.
Kalsum
,
S.
M.
Anw
ar,
M.
Maji
d,
B
.
Khan
a
nd
S.
M.
Ali
.
“
Emotion
rec
ogn
i
ti
on
from
facia
l
expr
essions
using
h
y
brid
fe
at
ur
e
d
esc
ript
ors
.”
in
I
E
T Image
Proc
essing
,
vo
l. 1
2,
no.
6,
pp
.
1004
-
101
2,
2018
.
[7]
C.
Qi
e
t
al
.
“
Facial
Expr
essions
Rec
ogni
ti
on
Bas
e
d
on
Cognit
ion
a
nd
Mappe
d
Bin
a
r
y
Pa
tt
ern
s
.”
in
I
EE
E
Acce
ss
,
vol
.
6,
pp
.
18795
-
18
803,
2018
.
[8]
A.
M.
M.
Shab
a
t
and
J.
T
apa
m
o
.
“
Angled
lo
ca
l
d
ire
c
ti
ona
l
patter
n
for
t
ext
ur
e
anal
y
sis
with
an
ap
pli
c
at
ion
to
f
acia
l
expr
ession
re
cog
nit
ion
.”
in
I
ET
C
omputer
Vi
sion
,
vol.
12
,
no
.
5
,
pp
.
603
-
608
,
8
201
8
Evaluation Warning : The document was created with Spire.PDF for Python.
IS
S
N
:
2722
-
3221
Com
pu
t. Sci.
I
nf. Tec
hnol.
,
V
ol. 2, N
o.
1, M
arch 2
021
:
26
–
32
32
[9]
N.
P.
Nirm
al
a
S
ree
dhar
an,
B
.
G
ane
san,
R
.
R
aveendra
n,
P.
Sara
l
a,
B.
Dennis
an
d
R.
Booth
al
ing
am
R.
“
Gre
y
W
olf
opti
m
isat
ion
-
b
ase
d
fe
at
ur
e
se
le
c
tion a
nd
class
ifi
c
a
ti
on
for
fa
ci
a
l e
m
oti
on
rec
ogn
it
i
on
.”
in
I
ET
B
io
metric
s
,
vo
l. 7, n
o.
5,
pp
.
490
-
499
,
2018
.
[10]
Ze
ng,
N.,
Zh
ang,
H.,
Song
,
B.
,
Liu,
W
.
,
Li,
Y.
,
Do
bai
e
,
A.
M.
“
Fac
ia
l
expr
ession
r
e
cogni
ti
on
vi
a
l
earning
de
ep
spars
e
aut
oen
code
rs
.”
Neurocomputi
ng
,
vol
.
273
,
pp
.
64
3
-
649,
2018
.
[11]
Y.
Ta
ng
,
X.
M.
Zha
ng
and
H.
W
ang
.
“
Geom
et
ri
c
-
Convolut
ional
Feat
ure
Fus
ion
Based
on
L
ea
rn
i
ng
Propaga
ti
on
f
or
Faci
a
l
Expr
ession
Rec
ogn
it
ion
.”
in
IE
EE A
c
ce
ss
,
vol.
6
,
pp
.
42532
-
42540,
2018
.
[12]
Ma
y
y
a
,
V.
,
Pai
,
R.
M.,
&
Pai
,
M
.
M.,
“
Autom
at
i
c
facial
expr
ession
rec
ogni
ti
on
u
sing
DCN
N
.
”
Proce
dia
Comput
e
r
Sci
en
ce
,
vol
.
93
,
pp.
453
-
461,
20
16
.
[13]
D.
V.
Sang
,
N
.
Van
Dat
and
D.
P.
Thua
n
.
“
Fac
ial
expr
ession
re
co
gnit
ion
using
de
ep
convo
lut
ion
al
neur
a
l
n
et
works
.”
2017
9th
In
te
rna
ti
onal
Con
fe
ren
c
e
on
Know
le
dg
e and Sy
stems
Eng
ine
ering
(
KSE)
,
pp.
130
-
135
,
20
17
.
[14]
B.
W
u
and
C.
Lin
.
“
Adapti
v
e
Fea
ture
Mapp
ing
for
Cu
stom
iz
ing
De
ep
L
ea
rning
Bas
ed
Fac
ia
l
Expr
ession
Rec
ogni
ti
o
n
Model
.”
in
IEEE
Acce
ss
,
vo
l. 6, p
p.
12451
-
12461
,
2018
.
[15]
Munir,
A
.
,
Hus
sain,
A.
,
Khan
,
S
.
A
.
,
Nad
ee
m
,
M.,
Ars
hid,
S.
“
Ill
um
ina
ti
on
inv
ari
an
t
facial
exp
ression
r
ec
ogni
tion
using se
lecte
d
m
erg
ed
b
ina
r
y
patt
ern
s for
r
ea
l
wor
ld
images
.”
Opt
i
c
;
vo
l.
158
,
pp.
1
016
-
1025,
2018
.
[16]
Guo, J
.
,
Zhou
,
S.
,
W
u,
J.
,
W
an,
J.
,
Zhu
,
X.,
Lei,
Z
.
,
&
Li
,
S.
Z.
“
Multi
-
m
odal
i
t
y
netw
ork w
it
h
visua
l
and
g
eometr
i
cal
informati
on
for
m
ic
ro
emotion
r
ec
ogni
ti
on.
”
In
Aut
omatic
fac
e
and
Gesture
Recogni
ti
on
(
FG
2
017)
,
12t
h
IEE
E
Inte
rnational
Co
nfe
renc
e
,
pp.
814
-
819
,
2017
.
[17]
J.
Guo
et
a
l.
“
Dom
ina
nt
and
Com
ple
m
ent
ar
y
Emotion
Re
cogni
t
ion
From
Stil
l
Im
age
s
of
Face
s
.”
in
I
EE
E
Acce
ss
,
vol
.
6,
pp
.
26391
-
26
403,
2018
.
[18]
Barr
os,
P.
,
Paris
i,
G.I
.
,
W
ebe
r
,
C.
,
W
ermte
r
S.
“
Emotion
-
m
odula
te
d
attention
i
m
prove
s
expr
ession
re
cogni
t
ion:
A
dee
p
learni
ng
m
odel
.
”
N
euroc
o
mputing
,
vo
l.
25
3,
pp
.
104
-
114
,
2017
.
[19]
H.
Zh
ang,
Q.
L
i,
Z.
Sun
and
Y.
Li
u
,
"Com
bini
ng
Dat
a
-
Drive
n
and
Model
-
D
rive
n
Me
thods
f
or
Robust
Fa
ci
a
l
La
ndm
ark
Detec
ti
on,
"
in
I
EE
E
T
ran
sac
t
ions
on
I
nform
at
ion
Fore
nsics
and
Se
cur
i
t
y
,
vo
l.
13
,
no.
1
0,
pp
.
2409
-
242
2,
Oct.
2018
.
[20]
P.
M.
Ferre
ira,
F
.
Marqu
es,
J.
S.
Cardoso
and
A.
Rebe
lo
.
“
Ph
y
sio
l
ogic
a
l
Inspir
ed
Dee
p
Neur
al
Net
works
for
Emoti
on
Rec
ognition
.”
in
IEEE
Ac
ce
ss
,
v
ol.
6
,
pp
.
53930
-
53943,
2018
.
[21]
Yan,
K.
,
Zh
eng,
W
.
,
Zh
ang,
T.,
Z
ong,
Y.
,
Cui
,
Z.
“
Cross
-
dat
aba
se
non
-
fronta
l
fa
ci
a
l
expr
ession
re
co
gnit
ion
b
ase
d
on
tra
nsduct
ive
d
eep t
ran
sfer
l
ea
rn
i
ng.
”
arX
iv preprint
arXi
v
:
1811
.
12774
,
2018
.
[22]
W
.
Zhe
ng
,
Y. Z
ong
,
X. Zhou
an
d
M.
Xin
.
“
Cross
-
Dom
ai
n
Color
Faci
a
l
Expr
ession
Rec
ogni
ti
on
U
sing T
ran
sduc
ti
v
e
Tra
nsfer
Subs
pac
e Learning
.”
in
IEE
E
Tr
ansacti
o
ns on
Affe
ct
i
ve Computi
ng
,
vo
l. 9, no. 1, pp. 21
-
37
,
2018
.
[23]
Ta
utkute,
I.,
Tr
z
ci
nski,
T.,
and
Bi
el
ski,
A
.
“
I
Know
How
You
Feel
:
Emotion
with
Facial
La
ndm
ark
s
.”
arXi
v:
preprin
t
arXiv
:
1805
.
003
26
,
2018
.
[24]
B.
W
u
and
C.
Lin
.
“
Adapti
v
e
Fea
ture
Mapp
ing
for
Custom
iz
ing
De
ep
L
ea
rning
Bas
ed
Fac
ia
l
Expr
ession
Rec
ogni
ti
o
n
Model
.”
in
IEEE
Acce
ss
,
vo
l. 6, p
p.
12451
-
12461
,
2018
.
[25]
S.
L
i
a
nd
W
.
D
eng
.
“
Rel
i
abl
e
Crowds
ourc
ing
and
De
ep
Local
ity
-
Preserving
L
ea
rning
for
Unc
onstrai
ned
Fac
ial
Expre
ss
ion
Re
co
gnit
ion
.”
in
I
EEE
Tr
ansacti
ons
on
Image Proce
s
sing
,
vol
.
28
,
no
.
1,
pp.
356
-
370,
Jan.
2019
.
[26]
C.
Loob
et
al.
“
Dom
ina
nt
and
Com
ple
m
ent
ar
y
Multi
-
E
m
oti
ona
l
Faci
a
l
Expre
ss
i
on
Rec
ogni
ti
on
Us
ing
C
-
Support
Vec
tor
Cl
assificat
ion
.
”
2017
12t
h
IE
EE
In
te
rnati
onal
Con
fe
ren
ce
on
Au
tomati
c
F
ace
&
Gesture
Re
cogn
it
ion
(
FG
2017)
,
Washington
,
DC,
pp.
833
-
838,
2017
.
[27]
P.
Lu
cey
,
J
.
F.
Cohn,
T.
Kan
ad
e,
J.
Sara
g
ih,
Z
.
Am
bada
r
and
I.
Mat
the
ws
.
“
The
Ext
end
ed
Coh
n
-
Kana
de
Dat
ase
t
(CK+):
A
complete
da
ta
se
t
fo
r
action
unit
an
d
emotion
-
spec
i
fie
d
expr
ession
.”
2010
IEEE
C
omputer
Societ
y
Confe
renc
e
on
C
omputer
Vi
sion
and
Pattern
R
ecogniti
on
-
Worksh
ops,
San
Fran
ci
s
co,
CA
,
pp
.
94
-
1
01,
2010
.
[28]
Dhall
,
A.
,
Goe
c
ke,
R.
,
Luc
e
y
,
S
.
,
&
Ged
eon,
T
.
Stat
ic
f
ac
i
al
exp
ressions
in
the
wild:
data
and
exp
eri
m
ent
protoc
o
l
.
CVHCI
Google Schol
ar
.
[Onl
ine]
htt
ps://
fip
a.cs.
k
it
.
edu/
download
/
SF
EW.pdf
.
[29]
L
y
ons
,
M.
J.
,
Ak
amatsu,
S.,
Kam
ac
h
i
,
M.
,
G
y
ob
a,
J.,
&
Bud
y
nek
,
J.
“
The
Jap
ane
se
f
emale
f
ac
i
al
expr
ession
(JA
FF
E)
dat
ab
ase
.
”
In
Proce
ed
ings
of
thir
d
int
ernati
ona
l
conf
ere
n
ce
on
a
utomati
c
face
an
d
gesture
re
cog
nit
ion,
pp.
14
-
16
,
1998
.
[30]
Goodfel
low
I
.
,
E
rha
n
D
.
,
Carr
ie
r
PL.
,
Courvi
ll
e
A
.
,
Mirz
a
M.,
Ha
m
ner
B.
,
Cuk
ie
r
ski
W
.
,
Ta
ng
Y.
,
L
ee
DG
.
,
Zhou
Y.,
Ramai
ah
C
.
,
Feng
F.,
Li
R
.
,
W
ang
X.,
Ath
an
asa
kis
D.,
Shawe
-
Tay
lor
J.,
Mi
la
k
ov
M.,
Park
J.,
Io
nesc
u
R.
,
Popesc
u
M.,
Groz
ea
C
.
,
Bergstra
J.
,
Xi
e
J.,
Rom
asz
ko
L
.
,
Xu
B.
,
Chaung
Z.,
ans
B
engi
o
Y.
“
Chal
le
ng
es
i
n
Repre
sen
ta
t
ion
Le
arn
ing:
A
rep
ort
on
three
m
achine
le
arn
ing
co
nte
sts.
”
Int
ernational
Conf
ere
nce
on
Neural
Infor
mation
Proc
ession
Springer
Be
rl
in H
ei
del
b
erg
,
201
3.
[31]
Aneja
,
D.,
Colb
urn,
A.,
Faigi
n
,
G.,
Shap
iro,
L.,
Mones,
B
.
“
Modeli
ng
st
y
l
ized
cha
r
ac
t
er
exp
re
ss
ions
via
dee
p
le
arn
ing
.
”
In
Asi
an
Confe
ren
ce o
n
Computer
V
isi
on
,
Springer
,
Ch
am,
pp.
136
-
135
,
2016.
Evaluation Warning : The document was created with Spire.PDF for Python.