TELKOM
NIKA
, Vol.13, No
.1, March 2
0
1
5
, pp. 173~1
8
0
ISSN: 1693-6
930,
accredited
A
by DIKTI, De
cree No: 58/DIK
T
I/Kep/2013
DOI
:
10.12928/TELKOMNIKA.v13i1.1302
173
Re
cei
v
ed O
c
t
ober 1
1
, 201
4; Revi
se
d Decem
b
e
r
8, 2014; Accepte
d
De
cem
ber
28, 2014
Lip Motion Pattern Recognition for Indonesian Syllable
Pronunciation Utilizing Hidden Markov Model Method
Balza Achm
ad, Faridah, Laras Fadillah
Dep
a
rtment of Engi
neer
in
g Ph
y
s
ics, F
a
cult
y
of Engine
eri
n
g
,
Universitas G
adj
ah Mad
a
Jala
n Grafika 2
,
Yog
y
akarta In
don
esia
e-mail: b
a
lzac
h
@
ugm.ac.i
d
, farida
h@u
g
m.ac.id
A
b
st
r
a
ct
A speec
h th
erape
utic too
l
h
a
s be
en
dev
e
l
op
ed to
he
lp
Indon
esi
an
d
eaf kids
le
arn
how
to
pron
ounc
e w
o
rds correctly. T
he a
ppl
ie
d tec
hni
que
utili
z
e
d
lip
mov
e
me
nt frames capt
ure
d
by a ca
mera
an
d
inp
u
tted the
m
i
n
to a
pattern r
e
cog
n
itio
n
mo
d
u
le w
h
ic
h ca
n
differenti
a
te b
e
t
w
een differe
nt vow
e
l ph
on
e
m
es
pron
unci
a
tio
n
i
n
Ind
o
n
e
sia
n
l
ang
ua
ge. In th
is pa
per, w
e
used
on
e
di
mensi
ona
l H
i
d
d
en M
a
rkov M
o
de
l
(HMM) method for patte
rn r
e
cognition
m
o
dule. T
he feature us
ed for
the trai
ning
and test data wer
e
compos
ed of
six key-poi
nts of 20 seq
uenti
a
l fra
m
es
represe
n
tin
g
certain ph
on
emes. Sevent
ee
n
Indon
esi
an p
h
one
mes w
e
re
chose
n
fro
m
th
e w
o
rds usu
a
ll
y used
by de
af kid spec
ial sc
hoo
l teach
e
rs for
speec
h thera
p
y
. T
he results show
ed that
th
e recog
n
itio
n r
a
tes varie
d
on
different ph
on
emes articu
lati
on,
ie. 78
% for bi
l
abi
al/pa
l
ata
l
p
hon
e
m
es a
nd
63% for
pal
at
a
l
on
ly ph
on
e
m
es. T
he con
d
iti
on of the
li
ps
also
had effect on the resu
lt, w
here fema
le w
i
th red lips
h
a
s 0.77 correl
a
tio
n
coefficie
n
t, comp
are to 0.68
for
pal
e lips a
nd 0.
38 for male w
i
th must
aches.
Ke
y
w
ords
: hid
den
mark
ov mode
l, lip
motio
n
,
pattern recog
n
itio
n, syllab
l
e
pron
ounc
iac
i
on
1. Introduc
tion
Visual
co
mm
unication pl
a
y
s impo
rtant
role i
n
n
o
isy
environ
ment
as
well
as for hea
ring
impaired pe
rson, which audio
comm
unication is
not possibl
e. Many re
sea
r
che
r
s h
a
ve
developin
g
m
e
thod
s to overco
me the
s
e probl
em
s;
one of them is by lip-readi
ng. Lip movemen
t
s
during prononciation
syllabl
es or
words will form
specific
patte
rns.
From these l
i
p patterns,
we
can find o
u
t what was
sai
d
by other pe
ople with
out heari
ng hi
s/h
e
r voice.
Image proce
ssi
ng an
d pat
tern reco
gniti
on fileds
hav
e been
gro
w
i
ng very fast, allowin
g
us to e
s
tablish a system fo
r automati
c
lip rea
d
ing.
Pe
tajan [1] sugg
ested that visual syste
m
wi
ll
help spee
ch reco
gnition p
r
oce
s
se
s be
come mo
re effective. Yau et al [2] introdu
ced a te
chni
q
ue
of spe
e
ch re
cog
n
ition
co
mbined
with
a visual
sp
ee
ch m
odel b
a
s
ed
on fa
cial
movement v
i
deo.
Ma et al [3] develop
ed
a Bayesia
n
model fo
r lip
-re
adin
g
patt
e
rn
s un
de
r
mode
rate n
o
i
se
exposure. M
ean
while, al
ong with the
developme
n
t
of commu
nicatio
n
tool
s, Kim et al [4]
develop
ed a
method
of lip
-re
adin
g
in
a
real time
fash
ion for sm
art
phon
es. Li
p-readin
g
meth
od
applie
d to
se
veral la
ngu
ag
es
wa
s d
e
vel
oped
by Saitoh et al
[5], while S
h
in et
al [6] devel
o
ped
this syste
m
for Korea
n
lang
uage.
One fo
cu
s of
resea
r
ch in li
p rea
d
ing i
s
o
n
the sele
ction pattern
re
cognition m
e
th
od. Ho
w
et al [7]
perf
o
rme
d
lip
-re
ading
on
syll
able
s
/ba/, /
da/, /fa/, /la/, /ma/ u
s
ing
Artificial
Neu
r
al
Network (ANN) on
vid
eo, audio,
th
e
co
mbination
of
both. Anothe
r wi
dely u
s
e
d
method
is the
method
of
Hi
dden
Ma
rkov Mod
e
l
(HM
M
). Puviar
asan et
al [8]
u
s
ed
HMM m
e
thod to
re
co
g
n
ize
33 wo
rd
s in
English by
peopl
e with
heari
ng im
p
a
i
rment
s. Two
feature
s
are use
d
, nam
ely
discrete
co
si
ne tran
sform (DCT)
and di
screte
wave
l
e
t transfo
rm (DWT), with reco
gnition rate of
91% an
d 9
7
%
re
spe
c
tively. Nursing
[9] co
ndu
cted
re
sea
r
ch o
n
lip-readi
ng
of three
Fre
n
ch
syllable
s
usi
n
g HMM and
our-poi
nt method for the
fe
ature extra
c
ti
on; namely: the point abo
ve
,
bottom, right, and left. Th
e system can read correc
tl
y the syllables /ba/ 63.64% ,
/be/ 72.73%
and /bou/ 81.
82% .
Lip-rea
d
ing
system for Ind
one
sian p
hon
eme
itself ha
s bee
n devel
oped by Fa
ri
dah et al
[10] and app
lied as a sp
eech thera
p
e
u
tic tool fo
r deaf kid
s
in Indone
sia. T
he system u
s
ed
Neu
r
al Netwo
r
k for lip p
a
tte
rn re
co
gnition
in pron
un
cin
g
vowel ph
on
emes: /a/, /i/,
/u/, /e/, and /o/
in Indone
sia
n
langua
ge. Howeve
r, the system wa
s
no
t able to provide sati
sfacto
ry result
s.
In this p
ape
r, we u
s
e
HMM metho
d
for lip
pattern re
co
gnition
in p
r
onu
nci
a
tion of
phon
eme
s
in
Indone
sian l
angu
age. Th
e phon
eme
s
to be recog
n
ize
d
are
re
pre
s
entin
g three
Evaluation Warning : The document was created with Spire.PDF for Python.
ISSN: 16
93-6
930
TELKOM
NIKA
Vol. 13, No. 1, March 2
015 : 173 – 1
8
0
174
different con
s
on
ant so
un
d formation,
namely
bilabial (lip
s consona
nts),
dental (d
ent
al
con
s
o
nant
s), and pal
ate (h
ard pal
ate co
nso
nant).
2. Rese
arch
Metho
d
2.1. The Da
ta
The data u
s
e
d
in this re
se
arch is in the
form of facia
l
video of 25 spe
a
kers, co
mpri
sing
variety of sp
eakers, na
m
e
ly female
wi
th red
lip
s,
female
with pal
e
lips,
male
with a
mustache
and pal
e lip
s, as sho
w
n in
Figure 1. Each
spea
ke
r
pron
oun
ciate
d
17 ph
onem
es
comp
osed
of
bilabial,
palat
al, dental
a
s
well
as mixed
co
nsona
nt
s,
as
presented
in T
able
1. T
he vide
os we
re
taken
und
er 2
40 -2
70 lux li
ghting. Of the
25 video d
a
ta, 15 were
used a
s
HM
M training d
a
ta a
n
d
10 we
re u
s
ed
for testing.
(a)
(b
)
(c)
Figure 1. Examples of fa
cial image of (a
) female
with
red lip, (b
) female with pal
e lip, and (c)
male with mu
stache an
d p
a
le lip
Tabel 1. Mod
e
led ph
onem
es
Bilabial Palatal
Dental
Mix
e
d
(1
)
Ba-
(2
)
Bi-
(3
)
Be-
(4
)
Bo
(
10)
Sa
(
13)
La
(
15)
Cak
(5
)
Ma-
(6
)
Me
(
11)
Ja
(
14)
Ta
(
16)
Dak
(7
)
Pa-
(8
)
Pi-
(9
)
Pu
(
12)
Ca
(
17)
Tol
Note : Th
e numb
e
r in parent
hese
s
are t
he inde
x of
the fonem fo
r re
cognition
2.2. Video Image Proce
s
s
i
ng
The d
a
ta fro
m
the video
are
processed to
p
r
od
uce feature extractio
n
u
s
ing
ste
p
s
illustrated in Figure 2.
Fr
a
m
e
ext
r
acti
o
n
Lip
lo
c
a
l
i
z
a
t
i
o
n
V
i
deo im
ag
e
Li
p
se
gm
en
t
a
t
i
on
F
e
at
ure
ex
tra
c
ti
o
n
L
i
p f
eature
(six
ke
y
p
o
int
s
)
Figure 2. Video image p
r
o
c
e
ssi
ng blo
ck diagra
m
For th
e first
step, 20
fram
e
s
a
r
e
extra
c
te
d from
the
vid
eo d
u
rin
g
th
e
prono
nciatio
n
of th
e
phon
eme
s
(F
igure 3
)
. Each frame is th
en unde
rg
on
e a seri
es of image processing to obtain
lip
feature
for pa
ttern
recogniti
on. Th
e first i
m
age
proc
essing
is p
e
rfo
r
med to
dete
c
t the lo
catio
n
of
the lip. The basi
c
metho
d
used in the
detectio
n
of lip locatio
n
is Casca
de Cl
assifier meth
od.
Once the lo
cation of the li
p is o
b
taine
d
, the
image
area aroun
d th
e lip is th
en
cropp
ed, he
nce we
can fo
cu
s on
smalle
r area
of image.
Evaluation Warning : The document was created with Spire.PDF for Python.
TELKOM
NIKA
ISSN:
1693-6
930
Lip Motion Pattern Recogni
t
ion for Indonesia
n Syllabl
e Pronunci
a
tion .... (Balza Achm
ad)
175
/ba/
/
b
e/
Figure 3. Image frame
s
du
ring p
r
on
oun
ciation of phon
emes
The cropp
ed
lip image is then segm
e
n
ted hen
ce the lip can b
e
sepa
rate
d from the
surro
undi
ng f
a
cial
skin. Li
p segme
n
tation alg
o
rithm
is p
e
rfo
r
me
d ba
sed
on t
he differen
c
e
in
colo
r co
mpo
s
ition betwe
en
the lip and the skin
[11]. Skin colo
r is determine
d
more on
col
o
r
comp
ositio
n
comp
are to b
r
ightne
ss. Co
lor comp
os
iti
on of the
ski
n is re
ma
rka
b
ly con
s
tant
even
exposed by d
i
fferent illumi
nation. Hul
b
e
r
t and Pog
g
io
[12] defines
the value of the p
s
eud
o h
u
e
to illustrate this difference,
)
,
(
)
,
(
)
,
(
)
,
(
y
x
G
y
x
R
y
x
R
y
x
h
(1)
with R(x,y) an
d G(x,y) are t
he red a
nd g
r
een compo
n
e
n
t for each pixel in the image.
Snake
s
m
e
th
od, whi
c
h
wa
s first develo
ped by
Ka
ss et al [13], is then ap
plied
to the
pse
udo hu
e image to obtai
n the outer contour of t
he lip. In this paper, lip move
ment pattern
are
con
s
tru
c
ted
by the lip
sh
ape
s of
ea
ch
frame
s
. T
h
e
lip
sha
pe it
self is
re
pre
s
e
n
ted by
six
key
points o
n
the lip conto
u
r, a
s
sh
own in Figure 4.
Figure 4. Six
key point
s re
pre
s
entin
g lip
movement p
a
ttern
In ord
e
r to fi
nd the
s
e
six
key
point
s,
the contou
r
obtaine
d by t
he Sna
k
e
m
e
thod i
s
evaluated, in
whi
c
h
six p
o
i
nts a
r
e
sel
e
cted. Slope
no
rmali
z
ation
sometime i
s
n
e
ce
ssary i
n
t
h
e
ca
se
of lip
s t
hat a
r
e
not u
p
right,
whi
c
h
may o
ccu
re
d
u
ring
the
mov
e
ment of
the l
i
p from
fram
e
to
frame. An example of co
m
p
lete image p
r
ocessi
n
g
for
each frame i
s
given in Figu
re 5.
F
r
am
e ima
g
e
L
i
p loc
a
liz
a
tion
an
d cr
opp
in
g
S
n
ake
s
c
ontr
o
l points
f
o
r
m
ing
lip
contou
r
Si
x
ke
y
p
o
ints
e
x
tra
c
tion
Figure 5. Co
mplete imag
e
processin
g
for ea
ch fram
e
Evaluation Warning : The document was created with Spire.PDF for Python.
ISSN: 16
93-6
930
TELKOM
NIKA
Vol. 13, No. 1, March 2
015 : 173 – 1
8
0
176
2.3. Hidden
Markov
Model
Hidd
en M
a
rkov Mod
e
l (HMM) i
s
a
statisti
cal
mo
d
e
l of a
syst
em o
n
whi
c
h hid
den
para
m
eters a
r
e dete
r
mine
d from ob
se
rvable param
etes. The o
b
s
erva
ble pa
rameters are
use
d
as in
puts of t
he HM
M in th
e form of a d
a
taba
se.
The
architectu
re
o
f
the input dat
aba
se i
s
sh
o
w
n
in Figure 6.
17
P
h
o
n
em
es
15 Spe
a
k
e
rs
2
0
F
r
am
es
6 ke
y
points
(x
,
y
)
Figure 6. Dat
aba
se Archite
c
ture of the
HMM
The
HMM arc
h
itec
ture for this
s
t
udy is giv
en in Fi
g
u
re
7. In this study, we
u
s
e O
ne
Dimentio
nal
Hidd
en Ma
rkov Model [14
], in which
th
e pro
bability of transitio
n from all
state
to
observabl
e p
a
ram
e
ter a
r
e
the same.
Figure 7. The
Hidde
n Markov Model architecture in this study
The
HMM
m
odelin
g sta
g
e
s
, commo
nly referred
to a
s
d
a
taba
se
constructio
n
, i
s
sho
w
n
Figure 8. The
param
eters i
n
the datab
a
s
e will
be u
s
ed as
a ben
chmark in recognition p
r
o
c
ess
durin
g testing
phase.
There are th
ree main sta
g
e
s in the con
s
tru
c
tion of m
odel data
b
a
s
e.
1.
Labellin
g, is the pro
c
e
s
s of making a lab
e
l for
each test data file co
nsi
s
ting lip feature, whi
c
h
are comp
ose
d
of six key points of 20
fra
m
es represen
ting certai
n p
honem
es.
2.
Cod
ebo
ok fo
rmation, which
is u
s
e
d
to
store
i
nput-out
put data
pairs for trai
ning. I
n
this
pap
er,
the input of si
x key points i
n
20 frame
s
a
r
e re
pre
s
e
n
te
d by the Euclidean di
stan
ce.
3.
HMM
mo
del con
s
tru
c
tion, whi
c
h calcula
t
es
t
he
HMM
para
m
eter
using 15 trainin
g
data. Thi
s
pro
c
e
ss
will g
enerate 17 m
odel
s acco
rdi
ng to the num
ber of train
e
d
phonem
es.
Evaluation Warning : The document was created with Spire.PDF for Python.
TELKOM
NIKA
ISSN:
1693-6
930
Lip Motion Pattern Recogni
t
ion for Indonesia
n Syllabl
e Pronunci
a
tion .... (Balza Achm
ad)
177
HM
M
d
a
ta
b
a
se
Codebook
da
t
a
b
a
s
e
L
a
be
l
da
t
a
ba
se
La
b
e
ll
ing
ST
A
R
T
Si
x
key
po
i
n
t
s
files
Ph
oneme
f
eature
ex
traction
C
o
d
e
book
H
M
M
EN
D
DAT
A
B
A
S
E
Figure 8. Flow ch
art of dat
aba
se con
s
truction
2.4. HMM Te
sting
In this
stag
e, the
con
s
tru
c
ted mod
e
ls a
r
e te
sted
usi
ng 1
0
te
st d
a
ta. The
step
s in
thi
s
stage
can b
e
seen in F
i
gure 9. The
general ide
a
of HMM testing i
s
to find phone
m
e
,
rep
r
e
s
ente
d
by label, whi
c
h has maxim
u
m log of pro
bability when
applie
d to the HMM model.
Figure 9. Flow ch
art of HMM Testin
g
START
F
r
am
e fr
om
vi
de
o
Feature extr
action
Norm
aliz
ation
Codebo
ok databas
e
Log of Probability
HM
M database
Find M
a
x L
o
P
Deter
m
ine phonem
e
L
a
bel database
END
Evaluation Warning : The document was created with Spire.PDF for Python.
ISSN: 16
93-6
930
TELKOM
NIKA
Vol. 13, No. 1, March 2
015 : 173 – 1
8
0
178
The perfo
rm
ance
of
the model
ha
s b
een ca
rri
ed
out
for
video
input
s co
nsi
s
ting 17
Indone
sia
n
p
honem
es.
Th
e an
alysi
s
wil
l
be
pe
rform
e
d to d
e
termi
n
e the
simil
a
rit
y
betwe
en th
e
centroid fo
r
different
peo
ple p
r
on
oun
cing
t
he
sa
me ph
one
m
e
s. An
other analy
s
is wil
l
b
e
perfo
rmed to
cal
c
ulate
su
cce
ssful
rate in recogni
zin
g
test data.
3. Results a
nd Analy
s
is
The test sho
w
s that the
r
e
are simil
a
riti
es in
the val
ue of centroi
d
formed by
20 frame
s
whe
n
pro
nou
ncin
g the sa
me phon
eme
by different spe
a
kers, as sho
w
n in Fi
gure 1
0a. On
the
other h
and,
each pe
rson
move their li
ps differ
ently while p
r
on
o
unci
ng differe
nt phone
me
s, as
sho
w
n in Fig
u
re 10
b.
(a)
(b)
Figure 10. Ce
ntroid patte
rn
of phoneme
(a) /ba/ by
three different p
e
rson (b) /ba/
, /be/, dan /bi/
by the same
person
Traini
ng process don
e for 15 trainin
g
d
a
ta for ea
ch
phon
eme cre
a
tes 17
HMM
models.
Each mod
e
l has spe
c
ific
cha
r
a
c
teri
stics rep
r
e
s
ente
d
by transmi
ssi
on and e
m
issi
on matri
c
e
s
.
Figure 11
sh
ows te
st re
su
lt of the mo
dels us
i
ng
15
trainin
g
dat
a
and
10 te
sti
ng dat
a. Te
sting
usin
g trai
ning
data give
s
correlation
co
efficient
R =
1, mean
s tha
t
100% trai
ni
ng data
can
be
recogni
ze
d p
e
rfectly. Mea
n
whil
e, when
applying
the
models u
s
in
g testing dat
a, the correla
t
ion
coeffici
ent R = 0.64. The
source of e
rro
r ca
n be t
he lip co
nditi
ons, arti
culati
on, as well as
external fa
cto
r
s
su
ch
as li
g
h
ting an
d po
sitional
ch
an
ges
duri
ng vi
deo recordin
g whi
c
h
wa
s
not
controlled in t
h
is stu
d
y.
(a)
(b)
Figure 11. Te
st re
sults of a
pplying the H
MM model
s to (a) traing d
a
ta (b) te
st data
Person
1
Person
2
Person
3
Evaluation Warning : The document was created with Spire.PDF for Python.
TELKOM
NIKA
ISSN:
1693-6
930
Lip Motion Pattern Recogni
t
ion for Indonesia
n Syllabl
e Pronunci
a
tion .... (Balza Achm
ad)
179
The rel
a
tion
betwe
en inpu
t and output of the H
MM model
s by varying lip co
nd
itions can
be
see
n
in
Figure 1
2
. T
he
co
rrel
a
tio
n
coeffice
nt
for red
lips,
pale li
ps,
an
d pal
e lip
s
with
musta
c
he
s are
0.74,
0.6
8
and 0.
38
re
spectivelly. Th
us, the
HMM
model
s
can
reco
gni
ze
co
rrect
phon
eme
s
for female with red lips
comp
are to othe
r lips conditio
n
s.
(a)
(b)
(c
)
Figure 12. Te
st re
sults for
(a) re
d lips
(b) pale lips (c) pale lip
s with
musta
c
he
s
Articulatio
n
can al
so b
e
o
ne of the
error
sou
r
ces in
re
cog
n
izin
g
phon
eme
s
u
s
ing HMM
model
s. Figu
re 1
3
sho
w
s the re
co
gnit
i
on results o
f
different p
honem
es arti
culatio
n
, whil
e
pron
oun
cin
g
bilabial, p
a
lat
a
l and
de
ntal
phon
eme
s
.
The
su
ccess
rates for
bila
bial an
d d
ent
al
phon
eme
s
a
r
e 78%, whil
e for p
a
latal
is 63%. T
h
i
s
is be
cau
s
e
whe
n
p
r
on
o
unci
ng different
bilabial a
nd d
ental ph
onem
es, the lip m
o
vements a
r
e
differently, while for p
a
lata
l phon
eme
s
, all
phon
eme
s
produ
ce simil
a
r lip movemen
t
.
(a)
(b)
Evaluation Warning : The document was created with Spire.PDF for Python.
ISSN: 16
93-6
930
TELKOM
NIKA
Vol. 13, No. 1, March 2
015 : 173 – 1
8
0
180
(c
)
Figure 13. Te
st re
sults for
(a) bilabi
al (b
) dental (c) p
a
l
a
tal phon
eme
s
4. Conclusio
n
1.
The d
e
velop
e
d
HM
M mo
de
ls h
a
s pe
rformance in
term of correlati
on
coeffici
ent
of R = 1 fo
r
training d
a
ta and 0.64 for t
e
st data.
2.
Phonem
es p
r
onou
nced by female with
red
lips
can
be recogni
zed better wit
h
R = 0.77,
comp
are to pale lips
with R = 0.68 a
nd
pale lip with
musta
c
he
R = 0.38.
3.
Articulatio
n
in
prono
un
cing
phon
eme
s
al
so ha
s effect
in recognitio
n
rate, bilabial
and palat
a
l
phon
eme
s
ha
s 78% re
co
gn
ition rate, whil
e palatal only
gives 63% re
cog
n
ition rate
.
Referen
ces
[1] Petajan
ED.
A
u
tomatic L
i
pr
e
adi
ng to E
n
h
a
n
ce Sp
eec
h R
e
cog
n
itio
n
. IEEE Confer
enc
e
on Computer
Visio
n
and P
a
ttern Rec
ogn
itio
n. San F
r
ansis
co. 1985: 4
0
-4
7.
[2]
Yau W
C
, Kumar DK, Arjuna
n
SP. Visual Re
cogn
ition of Sp
eech C
onso
n
a
n
ts using F
a
ci
a
l
Moveme
nt
F
eatures.
Integ
r
ated Co
mpute
r-Aide
d
Engi
ne
erin
g
. 200
7; 14
(1): 49-61.
[3]
Ma W
J
, Z
hou
X, Ross
LA, F
o
xe JJ, Parra
LC. Li
p Re
ad
in
g Aids W
o
r
d
R
e
cog
n
itio
n Mos
t
in Mod
e
rat
e
Noise: A B
a
ye
sian E
x
pla
nati
on Usi
ng Hig
h
-
Dimens
io
nal F
eature
Sp
ace
.
PLoS ONE
. 200
9; 4(3): 1-
14.
[4]
Kim Y, Kang
S, Jung S.
D
e
sig
n
an
d Imp
l
e
m
e
n
tation
of
A Lip R
e
a
d
in
g System
in
Smart Ph
on
e
En
vi
ronm
e
n
t
. 1
0
th IEEE Internatio
nal C
onfe
r
ence o
n
In
formation R
euse
and Integr
atio
n
,
Las Vegas.
200
9: 101-
104.
[5]
Saitoh T
,
Morishita K, Ko
nis
h
i R.
An
alysis
of Efficient L
i
p
Rea
d
in
g Meth
od for Var
i
o
u
s
Lan
gu
ages
.
19th Intern
atio
nal C
onfere
n
ce
on Pattern
Re
cogn
ition (ICP
R),
T
a
mpa. 20
08: 1-4.
[6]
Shin J,
Le
e J,
Kim D. Re
al-ti
m
e Li
p R
ead
in
g S
y
stem for I
s
olate
d
Kor
e
a
n
W
o
rd Rec
o
g
n
i
t
ion.
Pattern
Reco
gniti
on
. 2
011; 44: 5
59-5
71.
[7]
Baga
i A, Gand
hi H, Go
yal
R, Kohl
i M, Pr
asa
d
T
V
. Lip Rea
d
in
g Usi
ng N
e
ural N
e
t
w
orks.
Internatio
na
l
Journ
a
l of Co
mputer Scie
nce
and N
e
tw
ork Security
. 200
9; 4: 108-1
11.
[8]
W
e
rda S, Mah
d
i W
,
Hama
do
u AB. Lip
Loc
alizati
on an
d Viseme Class
ificatio
n
for
Vis
ual
S
p
e
e
c
h
Reco
gniti
on.
In
ternatio
nal J
o
u
r
nal of Co
mputi
ng an
d Infor
m
a
t
ion Scie
nce
. 2
007; 5(1): 6
2
-7
5.
[9]
Puviar
asan
N,
Pal
aniv
e
l S.
L
i
p R
e
a
d
in
g
of
Heari
n
g
Impair
ed P
e
rsons
usi
ng HMM.
Ex
pe
rt Systems
w
i
th Applicati
o
ns
. 2011; 38:
447
7-44
81.
[10]
F
a
rida
h, Utami SS,
W
i
bo
w
o
S,
W
ija
y
a
E. Speec
h T
herap
y Instrum
ent for Deaf Peop
le i
n
Indon
esia
.
Medi
a T
e
knik.
200
8; 30(2): 20
1-20
6.
[11]
Eveno N, C
a
plier A, C
oul
o
n
PY.
A N
e
w C
o
l
o
r Tran
sfo
rma
ti
on
fo
r L
i
p
s
Segm
enta
t
i
o
n
. IEEE
Workshop o
n
Multimed
ia Si
g
nal Proc
essin
g
(MMSP’01). Cann
es. 200
1: 3-8.
[12]
Hulb
ert A, Poggio T
.
S
y
nthes
izi
ng A C
o
lo
ur Algorit
hm from Exam
pl
es.
Science.
19
98; 2
39: 482-
48
5.
[13]
Kass M, W
i
tkin
A, T
e
rzopoul
o
s
D. Snak
es: A
c
tive Co
ntour
Mode
l.
Int.
Jou
r
nal Co
mp
uter Visio
n
. 19
88;
4: 321-3
31.
[14] Rabi
ner
L
R
.
A
T
u
torial o
n
Hid
den Mark
ov Mode
ls and Se
le
cted Appl
icatio
ns in Spe
e
ch
Reco
gniti
on
,
Procee
din
g
s of
the IEEE. 1989; 77(2): 25
7-2
86.
Evaluation Warning : The document was created with Spire.PDF for Python.