Indonesi
an
Journa
l
of El
ect
ri
cal Engineer
ing
an
d
Comp
ut
er
Scie
nce
Vo
l.
10
,
No.
1
,
A
pr
il
201
8
, p
p.
168
~
175
IS
S
N: 25
02
-
4752, DO
I: 10
.11
591/ijeecs
.v1
0.i
1.
pp
168
-
175
168
Journ
al h
om
e
page
:
http:
//
ia
es
core.c
om/j
ourn
als/i
ndex.
ph
p/ij
eecs
On
t
he
Comp
ari
son
o
f Li
ne Spect
ra
l
Fr
equencies
a
nd
Mel
-
F
r
eq
uency Cepstr
al Coeff
i
cients
Us
i
ng Feedf
orward
Neural Net
work
f
or
L
an
guage Id
entific
ation
Te
ddy
Sur
ya
Gun
awan
1
,
Mi
ra
K
art
iw
i
2
1
Depa
rtment of
El
e
ct
ri
ca
l
and
C
om
pute
r
Engi
n
e
eri
ng,
Kulliyy
a
h
of
Eng
in
ee
rin
g
2
Depa
rtment of
Inform
at
ion
S
y
s
t
ems
,
Kulliyy
ah
of
ICT
Inte
rna
ti
ona
l
Isl
a
m
ic
Univer
sit
y
Malay
s
ia
Art
ic
le
In
f
o
ABSTR
A
CT
Art
ic
le
history:
Re
cei
ved
Ja
n
3
, 201
8
Re
vised
Ma
r
5
,
201
8
Accepte
d
Ma
r
23
, 201
8
Of
the
m
an
y
au
dio
fe
at
ure
s
ava
i
la
bl
e,
thi
s
pap
er
foc
uses
on
th
e
compari
son
of
two
m
o
st
popula
r
fe
a
ture
s,
i
.
e.
li
n
e
spec
tr
al
f
re
quencie
s
(LSF)
and
Mel
-
Freque
nc
y
C
epstra
l
Co
eff
i
ci
en
ts.
W
e
tr
ai
n
ed
a
f
ee
dforward
n
eur
al
n
et
work
with
var
ious
hi
dden
lay
e
rs
and
num
ber
of
hidde
n
nodes
to
id
ent
if
y
fiv
e
diffe
re
n
t
la
ngu
a
ges,
i.e.
Arabi
c,
Chine
se,
Eng
li
s
h,
Korea
n,
and
Malay
.
LSF
,
MF
CC,
and
combinat
ion
of
b
oth
fe
atures
we
re
ext
r
ac
t
ed
as
the
fe
at
ure
vec
tors.
S
y
st
ematic
expe
riments
have
bee
n
con
duct
ed
to
find
t
he
opti
m
um
par
amete
rs
,
i.e.
s
ampling
fre
quen
c
y
,
fra
m
e
s
ize,
m
odel
orde
r,
and
struct
ure
of
neur
al
ne
twork.
The
re
cogn
it
ion
ra
te
per
fr
ame
was
conve
rte
d
to
re
cognition
ra
te
per
audi
o
fi
l
e
using
m
aj
ority
voti
ng.
On
ave
r
age
,
th
e
re
cogni
t
ion
ra
te
for
LSF
,
MF
CC,
a
nd
combination
of
both
fe
at
ure
s
are
96%
,
92
%,
and
96%
,
re
spec
t
ive
l
y
.
Th
ere
fore
,
LSF
is
the
m
ost
suita
bl
e
fe
a
ture
s
to
b
e
uti
lized
fo
r
la
nguag
e
id
entifi
ca
t
ion
using
fe
e
dforward
neur
al
net
work classifi
e
r.
Ke
yw
or
d
s
:
L
an
gu
a
ge
I
de
nt
ific
at
ion
LSF
MFC
C
F
eed
forw
a
r
d
N
eur
al
N
et
w
ork
C
la
ssifie
r
R
ecognit
ion
R
at
e
Cop
yright
©
201
8
Instit
ut
e
o
f Ad
vanc
ed
Engi
n
ee
r
ing
and
S
cienc
e
.
Al
l
rights re
serv
ed
.
Corres
pond
in
g
Aut
h
or
:
Ted
dy S
ur
ya
G
un
a
wa
n
,
Dep
a
rtm
ent o
f El
ect
rical
an
d
Com
pu
te
r
E
ng
i
neer
i
ng, Kulli
yy
ah
of E
nginee
rin
g
,
In
te
r
natio
nal Is
lam
ic
U
niv
er
sit
y M
al
ay
sia
,
Jal
an Go
m
bak
,
5310
0 Ku
al
a
Lum
pu
r, (+
603)
6196
4521
.
Em
a
il
:
tsgu
na
wan@ii
um
.ed
u.m
y, tsguna
wa
n@gm
ai
l.co
m
1.
INTROD
U
CTION
Ther
e
a
re
ab
ou
t
7105
li
ving
l
angua
ges
ow
ne
d
by
6.7
bill
ion
popula
ti
on
s
in
this
world
[
1]
and
th
ese
la
nguag
e
s
de
finite
ly
diff
er
from
each
oth
er.
Ma
ny
researc
hes
ha
ve
been
cond
ucted
in
the
area
of
la
ngua
ge
identific
at
ion
s
yst
e
m
(LI
D)
.
A
tuto
rial
on
LID
has
bee
n
pr
ese
nted
i
n
[2]
in
wh
ic
h
sy
nt
act
ic
,
m
or
phol
og
ic
al
,
and aco
us
ti
c,
phonet
ic
, pho
nota
ct
ic
, an
d p
ro
s
od
ic
le
vel in
for
m
at
ion
ha
ve be
en discusse
d
i
n detai
ls. Ar
ound
87
pros
od
ic
featu
res
has
been
us
e
d
f
or
LI
D
syst
e
m
in
[3]
w
hich
pro
vide
s
bette
r
recogn
it
io
n
perfor
m
ance,
wh
il
e
[4]
util
iz
es
vis
ual
feat
ures
with
e
rror
r
at
e
le
ss
than
10%.
I
n
[5]
,
a
hi
gh
ly
accu
rate
and
com
pu
ta
ti
on
al
ly
eff
ic
ie
nt
fr
am
ework
of
i
-
vector
prese
ntati
on
is
pr
opos
e
d
f
or
rap
i
d
la
ngua
ge
identific
at
ion.
A
hiera
rch
ic
a
l
LID
fr
am
ewo
r
k
is
pro
posed
in
[6]
,
in which
a se
ries of classi
fica
ti
on
decisi
ons i
s p
e
rfor
m
ed
at
m
ul
ti
ple levels wit
h
ind
ivi
du
al
l
a
ng
uag
e
s ide
ntifie
d on
ly
at th
e fi
nal level.
Althou
gh
m
any
researc
hes
ha
ve
bee
n
c
ondu
ct
e
d
on
L
I
D,
but
m
os
t
of
the
resea
rc
he
rs
a
re
on
ly
identify
in
g
ar
ound
t
wo
t
o
th
re
e
la
ngua
ges.
T
her
e
fore,
i
n
thi
s
pa
per,
fi
ve
la
ngua
ges
in
cl
ud
ing
A
ra
bic,
C
hi
nese,
En
glish,
K
or
e
a
n,
an
d
Ma
la
y,
sp
oke
n
by
both
m
a
le
s
and
fe
m
al
es
will
be
analy
zed.
F
or
LID
syst
em
,
th
e
m
os
t
us
e
d
featu
res
is
Me
l
-
Fr
e
qu
e
nc
y
Ce
ps
tral
Coeff
ic
ie
nts
(MF
CC
)
and
Li
ne
Sp
ect
ral
F
requ
encies
(LSF)
[7
]
-
[
9]
.
Syst
e
m
at
ic
exp
e
rim
ents
will
be
cond
ucted
t
o
fin
d
the o
ptim
um
par
am
e
te
rs.
The
c
om
bin
at
ion
o
f
both
L
S
F
an
d
MFC
C
featu
re
s
al
ong
with
var
i
ou
s
str
uctur
es
of
fee
dforwar
d
ne
ur
al
netw
orks
wil
l
be
e
valuate
d.
The
perform
ance cri
te
ria u
sed
is m
ai
nly t
he
rec
og
niti
on
rate, a
s
well
as
the
n
e
ural
n
et
work tra
ining t
i
m
e.
Evaluation Warning : The document was created with Spire.PDF for Python.
Ind
on
esi
a
n
J
E
le
c Eng &
Co
m
p
Sci
IS
S
N:
25
02
-
4752
On the C
omp
ari
so
n of
Line
Spect
ra
l F
req
ue
ncies
and
Mel
-
Freq
uen
cy
Cep
stral
... (
Teddy
Su
r
ya
Gunaw
an)
169
2.
LANGU
AGE
IDENTIF
IC
A
TION
SYST
EM
Lan
gu
a
ge
ide
nt
ific
at
ion
syst
em
con
ta
ins
at
l
east
three
basic
blo
cks
,
inclu
ding
pr
e
proce
s
sing,
featu
re
extracti
on,
a
nd
cl
assifi
er.
P
re
processi
ng
is
a
process
of
s
pe
ech
sig
nal
ref
i
nem
ent.
The
ra
w
s
peech
sig
na
l
that
we
obta
ine
d
is
no
t
pro
per
t
o
us
e
directl
y
as
input.
T
he
w
eak
sig
nal
that
we
ob
ta
ine
d
has
to
be
am
plifie
d,
rem
ov
ed
the
longe
r
sil
ence,
and
al
so
e
xtra
ct
ed
the
backg
rou
nd
noise
or
m
us
ic
fo
r
furt
her
pr
ocessin
g.
Ther
e
are
m
any
featu
re
e
x
tract
io
ns
t
hat
can
be
us
e
d
for
L
ID
syst
e
m
,
in
or
der
to
extract
t
he
s
pe
ech
si
gn
al
fro
m
each
diff
e
re
nt
sp
ea
ker
of
dif
fer
e
nt
la
ngua
ge,
f
or
e
xam
ple
Line
Sp
ect
ral
Fr
e
qu
e
ncies
(
LSF),
Me
l
-
Fr
e
qu
e
nc
y
Ce
ps
tral
Coe
ffi
ci
ents
(MFCC
),
S
hifte
d
Delt
a
Ce
ps
tra
(SD
C),
Pe
rce
p
tual
Linear
Pre
dict
ion
(P
L
P)
,
Dy
nam
i
c
Ti
m
e
W
a
r
ping
(DT
W)
,
an
d
Ba
rk
Fr
e
quenc
y
Ce
ps
tral
Coe
ff
ic
ie
nts
(BFC
C).
T
her
e
are
a
fe
w
cl
assifi
e
rs
that
can
us
e
d,
i
ncl
ud
i
ng
Ve
ct
or
Qu
a
ntiza
ti
on
(
VQ),
Ga
us
sia
n
Mi
xtu
r
e
Mo
de
l
(G
MM
)
,
Suppo
rt
Vect
or
Ma
chine
(S
VM
),
E
r
godi
c
Hi
dden
Ma
rko
v
Mo
del
(
HMM),
K
-
Me
ans
Cl
us
te
rin
g
Algorithm
and
A
rtific
ia
l
Neural
Netw
ork
(AN
N)
[2]
.
I
n
this
pa
per
,
tw
o
m
os
t
po
p
ula
r
au
dio
featu
res
w
il
l
be
evaluate
d,
i
nclu
ding
L
SF
an
d
MFC
C, an
d fe
edforwa
rd n
e
ural
n
et
w
ork wil
l be
us
e
d
as
the
cl
assifi
er as
s
how
n
i
n
Fi
gure
1.
Figure
1. Pro
pose
d
La
ngua
ge
Identific
at
io
n Sy
stem
2.1.
Li
ne Sp
ectr
al
Frequenci
es
A
wi
dely
us
ed
so
urce
-
filt
e
r
m
od
el
of
sp
ee
ch
is
the
li
near
pr
e
dicti
on
c
oe
ff
ic
ie
nt
(L
PC)
m
od
el
.
LPC
m
od
el
s
are
us
ed
for
s
peec
h
co
ding,
rec
ogniti
on
a
nd
e
nh
a
ncem
ent.
A
LPC
m
od
el
with
orde
r
p
can
be
expresse
d
as
s
how
n
in
E
q.
(1).
(1)
wh
e
re
is
s
pe
ech
si
gn
al
,
is
the
LP
pa
ra
m
et
ers
an
d
is
s
peech
e
xcita
ti
on
.
N
ote
t
ha
t,
th
e
coeffic
ie
nts
m
od
el
the
co
rr
el
at
ion
of
eac
h
s
a
m
ple
with
the
previ
ous
sam
ples
whereas
m
od
el
s
th
e
par
t
of s
peech t
hat can
not b
e
predict
ed
fro
m
the
past
p
sam
ples.
The
li
ne
s
pectr
al
fr
eq
ue
ncies
(LSF)
is
an
al
te
rn
at
ive
re
pr
es
entat
ion
of
li
ne
ar
predict
io
n
par
am
et
ers.
LSFs
are
us
e
d
in
sp
eec
h
co
di
ng,
an
d
in
the
i
nter
po
la
ti
on
an
d
extra
pola
ti
ons
of
LP
m
od
el
par
am
et
ers,
for
their
good
inte
rpol
at
ion
a
nd
qu
a
ntiza
ti
on
pro
pe
rtie
s.
LS
Fs
are
der
i
ved
a
s
the
r
oo
ts
of
the
f
ollo
wing
tw
o
po
ly
nom
ia
ls as
shown i
n
E
q. (
2) an
d (3).
(2)
(3)
wh
e
re
is
the
inv
e
rse
li
ner
predict
or
filt
er
and
.
T
he
po
ly
nom
ia
l
eq
uations
(
Eq.
(2)
an
d
(
3))
ca
n
be
re
wr
it
te
n
in
the
fa
ct
ori
zed
form
as
s
ho
w
n
in E
q.
(4)
a
nd (5).
(4)
(5)
Evaluation Warning : The document was created with Spire.PDF for Python.
IS
S
N
:
2502
-
4752
Ind
on
esi
a
n
J
E
l
ec En
g
&
Co
m
p
Sci,
Vo
l.
10
, N
o.
1
,
A
pr
il
2018
:
16
8
–
175
170
wh
e
re
a
re
th
e
LSF
pa
ram
et
er
s.
It
ca
n
be
s
ho
wn
that
al
l
the
roots
of
the
tw
o
poly
no
m
ia
ls
hav
e
a
m
agn
it
ude
of
one
a
nd
the
y
are
locat
e
d
on
the
unit
ci
rcl
e
an
d
al
te
r
nate
each
oth
e
r.
He
nce,
i
n
L
SF
re
pr
ese
ntati
on,
t
he
li
ne
r
pr
e
dictor
coe
ffi
ci
ents
is
co
nverte
d
to
LSF
vecto
r
.
Ma
tl
ab
im
ple
m
enta
ti
on
functi
on lpc
()
and poly
2lsf()
wer
e
u
se
d f
or
t
his
purpose
.
2.2.
Mel
-
Freq
uenc
y Ceps
tra
l
C
oe
ff
ic
ie
nt
s
Me
l
-
Fr
e
qu
e
nc
y
Ce
ps
tral
Co
eff
ic
ie
nts
(MF
CC
)
are
c
om
pu
te
d
us
in
g
a
filt
er
bank
of
fi
lt
ers
(
),
eac
h
on
e
has
a
tria
ngula
r
s
ha
pe
a
nd
is
sp
ac
ed
un
if
orm
l
y
on
the
m
el
scal
e
us
in
g
E
q.
(
6).
Eac
h
filt
er is d
e
fine
d as i
n Eq
. (7
).
(6)
(7)
The
l
og
-
ene
r
gy of m
el
sp
ect
rum
is cal
culat
ed
as:
(8)
wh
e
re
is
the
outp
ut
of
dis
crete
F
ourier
Transf
or
m
(DFT)
of
the
in
pu
t
sig
nal.
Althou
gh
tra
diti
on
a
l
cepstr
um
us
es
inv
e
rse
discre
te
Fo
uri
er
t
ransform
(I
DF
T
),
MFC
C
is
nor
m
al
l
y
i
m
ple
mented
us
in
g
di
screte
cosine
transf
orm
as f
ollow
s:
(9)
Ty
pical
ly
,
the
nu
m
ber
of f
il
te
rs
ra
ng
es
fro
m
20 to 4
0, an
d
t
he nu
m
ber
of c
oeffici
ents is
13.
2.3.
Feed F
orward
N
eur
al
Net
w
ork Clas
sifie
r
In
arti
fici
al
ne
ur
al
net
wor
k,
the
basic
pr
oce
ssing
un
it
is
a
per
ce
ptr
on.
A
feedfo
r
ward
ne
ur
al
net
wor
k
orga
nizes
pe
rc
eptr
on
s
i
nto
a
la
ye
r,
casca
de
these
la
ye
rs
in
to
a
netw
ork,
and
t
he
co
nne
ct
ion
s
be
twee
n
la
ye
rs
fo
ll
ow
only
one
directi
on.
The
la
ye
r
that
receives
co
nnect
ion
s
f
ro
m
t
he
input
featu
r
e
vector
s
is
th
e
inp
ut
la
ye
r,
the
ou
te
r
m
os
t
la
ye
r
is
t
he
outp
ut
la
ye
r
wh
ic
h
is
the
c
la
ssifie
r
ou
t
pu
t
,
and
the
rest
of
the
la
ye
rs
between
the
input
an
d
ou
t
pu
t
la
ye
rs
a
re
cal
le
d
hidde
n
la
ye
rs.
The
com
pu
ta
ti
on
of
a
feedfo
rw
a
r
d
ne
ur
al
net
w
ork
or
m
ul
ti
la
ye
r
per
c
eptr
on can
b
e
desc
ribe
d
as
foll
ow
s
(10)
wh
e
re
is
the
ou
t
pu
t
vector
of
la
ye
r
w
herevbnm
,
is
the
num
ber
of
la
ye
rs
i
n
t
he
ne
ur
a
l
netw
ork.
is
the
in
put,
wh
il
e
,
,
an
d
are
the
weig
ht
m
at
ri
x,
t
he
bias
vecto
r,
an
d
t
he
act
ivati
on
f
un
c
ti
on
of
la
ye
r
.
I
n
cl
assi
ficat
ion
of
cl
asses
,
t
he
act
ivati
on
functi
on
is
norm
al
ly
a
sigm
oid
for
or s
of
tm
ax
fun
ct
ion
for
.
Give
n
a
set
of
sam
ples
an
d
a
fe
e
dfo
rw
a
rd
ne
ur
al
net
wor
k
with
i
niti
al
par
am
et
ers
(c
har
act
erize
d
by
weig
ht
m
at
ri
ces
an
d
bias
ve
ct
or
s
),
we
w
ould
li
ke
to
trai
n
the
ne
ural
ne
twor
k
so
t
hat it
can
le
arn the m
app
in
g.
If we
see t
he
who
le
netw
or
k
as
the
fo
ll
ow
ing
f
un
ct
io
n
(11)
and
def
i
ne
som
e
loss
f
un
ct
ion
,
the
n
the
goal
of
trai
ni
ng
ou
r
netw
ork
beco
m
es
m
ini
m
iz
ing
. T
he gra
dient
of
ind
ic
at
es th
e d
irect
io
n
t
o
i
ncr
ease
as foll
ow
s
(12)
Evaluation Warning : The document was created with Spire.PDF for Python.
Ind
on
esi
a
n
J
E
le
c Eng &
Co
m
p
Sci
IS
S
N:
25
02
-
4752
On the C
omp
ari
so
n of
Line
Spect
ra
l F
req
ue
ncies
and
Mel
-
Freq
uen
cy
Cep
stral
... (
Teddy
Su
r
ya
Gunaw
an)
171
Since
the
gra
dient
sp
eci
fi
es
the
directi
on
to
in
creas
e
,
at
each
s
te
p
pa
ram
et
er
s
will
be
upda
te
d
pro
portion
al
ly
to the ne
gative
of the
gr
a
dient
(13)
wh
e
re
.
The
tr
ai
nin
g
proce
du
re
is
cal
le
d
gradient
descen
t
,
an
d
is
a
s
m
a
ll
po
sit
ive
trai
ning
par
am
et
er call
ed
le
ar
ning
rate.
Cro
s
s e
ntr
op
y
error i
s no
rm
ally used as a l
oss f
un
ct
io
n
(14)
wh
e
re
is
the
ind
e
x
of
a
n
ar
bitrary
sam
ple,
is
the
num
ber
of
cl
as
ses,
is
the
-
th
c
olu
m
n
corres
pondin
g
to
the
pro
bab
il
it
y
of
cl
ass
of
vect
or
.
T
he
gradie
nt
c
om
po
nen
ts
of
the
outpu
t
la
ye
r
c
a
n
be
c
om
pu
te
d
directl
y,
w
hile
they
are
hard
er
to
c
om
pu
te
in
lowe
r
la
ye
r
s.
N
or
m
al
ly
,
t
he
c
urren
t
gr
a
dient
is
cal
cula
te
d
us
i
ng
the
e
rror
of
the
pr
e
vious
ste
p.
Since
e
rror
s
ar
e
cal
cu
la
te
d
in
the
rev
erse
directi
on
,
this
al
gorithm
is k
now
n
as
bac
kpr
op
a
gatio
n.
3.
RESU
LT
S
A
ND D
I
SCUS
S
ION
This secti
on
will
d
isc
us
s the langua
ge
datab
ase pr
e
par
at
io
n, exp
e
rim
ental
set
up, v
ari
ous ex
pe
rim
ents
to f
i
nd opti
m
u
m
p
ara
m
et
ers,
an
d t
he pe
rform
ance ev
al
uation o
f
t
he pr
opose
d
L
I
D
syst
e
m
.
3.1.
Experim
en
ta
l
Setu
p
and L
angu
age
Datab
as
e
A
hi
gh
pe
rform
ance
syst
e
m
was
use
d
for
proces
sin
g,
i.e.
a
m
ulti
cor
e
syst
e
m
with
In
te
l
Core
i7
67
00
K
4.00
G
Hz
(
4
c
or
e
s
wit
h
8
threa
ds),
32
GByt
es
RAM,
256
GByt
es
S
SD
a
nd
2
TBy
te
s
hard
dis
k,
i
ns
ta
ll
ed
with
W
in
dow
s
10
op
e
rati
ng
syst
e
m
and
Ma
tl
ab
20
17
b
with
Sig
nal
Pr
oces
sin
g
and
Neural
Ne
twor
k
To
olbox
es
. Du
rin
g
sim
ulati
on
, o
t
her r
unning
appli
cat
ion
s
w
ere m
ini
m
iz
ed
as m
uch
a
s
pos
sible.
Fo
r
t
he
la
ngua
ge
data
base
pr
epar
at
io
n,
a
ud
i
o
file
of
te
n
s
pe
aker
s
with
dif
fer
e
nt
la
ngua
ge
wer
e
ta
ke
n
from
on
li
ne
la
ngua
ge
data
ba
se.
The
re
we
r
e
six
m
a
le
s
a
nd
fou
r
fem
ales
of
s
pea
ker
s
that
will
be
us
e
d
as
su
bject
for
t
his
pro
j
ect
.
All
th
e
sp
ea
ker
s
were
div
i
de
int
o
t
wo
gro
up
for
t
rainin
g
(fo
ur
m
al
es
and
one
fem
al
e)
and
te
sti
ng
(t
wo
m
al
es
and
three
fem
al
e
s)
re
sp
ect
ively
.
Be
sides,
eac
h
of
t
he
s
pea
ker
spo
ke
dif
fer
e
nt
la
nguag
e
s
a
nd
sentences
s
uch
as
A
ra
bic,
C
hi
nese
(
sp
eci
fic
al
ly
Ma
nd
arin
)
,
E
ng
li
sh,
Kor
ean
a
nd
Ma
la
y.
T
he
database
pr
e
se
nted
in
[10]
w
as
us
e
d
with
s
om
e
rearr
an
ge
m
ent,
in
w
hich
15
file
s
wer
e
us
e
d
f
or
trai
nin
g
a
nd
5
file
s w
e
re
us
ed
for
test
in
g.
3.2.
Experim
en
ts
on
S
amp
li
n
g
Frequenci
es,
Frame
Siz
es,
Model
Order
s,
and
Feed
forw
ard
Ne
ural
Net
w
or
k Stru
ctures
Ther
e
a
re
m
any
par
am
et
ers
wh
ic
h
co
uld
be
op
ti
m
iz
ed
to
achieve
the
hi
gh
est
pe
rfor
m
ance,
i.e.
in
t
erm
s
of
la
ngua
ge
recog
niti
on
rate.
In
this
pap
e
r,
sev
eral
i
m
po
rtant
pa
ra
m
et
ers
will
be
analy
sed,
inclu
ding
(sam
pling
f
requen
cy
),
(
fr
am
e
siz
e),
(m
odel
order),
an
d
the
str
uctu
re
of
fee
dfo
rw
a
r
d
neural
netw
ork
s.
The
st
ru
ct
ur
e
of
fee
dforwa
r
d
neural
netw
orks
c
ould
be
va
ried
i
n
te
r
m
s
of
num
ber
of
hidden
la
ye
rs
a
nd
nu
m
ber
of
no
de
s
in
each
hi
dden
la
ye
r.
N
ote
that,
a
50
%
ov
e
rlap
ping
wi
ndows
was
use
d
f
or
bo
t
h
LS
F
an
d
MFC
C
featu
re
extracti
on
so
that
both
will
hav
e
the
sam
e
num
ber
of
fr
a
m
es
for
eac
h
a
ud
i
o
file
.
I
n
[8]
,
we
us
e
d non o
ver
l
app
i
ng w
i
ndow fo
r
LS
F
featur
e
ex
tract
i
on.
Our
previ
ous
r
esearche
s
ha
ve
repor
te
d
that
sa
m
pling
f
reque
ncy
has
an
e
ff
e
ct
on
the
rec
og
niti
on
rate
[10]
,
w
hile
it
has
neg
li
gi
ble
eff
ect
on
t
he
ot
her
[8]
.
T
he
re
fore,
t
he
fir
st
exp
e
rim
ent
will
var
y
the
sam
pling
fr
e
qu
e
ncy,
i.e.
8000
Hz
a
nd
1600
0
Hz.
For
this
e
xperim
ent,
t
he
oth
e
r
tw
o
pa
ram
et
ers
w
ere
fixe
d
as
f
ol
lows
,
,
m
s.
W
hile
the
st
ru
ct
ur
e
of
the
fee
dfo
rw
a
rd
ne
ural
netw
ork
was
fi
xe
d
to
hav
e
on
e
hidd
e
n
la
ye
r
with
20
nodes
.
Table
1
sh
ows
the
rec
ogniti
on
rate
ve
rsu
s
tr
ai
ning
tim
e
fo
r
two
sa
m
pl
ing
f
reque
ncies
,
i.e.
8000
an
d
1600
0
Hz.
Ba
se
d
on
Table
1,
t
h
e
rec
ogniti
on
rate
for
16
kHz
sa
m
pling
fr
e
qu
e
ncy
is
high
er
than
8
kHz
sam
pling
f
reque
ncy,
e
sp
eci
al
ly
fo
r
L
SF
featu
res.
T
her
e
fore,
t
he
16
kHz
sam
pling
f
reque
ncy
w
il
l
be
sel
ect
ed
as
one
of the
opti
m
u
m
p
ara
m
et
er.
Table
1.
E
xper
i
m
ental
Result
s on Varyi
ng
S
a
m
pling
F
re
quencies
Tr
ain
in
g
ti
m
e
(
s)
Reco
g
n
itio
n
Rate (
%)
LSF
MFCC
LSF
MFCC
8000
2
.76
8
.28
6
3
.02
7
3
.86
1
6
0
0
0
1
0
.37
4
.05
6
7
.78
7
3
.01
Evaluation Warning : The document was created with Spire.PDF for Python.
IS
S
N
:
2502
-
4752
Ind
on
esi
a
n
J
E
l
ec En
g
&
Co
m
p
Sci,
Vo
l.
10
, N
o.
1
,
A
pr
il
2018
:
16
8
–
175
172
The
nex
t
e
xp
e
r
i
m
ent
will
evaluate
the
ef
fect
of
var
yi
ng
window
siz
e
(
fr
am
e
siz
e)
to
the
r
ecognit
ion
rate.
F
or
t
his
e
xp
e
rim
ent,
the
oth
e
r
tw
o
pa
ra
m
et
ers
were
fixed
as
f
ollows,
,
.
In
a
dd
it
io
n,
the str
uctu
re
of the
feedfo
r
ward
neural
netw
ork
w
as
f
i
xed to
h
a
ve on
e
h
i
dden
la
ye
r wit
h 2
0 nodes
.
Figure
2
sho
w
s
the
resu
lt
s
of
reco
gnit
io
n
ra
te
and
trai
ning
tim
e
fo
r
var
i
ous
fr
am
e
siz
e
from
10
to
100
m
s.
T
he
re
d
li
ne
re
pr
e
sen
ts
LSF,
wh
il
e
t
he
blu
e
li
ne
re
pr
ese
nts
MFC
C.
Th
e
s
qu
a
re
m
ark
er
re
pr
ese
nts
th
e
recog
niti
on
rat
e
(see
the
le
ft
axis),
wh
il
e
th
e
tria
ng
le
m
ar
ker
r
ep
rese
nts
the
trai
ning
ti
m
e
(see
the
rig
ht
axis)
.
Ba
sed
on
Fig
ure
2
,
t
he
f
ram
e
siz
e
of
30
m
s
was
sel
ect
ed
due
to
it
pr
ov
i
des
reas
ona
ble
trai
ning
ti
m
e
and
recog
niti
on
rate.
The
f
ram
e
s
iz
e
of
50
m
s
was
a
no
t
her
good
ca
ndidate
,
howe
ver,
la
r
ge
r
wi
ndows
si
ze
te
nds
no
t
to
capt
ur
e
enou
gh
the
dynam
ic
of
sp
ee
ch
sign
al
s
.
One
can
arg
ue
th
at
the
neu
ral
ne
twork
trai
ni
ng
play
s
m
or
e sign
ific
a
nt role t
o t
he re
cogniti
on r
at
e.
Figure
2. Re
co
gn
it
io
n
Ra
te
for
V
ario
us Fram
e Sizes
The
subse
qu
e
nt
ex
per
im
ent w
il
l evaluate
the
eff
ect
o
f
va
ryi
ng
m
od
el
o
r
der of
LPC and MFCC
to
the
reco
gnit
ion
r
at
e.
F
or
t
his
e
xperim
ent,
the
ot
her
tw
o
par
am
et
ers
we
re f
ixe
d
as
f
ollows,
,
.
In
a
ddit
ion
,
t
he
struct
ur
e
of
the
fee
dforwa
r
d
ne
ural
netw
ork
was
fi
xed
to
ha
ve
on
e
hidden
la
ye
r
with
20
nodes
.
Fig
ur
e
3
show
s
the
re
su
lt
s
of
rec
ogni
t
ion
rate
and
t
rainin
g
tim
e
fo
r
var
i
ous
m
od
el
or
de
r
of
LP
C
and
MFC
C
fr
om
6
to
48
with
interval
of
2.
Ba
se
d
on
Fig
ur
e
3,
the
m
od
el
orde
r
of
42
was
sel
ect
ed
as
on
e
of
th
e
op
ti
m
u
m
par
am
et
er
as
it
pr
ovides
high
rec
ogniti
on
rate
f
or
both
L
SF
a
nd
M
FCC
.
F
urt
her
m
o
re,
t
he
neural
netw
ork
trai
ning ti
m
e is no
t t
ha
t affecte
d by the inc
rem
ent of m
od
el
ord
e
r.
The
la
st
exp
e
ri
m
ent
is
reg
ar
din
g
th
e
ne
ur
al
ne
twork
str
uctu
r
e
config
ur
at
io
n.
The
fee
dfo
rward
neural
netw
ork
with
var
i
ou
s
struct
ure
of
hi
dd
e
n
la
ye
r(
s)
was
us
e
d.
Nu
m
be
r
of
epo
c
h
was
set
to
1000,
num
ber
of
m
axi
m
u
m
validat
ion
fail
w
as
set
to
10
0,
a
nd
the
scal
e
d
c
on
j
ugat
e g
ra
die
nt
was
use
d
as
t
he
trai
ning
al
gor
it
h
m
.
Table
2
s
how
s
the
rec
ogniti
on
rate
a
nd
trai
ning
tim
e
fo
r
va
rio
us
str
uctu
r
e
of
ne
ur
al
net
work,
i.e.
one
hidde
n
la
ye
r
with
a
num
ber
of
node
s
[
,
tw
o
hidd
en
la
ye
rs
,
an
d
th
ree
hidde
n
la
ye
rs
.
The
Ma
tl
ab
patte
rnnet
()
f
unct
ion
w
as
us
e
d
with h
id
den
la
ye
r
(s)
var
ia
ti
on.
N
ote
that,
our
prel
im
inary
resu
lt
s
us
i
ng
le
arn
in
g
vector
qu
a
ntiza
ti
on
(
LVQ)
neural
ne
twork
as
i
n
[8]
is
no
t
as
pro
m
isi
ng
as
si
m
ple
fee
dfo
rw
a
rd
neural
netw
ork
with
var
i
ou
s
hidden
la
ye
r
con
fig
urat
ion.
More
over
,
LV
Q
requires
lo
ng
e
r
tr
ai
nin
g
ti
m
e
as
well
com
par
ed
t
o
th
e sim
ple f
eedfor
ward
ne
ur
al
netw
orks.
Evaluation Warning : The document was created with Spire.PDF for Python.
Ind
on
esi
a
n
J
E
le
c Eng &
Co
m
p
Sci
IS
S
N:
25
02
-
4752
On the C
omp
ari
so
n of
Line
Spect
ra
l F
req
ue
ncies
and
Mel
-
Freq
uen
cy
Cep
stral
... (
Teddy
Su
r
ya
Gunaw
an)
173
Figure
3. Re
co
gn
it
io
n
Ra
te
for
V
ario
us M
od
el
O
r
ders
Table
2.
E
xper
i
m
ents o
f Fe
ed
forw
a
r
d
N
eu
ral Netw
ork
Str
uc
tures
Hid
d
en
L
ay
e
r(
s)
Tr
ain
in
g
T
i
m
e
(s)
Reco
g
n
itio
n
Rate (
%)
LSF
MFCC
LSF
MFCC
[
1
0
]
6
.45
4
.62
7
6
.62
6
8
.62
[
2
0
]
2
.23
2
.36
8
1
.09
7
3
.78
[
3
0
]
3
.14
3
.56
8
5
.1
8
0
.28
[
4
0
]
3
.16
3
.68
8
4
.23
8
0
.93
[
5
0
]
3
.81
3
.66
8
4
.62
7
9
.02
[
1
0
5
]
2
.93
5
.03
7
6
.09
7
0
.64
[
2
0
1
0
]
3
4
.29
8
2
.67
7
9
.18
[
4
0
2
0
]
5
.65
5
.31
8
9
.19
7
9
.76
[
4
0
2
0
10
]
7
.64
7
.21
8
6
.73
8
1
.87
[
1
0
0
]
6
.23
6
.48
8
7
.94
8
0
.87
[
2
0
0
]
1
2
.11
1
0
.77
8
7
.77
8
1
.32
[
3
0
0
]
1
7
.47
1
5
.65
8
9
.91
8
0
.48
[
4
0
0
]
2
5
.07
2
0
.32
9
1
.51
8
2
.43
[
5
0
0
]
2
5
.68
2
8
.3
3
8
9
.71
8
4
.65
[
1
0
0
0
]
6
0
.2
5
5
.28
9
1
.13
8
6
.04
[
4
0
4
0
]
6
.99
6
.26
8
8
.63
8
2
.73
[
4
0
4
0
20
]
8
.65
1
0
.7
8
7
.66
8
6
.36
[
4
0
4
0
40
]
1
0
.56
9
.58
8
7
.78
8
4
.03
[
2
0
0
0
]
1
2
0
.2
1
4
2
.46
9
0
.72
8
9
.01
[
3
0
0
0
]
2
0
1
.21
1
2
7
.03
9
3
.55
8
0
.36
[
4
0
0
0
]
4
2
1
.92
3
7
1
.81
9
3
.75
8
9
.25
[
5
0
0
0
]
35
6
.5
2
4
3
.19
9
2
.01
6
9
.91
[
1
0
0
0
0
]
8
7
2
.93
1
5
3
3
.7
8
8
8
.88
9
0
.33
[
1
5
0
0
0
]
1
5
9
2
.1
4
1
3
2
4
.6
8
9
2
.87
8
4
.74
[
2
0
0
0
0
]
2
2
5
5
.1
5
2
7
2
4
.1
4
9
3
.41
8
8
.63
[
3
0
0
0
0
]
3
2
3
4
.2
2
4
0
1
6
.2
3
9
0
.39
7
4
.67
[
4
0
0
0
0
]
5
3
7
9
.0
7
3
4
6
8
.1
5
9
4
.19
7
1
.81
[
5
0
0
0
0
]
4
0
7
6
.9
4
6
2
8
3
.1
4
7
1
.43
7
1
.03
Fr
om
Tabl
e
2,
it
is
fou
nd
tha
t
the
opti
m
u
m
nu
m
ber
of
hi
dden
la
ye
r
is
one
hidde
n
la
ye
r,
w
hile
the
nu
m
ber
of
nodes
is
10
00
a
s
hi
gh
li
ghte
d
in
bold
.
T
he
neural
netw
ork
str
uctu
re
pro
vid
es
a
high
recog
niti
on
rat
e
with
acce
pta
ble
trai
ni
ng
ti
m
e.
The
ot
her
structu
re,
i.e.
,
i
s
one
of
th
e
good
ca
nd
i
date
a
s
well
, but th
e tr
ai
nin
g t
im
e is
m
or
e than
t
hr
e
e tim
es lon
ge
r c
om
par
ed
t
o
.
3.3.
Experim
en
ts on the
Op
tim
um P
aramete
rs o
n
the Tr
aining D
ata
Fr
om
the
pr
e
vi
ou
s
ex
per
im
ents,
t
he
op
ti
m
um
par
am
et
er
s
an
d
ne
ur
al
netw
or
k
c
onfi
gurati
on
are
Hz,
m
s,
,
an
d
fee
dfo
rw
a
rd
neural
net
wor
k
with
structur
e
of
hi
dd
e
n
la
ye
r.
The
ne
ur
al
netw
ork
will
be
t
raine
d
us
ing
15
file
s
f
or
each
of
the
5
l
angua
ges.
Mo
r
eov
e
r,
as
the
num
ber
of
f
ram
es
fo
r
LSF
an
d
MFC
C
is
no
w
the
s
a
m
e
as
bo
th
usi
ng
50
%
over
la
pp
in
g
wi
ndow,
we
c
om
bin
ed
both
featur
e
s
to
e
va
luate
wh
et
her
t
he
rec
ogniti
on
rate
is
higher
or
not.
I
n
this
e
xp
e
rim
ent,
to
al
low
lo
ng
e
r
tra
ining
tim
e
we
furth
er
c
hange
t
he
nu
m
ber
of
ep
ochs
to
10
00,
m
axi
m
u
m
validat
ion
fail
to
1000,
a
nd
m
ini
m
u
m
gr
a
dient
t
o
.
F
igure
4
a
nd
T
able
3
s
hows
the
trai
ni
ng
pe
rfor
m
ance
f
or
LSF,
MFC
C,
and
com
bin
e
d
Evaluation Warning : The document was created with Spire.PDF for Python.
IS
S
N
:
2502
-
4752
Ind
on
esi
a
n
J
E
l
ec En
g
&
Co
m
p
Sci,
Vo
l.
10
, N
o.
1
,
A
pr
il
2018
:
16
8
–
175
174
featur
e
s.
N
ote
that,
input
la
ye
r
of
C
om
bin
ed
net
is
the
add
it
ion
of
in
put
la
ye
r
LSF
an
d
MFC
C,
i.e.
84.
It
can
con
cl
ud
e
d
t
hat
the
com
bin
ed
featur
e
s
will
con
t
rib
ute
to
the
r
ec
ogniti
on
rate,
w
hile
L
SF
is
the
do
m
inant
featur
e
.
(a)
LSF
net
(b)
MFC
Cn
et
(c)
C
om
bin
ed
ne
t
(d)
L
SF
P
er
for
m
ance
(e)
MFCC
Perf
or
m
ance
(f) Co
m
bin
ed
Perfo
rm
ance
Figure
4. Ne
ural
N
et
w
ork St
r
uctu
re a
nd it
s
Perfo
rm
ance f
or LSF
, MFCC
, and
C
om
bi
nation o
f
L
SF
a
nd MFC
C
Table
3.
Per
for
m
ance for
LSF
, MFCC
, a
nd C
om
bin
ed
Feat
ures
Featu
re
Tr
ain
in
g
T
i
m
e
(s)
NEpo
ch
s
MSE
Reco
g
n
itio
n
Rate (
%)
LSF
239
1248
7
.04
e
-
17
9
1
.58
MFCC
242
1277
1
.06
e
-
16
8
5
.53
Co
m
b
in
ed
L
S
F an
d
M
FCC
259
1160
4
.89
e
-
17
9
3
.77
3.4.
Expe
ri
men
ts on the Te
stin
g D
ata
The
la
st
ex
pe
rim
ent
would
be
to
eval
uate
the
trai
ned
ne
ur
al
netw
ork
on
t
he
unknow
n
or
te
sti
ng
data,
i.e.
5
file
s
for
each
la
ng
uag
es
.
W
e
hav
e
trai
ned
t
he
fee
dfo
rw
a
rd
ne
ur
al
ne
twork
t
o
cl
ass
ify
the
curre
nt
fr
am
e
into
5
trai
ne
d
l
an
gua
ges.
At
t
he
e
nd,
we
nee
d
to
deci
de
the
identifie
d
la
ng
uag
e
f
or
the
w
ho
le
file
an
d
not
the
current
fr
am
e.
For
this
pur
pose,
we
util
iz
ed
the
m
ajorit
y
vo
ti
ng
r
ule
as
ex
plained
in
[11]
,
in
w
hich
the
identifie
d
la
nguag
e
is
the
m
a
j
ori
ty
vo
ti
ng
in
that
par
ti
cular
file
.
Table
4
s
hows
the
recogn
it
io
n
rate
f
or
eac
h
la
nguag
e
per
f
ram
e
and
per
f
il
e
after
m
ajo
r
it
y
vo
ti
ng
.
N
ot
e
that,
al
tho
ug
h
it
has
lower
recog
niti
on
rat
e
per
fr
am
e
bu
t
s
ome
t
i
m
e
it
as
100%
recog
niti
on
rate
w
he
n
it
is
cal
culat
ed
per
file
us
in
g
m
ajo
rity
voti
ng,
vice
ver
sa
.
The
detai
le
d
a
naly
sis
rev
eal
e
d
that
for
En
gl
ish
la
nguag
e
,
it
has
been
w
r
ongly
cl
assifi
ed
as
Ma
lay
la
nguag
e
f
or
1
file
an
d
2
file
us
in
g
LSF
a
nd
MFC
C,
res
pe
ct
iv
el
y.
For
Ma
la
y
la
ng
ua
ge
,
it
has
bee
n
wrongly
cl
assifi
ed
as
M
al
ay
la
ng
ua
ge
for
1
file
us
i
ng
com
bin
ed
LS
F
an
d
MFC
C.
In
te
re
sti
ng
ly
,
t
he
com
bin
e
d
f
eat
ur
es
m
os
tly
i
m
pr
oved
the
rec
ogni
ti
on
rate
exce
pt
for
the
Ma
la
y
la
ng
ua
ge.
Fu
rt
her
e
xperi
m
ent
is
req
uir
ed
with
a
dd
it
io
nal
data
base,
es
pecial
ly
fo
r
E
ng
li
s
h
and
Ma
la
y
la
ngua
ge
to
valid
at
e
the
ob
ta
ine
d
res
ults.
Fro
m
the
aver
a
ge
of
rec
ogniti
on
rate,
it
has
been
f
ound
t
hat
usi
ng
LSF
featu
res
al
on
e
is
s
uffici
ent
f
or
la
ng
uag
e
identific
at
ion.
Table
4.
Rec
og
niti
on
Rat
e
f
or
Each La
ngua
ge
on th
e
Un
known/Test
in
g Da
ta
Lang
u
ag
e
Reco
g
n
itio
n
Rate (
%) Pe
r
Fr
a
m
e
Reco
g
n
itio
n
Rate (
%) Pe
r
File
LSF
MFCC
Co
m
b
in
ed
LSF
MFCC
Co
m
b
in
ed
Arabic
5
8
.16
5
5
.21
6
0
.49
100
100
100
Ch
in
ese
7
9
.65
4
3
.28
7
3
.81
100
100
100
Eng
lish
6
5
.39
4
6
.08
6
5
.58
8
0
60
100
Ko
rean
6
8
.34
6
7
.05
6
4
.47
100
100
100
Mala
y
6
5
.68
5
6
.79
6
9
.11
100
100
80
Averag
e
6
7
.44
5
3
.68
6
6
.69
96
92
96
Evaluation Warning : The document was created with Spire.PDF for Python.
Ind
on
esi
a
n
J
E
le
c Eng &
Co
m
p
Sci
IS
S
N:
25
02
-
4752
On the C
omp
ari
so
n of
Line
Spect
ra
l F
req
ue
ncies
and
Mel
-
Freq
uen
cy
Cep
stral
... (
Teddy
Su
r
ya
Gunaw
an)
175
4.
CONCL
US
I
O
N
AND
F
UT
U
RE W
ORKS
In
this
pap
e
r,
two
po
pu
la
r
f
eat
ur
es
f
or
la
ngua
ge
ide
ntific
at
ion
,
i.e.
L
S
F
and
MFC
C,
hav
e
been
com
par
ed
a
nd
evaluate
d.
La
ngua
ge
id
entifi
c
at
ion
syst
em
us
ing
feedf
orwa
rd
ne
ur
al
netw
ork
has
be
en
tr
ai
ned
on
five
dif
fere
nt
la
ngua
ges,
i.e.
Ar
a
bic,
Chi
nese,
En
glish,
Korea
n,
a
nd
Ma
la
y.
Syst
e
m
at
ic
exp
e
rim
en
ts
hav
e
been
co
nducte
d
to
obta
in
th
e
op
ti
m
u
m
para
m
et
ers,
i.e.
s
a
m
pling
f
re
quency,
f
ram
e
si
ze,
m
od
el
ord
er,
a
nd
structu
re
of
ne
ur
al
net
work.
The
opti
m
u
m
par
am
et
er
obta
ined
we
re
Hz,
m
s,
,
and
feedf
orwa
rd
neural
netw
ork
with
one
hi
dd
e
n
la
ye
r
an
d
1000
hidden
nodes.
On
a
ve
rag
e
,
the
rec
ogniti
on
rate
for
LSF,
MFC
C, and
c
om
bin
at
ion
o
f
both f
eat
ur
es a
r
e 9
6%
, 92%,
a
nd
96%, r
es
pe
ct
ively
. Results showe
d
that
LSF
al
on
e
is
the
m
os
t
su
it
able
featu
r
e
fo
r
la
ngua
ge
identific
at
ion
us
in
g
fee
dforwar
d
ne
ural
ne
twork
cl
assifi
er.
F
urt
her
resea
rc
h
in
cl
ud
es
us
i
ng
e
xtens
i
ve
datab
ase,
deep
ne
ur
al
netw
ork
f
or
featur
e
e
xtracti
on
a
nd
cl
assifi
er, or us
e d
if
fer
e
nt a
ud
i
o feat
ur
e
s a
nd
diff
e
re
nt classi
fiers.
ACKN
OWLE
DGE
MENT
The
aut
hors
w
ou
l
d
li
ke
to
express
their
grat
it
ud
e
to
the
Ma
la
ysi
an
Min
ist
ry
of
High
er
Ed
ucati
on
(MO
HE),
w
h
ic
h
has
pro
vid
e
d
f
unding
f
or
the
researc
h
th
rou
gh
the
F
un
dam
ental
Re
se
arch
Gr
a
nt
Sc
hem
e,
FRGS
15
-
194
-
0435.
REFERE
NCE
S
[1]
M.
P. Lewis,
et
al.
,
“
Et
hnologu
e
:
L
angua
ges
of
t
he
world,
”
SIL
i
nte
rna
ti
ona
l
Da
llas,
TX
,
vol
.
16
,
2009.
[2]
E.
Am
bika
ir
aj
ah
,
e
t
al
.
,
“
L
anguag
e
ide
n
ti
f
ic
a
ti
on
:
A
tut
or
ia
l
,
”
IE
EE
Circu
it
s
and
Syste
ms
Magazine
,
vol.
11,
p
p
.
82
-
108,
2011
.
[3]
R.
W
.
Ng,
et
al
.
,
“
Spoken
La
ng
uage
Re
co
gnitio
n
W
it
h
Pros
odic
Feat
ure
s,
”
IEEE
Tr
ansacti
ons
on
Audi
o,
Spe
ech,
and
Language
P
roce
ss
ing
,
vol
.
2
1,
pp
.
1841
-
185
3,
2013
.
[4]
J.
L.
Newm
an
and
S.
J.
Cox,
“
L
angua
ge
id
ent
if
i
ca
t
ion
using
visual
fe
a
ture
s,
”
IE
EE
Tr
ansacti
ons
on
audio,
spee
c
h,
and
language proce
ss
ing
,
vo
l. 20
,
pp
.
1936
-
1947
,
2012.
[5]
M.
V
.
Segbroe
c
k,
et
al
.
,
“
Rapid
la
nguage
ide
n
ti
ficat
ion,
”
I
EEE
Tr
ansacti
ons
on
Audi
o,
S
pe
ech,
and
Language
Proce
ss
ing
,
vo
l.
23,
pp
.
1118
-
11
29,
2015
.
[6]
S.
Irtz
a
,
e
t
al.
,
“
A
hie
ra
rc
h
ical
fr
amewo
rk
for
la
n
guage
id
ent
if
ic
a
t
ion,
”
in
Ac
oust
ics
,
Spee
ch
and
S
i
gnal
Proce
ss
ing
(
ICASSP
)
,
2016
IEE
E
Inte
rnat
io
nal
Conf
ere
nce
on
,
pp
.
5820
-
58
24,
2016
.
[7]
K.
Sim
o
nchi
k,
et
al.
,
“
Com
par
ative
Anal
y
s
is
of
Cla
ss
ifi
ers
for
Autom
at
ic
L
angu
age
Rec
ogn
it
ion
in
Spontane
ous
Speec
h,
”
in
In
te
r
nati
onal
Con
fe
re
nce
on
Spe
ec
h
a
nd
Computer
,
pp
.
174
-
181
,
2016
.
[8]
T.
S.
Gunawan,
et
al.
,
“
Deve
lo
pm
ent
of
L
angu
age
Id
ent
if
icati
o
n
S
y
stem
usin
g
Li
n
e
Spec
tra
l
Freque
ncies
and
Le
arn
ing
Vec
tor
Quanti
zation
N
et
works
,
”
Journ
al
of
Telecomm
unic
ati
on
,
Elec
t
ronic
and
Computer
Engi
ne
erin
g
(
JTEC
)
,
vol. 9, p
p.
21
-
27
,
2017
.
[9]
T.
M.
H.
As
da,
et
al.
,
“
Deve
lop
m
ent
of
Quran
Rec
iter
Ide
nt
ifi
c
at
ion
S
y
st
em
Usi
ng
MF
CC
and
Neura
l
Network,
”
Indone
sian J
our
nal
of
Elec
tric
al
Engi
ne
ering
and
Computer
Sc
ie
n
ce
,
vol
.
1
,
pp
.
16
8
-
175,
2016
.
[10]
T.
S.
Gunawan,
et
al.
,
“
Deve
lop
m
ent
of
La
ngua
ge
Ide
nti
f
icati
on
Sy
st
em
usin
g
M
FC
C
and
Vec
tor
Quanti
zation
,
”
i
n
Proce
ed
ing
of
4th
IEEE
Inte
r
na
ti
onal
Conf
ere
nce
on
Smar
t
Instrum
ent
ati
ons,
Me
asur
eme
nt
,
and
Apl
ic
a
ti
on
s
(
ICSIMA
)
2017,
Put
rajaya
,
pp
.
1
-
4,
2017
.
[11]
T.
S.
Gunawan
,
et
al.
,
“
Higher
-
Order
Statis
ti
cs
and
Neur
al
Ne
twork
Based
M
ult
i
-
Cl
assifie
r
Sy
stem
for
Gene
Ide
nti
f
ic
a
ti
on
,
”
i
n
Proceedi
ngs
of
Int
ernati
onal
Confe
ren
ce
on
Signal
Proce
ss
ing
and
Comm
u
nic
ati
on
Syst
ems
(
ICSPCS
2007)
,
Australi
a
,
pp.
1
-
7,
2007
.
Evaluation Warning : The document was created with Spire.PDF for Python.