I
A
E
S
I
n
t
e
r
n
at
io
n
al
Jou
r
n
al
of
A
r
t
if
ic
ia
l
I
n
t
e
ll
ig
e
n
c
e
(
I
J
-
AI
)
V
ol
.
14
, N
o.
4
,
A
ugus
t
2025
, pp.
3300
~
3310
I
S
S
N
:
2252
-
8938
,
D
O
I
:
10.11591/
ij
a
i.
v
14
.i
4
.pp
3300
-
3310
3300
Jou
r
n
al
h
om
e
page
:
ht
tp
:
//
ij
ai
.
ia
e
s
c
or
e
.c
om
M
u
si
c
ge
n
r
e
c
l
ass
i
f
i
c
a
t
i
on
u
si
n
g I
n
c
e
p
t
i
on
-
R
e
sNe
t
ar
c
h
i
t
e
c
t
u
r
e
F
au
z
a
n
V
al
d
e
r
a,
A
j
ib
S
e
t
yo A
r
if
in
D
e
pa
r
t
m
e
nt
of
E
l
e
c
t
r
i
c
a
l
E
ngi
ne
e
r
i
ng,
F
a
c
ul
t
y of
E
ngi
ne
e
r
i
ng
,
U
ni
ve
r
s
i
t
a
s
I
ndo
ne
s
i
a
, D
e
pok, I
ndone
s
i
a
A
r
t
ic
le
I
n
f
o
A
B
S
T
R
A
C
T
A
r
ti
c
le
h
is
to
r
y
:
R
e
c
e
iv
e
d
M
a
r
28, 2024
R
e
vi
s
e
d
J
un 10, 2025
A
c
c
e
pt
e
d
J
ul
10, 2025
Music
genres
help
categorize
music
but
lack
strict
boundaries,
em
erging
from
interactions
among
public,
marketing,
history,
and
culture.
With
Spotify
hosting
over
80
million
tracks,
organizing
digital
mu
sic
is
challengi
ng
due
to
the
sheer
volume
and
diversit
y.
Automat
ing
music
genre
classifi
cation
aids
in
managing
this
vast
array
and
attractin
g
cust
omers.
Recently
,
convolutio
nal
neural
n
etworks
(CNNs)
have
been
used
fo
r
their
abilit
y
to
extract
hierarchi
cal
features
from
images,
applicabl
e
to
music
through
spectrogr
ams.
This
study
introduces
the
Inception
-
ResNet
architectu
re
for
music
genre
classifi
cation,
signifi
cantly
imp
roving
performance
with
94.10%
accuracy,
precis
ion
of
94.19%,
recall
of
9
4.10%,
F1
-
score
of
94.08%,
and
149,418
parameters
on
the
GTZAN
dataset,
showcasing
its
potential
in
efficiently
managing
and
categorizing
large
music databa
ses.
K
e
y
w
o
r
d
s
:
C
la
s
s
if
ic
a
ti
on
C
onvolut
io
na
l
ne
ur
a
l
ne
twor
ks
G
e
nr
e
I
nc
e
pt
io
n
-
R
e
s
N
e
t
M
us
ic
This is an
open
acce
ss artic
le unde
r the
CC BY
-
SA
license.
C
or
r
e
s
pon
di
n
g A
u
th
or
:
A
ji
b S
e
ty
o A
r
if
in
D
e
pa
r
tm
e
nt
of
E
le
c
tr
ic
a
l
E
ngi
ne
e
r
in
g
,
F
a
c
ul
ty
of
E
ngi
ne
e
r
in
g
,
U
ni
ve
r
s
it
a
s
I
ndone
s
ia
D
e
pok 16424, I
ndone
s
ia
E
m
a
il
:
a
ji
b.s
a
@
ui
.a
c
.i
d
1.
I
N
T
R
O
D
U
C
T
I
O
N
M
us
ic
ge
nr
e
is
a
la
be
l
us
e
d
by
hum
a
ns
to
c
a
te
gor
iz
e
a
nd
de
s
c
r
ib
e
th
e
c
ha
r
a
c
te
r
is
ti
c
s
of
a
m
us
ic
.
M
us
ic
g
e
nr
e
s
do
not
ha
ve
s
tr
ic
t
de
f
in
it
io
ns
a
nd
bounda
r
ie
s
a
s
th
e
y
e
m
e
r
ge
th
r
ough
c
om
pl
e
x
in
te
r
a
c
ti
on
s
be
twe
e
n
th
e
a
udi
e
nc
e
,
m
a
r
ke
ti
ng,
hi
s
to
r
y,
a
nd
c
ul
tu
r
e
[
1]
.
O
bs
e
r
va
ti
ons
on
m
us
ic
ge
nr
e
s
ha
ve
le
d
s
om
e
r
e
s
e
a
r
c
he
r
s
to
pr
opos
e
ne
w
c
la
s
s
if
ic
a
ti
on
de
f
in
it
io
ns
pur
e
ly
f
or
th
e
pur
pos
e
of
in
f
o
r
m
a
ti
on
r
e
tr
ie
va
l
f
r
om
m
us
ic
[
2]
.
H
ow
e
ve
r
,
w
it
h
th
e
e
xi
s
ti
ng
m
us
ic
ge
nr
e
s
,
it
is
c
le
a
r
th
a
t
c
e
r
ta
in
ge
nr
e
s
ha
ve
c
ha
r
a
c
te
r
is
ti
c
s
ty
pi
c
a
ll
y a
s
s
oc
i
a
te
d w
it
h i
ns
tr
um
e
nt
a
ti
on, r
hyt
hm
ic
s
tr
uc
tu
r
e
, a
nd mus
ic
a
l
c
ont
e
nt
.
T
he
pr
oc
e
s
s
of
e
xt
r
a
c
ti
ng
in
f
or
m
a
ti
on
f
r
om
m
us
ic
is
b
e
c
om
in
g
in
c
r
e
a
s
in
gl
y
im
por
ta
nt
in
or
ga
ni
z
in
g
a
nd
m
a
na
gi
ng
th
e
va
s
t
a
m
ount
of
di
gi
ta
ll
y
a
va
il
a
bl
e
m
us
ic
f
il
e
s
on
th
e
in
te
r
ne
t.
H
ow
e
ve
r
,
th
is
pr
oc
e
s
s
ha
s
be
c
om
e
ne
a
r
ly
im
pos
s
ib
le
to
be
done
m
a
nua
ll
y
by
hum
a
ns
d
ue
to
th
e
c
ont
in
uous
ly
in
c
r
e
a
s
in
g
a
nd
di
ve
r
s
e
num
be
r
of
di
gi
ta
l
m
us
ic
.
T
he
r
e
f
or
e
, a
ut
om
a
te
d
m
u
s
ic
ge
nr
e
c
la
s
s
if
ic
a
ti
on
ha
s
be
c
om
e
one
of
th
e
s
e
r
vi
c
e
s
th
a
t
w
il
l
a
s
s
is
t
m
u
s
ic
di
s
tr
ib
ut
io
n
ve
ndor
s
in
or
ga
ni
z
in
g
th
e
m
u
lt
it
ude
of
m
us
ic
f
il
e
s
a
nd
le
ve
r
a
gi
ng
m
u
s
ic
in
f
or
m
a
ti
on t
o a
tt
r
a
c
t
c
us
to
m
e
r
s
.
D
ur
in
g
th
e
pa
s
t
de
c
a
de
,
th
e
r
e
h
a
s
be
e
n
a
s
ur
ge
in
th
e
u
s
e
of
c
onvolut
io
na
l
ne
ur
a
l
ne
twor
k
(
C
N
N
)
a
r
c
hi
te
c
tu
r
e
s
,
w
hi
c
h
ha
ve
a
c
hi
e
ve
d
s
a
ti
s
f
a
c
to
r
y
pe
r
f
or
m
a
nc
e
in
th
e
f
ie
ld
of
im
a
ge
r
e
c
ogni
ti
on
[
3]
.
C
N
N
s
c
a
n
e
f
f
e
c
ti
ve
ly
e
xt
r
a
c
t
in
f
or
m
a
ti
on
f
r
om
a
n
im
a
ge
due
to
th
e
ir
hi
e
r
a
r
c
hi
c
a
l
s
tr
uc
tu
r
e
[
4]
.
L
ow
-
le
ve
l
f
e
a
tu
r
e
s
,
s
uc
h
a
s
ba
s
ic
t
e
xt
ur
e
s
,
a
r
e
bui
lt
in
to
hi
gh
-
le
ve
l
s
e
m
a
nt
ic
in
f
or
m
a
ti
on
th
r
ough
C
N
N
la
ye
r
s
[
5]
.
T
he
s
pe
c
if
ic
c
a
pa
bi
li
ti
e
s
of
C
N
N
s
c
a
n
a
s
s
i
s
t
in
ta
s
ks
s
uc
h
a
s
m
us
ic
c
l
a
s
s
if
ic
a
ti
on
by
le
ve
r
a
gi
ng
in
f
or
m
a
ti
on
f
r
o
m
s
pe
c
tr
ogr
a
m
s
, w
hi
c
h c
ont
a
in
t
e
xt
ur
e
i
nf
or
m
a
ti
on f
r
om
m
us
ic
s
ig
na
ls
.
L
iu
e
t
al
.
[
6]
pr
opos
e
d
a
n
a
r
c
hi
te
c
tu
r
e
c
a
ll
e
d
bot
to
m
-
up
br
oa
dc
a
s
t
ne
ur
a
l
ne
twor
k
(
B
B
N
N
)
,
w
hi
c
h
a
dopt
s
a
r
e
la
ti
ve
ly
w
id
e
a
nd
s
ha
ll
ow
s
tr
uc
tu
r
e
.
T
he
m
a
in
id
e
a
be
hi
nd
th
e
B
B
N
N
a
r
c
hi
te
c
tu
r
e
is
to
de
ve
lo
p
e
f
f
e
c
ti
ve
bl
oc
ks
a
nd
di
f
f
e
r
e
nt
bl
oc
k
-
to
-
bl
oc
k
c
onne
c
ti
ons
to
e
xpl
oi
t
a
nd
pr
e
s
e
r
ve
lo
w
-
le
ve
l
in
f
or
m
a
ti
on
to
Evaluation Warning : The document was created with Spire.PDF for Python.
I
nt
J
A
r
ti
f
I
nt
e
ll
I
S
S
N
:
2252
-
8938
M
us
ic
ge
nr
e
c
la
s
s
if
ic
at
io
n us
in
g I
nc
e
pt
io
n
-
R
e
s
N
e
t
ar
c
hi
te
c
tu
r
e
(
F
auz
an V
al
de
r
a
)
3301
hi
ghe
r
la
ye
r
s
.
T
he
a
r
c
hi
te
c
tu
r
e
is
de
s
ig
ne
d
in
s
uc
h
a
w
a
y
th
a
t
s
pe
c
tr
ogr
a
m
in
f
or
m
a
ti
on
a
t
th
e
lo
w
e
r
le
ve
l
c
a
n
pa
r
ti
c
ip
a
te
i
n de
c
is
io
n
-
m
a
ki
ng l
a
ye
r
s
t
hr
oughout
t
he
ne
twor
k.
T
he
r
e
f
or
e
, B
B
N
N
i
s
e
qui
ppe
d w
it
h a
br
oa
dc
a
s
t
m
odul
e
(
B
M
)
c
ons
is
ti
ng
of
I
nc
e
pt
io
nV
1
bl
oc
k
s
a
nd
d
e
ns
e
c
on
ne
c
ti
vi
ty
.
T
he
y
r
e
por
te
d
a
n
a
c
c
ur
a
c
y
of
93%
,
in
di
c
a
ti
ng
a
n
im
pr
ove
m
e
nt
ov
e
r
pr
e
vi
ous
C
N
N
a
r
c
hi
te
c
tu
r
e
s
.
H
ow
e
ve
r
,
th
e
r
e
a
r
e
s
e
ve
r
a
l
dr
a
w
ba
c
ks
to
th
e
B
B
N
N
,
in
c
lu
di
ng
th
e
u
s
e
of
I
nc
e
pt
io
nV
1
bl
oc
k
s
in
th
e
BM
.
I
nc
e
pt
io
nV
1
ha
s
a
hi
gh
c
om
put
a
ti
ona
l
c
o
s
t
due
to
th
e
us
e
of
la
r
ge
f
il
te
r
s
,
s
pe
c
if
ic
a
ll
y
a
5×
5
f
il
te
r
.
B
B
N
N
a
do
pt
s
a
r
e
la
ti
ve
ly
s
ha
ll
ow
s
tr
uc
tu
r
e
[
6
]
.
T
hi
s
c
a
n
li
m
it
it
s
c
a
pa
c
it
y
to
c
a
pt
ur
e
c
om
pl
e
x
f
e
a
tu
r
e
s
a
nd
r
e
pr
e
s
e
nt
a
ti
o
ns
,
e
s
pe
c
ia
ll
y
f
or
ta
s
ks
th
a
t
r
e
qui
r
e
de
pt
h
a
nd
hi
ghe
r
-
le
ve
l
hi
e
r
a
r
c
hi
c
a
l
in
f
or
m
a
ti
on
pr
oc
e
s
s
in
g,
s
u
c
h
a
s
m
us
ic
c
la
s
s
if
ic
a
ti
on. T
he
u
s
e
of
m
a
x
-
pool
in
g
w
it
h
a
la
r
ge
w
in
dow
s
iz
e
,
s
pe
c
if
ic
a
ll
y
(
4,
1)
,
in
s
ha
ll
ow
la
ye
r
s
.
D
ow
n
-
s
a
m
pl
in
g
us
in
g
m
a
x
-
pool
in
g
w
it
h
a
la
r
ge
w
in
dow
s
iz
e
c
a
n
dr
a
s
ti
c
a
ll
y
r
e
duc
e
th
e
in
put
di
m
e
ns
io
ns
,
r
e
s
ul
ti
ng
in
lo
s
t
in
f
or
m
a
ti
on
a
nd
pot
e
nt
ia
l
a
c
c
ur
a
c
y
de
gr
a
da
ti
on
[
7]
.
I
n
2016,
S
z
e
ge
dy
e
t
al
.
[
8]
,
I
nc
e
pt
io
n
-
v4
a
nd
I
nc
e
pt
io
n
-
R
e
s
N
e
t
,
c
om
bi
ni
ng
I
nc
e
pt
io
n
m
odul
e
s
w
it
h
r
e
s
id
ua
l
c
onne
c
ti
on
s
to
im
pr
ove
de
e
p
le
a
r
ni
ng
e
f
f
ic
ie
nc
y.
I
nc
e
pt
io
n
-
v4
r
e
f
in
e
s
th
e
or
ig
in
a
l
I
nc
e
pt
io
n
a
r
c
hi
te
c
tu
r
e
,
w
hi
le
I
nc
e
pt
io
n
-
R
e
s
N
e
t
in
te
gr
a
te
s
r
e
s
id
ua
l
c
onne
c
ti
ons
to
e
nha
nc
e
gr
a
di
e
nt
f
lo
w
a
nd
tr
a
in
in
g
s
pe
e
d. E
xpe
r
im
e
nt
s
s
how
t
ha
t
I
nc
e
pt
io
n
-
R
e
s
N
e
t
tr
a
in
s
f
a
s
t
e
r
a
nd a
c
hi
e
ve
s
c
om
pa
r
a
bl
e
or
be
tt
e
r
a
c
c
ur
a
c
y t
ha
n
tr
a
di
ti
ona
l
I
nc
e
pt
io
n ne
twor
ks
. T
he
s
tu
dy highl
ig
ht
s
how
r
e
s
id
u
a
l
c
onne
c
ti
ons
m
it
ig
a
te
t
he
va
ni
s
hi
ng gr
a
di
e
nt
pr
obl
e
m
,
le
a
di
ng
to
m
o
r
e
s
ta
bl
e
le
a
r
ni
ng.
O
ve
r
a
ll
,
th
e
r
e
s
e
a
r
c
h
de
m
ons
tr
a
te
s
th
a
t
c
om
bi
ni
ng
I
nc
e
pt
io
n
m
odul
e
s
w
it
h
r
e
s
id
ua
l
le
a
r
ni
ng
r
e
s
ul
t
s
in
hi
ghl
y
a
c
c
ur
a
te
a
nd
c
om
put
a
ti
ona
ll
y
e
f
f
ic
ie
nt
de
e
p
ne
twor
ks
.
T
hi
s
r
e
s
e
a
r
c
h
a
im
s
to
im
pr
ove
th
e
a
c
c
ur
a
c
y
p
e
r
f
or
m
a
nc
e
a
nd
r
e
du
c
e
th
e
c
om
put
a
ti
on
a
l
c
om
pl
e
xi
ty
of
th
e
B
B
N
N
a
r
c
hi
te
c
tu
r
e
.
T
he
r
e
s
e
a
r
c
he
r
s
pr
opos
e
m
us
ic
ge
nr
e
c
l
a
s
s
if
ic
a
ti
o
n
us
in
g
th
e
I
nc
e
pt
io
n
-
R
e
s
N
e
t
a
r
c
hi
te
c
tu
r
e
w
it
h
in
put
i
n t
he
f
or
m
of
m
e
l
-
s
pe
c
tr
ogr
a
m
s
of
a
udi
o s
ig
na
ls
.
2.
L
I
T
E
R
A
T
U
R
E
R
E
V
I
E
W
O
ve
r
r
e
c
e
nt
ye
a
r
s
,
th
e
c
la
s
s
if
ic
a
ti
on
of
m
us
ic
ge
nr
e
s
t
hr
ough
vi
s
ua
l
r
e
pr
e
s
e
nt
a
ti
on
s
li
ke
s
pe
c
tr
ogr
a
m
s
s
hor
t
-
ti
m
e
F
our
ie
r
tr
a
ns
f
or
m
(
S
T
F
T
)
a
nd
m
e
l
-
s
p
e
c
tr
ogr
a
m
,
m
e
l
-
f
r
e
que
nc
y
c
e
p
s
tr
a
l
c
oe
f
f
ic
ie
nt
s
(
M
F
C
C
)
ha
s
s
e
e
n
s
ig
ni
f
ic
a
nt
a
dva
nc
e
m
e
nt
s
.
T
he
s
e
vi
s
ua
l
m
e
th
ods
le
ve
r
a
ge
tr
a
di
ti
ona
l
te
xt
ur
e
de
s
c
r
ip
to
r
s
f
r
om
c
om
put
e
r
vi
s
io
n
s
uc
h
a
s
lo
c
a
l
pha
s
e
qua
nt
iz
a
ti
on,
lo
c
a
l
bi
na
r
y
pa
tt
e
r
ns
,
a
nd
G
a
bor
f
il
te
r
s
to
e
nc
a
ps
ul
a
t
e
th
e
s
pe
c
tr
ogr
a
m
s
'
c
ont
e
nt
,
w
hi
c
h
r
e
s
e
m
bl
e
s
te
m
por
a
l
e
ne
r
gy
di
s
tr
ib
ut
io
n
c
ha
nge
s
a
c
r
os
s
f
r
e
que
nc
y
bi
ns
.
D
e
s
pi
te
th
e
tr
a
di
ti
ona
l
c
la
s
s
if
ic
a
ti
on
te
c
hni
que
s
,
in
c
lu
di
ng
s
uppor
t
ve
c
to
r
m
a
c
hi
ne
(
S
V
M
)
a
nd
G
a
us
s
ia
n
m
ix
tu
r
e
m
ode
ls
(
G
M
M
)
,
out
pe
r
f
or
m
in
g
hum
a
n
a
c
c
ur
a
c
y
(
70%
)
on
va
r
io
us
m
us
ic
da
ta
s
e
ts
,
th
e
y
a
r
e
s
ti
ll
he
a
vi
ly
r
e
li
a
nt
on f
e
a
tu
r
e
e
ngi
ne
e
r
in
g
[
9]
.
D
e
e
p
ne
ur
a
l
ne
twor
ks
ha
ve
s
ig
ni
f
ic
a
nt
ly
r
e
duc
e
d
th
e
r
e
li
a
nc
e
on
ta
s
k
-
s
pe
c
if
ic
pr
io
r
knowle
dge
,
a
c
hi
e
vi
ng
not
a
bl
e
s
uc
c
e
s
s
e
s
in
c
om
put
e
r
vi
s
io
n
[
10]
,
[
11]
a
nd
in
s
pi
r
in
g
a
ppl
ic
a
ti
ons
in
m
us
ic
ge
nr
e
c
la
s
s
if
ic
a
ti
on
[
12]
,
[
13]
.
P
io
ne
e
r
in
g
w
or
k
by
L
e
e
e
t
al
.
[
14]
,
a
de
e
p
le
a
r
ni
ng
f
r
a
m
e
w
or
k
f
or
a
udi
o
c
la
s
s
if
ic
a
ti
on
w
a
s
in
tr
oduc
e
d,
e
m
pl
oyi
ng
a
c
onvolut
io
na
l
de
e
p
be
li
e
f
ne
twor
k
to
le
a
r
n
f
r
om
s
pe
c
tr
ogr
a
m
s
,
in
s
pi
r
in
g
f
ur
th
e
r
r
e
s
e
a
r
c
h
in
us
in
g
d
e
e
p
l
e
a
r
ni
ng
f
or
a
udi
o
r
e
c
ogni
ti
on.
P
r
e
vi
ous
r
e
s
e
a
r
c
he
r
s
in
[
15]
,
[
16]
,
in
nova
te
d
by
s
ta
c
ki
ng
hi
dd
e
n
la
ye
r
s
a
nd
e
m
pl
oyi
ng
di
f
f
e
r
e
nt
a
c
ti
va
ti
on
f
unc
ti
ons
a
nd
c
l
a
s
s
if
ie
r
s
,
a
c
hi
e
vi
ng
up
to
84%
a
c
c
ur
a
c
y
on
th
e
G
T
Z
A
N
da
ta
s
e
t
[
17]
.
D
e
s
pi
te
th
e
s
e
a
dva
nc
e
m
e
nt
s
,
c
ha
ll
e
nge
s
r
e
m
a
in
in
f
e
a
tu
r
e
le
a
r
ni
ng
w
it
hout
c
la
s
s
if
ie
r
s
upe
r
vi
s
io
n,
im
pa
c
ti
ng
th
e
p
r
e
di
c
ti
on
c
a
pa
bi
li
ti
e
s
of
th
e
m
ode
ls
a
nd
m
a
in
ta
in
in
g
a
two
-
s
ta
ge
pr
oc
e
s
s
i
n t
he
f
r
a
m
e
w
or
k.
R
e
c
e
nt
a
dva
nc
e
m
e
nt
s
in
m
us
ic
ge
nr
e
c
la
s
s
if
ic
a
ti
on
ha
ve
s
e
e
n
th
e
in
te
gr
a
ti
on
of
f
e
a
tu
r
e
le
a
r
ni
ng
a
nd
c
la
s
s
if
ic
a
ti
on
in
to
a
s
in
gl
e
s
t
a
ge
,
pr
im
a
r
il
y
us
in
g
C
N
N
-
ba
s
e
d
m
e
th
ods
.
J
a
kubi
k
[
18]
in
tr
oduc
e
d
r
e
c
ur
r
e
nt
ne
ur
a
l
ne
twor
k
(
R
N
N
)
a
r
c
hi
te
c
tu
r
e
s
,
s
p
e
c
if
ic
a
ll
y
lo
ng
s
hor
t
-
te
r
m
m
e
m
or
y
(
L
S
T
M
)
,
a
nd
ga
te
d
r
e
c
ur
r
e
nt
uni
t
(
G
R
U
)
,
f
r
om
th
e
im
a
ge
dom
a
in
to
m
us
ic
a
na
ly
s
is
,
a
c
hi
e
vi
ng
r
e
m
a
r
ka
bl
e
a
c
c
ur
a
c
ie
s
of
91
a
nd
92%
on
th
e
G
T
Z
A
N
da
ta
s
e
t,
s
how
c
a
s
in
g
th
e
ir
e
f
f
ic
a
c
y.
T
he
N
N
e
t2
m
od
e
l
in
tr
oduc
e
d
a
nove
l
C
N
N
a
r
c
hi
te
c
tu
r
e
w
it
h
s
hor
tc
ut
c
onne
c
ti
ons
to
a
ll
la
y
e
r
s
,
e
nha
n
c
in
g
le
a
r
ni
ng
c
a
pa
c
it
y
th
r
ough
a
c
om
bi
na
ti
on
of
m
a
x
a
nd
a
ve
r
a
g
e
pool
in
g,
a
nd
a
c
hi
e
v
e
d
a
n
87%
a
c
c
ur
a
c
y
on
G
T
Z
A
N
[
19]
.
A
ddi
ti
ona
ll
y,
to
a
ddr
e
s
s
th
e
v
a
r
yi
ng
s
ig
ni
f
ic
a
nc
e
of
di
f
f
e
r
e
nt
te
m
por
a
l
s
e
gm
e
nt
s
in
m
us
ic
,
th
e
s
tu
di
e
s
in
[
3]
,
[
20
]
,
[
21]
in
c
or
por
a
te
d
a
n
a
tt
e
nt
io
n
m
e
c
ha
ni
s
m
w
it
h
a
bi
di
r
e
c
ti
ona
l
R
N
N
,
a
te
c
hni
que
f
ur
th
e
r
r
e
f
in
e
d
by
in
te
gr
a
ti
ng
s
ta
c
ki
ng
a
tt
e
nt
io
n
m
odul
e
s
in
s
ub
s
e
que
nt
w
or
k,
e
m
pha
s
iz
in
g
th
e
e
vol
vi
ng
f
oc
us
on
nua
nc
e
d
te
m
por
a
l
a
na
ly
s
is
in
m
us
ic
c
ont
e
nt
.
T
he
m
os
t
r
e
c
e
nt
w
or
k
w
a
s
c
ondu
c
te
d
by
L
iu
e
t
al
.
[
6]
in
tr
oduc
e
d
th
e
B
B
N
N
,
a
m
o
de
l
f
e
a
tu
r
in
g
a
w
id
e
a
nd
s
ha
ll
ow
a
r
c
hi
te
c
tu
r
e
de
s
ig
ne
d
to
e
f
f
ic
ie
nt
ly
ut
il
iz
e
lo
w
-
le
ve
l
s
pe
c
tr
ogr
a
m
in
f
or
m
a
ti
on
a
c
r
os
s
it
s
de
c
is
io
n
-
m
a
ki
ng
la
ye
r
s
.
T
hi
s
ne
twor
k
in
c
or
por
a
te
s
a
BM
w
it
h
I
nc
e
pt
io
nV
1
bl
oc
k
s
a
nd
de
n
s
e
c
onne
c
ti
ons
,
a
im
in
g
to
e
nh
a
nc
e
th
e
f
lo
w
of
in
f
or
m
a
ti
on
f
r
om
lo
w
e
r
to
hi
ghe
r
la
ye
r
s
.
D
e
s
pi
te
a
c
hi
e
vi
ng
a
93%
a
c
c
ur
a
c
y,
s
ur
pa
s
s
in
g
m
a
ny
c
onve
nt
io
na
l
C
N
N
a
r
c
hi
te
c
tu
r
e
s
,
th
e
B
B
N
N
f
a
c
e
s
c
ha
ll
e
nge
s
s
uc
h
a
s
th
e
c
om
put
a
ti
ona
ll
y
e
xpe
ns
iv
e
us
e
of
I
nc
e
pt
io
nV
1
bl
oc
ks
,
it
s
s
ha
ll
ow
s
tr
uc
tu
r
e
w
hi
c
h
m
ig
ht
li
m
it
th
e
e
xt
r
a
c
ti
on
of
c
om
pl
e
x
f
e
a
tu
r
e
s
ne
c
e
s
s
a
r
y
f
or
m
us
ic
c
la
s
s
if
ic
a
ti
on,
a
nd
th
e
us
e
of
m
a
x
-
pool
in
g
w
it
h
la
r
ge
w
in
dow
s
iz
e
s
th
a
t
m
a
y
le
a
d
to
s
ig
ni
f
ic
a
nt
in
f
or
m
a
ti
on
lo
s
s
a
nd
pot
e
nt
ia
l
r
e
duc
ti
ons
in
a
c
c
ur
a
c
y.
T
o
ove
r
c
om
e
th
e
s
e
dr
a
w
ba
c
ks
,
in
th
is
a
r
ti
c
le
an
I
nc
e
pt
io
n
-
R
e
s
N
e
t
m
odul
e
i
s
pr
opos
e
d.
Evaluation Warning : The document was created with Spire.PDF for Python.
I
S
S
N
:
2252
-
8938
I
nt
J
A
r
ti
f
I
nt
e
ll
,
V
ol
. 14, No. 4, A
ugus
t
2025
:
3300
-
3310
3302
3.
P
R
O
P
O
S
E
D
A
R
C
H
I
T
E
C
T
U
R
E
W
e
pr
opos
e
th
e
us
e
of
I
nc
e
pt
io
n
-
R
e
s
N
e
t
bl
oc
ks
to
r
e
pl
a
c
e
th
e
I
nc
e
pt
io
nV
1
bl
oc
ks
a
nd
m
a
ke
m
odi
f
ic
a
ti
ons
to
th
e
s
ha
ll
ow
la
ye
r
s
in
th
e
B
B
N
N
m
ode
l
[
6]
.
T
he
I
nc
e
pt
io
n
-
R
e
s
N
e
t
bl
oc
k
is
de
s
ig
ne
d
us
in
g
th
e
T
e
n
s
or
F
lo
w
li
br
a
r
y
a
nd
c
ons
is
t
s
of
a
to
ta
l
of
89
la
y
e
r
s
,
in
c
lu
di
ng
s
e
ve
r
a
l
c
om
pon
e
nt
s
s
u
c
h
a
s
th
e
s
te
m
m
odul
e
(
5
la
ye
r
s
)
,
r
e
duc
ti
on
m
odul
e
(
23
la
ye
r
s
)
,
I
nc
e
pt
io
n
-
R
e
s
N
e
t
m
odul
e
(
54
la
ye
r
s
)
,
a
nd
th
e
f
ul
ly
c
onne
c
te
d
m
odul
e
(
7
la
ye
r
s
)
.
F
ig
ur
e
1
s
how
s
th
e
pr
opos
e
d
a
r
c
hi
te
c
tu
r
e
us
in
g
I
nc
e
pt
io
n
-
R
e
s
N
e
t
bl
oc
k
s
.
T
h
e
s
te
m
m
odul
e
s
e
r
ve
s
a
s
a
f
e
a
tu
r
e
e
xt
r
a
c
ti
on
c
om
pone
nt
a
t
th
e
be
gi
nni
ng
of
th
e
ne
twor
k.
T
he
s
te
m
m
od
e
l
is
r
e
s
pons
ib
le
f
or
pr
oc
e
s
s
in
g
th
e
in
put
da
ta
a
nd
e
xt
r
a
c
ti
ng
m
e
a
ni
ngf
ul
f
e
a
tu
r
e
s
th
a
t
w
il
l
be
f
ur
th
e
r
us
e
d
by
th
e
s
ubs
e
que
nt
l
a
y
e
r
s
. F
ig
ur
e
2 s
how
s
t
he
s
c
he
m
a
ti
c
of
t
he
s
te
m
m
odul
e
.
F
ig
ur
e
1. P
r
opos
e
d a
r
c
hi
te
c
tu
r
e
us
in
g I
nc
e
pt
io
n
-
R
e
s
N
e
t
bl
oc
k
s
F
ig
ur
e
2. S
te
m
m
odul
e
T
he
s
te
m
m
odul
e
c
on
s
is
ts
of
two
dow
ns
a
m
pl
in
g
s
ta
g
e
s
,
r
e
duc
i
ng
th
e
di
m
e
ns
io
ns
f
r
om
(
647,
128)
t
o
(
323,
128)
,
a
nd
th
e
n
f
r
om
(
323,
128)
to
(
161,
128)
.
I
n
e
a
c
h
do
w
ns
a
m
pl
in
g
s
ta
g
e
,
th
e
r
e
i
s
a
3×
3
c
onvolut
io
na
l
Evaluation Warning : The document was created with Spire.PDF for Python.
I
nt
J
A
r
ti
f
I
nt
e
ll
I
S
S
N
:
2252
-
8938
M
us
ic
ge
nr
e
c
la
s
s
if
ic
at
io
n us
in
g I
nc
e
pt
io
n
-
R
e
s
N
e
t
ar
c
hi
te
c
tu
r
e
(
F
auz
an V
al
de
r
a
)
3303
la
ye
r
w
it
h
r
e
c
ti
f
ie
d
li
ne
a
r
uni
t
(
R
e
L
U
)
a
c
ti
va
ti
on.
T
h
e
c
onvolu
ti
ona
l
la
ye
r
s
e
xt
r
a
c
t
in
it
ia
l
f
e
a
tu
r
e
s
th
a
t
s
e
r
ve
a
s
th
e
f
ounda
ti
on
f
or
th
e
s
ubs
e
que
nt
la
ye
r
s
. T
he
u
s
e
of
m
a
x
-
po
ol
in
g
w
it
h
a
s
iz
e
of
(
4,
1)
in
th
e
B
B
N
N
m
ode
l
r
e
duc
e
s
th
e
in
put
di
m
e
ns
io
n
by
a
qua
r
te
r
.
T
hi
s
c
a
n
pot
e
nt
ia
ll
y
e
li
m
in
a
te
s
om
e
im
por
ta
nt
f
e
a
tu
r
e
s
or
de
ta
il
e
d
s
pa
ti
a
l
in
f
or
m
a
ti
on, e
s
pe
c
ia
ll
y w
he
n t
he
i
nput
s
i
z
e
i
s
r
e
la
ti
ve
ly
s
m
a
ll
. I
n t
he
pr
opos
e
d s
te
m
m
odul
e
, t
he
us
e
of
two
s
ta
ge
s
of
m
a
x
-
pool
in
g
w
it
h
a
s
iz
e
of
(
2,
1)
a
c
hi
e
ve
s
a
good
ba
la
nc
e
b
e
twe
e
n
r
e
duc
in
g
th
e
in
put
di
m
e
ns
io
n a
nd r
e
ta
in
in
g i
m
por
ta
nt
i
nf
or
m
a
ti
on i
n t
he
f
e
a
tu
r
e
r
e
pr
e
s
e
nt
a
ti
on.
I
n
th
e
pr
opos
e
d
a
r
c
hi
te
c
tu
r
e
,
th
r
e
e
I
nc
e
pt
io
n
-
R
e
s
N
e
t
m
odul
e
s
a
r
e
us
e
d:
I
nc
e
pt
io
n
-
R
e
s
N
e
t
A
(
22
la
ye
r
s
)
,
I
nc
e
pt
io
n
-
R
e
s
N
e
t
B
(
16
la
ye
r
s
)
,
a
nd
I
nc
e
pt
io
n
-
R
e
s
N
e
t
C
(
16
la
y
e
r
s
)
.
T
he
s
c
he
m
e
of
th
e
I
nc
e
pt
io
n
-
R
e
s
N
e
t
m
o
du
le
w
a
s
in
tr
od
uc
e
d
b
y
S
z
e
g
e
dy
e
t
al
.
[
8]
.
F
ig
ur
e
s
3
a
n
d
4
s
h
ow
th
e
I
n
c
e
pt
i
on
-
R
e
s
N
e
t
A
,
I
nc
e
pt
io
n
-
R
e
s
N
e
t
B
,
a
nd
I
nc
e
pt
io
n
-
R
e
s
N
e
t
C
m
odul
e
s
,
r
e
s
pe
c
ti
ve
ly
.
I
n
F
ig
ur
e
3,
t
he
I
nc
e
pt
io
n
-
R
e
s
N
e
t
A
m
odul
e
f
unc
ti
ons
to
e
xt
r
a
c
t
f
e
a
tu
r
e
s
a
t
th
e
in
it
ia
l
s
ta
ge
.
T
h
e
m
odul
e
c
ons
is
ts
of
th
r
e
e
br
a
nc
he
s
w
it
h
a
c
om
bi
na
ti
on
of
1×
1
a
nd
3×
3
c
onvolut
io
ns
.
E
a
c
h
out
put
f
r
om
th
e
br
a
nc
he
s
is
m
e
r
ge
d
th
r
ough
a
1×
1
c
onvolut
io
n
w
it
h
a
li
ne
a
r
a
c
ti
va
ti
on
f
unc
ti
on.
T
hi
s
la
ye
r
is
c
a
ll
e
d
th
e
a
c
ti
va
ti
on
s
c
a
le
,
w
hi
c
h
a
dj
u
s
ts
th
e
m
a
gni
tu
de
of
t
he
m
odul
e
'
s
out
put
a
da
pt
iv
e
ly
.
I
n
F
ig
ur
e
4(
a
)
,
th
e
I
nc
e
pt
io
n
-
R
e
s
N
e
t
B
m
odul
e
f
unc
ti
ons
to
e
xt
r
a
c
t
f
e
a
tu
r
e
s
a
t
th
e
in
te
r
m
e
di
a
te
s
ta
ge
.
T
he
m
odul
e
c
on
s
is
ts
of
two
br
a
nc
he
s
w
it
h
a
c
om
bi
na
ti
o
n
of
1×
1,
1×
7,
a
nd
7×
1
c
onvolut
io
n
s
.
T
he
us
e
of
1×
7
a
nd
7×
1
f
il
te
r
s
in
s
te
a
d
of
a
7×
7
f
il
te
r
is
c
onduc
te
d
to
r
e
duc
e
th
e
to
ta
l
c
om
put
a
ti
on
s
in
th
e
m
odul
e
.
I
n
F
ig
ur
e
4(
b
)
,
th
e
I
nc
e
pt
io
n
-
R
e
s
N
e
t
C
m
odul
e
ha
s
th
e
s
a
m
e
c
onf
ig
ur
a
ti
on
s
c
he
m
e
a
s
th
e
I
nc
e
pt
io
n
-
R
e
s
N
e
t
B
m
odul
e
,
w
hi
c
h
f
unc
ti
ons
to
e
xt
r
a
c
t
f
e
a
tu
r
e
s
a
t
th
e
f
in
a
l
s
ta
g
e
.
T
he
m
odul
e
c
ons
i
s
ts
of
two
br
a
nc
he
s
w
it
h
a
c
om
bi
na
ti
on of
1×
1, 1×
3, a
nd 3×
1
c
onvolut
io
ns
.
F
ig
ur
e
3. I
nc
e
pt
io
n
-
R
e
s
N
e
t
A
m
odul
e
E
a
c
h
I
nc
e
pt
io
n
-
R
e
s
N
e
t
m
odul
e
ha
s
a
di
f
f
e
r
e
nt
num
be
r
a
nd
s
iz
e
of
f
il
te
r
s
,
a
ll
ow
in
g
f
or
m
or
e
c
ont
r
ol
ove
r
th
e
c
a
pa
c
it
y
a
nd
c
om
pl
e
xi
ty
of
th
e
m
ode
l.
T
he
r
e
is
a
di
f
f
e
r
e
nc
e
be
twe
e
n
th
e
I
nc
e
pt
io
n
-
R
e
s
N
e
t
m
odul
e
s
a
nd
th
e
I
nc
e
pt
io
n
m
odul
e
s
in
th
e
B
B
N
N
m
ode
l,
w
he
r
e
th
e
r
e
is
no
u
s
e
of
c
onvolut
io
na
l
la
ye
r
s
w
it
h
la
r
ge
f
il
te
r
s
,
s
uc
h
a
s
5×
5
.
C
onvolut
io
ns
w
it
h
la
r
ge
f
il
te
r
s
a
r
e
r
e
pl
a
c
e
d
w
it
h
two
c
onvolut
io
ns
w
it
h
f
il
te
r
s
(
1×
3
a
nd
3×
1)
or
(
1×
7
a
nd
7×
1)
to
r
e
duc
e
th
e
to
ta
l
num
be
r
of
pa
r
a
m
e
te
r
s
.
T
hi
s
a
ll
ow
s
f
or
th
e
c
r
e
a
ti
on
o
f
de
e
pe
r
m
ode
ls
.
T
he
r
e
duc
ti
on
m
odul
e
a
im
s
to
r
e
du
c
e
th
e
di
m
e
ns
io
n
of
th
e
f
e
a
tu
r
e
s
w
hi
le
pr
e
s
e
r
vi
ng
a
nd
e
nha
nc
in
g
im
por
ta
nt
in
f
or
m
a
ti
on.
F
ig
ur
e
5
s
how
s
th
e
s
c
he
m
e
of
th
e
r
e
duc
ti
on
m
odul
e
.
T
o
r
e
duc
e
th
e
s
iz
e
of
th
e
f
e
a
tu
r
e
di
m
e
ns
io
n, a
c
onvolut
io
na
l
la
ye
r
w
it
h
a
s
tr
id
e
of
t
w
o i
s
us
e
d, w
hi
c
h r
e
duc
e
s
t
he
f
e
a
tu
r
e
di
m
e
ns
io
n
by ha
lf
of
i
ts
or
ig
in
a
l
s
iz
e
.
Evaluation Warning : The document was created with Spire.PDF for Python.
I
S
S
N
:
2252
-
8938
I
nt
J
A
r
ti
f
I
nt
e
ll
,
V
ol
. 14, No. 4, A
ugus
t
2025
:
3300
-
3310
3304
(
a
)
(
b)
F
ig
ur
e
4.
T
he
l
a
ye
r
a
r
c
hi
te
c
tu
r
e
f
or
(
a
)
I
nc
e
pt
io
n
-
R
e
s
N
e
t
B
m
o
dul
e
a
nd (
b)
I
nc
e
pt
io
n
-
R
e
s
N
e
t
C
m
odul
e
F
ig
ur
e
5. R
e
duc
ti
on modul
e
4.
E
X
P
E
R
I
M
E
N
T
A
L
S
E
T
U
P
4.1.
D
at
as
e
t
T
he
da
ta
s
e
t
us
e
d i
n t
hi
s
s
tu
dy i
s
t
he
G
T
Z
A
N
da
ta
s
e
t.
G
T
Z
A
N
i
s
t
he
m
os
t
w
id
e
ly
us
e
d publi
c
da
ta
s
e
t
f
or
e
va
lu
a
ti
on
in
m
us
ic
ge
nr
e
r
e
c
ogni
ti
on
(
M
G
R
)
r
e
s
e
a
r
c
h
[
22]
.
T
he
f
il
e
s
w
e
r
e
c
ol
le
c
te
d
be
twe
e
n
2000
-
2001
f
r
om
va
r
io
us
s
our
c
e
s
to
r
e
pr
e
s
e
nt
va
r
io
us
m
us
ic
r
e
c
or
di
n
g
c
ondi
ti
ons
,
s
uc
h
a
s
pe
r
s
ona
l
C
D
s
,
r
a
di
o,
r
e
c
or
di
ngs
,
a
nd
m
ic
r
ophone
s
.
G
T
Z
A
N
c
ont
a
in
s
1,000
m
us
ic
tr
a
c
ks
in
.w
a
v
f
or
m
a
t,
e
a
c
h
la
s
ti
ng
f
or
Evaluation Warning : The document was created with Spire.PDF for Python.
I
nt
J
A
r
ti
f
I
nt
e
ll
I
S
S
N
:
2252
-
8938
M
us
ic
ge
nr
e
c
la
s
s
if
ic
at
io
n us
in
g I
nc
e
pt
io
n
-
R
e
s
N
e
t
ar
c
hi
te
c
tu
r
e
(
F
auz
an V
al
de
r
a
)
3305
30
s
e
c
onds
[
23]
.
G
T
Z
A
N
c
ons
is
t
s
of
10
ge
nr
e
s
,
in
c
lu
di
ng
b
lu
e
s
,
c
la
s
s
ic
a
l,
c
ount
r
y,
di
s
c
o,
hi
p
-
hop,
ja
z
z
,
m
e
ta
l,
pop, r
e
gga
e
, a
nd r
oc
k
[
22]
.
4.2. P
r
e
p
r
oc
e
s
s
in
g
T
he
a
udi
o da
ta
s
e
t
w
it
h a
dur
a
ti
on o
f
30
s
e
c
onds
w
il
l
be
pr
oc
e
s
s
e
d i
nt
o m
e
l
-
s
pe
c
tr
ogr
a
m
f
or
m
.
T
he
r
e
a
r
e
1,000
a
udi
o
s
a
m
pl
e
s
c
ons
is
ti
ng
of
10
m
us
ic
ge
nr
e
s
,
w
it
h
900
s
a
m
pl
e
s
us
e
d
f
or
tr
a
in
in
g
a
nd
100
s
a
m
pl
e
s
us
e
d
f
or
te
s
ti
ng.
T
o
pr
oc
e
s
s
th
e
a
udi
o
da
ta
s
e
t
in
to
m
e
l
-
s
pe
c
tr
og
r
a
m
s
,
w
e
ne
e
d
to
pe
r
f
or
m
S
T
F
T
,
m
e
l
-
s
c
a
li
ng,
a
nd
tr
ia
ngl
e
f
il
te
r
in
g,
a
ll
of
w
hi
c
h
a
r
e
a
va
il
a
bl
e
in
th
e
P
y
th
on
li
br
a
r
y
c
a
ll
e
d
L
ib
r
os
a
.
T
he
r
e
a
r
e
s
e
ve
r
a
l
pa
r
a
m
e
te
r
s
us
e
d
in
a
udi
o
pr
e
pr
oc
e
s
s
in
g,
in
c
lu
di
ng
a
w
in
dow
l
e
ngt
h
f
or
F
our
ie
r
tr
a
ns
f
or
m
of
512
s
a
m
pl
e
s
,
a
hop
le
ngt
h
(
num
be
r
of
s
a
m
pl
e
s
be
twe
e
n
f
r
a
m
e
s
)
of
1,024
s
a
m
pl
e
s
,
a
nd
a
to
ta
l
of
128
m
e
l
ba
nds
f
or
m
e
l
-
s
c
a
li
ng.
T
he
pr
e
pr
oc
e
s
s
in
g
s
c
h
e
m
e
c
a
n
be
s
e
e
n
in
F
ig
ur
e
6.
A
f
te
r
goi
ng
th
r
ough
th
e
a
udi
o
pr
e
pr
oc
e
s
s
in
g
pr
oc
e
s
s
, t
he
r
e
s
ul
t
is
obt
a
in
e
d i
n t
he
f
or
m
of
m
e
l
-
s
pe
c
tr
ogr
a
m
s
w
it
h di
m
e
ns
io
ns
of
128×
647.
F
ig
ur
e
6. M
e
l
-
S
pe
c
tr
ogr
a
m
s
by pr
e
pr
oc
e
s
s
in
g
4.3.
T
r
ai
n
in
g an
d
t
e
s
t
in
g
T
he
pr
oc
e
s
s
of
de
s
ig
ni
ng
th
e
a
r
c
hi
te
c
tu
r
e
,
tr
a
in
in
g,
a
nd
te
s
ti
ng
w
a
s
pe
r
f
or
m
e
d
on
G
oogl
e
C
ol
la
bor
a
to
r
y
P
r
o
ve
r
s
io
n
us
in
g
a
T
e
s
la
P
100
G
P
U
,
15
G
B
of
R
A
M
,
a
nd
2
C
P
U
c
or
e
s
.
T
he
m
od
e
l
tr
a
in
in
g
s
ta
ge
w
a
s
c
onduc
te
d
f
or
100
e
poc
hs
u
s
in
g
th
e
A
da
m
opt
im
iz
e
r
w
it
h
a
ba
tc
h
s
iz
e
of
8.
A
n
in
it
ia
l
le
a
r
ni
ng
r
a
te
of
0.01
w
a
s
e
s
ta
bl
is
h
e
d
a
nd
w
a
s
a
ut
om
a
ti
c
a
ll
y
r
e
duc
e
d
by a
f
a
c
to
r
of
0.5
if
th
e
lo
s
s
di
d
not
de
c
r
e
a
s
e
f
or
th
r
e
e
c
ons
e
c
ut
iv
e
e
poc
h
s
.
A
ddi
ti
ona
ll
y,
th
e
tr
a
in
in
g
s
t
a
ge
im
pl
e
m
e
nt
e
d
a
n
e
a
r
ly
s
to
ppi
ng
m
e
c
ha
ni
s
m
,
w
hi
c
h
s
to
ppe
d
th
e
tr
a
in
in
g
w
he
n
th
e
m
oni
to
r
e
d
lo
s
s
di
d
not
de
c
r
e
a
s
e
f
or
5
e
poc
hs
.
T
he
c
a
te
gor
ic
a
l
c
r
os
s
-
e
nt
r
opy
lo
s
s
f
unc
ti
on w
a
s
u
s
e
d t
o c
a
lc
ul
a
te
t
he
l
os
s
va
lu
e
f
or
t
he
m
ul
ti
c
la
s
s
c
la
s
s
if
ic
a
ti
on mode
l.
T
he
m
ode
l
tr
a
in
in
g
e
m
pl
oye
d
th
e
K
-
f
ol
d
c
r
os
s
-
va
li
da
ti
on
m
e
th
od.
K
-
f
ol
d
c
r
os
s
-
va
li
da
ti
on
is
a
c
om
m
onl
y
us
e
d
te
c
hni
que
in
m
a
c
hi
n
e
le
a
r
ni
ng
f
or
m
or
e
obj
e
c
ti
ve
m
ode
l
pe
r
f
or
m
a
nc
e
e
va
lu
a
ti
on
[
24]
–
[
26]
.
I
n
K
-
f
ol
d
c
r
os
s
-
va
li
da
ti
on,
th
e
da
ta
s
e
t
is
r
a
ndoml
y
di
vi
de
d
in
t
o
ba
la
nc
e
d
s
ub
s
e
ts
c
a
ll
e
d
f
ol
ds
.
T
h
e
K
-
f
ol
d
c
r
os
s
-
va
li
da
ti
on
m
e
th
od
he
lp
s
a
ddr
e
s
s
th
e
unc
e
r
ta
in
ty
is
s
ue
in
m
ode
l
e
va
lu
a
ti
on
c
a
us
e
d
by
va
r
ia
ti
ons
in
th
e
tr
a
in
in
g
a
nd
te
s
ti
ng
da
ta
s
pl
it
s
.
B
y
c
om
bi
ni
ng
e
v
a
lu
a
ti
ons
f
r
o
m
in
de
pe
nde
nt
it
e
r
a
ti
ons
,
w
e
obt
a
in
a
m
or
e
obj
e
c
ti
ve
ove
r
vi
e
w
of
th
e
m
ode
l'
s
pe
r
f
or
m
a
nc
e
.
I
n
th
is
tr
a
in
in
g,
w
a
s
s
e
t
to
10
f
or
th
e
K
-
f
ol
d
c
r
os
s
-
va
li
da
ti
on me
th
od, a
nd s
tr
a
ti
f
ie
d s
a
m
pl
in
g w
a
s
us
e
d t
o
e
n
s
ur
e
ba
la
nc
e
d da
t
a
.
5.
R
E
S
U
L
T
S
A
N
D
A
N
A
L
Y
S
I
S
5.1.
R
e
s
u
lt
T
he
tr
a
in
in
g
pr
oc
e
s
s
of
th
e
I
nc
e
pt
io
n
-
R
e
s
N
e
t
a
r
c
hi
te
c
tu
r
e
f
ol
lo
w
s
th
e
e
xp
e
r
im
e
nt
a
l
s
c
h
e
m
e
de
s
c
r
ib
e
d
in
s
e
c
ti
on
4.
I
n
e
a
c
h
tr
a
in
in
g
it
e
r
a
ti
on
or
e
poc
h,
th
e
m
ode
l
is
e
va
lu
a
te
d
u
s
in
g
va
li
da
ti
on
da
ta
th
a
t
is
not
us
e
d
f
or
tr
a
in
in
g
th
e
m
ode
l.
T
w
o
e
v
a
lu
a
ti
on
m
e
tr
ic
s
a
r
e
u
s
e
d:
a
c
c
ur
a
c
y
a
nd
lo
s
s
.
T
he
m
a
in
goa
l
of
th
e
s
e
e
va
lu
a
ti
on
m
e
tr
ic
s
is
to
m
e
a
s
ur
e
how
w
e
ll
th
e
m
ode
l
ge
ne
r
a
li
z
e
s
a
nd
pr
e
di
c
ts
w
it
h
hi
gh
a
c
c
ur
a
c
y
on
uns
e
e
n
da
ta
.
T
he
e
va
lu
a
ti
on
m
e
tr
ic
va
lu
e
s
obt
a
in
e
d
f
r
om
th
e
tr
a
in
in
g
pr
oc
e
s
s
of
100
e
poc
hs
us
in
g
th
e
T
e
ns
or
F
lo
w
li
br
a
r
y
a
r
e
s
how
n
in
F
ig
u
r
e
7.
F
ig
u
r
e
7(
a
)
il
lu
s
tr
a
te
s
th
e
a
c
c
u
r
a
c
y
c
om
pa
r
is
on
gr
a
ph
w
it
h
e
poc
h
it
e
r
a
ti
ons
,
w
he
r
e
a
c
c
ur
a
c
y
r
e
pr
e
s
e
nt
s
th
e
pe
r
c
e
nt
a
ge
of
c
or
r
e
c
tl
y
pr
e
di
c
te
d
da
ta
out
of
th
e
to
ta
l
da
ta
.
F
ig
u
r
e
7(
b
)
di
s
pl
a
ys
th
e
lo
s
s
c
om
pa
r
is
on
gr
a
ph
w
it
h
e
poc
h
it
e
r
a
ti
on
s
,
w
he
r
e
lo
s
s
is
th
e
r
e
s
ul
t
of
th
e
c
a
te
gor
ic
a
l
c
r
os
s
-
e
nt
r
opy los
s
f
unc
ti
on.
A
f
te
r
th
e
tr
a
in
in
g
pr
oc
e
s
s
,
th
e
I
nc
e
pt
io
n
-
R
e
s
N
e
t
m
ode
l
is
e
v
a
lu
a
te
d
u
s
in
g
s
e
v
e
r
a
l
m
e
tr
ic
s
t
o a
s
s
e
s
s
it
s
pe
r
f
or
m
a
n
c
e
.
T
he
e
v
a
lu
a
ti
on
m
e
tr
i
c
s
us
e
d
f
or
th
e
m
u
s
i
c
g
e
nr
e
c
la
s
s
if
i
c
a
ti
on
t
a
s
k
in
c
lu
d
e
a
c
c
ur
a
c
y,
pr
e
c
i
s
io
n,
r
e
c
a
ll
r
a
te
, a
nd
F
1
-
s
c
or
e
.
T
a
bl
e
1 s
how
s
t
h
e
e
v
a
lu
a
ti
on m
e
tr
ic
s
c
a
lc
ul
a
te
d
by a
ve
r
a
gi
ng t
h
e
a
c
c
ur
a
c
y
, pr
e
c
is
io
n,
r
e
c
a
ll
r
a
te
, a
n
d F
1
-
s
c
or
e
m
e
tr
i
c
s
f
r
o
m
t
he
r
e
s
ul
t
s
of
t
h
e
10
-
f
ol
d c
r
os
s
-
v
a
li
d
a
ti
on
s
pr
o
c
e
s
s
.
T
o a
s
s
e
s
s
t
he
m
ode
l'
s
Evaluation Warning : The document was created with Spire.PDF for Python.
I
S
S
N
:
2252
-
8938
I
nt
J
A
r
ti
f
I
nt
e
ll
,
V
ol
. 14, No. 4, A
ugus
t
2025
:
3300
-
3310
3306
pe
r
f
or
m
a
n
c
e
in
pr
e
di
c
ti
ng
a
udi
o
f
or
e
a
c
h
m
us
ic
ge
nr
e
,
s
e
pa
r
a
te
m
e
tr
ic
c
a
l
c
ul
a
ti
on
s
a
r
e
pe
r
f
or
m
e
d
f
or
e
a
c
h
ge
nr
e
,
e
a
c
h
c
on
s
i
s
ti
ng
of
100
t
e
s
t
d
a
ta
,
a
s
s
how
n
in
T
a
bl
e
2.
A
ddi
ti
on
a
ll
y,
a
c
onf
u
s
io
n
m
a
tr
i
x
is
ge
ne
r
a
t
e
d
to
il
lu
s
tr
a
te
t
he
n
um
be
r
of
c
or
r
e
c
t
a
nd i
nc
or
r
e
c
t
pr
e
di
c
ti
on
s
f
or
e
a
c
h c
a
t
e
gor
y
,
a
s
s
how
n i
n F
ig
ur
e
8
.
(
a
)
(
b)
F
ig
ur
e
7.
E
va
lu
a
ti
on me
tr
ic
s
r
e
s
ul
t
of
(
a
)
v
a
li
da
t
io
n a
nd t
r
a
in
a
c
c
ur
a
c
y a
nd (
b)
v
a
li
da
ti
on a
nd t
r
a
in
l
os
s
T
a
bl
e
1. E
va
lu
a
ti
on
m
e
tr
ic
s
E
va
l
ua
t
i
on
m
e
t
r
i
c
s
(
%
)
A
c
c
ur
a
c
y
P
r
e
c
i
s
i
on
R
e
c
a
l
l
r
a
t
e
F1
-
s
c
or
e
94.10
94.10
94.19
94.08
T
a
bl
e
2. E
va
lu
a
ti
on me
tr
ic
s
pe
r
ge
nr
e
G
e
nr
e
E
va
l
ua
t
i
on m
e
t
r
i
c
s
(
%
)
A
c
c
ur
a
c
y
P
r
e
c
i
s
i
on
R
e
c
a
l
l
r
a
t
e
F1
-
s
c
or
e
B
l
ue
s
94.0
95.9
94.0
94.9
C
l
a
s
s
i
c
a
l
99.0
98.1
99.0
98.5
C
ount
r
y
95.0
90.4
95.0
92.7
D
i
s
c
o
90.0
92.7
90.0
91.3
H
i
p
-
H
op
98.0
98.9
98.0
98.9
J
a
z
z
94.0
95.9
94.0
95.8
M
e
t
a
l
98.0
92.4
98.0
92.4
P
op
96.0
88.1
96.0
91.8
R
e
gga
e
91.0
96.8
91.0
93.8
R
oc
k
86.0
92.5
86.0
89.1
F
ig
ur
e
8. C
onf
us
io
n
m
a
tr
ix
Evaluation Warning : The document was created with Spire.PDF for Python.
I
nt
J
A
r
ti
f
I
nt
e
ll
I
S
S
N
:
2252
-
8938
M
us
ic
ge
nr
e
c
la
s
s
if
ic
at
io
n us
in
g I
nc
e
pt
io
n
-
R
e
s
N
e
t
ar
c
hi
te
c
tu
r
e
(
F
auz
an V
al
de
r
a
)
3307
T
he
num
be
r
of
pa
r
a
m
e
te
r
s
in
th
e
m
ode
l
r
e
f
e
r
s
to
th
e
num
be
r
of
w
e
ig
ht
s
or
pa
r
a
m
e
t
e
r
s
th
a
t
ne
e
d
to
be
upda
te
d
dur
in
g
th
e
m
ode
l
tr
a
in
in
g
pr
oc
e
s
s
.
T
he
s
e
pa
r
a
m
e
t
e
r
s
a
r
e
va
lu
e
s
s
e
t
by
th
e
m
ode
l
a
nd
u
s
e
d
f
or
c
om
put
a
ti
ons
dur
in
g
tr
a
in
in
g.
T
a
bl
e
3
s
how
s
th
e
num
be
r
of
pa
r
a
m
e
te
r
s
in
th
e
I
nc
e
pt
io
n
-
R
e
s
N
e
t
a
r
c
hi
te
c
tu
r
e
obt
a
in
e
d
f
r
om
th
e
s
im
ul
a
ti
on
r
e
s
ul
ts
us
in
g
th
e
T
e
ns
or
F
lo
w
li
br
a
r
y.
T
r
a
in
a
bl
e
pa
r
a
m
e
te
r
s
r
e
f
e
r
to
th
e
pa
r
a
m
e
te
r
s
th
a
t
c
ha
nge
dur
in
g
th
e
tr
a
in
in
g
pr
oc
e
s
s
,
in
c
lu
di
ng
th
e
w
e
ig
ht
s
a
nd
bi
a
s
e
s
in
e
a
c
h
c
onvolut
io
na
l
la
ye
r
a
nd
f
ul
ly
c
onne
c
te
d
la
ye
r
.
N
on
-
tr
a
in
a
bl
e
pa
r
a
m
e
te
r
s
r
e
f
e
r
to
th
e
pa
r
a
m
e
te
r
s
th
a
t
doe
s
not
c
ha
nge
dur
in
g
th
e
tr
a
in
in
g
pr
oc
e
s
s
,
in
c
lu
di
ng
gl
oba
l
a
nd
c
ons
ta
nt
pa
r
a
m
e
te
r
s
.
T
ot
a
l
pa
r
a
m
e
te
r
s
r
e
f
e
r
to
th
e
s
um
of
t
r
a
in
a
bl
e
pa
r
a
m
e
te
r
s
a
nd
non
-
tr
a
in
a
bl
e
pa
r
a
m
e
te
r
s
.
T
he
r
e
s
ul
ts
of
m
u
s
ic
ge
nr
e
c
la
s
s
if
ic
a
ti
on
pr
e
di
c
ti
on
us
in
g
th
e
I
nc
e
pt
io
n
-
R
e
s
N
e
t
a
r
c
hi
te
c
tu
r
e
a
r
e
s
how
n i
n F
ig
ur
e
9.
T
a
bl
e
3.
T
ot
a
l
pa
r
a
m
e
te
r
s
P
a
r
a
m
e
t
e
r
s
V
a
l
ue
T
r
a
i
na
bl
e
pa
r
a
m
e
t
e
r
s
147,818
N
on
-
t
r
a
i
na
bl
e
pa
r
a
m
e
t
e
r
s
1.600
T
ot
a
l
pa
r
a
m
e
t
e
r
s
149.418
F
ig
ur
e
9. M
ode
l
pr
e
di
c
ti
ons
5.2.
A
n
al
ys
is
T
he
I
nc
e
pt
io
n
-
R
e
s
N
e
t
m
ode
l
w
a
s
e
va
lu
a
te
d
us
in
g
s
e
ve
r
a
l
e
va
lu
a
ti
on
m
e
tr
ic
s
to
a
s
s
e
s
s
it
s
pe
r
f
or
m
a
nc
e
.
T
he
e
va
lu
a
ti
on
m
e
tr
ic
s
u
s
e
d
f
or
m
us
ic
ge
nr
e
c
la
s
s
if
ic
a
ti
on
in
c
lu
de
a
c
c
ur
a
c
y,
pr
e
c
is
io
n,
r
e
c
a
ll
r
a
te
,
a
nd
F
1
-
s
c
or
e
.
T
he
s
e
e
va
lu
a
ti
on
m
e
tr
ic
s
c
ons
is
t
of
f
our
c
om
pone
nt
s
:
tr
ue
pos
it
iv
e
(
T
P
)
,
tr
ue
ne
ga
ti
ve
(
T
N
)
,
f
a
ls
e
ne
ga
ti
ve
(
F
N
)
,
a
nd
f
a
ls
e
pos
it
iv
e
(
F
P
)
.
I
n
th
e
c
a
s
e
of
m
us
ic
ge
nr
e
c
la
s
s
if
ic
a
ti
on,
th
e
in
put
is
tr
a
ns
f
or
m
e
d
in
to
bi
na
r
y
ve
c
to
r
s
us
in
g
one
-
hot
e
nc
odi
ng.
E
a
c
h
uni
que
c
a
te
gor
y
or
le
ve
l
of
th
e
c
a
te
gor
ic
a
l
va
r
ia
bl
e
is
r
e
pr
e
s
e
nt
e
d
by
a
bi
na
r
y
ve
c
to
r
,
w
he
r
e
th
e
le
ngt
h
o
f
th
e
ve
c
to
r
is
e
qu
a
l
to
th
e
num
be
r
of
uni
que
c
a
te
gor
ie
s
.
F
or
e
a
c
h
da
ta
poi
nt
,
onl
y
one
e
le
m
e
nt
in
th
e
ve
c
to
r
is
pos
it
iv
e
(
de
not
e
d
by
1)
,
in
di
c
a
ti
ng
th
e
c
or
r
e
s
ponding c
a
te
gor
y, w
hi
le
t
he
ot
he
r
e
le
m
e
nt
s
a
r
e
ne
ga
ti
ve
(
de
not
e
d by 0)
.
A
c
c
ur
a
c
y
is
a
c
om
m
onl
y
us
e
d
e
va
lu
a
ti
on
m
e
tr
ic
to
m
e
a
s
ur
e
th
e
pe
r
f
or
m
a
nc
e
of
a
c
la
s
s
if
ic
a
ti
on
m
ode
l.
I
t
c
a
lc
ul
a
te
s
th
e
pe
r
c
e
nt
a
ge
of
c
or
r
e
c
t
pr
e
di
c
ti
ons
(
TP
a
nd
TN
)
out
of
a
ll
p
r
e
di
c
ti
ons
m
a
de
by
th
e
m
ode
l.
A
c
c
or
di
ng
to
T
a
bl
e
1,
th
e
a
c
c
ur
a
c
y
obt
a
in
e
d
f
r
om
th
e
m
ode
l
us
in
g
th
e
te
s
t
da
ta
is
94.10%
.
T
hi
s
a
c
c
ur
a
c
y
va
lu
e
in
di
c
a
te
s
th
a
t
th
e
m
ode
l
pe
r
f
or
m
s
w
e
ll
a
nd
a
c
c
ur
a
te
ly
pr
e
di
c
ts
th
e
m
us
i
c
ge
nr
e
ov
e
r
a
ll
.
Evaluation Warning : The document was created with Spire.PDF for Python.
I
S
S
N
:
2252
-
8938
I
nt
J
A
r
ti
f
I
nt
e
ll
,
V
ol
. 14, No. 4, A
ugus
t
2025
:
3300
-
3310
3308
H
ow
e
ve
r
,
ba
s
e
d
on
T
a
bl
e
2,
th
e
r
e
a
r
e
ge
nr
e
s
w
it
h
a
c
c
ur
a
c
y
unde
r
90%
,
s
uc
h
a
s
r
oc
k,
in
di
c
a
ti
ng
th
a
t
th
e
m
ode
l
is
le
s
s
a
c
c
ur
a
te
in
pr
e
di
c
ti
ng
a
udi
o
w
it
h
th
a
t
ge
nr
e
.
T
he
c
onf
us
io
n
m
a
tr
ix
in
F
ig
ur
e
8
s
how
s
th
a
t
th
e
m
ode
l
m
is
c
la
s
s
if
ie
d
s
om
e
r
oc
k
a
udi
o
a
s
m
e
ta
l,
w
hi
c
h
c
a
n
b
e
a
tt
r
ib
ut
e
d
to
th
e
s
im
il
a
r
it
ie
s
be
twe
e
n
r
oc
k
a
nd
m
e
ta
l
ge
nr
e
s
.
P
r
e
c
is
io
n
is
a
n
e
va
lu
a
ti
on
m
e
tr
ic
us
e
d
to
m
e
a
s
ur
e
how
a
c
c
ur
a
te
ly
a
c
la
s
s
if
ic
a
ti
on
m
ode
l
id
e
nt
if
ie
s
pos
it
iv
e
pr
e
di
c
ti
ons
.
I
t
c
a
lc
ul
a
te
s
th
e
p
e
r
c
e
nt
a
ge
of
TP
pr
e
di
c
ti
ons
out
of
a
ll
pos
it
iv
e
pr
e
di
c
ti
ons
m
a
de
by
th
e
m
ode
l.
A
c
c
or
di
ng
to
T
a
bl
e
1,
th
e
pr
e
c
is
io
n
obt
a
in
e
d
f
r
om
th
e
m
ode
l
us
in
g
th
e
te
s
t
d
a
ta
is
a
ls
o
94.10%
.
T
hi
s
pr
e
c
is
io
n
va
lu
e
in
di
c
a
te
s
th
a
t
th
e
m
ode
l
ha
s
a
hi
gh
le
ve
l
of
a
c
c
ur
a
c
y
in
c
la
s
s
if
yi
ng
a
udi
o
in
to
th
e
i
r
r
e
s
pe
c
ti
ve
c
la
s
s
e
s
.
H
ow
e
v
e
r
,
ba
s
e
d
on
T
a
bl
e
2,
th
e
r
e
a
r
e
ge
nr
e
s
w
it
h
pr
e
c
is
io
n
unde
r
90
%
,
s
uc
h
a
s
pop,
in
di
c
a
ti
ng
th
a
t
th
e
m
ode
l
te
nds
to
m
a
ke
m
is
ta
k
e
s
by
pr
e
di
c
ti
ng
a
udi
o
th
a
t
s
houl
d
not
be
lo
ng
to
th
a
t
ge
nr
e
a
s
m
e
m
be
r
s
of
t
ha
t
ge
nr
e
.
R
e
c
a
ll
r
a
te
, a
ls
o
known a
s
s
e
ns
it
iv
it
y
or
TP
r
a
te
,
is
a
n
e
va
lu
a
ti
o
n
m
e
tr
ic
us
e
d
to
m
e
a
s
ur
e
how
w
e
ll
a
c
la
s
s
if
ic
a
ti
on
m
ode
l
c
or
r
e
c
tl
y
id
e
nt
if
ie
s
th
e
ove
r
a
ll
num
be
r
of
pos
it
iv
e
s
.
R
e
c
a
ll
c
a
lc
ul
a
te
s
th
e
pe
r
c
e
nt
a
ge
of
TP
pr
e
di
c
ti
ons
out
of
th
e
to
ta
l
num
be
r
of
a
c
tu
a
l
po
s
it
iv
e
s
.
A
c
c
o
r
di
ng
to
T
a
bl
e
1,
th
e
r
e
c
a
ll
r
a
te
obt
a
in
e
d
f
r
om
th
e
m
ode
l
us
in
g t
he
t
e
s
t
da
ta
i
s
94.19%
.
F1
-
s
c
or
e
i
s
a
n e
va
lu
a
ti
on me
tr
ic
us
e
d t
o c
om
bi
ne
i
nf
or
m
a
ti
on a
bout
pr
e
c
is
io
n a
nd r
e
c
a
ll
i
nt
o a
s
in
gl
e
num
be
r
th
a
t
de
s
c
r
ib
e
s
th
e
ove
r
a
ll
pe
r
f
or
m
a
nc
e
of
a
c
la
s
s
if
ic
a
ti
on
m
ode
l
or
s
e
le
c
ti
on
s
ys
te
m
.
F
1
-
s
c
or
e
m
e
a
s
ur
e
s
how
w
e
ll
th
e
m
ode
l
c
a
n
a
c
hi
e
ve
a
ba
la
nc
e
be
tw
e
e
n
p
r
e
c
is
io
n
a
nd
r
e
c
a
ll
.
A
c
c
or
di
ng
to
T
a
bl
e
1,
th
e
F1
-
s
c
or
e
obt
a
in
e
d f
r
om
t
he
m
ode
l
us
in
g t
he
t
e
s
t
da
ta
i
s
94.08%
.
T
he
num
be
r
of
p
a
r
a
m
e
te
r
s
in
a
m
ode
l
i
s
di
r
e
c
tl
y
r
e
la
te
d
to
c
om
put
a
ti
ona
l
c
os
t.
T
he
m
or
e
pa
r
a
m
e
te
r
s
a
m
ode
l
ha
s
,
th
e
m
or
e
c
om
pl
e
x
it
i
s
,
a
nd
th
e
m
or
e
c
om
put
a
ti
o
na
l
ope
r
a
ti
ons
a
r
e
r
e
qui
r
e
d
dur
in
g
tr
a
in
in
g
or
pr
e
di
c
ti
on. Ac
c
or
di
ng t
o T
a
bl
e
3, t
he
t
ot
a
l
num
be
r
of
pa
r
a
m
e
te
r
s
obt
a
in
e
d i
s
149,418.
T
he
tr
a
in
in
g
gr
a
phs
in
F
ig
ur
e
7
s
how
a
c
ons
is
te
nt
in
c
r
e
a
s
e
in
va
li
da
ti
on
a
c
c
ur
a
c
y
a
nd
a
de
c
r
e
a
s
e
in
va
li
da
ti
on
lo
s
s
th
r
oughout
th
e
tr
a
in
in
g
pr
oc
e
s
s
.
T
hi
s
in
di
c
a
t
e
s
th
a
t
th
e
m
ode
l
c
a
n
ge
ne
r
a
li
z
e
w
e
ll
,
a
s
it
a
c
hi
e
ve
s
good pe
r
f
or
m
a
nc
e
not
onl
y on the
t
r
a
in
in
g da
ta
but
a
ls
o on the
va
li
da
ti
on da
ta
. A
ddi
ti
ona
ll
y,
t
he
r
e
i
s
no
s
ig
ni
f
ic
a
nt
di
f
f
e
r
e
nc
e
be
twe
e
n
th
e
va
li
da
ti
on
a
c
c
ur
a
c
y
a
nd
tr
a
in
a
c
c
ur
a
c
y
va
lu
e
s
a
t
th
e
f
in
a
l
e
poc
h,
in
di
c
a
ti
ng t
ha
t
th
e
m
ode
l
doe
s
not
s
uf
f
e
r
f
r
om
ove
r
f
it
ti
ng.
5.3.
P
e
r
f
or
m
an
c
e
c
o
m
p
ar
is
on
I
n
c
e
p
t
io
n
-
R
e
s
N
e
t
an
d
b
ot
t
o
m
-
u
p
b
r
oad
c
as
t
n
e
u
r
al
n
e
t
w
or
k
T
he
e
va
lu
a
ti
on
m
e
tr
ic
s
of
th
e
tr
a
in
e
d
I
nc
e
pt
io
n
-
R
e
s
N
e
t
m
ode
l
a
r
e
c
om
pa
r
e
d
w
it
h
th
e
B
B
N
N
m
ode
l.
T
a
bl
e
4
pr
e
s
e
nt
s
a
c
om
pa
r
is
on
of
th
e
pe
r
f
or
m
a
nc
e
be
twe
e
n
t
he
s
e
two
m
ode
ls
,
w
hi
c
h
ha
ve
unde
r
gone
th
e
s
a
m
e
tr
a
in
in
g
pr
oc
e
s
s
a
nd
d
a
ta
s
e
t
pr
e
-
pr
oc
e
s
s
in
g
[
6]
.
B
a
s
e
d
on
T
a
bl
e
4,
th
e
pr
opos
e
d
a
r
c
hi
te
c
tu
r
e
m
ode
l
ha
s
hi
ghe
r
va
lu
e
s
in
e
a
c
h
m
e
tr
ic
a
nd
a
s
m
a
ll
e
r
to
ta
l
num
be
r
of
pa
r
a
m
e
te
r
s
c
om
pa
r
e
d
to
th
e
B
B
N
N
a
r
c
hi
te
c
tu
r
e
.
T
he
pr
opos
e
d
a
r
c
hi
te
c
tu
r
e
ut
il
iz
e
s
I
nc
e
pt
io
n
-
R
e
s
N
e
t
m
odul
e
s
,
w
hi
c
h
ha
ve
f
e
w
e
r
pa
r
a
m
e
te
r
s
c
om
pa
r
e
d
to
th
e
I
nc
e
pt
io
nV
1
m
odul
e
s
us
e
d
in
th
e
B
B
N
N
a
r
c
hi
te
c
tu
r
e
.
T
hi
s
a
ll
ow
s
th
e
a
ut
hor
s
to
c
r
e
a
te
a
de
e
p
e
r
a
r
c
hi
te
c
tu
r
e
to
e
nha
nc
e
t
he
m
ode
l'
s
c
a
pa
c
it
y i
n c
a
pt
ur
in
g c
om
pl
e
x f
e
a
tu
r
e
s
a
nd r
e
pr
e
s
e
nt
a
ti
ons
f
r
om
m
e
l
-
s
pe
c
tr
ogr
a
m
s
.
T
a
bl
e
4
. P
e
r
f
or
m
a
nc
e
c
om
pa
r
is
on
M
ode
l
T
ot
a
l
p
a
r
a
m
e
t
e
r
E
va
l
ua
t
i
on m
e
t
r
i
c
(
%
)
A
c
c
ur
a
c
y
R
e
c
a
l
l
P
r
e
c
i
s
i
on
F1
-
s
c
or
e
I
nc
e
pt
i
on
-
R
e
s
N
e
t
149.418
94.10
94.10
94.19
94.08
B
B
N
N
[
5]
185.642
93.90
94.0
93.7
93.7
T
he
r
e
duc
ti
on i
n t
he
numbe
r
of
pa
r
a
m
e
te
r
s
i
n t
he
I
nc
e
pt
io
n
-
R
e
s
N
e
t
m
odul
e
s
i
s
due
t
o t
he
a
bs
e
nc
e
of
us
in
g
c
onvolut
io
na
l
la
ye
r
s
w
it
h
la
r
ge
r
f
il
te
r
s
,
s
uc
h
a
s
5×
5
.
I
ns
te
a
d,
th
e
la
r
ge
f
il
te
r
s
a
r
e
r
e
pl
a
c
e
d
w
it
h
tw
o
c
onvolut
io
ns
us
in
g
f
il
te
r
s
of
s
iz
e
(
1×
3
a
nd
3×
1)
or
(
1×
7
a
nd
7×
1)
to
r
e
duc
e
th
e
c
om
put
a
ti
ona
l
c
os
t
or
to
ta
l
num
be
r
of
pa
r
a
m
e
te
r
s
.
F
ig
ur
e
s
3
a
nd
4
il
lu
s
tr
a
te
th
e
I
nc
e
pt
io
n
-
R
e
s
N
e
t
m
odul
e
s
u
s
e
d
in
th
e
pr
opo
s
e
d
a
r
c
hi
te
c
tu
r
e
.
I
n
th
e
B
B
N
N
a
r
c
hi
te
c
tu
r
e
,
a
s
in
gl
e
s
ta
g
e
of
m
a
x
-
pool
in
g
w
it
h
a
s
iz
e
of
(
4,
1)
is
us
e
d,
w
hi
c
h
di
r
e
c
tl
y
r
e
duc
e
s
th
e
in
put
di
m
e
n
s
io
n
by
a
qua
r
te
r
.
T
hi
s
m
a
y
le
a
d
to
th
e
lo
s
s
of
s
om
e
im
por
ta
nt
f
e
a
tu
r
e
s
or
d
e
ta
il
e
d
s
pa
ti
a
l
in
f
or
m
a
ti
on. On the
ot
he
r
ha
nd, i
n t
he
p
r
opos
e
d I
nc
e
pt
io
n
-
R
e
s
N
e
t
a
r
c
hi
te
c
tu
r
e
t
he
us
e
of
t
w
o s
ta
ge
s
of
m
a
x
-
pool
in
g
w
it
h
a
s
i
z
e
of
(
2,
1)
s
tr
ik
e
s
a
good
ba
la
nc
e
be
twe
e
n
r
e
duc
in
g
th
e
in
put
di
m
e
ns
io
n
a
nd
r
e
ta
in
in
g
im
por
ta
nt
in
f
or
m
a
ti
on
in
th
e
f
e
a
tu
r
e
r
e
pr
e
s
e
nt
a
ti
on.
T
he
lo
w
e
r
to
ta
l
num
be
r
of
pa
r
a
m
e
te
r
s
c
a
n
im
pr
ove
th
e
pe
r
f
or
m
a
nc
e
of
m
us
ic
ge
nr
e
c
la
s
s
if
ic
a
ti
on
on
de
vi
c
e
s
w
it
h
li
m
it
e
d
c
om
put
a
ti
ona
l
r
e
s
our
c
e
s
,
s
uc
h
a
s
m
obi
le
phone
s
.
A
ddi
ti
ona
ll
y,
th
e
hi
ghe
r
a
c
c
ur
a
c
y
c
a
n
e
nh
a
nc
e
th
e
p
e
r
f
or
m
a
nc
e
of
a
ppl
ic
a
ti
ons
th
a
t
ut
il
iz
e
th
e
ge
nr
e
c
la
s
s
if
ic
a
ti
on mode
l,
s
u
c
h a
s
r
e
c
om
m
e
nde
r
s
ys
te
m
s
, da
ta
pi
pe
li
ne
s
, a
nd othe
r
a
ppl
ic
a
ti
ons
.
Evaluation Warning : The document was created with Spire.PDF for Python.
I
nt
J
A
r
ti
f
I
nt
e
ll
I
S
S
N
:
2252
-
8938
M
us
ic
ge
nr
e
c
la
s
s
if
ic
at
io
n us
in
g I
nc
e
pt
io
n
-
R
e
s
N
e
t
ar
c
hi
te
c
tu
r
e
(
F
auz
an V
al
de
r
a
)
3309
6.
C
O
N
C
L
U
S
I
O
N
I
n
th
is
a
r
ti
c
le
,
w
e
pr
opos
e
a
m
us
ic
ge
nr
e
c
la
s
s
if
ic
a
ti
on
f
r
a
m
e
w
or
k
ba
s
e
d
on
th
e
I
nc
e
pt
io
n
-
R
e
s
N
e
t
a
r
c
hi
te
c
tu
r
e
.
T
he
pr
opo
s
e
d
f
r
a
m
e
w
or
k
c
a
n
c
a
pt
ur
e
c
om
pl
e
x
f
e
a
tu
r
e
s
w
it
h
a
de
e
pe
r
a
r
c
hi
te
c
tu
r
e
in
m
e
l
-
s
pe
c
tr
ogr
a
m
s
.
E
ve
n
w
it
h
a
s
m
a
ll
e
r
num
be
r
of
pa
r
a
m
e
te
r
s
,
th
e
pr
opos
e
d
f
r
a
m
e
w
or
k
m
a
na
ge
s
to
out
pe
r
f
or
m
th
e
e
xi
s
ti
ng
m
ode
l
on
a
ll
m
e
a
s
ur
e
m
e
nt
m
e
tr
ic
s
in
c
lu
di
ng
a
c
c
ur
a
c
y,
r
e
c
a
ll
,
pr
e
c
is
io
n,
a
nd
F
1
-
s
c
or
e
,
ba
s
e
d
on
G
T
Z
A
N
da
t
a
s
e
t
s
.
W
it
h
a
s
m
a
ll
e
r
num
be
r
of
pa
r
a
m
e
te
r
s
,
th
e
pr
opos
e
d
f
r
a
m
e
w
or
k
c
a
n
pot
e
nt
ia
ll
y be
a
ppl
ie
d t
o de
vi
c
e
s
w
it
h l
im
it
e
d c
om
put
a
ti
ona
l
r
e
s
our
c
e
s
.
F
U
N
D
I
N
G
I
N
F
O
R
M
A
T
I
O
N
T
hi
s
w
or
k
w
a
s
s
uppor
te
d
in
p
a
r
t
by
th
e
U
ni
ve
r
s
it
a
s
I
ndone
s
ia
unde
r
G
r
a
nt
P
U
T
I
Q
2
N
K
B
-
796/
U
N
2.R
S
T
/HK
P
.05.00/2023 and G
r
a
nt
L
K
N
K
B
-
2582/UN2.F4.D/P
P
M
.00.00/2023.
A
U
T
H
O
R
C
O
N
T
R
I
B
U
T
I
O
N
S
S
T
A
T
E
M
E
N
T
T
hi
s
jo
ur
na
l
us
e
s
th
e
C
ont
r
ib
ut
or
R
ol
e
s
T
a
xonomy
(
C
R
e
d
iT
)
to
r
e
c
ogni
z
e
in
di
vi
dua
l
a
ut
hor
c
ont
r
ib
ut
io
ns
, r
e
duc
e
a
ut
hor
s
hi
p di
s
put
e
s
,
a
nd f
a
c
il
it
a
te
c
ol
la
bo
r
a
ti
on.
N
am
e
o
f
A
u
t
h
or
C
M
So
Va
Fo
I
R
D
O
E
Vi
Su
P
Fu
F
a
uz
a
n V
a
ld
e
r
a
✓
✓
✓
✓
✓
✓
A
ji
b S
e
ty
o A
r
if
in
✓
✓
✓
✓
✓
✓
✓
✓
✓
✓
✓
C
:
C
onc
e
pt
ua
l
i
z
a
t
i
on
M
:
M
e
t
hodol
ogy
So
:
So
f
t
w
a
r
e
Va
:
Va
l
i
da
t
i
on
Fo
:
Fo
r
m
a
l
a
na
l
ys
i
s
I
:
I
nve
s
t
i
ga
t
i
on
R
:
R
e
s
our
c
e
s
D
:
D
a
t
a
C
ur
a
t
i
on
O
:
W
r
i
t
i
ng
-
O
r
i
gi
na
l
D
r
a
f
t
E
:
W
r
i
t
i
ng
-
R
e
vi
e
w
&
E
di
t
i
ng
Vi
:
Vi
s
ua
l
i
z
a
t
i
on
Su
:
Su
pe
r
vi
s
i
on
P
:
P
r
oj
e
c
t
a
dm
i
ni
s
t
r
a
t
i
on
Fu
:
Fu
ndi
ng a
c
qui
s
i
t
i
on
C
O
N
F
L
I
C
T
O
F
I
N
T
E
R
E
S
T
S
T
A
T
E
M
E
N
T
A
ut
hor
s
s
ta
te
no c
onf
li
c
t
of
i
nt
e
r
e
s
t.
D
A
T
A
A
V
A
I
L
A
B
I
L
I
T
Y
T
he
da
ta
t
ha
t
s
uppor
t
th
e
f
in
di
ngs
a
r
e
a
va
il
a
bl
e
f
r
om
t
he
c
or
r
e
s
ponding a
ut
hor
[
A
S
A
]
on r
e
que
s
t.
R
E
F
E
R
E
N
C
E
S
[
1]
G
.
C
e
r
a
t
i
,
“
D
i
f
f
i
c
ul
t
t
o
de
f
i
ne
,
e
a
s
y
t
o
unde
r
s
t
a
nd:
t
he
u
s
e
of
ge
nr
e
c
a
t
e
gor
i
e
s
w
hi
l
e
t
a
l
ki
ng
a
bout
m
u
s
i
c
,”
SN
Soc
i
al
Sc
i
e
nc
e
s
,
vol
. 1, no. 12, 2021, doi
:
10.1007/
s
43545
-
021
-
00296
-
2.
[
2]
M
.
G
e
nu
s
s
ov
a
nd
I
.
C
ohe
n,
“
M
u
s
i
c
a
l
g
e
nr
e
c
l
a
s
s
i
f
i
c
a
t
i
on
of
a
udi
o
s
i
gna
l
s
u
s
i
ng
ge
om
e
t
r
i
c
m
e
t
hods
,
”
i
n
2010
18t
h
E
ur
ope
an
Si
gnal
P
r
oc
e
s
s
i
ng C
onf
e
r
e
nc
e
, 2010, pp. 497
–
501.
[
3]
Y
.
W
a
ng,
X
.
L
i
n,
L
.
W
u,
a
nd
W
.
Z
ha
ng,
“
E
f
f
e
c
t
i
ve
m
ul
t
i
-
que
r
y
e
xpa
ns
i
ons
:
c
ol
l
a
bor
a
t
i
ve
de
e
p
ne
t
w
or
ks
f
or
r
obus
t
l
a
ndm
a
r
k
r
e
t
r
i
e
va
l
,”
I
E
E
E
T
r
ans
ac
t
i
ons
on I
m
age
P
r
oc
e
s
s
i
ng
, vol
. 26, no. 3, pp. 1393
–
1404, 2017, doi
:
10.1109/
T
I
P
.2017.2655449.
[
4]
L
.
G
.
H
a
f
e
m
a
nn,
L
.
S
.
O
l
i
ve
i
r
a
,
a
nd
P
.
C
a
va
l
i
n,
“
F
or
e
s
t
s
pe
c
i
e
s
r
e
c
ogni
t
i
o
n
us
i
ng
de
e
p
c
onvol
ut
i
ona
l
ne
ur
a
l
ne
t
w
or
ks
,”
i
n
I
nt
e
r
nat
i
onal
C
onf
e
r
e
nc
e
on P
at
t
e
r
n R
e
c
ogni
t
i
on
, 2014, pp. 1103
–
1107
, doi
:
1
0.1109/
I
C
P
R
.2014.199.
[
5]
K
.
C
hoi
,
G
.
F
a
z
e
ka
s
,
M
.
S
a
ndl
e
r
,
a
nd
K
.
C
ho,
“
T
r
a
ns
f
e
r
l
e
a
r
ni
ng
f
o
r
m
us
i
c
c
l
a
s
s
i
f
i
c
a
t
i
on
a
nd
r
e
gr
e
s
s
i
on
t
a
s
ks
,”
i
n
P
r
oc
e
e
di
ng
s
of
t
he
18t
h
I
nt
e
r
nat
i
onal
Soc
i
e
t
y
f
or
M
us
i
c
I
nf
or
m
at
i
on
R
e
t
r
i
e
v
al
C
onf
e
r
e
nc
e
,
I
SM
I
R
2017
,
2017,
pp.
141
–
149
,
doi
:
10.5281/
z
e
nodo.1418015
.
[
6]
C
.
L
i
u,
L
.
F
e
ng,
G
.
L
i
u,
H
.
W
a
ng,
a
nd
S
.
L
i
u,
“
B
ot
t
om
-
up
br
oa
dc
a
s
t
ne
ur
a
l
n
e
t
w
or
k
f
or
m
us
i
c
ge
nr
e
c
l
a
s
s
i
f
i
c
a
t
i
on,”
M
ul
t
i
m
e
di
a
T
ool
s
and A
ppl
i
c
at
i
ons
, vol
. 80, no. 5, pp. 7313
–
7331, 2021, doi
:
10.1007/
s
11042
-
020
-
09643
-
6.
[
7]
J
.
T
.
S
pr
i
nge
nbe
r
g,
A
.
D
os
ovi
t
s
ki
y,
T
.
B
r
ox,
a
nd
M
.
R
i
e
dm
i
l
l
e
r
,
“
S
t
r
i
vi
ng
f
or
s
i
m
pl
i
c
i
t
y:
t
he
a
l
l
c
onvol
ut
i
ona
l
ne
t
,”
i
n
ar
X
i
v
-
C
om
put
e
r
Sc
i
e
n
c
e
,
pp. 1
-
14, A
pr
.
2015.
[
8]
C
.
S
z
e
ge
dy,
S
.
I
of
f
e
,
V
.
V
a
nhouc
ke
,
a
nd
A
.
A
.
A
l
e
m
i
,
“
I
nc
e
pt
i
on
-
v4,
i
nc
e
pt
i
on
-
R
e
s
N
e
t
a
nd
t
he
i
m
pa
c
t
of
r
e
s
i
dua
l
c
onne
c
t
i
ons
on
l
e
a
r
ni
ng,”
i
n
31s
t
A
A
A
I
C
onf
e
r
e
nc
e
on A
r
t
i
f
i
c
i
al
I
nt
e
l
l
i
ge
nc
e
, A
A
A
I
2017
, 2017,
pp. 4278
–
4284
, doi
:
10.1609/
a
a
a
i
.v31i
1.11231.
[
9]
Y
.
M
.
G
.
C
os
t
a
,
L
.
S
.
O
l
i
ve
i
r
a
,
a
nd
C
.
N
.
S
i
l
l
a
,
“
A
n
e
va
l
ua
t
i
on
of
c
onvol
ut
i
o
na
l
ne
ur
a
l
ne
t
w
or
ks
f
or
m
us
i
c
c
l
a
s
s
i
f
i
c
a
t
i
on
us
i
ng
s
pe
c
t
r
ogr
a
m
s
,”
A
ppl
i
e
d Sof
t
C
om
put
i
ng J
our
nal
, vol
. 52, pp. 28
–
38, 2017, doi
:
10.1016/
j
.a
s
oc
.2016.12.024.
[
10]
G
.
H
ua
ng,
Z
.
L
i
u,
a
nd
L
.
va
n
de
r
M
a
a
t
e
n,
“
D
e
ns
e
l
y
c
onne
c
t
e
d
c
onvol
ut
i
ona
l
ne
t
w
or
ks
,”
i
n
2017
I
E
E
E
C
onf
e
r
e
nc
e
on
C
om
put
e
r
V
i
s
i
on and P
at
t
e
r
n R
e
c
ogni
t
i
on (
C
V
P
R
)
,
H
onol
ul
u, H
I
, U
S
A
, 2017, pp. 2261
-
2
269, doi
:
10.1109/
C
V
P
R
.2017.243
.
[
11]
A
.
K
r
i
z
he
vs
ky,
I
.
S
ut
s
ke
ve
r
,
a
nd
G
.
E
.
H
i
nt
on,
“
I
m
a
ge
N
e
t
c
l
a
s
s
i
f
i
c
a
t
i
on
w
i
t
h
de
e
p
c
onvol
ut
i
ona
l
ne
ur
a
l
ne
t
w
or
ks
,
”
C
om
m
uni
c
at
i
ons
of
t
he
A
C
M
, vol
. 60, no. 6, pp. 84
–
90, 2017, doi
:
10.1145/
3065386.
[
12]
C
.
K
e
r
e
l
i
uk,
B
.
L
.
S
t
ur
m
,
a
nd
J
.
L
a
r
s
e
n,
“
D
e
e
p
l
e
a
r
ni
ng
a
nd
m
us
i
c
a
dve
r
s
a
r
i
e
s
,”
I
E
E
E
T
r
ans
ac
t
i
ons
on
M
ul
t
i
m
e
di
a
,
vol
.
17,
no. 11, pp. 2059
–
2071, 2015, doi
:
10.1109/
T
M
M
.2015.2478068.
[
13]
J
.
L
e
e
a
nd
J
.
N
a
m
,
“
M
ul
t
i
-
l
e
ve
l
a
nd
m
ul
t
i
-
s
c
a
l
e
f
e
a
t
ur
e
a
ggr
e
ga
t
i
on
u
s
i
ng
pr
e
t
r
a
i
ne
d
c
onvol
ut
i
ona
l
ne
ur
a
l
ne
t
w
or
ks
f
or
m
us
i
c
a
ut
o
-
t
a
ggi
ng,”
I
E
E
E
Si
gnal
P
r
oc
e
s
s
i
ng L
e
t
t
e
r
s
, vol
. 24, no. 8, pp. 1208
–
1212, 2
017, doi
:
10.1109/
L
S
P
.2017.2713830.
Evaluation Warning : The document was created with Spire.PDF for Python.