Int
ern
at
i
onal
Journ
al of Ele
ctrical
an
d
Co
mput
er
En
gin
eeri
ng
(IJ
E
C
E)
Vo
l.
9
, No
.
6
,
Decem
ber
201
9
, p
p.
5185
~
5191
IS
S
N:
20
88
-
8708
,
DOI: 10
.11
591/
ijece
.
v
9
i
6
.
pp5185
-
51
91
5185
Journ
al h
om
e
page
:
http:
//
ia
es
core
.c
om/
journa
ls
/i
ndex.
ph
p/IJECE
Convolu
ti
onal n
eu
ra
l n
etwork
-
ba
sed mod
el
for web
-
based te
xt classif
icat
i
on
Sa
t
yabr
ata Ai
ch
, Sab
yasa
c
h
i Ch
ak
r
abort
y
, H
ee
-
C
heol
K
im
Depa
rt
m
ent
o
f
C
om
pute
r
Engi
n
e
eri
ng/Institut
e
of
Digital
Ant
i
-
Aging
H
ealthcare,
I
nje
Univ
ersity
,
Republ
ic of
Korea
Art
ic
le
In
f
o
ABSTR
A
CT
Art
ic
le
history:
Re
cei
ved
A
pr
18
, 201
9
Re
vised
Ju
l
8
,
201
9
Accepte
d
J
ul
1
7
, 2
01
9
The
re
is
an
in
creasing
amount
of
te
xt
d
ata
av
ailab
le
on
the
web
wi
th
m
ult
iple
topi
c
al
gr
anul
ar
i
ti
es;
thi
s
ne
ce
ss
i
ta
t
es
prope
r
cate
goriz
a
ti
on/
cl
assi
fic
a
ti
on
of
te
xt
to
fa
cilitate
obta
in
ing
usef
ul
informa
ti
on
as
per
th
e
n
ee
d
s
of
users.
Som
e
tra
dit
ional
appr
oac
h
es
such
as
bag
-
of
-
words
and
bag
-
of
-
ngra
m
s
m
odel
s
provide
good
result
s
for
t
ext
cl
assifi
catio
n.
How
eve
r,
te
x
t
s
ava
ilable
on
the
web
in
the
cur
r
ent
sta
t
e
contain
high
eve
nt
-
r
el
a
te
d
gr
anul
ar
ity
on
diffe
ren
t
topics
at
diff
ere
nt
le
v
els
,
which
m
a
y
ad
ver
sel
y
aff
ect
th
e
accur
a
c
y
of
tra
ditional
ap
proa
che
s.
W
it
h
t
he
inve
nt
ion
of
dee
p
learn
ing
m
odel
s,
whic
h
al
re
ad
y
h
ave
the
ca
pability
of
pr
ovidi
ng
good
accura
c
y
in
th
e
field
of
image
proc
essing
and
spee
ch
r
ec
ogn
it
i
on,
th
e
prob
le
m
s
inhe
ren
t
in
th
e
tr
adi
t
ional
te
xt
c
la
ss
ifica
t
io
n
m
odel
ca
n
be
over
come.
Curr
e
ntly
,
th
ere
are
s
eve
ra
l
dee
p
le
arn
ing
m
odel
s
such
as
a
conv
olut
ional
neur
al
net
works
(CNNs),
rec
urr
ent
neur
al
net
works
(RNN
s),
and
lo
ng
-
short
te
rm
m
emor
y
th
at
are
widely
used
for
var
ious
t
ext
-
rel
a
te
d
ta
sks
;
howeve
r,
among
the
m
,
th
e
C
NN
m
odel
is
popula
r
becaus
e
it
is
sim
ple
to
use
and
has
high
ac
cur
a
c
y
for
t
ext
cl
assifi
ca
t
ion.
I
n
thi
s
stud
y
,
c
la
ss
ifi
c
at
ion
of
ran
dom
te
xts
on
the
we
b
int
o
cate
gori
es
is
at
te
m
pte
d
using
a
CNN
-
base
d
m
odel
by
ch
angi
n
g
the
h
y
per
p
ara
m
e
te
rs
and
seque
n
c
e
of
te
x
t
vectors.
W
e
at
t
empt
to
tune
ev
e
r
y
h
y
per
p
ara
m
eter
tha
t
is
uniqu
e
for
th
e
class
ifi
cation
t
ask
al
ong
wit
h
the
seque
nc
es
of
word
vec
tors
to
obta
in
the
d
esir
ed
ac
cu
racy
;
the
ac
cur
acy
is
found
to
be
in
the
ran
ge
of
85
–
92%.
Thi
s
mode
l
ca
n
be
co
nsidere
d
as
a
re
li
ab
le
m
odel
and
app
lied
to
solve
re
al
-
world
proble
m
or
e
xt
rac
t
useful
informati
on
for
var
ious text
m
ining a
ppl
ic
a
ti
ons.
Ke
yw
or
d
s
:
Conv
olu
ti
onal
neural
netw
ork
s
Deep l
ear
ning
Text cla
ssficat
i
on
Text m
ining
Web
-
based te
xt
classi
ifcat
ion
Copyright
©
201
9
Instit
ut
e
o
f Ad
vanc
ed
Engi
n
ee
r
ing
and
S
cienc
e
.
Al
l
rights re
serv
ed
.
Corres
pond
in
g
Aut
h
or
:
Hee
-
C
heo
l
Ki
m
,
Dep
a
rtem
ent o
f
Com
pu
te
r
Enginee
rin
g/Insti
tute o
f Digit
al
An
ti
-
A
ging
He
al
thcare
,
Inje U
niv
er
sit
y,
197, I
nje
-
ro,
G
i
m
hae
-
si, G
ye
ongsa
ng
nam
-
do
,
Rep
ub
li
c
of
K
or
ea
50
834
.
Em
a
il
:
heek
i@
inj
e.ac
.kr
1.
INTROD
U
CTION
Ca
te
go
rizat
io
n
or
cl
assifi
cat
ion
of
te
xt
is
c
onside
red
to
be
on
e
of
t
he
im
po
rta
nt
to
pics
in
the
fiel
d
of
natu
ral
la
ngua
ge
proces
sin
g
(N
L
P)
;
it
is
al
s
o
an
esse
ntial
t
oo
l
in
div
er
se
fiel
ds
su
c
h
as
filt
ering
in
for
m
at
ion
,
cat
egorizat
ion
of
top
ic
s
,
searchi
ng
the
w
eb,
an
d
senti
m
ental
analy
si
s
[1
]
.
O
ne
of
the
si
m
ple
ways
of
exp
la
ini
ng
te
xt
cl
assifi
cat
ion
is
descr
i
bed
a
s
fo
ll
ows:
Give
n
a
gr
oup
of
do
c
um
ents
co
ntainin
g
a
gro
up
of
cl
asses,
a
fun
ct
ion
is
def
in
ed
that
will
a
ssign
a
val
ue
to
the
gr
ou
p
of
cl
a
sses
for
each
doc
ume
nt
[2
]
.
Text
cl
assifi
cat
ion
or
te
xt
m
ining
ap
proac
hes
are
us
ed
to
e
xt
ract
i
m
po
rtant
inform
ation
from
a
la
rg
e
a
m
ount
of
te
xt
data
in
a
sh
ort
tim
e
[3
]
.
From
t
he
be
ginnin
g,
te
xt
cl
assifi
c
at
ion
has
bee
n
co
ns
ide
red
to
be
a
com
plica
te
d
pro
ble
m
becau
se
te
xt
data
ar
e
m
os
tl
y
un
str
uctu
red
an
d
c
onta
in
a
lot
of
te
xt
vect
ors.
Pre
viou
s
appr
oach
es
a
va
il
able
are
su
it
able
for
a
s
m
all
a
m
ou
nt
of
te
xt
with
le
ss
com
plexit
y;
ho
w
ever,
these
ap
proac
hes
are
not
su
it
able
fo
r
a
la
r
ge
am
ount
of
te
xt
be
cause
this
re
duces
their
accur
acy
.
Fo
r
this
pur
pose,
dee
p
le
arn
i
ng
m
od
el
s
are
popu
la
r
beca
us
e
they
pr
ovide
good
accu
racy
wh
e
n
us
e
d
f
or
a
la
rg
e
a
m
ou
nt
of
data
an
d
these
Evaluation Warning : The document was created with Spire.PDF for Python.
IS
S
N
:
2088
-
8708
In
t J
Elec
&
C
om
p
En
g,
V
ol.
9
, N
o.
6
,
Dece
m
ber
201
9
:
5185
-
5191
5186
m
od
el
s
hav
e
a
lready
s
how
n
their
pote
ntial
in
the
fiel
d
of
s
peech
rec
ognit
ion
a
nd
c
om
pu
te
r
visio
n
[4,
5].
The
proce
dure
that
m
os
t
deep
le
arn
ing
a
ppr
oa
ches
f
ollow
is
as
fo
ll
ow
s:
Fir
st,
inp
ut
sente
nc
es
are
represe
nted
as
a
sequ
e
nce
of
w
ords
.
A
te
rm
ca
ll
ed
“on
e
-
hot
vecto
r”
re
pr
ese
nts
each
word
i
n
that
m
od
el
.
T
he
n,
a
weig
ht
m
at
rix
is
m
ult
i
plied
by
the
se
qu
e
nce
of
w
or
ds
a
nd
pro
j
ect
ed
to
a
ve
ct
or
sp
ace
th
at
is
c
on
ti
nu
ous
in
na
ture
t
o
form
a
den
se
vecto
r
that
con
ta
ins
a
sequ
e
nce
of
real
val
ues.
T
his
sequence
of
words
will
be
con
sidere
d
a
s
input
to
the
de
ep
ne
ur
al
ne
tw
ork
in
w
hich
m
ul
ti
ple
la
ye
rs
pr
e
dict
the
de
sire
d
out
pu
t.
B
ased
on
the
tu
ning
of
su
it
able
hyperparam
et
ers
an
d
the
se
qu
e
nc
e
of
w
ord
ve
ct
or
s,
m
axi
m
u
m
accuracy
can
be
ac
hieve
d
i
n
the
trai
ning
set
[6
-
8].
Althou
gh
dif
fere
nt
de
e
p
le
ar
ning
m
od
el
s
are
avail
abl
e
su
ch
a
s
lo
ng
-
sh
ort
-
te
rm
m
e
m
or
y
(LSTM
),
recur
ren
t
n
e
ur
al
net
works
(R
N
Ns),
an
d
c
onvolut
ion
al
neural
ne
tworks
(C
NNs),
we
us
e
a
CNN
because
of
it
s
si
m
plici
t
y
and
al
so
beca
us
e
it
pr
ovide
s
high
accuracy
f
or
te
xt
cl
assifi
cat
ion
.
The
refo
re,
in
this
stud
y,
a
CN
N
m
od
el
is
dev
el
op
e
d
with
t
he
best
possible
tun
i
ng
of
hype
r
par
am
et
ers
and
sequ
e
ncin
g
of
word
vecto
rs
to
im
pr
ov
e
the classi
ficat
ion
acc
ur
ac
y of t
exts int
o diff
e
re
nt cate
gories
.
The
pap
e
r
is
orga
nized
as
f
ollows:
Sect
io
n
2
pro
vid
e
s
backg
rou
nd
a
nd
relat
ed
w
ork.
Sect
io
n
3
descr
i
bes
t
he
m
et
ho
dolo
gy.
In
Sect
io
n
4,
the
e
valuati
on
of
the
m
od
el
is
disc
us
se
d.
Sect
io
n
5
prov
i
des
con
cl
us
io
n
a
nd
futur
e
wo
rk.
2.
BACKG
ROU
ND
Kim
[9
]
pr
op
os
e
d
a
m
et
ho
d
to
perform
cl
assifi
cat
ion
of
a
sentence
us
in
g
a
CNN
by
us
in
g
one
conv
olu
ti
on
la
ye
r
with
the
ve
ry
li
tt
le
tun
in
g
of
hy
perpara
m
et
e
rs;
four
di
ff
e
ren
t
m
od
el
s
su
c
h
as
CN
N
-
ra
nd,
CNN
-
sta
ti
c,
CNN
-
non
-
sta
ti
c,
and
CN
N
-
m
ulti
chan
nel
we
r
e
us
ed
a
nd
an
accuracy
ra
nging
from
81
.5
–
89.6%
was
ac
hieve
d.
The
a
uthor
c
on
cl
ud
e
d
that
pr
i
or
trai
ning
of
un
s
uper
vise
d
word
vect
or
s
i
s
one
of
the
im
portant
aspec
ts
w
hile
perform
ing
N
LP
-
relat
ed
ta
s
ks
us
in
g
deep
le
arn
in
g.
Zha
ng
et
al
.
[
10
]
pro
posed
a
n
em
pirical
stud
y
t
hat
us
e
s
cha
racter
-
le
ve
l
CNN
f
or
la
rge
-
scal
e
dataset
s;
they
f
ound
t
hat
a
C
NN
that
us
es
cha
racter
-
le
vel
featur
e
s
as
in
pu
t
is
e
ff
ect
iv
e
wh
e
n
s
ubj
e
ct
ed
t
o
datase
ts
of
a
ce
rtai
n
siz
e,
c
ur
at
e
d
te
xt,
a
nd
ch
oice
of
al
ph
a
bets.
T
he
resu
lt
w
ou
l
d
be
dif
fer
e
nt
if
there
is
a
chan
ge
in
dataset
siz
e,
cur
at
ed
te
xts,
an
d
ch
oice
of
al
ph
a
bets.
J
ohns
on
a
nd
Z
ha
ng
[11]
pro
po
sed
a
m
et
ho
d
to
ver
i
fy
the
eff
ect
of
the
word
orde
r
on
te
xt
cl
assifi
cat
ion
a
ccur
acy
;
he
re,
instea
d
of
ap
plyi
ng
a
CN
N
to
a
low
-
dim
ens
io
nal
w
ord
vector,
they
app
li
ed
the
m
e
tho
d
to
high
-
dim
ensi
on
al
te
xt
data
to
achieve
be
tt
er
m
od
el
accu
racy.
T
hey
use
d
a
pa
rall
el
CNN
,
in
wh
ic
h
m
or
e
than
tw
o
co
nv
olu
ti
on
la
ye
rs
are
us
e
d
in
paral
le
l
fo
r
le
arn
i
ng
m
or
e
em
bed
di
ng
te
xts
to
im
pr
ov
e
m
od
el
accurac
y;
they
fo
und
t
his
m
e
thod
to
be
ef
fecti
ve.
H
ughes
et
al
.
[12]
pro
po
se
d
a
n
appro
ac
h
base
d
on
the
sem
antic
c
la
ssific
at
ion
pe
rfor
m
ed
at
the
sentence
le
ve
l.
They
us
e
d
this
m
et
ho
d
o
n
m
edical
te
xt
s
and
fou
nd
that
de
e
p
CN
Ns
facil
it
at
e
in
analy
zi
ng
the
sem
antics
of
se
ntences
by
gen
e
rati
ng
m
or
e
op
tim
al
f
eat
ur
es,
wh
ic
h
in
directl
y
i
m
pr
ov
e
s
the
accur
acy
.
T
he
y
found
t
hat
th
is
m
et
ho
d
outp
erfor
m
ed
ot
her
ap
proache
s u
s
ed
f
or
ta
sk
s
relat
ed
to
N
LP
.
Ri
os
a
nd
Ka
vu
lur
u
[13]
propose
d
a
n
a
ppr
oa
ch
t
o
perform
te
xt
cl
assifi
ca
ti
on
of
bio
m
edical
arti
cl
es
by
assig
ning
a
m
edical
su
bj
ect
hea
ding
(
Me
SH
)
te
rm
t
o
the
a
rtic
le
s.
They
f
ound
an
im
pr
ovem
e
nt
of
appr
ox
im
at
ely
3%
wh
e
n
us
in
g
Me
S
H
te
rm
s
for
cl
assifi
cat
ion
ta
s
ks
c
ompare
d
to
previ
ou
s
res
ults
on
public
dataset
s.
They
m
entioned
th
at
this
m
e
thod
has
a
stron
g
pote
ntial
for
cl
assifi
cat
ion
of
te
xts
relat
ed
to
bio
m
edical
arti
cl
es.
Zha
ng
et
al
.
[
14]
pro
pos
ed
a
novel
m
eth
od
t
o
perform
sentim
ental
a
naly
sis
on
te
xt
data
by
us
i
ng
a
CN
N
a
nd
c
r
os
s
-
m
od
al
it
y
co
ns
ist
ent
re
gr
es
sio
n
(CCR
)
an
d
tra
ns
fe
r
le
ar
ning.
They
use
d
t
hr
e
e
ty
pes
of
em
beddin
gs
su
c
h
as
le
xi
con
em
beddin
g,
sem
antic
em
bed
din
g,
an
d
sentim
ent
em
bed
din
g
t
o
encode
the
te
xts.
T
o
i
m
pr
ov
e
t
he
pe
rfor
m
ance,
eac
h
C
NN
m
od
el
con
ta
ine
d
one
of
the
em
bed
di
ng
s
as
well
as
CC
R
and
tran
sfe
r
le
arn
i
ng
was
pe
rfor
m
ed.
It
ha
s
bee
n
f
ound
t
hat
al
l
CNN
m
od
el
s
per
f
or
m
bette
r
co
m
par
ed
t
o
the
existi
ng
m
od
el
s.
W
a
ng
and
Kim
[1
5]
pr
opose
d
a
n
i
m
pr
ov
e
d
C
NN
m
od
el
for
top
ic
an
d
senti
m
ent
cl
assifi
cat
ion
on
fou
r
be
nch
m
ark
dataset
s
an
d
fou
nd
that
th
e
per
f
or
m
ance
of
the
ne
w
m
od
el
ou
t
perform
ed
al
l
the pre
vious m
od
el
s
.
Moriya
and Shibata
[16] p
r
op
os
e
d
a CNN
te
chn
i
qu
e
with deep lay
ers
at
ch
aracte
r
le
vel an
d
t
hen
us
e
d
trans
fer
le
a
rn
i
ng
m
et
ho
d
for
i
m
pr
ov
em
ent
in
the
cl
assifi
cat
ion
accu
racy.
Nii
et
al
.,
[1
7
]
pr
opos
e
d
a fr
am
ework
that
us
e
d
word
vect
or
representat
io
n
of
te
xt
an
d
C
NN
f
or
te
xt
c
la
ssific
at
ion.It
wa
s
f
ound
tha
t
the
perf
or
m
ance
of
the
pro
pos
ed
m
et
ho
d
is
be
tt
er
than
the
previ
ou
s
m
et
hods
.
Li
dong
an
d
Hu
i
[
18]
propose
d
a
te
chn
i
qu
e
na
m
ed
as
m
ulti
m
ixed
CN
N
f
or
cl
assifi
cat
ion
of
te
xt
se
nti
m
ents.I
t
was
fou
nd
that
this
m
et
hod
i
s
m
or
e
eff
ect
ive
com
par
ed
to
suppo
rt
vector
m
achine,
Naï
ve
Ba
ye
sia
n
and
oth
e
r
cl
assic
al
methods
.
Kowsa
ri
et
al
.
,
[
19
]
m
entione
d
a
good
revi
ew
ab
out
al
l
the
cl
assifi
cat
ion
m
et
ho
d
a
nd
it
s
ad
va
ntag
es
an
d
disad
va
ntages.
From
of
the
a
bovem
entione
d
pa
st
w
ork,
it
can
be
see
n
that
CN
Ns
have
en
ough
pote
ntial
for
te
xt classi
ficat
ion
-
relat
ed
task
s whil
e achie
vin
g hi
gh accu
ra
cy
.
Evaluation Warning : The document was created with Spire.PDF for Python.
In
t J
Elec
&
C
om
p
En
g
IS
S
N:
20
88
-
8708
Con
v
olu
ti
on
al
neural
network
-
base
d model f
or
we
b
-
based
text
classif
ic
ation
(
Saty
abra
t
a Ai
ch
)
5187
3.
CONV
OLUT
IONAL
NEU
RA
L
NETW
ORKS
This
sect
io
n
giv
es
a
n
ov
e
rv
ie
w
of
C
NN
s
an
d
their
arc
hitec
ture
for
a
pp
li
c
at
ion
to
te
xt
cl
assifi
cat
ion.
A
ty
pical
CN
N
arc
hitec
ture
for
rec
ognizin
g
cha
racters
is
sh
ow
n
in
Fig
ure
1
[
20
]
.
It
w
as
m
a
inly
inv
e
nted
for
com
pu
te
r visi
on a
nd no
wad
ay
s a
lm
os
t every
visio
n
syst
em
us
es C
NNs [9]
.
A
CN
N
is
a
ne
ur
al
netw
ork
-
ba
sed
a
rc
hitec
ture,
wh
ic
h
basical
ly
co
ns
ist
s
of
m
ult
iple
sta
ges;
the
net
work
i
s
trai
na
ble
for
perform
ing
te
xt
cl
assifi
cat
ion
-
relat
ed
ta
sk
s.
T
he
sta
ge
s
of
a
CN
N
are
as
fo
ll
ows
[
21
-
23]:
a.
Con
voluti
onal
la
ye
rs:
These
a
re
so
m
e
of
the i
m
po
rtant
la
ye
rs
of
a
CN
N.
T
hese
la
ye
rs
co
nt
ai
n
a
nu
m
ber
of
kernel
m
a
tric
es.
In
t
hese
la
ye
rs,
co
nvol
utio
n
is
us
ually
per
f
or
m
ed
by
the
ke
rn
el
m
at
rices
on
th
e
input
an
d
an
outp
ut
as
a
biased
value
-
a
dd
e
d
feat
ur
e
m
at
rix
is
ge
ner
at
ed.
T
he
wei
gh
t
of
kernel
an
d
biases
is
le
arned
by u
si
ng lear
nin
g p
r
ocedur
e
s
because
the
conn
ect
io
n wei
ghts are s
har
e
d
a
m
on
g
ne
uro
ns
.
b.
Pooli
ng
la
ye
rs:
These
la
ye
rs
are
fun
dam
ental
el
e
m
ents
of
a
CNN.
T
he
m
ain
obj
e
ct
ive
of
these
la
ye
rs
is
to
carry
o
ut
dim
ensio
nalit
y
red
uctio
n
of
t
he
input,
w
hic
h
reduces
t
he
nu
m
ber
of
ra
ndom
ly
gen
er
at
ed
var
ia
bles
so
th
at
the
data
analy
ti
c
pr
ocess
is
faster
an
d
sim
pler.
T
he
subsa
m
pl
ing
of
the
conv
olu
ti
on
la
ye
r
ou
t
pu
t
is
pe
rfo
rm
ed
by
the
poolin
g
la
ye
r
by
com
bin
ing
ne
igh
borin
g
el
e
m
ents.
T
he
m
a
x
-
poolin
g
f
un
c
ti
on
is
the
m
os
t
com
m
on
ly
us
ed
poolin
g
f
un
ct
io
n
t
hat
us
ua
ll
y
ta
kes
the
m
axi
m
um
value
a
m
on
g
the local
neig
hbor
hoods.
c.
Em
bed
ding
la
ye
r:
This
is
on
e
of
the
s
pec
ia
l
el
e
m
ents
of
a
CNN
to
pe
rfor
m
te
xt
c
l
assifi
cat
ion
-
rel
at
ed
ta
sk
s.
The
ob
je
ct
ive
of
t
his
l
ay
er
is
to
co
nv
ert
in
pu
t
te
xt
docum
ents
into
a
pro
pe
r
f
orm
a
t
that
is
s
uitabl
e
for
th
e
CN
N.
In
this
la
ye
r
,
each
w
ord
of
the
in
put
te
xt
do
c
um
ent
is
c
onve
rted
i
nto
a
de
ns
e
vect
or
of
a fix
e
d
siz
e.
d.
Fu
ll
y
co
nn
ect
e
d
la
ye
r:
T
his
is
a
hi
dd
e
n
la
ye
r
of
a
feed
-
f
orw
ard
ne
ur
al
net
work
(
FNN)
.
This
la
ye
r
ca
n
be
exp
li
cat
ed
as
a
uniq
ue
c
onvol
ution
la
ye
r
tha
t
con
ta
i
ns
a
ke
rn
el
m
at
rix
of
siz
e
1x1.
T
his
ty
pe
of
la
ye
r
i
s
a
m
e
m
ber
of
t
he
gro
up
that
c
on
ta
in
s
the
wei
gh
ts
of
a
t
raina
ble
la
ye
r.
This
la
y
er
is
m
os
tl
y
us
e
d
i
n
the
la
s
t
sta
ge of
a
CN
N.
Figure
1.
A
rch
i
te
ct
ur
e
of
a
c
onvoluti
onal
ne
ur
al
netw
ork
A
backpro
pagat
ion
al
go
rith
m
basical
l
y
us
es
super
vise
d
le
arn
in
g
a
nd
c
on
ti
nu
ous
-
valu
ed
f
unct
io
n.
The
num
erical
weigh
ts
on
e
ach
input
are
assigne
d
ba
s
e
d
on
histo
rical
data
pr
io
r
to
the
trai
ning
process
.
In
t
he
trai
ning p
r
ocess
,
the
op
tim
a
l
weig
ht
is
finali
zed
by r
e
du
ci
ng
the
m
ean
s
qu
a
re
e
rro
r
.Th
e form
ula
for
fin
ding the
m
e
an
s
quare e
rro
r
is as
f
ollow
s
[23
]
.
=
1
1
∗
∑
∗
∑
2
=
=
1
=
1
=
1
(
)
Wh
e
re
n
o
is t
he
num
ber
of
neurons of
the
outpu
t l
ay
er
an
d
e
s
2
(p
)
is the e
rro
r
of t
he
s
th
ou
t
put ne
uro
n for t
he p
th
patte
rn of th
e t
rainin
g
set
.
To
m
ini
m
iz
e
t
he
e
rro
r
funct
ion
m
ini
-
batc
h
st
oc
hastic
gradie
nt
desce
nt
(m
-
S
GD)
is
widely
us
e
d
[
24]
.
I
n
m
-
SG
D,
t
he
m
od
el
coeffic
ie
nts
an
d
m
od
el
er
ror
est
i
m
at
ion
s
are
pe
rfor
m
ed
by
div
idi
ng
the
trai
ni
ng
set
s
into
sm
all
num
ber
of
batch
es.
T
his
ty
pe
of
al
gorithm
ha
s
the
a
dvanta
ge
s
of
bo
t
h
stoc
hastic
gr
a
dient
desce
nt
al
gorithm
as
wel
l
as
batc
h
gr
a
dient
des
cent
al
gorithm
,
i.e.
r
obus
t
nes
s
f
ro
m
the
sto
chasti
c
gr
a
dient
desce
nt
al
gorithm
a
nd
e
ff
ic
ie
ncy
the
batc
h
gr
a
dient
desce
nt
al
gorithm
.
This
al
gorithm
is
th
e
m
ost
widely
used
algorit
hm
in
the
fiel
d of
dee
p
le
arn
i
ng [25
]
.
Evaluation Warning : The document was created with Spire.PDF for Python.
IS
S
N
:
2088
-
8708
In
t J
Elec
&
C
om
p
En
g,
V
ol.
9
, N
o.
6
,
Dece
m
ber
201
9
:
5185
-
5191
5188
4.
METHO
DOL
OGY
The
flo
wch
a
rt
of the m
et
ho
do
log
y
us
e
d
in
th
is st
ud
y i
s
sho
wn in F
ig
ur
e
2.
Figure
2
.
Flo
w
char
t
of the te
xt
classi
ficat
ion
us
in
g
C
NN
The follo
wing
ste
ps
ca
n be c
onside
red as t
he
k
ey
ste
ps
to
perf
or
m
text cl
assifi
cat
ion
us
in
g
CN
Ns
.
a.
Data
aggre
gation
a
nd
in
gestion
:
Th
e
init
ia
l
dev
el
op
m
ent
of
the
m
od
el
re
qu
i
res
a
huge
a
m
ou
nt
of
data
to
be
us
e
d
f
or
re
viv
in
g
the
fea
ture
vect
or
to
com
plete
l
y
c
onvolve
th
rou
gh
the
te
xt
m
at
rix
an
d
to
s
el
f
-
op
ti
m
iz
e
the
s
cor
i
ng
functi
on
that
is
the
w
ei
gh
ts.
T
he
dat
a
was
c
ollec
te
d
f
ro
m
an
ope
n
s
ource
re
pos
it
or
y
that
co
ntained
blogs
on
f
our
diff
e
re
nt
top
ic
s,
nam
el
y,
healt
hcar
e,
s
ports,
m
ov
ie
s,
a
nd
f
inance.
The
da
ta
init
ia
ll
y
con
ta
ined
a
l
ot
of
noise
li
ke
ad
vert
isem
ent
data,
i
m
ages,
an
d
li
nk
s
.
T
his
nois
e
was
rem
ov
e
d
at
the
ver
y
init
ia
l
sta
ge
to
sup
port
pro
per
i
ngest
io
n
of
dat
a
into
the
m
od
el
.A
t
the
pro
du
ct
io
n
le
vel,
data
ing
est
io
n
is
i
m
plem
ented
by
us
in
g
the
Be
autiful
Sou
p
ob
j
ect
,
w
hich
ac
cepts
a
hyper
li
nk
to
a
ny
blo
g
or
any
kind
of
w
ebp
a
ge.
Af
te
r
fetchin
g
the
li
nk,
the
Be
autiful
Sou
p
o
bj
e
ct
is
pr
ocesse
d
with
a
li
nk
of
the
blog
or
te
xtu
al
data;
ini
ti
al
l
y,
it
captu
res
c
om
plete
te
xtu
al
data
f
r
om
the
blog
by
ne
gatin
g
a
ll
the
exces
s
noi
se
su
c
h
as
im
a
ges
a
nd
li
nks
.
Af
te
r
fetchi
ng
data
from
the
li
nk
,
t
he
ob
j
ec
t
is
processe
d
in
the
init
ia
l
sta
g
e
su
ch
as
by
re
m
ov
al
of
punct
uations
an
d
le
ftov
e
r
noise
;
further,
two
-
degr
ee
lemm
at
iz
a
tion
is
perform
ed
on
the
te
xt
t
o
m
ake
it
su
it
able
to
pr
ocess
us
in
g
the
cl
assifi
e
r.
The
sc
op
e
of
the
S
oup
obj
ect
is
su
c
h
that
eve
r
y
tim
e
the
ob
je
ct
ru
ns,
it
retrains
the
m
odel
fu
rt
her
to
m
ake
it
m
or
e
so
phist
ic
at
ed
a
fter
pro
per
ly
test
in
g usin
g
a
prede
fine
d
m
od
el
.
b.
Data
prep
r
oces
sing
a
nd
gen
e
r
at
ion
of
vocab
ulary:
This
is
the
init
ia
l
an
d
t
he
m
os
t
fun
da
m
ental
par
t
th
at
needs
to
be
perform
ed
for
t
he
gen
e
rati
on
of
a
ny
m
od
el
of
s
u
ch
a
us
e
case.
An
y
kind
of
te
xt
that
is
fetch
ed
from
the
internet
co
ntains
s
om
e
no
ise
,
w
hich
is
requir
ed
to
be
rem
ov
ed
from
the
input
pri
or
to
it
s
ing
est
io
n
into
t
he
m
od
el
.
In
a
CNN,
few
as
pe
ct
s
that
al
ways
need
to
be
as
certai
ned
before
go
in
g
f
orwa
r
d
with
any
a
ppr
oach
f
or
m
od
e
l
dev
el
opm
ent
include
t
he
le
ngth
of
t
he
doc
um
ent,
the
pa
ddin
g
re
quired
f
or
the
in
pu
t
m
at
ri
x,
et
c.
The
refo
re,
the
pri
m
ary
aim
is
al
ways
to
analy
ze
the
ty
pe
of
data.
In
t
he
ver
y
fi
r
st
ste
p,
we
rem
ov
e
al
l
kin
ds
of
punctuati
ons
f
ro
m
the
te
xt
data
to
dev
el
op
a
con
sist
ent
vo
cabu
la
ry
that
can
be
us
e
d
f
or
th
e
m
od
el
.
For
vocab
ulary
ge
ne
rati
on,
f
ew
ge
ner
ic
functi
ons
are
c
reated
,
w
hich
first
form
a
Evaluation Warning : The document was created with Spire.PDF for Python.
In
t J
Elec
&
C
om
p
En
g
IS
S
N:
20
88
-
8708
Con
v
olu
ti
on
al
neural
network
-
base
d model f
or
we
b
-
based
text
classif
ic
ation
(
Saty
abra
t
a Ai
ch
)
5189
pro
per
voca
bula
ry,
beca
us
e
neural
netw
orks
ne
ver
ta
ke
strin
gs
as
i
nput
an
d
re
qu
i
re
work
i
ng
w
it
h
nu
m
e
rical
inp
ut
s.
Ther
e
fore,
any
kind
of
in
pu
t
sho
uld
be
conve
rted
to
a
on
e
-
ho
t
in
pu
t
s
o
that
it
per
fec
tl
y
relat
es
to
the
r
esp
on
se
va
riab
le
or
t
he
ou
t
put
var
ia
ble
.
F
or
the
re
gula
rizat
ion
of
al
l
the
se
ntences
or
wor
d
vecto
rs
in
a
te
xtu
al
obj
ect
,
th
e
vect
or
s
nee
d
to
be
c
om
bin
ed
with
pro
per
bindin
gs
.
F
or
m
ai
ntaining
prope
r
bindin
g
f
or
all
w
ord vect
ors,
we
pe
rfor
m
p
add
i
ng
of
se
nte
nces s
uc
h
that
it
r
end
e
rs
the leng
t
h
of all
v
ect
ors
to
be
of
th
e
sa
m
e
s
iz
e.
Ther
e
fore,
we
us
ually
hav
e
a
hy
perparam
et
er
in
ou
r
sy
s
tem
,
MA
X
_DOC
_L
ENGTH
,
s
o
t
ha
t
pro
per
pa
dding
s
uch
as
<
P
AD
>
(i
n
our
c
ase)
ca
n
be
us
e
d
to
re
gula
rize
al
l
the
w
ord
vecto
rs
into
a
sim
ilar
segm
ent
in
te
rm
s
of
le
ng
t
h.
Mo
re
ov
e
r,
t
his
pa
ddin
g
he
lps
in
m
on
it
or
i
ng
new
w
ords
t
ha
t
the
cl
assifi
e
rs
hav
e
not
obser
ve
d
pr
e
viousl
y
an
d
c
onv
ert
to
a
pro
pe
r
se
gm
ent
so
that
the syst
em
m
a
i
ntains c
onsist
ency.
c.
CNN
m
od
el
:
First,
we
ge
ne
rate
a
sp
arse
m
at
rix
fr
om
the
te
xtu
al
data
on
the
basis
of
the
sp
li
t
rati
o
of
the
num
ber
of
sentenc
es
a
nd
the
num
ber
of
w
ords
.
Af
te
r
creati
ng
a
s
pa
rse
m
at
rix,
we
look
it
up
i
n
our
vo
ca
bula
ry
f
or
m
at
ching
,
w
hi
ch
ge
ner
at
es
the
batc
h
siz
e.
Our
pri
m
ary
d
efau
lt
batch
si
ze
was
10
w
ord
s
,
wh
e
re
the
sm
aller
sente
nces
w
ere
pa
dded
wit
h
0
with
re
gard
s
to
the
pr
e
vious
paddin
g
pro
vid
e
d
to
t
hem
i
n
the
earli
er
ste
p.
The
refor
e
,
w
e
con
cl
ud
e
t
hat
the
m
axi
m
u
m
le
ng
th
of
the
docum
ent
ing
est
ed
into
t
he
C
N
N
m
od
el
sh
ou
l
d
be
45
w
ords
.
I
n
the
em
bed
di
ng
la
ye
r,
w
hich
is
ge
ner
al
ly
the
first
la
ye
r
of
a
CN
N
m
od
el
,
we
m
ap
each
vo
ca
bula
ry
w
ord
to
l
ow
-
dim
ensio
nal
te
xt
vecto
rs.
On
t
he
basis
of
the
e
m
bed
di
ng
la
ye
r
dev
el
op
e
d
i
n
the
pre
vious
sta
ge,
a
c
onvoluti
on
la
ye
r
is
de
ve
lop
e
d
that
ta
ke
s
an
in
put
f
rom
the
e
m
bed
di
ng
la
ye
r
and
pass
es
the
scal
ar
pro
duct
s
of
the
functi
ons
to
t
he
m
ax
-
po
oling
la
ye
r.
For
the
com
plete
syst
e
m
,
we
de
velo
pe
d
two
c
onvoluti
on
la
ye
rs
fo
ll
owed
by
two
m
ax
-
poolin
g
la
y
ers
an
d
f
ully
connecte
d
la
ye
r
s.
Fo
r
th
e
first
co
nvolu
ti
on
la
ye
r
,
the
weig
ht
m
at
rix
or
the
filt
er
ha
d
a
dim
en
sion
of
the
wi
ndow
siz
e
of
ea
c
h
conv
olu
ti
on
fra
m
e
and
t
he
e
m
bed
din
g
siz
e
;
for
the
seco
nd
co
nvol
uti
on
la
ye
r,
t
he
w
ei
gh
t
m
at
rix
ha
d
the
dim
ension
of
the
sam
e
window
siz
e
of
t
he
seco
nd
co
nvol
ution
la
ye
r
and
t
he
siz
e
of
weig
ht
m
at
rix
or
filt
er.
T
he
siz
e
s
of
the
filt
er
or
wei
gh
t
m
at
rix
trac
k
a
m
ajo
r
us
e
case
i
n
determ
ining
th
e
pe
rfor
m
ance
of
the
CN
N
m
odel
as
these
a
re
an
i
niti
al
ly
ra
ndom
iz
ed
seq
ue
nce
of
num
be
rs,
w
hich
a
re
t
hen
opti
m
iz
ed
to
a
loss
gr
a
die
nt
to
ho
l
d
the
m
i
nim
a.
Af
te
r
co
nvolu
ti
on,
t
he
m
at
rix
is
po
ole
d
over
by
a
m
ax
-
poolin
g
la
ye
r
,
wh
ic
h
dow
ns
a
m
ples
the
ind
e
x
to
a
m
axi
m
um
value.
The
f
ully
connecte
d
la
ye
r
at
the
e
nd
te
nds
t
o
acce
pt
an
in
put
f
ro
m
the
fi
nal
m
ax
-
poolin
g
la
ye
r
t
o
gen
e
rate
the
sco
re
of
eac
h
an
d
e
ver
y
cl
a
ss
f
or
a
ba
tc
h
of
word
vecto
rs
passe
d
into
th
e
m
od
el
.
Final
ly
,
the
o
utput
is
pr
ovi
ded
by
the
ou
t
pu
t
la
ye
r.
T
he
act
ivati
on
functi
on
that
was
us
e
d
in
the
m
od
el
wa
s
the
Re
L
U
functi
on
to
ge
ne
rali
ze
the
path
of
the
syst
em
for
con
c
urrin
g res
ults so
that it
re
inf
or
ces
it
sel
f
to
diff
e
re
nt w
a
ys for
pro
per p
red
ic
ti
on.
5.
EVAL
UA
TI
O
NS
This
sect
io
n
c
om
par
es
the
pe
rfor
m
ance
of
our
m
od
el
with
so
m
e
existi
ng
sta
te
-
of
-
the
-
art
m
e
tho
ds
.
In
this
stu
dy
,
we
use
d
a
CNN
m
od
el
us
in
g
t
he
voc
abu
la
ry
ge
ne
r
at
ed
f
r
om
the
in
pu
t
te
xt
a
nd
t
un
i
ng
the
hy
perpara
m
et
ers;
we
f
ound
t
he
m
axi
m
um
accuracy
to
be
92
%
.
T
he
hy
perparam
et
ers
us
e
d
in our
stud
y
a
re
as
fo
ll
ows:
nu
m
ber
of
filt
ers
,
m
axi
m
u
m
len
gt
h
of
the
docum
ent,
em
b
edd
i
ng
siz
e
,
w
indow
siz
e,
poolin
g
window
,
pooli
ng
st
rides
,
nu
m
ber
of
co
nv
olu
ti
on
la
ye
rs,
and
num
ber
of
poolin
g
la
ye
rs.
I
n
ou
r
case,
we
achie
ved
m
axim
u
m
accuracy
by
assigni
ng
t
he
val
ues
to
the
ab
ovem
e
ntion
e
d
hype
rparam
et
ers
as
fo
ll
ow
s:
nu
m
ber
of
filt
ers
=
10,
m
axi
m
u
m
le
ng
th
of
the
docum
ent
=
80,
em
bed
di
ng
siz
e
=
20,
wind
ow
siz
e
=
20
,
poolin
g
wi
ndow
=
4,
poolin
g
stride
=
2,
num
ber
of
c
onv
ol
ution
la
ye
r
=
2,
a
nd
num
ber
of
poolin
g
la
y
ers
=
2.
W
it
h t
he a
bo
ve
m
entioned c
om
bin
at
ion
s, we
achieve
d hi
gher acc
ur
acy
.
The
m
et
ho
d
by
Hug
hes
et
al
.
[
12]
us
e
d
a
C
N
N
m
od
el
with
t
heir
word2
vec ap
pr
oach
for
m
edical
te
xt
cl
assifi
cat
ion
and
they
ac
hie
ved
a
m
axi
m
um
accuracy
of
68%.
T
he
m
e
thod
by
Kim
[9
]
us
e
d
CN
N
-
ran
d
m
od
el
,
in
w
hich
w
ords
we
re
init
ia
li
zed
rand
om
l
y
and
m
od
ifie
d
la
te
r
in
the
trai
ning
ph
ase
;
CNN
-
sta
ti
c
m
od
el
,
in
w
hich
pre
-
tr
ai
ned
vect
or
s
wer
e
sta
ti
c
bu
t
oth
e
r
par
am
et
ers
wer
e
up
dated
base
d
on
th
e
pe
r
f
or
m
ance;
CNN
-
non
-
sta
ti
c
m
od
el
,
w
he
re
pe
rtai
ned
vect
or
s
a
s
well
as
oth
e
r
p
a
r
a
m
et
ers
wer
e
updated
bas
ed
on
the
pe
rfor
m
ance;
CNN
-
m
ulti
channel
m
od
el
,
in
w
hich
each
cha
nnel
has
it
s
own
s
et
ti
ng
an
d
is
tun
e
d
separ
at
el
y.
Th
e
accuracy
of
the
abovem
e
ntion
e
d
m
od
el
s
ranges
from
81
.
5
–
89.
6
%
.
The
cl
assifi
cat
io
n
perform
ance
i
s
sh
own
in
F
i
gure
3.
F
ro
m
the
F
igu
re
3
,
it
is
cl
ear
th
at
ou
r
pro
pos
ed
m
e
tho
d
ac
hieves
the
hi
gh
e
st
ac
cur
acy
.
The
s
li
gh
t
m
arg
in
of
im
pr
ovem
e
nt,
of
a
ppr
oxim
at
ely
3%,
c
an
al
s
o
facil
it
at
e
in
achievin
g
t
he
de
sired
cl
assi
ficat
ion
obj
ect
ive
s.
Evaluation Warning : The document was created with Spire.PDF for Python.
IS
S
N
:
2088
-
8708
In
t J
Elec
&
C
om
p
En
g,
V
ol.
9
, N
o.
6
,
Dece
m
ber
201
9
:
5185
-
5191
5190
Figure
3
.
Cl
assifi
cat
ion
perfor
m
ance of
diff
e
ren
t C
NN m
od
el
s
6.
CONCL
US
I
O
N
In
this
stu
dy,
we
pro
posed
a
novel
CNN
-
base
d
m
e
tho
d
to
cl
assify
tex
ts
belo
ngin
g
to
dif
fer
e
nt
cat
egories
colle
ct
ed
from
the
web
with
hi
gh
er
accu
racy
co
m
par
ed
to
oth
e
r
CN
N
-
base
d
m
od
el
s.
The
pr
opos
e
d
CNN
m
od
el
w
as
buil
t
by
co
ns
ide
rin
g
di
ff
e
ren
t
hyperpa
ra
m
et
ers,
wh
ic
h
wer
e
t
un
e
d
to
op
ti
m
iz
e
the
resu
lt
s.
We
f
ound
ac
cur
aci
es
rangi
ng
from
85
–
92%
base
d
on
the
hyperp
a
ram
et
er
tun
ing
a
nd
shu
ff
l
ing
of
t
he
seq
ue
nce
of
the
te
xt
vect
ors.
In
t
he
f
ut
ure,
we
will
i
m
ple
m
ent
our
pro
po
s
ed
m
od
el
at
a
m
uch
la
rg
e
r
scale
with
bette
r
fin
e
-
grai
ne
d
datas
et
s.
W
e
hope
t
hat
our
m
od
el
i
niti
at
es
fu
rthe
r
stud
ie
s
as
well
as
helps
resear
chers
in the fi
el
d
t
o
c
la
ssify rand
om t
ex
ts f
r
om
the w
eb
and e
xtrac
t useful in
f
or
m
at
ion
from
it
.
ACKN
OWLE
DGE
MENTS
This
resea
rch
was
sup
ported
by
Ba
sic
Sci
ence
Re
searc
h
Pr
ogram
through
the
Nati
onal
Re
search
Foundati
on
of
K
orea
(
N
RF)
f
unde
d
by
the
M
inist
ry
of
S
ci
ence,
ICT
&
F
uture
Plan
ning
(N
RF
-
20
17
R
1D1
A
3B0403
2905)
.
This
pa
pe
r
is
a
re
vised
a
nd
ex
pa
nded
ve
rsion
of
a
pa
pe
r
e
ntit
le
d
“A
Co
nvolu
ti
onal
Neural
Net
work
A
ppro
ac
h
f
or
Cl
assifi
c
at
ion
of
Text
base
d
on
the
Texts
Coll
ect
ed
from
the
W
eb”
pr
e
sented
at
“In
t
ern
at
io
nal
Confere
nce
on
Fut
ur
e
I
nfor
m
at
i
on
&
Com
m
un
ic
at
ion
E
ng
i
ne
erin
g
(I
CF
ICE)
, P
at
t
ay
a, Th
ai
la
nd,
and 26
June
-
1 July
20
18.
REFERE
NCE
S
[1]
C.
C.
Aggarwal,
and
C.
Zha
i
,
“
A
surve
y
of
te
xt
cl
assificat
ion
algorithms
,”
In
mi
ning
te
x
t
data
,
Springer,
Bosto
n,
MA
,
pp.
163
-
22
2
,
2012
[2]
M.R.
Murt
y
,
e
t
al
.,
“
Te
x
t
Docu
m
ent
Cla
ss
ifi
cati
on
base
d
-
on
Le
a
st
Square
Support
Vec
tor
Mac
h
i
nes
with
Singular
Value
De
compos
it
ion
,
"
In
te
rnati
onal
Journal
of
Computer
Appli
cat
ions
,
vol
.
27
,
pp.
21
-
26
,
201
1
.
[3]
S.
Aich
,
et
al
.
,
“
A
te
xt
m
ini
ng
appr
oac
h
to
id
entif
y
th
e
relat
ionship
bet
wee
n
g
ai
t
-
Parkinson'
s
dise
ase
(PD
)
from
P
D
base
d
rese
arc
h
art
i
cl
es
,"
In
Inv
e
nti
v
e
Computin
g
and
Informati
cs
(
ICICI
)
,
Inter
nati
onal
Confer
enc
e
on
,
IEEE
,
pp.
481
-
485
,
20
1
7
.
[4]
A.
Krizh
evsk
y
,
et
al
.,
“
Im
age
Net
Cla
ss
ifica
t
ion
with
Dee
p
Convolut
ional
Neur
al
Networks
,”
I
n
Proce
edi
ngs
o
f
NIPS
,
2012
.
[5]
A.
Grave
s
,
e
t
al
.,
“
Speec
h
r
ec
ogn
it
ion
wi
th
d
ee
p
r
ec
urre
n
t
n
eur
al n
et
works
,
”
In
Pro
ce
ed
ings o
f
ICA
SSP
,
2013
.
[6]
Y.
Kim
,
“
Convolut
ional
n
eur
al
net
works
for
sente
n
ce
class
ifi
c
at
ion
,”
Proceedi
ngs
of
th
e
2014
Confe
ren
ce
o
n
Empiric
al
Me
th
ods
in
Natural
Language
Pro
ce
ss
ing
(
EMNL
P)
,
Associat
ion
for
Computati
onal
li
nguisti
cs
,
2014,
pp
.
1746
-
1751
.
[7]
Y.
Xiao,
and
K.
Cho
,
“
Eff
ic
i
ent
cha
ra
cter
-
le
v
el
document
cl
assi
fic
a
ti
on
b
y
combining
convol
ut
i
on
and
rec
urre
n
t
lay
ers
,”
arX
iv preprint
arX
iv
:
16
02.
00367
,
2016.
[8]
A.
Hass
an,
and
A.Mahm
ood
,
“
Eff
icient
De
ep
Le
arn
ing
Model
for
Te
xt
C
la
ss
i
fic
a
ti
on
Based
on
Rec
urre
n
t
an
d
Convolut
ional
L
a
y
ers
,”
In
Ma
chine
Learning
and
Appl
ic
a
ti
ons
(
ICMLA)
,
16th
IEE
E
Int
ernati
onal
Confe
renc
e
on
,
IEE
E
,
2017
,
pp
.
1108
-
1113
.
[9]
K.
Yoon,
“
Conv
olut
ional
n
eur
al
net
works
for
sen
te
nc
e cl
assifi
ca
t
i
on
,”
arX
iv preprint
arXi
v:
1408
.
5882
,
2014
.
[10]
X.
Zha
ng
,
et
al
.,
“
Chara
c
te
r
-
le
ve
l
convo
lut
i
onal
n
et
works
for
te
x
t
c
la
ss
if
ic
a
ti
on
,
”
In
Ad
v
ance
s
in
neura
l
inf
orm
ati
on
proc
essing
systems
,
p
p.
649
-
657
,
201
5.
Evaluation Warning : The document was created with Spire.PDF for Python.
In
t J
Elec
&
C
om
p
En
g
IS
S
N:
20
88
-
8708
Con
v
olu
ti
on
al
neural
network
-
base
d model f
or
we
b
-
based
text
classif
ic
ation
(
Saty
abra
t
a Ai
ch
)
5191
[11]
R.
Johns
on,
an
d
T.
Zh
ang
,
“
Eff
ective
use
o
f
word
orde
r
f
or
te
x
t
c
ategori
za
t
ion
with
con
volut
ional
n
eur
al
net
works
,
”
arXi
v
preprint
arXi
v:
1412.
1058
,
201
4.
[12]
M.
Hughes
et
a
l
.,
“
Medi
ca
l
te
x
t
cl
assifi
ca
t
ion
usi
ng
convol
ut
ional
neur
al
n
et
work
s
,”
Stud
H
eal
th
Technol
Inform
,
vol.
235
,
pp
.
246
-
50
,
2017
.
[13]
A.
Rios,
and
R.
Kavulur
u
,
“
Convolut
ional
ne
ura
l
net
works
for
biomedic
al
t
ext
cl
assifi
catio
n:
appl
icati
on
i
n
inde
xing
b
iomedic
a
l
artic
le
s
,
”
In
Proce
ed
ings
of
the
6
th
AC
M
Confe
renc
e
on
Bi
oin
formatic
s,
Computati
on
al
Bi
ology and
H
ea
lt
h
In
formatic
s,
ACM
,
2015
,
pp
.
258
-
267
.
[14]
Z.
Zha
ng
,
et
a
l
.,
“
Te
xtua
l
sent
iment
anal
y
s
is
via
thre
e
diff
ere
n
t
attent
ion
convol
ut
iona
l
neur
al
ne
t
work
s
and
cro
ss
-
m
odal
ity
consist
ent
r
egr
ession
,
”
Neurocomputi
ng
,
vol
.
275
,
pp
.
14
07
-
1415
,
2018
.
[15]
X.
W
ang,
and
H.C.
Kim
,
“
Te
xt
Cat
egor
i
zation
with
Im
prove
d
Dee
p
Le
arn
ing
Methods
,”
Jour
nal
of
Information
and
Comm
unic
a
ti
on
Con
ve
rgenc
e
Eng
ine
ering
,
v
ol
.
16
,
pp
.
106
-
1
13
,
2018
.
[16]
S.
Mori
y
a
,
and
C.
Shibata,
“
Tra
n
sfer
le
arn
ing
m
e
thod
for
ver
y
de
ep
CNN
for
te
xt
cl
assificat
ion
an
d
m
et
hods
for
it
s
eva
lu
at
ion
,”
In
2018
IEE
E
42n
d
Annual
Computer
Soft
ware
a
nd
Appl
ic
a
ti
ons
Confe
renc
e
(
C
OMPSAC)
,
IEEE
,
v
ol.
2
,
pp
.
153
-
1
58,
2018
.
[17]
M.
Nii,
et
a
l
.
,
“
Nurs
ing
-
ca
re
te
xt
cl
assifi
cati
on
using
word
vec
tor
rep
r
ese
nta
ti
on
and
con
volut
ional
neur
al
net
works
,”
In
20
17
Joi
nt
17th
W
orld
Congress
o
f
Inte
rnationa
l
Fuzzy
Syste
ms
Associat
ion
and
9th
Inte
rnation
al
Confe
renc
e
on
S
oft
Computing
a
nd
Intelli
g
ent Sys
te
ms
(
I
FSA
-
SCIS)
,
IEE
E
,
2017
,
pp.
1
-
5.
[18]
H.
Li
dong,
and
Z.
Hui,
“
A
new
short
te
xt
sentim
ent
al
class
ifi
c
at
ion
m
et
hod
ba
sed
on
m
ult
i
-
m
i
xed
convol
ution
al
neur
al
ne
twork
,”
In
2018
IEEE
3rd
Inte
rnational
Confe
renc
e
on
Cloud
Computing
and
Big
Data
An
aly
si
s
(
ICCCBDA
)
.
IE
EE
,
2018
,
pp.
93
-
99.
[19]
K.
Kow
sari
,
e
t
a
l
.,
“
Te
xt
class
ifi
c
at
ion
al
gor
it
hm
s: A
surve
y
,”
In
fo
rm
ati
on
, v
ol
.
10
,
p
p.
150
,
2019
[20]
Y.
Le
Cun
,
e
t
al
.
,
“
Gradi
ent
-
b
ase
d
le
arn
ing
app
lied
to
do
cument
rec
ogni
ti
on
,”
Pr
oce
ed
ings
of
th
e
IEE
E
,
vol
.
86
,
pp.
2278
-
2324
,
1998.
[21]
K.
Fukus
hima
,
“
Neoc
ognit
ron:
A
self
-
orga
ni
zi
n
g
neur
al
n
et
wor
k
m
odel
for
a
m
ec
hani
sm
of
pat
t
ern
re
cogni
t
io
n
unaf
fecte
d
b
y
sh
ift
in
posit
ion,”
Bi
ological
Cyber
net
ic
s
,
vol
.
36
,
pp.
193
–
202
,
19
80.
[22]
S.V.
Georga
kopoulos
,
et
al
.,
“
Convolut
ional
Neura
l
Networks
for
Toxi
c
Com
me
nt
Cla
ss
ifi
c
at
io
n
,”
arXiv
pre
pri
nt
arXiv
:
1802
.
099
57
,
2018.
[23]
X.
Pan
,
e
t
al
.,
“
A
compari
son
of
neur
al
net
wo
rk
bac
kprop
aga
t
ion
al
gor
it
hm
s
for
el
e
ct
r
ic
i
t
y
lo
ad
fore
ca
sting
,”
In
Intelli
g
ent E
n
ergy
Syst
ems (
IWIE
S)
,
2013
IE
EE
In
te
rnationa
l
Workshop on,
I
EE
E
,
pp
.
22
-
27
,
2013.
[24]
L.
Bo
tt
ou
,
“
Online
learni
ng
a
nd
stocha
st
ic
a
pproximati
ons
,
”
On
-
li
ne
le
arni
ng
in
n
eural
net
works
,
vo
l.
17
,
p
p.
142
,
1998
.
[25]
Jaso
n
Brow
nlee
,
“
A
Gen
tl
e
I
ntr
oductio
n
to
Mi
ni
-
Ba
tc
h
Gradient
Desce
nt
and
H
ow
to
C
onfi
gure
Ba
tc
h
Size
,”
on
D
eep
Learnin
g
,
2017,
[
Onli
ne],
Av
ai
la
ble:
htt
ps://
m
a
chi
ne
le
arn
ingma
ster
y
.
com/gen
tle
-
int
roduc
t
ion
-
m
in
i
-
batch
-
gra
di
ent
-
desc
ent
-
conf
igur
e
-
batch
-
siz
e/
[
ex
tra
c
te
d
on
14
Au
gust
2018
]
.
BIOGR
AP
H
I
ES
OF
A
UTH
ORS
Saty
abrata
Aic
h
,
is
working
as
a
rese
arc
h
er
in
the
fie
ld
of
co
m
pute
r
engi
nee
r
ing
and
digi
tal
hea
l
thc
ar
e
.
He
h
as
publi
shed
m
an
y
rese
ar
ch
p
ap
ers
in
journals
a
nd
conf
er
ences
i
n
the
re
al
m
s
of
m
ac
hine
learni
n
g,
t
ext
m
ini
ng,
a
nd
suppl
y
chain
m
ana
gement.
Hi
s
rese
arc
h
intere
sts
ar
e
nat
ura
l
la
nguag
e
proc
essing,
m
ac
hine
l
ea
r
ning
,
suppl
y
cha
in
m
ana
g
ement,
te
xt
m
ini
ng
,
and
m
edi
c
al
informati
cs.
Sab
y
asachi
Chakraborty
,
is
working
as
a
m
ast
er
student
at
In
j
e
Univer
sit
y
.
He
has
worked
in
m
an
y
re
al
l
ife
p
roje
c
t
relat
ed
to
dat
a
m
ini
ng
,
text
m
ini
ng.
He
h
as
al
so
publi
she
d
few
pape
rs
rel
a
te
d
to
d
at
a
ana
l
y
t
ic
s
and
bi
g
data.
His
rese
arc
h
int
er
ests
ar
e
na
tura
l
l
anguage
proc
essing,
m
ac
hine
learni
n
g,
big
da
ta,
and
t
ext
m
ini
ng
.
Hee
-
Ch
eol
K
i
m
,
recei
ved
h
is
BS
c
at
Depa
rt
m
ent
of
Mathem
at
ic
s,
MS
c
at
Depa
rtment
of
Com
pute
r
Scie
n
ce
in
SoG
ang
Univer
sit
y
in
Korea
,
and
Ph
D
at
Num
eri
cal
Anal
y
s
is
and
Com
puti
ng
Science
,
Stockhol
m
Univer
sit
y
in
Sw
ede
n
in
2001.
He
is
Profess
or
at
Dep
art
m
ent
o
f
Com
pute
r
Enginee
ring
and
H
ea
d
of
the
Inst
it
ute
of.
Digita
l
Anti
-
agi
ng
H
ea
l
thc
ar
e
,
Inj
e
Univer
sit
y
in
Korea
.
His
rese
ar
ch
int
er
ests
inc
l
ude
m
ac
hine
l
earni
ng,
te
x
t
m
ining,
and
m
edi
c
al
informati
cs.
Evaluation Warning : The document was created with Spire.PDF for Python.