I
A
E
S
I
n
t
e
r
n
at
io
n
al
Jou
r
n
al
of
A
r
t
if
ic
ia
l
I
n
t
e
ll
ig
e
n
c
e
(
I
J
-
A
I
)
V
ol
.
10
, N
o.
2
,
J
une
2021
, pp.
31
6
~
3
23
I
S
S
N
:
2252
-
8938
,
D
O
I
:
10.11591/
ij
a
i.
v
10
.i
2
.pp
31
6
-
323
316
Jou
r
n
al
h
om
e
page
:
ht
tp
:
//
ij
ai
.
ia
e
s
c
or
e
.c
om
E
n
h
a
n
c
i
n
g t
h
e
p
e
r
f
or
m
an
c
e
of
c
an
c
e
r
t
e
xt
c
l
ass
i
f
i
c
at
i
on
m
od
e
l
b
ase
d
on
c
an
c
e
r
h
al
l
m
ar
k
s
N
oh
a A
li
1
, A
h
m
e
d
H
. A
b
u
E
l
-
A
t
t
a
2
, H
al
a H
.
Z
aye
d
3
1
Department of Compu
ter Science, Modern Academy for C
omputer Science and Managem
ent
Technology, Egypt
1,2,3
Department of Compu
ter Science, Faculty o
f Computers &
Artificial Intelli
gence, Benha University,
Egypt
A
r
t
ic
le
I
n
f
o
A
B
S
T
R
A
C
T
A
r
ti
c
le
h
is
to
r
y
:
R
e
c
e
iv
e
d
O
c
t
2
5
, 20
20
R
e
vi
s
e
d
F
e
b 24
, 20
21
A
c
c
e
pt
e
d
M
a
r
2
7
, 20
21
Deep
learning
(DL)
algorithms
achieved
state
-
of
-
the
-
art
performance
in
computer
vision
,
speech
recogniti
on,
and
natural
language
processin
g (NLP).
In
this
paper,
we
enhance
the
convolutional
neural
network
(CNN)
alg
orithm
to
classify
cancer
articles
accord
ing
to
cancer
hallmarks.
Th
e
model
implements
a
recent
word
embedding
technique
in
the
embedding
lay
er.
This
technique
uses
the
concept
of
distributed
phrase
represe
ntation
and
multi
-
word
phrases
embedding.
The
proposed
model
enhances
the
perform
ance
of
th
e
existing
model
used
for
biomedical
t
ext
classification.
The
result
of
the
proposed
model
overcomes
the
previous
model
by
achieving
an
F
-
score
equal
to
83.87%
using
an
unsupervised
technique
that
trained
on
P
ubMed
abstracts
called
PMC
vectors
(PMCVec)
emb
edding.
Also,
we
made
another
experiment
on
the
same
dataset
using
the
recurrent
neural
network
(RNN)
algorit
hm
with
two
different
word
embeddin
gs
Google
news
and
P
MCVec
which achieving F
-
score equa
l to 74.9% and 76.26%
, respectively.
K
e
y
w
o
r
d
s
:
B
io
m
e
di
c
a
l
te
xt
c
la
s
s
if
ic
a
ti
on
C
a
nc
e
r
ha
ll
m
a
r
ks
C
N
N
D
e
e
p l
e
a
r
ni
ng
N
L
P
P
hr
a
s
e
e
m
be
ddi
ng
P
M
C
V
e
c
R
N
N
This is an
open
acce
ss artic
le unde
r the
CC BY
-
SA
license.
C
or
r
e
s
pon
di
n
g A
u
th
or
:
N
oha
A
li
D
e
pa
r
tm
e
nt
of
C
om
put
e
r
S
c
ie
nc
e
M
ode
r
n A
c
a
de
m
y f
or
C
om
put
e
r
S
c
ie
nc
e
a
nd
M
a
na
g
e
m
e
nt
T
e
c
hnol
ogy
30
4
S
tr
e
e
t,
M
a
a
di
, C
a
ir
o, E
gypt
E
m
a
il
:
c
s
.noha
.a
li
@
gm
a
il
.c
om
1.
I
N
T
R
O
D
U
C
T
I
O
N
C
a
nc
e
r
is
a
ha
r
m
f
ul
di
s
e
a
s
e
th
a
t
ha
s
le
d
to
m
il
li
ons
of
hum
a
ns
de
a
th
s
.
C
a
nc
e
r
is
r
e
gul
a
r
ly
de
pi
c
te
d
w
it
hi
n
bi
om
e
di
c
a
l
li
te
r
a
tu
r
e
by
it
s
ha
ll
m
a
r
ks
;
A
g
r
oup
of
r
e
l
a
te
d
bi
ol
ogi
c
a
l
be
ha
vi
or
s
a
nd
pr
ope
r
ti
e
s
th
a
t
e
m
pow
e
r
c
a
nc
e
r
to
pa
s
s
in
to
th
e
body.
T
he
m
a
jo
r
obj
e
c
ti
ve
o
f
c
a
nc
e
r
r
e
s
e
a
r
c
he
s
is
to
know
th
e
bi
ol
ogi
c
a
l
tu
m
or
m
e
c
ha
ni
s
m
s
de
ve
lo
pm
e
nt
s
be
gi
nni
ng
w
it
hi
n
th
e
body
s
us
ta
in
e
d,
a
nd
tu
r
ni
ng
to
be
m
a
li
gna
nt
.
S
i
x
ha
ll
m
a
r
ks
of
c
a
nc
e
r
w
e
r
e
in
tr
oduc
e
d
th
e
f
ir
s
t
ti
m
e
in
th
e
s
e
m
in
a
l
pa
pe
r
publ
is
he
d
in
c
e
ll
jo
ur
na
l
[
1]
th
e
n
th
e
y
w
e
r
e
e
xt
e
nde
d
by
a
not
he
r
f
our
in
th
is
w
or
k
[
2]
,
f
or
m
in
g
a
s
e
t
of
c
a
nc
e
r
ha
ll
m
a
r
ks
th
a
t
a
r
e
known
ti
ll
now
.
T
he
e
xi
s
ti
ng
s
e
t
of
ha
ll
m
a
r
ks
s
um
m
a
r
iz
e
s
our
knowle
dge
of
t
he
di
s
e
a
s
e
in
to
a
f
ix
e
d
s
e
t
of
c
ha
nge
s
in
c
e
ll
phys
io
lo
gy
th
a
t
in
f
lu
e
nc
e
m
a
li
gna
nt
gr
ow
th
o
f
th
e
tu
m
o
r
(
s
uc
h
a
s
e
va
s
io
n
of
pr
ogr
a
m
m
e
d
c
e
ll
de
a
th
,
s
e
lf
-
s
uf
f
ic
ie
nc
y
in
gr
ow
th
s
ig
na
ls
,
s
us
t
a
in
e
d
a
ngi
oge
n
e
s
is
,
in
s
e
n
s
it
iv
it
y
to
gr
ow
th
-
in
hi
bi
to
r
s
,
li
m
it
le
s
s
r
e
pl
ic
a
ti
ve
pot
e
nt
ia
l
a
nd
ti
s
s
ue
in
va
s
io
n)
.
O
v
e
r
150k
r
e
s
e
a
r
c
h
in
c
a
n
c
e
r
pu
bl
is
he
d
ye
a
r
ly
on
P
ubM
e
d.
C
a
n
c
e
r
r
e
s
e
a
r
c
he
r
s
a
nd
onc
ol
ogi
s
ts
a
dva
nt
a
ge
e
nor
m
ous
ly
f
r
om
te
xt
m
in
in
g
f
ie
ld
in
f
or
m
a
ti
o
n
s
our
c
e
s
in
bi
om
e
di
c
in
e
s
uc
h
a
s
P
ubM
e
d.
I
n
th
is
pa
pe
r
,
w
e
e
nha
nc
e
th
e
pe
r
f
or
m
a
nc
e
of
th
e
c
la
s
s
if
ic
a
ti
on
m
ode
l
[
3]
,
w
hi
c
h
w
a
s
us
e
d
to
c
la
s
s
if
y P
ubM
e
d a
r
ti
c
le
s
ba
s
e
d on the
10 ha
ll
m
a
r
ks
of
c
a
nc
e
r
.
F
ir
s
t,
th
e
te
xt
c
la
s
s
if
ic
a
ti
on
ta
s
ks
c
a
n
a
c
c
om
pl
is
h
us
in
g
m
a
c
hi
ne
le
a
r
ni
ng
(
M
L
)
or
de
e
p
le
a
r
ni
n
g
(
D
L
)
te
c
hni
que
s
w
hi
c
h
a
r
e
bot
h
of
th
e
m
und
e
r
th
e
um
br
e
ll
a
of
a
r
ti
f
ic
ia
l
in
te
ll
ig
e
nc
e
(
A
I
)
.
D
L
te
c
hni
que
s
ha
ve
t
he
a
bi
li
ty
t
o c
a
pt
ur
e
t
he
f
e
a
tu
r
e
s
a
ut
om
a
ti
c
a
ll
y f
r
om
t
he
t
e
xt
. O
n t
he
ot
he
r
ha
nd,
M
L
t
e
c
hni
que
s
ha
ve
t
o
Evaluation Warning : The document was created with Spire.PDF for Python.
I
nt
J
A
r
ti
f
I
nt
e
ll
I
S
S
N
:
2252
-
8938
E
nhanc
in
g t
he
pe
r
fo
r
m
anc
e
of
c
anc
e
r
t
e
x
t
c
la
s
s
if
ic
at
io
n m
ode
l
bas
e
d on c
anc
e
r
hal
lmar
k
s
(
N
oha A
li
)
317
be
f
e
d
m
a
nua
ll
y
w
it
h
th
e
e
xt
r
a
c
te
d
f
e
a
tu
r
e
s
a
s
in
put
.
T
hi
s
di
f
f
e
r
e
nc
e
a
f
f
e
c
ts
th
e
p
e
r
f
or
m
a
nc
e
of
de
e
p
l
e
a
r
ni
ng
a
lg
or
it
hm
s
m
a
ki
ng
th
e
m
out
p
e
r
f
or
m
ove
r
M
L
te
c
hni
que
s
in
th
e
te
xt
c
l
a
s
s
if
ic
a
ti
on
ta
s
k.
S
e
c
ond,
th
e
nor
m
a
l
(
na
tu
r
a
l)
te
xt
di
f
f
e
r
s
f
r
om
th
e
bi
om
e
di
c
a
l
te
xt
in
th
e
f
ol
lo
w
in
g
c
ha
r
a
c
te
r
is
ti
c
s
,
a
m
e
di
c
a
l
te
r
m
m
a
y
be
w
r
it
te
n
a
bbr
e
vi
a
te
d l
ik
e
t
hi
s
c
e
ll
t
y
pe
na
m
e
c
a
ll
e
d
(
O
R
)
m
e
a
ns
out
e
r
r
o
ot
c
e
ll
t
ype
, not
t
ha
t
pr
opos
it
io
n
l
e
tt
e
r
. A
ls
o,
a
ve
r
y
im
por
ta
nt
c
ha
r
a
c
te
r
is
ti
c
,
th
e
m
e
di
c
a
l
te
r
m
m
a
y
c
on
s
is
t
of
phr
a
s
e
s
or
c
om
pound
-
w
or
ds
li
ke
th
is
pr
ot
e
in
na
m
e
(
hypoxia
-
in
duc
ib
le
)
o
r
s
ym
pt
om
l
ik
e
hi
gh
-
bl
ood
-
pr
e
s
s
ur
e
a
ll
of
th
e
s
e
c
ha
r
a
c
te
r
is
ti
c
s
m
a
y
c
a
us
e
di
s
pe
r
s
io
n pr
obl
e
m
s
i
n c
la
s
s
if
ic
a
ti
on
[
4]
.
D
e
s
pi
te
th
e
a
c
hi
e
ve
m
e
nt
of
hi
gh
-
qua
li
ty
ve
c
to
r
s
pa
c
e
m
ode
ls
,
f
or
e
xa
m
pl
e
,
W
or
d2ve
c
a
nd
G
lo
ve
,
th
e
y
ju
s
t
gi
ve
uni
gr
a
m
w
or
d
r
e
pr
e
s
e
nt
a
ti
on
a
nd
th
e
s
e
m
a
nt
ic
s
f
or
phr
a
s
e
s
c
on
s
is
t
of
m
ul
ti
-
w
or
d
m
us
t
b
e
a
ppr
oxi
m
a
te
d
th
r
ough
th
e
c
om
pos
it
io
na
l
a
ppr
oa
c
he
s
.
I
n
bi
om
e
di
c
a
l
te
xt
pr
oc
e
s
s
in
g,
it
i
s
di
f
f
ic
ul
t
to
w
r
it
e
te
c
hni
c
a
l
phr
a
s
e
s
f
or
s
ym
pt
om
s
,
m
e
di
c
a
ti
ons
,
a
nd
di
s
e
a
s
e
s
a
s
s
in
gl
e
w
or
ds
to
c
a
pt
ur
e
th
e
r
ig
ht
m
e
a
ni
ng.
T
o
s
ol
ve
th
is
pr
obl
e
m
,
in
th
is
w
or
k,
w
e
u
s
e
a
r
e
c
e
nt
ly
un
-
s
upe
r
vi
s
e
d
te
c
hni
que
,
th
a
t
u
s
e
s
th
e
c
onc
e
pt
of
th
e
m
ul
ti
-
w
or
d
(
phr
a
s
e
)
e
m
be
ddi
ng,
c
a
ll
e
d
P
M
C
v
e
c
to
r
s
(
P
M
C
V
e
c
)
[
5]
(
w
hi
c
h
is
pe
r
ta
in
e
d
to
bi
om
e
di
c
a
l
a
r
ti
c
le
s
)
f
or
pr
e
pr
oc
e
s
s
in
g
to
e
xt
r
a
c
t
th
e
di
s
tr
ib
ut
e
d
s
e
m
a
nt
ic
phr
a
s
e
s
f
r
om
c
a
nc
e
r
’
s
a
bs
tr
a
c
t
s
f
or
be
tt
e
r
c
la
s
s
if
ic
a
ti
on
pe
r
f
or
m
a
nc
e
.
T
he
P
M
C
V
e
c
w
a
s
im
pl
e
m
e
nt
e
d
i
n
th
e
e
m
be
ddi
ng
la
y
e
r
of
th
e
c
onvolut
io
na
l
ne
ur
a
l
ne
twor
k
(
C
N
N
)
a
lg
or
it
hm
us
e
d
f
or
b
io
m
e
di
c
a
l
te
xt
c
la
s
s
if
ic
a
ti
on
a
c
c
or
di
ng
to
c
a
nc
e
r
ha
ll
m
a
r
ks
.
A
ls
o,
w
e
pr
ove
th
a
t
c
ha
ngi
ng
in
w
or
d
e
m
be
ddi
ngs
t
e
c
hni
que
c
a
n
im
pr
ove
th
e
pe
r
f
or
m
a
nc
e
of
c
la
s
s
if
ic
a
ti
on
a
nd
a
ls
o,
c
om
pa
r
e
s
th
e
c
onvolut
io
na
l
ne
ur
a
l
ne
twor
ks
ve
r
s
us
r
e
c
ur
r
e
nt
ne
ur
a
l
ne
twor
ks
on
th
e
s
a
m
e
da
ta
s
e
t
u
s
in
g
two dif
f
e
r
e
nt
c
onc
e
pt
s
on e
m
be
ddi
ng uni
-
gr
a
m
e
m
be
ddi
ng a
nd
m
ul
ti
-
w
or
d e
m
be
ddi
ng.
D
L
a
lg
or
it
hm
s
a
nd
a
r
c
hi
te
c
tu
r
e
s
ha
v
e
a
lr
e
a
dy
m
a
de
s
upe
r
i
or
a
dva
nc
e
s
in
s
pe
e
c
h
r
e
c
ogni
ti
on,
c
om
put
e
r
vi
s
io
n,
a
nd
na
tu
r
a
l
la
ngua
g
e
pr
oc
e
s
s
in
g
(
N
L
P
)
f
ie
ld
s
[
6]
.
C
N
N
pr
opos
e
d
a
s
th
e
f
ir
s
t
ti
m
e
f
or
im
a
ge
pr
oc
e
s
s
in
g by
[
7]
a
nd s
ti
ll
w
or
ki
ng t
il
l
now
a
nd a
c
hi
e
ve
s
pe
r
f
e
c
t
r
e
s
ul
ts
i
n va
r
io
us
c
om
put
e
r
vi
s
io
n t
a
s
ks
s
uc
h
a
s
obj
e
c
t
de
te
c
ti
on
[
8]
,
im
a
ge
c
la
s
s
if
ic
a
ti
on
[
9]
,
m
e
di
c
a
l
im
a
g
e
a
na
ly
s
i
s
[
10]
,
im
pr
ovi
ng
th
e
pe
r
f
or
m
a
nc
e
of
br
e
a
s
t
c
a
nc
e
r
de
te
c
ti
on
[
11]
,
a
nd
a
lo
t
of
im
a
ge
pr
oc
e
s
s
in
g
ta
s
ks
.
A
l
s
o,
C
N
N
w
a
s
a
ppl
ie
d
to
s
pe
e
c
h
r
e
c
ogni
ti
on, f
or
e
xa
m
pl
e
, i
t
w
a
s
us
e
d t
o r
e
c
ogni
z
e
t
he
b
a
by c
r
y a
nd a
c
hi
e
ve
d a
n
a
c
c
ur
a
c
y of
78.6%
on 5 type
s
of
ba
by
c
r
ie
s
[
12
]
,
a
ls
o,
us
e
d
to
r
e
c
ogni
z
e
s
pe
e
c
h
e
m
ot
io
ns
[
13
]
.
H
ow
e
ve
r
,
th
e
c
onv
ol
ut
io
na
l
ne
u
r
a
l
ne
twor
k
(
C
N
N
)
is
us
e
d
in
ge
ne
r
a
l
N
L
P
ta
s
ks
,
pa
r
ti
c
ul
a
r
ly
te
xt
c
la
s
s
if
ic
a
ti
on
ta
s
ks
[
14]
.
T
he
r
e
a
r
e
a
huge
num
be
r
o
f
r
e
s
e
a
r
c
he
r
s
a
ppl
ie
d
th
e
C
N
N
a
lg
or
it
hm
to
de
te
c
t
th
e
pol
a
r
it
y
of
a
te
xt
,
th
e
te
xt
m
a
y
be
a
s
e
nt
e
nc
e
,
pa
r
a
gr
a
ph,
or
doc
u
m
e
nt
a
s
w
e
ll
to
de
te
c
t
th
e
opi
ni
on
i
s
pos
it
iv
e
;
n
e
ga
t
iv
e
;
or
ne
ut
r
a
l,
th
is
s
t
e
p
is
c
a
ll
e
d
s
e
nt
im
e
nt
a
na
ly
s
is
.
A
l
s
o,
in
th
is
w
or
k
[
15]
it
’
s
us
e
d
f
or
s
e
nt
e
nc
e
-
le
ve
l
c
la
s
s
if
ic
a
ti
on,
th
e
y
a
ppl
ie
d
4
m
ode
ls
of
th
e
a
lg
or
it
hm
on
di
f
f
e
r
e
nt
da
ta
s
e
ts
a
nd
th
e
a
lg
or
it
hm
ha
s
im
pr
ove
d
f
our
of
s
e
ve
n
ta
s
k
s
w
hi
c
h
in
c
lu
de
que
s
ti
on
c
la
s
s
if
ic
a
ti
on
a
nd
s
e
nt
im
e
nt
a
na
ly
s
i
s
.
I
n
th
e
bi
om
e
di
c
a
l
na
tu
r
a
l
pr
oc
e
s
s
in
g
(
B
io
-
N
L
P
)
to
pi
c
,
th
is
w
or
k
[
16
]
a
ut
hor
s
us
e
d
r
ul
e
-
ba
s
e
d
f
e
a
tu
r
e
s
w
it
h
a
kno
w
le
dge
-
gui
de
d
c
o
nvol
ut
io
na
l
ne
ur
a
l
ne
twor
k
to
c
la
s
s
if
y
c
li
ni
c
a
l
te
xt
. H
ow
e
ve
r
, A
c
onvolut
io
na
l
ne
ur
a
l
ne
twor
k w
a
s
a
ppl
ie
d on
c
li
ni
c
a
l
not
e
s
t
o c
a
te
gor
iz
e
t
e
xt
f
r
a
gm
e
nt
s
, t
he
s
ys
te
m
[
17]
out
pe
r
f
or
m
e
d
th
e
ot
he
r
M
L
a
ppr
oa
c
he
s
by
a
lm
os
t
15%
w
hi
le
th
e
t
r
a
in
in
g
da
ta
s
e
t
c
ont
a
in
s
4000
s
e
nt
e
nc
e
s
a
nd
th
e
a
c
c
ur
a
c
y
w
a
s
68%
.
I
n
[
18]
a
ut
hor
s
ha
ve
a
c
hi
e
ve
d
54,79%
a
c
c
ur
a
c
y
w
hi
le
c
la
s
s
if
yi
ng
bi
om
e
di
c
a
l
a
bs
tr
a
c
ts
publi
s
he
d i
n O
hs
um
e
d,
a
nd t
he
da
ta
s
e
t
w
a
s
c
ont
a
in
e
d 11,566 medic
a
l
a
bs
tr
a
c
ts
.
F
ur
th
e
r
m
or
e
,
in
th
e
“
C
a
nc
e
r
”
to
pi
c
,
a
ut
hor
s
in
[
19]
a
ppl
ie
d
th
e
M
L
a
lg
or
it
hm
“
s
uppor
t
ve
c
to
r
m
a
c
hi
ne
(
S
V
M
)
”
to
c
la
s
s
if
y
1,852
bi
om
e
di
c
a
l
a
bs
tr
a
c
t
s
a
c
c
or
di
ng
to
th
e
te
n
ha
ll
m
a
r
ks
of
c
a
nc
e
r
w
it
h
m
a
nua
l
f
e
a
tu
r
e
e
ngi
ne
e
r
in
g
a
c
hi
e
vi
ng
a
ve
r
a
ge
F
-
s
c
or
e
69.2%
w
it
h
b
a
g
-
of
-
w
or
ds
(
B
O
W
)
m
e
th
odol
ogy,
th
e
n,
th
e
y
im
pr
ove
th
e
pe
r
f
or
m
a
nc
e
u
s
in
g
r
ic
h
f
e
a
tu
r
e
s
te
c
hni
qu
e
a
c
hi
e
vi
ng
F
-
s
c
or
e
76.8%
. T
he
n,
th
e
a
ut
hor
s
c
om
pa
r
e
d
th
e
r
e
s
ul
t
of
S
V
M
w
it
h
th
e
C
N
N
a
lg
or
it
hm
in
th
is
w
or
k
[
3]
a
n
d
th
e
y
a
c
hi
e
ve
d
F
-
S
c
or
e
76.6%
us
in
g
G
oogl
e
N
e
w
s
w
or
d
ve
c
to
r
.
T
he
n,
th
e
a
ut
hor
s
m
a
de
s
om
e
m
odi
f
ic
a
ti
on
s
in
th
e
da
ta
s
e
t,
f
il
te
r
s
iz
e
s
of
th
e
m
ode
l,
a
nd
w
or
d
e
m
be
ddi
ng
a
lg
or
it
hm
s
w
hi
c
h
im
pr
ove
th
e
ir
m
ode
l
a
c
hi
e
vi
ng
F
-
s
c
or
e
81.0%
w
it
h
C
hi
u
-
w
in
-
2
w
o
r
d
ve
c
to
r
[
20]
.
T
he
p
a
pe
r
is
or
ga
ni
z
e
d
a
be
in
g
a
s
:
s
e
c
ti
on
2,
d
e
s
c
r
ib
e
s
th
e
pr
opos
e
d
m
e
th
od,
th
e
e
xp
e
r
im
e
nt
a
l
s
e
tu
p
in
th
is
r
e
s
e
a
r
c
h,
a
nd
c
la
r
if
ie
s
th
e
da
ta
s
e
t
us
e
d.
S
e
c
ti
on
3
e
va
lu
a
te
s
a
nd
di
s
c
us
s
e
s
th
e
pr
opos
e
d
te
c
hni
que
. F
in
a
ll
y, s
e
c
ti
on 4 s
how
s
our
c
onc
lu
s
io
n.
2.
R
E
S
E
A
R
C
H
M
E
T
H
O
D
2.1.
M
od
e
l
la
ye
r
s
T
he
pr
opos
e
d
m
ode
l
to
c
la
s
s
if
y
c
a
nc
e
r
a
r
ti
c
le
s
ba
s
e
d
on
c
a
nc
e
r
ha
ll
m
a
r
ks
is
il
lu
s
tr
a
te
d
in
F
ig
u
r
e
1.
T
he
m
ode
l
c
ons
is
ts
of
C
N
N
a
lg
or
it
hm
la
ye
r
s
,
w
hi
c
h
s
ta
r
t
f
r
om
th
e
e
m
be
ddi
ng
la
ye
r
f
ol
lo
w
e
d
by
1
c
onvolut
io
n l
a
ye
r
, t
he
n 1 ma
x
-
pool
in
g l
a
ye
r
, a
nd a
de
ns
e
l
a
y
e
r
.
T
he
in
put
a
r
ti
c
le
s
s
houl
d
be
pr
e
-
pr
oc
e
s
s
e
d
be
f
or
e
e
nt
e
r
in
g
th
e
C
N
N
la
ye
r
s
.
I
n
th
e
pr
e
-
pr
oc
e
s
s
in
g
pr
oc
e
s
s
,
w
e
us
e
th
e
P
M
C
ve
c
th
a
t
e
xt
r
a
c
t
us
e
f
ul
phr
a
s
e
s
f
r
om
th
e
te
xt
by
r
e
m
ovi
ng
th
e
nu
m
be
r
s
,
th
e
n
c
hunk
th
e
s
e
nt
e
nc
e
s
ba
s
e
d
on
th
e
pr
e
de
f
in
e
d
s
to
p
w
or
ds
.
T
he
n,
f
il
te
r
th
e
phr
a
s
e
s
in
it
ia
ll
y
ba
s
e
d
on
f
r
e
que
nc
y
s
ta
ti
s
ti
c
s
th
e
n,
r
a
nk
a
nd
f
il
te
r
a
ga
in
th
e
e
xt
r
a
c
te
d
phr
a
s
e
s
by
a
r
a
nki
ng
a
lg
or
it
hm
;
I
n
f
or
m
a
ti
on
F
r
e
que
nc
y
(
I
nf
o_F
r
e
q)
.
T
he
n,
ta
ggi
ng
th
e
phr
a
s
e
s
by
unde
r
s
c
or
e
s
’
.
A
f
te
r
pr
e
pa
r
in
g
th
e
da
ta
in
th
e
pr
e
pr
oc
e
s
s
in
g
ph
a
s
e
Evaluation Warning : The document was created with Spire.PDF for Python.
I
S
S
N
:
2252
-
8938
I
nt
J
A
r
ti
f
I
nt
e
ll
,
V
ol
.
10
, N
o.
2, J
une
20
21
:
31
6
–
32
3
318
th
e
e
xt
r
a
c
te
d
phr
a
s
e
s
pa
s
s
th
e
w
or
d
e
m
be
ddi
ng
la
y
e
r
;
th
e
pr
o
c
e
s
s
of
m
a
ppi
ng
th
e
voc
a
bul
a
r
ie
s
in
to
ve
c
to
r
s
w
hi
c
h c
o
ns
is
t
of
r
e
a
l
num
be
r
s
u
s
in
g l
a
ngua
ge
m
ode
li
ng a
nd f
e
a
tu
r
e
l
e
a
r
ni
ng me
th
ods
i
n N
L
P
.
F
ig
ur
e
1
. P
r
opos
e
d
m
ode
l
T
he
qu
a
li
ty
of
th
e
w
or
d
ve
c
to
r
c
a
n a
f
f
e
c
t
th
e
to
ta
l
qua
li
ty
of
th
e
te
xt
c
l
a
s
s
if
ic
a
ti
on.
T
he
r
e
a
r
e
a
lo
t
of
w
or
d
e
m
be
ddi
ngs
a
va
il
a
bl
e
publ
ic
a
ll
y
li
ke
G
oog
le
N
e
w
s
,
G
lo
V
e
a
nd
B
io
N
L
P
.
T
he
y
w
e
r
e
m
e
nt
io
ne
d
in
th
is
s
ur
ve
y
pa
pe
r
[
21]
a
nd
th
e
y
c
om
pa
r
e
d
w
it
h
P
M
C
ve
c
on
f
iv
e
di
f
f
e
r
e
nt
da
ta
s
e
ts
in
[
5]
.
T
he
m
a
in
di
f
f
e
r
e
nc
e
s
be
twe
e
n t
he
m
:
−
G
oogl
e
N
e
w
s
[
22]
:
A
popula
r
e
m
be
ddi
ng
m
ode
l
us
e
d
a
s
s
ta
te
-
of
-
th
e
-
a
r
t,
it
is
tr
a
in
e
d
on
G
oogl
e
N
e
w
s
da
ta
s
e
t.
T
ha
t
is
a
W
or
d2V
e
c
m
ode
l
tr
a
in
e
d
on
a
ge
ne
r
a
l
(
non
-
bi
om
e
di
c
a
l)
c
or
pus
.
I
t
is
a
30
0
-
di
m
e
ns
io
na
l
ve
c
to
r
r
e
pr
e
s
e
nt
a
ti
on.
−
G
lo
V
e
[
23]
:
C
om
bi
ne
s
th
e
pow
e
r
of
th
e
W
or
d2V
e
c
m
od
e
l
w
it
h
th
e
e
f
f
e
c
ti
ve
ne
s
s
of
th
e
gl
oba
l
C
o
-
oc
c
ur
r
e
nc
e
s
ta
ti
s
ti
c
s
m
e
th
od,
w
hi
c
h
is
a
ls
o
tr
a
in
e
d
on
a
ge
ne
r
a
l
(
non
-
bi
om
e
di
c
a
l)
c
or
pus
o
f
W
ik
ip
e
di
a
.
I
t
is
a
300
-
di
m
e
ns
io
na
l
ve
c
to
r
r
e
pr
e
s
e
nt
a
ti
on.
−
B
io
N
L
P
[
24]
:
I
nduc
e
d
f
r
om
P
ubM
e
d,
P
M
C
,
a
nd
th
e
ir
c
om
bi
n
a
ti
on
us
in
g
th
e
W
or
d2ve
c
m
ode
l.
I
t
i
s
a
200
-
di
m
e
ns
io
na
l
ve
c
to
r
r
e
pr
e
s
e
nt
a
ti
on.
−
P
M
C
V
e
c
[
5]
:
A
R
e
c
e
nt
ly
w
or
d
-
e
m
be
ddi
ng
ve
c
to
r
s
,
w
hi
c
h
tr
a
in
e
d
on
P
ubM
e
d
a
r
ti
c
le
s
a
nd
s
uppor
ts
uni
gr
a
m
w
or
d a
nd mul
ti
-
w
or
d ph
r
a
s
e
s
r
e
pr
e
s
e
nt
a
ti
on
s
. I
t
is
a
2
00
-
di
m
e
ns
io
na
l
ve
c
to
r
r
e
pr
e
s
e
nt
a
ti
on.
T
he
r
e
f
or
e
,
w
e
s
e
le
c
t
th
e
P
M
C
v
e
c
be
c
a
us
e
it
us
e
s
m
ul
ti
-
w
or
d
(
phr
a
s
e
)
e
m
be
ddi
ng
but
th
e
ot
h
e
r
ve
c
to
r
s
us
e
uni
-
w
or
d
e
m
be
ddi
ng.
A
s
w
e
m
e
nt
io
n
th
e
bi
om
e
di
c
a
l
te
r
m
s
,
s
ym
pt
om
s
,
a
nd
m
e
di
c
a
ti
ons
a
r
e
us
ua
ll
y
w
r
it
te
n
in
phr
a
s
e
s
.
S
o,
th
e
P
M
C
v
e
c
i
s
th
e
be
tt
e
r
in
o
ur
c
a
s
e
be
c
a
us
e
th
e
a
r
ti
c
le
s
in
th
e
d
a
ta
s
e
t
a
r
e
a
bout
c
a
nc
e
r
di
s
e
a
s
e
w
hi
c
h
i
s
in
th
e
m
e
di
c
a
l
dom
a
in
.
A
f
te
r
th
e
w
or
d
-
e
m
be
dd
in
g
la
y
e
r
,
th
e
m
a
tr
ix
th
a
t
c
ont
a
in
s
th
e
va
lu
e
s
of
e
m
be
ddi
ng
w
il
l
e
nt
e
r
th
e
c
onvolut
io
n
la
ye
r
.
C
onvolut
io
na
l
l
a
ye
r
;
us
e
s
a
m
a
th
e
m
a
ti
c
a
l
m
ode
l
th
a
t
c
ont
a
in
s
th
e
R
e
L
U
a
c
ti
va
ti
on
f
unc
ti
on
(
r
e
c
ti
f
ie
d
l
in
e
a
r
uni
t
)
th
a
t
a
ppl
ie
s
th
e
f
il
te
r
s
iz
e
s
to
th
e
gi
ve
n
te
xt
a
nd
pa
s
s
e
s
it
s
r
e
s
ul
t
s
to
th
e
m
a
x
-
pool
in
g
l
a
ye
r
in
a
2D
a
r
r
a
y.
H
ow
e
ve
r
,
in
M
a
x
-
P
ool
in
g
la
ye
r
r
e
duc
e
s
th
e
pool
e
d
f
e
a
tu
r
e
s
to
th
e
m
a
x
by
a
ppl
yi
ng
a
f
il
te
r
m
a
tr
ix
.
T
he
n,
th
e
m
ode
l
s
houl
d
c
onve
r
t
th
e
2
-
D
a
r
r
a
y
to
1
-
D
vi
a
th
e
f
la
tt
e
ni
ng
a
nd
c
onc
a
t
a
ll
th
e
1D
-
ar
r
a
ys
a
nd
pa
s
s
e
s
th
e
r
e
s
ul
ts
to
th
e
f
ul
ly
c
onne
c
te
d
la
ye
r
(
de
ns
e
l
a
ye
r
)
w
hi
c
h
is
c
ons
id
e
r
e
d
a
s
th
e
out
put
la
y
e
r
to
de
c
id
e
if
th
e
gi
ve
n
a
r
ti
c
le
pos
it
iv
e
/n
e
ga
ti
ve
f
or
th
e
gi
ve
n ha
ll
m
a
r
k. A
lg
or
it
hm
1 de
s
c
r
ib
e
s
t
he
s
te
p
s
of
t
he
pr
opos
e
d m
ode
l.
A
lg
or
it
h
m
1:
P
r
o
pos
e
d
m
ode
l
f
or
c
a
nc
e
r
te
xt
c
la
s
s
if
ic
a
ti
on
ba
s
e
d
on
c
a
nc
e
r
ha
ll
m
a
r
ks
u
s
in
g
C
N
N
a
lg
or
it
hm
a
nd P
M
C
V
e
c
e
m
be
ddi
ngs
.
•
Suppos
e
t
hat
D
= {
l
a
be
l
_1, l
a
be
l
_2,…. , l
a
be
l
-
a
}
S
e
t
of
10 f
i
l
e
s
, one
f
or
e
a
c
h ha
l
l
m
a
r
k.
Ab
Tr
= {
a
b
1
, a
b
2
, …. , a
b
n }
S
e
t
of
n
a
bs
t
r
a
c
t
s
i
n T
r
a
i
ni
ng da
t
a
s
e
t
Ab
Te
= {
a
b
1
, a
b
2
, …. , a
b
n }
S
e
t
of
n a
bs
t
r
a
c
t
s
i
n T
e
s
t
i
ng da
t
a
s
e
t
•
T
r
a
i
ni
ng P
ha
s
e
:
-
C
onve
r
t
D
i
nt
o X
M
L
f
or
m
a
t
.
F
or
e
ac
h
f
i
l
e
i
n
D
F
or
e
ac
h
Ab
i
i
n
Ab
Tr
1.
R
e
m
ove
num
be
r
s
a
nd
s
pe
c
i
a
l
c
ha
r
a
c
t
e
r
s
.
2.
I
de
nt
i
f
y noun phr
a
s
e
s
.
3.
I
ni
t
i
a
l
f
i
l
t
e
r
i
ng by
r
e
m
ovi
ng a
ny s
i
ngl
e
w
or
d oc
c
ur
r
e
d onc
e
.
4.
R
a
nki
ng us
i
ng I
nf
o_f
r
e
q r
a
nki
ng a
l
gor
i
t
hm
us
i
ng t
hi
s
f
or
m
u
l
a
f
or
t
w
o w
or
ds
ph
r
a
s
e
s
:
i
nf
o
_f
r
e
q (
A
,B
)
=l
og
p
(
A
, B
)
p
(
A
)
p
(
B
)
*l
og (
f
r
e
q
(
A
,B
)
)
Evaluation Warning : The document was created with Spire.PDF for Python.
I
nt
J
A
r
ti
f
I
nt
e
ll
I
S
S
N
:
2252
-
8938
E
nhanc
in
g t
he
pe
r
fo
r
m
anc
e
of
c
anc
e
r
t
e
x
t
c
la
s
s
if
ic
at
io
n m
ode
l
bas
e
d on c
anc
e
r
hal
lmar
k
s
(
N
oha A
li
)
319
, a
nd t
hi
s
f
or
m
ul
a
f
or
3
w
or
d ph
r
a
s
e
s
i
nf
o
_f
r
e
q (
A
,B
,C
)
=
l
og
p
(
A
,B
,C
)
i
nf
o_f
r
e
q (
A
,B
)
p
(
C
)
*l
og (
f
r
e
q(
A
,B
,C
)
)
5.
T
a
ggi
ng phr
a
s
e
s
a
nd bui
l
d e
m
be
ddi
ng m
a
t
r
i
x.
6.
A
ppl
y c
onvol
ut
i
on on e
m
be
ddi
ng m
a
t
r
i
x us
i
ng di
f
f
e
r
e
nt
f
i
l
t
e
r
s
i
z
e
s
.
7.
G
e
ne
r
a
t
e
M
a
x
-
P
ool
i
ng on e
a
c
h f
e
a
t
ur
e
m
a
p.
8.
F
l
a
t
t
e
ni
ng (
C
onve
r
t
a
2D
a
r
r
a
y i
nt
o 1D
A
r
r
a
y)
.
9.
A
ppl
y a
f
ul
l
y
-
c
onne
c
t
e
d l
a
ye
r
w
i
t
h dr
opout
.
10.
S
a
ve
A
b
i
i
n t
he
t
r
a
i
ne
d m
odul
e
.
E
nd F
or
E
nd F
or
•
T
e
s
t
i
ng P
ha
s
e
:
-
F
or
e
a
c
h
Ab
i
i
n
Ab
Te
do
W
hi
l
e
E
O
F
(
Ab
i
)
do
1.
L
oa
d t
r
a
i
ne
d m
odul
e
2.
E
va
l
ua
t
e
S
i
on t
he
t
r
a
i
ne
d m
odul
e
3.
C
a
l
c
ul
a
t
e
t
he
F
-
s
c
or
e
of
Ab
Te
E
n
d
F
or
2.2.
S
e
t
t
in
g m
od
e
l
p
ar
am
e
t
e
r
s
T
he
pr
opos
e
d
m
ode
l
is
ba
s
e
d
on
a
s
im
pl
e
C
N
N
a
r
c
hi
te
c
tu
r
e
b
y
K
im
[
15]
,
im
pl
e
m
e
nt
in
g
th
e
ne
ur
a
l
ne
twor
k (
N
N
)
us
in
g K
e
r
a
s
[
25]
, a
nd T
e
ns
or
F
lo
w
w
a
s
us
e
d a
s
a
ba
c
ke
nd t
ool
. T
he
pr
opos
e
d m
ode
l
c
ons
i
s
ts
of
th
e
P
M
C
ve
c
in
th
e
e
m
be
ddi
ng
la
ye
r
f
ol
lo
w
e
d
by
one
c
onvolu
ti
on
la
ye
r
of
va
r
io
us
f
il
te
r
s
iz
e
s
,
th
e
n
1
m
a
x
-
pool
in
g
la
ye
r
,
th
e
n
f
in
a
ll
y
th
e
out
put
la
ye
r
.
W
e
us
e
d
th
e
m
o
de
l
hype
r
pa
r
a
m
e
te
r
s
li
ke
th
e
tu
ne
d
v
e
r
s
io
n
of
S
.ba
ke
r
’
s
w
or
k
[
3]
e
xc
e
pt
f
or
th
e
e
m
be
ddi
ng
la
y
e
r
,
w
he
r
e
th
e
f
il
te
r
-
s
iz
e
s
w
e
r
e
2,
3,
4,
num
be
r
of
f
il
te
r
s
128,
dr
opout
ke
e
p
pr
oba
bi
li
ty
0.5,
a
nd
la
m
bda
r
e
gul
a
r
iz
a
ti
on
a
s
de
f
a
ul
t.
T
he
tr
a
in
in
g
pa
r
a
m
e
te
r
s
w
e
r
e
b
a
tc
h
s
iz
e
64, t
he
numbe
r
of
t
r
a
in
i
ng e
poc
hs
250, a
nd e
va
lu
a
te
e
v
e
r
y 100 s
te
ps
. P
a
r
a
m
e
te
r
s
a
r
e
s
um
m
a
r
iz
e
d i
n T
a
bl
e
1.
T
a
bl
e
1
. M
od
e
l
p
a
r
a
m
e
te
r
s
P
a
r
a
m
e
t
e
r
V
a
l
ue
W
or
d V
e
c
t
or
S
i
z
e
200 (
P
m
c
ve
c
)
F
i
l
t
e
r
S
i
z
e
s
2,3,4
D
r
opout
P
r
oba
bi
l
i
t
y
0.5
N
um
be
r
of
F
i
l
t
e
r
s
128
B
a
t
c
h S
i
z
e
50
2.3. Dat
as
e
t
T
he
s
a
m
e
c
or
pu
s
of
[
19]
w
a
s
us
e
d, w
hi
c
h c
ont
a
in
s
1852 biom
e
di
c
a
l
a
bs
tr
a
c
ts
f
or
t
r
a
in
in
g a
nd t
e
s
ti
ng
our
m
ode
l.
D
a
ta
s
e
t
a
nnot
a
te
d
by
a
n
e
xpe
r
t
w
it
h
15+
ye
a
r
s
of
i
nvol
ve
m
e
nt
w
it
h
c
a
nc
e
r
r
e
s
e
a
r
c
h.
T
he
ta
s
k
is
m
ul
ti
-
la
be
l
c
la
s
s
if
ic
a
ti
on;
e
a
c
h
a
bs
tr
a
c
t
m
a
y
b
e
la
be
l
e
d
w
it
h
z
e
r
o
or
m
or
e
of
th
e
te
n
h
a
ll
m
a
r
ks
.
W
e
s
pl
it
th
e
da
ta
s
e
t
in
to
10
B
in
a
r
y
-
la
be
le
d
d
a
ta
s
e
t
s
(
one
f
or
e
ve
r
y
ha
ll
m
a
r
k)
,
th
e
pos
it
iv
e
s
a
m
pl
e
s
in
e
a
c
h
d
a
ta
s
e
t
a
r
e
th
e
a
bs
tr
a
c
ts
a
nnot
a
t
e
d
w
it
h
th
a
t
ha
ll
m
a
r
k,
w
he
r
e
th
e
ne
ga
ti
ve
s
a
m
pl
e
s
a
r
e
th
os
e
th
a
t
a
r
e
n’
t
a
nnot
a
te
d
w
it
h
th
a
t
ha
ll
m
a
r
k. T
he
t
e
n ha
ll
m
a
r
ks
a
r
e
br
ie
f
ly
de
s
c
r
ib
e
d
is
:
−
S
us
ta
in
in
g
pr
ol
if
e
r
a
ti
ve
s
ig
na
li
ng:
N
or
m
a
l
c
e
ll
s
ne
e
d
m
ol
e
c
ul
e
s
th
a
t
a
c
t
a
s
s
ig
n
s
f
or
th
e
m
to
gr
ow
u
p
a
nd divi
de
. O
n t
he
ot
he
r
ha
nd, c
a
nc
e
r
c
e
ll
s
, a
r
e
a
bl
e
t
o gr
ow
up
w
it
hout
t
he
s
e
e
xt
e
r
na
l
s
ig
n
s
.
−
E
va
di
ng
gr
ow
th
s
up
pr
e
s
s
or
s
:
N
on
-
C
a
n
c
e
r
c
e
ll
s
,
h
a
ve
ope
r
a
ti
ons
th
a
t
c
a
n
s
to
p
th
e
c
e
ll
gr
ow
th
or
di
vi
s
io
n. I
n C
a
nc
e
r
c
e
ll
s
, t
he
s
e
ope
r
a
ti
ons
a
r
e
c
ha
nge
d
s
o t
ha
t
t
he
y don’
t
de
ny c
e
ll
di
vi
s
io
n e
f
f
e
c
ti
ve
ly
.
−
R
e
s
is
ti
ng
c
e
ll
d
e
a
th
:
P
r
ogr
a
m
m
e
d
C
e
ll
D
e
a
th
is
a
te
c
hni
que
by w
hi
c
h
c
e
ll
s
c
a
n
be
pr
ogr
a
m
m
e
d
to
di
e
if
da
m
a
ge
d. B
ut
, c
a
n
c
e
r
c
e
ll
s
a
r
e
c
a
pa
bl
e
t
o ove
r
r
id
e
t
he
s
e
t
e
c
hni
que
s
.
−
E
na
bl
in
g
r
e
pl
ic
a
ti
ve
im
m
or
ta
li
ty
:
H
e
a
lt
hy
c
e
ll
di
e
s
a
f
te
r
a
pa
r
ti
c
ul
a
r
num
be
r
of
di
vi
s
io
ns
.
B
ut
,
c
a
nc
e
r
c
e
ll
s
a
r
e
a
bl
e
t
o gr
ow
a
nd divi
de
e
ndl
e
s
s
ly
.
−
I
nduc
in
g
a
ngi
oge
ne
s
is
:
C
a
nc
e
r
c
e
ll
s
;
a
r
e
c
a
pa
bl
e
to
s
ta
r
t
a
ngi
oge
ne
s
is
,
th
e
pr
oc
e
dur
e
by
w
hi
c
h
f
r
e
s
h
bl
ood ve
s
s
e
l
s
a
r
e
s
ha
p
e
d, he
nc
e
gu
a
r
a
nt
e
e
in
g t
he
gr
a
c
e
f
ul
ly
of
oxyge
n a
nd dif
f
e
r
e
nt
s
uppl
e
m
e
nt
s
.
−
A
c
ti
va
ti
ng
in
va
s
io
n
&
m
e
ta
s
ta
s
is
:
C
a
nc
e
r
-
c
e
ll
s
c
a
n
s
pl
it
a
w
a
y
f
r
om
th
e
ir
s
it
e
of
in
c
e
pt
io
n
to
a
tt
a
c
k
e
nc
om
pa
s
s
in
g t
i
s
s
ue
a
nd s
pr
e
a
d t
o f
a
r
of
f
body pa
r
ts
. B
ut
, H
e
a
l
th
y c
e
ll
s
a
r
e
n’
t
s
pl
it
a
w
a
y.
−
G
e
nom
e
in
s
ta
bi
li
ty
&
m
ut
a
ti
on:
C
a
nc
e
r
gr
ow
th
c
e
ll
s
f
or
th
e
m
os
t
pa
r
t
ha
ve
s
e
r
io
us
c
hr
om
os
om
a
l
va
r
ia
ti
ons
f
r
om
t
he
nor
m
, w
hi
c
h c
om
pound a
s
t
he
i
ll
ne
s
s
a
dv
a
n
c
e
s
.
−
T
um
or
-
pr
om
ot
in
g
in
f
la
m
m
a
ti
on:
A
ggr
a
va
ti
on
in
f
lu
e
nc
e
s
th
e
m
ic
r
oe
nvi
r
onm
e
nt
e
nc
om
pa
s
s
in
g
tu
m
or
s
,
a
ddi
ng t
o t
he
m
ul
ti
pl
ic
a
ti
on, e
ndur
a
nc
e
, a
nd me
ta
s
ta
s
is
of
m
a
li
g
na
nt
c
e
ll
s
.
Evaluation Warning : The document was created with Spire.PDF for Python.
I
S
S
N
:
2252
-
8938
I
nt
J
A
r
ti
f
I
nt
e
ll
,
V
ol
.
10
, N
o.
2, J
une
20
21
:
31
6
–
32
3
320
−
D
e
r
e
gul
a
ti
ng
c
e
ll
ul
a
r
e
ne
r
ge
ti
c
s
:
C
a
nc
e
r
c
e
ll
s
m
os
tl
y
ut
il
iz
e
s
tr
a
nge
m
e
ta
bol
ic
pa
th
w
a
ys
to
c
r
e
a
te
vi
ta
li
ty
,
f
or
e
xa
m
pl
e
di
s
pl
a
yi
ng
gl
uc
os
e
a
gi
ng
in
a
ny
e
ve
nt
,
w
he
n
e
nough
oxyge
n
is
a
va
il
a
bl
e
to
a
ppr
opr
ia
te
ly
br
e
a
th
e
.
−
A
voi
di
ng
im
m
une
de
s
tr
uc
ti
on
:
N
on
-
C
a
nc
e
r
c
e
ll
s
a
r
e
vi
s
ib
le
b
y
th
e
im
m
une
s
ys
te
m
.
H
ow
e
ve
r
,
c
a
nc
e
r
c
e
ll
s
a
r
e
n’
t.
F
ur
th
e
r
m
or
e
,
w
e
c
ha
nge
a
li
tt
le
bi
t
in
th
e
di
s
tr
ib
ut
io
n
of
th
e
da
ta
s
e
t
th
a
n
S
.ba
ke
r
’
s
w
or
k
[
3]
.
W
e
di
vi
de
th
e
a
nnot
a
te
d
da
ta
in
to
tr
a
in
in
g,
va
li
da
ti
on,
a
nd
te
s
ti
ng
s
ubgr
oups
,
70%
f
or
tr
a
in
in
g,
10%
f
o
r
va
li
da
ti
on,
a
nd
20%
f
or
te
s
ti
ng
us
in
g
a
r
a
ndom
s
a
m
pl
in
g
s
tr
a
te
gy.
T
a
bl
e
2
s
how
s
th
e
d
a
ta
s
e
t
di
s
tr
ib
ut
io
n
of
pos
it
iv
e
a
nd ne
ga
ti
ve
s
a
m
pl
e
s
f
or
e
a
c
h h
a
ll
m
a
r
k.
T
a
bl
e
2
. D
a
ta
s
e
t
di
s
tr
ib
ut
io
n
H
a
l
l
m
a
r
k
T
r
a
i
n
V
a
l
i
da
t
i
on
T
e
s
t
T
ot
a
l
pos
i
t
i
ve
N
e
ga
t
i
ve
pos
i
t
i
ve
ne
ga
t
i
ve
pos
i
t
i
ve
ne
ga
t
i
ve
pos
i
t
i
ve
ne
ga
t
i
ve
1
st
328
975
43
140
91
275
462
1390
2
nd
172
1131
22
161
46
320
240
1612
3
rd
303
1000
42
141
84
282
429
1423
4
th
81
1222
11
172
23
343
115
1737
5
th
99
1204
13
170
31
335
143
1708
6
th
208
1095
29
154
54
312
291
1561
7
th
227
1076
38
145
68
298
333
1519
8
th
169
1143
24
159
47
319
240
1612
9
th
74
1229
10
173
21
345
105
1747
10
th
77
1226
10
173
21
345
108
1744
3.
R
E
S
U
L
T
S
A
N
D
D
I
S
C
U
S
S
I
O
N
F
ir
s
t,
w
e
c
om
pa
r
in
g
our
m
ode
l
u
s
in
g
P
M
C
V
e
c
e
m
be
ddi
ng
v
e
r
s
us
th
e
C
N
N
m
ode
l
(
tu
ne
d
ve
r
s
io
n)
by
S
.B
a
ke
r
[
3]
.
F
ig
ur
e
2
r
e
pr
e
s
e
nt
s
our
m
e
th
od
out
pe
r
f
o
r
m
s
th
e
pr
e
vi
ous
m
ode
l
f
or
e
a
c
h
ha
ll
m
a
r
k,
a
nd
T
a
bl
e
3
c
om
pa
r
e
s
th
e
F
-
s
c
or
e
p
e
r
c
e
nt
a
ge
s
f
or
e
a
c
h
ha
ll
m
a
r
k
in
di
vi
dua
ll
y
a
nd
th
e
a
ve
r
a
ge
F
-
s
c
or
e
f
or
bot
h
m
ode
ls
.
F
ig
ur
e
2
. F
-
s
c
or
e
c
ha
r
t
c
om
pa
r
is
on f
or
e
a
c
h ha
ll
m
a
r
k us
in
g C
N
N
T
a
bl
e
3
. C
N
N
a
lg
or
it
hm
-
te
s
t
r
e
s
ul
t
c
om
pa
r
is
on us
in
g t
he
F
-
s
c
or
e
m
e
tr
ic
NO.
H
a
l
l
m
a
r
k
S
.ba
ke
r
[
3]
P
r
opos
e
d M
ode
l
1
S
us
t
a
i
ni
ng P
r
ol
i
f
e
r
a
t
i
ve
S
i
gna
l
i
ng
67.90%
71.60%
2
E
va
di
ng G
r
ow
t
h S
uppr
e
s
s
or
s
71.50%
75.80%
3
R
e
s
i
s
t
i
ng C
e
l
l
D
e
a
t
h
86.70%
88.90%
4
E
na
bl
i
ng R
e
pl
i
c
a
t
i
ve
I
m
m
or
t
a
l
i
t
y
91.50%
94%
5
I
nduc
i
ng A
ngi
oge
ne
s
i
s
79.40%
82%
6
A
c
t
i
va
t
i
ng I
nva
s
i
on &
M
e
t
a
s
t
a
s
i
s
82.60%
85.70%
7
G
e
nom
e
I
ns
t
a
bi
l
i
t
y &
M
ut
a
t
i
on
81.70%
83%
8
T
um
or
-
P
r
om
ot
i
ng I
n
f
l
a
m
m
a
t
i
on
84.20%
87.70%
9
D
e
r
e
gul
a
t
i
ng C
e
l
l
ul
a
r
E
ne
r
ge
t
i
c
s
88.30%
90%
10
A
voi
di
ng I
m
m
une
D
e
s
t
r
uc
t
i
on
75.80%
80%
A
ve
r
a
ge
F
-
s
c
or
e
81%
83.87%
Evaluation Warning : The document was created with Spire.PDF for Python.
I
nt
J
A
r
ti
f
I
nt
e
ll
I
S
S
N
:
2252
-
8938
E
nhanc
in
g t
he
pe
r
fo
r
m
anc
e
of
c
anc
e
r
t
e
x
t
c
la
s
s
if
ic
at
io
n m
ode
l
bas
e
d on c
anc
e
r
hal
lmar
k
s
(
N
oha A
li
)
321
W
hi
le
e
va
lu
a
ti
ng
te
s
t
da
ta
in
th
e
da
ta
s
e
t,
th
e
r
e
s
ul
ts
in
T
a
bl
e
3
s
how
th
a
t
our
m
ode
l
out
pe
r
f
or
m
s
ove
r
th
e
e
xi
s
ti
ng
m
ode
l
f
o
r
e
a
c
h
ha
ll
m
a
r
k
in
di
vi
dua
ll
y
a
nd
on
th
e
to
ta
l
a
ve
r
a
ge
,
obt
a
in
in
g
a
n
F
-
s
c
or
e
of
83.87%
f
or
ove
r
a
ll
pe
r
f
o
r
m
a
nc
e
w
hi
c
h
is
hi
ghe
r
th
a
n
th
e
pr
e
v
io
us
m
ode
l
th
a
t
e
qua
ls
to
81%
.
T
he
pr
opos
e
d
m
ode
l
c
a
n
e
nh
a
nc
e
th
e
e
xi
s
ti
ng
m
ode
l
by a
t
le
a
s
t
2%
to
5%
f
or
e
a
c
h
h
a
ll
m
a
r
k
in
di
vi
dua
ll
y
a
nd
by
a
lm
os
t
3%
on
th
e
to
ta
l
a
ve
r
a
ge
a
nd
if
th
e
da
ta
s
e
t
is
la
r
ge
r
th
a
n
th
e
c
ur
r
e
nt
,
th
e
c
la
s
s
if
ic
a
ti
on
r
e
s
ul
t
s
w
il
l
be
be
tt
e
r
us
in
g
th
e
m
ul
ti
-
w
or
d
e
m
be
ddi
ng
te
c
hni
que
.
A
ls
o,
th
e
4t
h
h
a
ll
m
a
r
k
r
e
s
ul
t
is
th
e
hi
ghe
s
t
th
a
n
th
e
r
e
s
t,
be
c
a
us
e
th
e
e
xa
m
pl
e
s
of
th
e
da
ta
s
e
t
a
r
e
m
or
e
r
e
le
va
nt
to
th
is
ha
ll
m
a
r
k
th
a
n
th
e
ot
he
r
.
S
o,
th
e
c
onc
e
pt
of
m
ul
ti
-
w
or
d
e
m
be
ddi
ng
us
in
g
P
M
C
ve
c
is
e
f
f
e
c
ti
ve
th
a
n
th
e
uni
-
w
or
d
e
m
be
ddi
ngs
te
c
hni
que
s
.
I
t
c
a
n
im
pr
ove
th
e
pe
r
f
or
m
a
nc
e
of
e
m
be
ddi
ng, a
nd t
he
r
e
f
or
e
t
he
r
e
s
ul
t
of
c
la
s
s
if
ic
a
ti
on a
s
w
e
ll
.
A
not
he
r
e
xpe
r
im
e
nt
w
a
s
p
e
r
f
or
m
e
d
on
th
e
s
a
m
e
d
a
ta
s
e
t
u
s
i
ng
a
not
he
r
D
L
a
lg
or
it
hm
;
th
e
R
N
N
a
lg
or
it
hm
to
s
how
it
s
pe
r
f
or
m
a
nc
e
in
t
e
xt
c
la
s
s
if
ic
a
ti
on
ta
s
ks
in
th
e
bi
om
e
di
c
a
l
dom
a
in
.
T
a
bl
e
4
r
e
pr
e
s
e
nt
s
th
e
r
e
s
ul
ts
of
e
va
lu
a
ti
ng
th
e
t
e
s
t
da
ta
u
s
in
g
th
e
R
N
N
a
lg
o
r
it
hm
w
it
h
two
di
f
f
e
r
e
nt
w
or
d
e
m
be
ddi
ngs
te
c
hni
que
s
a
ls
o;
uni
-
gr
a
m
w
or
d
e
m
be
ddi
ngs
li
ke
G
oogl
e
N
e
w
s
(
de
f
a
ul
t
w
or
d
ve
c
to
r
)
a
nd
phr
a
s
e
e
m
be
ddi
ng
li
ke
P
M
C
V
e
c
.
T
he
r
e
s
ul
t
in
T
a
bl
e
4
s
how
s
th
a
t
th
e
p
e
r
f
or
m
a
nc
e
of
R
N
N
w
it
h
P
M
C
V
e
c
i
s
76.26%
w
hi
c
h
is
out
p
e
r
f
or
m
in
g
th
e
R
N
N
w
it
h
G
oogl
e
N
e
w
s
w
hi
c
h
obt
a
in
s
F
-
s
c
or
e
74.9%
.
T
ha
t
is
b
e
c
a
us
e
G
oogl
e
N
e
w
s
is
tr
a
in
e
d
on
ge
ne
r
a
l
te
xt
how
e
ve
r
th
e
P
M
C
V
e
c
is
tr
a
in
e
d
on
th
e
bi
om
e
di
c
a
l
te
xt
.
A
ls
o,
th
e
phr
a
s
e
e
m
be
ddi
ng
s
a
r
e
be
tt
e
r
th
a
n
uni
-
gr
a
m
e
m
be
ddi
ng.
S
o,
P
M
C
ve
c
gi
ve
s
be
tt
e
r
w
or
d
e
m
be
ddi
ngs
a
nd
th
is
is
r
e
f
le
c
te
d
in
th
e
c
la
s
s
if
ic
a
ti
on
r
e
s
ul
t
a
s
w
e
ll
,
but
s
ti
ll
,
bot
h
of
th
e
m
a
r
e
lo
w
e
r
th
a
n
th
e
C
N
N
a
lg
or
it
hm
r
e
s
ul
t
w
it
h
a
n
a
ve
r
a
ge
F
-
s
c
or
e
83.87%
a
s
s
how
n i
n T
a
bl
e
3.
T
a
bl
e
4
. R
N
N
a
lg
or
it
hm
c
om
pa
r
is
o
n of
t
e
s
t
r
e
s
ul
t
us
in
g F
-
s
c
or
e
N
o.
H
a
l
l
m
a
r
k
R
N
N
(
G
oogl
e
N
e
w
s
)
R
N
N
(
P
M
C
V
e
c
)
1
S
us
t
a
i
ni
ng P
r
ol
i
f
e
r
a
t
i
ve
S
i
gna
l
i
ng
66.0%
68.1%
2
E
va
di
ng G
r
ow
t
h S
uppr
e
s
s
or
s
67.4%
69.0%
3
R
e
s
i
s
t
i
ng C
e
l
l
D
e
a
t
h
79.0%
82.1%
4
E
na
bl
i
ng R
e
pl
i
c
a
t
i
ve
I
m
m
or
t
a
l
i
t
y
82.0%
86.0%
5
I
nduc
i
ng A
ngi
oge
ne
s
i
s
74.8%
75.0%
6
A
c
t
i
va
t
i
ng I
nva
s
i
on a
nd M
e
t
a
s
t
a
s
i
s
72.0%
72.4%
7
G
e
nom
i
c
I
ns
t
a
bi
l
i
t
y a
nd M
ut
a
t
i
on
76.8%
77.2%
8
T
um
or
P
r
o
m
ot
i
ng I
nf
l
a
m
m
a
t
i
on
80.0%
80.3%
9
C
e
l
l
ul
a
r
E
ne
r
ge
t
i
c
s
81.0%
82.0%
10
A
voi
di
ng I
m
m
une
D
e
s
t
r
uc
t
i
on
70.0%
70.5%
A
ve
r
a
ge
F
-
s
c
or
e
74.9%
76.26%
T
a
bl
e
5
a
nd
F
ig
ur
e
3
c
om
pa
r
e
s
th
e
be
nc
hm
a
r
ks
a
lg
or
it
hm
s
us
in
g
M
L
a
lg
or
it
hm
s
[
19
]
,
a
nd
th
e
D
L
a
lg
or
it
hm
[
3]
w
it
h t
he
pr
opos
e
d m
ode
l
us
in
g C
N
N
a
nd R
N
N
a
lg
or
it
hm
s
on t
he
s
a
m
e
da
ta
s
e
t.
T
he
c
om
p
a
r
is
on
be
twe
e
n t
he
m
on e
a
c
h h
a
ll
m
a
r
k i
ndi
vi
dua
ll
y a
nd on a
ve
r
a
ge
of
a
ll
t
he
ha
ll
m
a
r
ks
us
in
g t
he
F
-
s
c
or
e
m
e
tr
ic
.
B
a
s
e
d
on
th
e
pr
e
vi
ous
c
om
pa
r
is
on
T
a
bl
e
5
a
nd
F
ig
ur
e
3,
w
e
c
onduc
t
th
a
t
C
N
N
w
it
h
P
M
C
V
e
c
e
m
be
ddi
ng
ha
s
ov
e
r
c
om
e
th
e
ot
he
r
be
n
c
hm
a
r
k
m
ode
ls
in
bi
o
m
e
di
c
a
l
te
xt
c
la
s
s
if
ic
a
ti
on.
T
he
C
N
N
is
hi
ghl
y
r
e
c
om
m
e
nde
d
in
te
xt
c
la
s
s
if
ic
a
ti
on
in
th
e
bi
om
e
di
c
a
l
na
tu
r
a
l
la
ngua
ge
dom
a
in
.
A
ls
o,
P
M
C
V
e
c
pr
oduc
e
s
hi
ghe
r
w
or
d
e
m
be
ddi
ng
pe
r
f
or
m
a
nc
e
ve
r
s
us
th
e
G
oogl
e
N
e
w
s
a
nd
C
hi
u
-
w
in
-
2
w
it
h
bot
h
C
N
N
a
nd
R
N
N
a
lg
or
it
hm
s
i
n our
c
a
s
e
.
T
a
bl
e
5
. C
om
pa
r
is
on b
e
twe
e
n be
n
c
hm
a
r
ks
a
lg
or
it
hm
s
ve
r
s
u
s
t
he
pr
opos
e
d m
ode
l
H
a
l
l
m
a
r
k
M
L
(
S
V
M
+ B
oW
)
M
L
(
S
V
M
+R
i
c
h
f
e
a
t
u
r
e
s
)
R
N
N
(
G
oogl
e
N
e
w
s
)
R
N
N
(
P
M
C
V
e
c
)
C
N
N
(
G
oogl
e
N
e
w
s
)
C
N
N
(
C
hi
u
-
w
i
n
-
2)
C
N
N
(
P
M
C
V
e
c
)
1
st
70
67.4
66
68.1
66.3
67.9
71.6
2
nd
53.3
65.3
67.4
69
66.7
71.5
75.8
3
rd
75.9
82.7
79
82.1
86.9
86.7
88.9
4
th
73.1
90.9
82
86
91.2
91.5
94
5
th
73.9
85.7
74.8
75
74.8
79.4
82
6
th
72.5
72.7
72
72.4
82
82.6
85.7
7
th
71.2
69.2
76.8
77.2
72.2
81.7
83
8
th
69.9
76.6
80
80.3
81.6
84.2
87.7
9
th
78.1
85.7
81
82
76.6
88.3
90
10
th
54.3
71.8
70
70.5
67.7
75.8
80
A
ve
r
a
ge
69.22
76.8
74.9
76.26
76.6
81
83.87
Evaluation Warning : The document was created with Spire.PDF for Python.
I
S
S
N
:
2252
-
8938
I
nt
J
A
r
ti
f
I
nt
e
ll
,
V
ol
.
10
, N
o.
2, J
une
20
21
:
31
6
–
32
3
322
F
ig
ur
e
3
.
C
om
pa
r
is
on be
twe
e
n be
nc
hm
a
r
ks
a
lg
or
it
hm
s
4.
C
O
N
C
L
U
S
I
O
N
I
n
th
is
pa
pe
r
,
w
e
pr
opos
e
d
a
m
ode
l
th
a
t
e
nha
n
c
e
s
th
e
p
e
r
f
or
m
a
nc
e
of
th
e
C
N
N
a
lg
or
it
hm
w
hi
c
h
is
us
e
d
f
or
te
xt
c
la
s
s
if
ic
a
ti
on
of
bi
om
e
di
c
a
l
a
r
ti
c
le
s
r
e
la
te
d
to
c
a
nc
e
r
di
s
e
a
s
e
ba
s
e
d
on
th
e
te
n
ha
ll
m
a
r
ks
of
c
a
nc
e
r
us
in
g
a
ne
w
r
e
c
e
nt
c
onc
e
pt
in
th
e
w
or
d
e
m
be
ddi
ng
la
ye
r
.
T
hi
s
te
c
hni
que
r
e
f
e
r
s
to
th
e
us
e
of
uni
-
w
or
d
a
nd
m
ul
ti
-
w
or
d
(
phr
a
s
e
)
e
m
be
ddi
ng
in
s
te
a
d
of
us
in
g
uni
-
w
or
d
e
m
be
ddi
ng
onl
y
w
hi
c
h
is
s
ui
ta
bl
e
f
or
th
e
na
tu
r
e
of
t
he
m
e
di
c
a
l
te
xt
.
T
he
e
xpe
r
im
e
nt
a
l
r
e
s
ul
ts
of
s
how
th
a
t
th
e
c
onc
e
pt
of
th
e
phr
a
s
e
(
M
ul
ti
-
w
or
d)
e
m
be
ddi
ng
te
c
hni
que
li
ke
P
M
C
V
e
c
ha
s
im
pr
ove
d
th
e
p
e
r
f
or
m
a
nc
e
of
th
e
e
xi
s
ti
ng
m
ode
l
a
c
hi
e
vi
ng
a
n
F
-
s
c
or
e
e
qua
l
to
83.87%
,
w
hi
le
th
e
pr
e
vi
ou
s
one
w
a
s
a
c
hi
e
ve
d
a
n
F
-
s
c
or
e
e
qua
l
to
81%
th
a
t
us
e
s
th
e
uni
-
w
or
d
e
m
be
ddi
ngs
te
c
hni
que
,
a
nd
if
th
e
d
a
ta
s
e
t
is
la
r
ge
r
th
e
c
la
s
s
i
f
ic
a
ti
on
pe
r
f
or
m
a
nc
e
w
il
l
be
be
tt
e
r
th
a
n
th
e
c
ur
r
e
nt
.
T
he
pr
opos
e
d
m
ode
l
a
c
hi
e
vi
ng
a
n
a
ve
r
a
ge
F
-
s
c
or
e
gr
e
a
te
r
th
a
n
ot
he
r
M
L
a
nd
D
L
m
ode
ls
.
A
ls
o,
th
e
r
e
s
u
lt
s
s
how
th
a
t
C
N
N
is
be
tt
e
r
th
a
n
R
N
N
in
bi
om
e
di
c
a
l
te
xt
c
la
s
s
if
ic
a
ti
on.
S
om
e
di
r
e
c
ti
ons
f
or
f
ut
ur
e
w
or
k
s
ta
y
ope
n;
in
a
ddi
ti
on
to
c
ha
ngi
ng
th
e
w
or
d
ve
c
to
r
,
w
e
c
a
n
e
xa
m
in
e
th
e
e
f
f
e
c
t
of
c
h
a
ngi
ng
th
e
opt
im
iz
e
r
te
c
hni
que
,
f
il
te
r
s
iz
e
s
,
num
b
e
r
of
f
il
te
r
s
,
or
u
s
in
g
la
r
ge
r
te
xt
c
or
por
a
m
a
y
of
f
e
r
a
ddi
ti
ona
l
oppor
tu
ni
ti
e
s
f
or
e
nha
nc
e
m
e
nt
.
R
E
F
E
R
E
N
C
E
S
[1]
D.
Hanahan,
R.
A.
Weinberg,
and
S.
Francisco,
“The
Hallmarks
of
C
ancer,”
cell
,
vol.
100,
pp.
57
-
70,
2000.
DOI:
https://do
i.org/10.1016/S00
92
-
8674(00)81683
-
9.
[2]
D.
Hanahan
and
R.
A.
Weinberg,
“Review
Hallmarks
of
Cancer:
The
Next
Generation,”
Cell
,
vol.
144,
no.
5,
pp
.
646
-
674, 2011. DOI:
https://doi.org/10.1016/j.cell.2011.02.013
[3]
S.
Baker,
A.
Korhonen,
and
S.
Pyysalo,
“Cancer
Hallmark
Text
Classification
Using
Convolutional
Neural
Networks,”
in
Proceedings
of
the
Fifth
Workshop
on
Bu
ilding
and
Evaluating
Resources
for
Biomedical
Text
Mining (Bi
oTxtM 2016
)
, pp. 1
-
9, 2016. DOI: https://doi.org/10.17863/CAM.12420.
[4]
N.
Ali,
E.
Amer,
and
H.
Zayed,
“Understanding
Medical
Text
Re
lated
to
Breast
Cancer
:
A
Review,”
in
In
Internati
onal
Conferen
ce
on
Advanced
Intelligent
Systems
and
Inform
atics
,
pp.
280
-
288,
2017.
DOI:
10.1007/978
-
3
-
319
-
64861
-
3_26.
[5]
Z.
Gero
and
J.
Ho,
“PMCVec:
Distributed
phrase
representation
f
or
biomedical
text
processing,”
Journal
of
Biomedical Informatics: X
, vol. 3, p. 100047
, 2019. DOI:
https://doi.org/10.1016/j.yjbinx.2019.100047
.
[6]
D.
Y.
Deng,
Li,
“Deep
Learning:
Methods
and
Applications,”
Founda
tions
and
trends
in
signal
processing
,
vol.
7,
no. 3
-
4, pp. 197
-
387, 2014. DOI:
https://doi.org/10.1561/2000000039
.
[7]
Y.
Bengio
and
Y.
Lecun,
“Convol
utional
networks
for
images,
speech,
and
time
series,”
The
handbook
of
brain
theory
and
neural
networks
3361,
vol.
10,
Nov.
1995.
Available:
https://www.researchgate.net/publication/2453996.
[8]
L.
Jiao,
F.
Zhang,
F.
Liu,
S.
Member,
S.
Yang,
and
S.
Memb
er,
“A
Survey
of
Deep
Learning
-
based
Object
Detection,”
IEEE Access
,
vol.
7, pp. 128837
-
128868, 2017. DOI:
10.1109/ACCESS.2019.2939201
.
[9]
H.
Education,
“Construction
of
Deep
Convolutional
Neural
Net
works
For
Medical
Image
Classification,”
Internati
onal
Journ
al
of
Computer
Vision
and
Image
Processing
(IJCVIP)
,
vol.
9,
no.
2,
pp.
1
-
15,
2019.
DOI:
10.4018/IJCVIP.2019040101
.
[10]
C.
I.
S.
Litjens,
Geer
t,
Thijs
Kooi,
Babak
Ehteshami
Bejnordi,
Arnau
d
Arindra
Adiyoso
Setio,
Francesco
Ciompi,
Mohsen
Ghafoorian,
Jeroen
Awm
Van
Der
Laak,
Bram
Van
Ginneken,
“A
survey
on
deep
learning
in
medical
image ana
lysis,”
Medical imag
e analysis
, vol.
42, pp. 60
-
88, 2017.
DOI:
10.1016/j.media.2017.07.005
.
Evaluation Warning : The document was created with Spire.PDF for Python.
I
nt
J
A
r
ti
f
I
nt
e
ll
I
S
S
N
:
2252
-
8938
E
nhanc
in
g t
he
pe
r
fo
r
m
anc
e
of
c
anc
e
r
t
e
x
t
c
la
s
s
if
ic
at
io
n m
ode
l
bas
e
d on c
anc
e
r
hal
lmar
k
s
(
N
oha A
li
)
323
[11]
L.
Shen,
L.
R.
Margolies,
J.
H.
Rothstein,
E.
Fluder,
and
R.
Mcbrid
e,
“Deep
Learning
to
Improve
Breast
Cance
r
D
etection
on
Screening
Mammograp
hy,”
Scientific
reports
,
v
ol.
9,
vol.
1,
pp.
1
-
12,
2019.
DOI:
|
https://doi.org/10.1038/s41598
-
019
-
48995
-
4.
[12]
B.
F.
Yong,
H.
N.
Ting,
and
K.
H.
Ng,
“Baby
Cry
Recognition
Using
Deep
Neural
Networks,”
World
Congress
on
Medical
Ph
ysics
and
Biomedical
Engineering
2018
,
pp.
809
-
813,
20
19.
DOI:
https://doi.org/10.1007/978
-
981
-
10
-
9023
-
3_147.
[13]
J.
Zhao,
X.
Mao,
and
L.
Chen,
“Speech
emotion
recognition
usin
g
deep
1D
&
2D
CNN
LSTM
networks,”
Biomedical
Signal
Processing
and
Control
,
vol.
47
,
pp.
312
-
323,
2019.
DOI
:
https://doi.org/10.1016/j.bspc.2018.08.035
.
[14]
B.
C.
Wallace,
“A
Sensitivity
Analysis
of
(and
Practitioners’
Gui
de
to)
Convolutional
Neural
Networks
fo
r
Sentenc
e Class
ifica
tion,”
arXiv
prep
rint a
rXiv:15
10.0382
0., 201
5.
[15]
Y.
Kim,
“Conv
olutional
neural
networks
for
sentence
classification,
”
arXiv
preprint
arXiv:1408.5882,
pp.
1746
-
1751, 2014.
[16]
L.
Yao,
C.
Mao,
and
Y.
Luo,
“Clinical
Text
Classification
with
Rule
-
based
Features
and
Knowledge
-
guided
Convolutiona
l
Neura
l
Network
s,”
BMC
medica
l
informatics
and
decision
making
,
vol.
19,
no.
3
,
arXiv:18
07.0742
5v2, p.
71, 20
08.
[17]
I. Li, “Medi
cal Text Cl
assificatio
n using
Convolu
tional Neural
Netwo
rks,”
Stud Health Technol Inform 235
,
arXiv:
1704.06841, pp. 246
-
50, 2017.
[18]
R.
Dollah,
C.
Y.
Sheng,
N.
Zakaria,
M.
S.
Othman,
and
A.
W.
Rasib,
“Deep
Learning
Classification
of
Biomedical
Text using Convoluti
onal Neural Network,”
Internati
onal Journ
al of
Advanced Compu
ter Science and
Applicat
ion
s
(IJACSA)
, vol. 10, no. 8, pp. 512
-
517, 2019. DOI:
10.14569/IJACSA.2019.0100867
.
[19]
S.
Baker
et
al.,
“Automatic
semantic
classification
of
scientific
liter
ature
according
to
the
hallmarks
of
cancer,”
Bioi
nformatics
, vol. 32, no. 3, pp. 432
-
440, 2016. DOI:
10.1093/bioinformatics/btv585
.
[20]
S.
P.
Chiu,
Billy,
Gamal
Crichton,
Anna
Korhonen,
“How
to
Train
G
ood
Word
Embeddings
for
Biomedical
NLP,”
in
Proceedings
of
the
15th
workshop
on
biomedical
natural
lang
uage
processing
,
pp.
166
-
174,
2016.
DOI
:
10.18653/v1/W16
-
2922
.
[21]
S.
S.
Kalyan,
Katikap
alli
Subramanyam,
“SECNLP:
A
survey
o
f
embeddings
in
clinical
natural
language
processing,”
Journal
of
biomedica
l
inform
atics
,
vol.
101,
p.
103323,
2020.
DOI:
https://doi.org/10.1016/j.jbi.2019.103323
.
[22]
“word2vec:
Tool
for
computi
ng
continu
ous
distri
buted
represe
ntation
s
of
words.”
[Online].
Availabl
e:
https://code.google.com/archive/p/word2vec/.
[23]
“Glove:
Global
Vectors
for
word
represen
tation.”,
DOI:
10.3115/v1/D14
-
1162
,
[Online].
Available:
https://nlp.stanford.edu/projects/glove/.
[24]
“Biomedi
cal
Natural
Language
Processi
ng
(BioNLP
)
:
Tool
s
and
resources.”
[Online].
Available:
http://bio.nlplab.org/.
[25]
“Keras., F
rancois C
hollet.
2015.”
[Online].
Availabl
e: http
s://gi
thub.com
/fcholl
et/keras.
Evaluation Warning : The document was created with Spire.PDF for Python.