I
nd
o
ne
s
ia
n J
o
urna
l o
f
E
lect
rica
l En
g
ineering
a
nd
Co
m
p
u
t
er
Science
Vo
l.
21
,
No
.
2
,
Feb
r
u
ar
y
2
0
2
1
,
p
p
.
7
5
7
~
7
6
7
I
SS
N:
2
5
02
-
4
7
5
2
,
DOI
: 1
0
.
1
1
5
9
1
/i
j
ee
cs.v
2
1
.i
2
.
p
p
7
5
7
-
7
6
7
757
J
o
ur
na
l ho
m
ep
a
g
e
:
h
ttp
:
//ij
ee
cs.ia
esco
r
e.
co
m
Ba
ng
la
lang
ua
g
e t
ex
tual i
m
a
g
e
desc
ription by
hybri
d
neura
l
netw
o
rk
m
o
de
l
M
d.
Asi
f
uzza
m
a
n J
is
ha
n
1
,
K
ha
n Ra
qib
M
a
h
m
u
d
2
,
Abul
K
a
la
m
Al
Aza
d
3
,
M
o
ha
mm
a
d Rif
a
t
Ah
mm
a
d
Ra
s
hid
4
,
B
ij
a
n P
a
ul
5
,
M
d.
S
ha
ha
bu
b Ala
m
6
1,
2,
3,
4
,
5
De
p
a
rtm
e
n
t
o
f
Co
m
p
u
ter S
c
ien
c
e
a
n
d
E
n
g
in
e
e
rin
g
,
Un
iv
e
rsit
y
o
f
L
ib
e
ra
l
A
rts
Ba
n
g
lad
e
sh
,
Dh
a
k
a
,
Ba
n
g
lad
e
sh
6
De
p
a
rtme
n
t
o
f
Co
m
p
u
ter S
c
ien
c
e
a
n
d
E
n
g
in
e
e
rin
g
,
A
h
sa
n
u
ll
a
h
U
n
iv
e
rsity
o
f
S
c
ien
c
e
a
n
d
T
e
c
h
n
o
l
o
g
y
,
Dh
a
k
a
,
Ba
n
g
lad
e
sh
Art
ic
le
I
nfo
AB
ST
RAC
T
A
r
ticle
his
to
r
y:
R
ec
eiv
ed
J
u
l 1
8
,
2
0
2
0
R
ev
i
s
ed
Sep
2
0
,
2
0
2
0
A
cc
ep
ted
Oct
4
,
2
0
2
0
A
u
to
m
a
ti
c
i
m
a
g
e
c
a
p
ti
o
n
in
g
tas
k
in
d
if
fe
re
n
t
lan
g
u
a
g
e
is
a
c
h
a
ll
e
n
g
in
g
tas
k
w
h
ich
h
a
s
n
o
t
b
e
e
n
w
e
ll
in
v
e
stig
a
ted
y
e
t
d
u
e
to
t
h
e
lac
k
o
f
d
a
tas
e
t
a
n
d
e
ffe
c
ti
v
e
m
o
d
e
ls.
It
a
lso
re
q
u
ires
g
o
o
d
u
n
d
e
rsta
n
d
in
g
o
f
sc
e
n
e
a
n
d
c
o
n
tex
tu
a
l
e
m
b
e
d
d
in
g
f
o
r
r
o
b
u
st
se
m
a
n
ti
c
in
terp
re
tatio
n
o
f
im
a
g
e
s
f
o
r
n
a
tu
r
a
l
lan
g
u
a
g
e
im
a
g
e
d
e
sc
rip
to
r.
T
o
g
e
n
e
ra
t
e
i
m
a
g
e
d
e
sc
rip
to
r
in
Ba
n
g
la,
we
c
r
e
a
ted
a
n
e
w
Ba
n
g
la
d
a
tas
e
t
o
f
i
m
a
g
e
s
p
a
ired
w
it
h
targ
e
t
lan
g
u
a
g
e
lab
e
l,
n
a
m
e
d
a
s
Ba
n
g
la
n
a
tu
ra
l
lan
g
u
a
g
e
im
a
g
e
to
tex
t
(BNL
I
T
)
d
a
tas
e
t.
T
o
d
e
a
l
w
it
h
th
e
im
a
g
e
u
n
d
e
rsta
n
d
i
n
g
,
w
e
p
ro
p
o
se
a
h
y
b
rid
e
n
c
o
d
e
r
-
d
e
c
o
d
e
r
m
o
d
e
l
b
a
se
d
o
n
e
n
c
o
d
e
r
-
d
e
c
o
d
e
r
a
rc
h
it
e
c
tu
re
a
n
d
th
e
m
o
d
e
l
is
e
v
a
lu
a
ted
o
n
o
u
r
n
e
w
l
y
c
re
a
ted
d
a
tas
e
t.
T
h
is
p
ro
p
o
se
d
a
p
p
r
o
a
c
h
a
c
h
iev
e
s
sig
n
if
i
c
a
n
c
e
p
e
rf
o
r
m
a
n
c
e
im
p
ro
v
e
m
e
n
t
o
n
tas
k
o
f
se
m
a
n
ti
c
re
t
riev
a
l
o
f
i
m
a
g
e
s.
Ou
r
h
y
b
rid
m
o
d
e
l
u
se
s
th
e
c
o
n
v
o
l
u
ti
o
n
a
l
n
e
u
ra
l
n
e
tw
o
r
k
a
s
a
n
e
n
c
o
d
e
r
w
h
e
re
a
s
th
e
b
id
irec
ti
o
n
a
l
lo
n
g
sh
o
rt
term
m
e
m
o
r
y
is
u
se
d
fo
r
th
e
se
n
ten
c
e
re
p
re
se
n
tatio
n
t
h
a
t
d
e
c
re
a
se
s
th
e
c
o
m
p
u
tatio
n
a
l
c
o
m
p
lex
it
ies
w
it
h
o
u
t
trad
i
n
g
o
f
f
th
e
e
x
a
c
tn
e
ss
o
f
th
e
d
e
sc
rip
to
r.
T
h
e
m
o
d
e
l
y
ield
e
d
b
e
n
c
h
m
a
rk
a
c
c
u
ra
c
y
in
re
c
o
v
e
ri
n
g
Ba
n
g
la
n
a
tu
ra
l
lan
g
u
a
g
e
a
n
d
w
e
a
lso
c
o
n
d
u
c
ted
a
t
h
o
r
o
u
g
h
n
u
m
e
rica
l
a
n
a
l
y
sis
o
f
th
e
m
o
d
e
l
p
e
rf
o
r
m
a
n
c
e
o
n
th
e
BNL
IT
d
a
tas
e
t.
K
ey
w
o
r
d
s
:
B
an
g
la
n
atu
r
al
la
n
g
u
a
g
e
d
escr
ip
to
r
s
C
o
n
v
o
lu
tio
n
al
n
e
u
r
al
n
e
t
w
o
r
k
H
y
b
r
id
r
ec
u
r
r
en
t n
e
u
r
al
n
et
w
o
r
k
L
o
n
g
s
h
o
r
t
-
ter
m
m
e
m
o
r
y
b
i
-
d
ir
ec
tio
n
al
r
ec
u
r
r
en
t n
e
u
r
al
n
et
w
o
r
k
T
h
is
is
a
n
o
p
e
n
a
c
c
e
ss
a
rticle
u
n
d
e
r th
e
CC B
Y
-
SA
li
c
e
n
se
.
C
o
r
r
e
s
p
o
nd
ing
A
uth
o
r
:
Md
.
A
s
i
f
u
zz
a
m
a
n
J
is
h
an
Dep
ar
t
m
en
t o
f
C
o
m
p
u
ter
Scie
n
ce
an
d
E
n
g
i
n
ee
r
in
g
Un
i
v
er
s
it
y
o
f
L
ib
er
al
A
r
ts
B
an
g
lad
es
h
Dh
a
n
m
o
n
d
i,
Dh
a
k
a
-
1
2
0
9
,
B
an
g
lad
es
h
E
m
ail:
j
is
h
a
n
9
0
0
@
g
m
ai
l.c
o
m
1.
I
NT
RO
D
UCT
I
O
N
A
f
u
n
d
a
m
e
n
tal
m
o
ti
v
atio
n
o
f
co
m
p
u
tatio
n
al
v
is
u
al
tas
k
s
is
to
im
ita
te
th
e
r
e
m
ar
k
ab
le
ca
p
ab
ilit
y
o
f
h
u
m
a
n
to
co
g
n
ize
a
n
d
co
m
p
r
eh
en
d
v
i
s
u
al
i
n
f
o
r
m
atio
n
w
it
h
asto
n
i
s
h
i
n
g
s
p
ee
d
a
n
d
ac
cu
r
ac
y
.
Fo
r
a
n
ar
ti
f
icia
l
f
r
a
m
e
w
o
r
k
to
e
m
u
late
th
is
ab
i
lit
y
o
f
i
m
a
g
e
d
escr
ip
tio
n
is
n
o
t
s
i
m
p
l
y
co
n
f
i
n
ed
to
p
er
ce
iv
in
g
i
m
a
g
e
s
,
r
ath
er
it
is
i
m
p
er
ativ
e
to
co
m
p
r
e
h
en
d
b
o
th
s
y
n
tactic
an
d
s
e
m
an
tic
i
m
p
o
r
tan
ce
o
f
t
h
e
i
m
a
g
es,
i
n
o
t
h
er
w
o
r
d
s
,
t
h
e
u
n
d
er
ta
k
in
g
m
u
s
t
i
n
clu
d
e
u
n
d
er
s
tan
d
in
g
t
h
e
s
u
b
s
ta
n
ce
s
o
f
t
h
e
p
ictu
r
e
as
w
ell
a
s
th
e
co
m
m
u
n
icat
io
n
s
a
m
o
n
g
th
e
s
u
b
s
tan
ce
s
[
1
-
5
]
.
I
m
ag
e
d
escr
ip
tio
n
is
es
s
en
t
iall
y
t
h
e
lan
g
u
a
g
e
b
ased
tex
t
u
al
d
escr
i
p
tio
n
o
f
an
i
m
ag
e,
w
h
ic
h
h
a
s
b
ee
n
a
n
ac
tiv
e
f
ield
o
f
r
esear
ch
i
n
co
m
p
u
ter
v
is
io
n
an
d
n
atu
r
al
la
n
g
u
a
g
e
p
r
o
ce
s
s
in
g
[
6
-
1
3
]
.
I
m
a
g
e
ca
p
tio
n
in
g
h
a
s
d
r
a
w
n
a
lo
t
o
f
in
ter
est
o
f
th
e
r
e
s
ea
r
ch
er
s
b
e
ca
u
s
e
o
f
its
m
a
n
y
p
r
ac
tical
a
p
p
licatio
n
s
,
s
u
ch
a
s
tex
t
b
ased
i
m
a
g
e
s
ea
r
c
h
,
i
m
ag
e
cu
r
atio
n
,
as
s
is
ti
n
g
o
f
v
is
u
al
i
m
p
air
ed
in
d
iv
id
u
als
to
b
etter
u
n
d
er
s
ta
n
d
th
e
r
ea
l
w
o
r
ld
,
i
m
a
g
e
u
n
d
er
s
ta
n
d
in
g
in
s
o
cial
m
ed
ia,
etc.
Evaluation Warning : The document was created with Spire.PDF for Python.
I
SS
N
:
2
5
0
2
-
4752
I
n
d
o
n
esia
n
J
E
lec
E
n
g
&
C
o
m
p
Sci,
Vo
l.
21
,
No
.
2
,
Feb
r
u
ar
y
2
0
2
1
:
7
5
7
-
7
6
7
758
W
h
er
e
m
o
s
t
o
f
t
h
e
s
t
u
d
ies
o
f
i
m
a
g
e
ca
p
tio
n
g
e
n
er
atio
n
ar
e
in
E
n
g
li
s
h
lan
g
u
ag
e,
we
f
o
cu
s
o
n
g
en
er
ati
n
g
ca
p
tio
n
in
an
o
t
h
er
lan
g
u
a
g
e:
B
an
g
la
(
T
o
th
e
’
B
en
g
al
i’
s
p
ea
k
in
g
p
eo
p
le,
th
e
l
an
g
u
a
g
e
i
s
m
ai
n
l
y
k
n
o
w
n
as
’
B
an
g
la
’
)
.
De
m
o
g
r
ap
h
icall
y
,
B
an
g
la
i
s
o
n
e
o
f
th
e
m
o
s
t
w
id
el
y
s
p
o
k
e
n
la
n
g
u
a
g
e
s
.
I
t
is
s
p
o
k
e
n
b
y
i
n
ex
ce
s
s
o
f
2
1
0
m
illi
o
n
in
d
i
v
id
u
als
as
a
f
ir
s
t
o
r
s
ec
o
n
d
lan
g
u
ag
e,
w
i
th
s
o
m
e
w
h
er
e
in
th
e
r
an
g
e
o
f
1
0
0
m
illi
o
n
B
en
g
ali
s
p
ea
k
er
s
in
B
an
g
lad
esh
,
ar
o
u
n
d
8
5
m
il
lio
n
in
I
n
d
i
a,
m
ai
n
l
y
i
n
th
e
r
eg
io
n
s
o
f
W
est
B
en
g
al,
A
s
s
a
m
,
an
d
T
r
ip
u
r
a,
an
d
s
izab
le
m
i
g
r
an
t
n
et
w
o
r
k
s
in
th
e
Un
i
ted
Kin
g
d
o
m
,
th
e
Un
ited
States
,
a
n
d
th
e
Mid
d
le
E
ast.
Giv
e
n
t
h
e
r
ec
e
n
t
ad
v
an
ce
s
i
n
n
at
u
r
al
la
n
g
u
a
g
e
p
r
o
ce
s
s
i
n
g
,
t
h
is
s
t
u
d
y
ai
m
s
at
g
en
er
at
in
g
B
an
g
la
-
te
x
tu
a
l
ca
p
tio
n
s
o
f
co
n
te
x
t
u
al
i
m
ag
e
s
to
th
e
s
er
v
e
t
h
e
B
an
g
la
-
s
p
ea
k
i
n
g
co
m
m
u
n
it
y
.
T
h
e
m
o
ti
v
atio
n
al
Fi
g
u
r
e
1
p
o
r
tr
ay
s
a
ca
s
e
o
f
m
o
d
el
g
e
n
er
a
ted
im
a
g
e
-
ca
p
tio
n
i
n
g
,
w
h
er
e
th
e
i
m
a
g
e
h
as
b
ee
n
u
s
ed
to
e
x
tr
icate
a
n
atu
r
al
la
n
g
u
a
g
e
b
ased
s
in
g
le
s
en
ten
ce
d
ep
ictio
n
f
r
o
m
th
e
c
l
ea
r
an
d
v
i
s
u
al
d
ata.
Her
e
th
e
s
tr
ai
g
h
tf
o
r
w
ar
d
ca
p
tio
n
in
g
s
h
o
w
s
t
h
e
v
er
y
e
x
ce
p
tio
n
al
p
r
o
f
u
n
d
it
y
in
v
ie
w
o
f
th
e
i
m
ag
e
i
n
b
o
t
h
g
r
a
m
m
atica
l
a
n
d
s
e
m
a
n
tic
s
i
g
n
i
f
ican
ce
,
w
h
er
e
th
e
ite
m
a
n
d
s
p
atial
s
u
b
s
ta
n
ce
in
t
h
e
i
m
a
g
e
(
e.
g
.
p
eo
p
le
an
d
r
o
ad
)
ar
e
ass
o
ci
-
ated
s
em
a
n
ti
ca
ll
y
,
an
d
co
n
cu
r
r
e
n
t
w
ith
t
h
e
ac
tiv
it
y
o
f
s
tan
d
i
n
g
to
g
et
h
er
.
T
h
e
p
er
ce
p
tio
n
o
f
s
alien
c
y
i
n
to
i
m
a
g
es
co
u
ld
b
e
cu
lt
u
r
e
d
ep
en
d
en
t
s
o
it
i
s
n
ec
e
s
s
ar
y
to
g
e
n
er
ate
ca
p
tio
n
s
i
n
d
if
f
er
en
t
lan
g
u
ag
e
s
,
w
h
ic
h
is
r
ef
er
r
ed
to
as th
e
cr
o
s
s
lin
g
u
al
i
m
ag
e
ca
p
tio
n
i
n
g
[
1
3
,
1
4
]
.
Fig
u
r
e
1
.
E
x
tr
ac
tio
n
o
f
a
b
asic
co
m
m
o
n
la
n
g
u
a
g
e
p
o
r
tr
a
y
al
f
r
o
m
v
is
u
al
i
n
f
o
r
m
atio
n
T
o
cr
ea
te
an
i
m
ag
e
ca
p
tio
n
i
n
g
m
o
d
el,
o
n
e
o
f
t
h
e
m
ai
n
c
h
al
len
g
e
s
i
s
to
cr
ea
te
a
d
ataset
in
th
e
tar
g
et
lan
g
u
a
g
e.
So
,
w
e
f
ir
s
t
b
u
ild
a
n
e
w
tar
g
et
lan
g
u
ag
e
d
ata
s
et,
n
a
m
ed
B
NL
I
T
d
ataset,
o
f
a
r
ea
s
o
n
ab
le
s
ize
b
y
an
n
o
tati
n
g
ea
ch
i
m
ag
e
w
i
th
a
s
in
g
le
an
n
o
tatio
n
an
d
r
ef
in
in
g
th
ese
an
n
o
tatio
n
s
th
r
o
u
g
h
ex
p
er
ts
.
W
ith
th
e
b
est
o
f
o
u
r
k
n
o
w
led
g
e,
th
e
d
atas
et
o
f
an
i
m
a
g
e
to
B
an
g
la
c
ap
tio
n
g
en
er
atio
n
is
n
o
t
av
a
i
lab
le
in
th
e
p
u
b
lic
liter
atu
r
e.
Giv
e
n
th
e
lo
g
ical
a
n
d
f
u
n
ctio
n
al
s
i
g
n
i
f
ica
n
ce
o
f
t
h
e
co
m
m
o
n
lan
g
u
ag
e
b
ased
d
ep
ictio
n
o
f
i
m
a
g
es,
it
h
as
b
ee
n
a
u
n
iq
u
e
s
t
u
d
y
e
m
p
lo
y
i
n
g
b
o
th
tr
ad
itio
n
al
an
d
d
ee
p
m
ac
h
i
n
e
lear
n
i
n
g
m
et
h
o
d
s
f
o
r
ac
co
m
p
lis
h
i
n
g
ex
p
ec
ted
o
u
tco
m
e.
F
u
r
th
er
m
o
r
e,
th
e
ev
er
-
g
r
o
w
in
g
n
u
m
b
e
r
o
f
i
m
a
g
e
an
d
v
id
eo
d
atasets
r
aise
test
in
g
b
ar
s
ag
ain
s
t
t
h
e
co
m
p
u
tatio
n
al
e
n
d
ea
v
o
r
s
to
p
r
o
d
u
ce
lin
g
u
is
tica
ll
y
an
d
s
e
m
a
n
ticall
y
v
iab
le
n
at
u
r
al
lan
g
u
a
g
e
b
ased
p
o
r
tr
ay
al,
li
m
ited
b
y
te
m
p
la
tes
an
d
clo
s
ed
v
o
ca
b
u
lar
ies.
I
n
o
r
d
er
t
o
b
u
ild
an
im
a
g
e
ca
p
tio
n
g
e
n
er
atio
n
m
o
d
el,
it
is
i
m
p
er
ativ
e
to
im
p
r
o
v
e
t
h
e
v
i
s
u
al
r
elev
a
n
c
y
o
f
i
m
a
g
e
d
escr
ip
to
r
o
f
an
i
m
ag
e,
i.e
.
,
h
o
w
w
ell
t
h
e
m
o
d
e
l
u
n
d
er
s
ta
n
d
s
t
h
e
i
m
a
g
e
co
n
t
ex
t
an
d
th
e
n
h
o
w
ef
f
icien
tl
y
it
g
e
n
er
ates
d
escr
ip
tiv
e
s
en
ten
ce
s
,
w
h
ich
is
co
h
er
en
t
w
i
th
th
e
i
m
ag
e
co
n
te
n
t.
I
t
is
also
i
m
p
o
r
tan
t
to
co
n
s
id
er
h
o
w
co
n
te
x
t
u
al
s
e
m
an
tic
e
m
b
e
d
d
in
g
ca
n
b
e
ad
ap
t
ed
to
d
if
f
er
en
t
s
ce
n
ar
io
s
o
f
an
i
m
ag
e.
I
n
o
r
d
er
to
cir
cu
m
v
e
n
t
t
h
ese
co
m
p
lex
itie
s
in
ca
p
tio
n
i
n
g
tas
k
,
w
e
p
r
o
p
o
s
e
a
h
y
b
r
id
e
n
co
d
er
-
d
ec
o
d
er
m
o
d
el,
an
d
t
h
e
ch
alle
n
g
i
n
g
p
ar
t
in
t
h
e
en
co
d
er
-
d
ec
o
d
er
ar
ch
itectu
r
es
is
to
d
esig
n
th
e
i
n
ter
f
ac
e
th
at
co
n
tr
o
ls
th
e
i
n
f
o
r
m
atio
n
f
lo
w
b
et
w
ee
n
ap
p
lied
C
N
N
[
1
4
]
,
lo
n
g
s
h
o
r
t
ter
m
m
e
m
o
r
y
(
L
ST
M)
[
1
5
]
an
d
bi
-
d
ir
ec
tio
n
al
n
eu
r
al
n
et
w
o
r
k
s
(
B
R
NN)
[
1
6
,
1
7
]
m
o
d
el
co
n
s
t
r
u
cts.
So
th
e
m
ai
n
co
n
tr
ib
u
tio
n
o
f
t
h
e
p
ap
er
ar
e
1
)
cr
ea
tin
g
a
tar
g
et
lan
g
u
ag
e
d
ataset
;
2
)
b
u
ild
in
g
a
B
an
g
la
ca
p
tio
n
g
e
n
er
atio
n
m
o
d
el
b
as
ed
o
n
a
h
y
b
r
id
e
n
co
d
er
-
d
ec
o
d
er
m
o
d
el,
an
d
3
)
ex
p
er
i
m
en
ti
n
g
s
u
cc
e
s
s
f
u
ll
y
w
it
h
th
e
p
r
o
p
o
s
ed
m
o
d
el
o
n
task
s
o
f
s
e
m
a
n
tic
r
etr
iev
a
l
o
f
i
m
a
g
es.
T
h
e
f
u
ll
v
er
s
io
n
o
f
B
NL
I
T
d
ataset
h
as
b
ee
n
alr
ea
d
y
u
p
lo
ad
ed
an
d
p
u
b
lis
h
e
d
in
f
o
u
r
d
if
f
er
e
n
t d
atav
er
s
e
[
1
8
]
.
Evaluation Warning : The document was created with Spire.PDF for Python.
I
n
d
o
n
esia
n
J
E
lec
E
n
g
&
C
o
m
p
Sci
I
SS
N:
2502
-
4752
B
a
n
g
la
l
a
n
g
u
a
g
e
textu
a
l ima
g
e
d
escr
ip
tio
n
b
y
h
y
b
r
id
n
eu
r
a
l
n
etw
o
r
k
mo
d
el
(
Md
.
A
s
ifu
z
z
a
ma
n
Jish
a
n
)
759
2.
RE
L
AT
E
D
WO
RK
I
n
th
e
co
m
p
u
ter
v
i
s
io
n
,
i
m
a
g
e
class
i
f
icatio
n
,
an
d
i
m
a
g
e
to
tex
t
g
e
n
er
atio
n
r
esear
c
h
d
ata
s
et
p
la
y
a
cr
itical
r
o
le.
T
h
e
p
r
o
d
u
ctio
n
o
f
g
r
o
u
n
d
tr
u
th
s
ter
eo
an
d
o
p
tical
s
tr
ea
m
d
ata
s
ets
[
1
9
,
2
0
]
a
n
i
m
ated
a
s
u
r
g
e
o
f
en
th
u
s
ias
m
f
o
r
th
e
s
e
ter
r
ito
r
ies.
T
h
e
ea
r
l
y
ad
v
a
n
ce
m
en
t o
f
it
e
m
ac
k
n
o
w
led
g
m
e
n
t d
ata
s
ets
[
2
1
-
2
3
]
en
co
u
r
ag
ed
th
e
i
m
m
ed
iate
e
x
a
m
i
n
atio
n
o
f
s
ev
er
al
p
ictu
r
e
ac
k
n
o
w
led
g
m
en
t
ca
lcu
latio
n
s
w
h
ile
at
th
e
s
a
m
e
ti
m
e
p
u
s
h
i
n
g
th
e
f
ield
to
w
ar
d
s
in
cr
ea
s
in
g
l
y
co
m
p
le
x
i
s
s
u
es.
T
h
er
e
a
r
e
d
if
f
er
en
t
d
atasets
ex
i
s
ti
n
g
e.
g
.
Fl
i
ck
r
8
K,
Fli
c
k
r
3
0
K,
MS
C
OC
O,
I
m
ag
eNe
t
f
o
r
th
e
i
m
ag
e
p
r
o
ce
s
s
i
n
g
s
ec
to
r
.
As
o
f
late,
th
e
I
m
a
g
eNe
t
d
atase
t
[
2
4
]
co
n
tain
in
g
a
g
r
ea
t
m
an
y
p
ict
u
r
es
h
a
s
e
m
p
o
w
er
ed
leap
s
f
o
r
w
ar
d
in
b
o
th
ite
m
ar
r
an
g
e
m
e
n
t
a
n
d
r
ec
o
g
n
i
tio
n
in
v
es
tig
a
te
u
tili
zi
n
g
an
o
t
h
er
class
o
f
p
r
o
f
o
u
n
d
lear
n
in
g
ca
lc
u
latio
n
[
2
4
-
2
6
]
.
I
m
ag
e
C
la
s
s
i
f
icat
io
n
T
h
e
m
ai
n
f
o
cu
s
o
f
i
m
ag
e
clas
s
i
f
icatio
n
i
s
to
id
e
n
ti
f
y
o
b
j
ec
ts
f
r
o
m
t
h
e
i
m
a
g
es.
E
ar
ly
d
ataset
s
o
f
t
h
is
s
o
r
t
i
n
cl
u
d
ed
p
ictu
r
es
co
n
tai
n
i
n
g
a
s
o
l
itar
y
ar
ticle
w
it
h
clea
r
f
o
u
n
d
a
tio
n
s
,
f
o
r
ex
a
m
p
le,
th
e
MN
I
ST
w
r
i
tten
b
y
h
an
d
d
ig
it
s
[
2
4
]
o
r
C
OI
L
f
a
m
i
l
y
u
n
i
t
o
b
j
ec
ts
.
C
altec
h
1
0
1
[
2
1
]
an
d
C
altec
h
2
5
6
[
2
2
]
d
en
o
ted
th
e
ch
a
n
g
e
to
ad
d
itio
n
al
r
ea
s
o
n
ab
le
ar
ticle
p
ict
u
r
es
r
ec
o
v
er
ed
f
r
o
m
t
h
e
w
eb
w
h
ile
ad
d
itio
n
al
l
y
ex
p
an
d
in
g
t
h
e
n
u
m
b
er
o
f
ite
m
class
es
to
1
0
1
an
d
2
5
6
,
s
ep
ar
atel
y
.
Mo
s
t
p
o
p
u
lar
an
d
av
aila
b
le
i
m
ag
e
d
ataset
s
in
t
h
e
A
I
,
m
ac
h
in
e
lear
n
i
n
g
,
an
d
d
ee
p
lear
n
i
n
g
p
eo
p
le
g
r
o
u
p
b
ec
au
s
e
o
f
th
e
b
ig
g
er
n
u
m
b
er
o
f
p
r
ep
ar
in
g
m
o
d
el
s
,
C
I
F
AR
-
1
0
an
d
C
I
FAR
-
1
0
0
o
f
f
er
ed
1
0
an
d
1
0
0
cl
ass
es
f
r
o
m
a
d
a
ta
s
et
o
f
m
o
d
es
t
3
2
3
2
p
ictu
r
es
[
2
6
,
2
7
]
.
A
s
o
f
late,
I
m
a
g
eNe
t [
2
8
]
m
ad
e
a
s
tr
ik
i
n
g
tak
eo
f
f
f
r
o
m
t
h
e
g
r
ad
u
al
i
n
cr
e
m
en
t i
n
d
atase
t sizes.
I
m
ag
e
to
T
ex
t
Ge
n
er
atio
n
Ge
n
er
ate
tex
t
f
r
o
m
t
h
e
g
i
v
e
n
i
n
p
u
t
i
m
a
g
e
is
t
h
e
m
ai
n
f
o
cu
s
o
n
i
m
a
g
e
to
tex
t
g
e
n
er
ate.
Si
n
ce
th
e
p
ap
er
o
f
A
tt
n
G
AN:
Fin
e
-
Gr
ai
n
e
d
T
ex
t
to
I
m
a
g
e
Ge
n
er
atio
n
w
it
h
A
tte
n
tio
n
al
Gen
er
ati
v
e
A
d
v
er
s
ar
ial
Net
w
o
r
k
s
[
2
8
]
an
d
C
h
atP
ain
ter
:
I
m
p
r
o
v
in
g
T
ex
t
to
I
m
a
g
e
Ge
n
er
ati
o
n
u
s
i
n
g
Dialo
g
u
e
[
2
9
]
,
w
e
s
ee
t
h
at
th
e
y
w
er
e
m
ain
l
y
f
o
cu
s
ed
o
n
C
NN
f
ea
t
u
r
e
s
o
f
th
eir
p
ap
er
.
Me
an
w
h
i
le,
i
n
t
h
e
e
v
en
t
t
h
at
w
e
talk
ab
o
u
t
t
h
e
p
ap
er
o
f
Gr
o
u
n
d
ed
C
o
m
p
o
s
it
io
n
al
Se
m
a
n
tics
f
o
r
Fi
n
d
in
g
an
d
De
s
cr
ib
in
g
I
m
a
g
es
w
it
h
Sen
te
n
ce
s
,
E
x
p
lo
r
in
g
Mo
d
el
s
an
d
Data
f
o
r
I
m
a
g
e
Q
u
es
tio
n
A
n
s
w
er
i
n
g
[
2
8
-
3
0
]
an
d
Den
s
e
C
ap
:
F
u
ll
y
C
o
n
v
o
lu
tio
n
al
L
o
ca
lizat
io
n
N
et
w
o
r
k
s
f
o
r
De
n
s
e
C
ap
tio
n
in
g
[
2
7
]
,
w
e
s
ee
t
h
at
th
e
y
d
ep
e
n
d
en
t
o
n
DT
-
R
NN
d
em
o
n
s
tr
ate
f
o
r
p
r
o
d
u
cin
g
co
n
ten
t
f
r
o
m
t
h
e
p
ictu
r
e
d
is
tr
ict
.
T
h
e
y
lik
e
w
is
e
ce
n
ter
ed
o
n
u
tili
zin
g
a
s
e
m
an
t
ic
i
m
p
lan
tin
g
f
r
a
m
e
w
o
r
k
an
d
d
e
m
o
n
s
tr
ated
h
o
w
a
n
eu
r
al
s
y
s
te
m
ca
n
f
u
n
ct
io
n
a
n
d
d
is
ti
n
g
u
i
s
h
p
ictu
r
es
ar
ea
.
T
h
ey
u
til
ized
th
e
d
ata
s
et
o
f
C
OC
O
-
Q
A,
D
A
QU
AR
.
T
h
e
y
u
tili
ze
d
t
h
e
VG
G
-
1
6
d
esi
g
n
f
o
r
its
c
u
tti
n
g
-
ed
g
e
P
er
f
o
r
m
a
n
ce
b
u
t,
t
h
eir
r
es
u
lt
o
f
t
h
e
m
o
d
el
w
as
s
o
p
o
o
r
an
d
th
at
w
a
s
o
n
l
y
0
.
2
7
.
T
h
ese
p
ap
er
s
ar
e
s
tate
-
of
-
t
h
e
-
ar
t f
o
r
o
u
r
w
o
r
k
[
3
0
]
.
3.
DATAS
E
T
I
m
ag
e
co
llec
tio
n
i
s
t
h
e
m
o
s
t
i
m
p
o
r
tan
t,
p
o
p
u
lar
f
o
r
v
ar
io
u
s
s
ig
n
i
f
ica
n
t
ap
p
licatio
n
s
an
d
also
ch
alle
n
g
-
i
n
g
.
W
e
cr
ea
ted
a
n
e
w
d
ata
s
et
w
h
ic
h
n
a
m
e
i
s
B
N
L
I
T
an
d
th
at
d
ata
s
et
co
n
tai
n
s
8
,
7
4
3
im
ag
e
s
.
W
e
ch
o
o
s
e
B
an
g
lad
es
h
p
er
s
p
ec
ti
v
e
i
m
ag
e
s
f
o
r
cr
ea
te
a
n
e
w
d
ata
s
et.
Fli
ck
r
8
K,
Fli
c
k
r
3
0
K,
an
d
MS
C
O
C
O
d
atasets
ar
e
co
n
tai
n
i
n
g
w
ester
n
c
u
lt
u
r
al
i
m
a
g
es
b
u
t
w
e
u
s
e
o
n
l
y
o
u
r
co
u
n
tr
y
s
c
u
lt
u
r
al
i
m
ag
e.
Fo
r
cr
ea
te
a
n
e
w
d
ataset,
m
ai
n
c
h
alle
n
g
e
is
co
l
lect
d
ata
o
r
i
m
a
g
es
f
r
o
m
v
ar
i
o
u
s
s
o
u
r
ce
s
.
W
e
ch
o
o
s
e
B
a
n
g
lad
es
h
p
er
s
p
ec
tiv
e
i
m
a
g
es
f
o
r
cr
ea
tin
g
th
i
s
n
e
w
d
ataset
th
at
s
w
h
y
w
e
co
llect
i
m
ag
e
s
f
r
o
m
t
h
e
v
il
lag
e
s
,
r
iv
er
s
,
h
u
m
a
n
s
,
a
n
i
m
als,
s
h
o
p
s
,
co
w
,
d
o
g
,
f
ie
ld
,
s
tatio
n
an
d
m
a
n
y
m
o
r
e.
W
e
co
llect
th
o
s
e
i
m
a
g
e
s
f
r
o
m
d
i
f
f
er
e
n
t
s
o
u
r
ce
s
li
k
e
as
ce
ll
p
h
o
n
e
g
aller
y
,
ca
m
er
a,
u
n
i
v
er
s
it
y
g
aller
y
,
m
ar
r
iag
e
f
u
n
ct
io
n
,
to
u
r
i
m
ag
e
s
an
d
also
i
n
ter
n
et
s
o
u
r
ce
s
.
An
n
o
tatio
n
is
an
o
t
h
er
i
m
p
o
r
tan
t
p
ar
t
f
o
r
t
h
is
B
an
g
la
d
ata
s
et.
W
e
g
i
v
e
o
n
e
a
n
n
o
tatio
n
f
o
r
e
ac
h
i
m
a
g
e
an
d
th
at
ca
p
tio
n
la
n
g
u
a
g
e
is
B
an
g
la.
I
m
ag
e
a
n
n
o
tatio
n
i
s
th
a
t th
e
m
eth
o
d
b
y
t
h
at
an
a
u
to
m
atic
d
ata
p
r
o
ce
s
s
in
g
ad
ap
tiv
e
d
y
n
a
m
ic
p
r
o
g
r
a
m
m
i
n
g
(
ADP
)
s
y
s
te
m
m
ec
h
an
ical
l
y
ass
i
g
n
s
d
ata
w
it
h
in
t
h
e
v
ar
iet
y
o
f
ca
p
tio
n
in
g
to
a
d
ig
ital
i
m
a
g
e.
T
h
is
ap
p
licatio
n
o
f
lap
to
p
v
i
s
io
n
tec
h
n
iq
u
es
i
s
e
m
p
lo
y
ed
i
n
i
m
ag
e
r
etr
ie
v
al
s
y
s
te
m
s
to
ar
r
an
g
e
an
d
f
i
n
d
p
ictu
r
e
s
o
f
i
n
ter
es
t
f
r
o
m
in
f
o
r
m
a
tio
n
.
T
h
ese
tec
h
n
i
q
u
es
ar
e
o
f
te
n
co
n
s
id
er
ed
a
s
o
r
t
o
f
m
u
lti
ca
teg
o
r
y
i
m
a
g
e
clas
s
if
icatio
n
w
i
th
a
n
a
w
f
u
ll
y
s
izab
le
a
m
o
u
n
t
o
f
ca
te
g
o
r
ies
w
it
h
th
e
v
o
ca
b
u
lar
y
s
iz
e.
T
y
p
icall
y
,
i
m
ag
e
an
al
y
s
is
w
i
th
in
t
h
e
v
ar
iet
y
o
f
ex
tr
ac
ted
f
ea
t
u
r
e
v
ec
to
r
s
a
n
d
als
o
th
e
co
ac
h
i
n
g
a
n
n
o
ta
tio
n
w
o
r
d
s
s
q
u
ar
e
m
ea
s
u
r
e
u
ti
lized
b
y
m
ac
h
i
n
e
l
ea
r
n
in
g
tec
h
n
iq
u
es to
ai
m
to
m
ec
h
a
n
ical
l
y
ap
p
l
y
a
n
n
o
tat
io
n
s
to
n
e
w
p
ictu
r
es.
B
NL
I
T
is
co
n
tain
in
g
8
,
7
4
3
im
ag
e
s
w
it
h
d
if
f
er
en
t
t
y
p
e
s
o
f
i
m
a
g
e
class
i
f
icat
io
n
.
At
f
ir
s
t,
w
e
n
ee
d
to
class
i
f
icat
io
n
w
h
o
le
d
ataset.
F
o
r
class
if
icatio
n
,
w
e
u
s
e
3
0
t
y
p
es
o
f
clas
s
es
a
n
d
th
e
y
ar
e
ca
t
,
h
o
r
s
e,
d
o
g
,
h
o
u
s
e,
co
w
,
w
i
n
d
o
w
,
v
illa
g
e,
h
u
m
a
n
,
to
w
n
,
ch
air
,
tab
le,
b
o
ar
d
,
s
p
o
o
n
,
ca
k
e,
m
ir
r
o
r
,
b
o
ttle,
p
en
,
p
en
cil,
b
o
o
k
,
ca
r
,
tr
u
ck
,
s
k
y
,
tr
ai
n
,
b
u
s
,
ae
r
o
p
lan
e,
b
ir
d
,
tr
ee
,
f
is
h
,
w
ater
,
f
l
o
w
er
.
I
f
d
ataset
s
ize
i
s
lar
g
er
,
th
en
m
ac
h
in
e
g
et
b
etter
lear
n
in
g
an
d
g
i
v
e
b
ett
er
ac
cu
r
ac
y
.
I
n
o
u
r
co
u
n
tr
y
,
it
is
d
if
f
ic
u
lt
to
co
llect
i
m
a
g
es
f
o
r
tr
ain
u
p
t
h
e
s
y
s
te
m
.
W
e
u
s
e
o
n
e
s
e
n
te
n
ce
f
o
r
ea
ch
i
m
a
g
e.
A
f
ter
co
llect
an
d
an
n
o
tate
o
f
ea
ch
i
m
a
g
e
,
n
ee
d
to
r
esize
all
i
m
a
g
es
o
f
w
h
o
le
d
ataset.
T
h
e
r
e
ar
e
a
h
u
g
e
n
u
m
b
er
o
f
i
m
a
g
es
in
o
u
r
d
ataset
an
d
t
h
e
y
a
r
e
s
ta
y
i
n
g
d
i
f
f
er
e
n
t
p
ix
els.
So
,
b
ef
o
r
e
tr
ain
i
n
g
,
we
r
esized
an
d
s
et
s
a
m
e
p
ix
e
ls
o
f
w
h
o
le
d
ataset.
W
e
w
r
ite
a
s
cr
ip
t
in
p
y
t
h
o
n
w
h
ic
h
is
r
esizi
n
g
all
i
m
ag
e
s
o
f
d
ataset
an
d
s
av
e
a
n
e
w
d
ir
ec
to
r
y
.
E
ar
ly
s
t
u
f
f
d
ataset
s
ce
n
ter
ed
o
n
tex
t
u
r
e
clas
s
if
icatio
n
a
n
d
h
ad
s
tr
aig
h
t
f
o
r
w
ar
d
p
ictu
r
es
f
u
ll
y
co
ated
w
it
h
o
n
e
r
o
u
g
h
p
atch
.
E
ac
h
d
a
taset
h
as
a
s
p
ec
i
f
ic
n
u
m
b
er
o
f
i
m
a
g
es
an
d
clas
s
es.
I
n
th
e
T
ab
le
1
,
w
e
co
m
p
ar
ed
Evaluation Warning : The document was created with Spire.PDF for Python.
I
SS
N
:
2
5
0
2
-
4752
I
n
d
o
n
esia
n
J
E
lec
E
n
g
&
C
o
m
p
Sci,
Vo
l.
21
,
No
.
2
,
Feb
r
u
ar
y
2
0
2
1
:
7
5
7
-
7
6
7
760
o
u
r
d
ataset
w
it
h
t
h
e
o
t
h
er
ex
i
s
tin
g
d
ata
s
et
w
it
h
r
esp
ec
ti
v
el
y
clas
s
es
an
d
i
m
a
g
e
n
u
m
b
er
.
I
n
MS
R
C
d
ataset,
co
n
tain
i
n
g
5
9
1
i
m
a
g
es
w
i
th
2
1
class
es
a
n
d
KI
T
T
I
d
ataset
co
n
tain
i
n
g
2
0
3
class
es.
I
n
a
n
o
th
er
s
id
e
C
a
m
Vid
an
d
SIFT
FL
OW
co
n
tai
n
in
g
7
0
0
an
d
2
,
6
8
8
class
e
s
r
esp
ec
t
iv
el
y
.
W
e
u
s
e
3
0
class
es
f
o
r
8
,
7
4
3
im
a
g
es
i
n
o
u
r
B
NL
I
T
d
ataset.
T
ab
le
1
.
Ov
er
v
ie
w
o
f
d
atasets
w
it
h
cla
s
s
es
D
a
t
a
se
t
I
mag
e
s
C
l
a
sse
s
Y
e
a
r
M
S
R
C
[
3
1
]
5
9
1
21
2
0
0
6
K
I
TT
I
[
3
2
]
2
0
3
14
2
0
1
2
C
a
mV
i
d
[
3
3
]
7
0
0
32
2
0
0
8
S
I
F
T
F
l
o
w
[
3
4
]
2
,
6
8
8
15
2
0
0
9
B
a
r
c
e
l
o
n
a
[
3
5
]
1
5
,
1
5
0
31
2
0
1
0
A
D
E2
0
K
[
3
6
]
2
5
,
2
1
0
2
,
6
9
3
2
0
1
7
B
N
LI
T
[
1
8
]
8
,
7
4
3
30
2
0
1
9
4.
H
YB
RID E
NCO
D
E
R
-
DE
C
O
DE
R
M
O
DE
L
T
h
e
Neu
r
al
S
y
s
te
m
f
o
r
t
h
e
i
n
t
er
p
r
etatio
n
an
d
h
a
n
d
li
n
g
o
f
v
i
s
u
al
d
ata
is
in
co
r
p
o
r
ated
in
to
ca
lcu
lati
v
e
f
r
a
m
e
w
o
r
k
s
to
co
p
y
t
h
e
s
u
b
j
ec
tiv
e
ele
m
e
n
ts
o
f
h
u
m
a
n
b
r
ain
.
T
h
er
e
ar
e
b
asically
t
h
r
ee
b
as
ic
p
ar
ts
co
m
p
r
is
i
n
g
a
Neu
r
al
S
y
s
te
m
:
co
n
v
o
l
u
tio
n
al
n
eu
r
al
n
et
w
o
r
k
(
C
NN)
,
lo
n
g
s
h
o
r
t
-
ter
m
m
e
m
o
r
y
(
L
ST
M)
,
an
d
B
i
-
d
ir
ec
tio
n
al
r
ec
u
r
r
en
t n
e
u
r
al
n
et
w
o
r
k
(
B
R
NN)
m
o
d
els.
W
e
illu
s
tr
ated
o
f
o
u
r
i
m
p
le
m
e
n
ted
m
o
d
el
i
n
th
e
Fig
u
r
e
2
.
Fig
u
r
e
2
.
Ov
er
v
ie
w
o
f
o
u
r
p
r
o
p
o
s
ed
m
o
d
el.
First o
f
all,
a
n
in
p
u
t i
m
a
g
e
p
r
o
ce
s
s
ed
b
y
C
NN.
Af
ter
th
a
t,
th
ese
r
eg
io
n
s
ar
e
p
r
o
ce
s
s
ed
w
i
th
a
f
u
ll
y
-
co
n
n
ec
ted
r
ec
o
g
n
iti
o
n
n
et
w
o
r
k
a
n
d
d
escr
ib
ed
w
it
h
a
B
R
NN
an
d
L
ST
M
lan
g
u
a
g
e
m
o
d
el.
T
h
e
m
o
d
el
i
s
tr
ain
ed
en
d
-
to
-
e
n
d
w
i
th
s
to
c
h
asti
c
g
r
ad
ien
t d
esce
n
t
C
o
n
v
o
lu
tio
n
al
Ne
u
r
al
Net
w
o
r
k
is
a
n
i
m
p
o
r
tan
t
p
ar
t
o
f
i
m
a
g
e
p
r
o
ce
s
s
in
g
a
n
d
class
i
f
icat
io
n
o
f
i
m
a
g
es
u
s
i
n
g
n
e
u
r
al
n
et
w
o
r
k
s
.
I
n
t
h
e
ar
ch
itectu
r
e
o
f
a
C
NN,
i
n
p
u
t
la
y
er
,
co
n
v
o
l
u
tio
n
al
la
y
er
,
p
o
llin
g
la
y
er
,
f
u
ll
y
co
n
n
ec
ted
la
y
e
r
an
d
o
u
tp
u
t
la
y
er
ex
i
s
t
[
9
-
1
2
]
.
I
n
i
n
p
u
t
la
y
er
th
er
e
ar
e
th
r
ee
m
ea
s
u
r
e
m
en
ts
an
d
th
e
y
ar
e
w
id
t
h
,
h
eig
h
t
a
n
d
d
ep
th
.
A
t
t
h
at
p
o
in
t
th
e
co
n
v
o
lu
tio
n
al
la
y
er
ex
is
t
in
g
.
A
p
iece
o
f
th
e
p
ict
u
r
e
is
ass
o
ciate
d
w
it
h
th
e
f
o
llo
w
in
g
C
o
n
v
o
lu
tio
n
al
la
y
er
in
li
g
h
t
o
f
th
e
f
ac
t
t
h
at
i
f
ev
e
r
y
o
n
e
o
f
t
h
e
p
ix
el
s
o
f
t
h
e
in
f
o
is
ass
o
ciate
d
w
it
h
th
e
C
o
n
v
o
l
u
tio
n
a
l
la
y
er
.
Af
te
r
co
n
v
o
lu
tio
n
al
la
y
er
,
at
th
at
p
o
in
t
th
e
p
o
o
lin
g
la
y
er
p
ar
t
ex
is
t
s
.
P
o
o
l
L
a
y
er
p
lay
s
o
u
t
a
ca
p
ac
it
y
to
d
ec
r
ea
s
e
th
e
s
p
atial
m
ea
s
u
r
e
m
e
n
ts
o
f
t
h
e
i
n
f
o
r
m
atio
n
,
an
d
th
e
co
m
p
u
tatio
n
al
u
n
p
r
ed
ictab
ilit
y
o
f
o
u
r
m
o
d
el.
T
o
ex
ten
d
,
it
ad
d
itio
n
ally
co
n
tr
o
ls
o
v
er
f
itti
n
g
.
Af
ter
p
o
o
lin
g
la
y
er
,
f
u
ll
y
co
n
n
ec
ted
la
y
er
p
ar
t
ex
is
t
in
g
an
d
f
u
ll
y
co
n
n
ec
ted
la
y
er
s
i
n
t
er
f
ac
e
ea
ch
n
e
u
r
o
n
in
o
n
e
la
y
er
to
ea
ch
n
e
u
r
o
n
i
n
an
o
th
er
la
y
er
.
T
h
e
l
ast
f
u
ll
y
c
o
n
n
ec
ted
la
y
er
u
tili
ze
s
a
s
o
f
t
m
ax
i
n
itiatio
n
w
o
r
k
f
o
r
c
h
ar
ac
ter
izin
g
th
e
p
r
o
d
u
ce
d
h
ig
h
li
g
h
ts
o
f
th
e
in
f
o
r
m
a
tio
n
p
ictu
r
e
i
n
to
d
if
f
er
en
t
clas
s
es
i
n
li
g
h
t
o
f
t
h
e
tr
ai
n
i
n
g
d
ataset
an
d
af
ter
co
m
p
leti
n
g
th
is
la
y
er
th
e
n
w
e
g
et
an
o
u
tp
u
t [
7
,
2
5
]
.
L
o
n
g
s
h
o
r
t
-
ter
m
m
e
m
o
r
y
(
L
ST
M)
is
a
s
p
ec
ial
k
i
n
d
o
f
R
NN
e
n
ab
led
to
lear
n
lo
n
g
ter
m
d
ep
en
d
en
cies.
I
t
is
w
id
el
y
u
s
ed
b
ec
au
s
e
o
f
its
f
ea
t
u
r
e
o
f
r
e
m
e
m
b
er
i
n
g
in
f
o
r
m
at
io
n
f
o
r
lo
n
g
p
er
io
d
s
o
f
ti
m
e
[
4
]
.
T
h
is
is
d
o
n
e
b
y
cr
ea
tin
g
s
p
ec
ial
m
o
d
u
les
t
h
at
is
d
es
ig
n
e
d
to
allo
w
i
n
f
o
r
m
atio
n
to
b
e
g
ated
-
in
a
n
d
g
ated
-
o
u
t
w
h
e
n
n
ee
d
ed
.
Un
li
k
e
tr
ad
itio
n
al
R
N
N,
L
ST
M
s
to
r
es
in
f
o
r
m
at
io
n
u
s
i
n
g
a
m
e
m
o
r
y
ce
ll
w
it
h
li
n
ea
r
ac
tiv
atio
n
f
u
n
ctio
n
[
5
,
6
]
.
T
h
e
L
ST
M
h
as
th
e
ca
p
ac
it
y
to
ev
ac
u
ate
o
r
ad
d
d
ata
to
t
h
e
ce
ll
s
tate,
p
ain
s
ta
k
in
g
l
y
m
an
a
g
ed
b
y
s
tr
u
c
tu
r
es
ca
lled
g
ates.
Ga
tes
ar
e
a
n
ap
p
r
o
ac
h
to
alter
n
ati
v
el
y
let
d
ata
t
h
r
o
u
g
h
.
T
h
e
y
ar
e
m
ad
e
o
u
t o
f
a
s
i
g
m
o
id
n
e
u
r
al
n
et
la
y
er
an
d
a
p
o
in
t
w
i
s
e
m
u
ltip
lica
ti
o
n
ac
tiv
it
y
[
7
]
.
Evaluation Warning : The document was created with Spire.PDF for Python.
I
n
d
o
n
esia
n
J
E
lec
E
n
g
&
C
o
m
p
Sci
I
SS
N:
2502
-
4752
B
a
n
g
la
l
a
n
g
u
a
g
e
textu
a
l ima
g
e
d
escr
ip
tio
n
b
y
h
y
b
r
id
n
eu
r
a
l
n
etw
o
r
k
mo
d
el
(
Md
.
A
s
ifu
z
z
a
ma
n
Jish
a
n
)
761
Bi
-
d
ir
ec
tio
n
al
r
ec
u
r
r
en
t
n
e
u
r
al
n
et
w
o
r
k
(
B
R
N
N)
is
a
p
ar
t
o
f
R
NN
a
n
d
in
B
R
NN
d
e
m
o
n
s
t
r
ate,
th
er
e
ar
e
m
ar
k
ev
er
y
co
m
p
o
n
e
n
t
o
f
th
e
ar
r
an
g
e
m
e
n
t
in
v
ie
w
o
f
th
e
p
ast
an
d
f
u
t
u
r
e
s
etti
n
g
co
m
p
o
n
en
t.
B
R
NN
co
n
-
d
u
cts
t
h
i
s
s
eq
u
en
ci
n
g
b
y
clo
s
e
-
y
ield
o
f
t
w
o
R
N
Ns
a
n
d
o
n
e
h
an
d
li
n
g
o
f
th
e
g
r
o
u
p
i
n
g
is
f
r
o
m
le
f
t
to
r
i
g
h
t,
th
e
an
o
th
er
ar
r
an
g
e
m
en
t
f
r
o
m
r
ig
h
t
to
le
f
t.
I
t
al
w
a
y
s
ca
n
a
v
o
id
g
r
ad
ien
t
v
a
n
is
h
i
n
g
p
r
o
b
le
m
w
h
ic
h
is
a
co
m
m
o
n
p
r
o
b
lem
f
o
r
n
o
r
m
a
l RNN
m
o
d
el
[
7
,
8
]
.
5.
SI
M
UL
AT
I
O
N
5
.
1
.
I
m
a
g
e
pro
ce
s
s
ing
Fo
r
th
e
i
m
ag
e
p
r
o
ce
s
s
in
g
s
ec
tio
n
,
at
f
ir
s
t,
w
e
r
es
ize
th
e
f
u
ll
d
ataset
i
m
ag
e
s
to
co
n
f
ir
m
th
at
ea
c
h
i
m
a
g
es
s
t
a
y
i
n
g
in
t
h
e
s
a
m
e
p
i
x
els.
T
h
e
i
m
a
g
es
o
f
th
e
d
ata
s
et
ar
e
w
it
h
o
u
t
a
d
o
u
b
t
s
h
ad
in
g
i
m
ag
e
s
w
it
h
p
ix
el
estee
m
s
r
u
n
n
in
g
f
r
o
m
0
to
2
5
5
w
i
th
a
co
m
p
o
n
e
n
t o
f
2
2
4
x
2
2
4
,
s
o
b
ef
o
r
e
f
ee
d
th
e
i
n
f
o
r
m
ati
o
n
in
to
t
h
e
m
o
d
el,
it
is
v
ital
to
p
r
ep
r
o
ce
s
s
it.
Firstl
y
,
w
e
clas
s
i
f
y
f
u
l
l
d
atase
t
u
s
i
n
g
C
NN
an
d
VGG1
6
f
e
atu
r
es.
W
e
d
o
th
e
class
i
f
icatio
n
f
o
r
3
0
class
es.
B
R
NN
m
ai
n
l
y
u
s
e
f
o
r
g
en
er
at
in
g
te
x
t
f
r
o
m
th
e
g
iv
e
n
in
p
u
t
i
m
a
g
es.
Fi
n
all
y
,
w
e
co
m
b
i
n
e
th
e
b
o
th
m
o
d
el
o
f
C
NN,
L
ST
M,
an
d
B
R
NN
f
ea
tu
r
es
o
f
o
u
r
d
ataset
an
d
tr
ain
u
p
f
u
ll
m
o
d
el.
T
h
en
w
e
tak
e
atte
m
p
t to
ev
al
u
ate
o
u
r
tr
ain
ed
m
o
d
el
f
o
r
th
e
s
e
d
atasets
to
g
et
b
etter
r
esu
lt.
5
.
2
.
I
m
ple
m
e
nta
t
io
n
R
ep
r
esen
ti
n
g
i
m
a
g
e
is
m
o
s
t
v
i
tal
h
al
f
f
o
r
i
m
a
g
e
p
r
o
ce
s
s
an
d
th
o
s
e
w
e
g
et
to
n
s
o
f
co
n
ce
p
ts
to
r
ev
ie
w
s
ev
er
al
r
ec
en
t
w
o
r
k
s
[
2
2
]
.
W
e
h
av
e
a
ten
d
en
c
y
to
w
atc
h
t
h
at
s
e
n
te
n
ce
d
escr
ip
tio
n
b
u
ild
v
is
i
t
r
ef
er
en
ce
s
to
th
i
n
g
s
an
d
t
h
eir
attr
ib
u
tes
[
2
3
]
.
T
h
e
C
NN
is
p
r
e
-
p
r
ep
ar
ed
o
n
I
m
a
g
eNe
t
[
2
4
,
2
5
]
an
d
f
in
etu
n
ed
o
n
th
e
t
w
o
h
u
n
d
r
ed
ca
teg
o
r
ies
o
f
th
e
I
m
a
g
eNe
t
Dete
ctio
n
C
h
alle
n
g
e
[
2
6
]
.
W
e
h
av
e
a
te
n
d
en
c
y
to
m
a
in
tai
n
t
h
e
tech
n
iq
u
e
to
d
is
co
v
er
ev
er
y
o
b
j
ec
t
in
e
v
er
y
i
m
a
g
e
w
i
th
a
p
ar
t
r
eg
io
n
-
b
a
s
ed
co
n
v
o
l
u
tio
n
al
n
eu
r
al
n
et
w
o
r
k
(
R
C
NN)
.
Fo
llo
w
i
n
g
t
h
e
p
ap
er
[
7
]
,
w
e
h
av
e
a
te
n
d
en
c
y
to
u
s
e
t
h
e
f
ir
s
t
n
i
n
etee
n
k
n
o
w
n
s
p
ac
e
d
esp
ite
t
h
e
to
tal
i
m
ag
e
s
p
ix
el
u
s
in
g
b
o
u
n
d
in
g
b
o
x
as ta
k
es a
f
ter
:
v
=
W
m
[
C
N
Nθ
c(
I
b
)
]
+
b
m
(
1
)
T
h
e
C
NN
(
I
b
)
ch
an
g
e
s
th
e
p
ix
els
in
s
id
e
th
e
b
o
u
n
d
in
g
b
o
x
(
I
b
)
to
4
0
9
6
-
d
im
e
n
s
io
n
al
estab
li
s
h
m
e
n
t
o
f
th
e
co
m
p
letel
y
a
s
s
o
ciate
d
la
y
er
in
a
b
r
ief
m
o
m
en
t
b
ef
o
r
e
th
e
class
if
ier
.
T
h
e
C
NN
p
ar
am
eter
s
θc
co
n
tai
n
ar
o
u
n
d
6
0
m
il
lio
n
p
ar
a
m
eter
s
.
T
h
e
f
r
a
m
e
w
o
r
k
W
m
h
a
s
es
ti
m
atio
n
s
h
4
0
9
6
,
w
h
er
e
h
i
s
th
e
d
e
g
r
ee
o
f
th
e
m
u
lti
m
o
d
al
e
m
b
ed
d
in
g
s
s
p
ac
e
.
E
ac
h
p
ictu
r
e
s
p
ea
k
to
as h
-
d
i
m
en
s
io
n
a
l v
ec
to
r
s
.
R
ep
r
esen
ti
n
g
s
e
n
ten
ce
i
s
also
a
cr
u
cial
p
ar
t
o
f
o
u
r
r
esear
ch
.
W
e
h
av
e
a
ten
d
en
c
y
to
u
s
e
a
B
R
NN
[
6
,
7
]
to
cy
p
h
er
th
e
w
o
r
d
ill
u
s
tr
at
io
n
.
B
R
NN
co
u
ld
b
e
a
p
ar
t
o
f
R
NN
s
ec
tio
n
an
d
t
h
at
i
s
u
s
e
a
f
i
n
ite
s
e
q
u
e
n
ce
to
p
r
ed
ictio
n
.
I
n
B
R
NN
m
o
d
el,
th
er
e
ar
e
lab
el
ev
er
y
co
m
p
o
n
e
n
t
o
f
th
e
s
eq
u
e
n
ce
s
u
p
p
o
r
ted
th
e
p
ast
an
d
f
u
t
u
r
e
co
n
tex
t
co
m
p
o
n
en
t.
Fo
r
o
u
r
m
o
d
el,
t
h
e
B
R
NN
ta
k
es
a
s
eq
u
en
ce
o
f
N
w
o
r
d
s
a
n
d
s
o
it
tr
an
s
f
o
r
m
s
ev
er
y
to
h
-
d
i
m
en
s
io
n
al
v
ec
to
r
.
5
.
3
.
O
ptim
iza
t
io
n
W
e
u
s
ed
s
to
ch
a
s
tic
g
r
ad
ien
t
d
escen
t
(
SG
D)
to
o
p
ti
m
ize
t
h
e
C
NN
p
ar
t
w
it
h
a
m
in
i
b
atc
h
o
f
1
6
f
r
a
m
e
s
en
te
n
ce
s
ets.
W
e
ar
e
u
s
in
g
le
ar
n
in
g
r
ate
0
.
0
1
,
d
ec
a
y
r
ate
1
e
-
6
,
m
o
m
e
n
tu
m
=0
.
9
,
n
e
s
ter
o
v
=
T
r
u
e.
W
e
cr
o
s
s
-
ap
p
r
o
v
e
o
f
th
e
lear
n
in
g
r
ate
an
d
th
e
w
e
ig
h
t
o
f
r
o
t.
W
e
also
u
s
e
d
r
o
p
o
u
t
r
eg
u
lar
izatio
n
i
n
all
la
y
er
s
ex
ce
p
t
f
o
r
r
ec
u
r
r
en
t
la
y
er
s
[
2
1
]
.
A
f
ter
th
at,
to
m
ea
s
u
r
e
t
h
e
lo
s
s
es
u
s
ed
,
u
s
e
th
e
ca
teg
o
r
ical
cr
o
s
s
-
en
tr
o
p
y
lo
s
s
,
a
n
d
to
m
ea
s
u
r
e
ac
cu
r
ac
y
,
u
s
e
t
h
e
p
r
ec
is
io
n
m
etr
ic.
Gen
er
ati
v
e
B
R
NN
is
m
o
r
e
d
if
f
icu
l
t
to
o
p
ti
m
ize
b
ec
au
s
e
o
f
th
e
d
if
f
er
e
n
ce
in
t
h
e
f
r
eq
u
e
n
c
y
o
f
w
o
r
d
s
b
et
w
ee
n
u
n
co
m
m
o
n
w
o
r
d
s
an
d
co
m
m
o
n
w
o
r
d
s
.
F
o
r
th
e
B
R
NN
an
d
L
ST
M
p
ar
ts
,
w
e
u
s
e
A
d
a
m
’
s
B
an
g
la
C
ap
tio
n
Gen
er
at
io
n
I
m
ag
e
Op
ti
m
izer
.
6.
RE
SU
L
T
S AN
D
D
I
SCU
SS
I
O
N
W
e
im
p
le
m
e
n
ted
a
h
y
b
r
id
n
eu
r
al
n
et
w
o
r
k
f
r
a
m
e
w
o
r
k
t
h
a
t
is
ca
p
ab
le
o
f
g
en
er
ate
s
a
B
an
g
la
f
u
l
l
s
en
te
n
ce
f
r
o
m
th
e
g
i
v
en
in
p
u
t
i
m
a
g
e.
Firstl
y
,
let
u
s
lo
o
k
at
th
e
v
ie
w
p
o
i
n
t
o
f
t
h
e
C
NN
f
ea
tu
r
es
w
h
ic
h
is
v
er
y
i
m
p
o
r
tan
t
f
o
r
i
m
a
g
e
clas
s
i
f
ica
tio
n
.
Af
ter
t
h
at,
w
e
g
i
v
e
co
n
c
er
n
ab
o
u
t
t
h
e
B
R
NN
a
n
d
L
ST
M
p
o
r
tio
n
w
h
ich
is
ca
p
ab
le
to
g
en
er
ate
B
an
g
la
te
x
t f
r
o
m
t
h
e
g
i
v
e
n
i
m
a
g
e.
6
.
1
.
E
nco
der
m
o
del:
co
nv
o
lutio
na
l neura
l net
w
o
rk
I
n
th
is
p
ar
t,
w
e
m
ai
n
l
y
d
is
c
u
s
s
ed
ab
o
u
t
C
N
N
i
m
p
le
m
e
n
tat
i
o
n
r
esu
lt
o
f
B
NL
I
T
d
ataset.
W
e
s
h
o
w
ed
th
at,
tr
ain
in
g
ti
m
e
ac
cu
r
ac
y
a
n
d
v
alid
atio
n
ti
m
e
ac
c
u
r
ac
y
v
s
.
ep
o
ch
f
o
r
C
NN
i
n
Fi
g
u
r
e
3
.
W
e
s
h
o
w
ed
th
at
r
esu
lt
in
g
r
ap
h
icall
y
f
o
r
w
h
o
l
e
d
ataset.
W
e
r
an
1
0
ep
o
ch
s
an
d
s
elec
t
b
atch
s
ize
1
6
.
Fro
m
t
h
e
f
ir
s
t
ep
o
ch
o
f
d
u
r
in
g
C
NN
tr
ain
in
g
ti
m
e,
w
e
g
o
t
b
etter
ac
cu
r
ac
y
f
o
r
d
ataset.
W
e
s
h
o
w
ed
th
at
ac
cu
r
ac
y
v
s
.
lo
s
s
an
d
Evaluation Warning : The document was created with Spire.PDF for Python.
I
SS
N
:
2
5
0
2
-
4752
I
n
d
o
n
esia
n
J
E
lec
E
n
g
&
C
o
m
p
Sci,
Vo
l.
21
,
No
.
2
,
Feb
r
u
ar
y
2
0
2
1
:
7
5
7
-
7
6
7
762
v
alid
atio
n
ac
c
u
r
ac
y
v
s
.
v
alid
atio
n
lo
s
s
in
C
NN
c
lass
if
ica
tio
n
tr
ain
in
g
ti
m
e.
Af
ter
r
an
8
ep
o
ch
s
,
w
e
g
o
t
0
.
7
9
4
5
3
8
tr
ain
in
g
ac
cu
r
ac
y
w
h
ic
h
is
b
est
ac
c
u
r
ac
y
f
o
r
th
is
d
atase
t
f
o
r
C
NN
r
es
u
lt.
W
e
g
o
t
0
.
7
8
2
1
6
1
v
alid
atio
n
ac
cu
r
ac
y
f
o
r
B
NL
I
T
d
ataset
an
d
t
h
at
i
s
b
e
n
ch
m
a
r
k
r
es
u
lt
f
o
r
t
h
is
d
ataset
i
n
C
N
N
p
ar
t
b
ec
au
s
e
o
f
it
is
a
s
el
f
-
m
ad
e
n
e
w
d
ata
s
et.
Fig
u
r
e
3
.
Gr
ap
h
ical
r
ep
r
esen
ta
tio
n
o
f
tr
ain
in
g
ti
m
e
an
d
v
alid
atio
n
ti
m
e
ac
cu
r
ac
y
f
o
r
i
m
a
g
e
class
if
icatio
n
o
f
C
NN
p
ar
t
6
.
2
.
Dec
o
der
m
o
del:
bi
direc
t
io
na
l lo
ng
s
ho
rt
t
er
m
m
e
m
o
ry
Af
ter
C
NN,
w
e
m
a
in
l
y
d
is
c
u
s
s
ed
ab
o
u
t
B
R
NN
an
d
L
ST
M
im
p
le
m
e
n
tat
io
n
r
esu
lt
o
f
B
NL
I
T
d
ataset.
W
e
s
h
o
w
ed
tr
ain
in
g
ti
m
e
ac
cu
r
ac
y
v
s
.
ep
o
ch
s
in
Fig
u
r
e
4
.
W
e
also
r
e
p
r
esen
ted
th
at,
tr
ain
i
n
g
ti
m
e
lo
s
s
v
s
.
ep
o
ch
f
o
r
B
R
N
N
a
n
d
L
ST
M
i
n
F
ig
u
r
e
5
.
W
e
s
h
o
w
ed
t
h
at
r
e
s
u
lt
i
n
g
r
ap
h
icall
y
f
o
r
w
h
o
le
d
ataset.
W
e
s
h
o
w
ed
th
at
ac
c
u
r
ac
y
v
s
.
lo
s
s
i
n
B
R
N
N
an
d
L
ST
M
d
u
r
in
g
tr
ai
n
in
g
t
i
m
e.
Af
ter
r
a
n
5
0
ep
o
ch
s
,
w
e
g
o
t 0
.
8
7
3
9
ac
cu
r
ac
y
w
h
ic
h
is
b
es
t
ac
cu
r
ac
y
f
o
r
t
h
i
s
d
ataset
f
o
r
B
R
NN
an
d
L
ST
M
r
es
u
lt
a
n
d
th
a
t
is
b
e
n
ch
m
ar
k
r
es
u
lt
f
o
r
B
NL
I
T
d
ataset.
W
e
s
elec
t b
atch
s
ize
1
2
8
d
u
r
in
g
B
R
NN
a
n
d
L
ST
M
tr
ain
u
p
f
o
r
B
NL
I
T
d
ataset.
Fig
u
r
e
4
.
Gr
ap
h
ical
r
ep
r
esen
ta
tio
n
o
f
B
R
NN
a
n
d
L
ST
M
p
ar
t o
f
B
NL
I
T
d
ataset
r
esu
lt
(
ep
o
ch
v
s
.
tr
ain
i
n
g
ti
m
e
ac
cu
r
ac
y
)
Fig
u
r
e
5
.
Gr
ap
h
ical
r
ep
r
esen
ta
tio
n
o
f
B
R
NN
a
n
d
L
ST
M
p
ar
t o
f
B
NL
I
T
d
ataset
r
esu
lt
(
ep
o
ch
v
s
.
tr
ain
i
n
g
ti
m
e
lo
s
s
)
Evaluation Warning : The document was created with Spire.PDF for Python.
I
n
d
o
n
esia
n
J
E
lec
E
n
g
&
C
o
m
p
Sci
I
SS
N:
2502
-
4752
B
a
n
g
la
l
a
n
g
u
a
g
e
textu
a
l ima
g
e
d
escr
ip
tio
n
b
y
h
y
b
r
id
n
eu
r
a
l
n
etw
o
r
k
mo
d
el
(
Md
.
A
s
ifu
z
z
a
ma
n
Jish
a
n
)
763
6
.
3
.
H
y
brid
m
o
del t
o
g
ener
a
t
e
t
ex
t
W
e
g
en
er
ated
a
p
ick
le
f
ile
f
r
o
m
th
e
w
h
o
le
d
ataset
w
h
ic
h
is
co
n
tai
n
i
n
g
8
,
7
4
3
i
m
ag
e
s
.
W
e
r
an
2
5
ep
o
ch
s
f
o
r
f
i
n
a
l
tr
ai
n
in
g
.
E
a
ch
ep
o
ch
to
o
k
ap
p
r
o
x
i
m
atel
y
1
h
o
u
r
2
0
m
in
u
te
s
a
n
d
o
u
r
ac
cu
r
ac
y
r
ea
ch
ed
0
.
9
4
2
5
4
6
f
o
r
tr
ain
in
g
a
n
d
0
.
7
5
8
6
5
1
f
o
r
v
alid
atio
n
.
W
e
g
o
t
ap
p
r
o
x
im
a
tel
y
0
.
1
9
7
4
3
2
lo
s
s
e
s
i
n
tr
ai
n
i
n
g
p
er
io
d
an
d
1
.
6
1
5
3
2
6
lo
s
s
es
i
n
v
alid
at
io
n
p
u
r
p
o
s
e
a
n
d
t
h
at
i
s
b
en
c
h
m
ar
k
r
esu
lt.
Af
ter
co
m
p
le
te
al
l
ep
o
ch
s
,
g
en
er
ated
tr
ain
i
n
g
ac
cu
-
r
ac
y
v
s
.
v
alid
atio
n
ac
cu
r
ac
y
g
r
ap
h
an
d
tr
ain
i
n
g
lo
s
s
v
s
.
v
al
id
atio
n
g
r
a
p
h
.
W
e
illu
s
tr
ated
g
r
ap
h
icall
y
tr
ain
in
g
an
d
v
alid
atio
n
ac
cu
r
ac
y
in
F
ig
u
r
e
6
an
d
s
h
o
w
ed
tr
ai
n
i
n
g
a
n
d
v
alid
at
io
n
lo
s
s
g
r
ap
h
ical
l
y
in
Fi
g
u
r
e
7
.
T
o
r
ed
u
ce
th
e
lo
s
s
v
al
u
e
o
f
t
h
e
m
o
d
el,
t
h
e
m
o
d
el
w
a
s
tr
ai
n
ed
2
5
ep
o
ch
s
.
Fro
m
t
h
e
s
ec
o
n
d
ep
o
ch
,
ac
cu
r
ac
y
g
o
t
i
m
p
r
o
v
e
m
en
t
co
m
p
ar
i
s
o
n
w
it
h
f
ir
s
t
ep
o
ch
an
d
g
en
er
ated
m
o
d
el
an
d
s
a
v
e
in
a
s
p
ec
if
ic
d
ir
ec
to
r
y
.
T
h
e
in
itial
ac
c
u
r
ac
y
v
al
u
e
was
th
er
e
f
o
r
e
0
.
8
1
2
8
in
f
ir
s
t
e
p
o
ch
s
f
o
r
tr
ain
in
g
p
er
io
d
.
B
u
t,
f
r
o
m
t
h
e
s
ec
o
n
d
ep
o
ch
w
it
h
t
h
e
ac
cu
r
ac
y
v
al
u
e
co
m
i
n
g
d
o
w
n
to
0
.
8
2
9
6
.
Fig
u
r
e
6
.
Gr
ap
h
ical
r
ep
r
esen
ta
tio
n
o
f
d
u
r
i
n
g
f
i
n
al
tr
ain
u
p
f
o
r
tr
ain
in
g
an
d
v
alid
a
tio
n
ac
cu
r
ac
y
Fig
u
r
e
7
.
Gr
ap
h
ical
r
ep
r
esen
ta
tio
n
o
f
d
u
r
i
n
g
f
i
n
al
tr
ain
u
p
f
o
r
tr
ain
in
g
an
d
v
alid
a
tio
n
lo
s
s
6
.
4
.
M
o
del e
v
a
lua
t
i
o
n
W
e
r
esear
ch
ed
th
e
ca
p
ac
it
y
o
f
t
h
e
w
o
r
k
i
n
g
cr
o
s
s
b
r
ee
d
p
r
o
f
o
u
n
d
lear
n
i
n
g
m
o
d
el
b
y
i
n
v
esti
g
ati
n
g
h
o
w
w
ell
it
ca
n
cr
ea
te
a
r
ea
s
o
n
ab
le
d
ep
ictio
n
o
f
t
h
e
tes
t
i
m
a
g
es.
W
e
p
r
ep
ar
ed
o
u
r
m
o
d
el
t
o
b
ec
o
m
e
f
a
m
i
liar
.
W
ith
th
e
co
n
n
ec
tio
n
b
et
w
ee
n
b
etter
p
ar
ts
o
f
th
e
i
m
a
g
es
al
o
n
g
s
id
e
th
e
ap
p
licab
le
b
it
o
f
th
e
s
e
n
te
n
ce
s
.
W
e
r
ep
r
esen
t
th
e
B
L
E
U
a
n
d
ME
T
E
OR
s
co
r
es
to
ev
alu
ate
th
e
p
r
esen
tatio
n
o
f
o
u
r
m
o
d
el.
T
h
ese
m
et
h
o
d
s
p
er
m
i
t
u
s
to
p
r
o
ce
s
s
a
s
co
r
e
th
e
m
ea
s
u
r
es
h
o
w
r
ea
s
o
n
ab
le
is
th
e
p
ictu
r
e
p
o
r
tr
a
y
als.
T
h
e
i
n
s
ti
n
ct
is
to
q
u
a
n
ti
f
y
h
o
w
clo
s
e
th
e
m
o
d
el
cr
ea
ted
s
en
t
en
ce
co
o
r
d
in
ates
th
e
r
ef
er
en
ce
s
en
te
n
ce
s
g
a
v
e
th
e
d
atase
t.
W
e
r
e
p
o
r
t
th
ese
ass
es
s
m
en
t
m
ea
s
u
r
e
m
en
t
s
o
f
o
u
r
m
o
d
el
an
d
ill
u
s
t
r
ated
t
h
e
m
in
T
ab
le
2
.
W
e
p
r
ep
ar
e
d
o
u
r
m
o
d
el
o
n
B
NL
I
T
d
ataset
an
d
w
atc
h
ed
th
e
ass
es
s
m
en
t
o
f
f
u
ll i
m
a
g
e
ex
p
e
ctatio
n
s
o
n
1
0
0
0
test
p
ictu
r
es.
T
h
e
B
L
E
U
-
1
,
2
,
3
,
4
ass
ess
m
en
t
s
co
r
es
a
n
d
ME
T
E
OR
m
etr
ic
s
co
r
es
ar
e
s
u
r
v
e
y
ed
o
u
t
lin
ed
in
T
ab
le
2
.
W
e
ac
tu
alize
d
th
e
co
n
ce
aled
la
y
er
s
s
ize
o
f
6
4
,
1
2
8
,
2
5
6
,
an
d
5
1
2
s
ep
ar
ately
.
T
ab
le
2
.
B
L
E
U
s
co
r
es a
n
d
M
E
T
E
OR
s
co
r
e
f
o
r
B
NL
I
T
d
ata
s
et
H
i
d
d
e
n
L
a
y
e
r
S
i
z
e
B
L
EU
-
1
B
L
EU
-
2
B
L
EU
-
3
B
L
EU
-
4
M
ET
EO
R
64
6
4
.
5
4
5
.
6
3
1
.
8
2
2
.
1
1
9
.
6
1
3
2
2
7
1
2
8
6
3
.
8
4
2
.
3
3
0
.
4
1
9
.
6
1
8
.
6
2
5
4
8
9
2
5
6
6
4
.
8
4
6
.
5
3
2
.
3
2
2
.
9
1
9
.
6
8
3
6
2
5
5
1
2
6
4
.
9
4
6
.
8
3
3
.
1
2
3
.
3
1
9
.
9
6
8
5
3
2
6
.
5
.
Dis
cus
s
io
n
W
e
i
m
p
le
m
en
ted
o
u
r
h
y
b
r
id
m
o
d
el
u
s
i
n
g
B
NL
I
T
d
ataset.
W
e
o
b
s
er
v
ed
th
at
o
u
r
m
o
d
el
g
iv
e
s
b
etter
ac
cu
r
ac
y
u
s
i
n
g
o
u
r
s
el
f
-
m
ad
e
B
an
g
la
d
ataset.
Du
r
i
n
g
th
e
cl
ass
i
f
icatio
n
o
f
u
s
in
g
C
N
N,
w
e
s
ee
h
o
w
t
h
is
n
e
w
d
ataset
ca
n
ca
p
t
u
r
e
to
lear
n
i
n
g
f
r
o
m
d
ataset
a
n
d
i
m
ag
e
c
la
s
s
i
f
icatio
n
u
s
i
n
g
VGG1
6
.
W
e
g
et
b
etter
ac
cu
r
ac
y
w
h
ic
h
i
s
0
.
7
9
4
5
3
8
tr
ain
in
g
ti
m
e
ac
c
u
r
ac
y
a
n
d
0
.
7
8
2
1
6
1
w
h
ich
is
v
al
id
atio
n
ac
c
u
r
ac
y
f
o
r
B
NL
I
T
d
ataset
f
o
r
C
NN
r
e
s
u
l
t.
F
u
r
th
er
m
o
r
e,
w
e
g
o
t 0
.
8
7
3
9
tr
ain
in
g
t
i
m
e
ac
cu
r
ac
y
d
u
r
i
n
g
i
n
th
e
B
R
NN
an
d
L
ST
M
p
er
io
d
.
T
h
en
co
m
b
i
n
ed
b
o
th
m
o
d
el
a
n
d
tr
ain
u
p
f
u
ll
d
ata
s
et
a
g
ai
n
an
d
f
i
n
all
y
o
u
r
ac
cu
r
ac
y
r
ea
ch
ed
0
.
9
4
2
5
4
6
f
o
r
tr
ain
i
n
g
Evaluation Warning : The document was created with Spire.PDF for Python.
I
SS
N
:
2
5
0
2
-
4752
I
n
d
o
n
esia
n
J
E
lec
E
n
g
&
C
o
m
p
Sci,
Vo
l.
21
,
No
.
2
,
Feb
r
u
ar
y
2
0
2
1
:
7
5
7
-
7
6
7
764
ti
m
e
a
n
d
0
.
7
5
8
6
5
1
f
o
r
v
alid
at
io
n
.
T
o
ex
ten
d
,
w
e
s
h
o
w
ed
i
n
Fig
u
r
e
8
a
n
d
Fi
g
u
r
e
9
t
h
at
h
o
w
to
g
en
er
ate
tex
t
f
r
o
m
g
iv
e
n
i
n
p
u
t i
m
a
g
e.
Fin
a
ll
y
,
w
e
r
ep
r
esen
ted
o
u
r
ev
al
u
ati
o
n
r
esu
lt
s
in
t
h
e
T
ab
le
2
.
Fig
u
r
e
8
.
C
ase
o
f
s
e
n
te
n
ce
an
t
icip
ated
b
y
o
u
r
m
o
d
el.
W
e
s
h
o
w
ed
t
h
at
h
o
w
m
u
c
h
th
e
p
er
f
ec
t B
an
g
la
tex
t o
u
r
m
o
d
el
ca
n
g
en
er
ate
F
ig
u
r
e
9
.
C
ase
o
f
s
e
n
te
n
ce
an
t
icip
ated
b
y
o
u
r
m
o
d
el.
Fo
r
ea
ch
test
p
ict
u
r
e,
w
e
g
o
t th
e
m
o
s
t p
er
f
ec
t te
s
t se
n
ten
ce
7.
CO
NCLU
SI
O
N
I
n
t
h
is
s
t
u
d
y
,
a
co
m
p
le
x
h
y
b
r
i
d
n
eu
r
al
n
et
w
o
r
k
m
o
d
el
is
p
r
o
p
o
s
ed
,
w
h
ic
h
d
e
m
o
n
s
tr
ates
e
x
ce
p
tio
n
a
l
ca
p
ac
it
y
to
cr
ea
te
B
an
g
la
n
at
u
r
al
la
n
g
u
a
g
e
b
ased
s
i
n
g
le
s
en
ten
ce
d
ep
ictio
n
f
r
o
m
a
g
i
v
en
tes
t
i
m
ag
e.
T
h
e
m
o
d
el
i
s
ca
p
ab
le
o
f
d
etec
tin
g
i
m
ag
e
s
w
it
h
e
m
b
ed
d
ed
m
u
lti
m
o
d
al
a
n
d
s
e
m
a
n
tic
co
m
p
le
x
i
ties
,
an
d
is
ab
le
to
g
en
er
ate
n
atu
r
al
la
n
g
u
a
g
e
d
escr
ip
tio
n
b
ased
o
n
th
e
co
n
tex
t
o
f
i
m
a
g
es.
Ou
r
m
et
h
o
d
o
lo
g
y
in
co
r
p
o
r
ates
m
o
d
i
f
icat
io
n
to
th
e
m
o
d
el
to
ca
p
tu
r
e
v
is
u
al
an
d
la
n
g
u
a
g
e
m
o
d
alitie
s
b
y
e
m
p
lo
y
i
n
g
ef
f
ec
ti
v
e
L
ST
M
an
d
B
R
NN
co
u
n
ter
p
ar
ts
.
Mo
r
eo
v
e
r
,
w
e
r
ep
o
r
t
ac
ce
p
tab
le
p
er
f
o
r
m
an
ce
an
d
ac
c
u
r
ac
y
as
t
h
e
n
e
ce
s
s
ar
y
f
o
r
o
u
r
s
el
f
-
m
ad
e
d
ataset.
O
u
r
ex
p
er
i
m
e
n
ts
w
it
h
t
h
e
m
o
d
el
s
h
o
w
s
th
a
t
b
etter
ex
ec
u
tio
n
ac
r
o
s
s
w
id
e
r
s
co
p
e
o
f
d
atasets
m
a
y
b
e
ac
co
m
p
li
s
h
ed
b
y
m
ea
n
s
o
f
m
o
d
el
f
i
n
e
-
tu
n
i
n
g
an
d
ar
ch
itect
u
r
al
au
g
m
en
ta
tio
n
.
Evaluation Warning : The document was created with Spire.PDF for Python.
I
n
d
o
n
esia
n
J
E
lec
E
n
g
&
C
o
m
p
Sci
I
SS
N:
2502
-
4752
B
a
n
g
la
l
a
n
g
u
a
g
e
textu
a
l ima
g
e
d
escr
ip
tio
n
b
y
h
y
b
r
id
n
eu
r
a
l
n
etw
o
r
k
mo
d
el
(
Md
.
A
s
ifu
z
z
a
ma
n
Jish
a
n
)
765
RE
F
E
R
E
NC
E
S
[1
]
K.
F
u
,
J.
Jin
,
R.
Cu
i,
F
.
S
h
a
,
a
n
d
C.
Zh
a
n
g
,
”
A
li
g
n
in
g
w
h
e
re
to
se
e
a
n
d
w
h
a
t
to
tell:
Im
a
g
e
c
a
p
ti
o
n
i
n
g
w
it
h
re
g
io
n
-
b
a
se
d
a
tt
e
n
ti
o
n
a
n
d
sc
e
n
e
-
sp
e
c
if
i
c
c
o
n
tex
ts”
,
In
IEE
E
T
r
a
n
sa
c
t
io
n
s
o
n
P
a
tt
e
rn
A
n
a
lys
is
a
n
d
M
a
c
h
i
n
e
In
telli
g
e
n
c
e
,
v
o
l.
3
9
,
n
o
.
1
2
,
p
p
.
2
3
2
1
-
2
3
3
4
,
2
0
1
7
.
[2
]
L
.
Ch
e
n
,
H.
Z
h
a
n
g
,
J.
X
ia
o
,
L
.
Nie
,
J.
S
h
a
o
,
W
.
L
iu
,
T
.
Ch
u
a
,
”
S
CA
-
CNN
:
sp
a
ti
a
l
a
n
d
c
h
a
n
n
e
l
-
w
ise
a
tt
e
n
ti
o
n
i
n
c
o
n
v
o
lu
ti
o
n
a
l
n
e
tw
o
rk
s
f
o
r
i
m
a
g
e
c
a
p
ti
o
n
in
g
”
,
In
C
o
mp
u
ter
Vi
sio
n
a
n
d
P
a
tt
e
rn
Rec
o
g
n
it
io
n
(
CVP
R)
,
p
p
.
6
2
9
8
-
6
3
0
6
,
2
0
1
7
.
[3
]
P
.
A
n
d
e
rso
n
,
B.
F
e
rn
a
n
d
o
,
M
.
Jo
h
n
-
so
n
,
S
.
G
o
u
l
d
,
”
S
P
ICE:
se
m
a
n
ti
c
p
r
o
p
o
siti
o
n
a
l
im
a
g
e
c
a
p
ti
o
n
e
v
a
lu
-
a
ti
o
n
”
,
I
n
Eu
ro
p
e
a
n
C
o
n
fer
e
n
c
e
o
n
C
o
mp
u
t
e
r V
isio
n
(
ECCV
)
,
p
p
.
3
8
2
-
3
9
8
,
2
0
1
6
.
[4
]
Q.
Yo
u
,
H.
Ji
n
,
Z.
W
a
n
g
,
C.
F
a
n
g
,
J.
L
u
o
,
”
Im
a
g
e
c
a
p
ti
o
n
in
g
w
it
h
se
m
a
n
ti
c
a
tt
e
n
ti
o
n
”
,
In
Co
m
p
u
ter
Vi
sio
n
a
n
d
Pa
tt
e
rn
Rec
o
g
n
it
io
n
(
CVP
R)
,
p
p
.
4
6
5
1
-
4
6
5
9
,
2
0
1
6
.
[5
]
Orio
l
Vin
y
a
ls,
A
lex
a
n
d
e
r
T
o
sh
e
v
,
S
a
m
y
B
e
n
g
io
,
a
n
d
Du
m
it
ru
Erh
a
n
,
“
S
h
o
w
a
n
d
tell:
A
n
e
u
ra
l
im
a
g
e
c
a
p
ti
o
n
g
e
n
e
ra
to
r”
,
a
rXiv:1
4
1
1
.
4
5
5
5
v
2
,
2
0
1
5
.
[6
]
W
a
n
g
,
H.,
Zh
a
n
g
,
Y.,
Yu
,
X
.
,
”
A
n
Ov
e
rv
ie
w
o
f
I
m
a
g
e
Ca
p
ti
o
n
G
e
n
e
r
a
ti
o
n
M
e
t
h
o
d
s”
,
Co
mp
u
ta
t
i
o
n
a
l
in
telli
g
e
n
c
e
a
n
d
n
e
u
r
o
sc
ien
c
e
,
p
p
.
1
-
13
,
2
0
2
0
.
[7
]
M
d
.
A
sif
u
z
z
a
m
a
n
Jish
a
n
,
K.
R.
M
a
h
m
u
d
,
A
.
K.
A
l
Az
a
d
,
”
Na
tu
ra
l
lan
g
u
a
g
e
d
e
s
c
rip
ti
o
n
o
f
i
m
a
g
e
s
u
sin
g
h
y
b
rid
re
c
u
rre
n
t
n
e
u
ra
l
n
e
tw
o
rk
”
,
In
ter
n
a
ti
o
n
a
l
J
o
u
rn
a
l
o
f
El
e
c
trica
l
a
n
d
Co
mp
u
ter
En
g
i
n
e
e
rin
g
(
IJ
ECE
)
,
v
o
l.
9
,
n
o
.
4
,
p
p
.
2
9
3
2
-
2
9
4
0
,
2
0
1
9
.
[8
]
T.
-
H.
C
h
e
n
,
Y.
-
H.
L
iao
,
C.
-
Y.
C
h
u
a
n
g
,
W
.
-
T
.
Hs
u
,
J.
F
u
,
M
.
S
u
n
,
”
S
h
o
w
,
a
d
a
p
t
a
n
d
tell:
a
d
v
e
rsa
rial
train
in
g
o
f
c
ro
ss
-
d
o
m
a
in
ima
g
e
c
a
p
ti
o
n
e
r”
,
in
Pro
c
e
e
d
i
n
g
s
o
f
th
e
IEE
E
Co
n
fer
e
n
c
e
o
n
I
n
ter
n
a
ti
o
n
a
l
C
o
n
-
fer
e
n
c
e
o
n
Co
mp
u
ter
V
isio
n
a
n
d
Pa
t
ter
n
Rec
o
g
n
it
io
n
,
p
p
.
5
2
1
-
5
3
0
,
H
o
n
o
lu
lu
,
HI,
USA
,
Ju
ly
2
0
1
7
.
[9
]
J.
A
n
e
ja,
A
.
De
sh
p
a
n
d
e
,
A
.
G
.
S
c
h
w
in
g
,
”
Co
n
v
o
lu
ti
o
n
a
l
im
a
g
e
c
a
p
ti
o
n
i
n
g
”
,
In
Co
mp
u
ter
Vi
sio
n
a
n
d
Pa
tt
e
r
n
Rec
o
g
n
it
io
n
(
CVP
R)
,
p
p
.
5
5
6
1
-
5
5
7
0
,
2
0
1
8
.
[1
0
]
F
.
F
a
n
g
,
H.
W
a
n
g
,
Y.
Ch
e
n
,
P
.
T
a
n
g
,
”
L
o
o
k
in
g
d
e
e
p
e
r
a
n
d
tran
s
f
e
rrin
g
a
tt
e
n
t
io
n
f
o
r
im
a
g
e
c
a
p
ti
o
n
i
n
g
”
,
M
u
lt
ime
d
i
a
T
o
o
ls
a
n
d
A
p
p
li
c
a
ti
o
n
,
v
o
l.
77
,
n
o
.
23
,
p
p
.
3
1
1
5
9
-
3
1
1
7
5
,
2
0
1
8
.
[1
1
]
T
.
Ya
o
,
Y.
P
a
n
,
Y.
L
i,
T
.
M
e
i,
”
Ex
p
lo
ri
n
g
v
isu
a
l
re
latio
n
sh
ip
f
o
r
im
a
g
e
c
a
p
ti
o
n
in
g
”
,
In
E
u
ro
p
e
a
n
Co
n
fer
-
e
n
c
e
o
n
Co
mp
u
ter
V
isio
n
(
ECCV
)
,
p
p
.
7
1
1
-
7
2
7
,
2
0
1
8
.
[1
2
]
Q.
W
a
n
g
a
n
d
A
.
B.
Ch
a
n
,
”
CNN
+
CNN
:
c
o
n
v
o
lu
ti
o
n
a
l
d
e
c
o
d
e
rs
f
o
r
i
m
a
g
e
c
a
p
ti
o
n
i
n
g
”
,
a
rXi
v
:1
8
0
5
.
0
9
0
1
9
v
1
[
c
s.CV
]
,
2
0
1
8
.
[1
3
]
P
.
A
n
d
e
rso
n
,
X
.
He
,
C.
Bu
e
h
ler,
D.
T
e
n
e
y
,
M
.
Jo
h
n
so
n
,
S
.
G
o
u
ld
,
L
.
Zh
a
n
g
,
”
Bo
tt
o
m
-
u
p
a
n
d
t
o
p
-
d
o
w
n
a
t
-
ten
ti
o
n
f
o
r
i
m
a
g
e
c
a
p
ti
o
n
i
n
g
a
n
d
v
isu
a
l
q
u
e
stio
n
a
n
sw
e
rin
g
”
,
In
Co
mp
u
t
e
r
Vi
sio
n
a
n
d
P
a
tt
e
rn
Rec
o
g
n
it
i
o
n
(
CVP
R)
,
p
p
.
6
0
7
7
-
6
0
8
6
,
2
0
1
8
.
[1
4
]
T
a
k
a
sh
i
M
iy
a
z
a
k
i,
No
b
u
y
u
k
i
S
h
im
izu
,
”
Cro
ss
-
L
in
g
u
a
l
I
m
a
g
e
Ca
p
ti
o
n
G
e
n
e
ra
ti
o
n
”
,
Asso
c
ia
ti
o
n
f
o
r
Co
m
-
p
u
t
a
ti
o
n
a
l
L
i
n
g
u
isti
c
s (
ACL
)
,
p
p
.
1
7
8
0
-
1
7
9
0
,
2
0
1
6
.
[1
5
]
Kriz
h
e
v
sk
y
,
I.
S
u
tsk
e
v
e
r,
a
n
d
G
.
Hin
to
n
,
”
Im
a
g
e
Ne
t
c
las
sif
i
c
a
ti
o
n
w
it
h
d
e
e
p
c
o
n
v
o
l
u
ti
o
n
a
l
n
e
u
ra
l
n
e
tw
o
rk
s”
,
Ne
u
ra
l
In
fo
rm
a
t
io
n
Pro
c
e
ss
in
g
S
y
ste
ms
(
NIPS
)
,
v
o
l.
1
,
p
p
.
1
0
9
7
-
1
1
0
5
,
2
0
1
2
.
[1
6
]
S
.
Ho
c
h
re
it
e
r
a
n
d
J.
S
c
h
m
id
h
u
b
e
r,
”
L
o
n
g
sh
o
rt
-
term
m
e
m
o
r
y
”
,
N
e
u
ra
l
c
o
mp
u
ta
t
io
n
,
v
o
l.
9
,
n
o
.
8
,
p
p
.
1
7
3
5
-
1
7
8
0
,
1
9
9
7
.
[1
7
]
M
.
S
c
h
u
ste
r
a
n
d
K
.
K.
P
a
li
w
a
l,
”
Bid
irec
ti
o
n
a
l
re
c
u
rre
n
t
n
e
u
ra
l
n
e
t
w
o
rk
s
”
,
S
ig
n
a
l
Pr
o
c
e
ss
in
g
,
I
EE
E
T
ra
n
s
-
a
c
ti
o
n
s
,
v
o
l.
4
5
,
n
o
.
1
1
,
p
p
.
2
6
7
3
-
2
6
8
1
,
1
9
9
7
.
[1
8
]
M
d
.
A
sif
u
z
z
a
m
a
n
Jish
a
n
,
Kh
a
n
Ra
q
ib
M
a
h
m
u
d
,
a
n
d
A
b
u
l
Ka
lam
A
l
A
z
a
d
,
”
Ba
n
g
la
Na
tu
ra
l
La
n
g
u
a
g
e
I
m
a
g
e
to
T
e
x
t
(BNL
I
T
)”
,
2
0
1
9
.
[
On
li
n
e
]
.
Av
a
il
a
b
le:
h
tt
p
s://
w
ww
.
k
a
g
g
le.co
m
/j
ish
a
n
9
0
0
/
b
a
n
g
la
-
n
a
t
u
ra
l
-
lan
g
u
a
g
e
-
i
m
a
g
e
-
to
-
tex
t
-
b
n
li
t,
h
t
tp
s://
d
o
i.
o
rg
/1
0
.
7
9
1
0
/DV
N/DZZ
1
ZB
(Ha
r
-
v
a
rd
Da
ta
v
e
rse
),
h
tt
p
:/
/
d
x
.
d
o
i.
o
rg
/
1
0
.
1
7
6
3
2
/w
s3
r8
2
g
n
m
8
.
4
(M
e
n
d
e
ley
-
E
L
S
EV
IER),
h
tt
p
:/
/
d
o
i.
o
rg
/1
0
.
5
2
8
1
/ze
n
o
d
o
.
3
3
7
2
7
5
2
(
Z
e
n
o
d
o
).
[1
9
]
S
.
Ba
k
e
r,
D.
S
c
h
a
rste
in
,
J.
L
e
w
is,
S
.
Ro
t
h
,
M
.
Blac
k
,
a
n
d
R.
S
z
e
li
s
k
i,
”
A
d
a
tab
a
se
a
n
d
e
v
a
lu
a
ti
o
n
m
e
th
o
d
-
o
l
o
g
y
f
o
r
o
p
ti
c
a
l
f
lo
w
”
,
In
ter
n
a
ti
o
n
a
l
J
o
u
rn
a
l
o
f
Co
m
p
u
ter
Vi
si
o
n
(
IJ
CV)
,
v
o
l.
9
2
,
n
o
.
1
,
p
p
.
1
-
3
1
,
2
0
1
1
.
[2
0
]
L
.
F
e
i
-
F
e
i,
R.
F
e
rg
u
s,
a
n
d
P
.
P
e
r
o
n
a
,
”
L
e
a
rn
in
g
g
e
n
e
ra
ti
v
e
v
i
su
a
l
m
o
d
e
ls
f
ro
m
f
e
w
train
in
g
e
x
a
m
p
les
:
A
n
in
c
re
m
e
n
tal
Ba
y
e
sia
n
a
p
p
ro
a
c
h
tes
ted
o
n
1
0
1
o
b
jec
t
c
a
teg
o
ries
”
,
Co
mp
u
ter
Vi
si
o
n
a
n
d
Pa
t
ter
n
Rec
o
g
n
i
-
t
io
n
(
CVP
R)
,
W
o
rk
sh
o
p
o
f
G
e
n
e
ra
ti
v
e
M
o
d
e
l
Ba
se
d
Visio
n
(W
G
M
BV
),
2
0
0
4
.
[2
1
]
G
.
G
riff
in
,
A
.
Ho
lu
b
,
a
n
d
P
.
P
e
r
o
n
a
,
”
Ca
lt
e
c
h
-
2
5
6
o
b
jec
t
c
a
teg
o
r
y
d
a
tas
e
t
”
,
Ca
li
f
o
rn
i
a
I
n
stit
u
te
o
f
T
e
c
h
-
n
o
l
o
g
y
,
T
e
c
h
.
Rep
.
7
6
9
4
,
2
0
0
7
.
[2
2
]
N.
Da
lal
a
n
d
B.
T
rig
g
s,
”
Histo
g
ra
m
s
o
f
o
rien
ted
g
ra
d
ien
ts
f
o
r
h
u
m
a
n
d
e
tec
ti
o
n
”
,
Co
m
p
u
ter
Vi
si
o
n
a
n
d
Pa
tt
e
r
n
Rec
o
g
n
it
io
n
(
CVP
R)
,
v
o
l.
1
,
p
p
.
8
8
6
-
8
9
3
,
2
0
0
5
.
[2
3
]
Y.
L
e
c
u
n
a
n
d
C.
Co
rtes
,
”
T
h
e
M
NIST
d
a
tab
a
se
o
f
h
a
n
d
w
rit
ten
d
ig
it
s”
,
1
9
9
8
.
[
On
l
i
n
e
]
.
A
v
a
il
a
b
le:
h
tt
p
:
//
y
a
n
n
.
lec
u
n
.
c
o
m
/ex
d
b
/m
n
ist/
[2
4
]
S
.
A
.
Ne
n
e
,
S
.
K.
Na
y
a
r,
a
n
d
H.
M
u
ra
se
,
”
Co
lu
m
b
ia
o
b
jec
t
im
a
g
e
li
b
ra
ry
(
c
o
il
-
2
0
)”
,
C
o
lu
m
b
ia
U
n
ive
rs
ty,
T
e
c
h
.
Rep
.
,
1
9
9
6
.
[2
5
]
Kriz
h
e
v
sk
y
a
n
d
G
.
Hin
to
n
,
”
L
e
a
rn
in
g
m
u
lt
ip
le l
a
y
e
rs o
f
f
e
a
tu
re
s
f
r
o
m
ti
n
y
i
m
a
g
e
s”
,
Co
mp
u
ter
S
c
ien
c
e
De
p
a
rtme
n
t,
Un
ive
rs
it
y
o
f
T
o
r
o
n
t
o
,
T
e
c
h
.
Re
p
,
2
0
0
9
.
[2
6
]
T
o
rra
lb
a
,
R.
F
e
rg
u
s,
a
n
d
W
.
T
.
F
re
e
m
a
n
,
”
8
0
m
i
ll
io
n
ti
n
y
i
m
a
g
e
s:
A
larg
e
d
a
ta
s
e
t
f
o
r
n
o
n
p
a
ra
m
e
tri
c
o
b
jec
t
a
n
d
sc
e
n
e
re
c
o
g
n
it
io
n
”
,
T
h
e
P
a
tt
e
rn
A
n
a
lys
is
a
n
d
M
a
c
h
i
n
e
In
telli
g
e
n
c
e
(
PA
M
I)
,
v
o
l.
3
0
,
n
o
.
1
1
,
p
p
.
1
9
5
8
-
1
9
7
0
,
2
0
0
8
.
[2
7
]
J.
De
n
g
,
W
.
Do
n
g
,
R
.
S
o
c
h
e
r,
L
.
-
J.
L
i,
K.
L
i,
a
n
d
L
.
F
e
i
-
F
e
i,
”
Im
a
g
e
N
e
t
:
A
L
a
rg
e
-
S
c
a
le
Hie
ra
rc
h
ica
l
Im
a
g
e
Da
tab
a
se
”
,
IEE
E
c
o
n
fer
e
n
c
e
o
n
C
o
mp
u
ter
V
isio
n
a
n
d
Pa
t
ter
n
Rec
o
g
n
it
i
o
n
(
CVP
R)
,
p
p
.
2
4
8
-
2
5
5
,
2
0
0
9
.
[2
8
]
T
a
o
X
u
,
P
e
n
g
c
h
u
a
n
Zh
a
n
g
,
Qiu
y
u
a
n
Hu
a
n
g
,
Ha
n
Z
h
a
n
g
,
Zh
e
G
a
n
,
X
iao
lei
Hu
a
n
g
,
a
n
d
X
iao
d
o
n
g
He
,
”
A
tt
n
GA
N:
F
in
e
-
G
ra
in
e
d
T
e
x
t
to
Im
a
g
e
G
e
n
e
ra
ti
o
n
w
it
h
A
tt
e
n
ti
o
n
a
l
G
e
n
e
ra
ti
v
e
A
d
v
e
rsa
rial
Ne
t
w
o
rk
s”
,
a
rXiv:1
7
1
1
.
1
0
4
8
5
v
1
[
c
s.CV
]
,
2
0
1
7
.
Evaluation Warning : The document was created with Spire.PDF for Python.
I
SS
N
:
2
5
0
2
-
4752
I
n
d
o
n
esia
n
J
E
lec
E
n
g
&
C
o
m
p
Sci,
Vo
l.
21
,
No
.
2
,
Feb
r
u
ar
y
2
0
2
1
:
7
5
7
-
7
6
7
766
[2
9
]
S
h
ik
h
a
r
S
h
a
rm
a
,
D
e
n
d
i
S
u
h
u
b
d
y
,
V
i
n
c
e
n
t
M
ich
a
lsk
i,
S
a
m
ira
Eb
ra
h
im
i
Ka
h
o
u
,
a
n
d
Yo
sh
u
a
Be
n
g
io
,
”
Ch
a
t
P
a
in
ter
:
Im
p
ro
v
in
g
Tex
t
to
Im
a
g
e
G
e
n
e
ra
t
io
n
u
sin
g
Dia
lo
g
u
e
”
,
a
rXiv:1
8
0
2
.
0
8
2
1
6
v
1
[
c
s.CV
]
,
2
0
1
8
.
[3
0
]
Rich
a
rd
S
o
c
h
e
r,
A
n
d
re
j
Ka
rp
a
th
y
,
Qu
o
c
V
.
L
e
*
,
Ch
risto
p
h
e
r
D.
M
a
n
n
i
n
g
,
a
n
d
A
n
d
re
w
Y.
Ng
,
”
G
ro
u
n
d
e
d
Co
m
p
o
siti
o
n
a
l
S
e
m
a
n
ti
c
s
f
o
r
F
in
d
in
g
a
n
d
De
sc
rib
in
g
Im
a
g
e
s
w
it
h
S
e
n
ten
c
e
s”
,
T
e
n
n
e
ss
e
e
Asso
c
ia
ti
o
n
o
f
Co
mm
u
n
it
y
L
e
a
d
e
rs
h
i
p
(
T
ACL
)
,
v
o
l.
2
,
n
o
.
1
,
p
p
.
2
0
7
-
2
1
8
,
2
0
1
4
.
[3
1
]
J.
S
h
o
tt
o
n
,
J.
W
in
n
,
C.
Ro
th
e
r,
a
n
d
A
.
Crim
in
isi,
Tex
to
n
-
Bo
o
st:
J
o
in
t
a
p
p
e
a
ra
n
c
e
,
sh
a
p
e
a
n
d
c
o
n
te
x
t
m
o
d
e
li
n
g
f
o
r
m
u
lt
i
-
c
las
s
o
b
jec
t
re
c
o
g
n
it
io
n
a
n
d
se
g
m
e
n
tatio
n
,
I
n
E
u
ro
p
e
a
n
C
o
n
fer
e
n
c
e
o
n
C
o
mp
u
ter
Vi
sio
n
(
ECCV
)
,
p
p
.
1
-
1
5
,
2
0
0
6
.
[3
2
]
G
e
i
g
e
r,
P
.
L
e
n
z
,
a
n
d
R.
Urta
su
n
,
A
re
w
e
re
a
d
y
f
o
r
a
u
to
n
o
m
o
u
s
d
riv
in
g
?
th
e
KIT
T
I
v
isio
n
b
e
n
c
h
m
a
rk
su
it
e
,
IEE
E
c
o
n
fer
e
n
c
e
o
n
C
o
mp
u
ter
Vi
si
o
n
a
n
d
P
a
tt
e
rn
Rec
o
g
n
it
i
o
n
(
CVP
R)
,
p
p
.
3
3
5
4
-
3
3
6
1
,
2
0
1
2
.
[3
3
]
G
.
J.
Bro
sto
w
,
J.
F
a
u
q
u
e
u
r,
a
n
d
R.
Cip
o
ll
a
,
S
e
m
a
n
ti
c
o
b
jec
t
c
las
se
s
in
v
id
e
o
:
A
h
ig
h
-
d
e
f
in
it
io
n
g
ro
u
n
d
tru
th
d
a
tab
a
se
,
Pa
tt
.
Rec
.
L
e
tt
e
rs
,
v
o
l.
3
0
,
n
o
.
2
,
p
.
8
8
9
7
,
2
0
0
9
.
[3
4
]
L
iu
,
J.
Yu
e
n
,
a
n
d
A
.
T
o
rra
lb
a
,
“
No
n
p
a
ra
m
e
tri
c
sc
e
n
e
p
a
rsin
g
v
ia
lab
e
l
tran
sfe
r
”,
IEE
E
T
ra
n
s,
o
n
T
h
e
P
a
tt
e
r
n
An
a
lys
is
a
n
d
M
a
c
h
i
n
e
In
tell
ig
e
n
c
e
(
PA
M
I)
,
v
o
l.
3
3
,
n
o
.
1
2
,
p
p
.
2
3
6
8
-
2
3
8
2
,
2
0
1
1
.
[3
5
]
J.
T
ig
h
e
a
n
d
S
.
L
a
z
e
b
n
ik
,
S
u
p
e
r
p
a
rsin
g
:
S
c
a
lab
le
n
o
n
p
a
ra
m
e
tri
c
im
a
g
e
p
a
rsin
g
w
it
h
su
p
e
rp
ix
e
ls,
In
Eu
-
ro
p
e
a
n
Co
n
fer
e
n
c
e
o
n
Co
m
p
u
ter
Vi
si
o
n
(
ECCV
)
,
p
p
.
3
5
2
-
3
6
5
,
2
0
1
0
.
[3
6
]
Zh
o
u
,
H.
Zh
a
o
,
X
.
P
u
ig
,
S
.
F
i
d
le
r,
A
.
Ba
rriu
so
,
a
n
d
A
.
T
o
rra
lb
a
,
S
c
e
n
e
p
a
rsin
g
th
ro
u
g
h
A
DE2
0
K
d
a
tas
e
t,
In
IEE
E
c
o
n
fer
e
n
c
e
o
n
C
o
mp
u
ter
Vi
si
o
n
a
n
d
P
a
tt
e
rn
Rec
o
g
n
it
i
o
n
(
CVP
R)
,
p
p
.
3
5
2
-
3
6
5
,
2
0
1
7
.
B
I
O
G
RAP
H
I
E
S O
F
AUTH
O
RS
M
d
.
As
ifu
z
z
a
m
a
n
J
ish
a
n
h
a
s
c
o
m
p
lete
d
Ba
c
h
e
lo
r
o
f
S
c
ien
c
e
i
n
Co
m
p
u
ter
S
c
ien
c
e
a
n
d
En
g
in
e
e
r
-
in
g
w
it
h
in
th
e
De
p
a
rtme
n
t
o
f
Co
m
p
u
ter
S
c
ien
c
e
a
n
d
En
g
in
e
e
rin
g
a
t
th
e
Un
iv
e
rsit
y
o
f
L
ib
e
ra
l
A
rts
B
a
n
g
lad
e
sh
(ULAB).
He
h
a
s
e
x
p
e
rti
se
in
C,
Ja
v
a
,
P
y
th
o
n
,
M
A
T
L
A
B
a
n
d
C+
+
p
ro
g
ra
m
m
in
g
lan
-
g
u
a
g
e
.
He
h
a
s
a
l
so
w
o
rk
i
n
g
k
n
o
w
led
g
e
in
d
iffere
n
t
w
e
b
p
ro
g
ra
m
m
in
g
lan
g
u
a
g
e
:
H
TM
L
,
C
S
S
,
J
a
v
a
S
c
rip
t
(JS),
L
a
r
a
v
e
l
f
ra
m
e
w
o
rk
a
n
d
d
a
tab
a
se
s
y
ste
m
.
His
re
s
e
a
rc
h
h
a
s
re
su
lt
e
d
in
t
o
a
re
se
a
rc
h
a
rti
c
le
w
h
ich
h
a
s
b
e
e
n
p
u
b
l
ish
e
d
in
an
in
tern
a
ti
o
n
a
l
jo
u
rn
a
l,
a
n
o
t
h
e
r
o
n
e
r
e
se
a
rc
h
p
u
b
li
sh
e
d
in
an
in
-
te
rn
a
ti
o
n
a
l
c
o
n
f
e
re
n
c
e
,
a
n
d
one
f
u
ll
im
a
g
e
d
a
ta
se
t
p
u
b
li
sh
e
d
in
f
o
u
r
d
if
f
e
re
n
t
D
a
tav
e
rse
.
He
h
a
s
b
e
e
n
a
c
ti
v
e
in
th
e
re
se
a
rc
h
w
it
h
re
se
a
rc
h
in
tere
st
in
th
e
a
re
a
o
f
ima
g
e
p
ro
c
e
ss
in
g
,
a
rti
f
icia
l
in
telli
g
e
n
c
e
,
m
a
c
h
in
e
lea
rn
in
g
a
n
d
n
e
u
ra
l
sy
ste
m
.
K
h
a
n
R
a
q
ib
M
a
h
m
u
d
c
u
rre
n
tl
y
w
o
rk
in
g
a
s
a
lec
tu
re
r
w
it
h
in
th
e
d
e
p
a
rtm
e
n
t
o
f
Co
m
p
u
ter
S
c
ien
c
e
a
n
d
En
g
in
e
e
rin
g
a
t
th
e
Un
iv
e
rsit
y
o
f
L
ib
e
ra
l
A
rts
Ba
n
g
lad
e
sh
(ULA
B).
H
e
h
a
s
c
o
m
p
lete
d
Ba
c
h
e
lo
r
o
f
S
c
ien
c
e
(Ho
n
o
rs)
a
n
d
M
a
ste
r
o
f
S
c
ien
c
e
in
M
a
th
e
m
a
ti
c
s
f
ro
m
S
h
a
h
Ja
lal
Un
iv
e
rsit
y
o
f
S
c
ien
c
e
a
n
d
T
e
c
h
n
o
l
o
g
y
,
Ba
n
g
lad
e
sh
.
He
re
c
e
iv
e
d
a
n
Eras
m
u
s
M
u
n
d
u
s
S
c
h
o
lars
h
i
p
f
ro
m
th
e
Ed
u
c
a
ti
o
n
,
Au
-
d
io
v
isu
a
l
a
n
d
Cu
lt
u
re
Ex
e
c
u
ti
v
e
Ag
e
n
c
y
of
th
e
Eu
ro
p
e
a
n
Co
m
m
issio
n
,
to
p
u
rsu
e
a
d
o
u
b
le
M
a
ste
rs
in
S
c
ien
c
e
d
e
g
re
e
in
Co
m
p
u
ter
S
im
u
latio
n
f
o
r
S
c
ien
c
e
a
n
d
En
g
i
n
e
e
rin
g
a
n
d
Co
m
p
u
tatio
n
a
l
En
g
i
n
e
e
r
-
in
g
,
f
ro
m
G
e
r
m
a
n
y
a
n
d
S
w
e
d
e
n
.
He
w
a
s
an
MSc
th
e
si
s
stu
d
e
n
t
w
it
h
in
th
e
C
o
m
p
u
tatio
n
a
l
T
e
c
h
n
o
lo
g
y
L
a
b
o
ra
to
ry
o
f
th
e
De
p
a
rt
m
e
n
t
o
f
Hig
h
P
e
rfo
rm
a
n
c
e
Co
m
p
u
ti
n
g
a
n
d
V
isu
a
li
z
a
ti
o
n
a
t
KT
H
Ro
y
a
l
In
-
stit
u
te
o
f
T
e
c
h
n
o
l
o
g
y
,
S
w
e
d
e
n
.
His
c
u
rre
n
t
re
se
a
rc
h
in
tere
st
in
c
lu
d
e
s
m
a
c
h
in
e
lea
rn
in
g
a
n
d
p
a
tt
e
rn
re
c
o
g
n
it
i
o
n
,
im
a
g
e
p
ro
c
e
s
sin
g
a
n
d
c
o
m
p
u
ter v
isio
n
a
n
d
a
d
a
p
ti
v
e
d
y
n
a
m
ic
s
y
ste
m
.
Abu
l
K
a
la
m
Al
Az
a
d
re
c
e
iv
e
d
h
is
P
h
D
i
n
A
p
p
li
e
d
M
a
th
e
m
a
ti
c
s
f
r
o
m
Un
iv
e
rsit
y
o
f
Ex
e
ter,
Un
it
e
d
Kin
g
d
o
m
,
M
a
ste
rs
o
f
S
c
ien
c
e
in
T
h
e
o
re
ti
c
a
l
P
h
y
sic
s
a
n
d
Ba
c
h
e
lo
r
o
f
S
c
ien
c
e
in
P
h
y
sic
s
f
ro
m
U
n
i
-
v
e
rsit
y
of
Dh
a
k
a
.
He
is
c
u
rre
n
tl
y
an
A
s
so
c
iate
P
ro
f
e
ss
o
r
at
th
e
De
p
a
rtme
n
t
of
Co
m
p
u
ter
S
c
ien
c
e
a
n
d
En
g
in
e
e
rin
g
,
Un
iv
e
rsity
of
L
ib
e
ra
l
A
rts
Ba
n
g
lad
e
sh
(ULA
B).
P
re
v
io
u
sly
,
he
u
n
d
e
rto
o
k
p
o
st
-
d
o
c
to
ra
l
re
se
a
rc
h
a
t
De
p
a
rtm
e
n
t
o
f
Co
m
p
u
ti
n
g
a
n
d
M
a
th
e
m
a
ti
c
s,
Un
iv
e
rsit
y
o
f
P
ly
m
o
u
t
h
,
U
n
it
e
d
Ki
n
g
d
o
m
,
a
n
d
S
c
h
o
o
l
of
Bio
lo
g
ica
l
S
c
ien
c
e
s,
Un
iv
e
rsit
y
of
Bristo
l,
Un
it
e
d
K
in
g
d
o
m
,
on
a
BBS
RC
f
e
ll
o
w
sh
ip
.
His
re
se
a
rc
h
in
tere
st
in
c
lu
d
e
s
a
re
a
s
o
f
th
e
o
re
ti
c
a
l
a
n
d
c
o
m
p
u
tatio
n
a
l
n
e
u
r
o
sc
ien
c
e
,
c
o
n
-
n
e
c
to
m
ics
,
m
u
lt
i
-
ti
m
e
sc
a
le
d
y
n
a
m
ics
,
s
e
l
f
-
o
rg
a
n
ize
d
c
rit
ica
li
t
y
(S
OC)
a
n
d
a
rti
f
icia
l
in
telli
g
e
n
c
e
.
He
h
a
s
p
u
b
-
li
sh
e
d
a
n
u
m
b
e
r
o
f
p
a
p
e
rs
in
p
e
e
r
-
re
v
ie
we
d
in
tern
a
ti
o
n
a
l
j
o
u
r
n
a
l
s
a
n
d
p
re
se
n
ted
o
rig
in
a
l
re
se
a
rc
h
a
rti
c
les
in
n
u
m
e
ro
u
s in
te
rn
a
ti
o
n
a
l
c
o
n
f
e
re
n
c
e
s.
M
o
h
a
m
m
a
d
Rifa
t
Ah
m
m
a
d
Ra
sh
id
is
s
e
rv
in
g
as
an
A
s
sista
n
t
P
r
o
f
e
ss
o
r
in
th
e
De
p
a
rtm
e
n
t
of
Co
m
-
p
u
ter
S
c
ien
c
e
a
n
d
En
g
in
e
e
rin
g
o
f
U
LA
B.
Be
f
o
re
jo
in
in
g
U
LA
B,
h
e
w
o
rk
e
d
a
s
a
re
se
a
rc
h
e
r
in
th
e
P
e
rv
a
siv
e
T
e
c
h
n
o
lo
g
ies
Re
se
a
rc
h
A
re
a
w
it
h
in
th
e
Io
T
S
e
rv
ice
M
a
n
a
g
e
m
e
n
t
Un
it
in
L
INK
S
f
o
u
n
d
a
-
ti
o
n
,
Italy
.
He
re
c
e
iv
e
d
h
is
P
h
.
D.
d
e
g
re
e
f
ro
m
P
o
ly
tec
h
n
ic
Un
iv
e
rsit
y
o
f
T
u
rin
,
Italy
in
2
0
1
8
w
it
h
a
f
o
c
u
s
on
e
m
p
iri
c
a
l
so
f
t
wa
re
e
n
g
in
e
e
rin
g
.
His
re
se
a
rc
h
in
ter
e
sts
in
c
lu
d
e
e
n
e
rg
y
c
o
n
su
m
p
ti
o
n
a
n
a
ly
sis,
m
o
d
e
l
-
b
a
se
d
p
ro
c
e
ss
o
p
t
im
iza
ti
o
n
a
n
d
d
a
ta q
u
a
li
ty
a
n
a
l
y
sis.
Evaluation Warning : The document was created with Spire.PDF for Python.