I
n
d
on
e
s
ian
Jou
r
n
al
o
f
E
lec
t
r
ica
l
E
n
gin
e
e
r
in
g
a
n
d
Com
p
u
t
e
r
S
c
ience
Vo
l
.
25
,
N
o
.
2
,
F
e
b
r
ua
r
y
2022
,
pp.
952
~
962
I
S
S
N:
2502
-
47
52,
DO
I
:
10
.
11591/i
j
e
e
c
s
.
v
25
.i
2
.
pp
952
-
96
2
952
Jou
r
n
al
h
o
m
e
page
:
ht
tp:
//
ij
e
e
c
s
.
iaes
c
or
e
.
c
om
D
y
n
am
ic
h
a
n
d
ge
st
u
r
e
r
e
c
ogn
ition
of
A
r
ab
ic
si
gn
la
n
g
u
age
by
u
si
n
g
d
e
e
p
c
on
v
ol
u
t
io
n
al
n
e
u
r
al
n
e
t
w
or
k
s
M
oh
am
m
ad
H.
I
s
m
ail
,
S
h
e
f
a
A.
Daw
wd
,
F
ak
h
r
ul
d
di
n
H.
Al
i
D
e
pa
r
tm
e
nt
of
C
o
mput
e
r
E
ngi
ne
e
r
in
g,
C
o
l
le
g
e
of
E
ngi
n
e
e
r
in
g,
U
ni
ve
r
s
it
y
of
M
o
s
ul
,
I
r
a
q
Ar
t
ic
l
e
In
fo
AB
S
T
RA
CT
A
r
ti
c
le
h
is
tor
y
:
R
e
c
e
i
ve
d
A
ug
28
,
2021
R
e
vi
s
e
d
No
v
20
,
2021
A
c
c
e
pt
e
d
De
c
9
,
2021
In
c
o
m
p
u
t
e
r
v
i
s
i
o
n
,
one
of
the
mo
s
t
d
i
ff
i
c
u
l
t
p
ro
b
l
em
s
is
h
u
m
a
n
g
e
s
t
u
r
e
s
in
v
i
d
eo
s
reco
g
n
i
t
i
o
n
B
ec
au
s
e
of
ce
rt
ai
n
i
rr
e
l
ev
an
t
en
v
i
ro
n
me
n
t
al
v
ari
ab
l
e
s
.
T
h
i
s
i
s
s
u
e
h
as
b
e
e
n
s
o
l
v
e
d
by
u
s
i
n
g
s
i
n
g
l
e
d
ee
p
n
e
t
w
o
rk
s
to
l
e
arn
s
p
at
i
o
-
t
e
m
p
o
ra
l
c
h
ara
c
t
e
ri
s
t
i
c
s
fro
m
v
i
d
eo
d
at
a,
a
n
d
this
ap
p
ro
ac
h
is
s
t
i
l
l
i
n
s
u
ffi
ci
e
n
t
to
h
a
n
d
l
e
b
o
t
h
p
ro
b
l
em
s
at
t
h
e
s
ame
t
i
me
.
As
a
re
s
u
l
t
,
the
re
s
e
ar
ch
e
rs
fu
s
e
d
v
ari
o
u
s
mo
d
el
s
to
al
l
o
w
fo
r
the
e
f
fec
t
i
v
e
c
o
l
l
ec
t
i
o
n
of
i
m
p
o
rt
an
t
s
h
ap
e
i
n
fo
r
m
at
i
o
n
as
w
e
l
l
as
p
reci
s
e
s
p
at
i
o
t
em
p
o
ral
v
ar
i
at
i
o
n
of
g
e
s
t
u
r
e
s
.
In
t
h
i
s
s
t
u
d
y
,
we
c
o
l
l
ec
t
e
d
the
d
y
n
a
m
i
c
d
at
as
e
t
fo
r
t
w
e
n
t
y
me
an
i
n
g
fu
l
w
o
r
d
s
of
A
rab
i
c
s
i
g
n
l
an
g
u
a
g
e
(A
rS
L
)
u
s
i
n
g
a
M
i
c
r
o
s
o
ft
K
i
n
ec
t
v2
c
a
me
ra.
The
r
ec
o
r
d
e
d
d
at
a
i
n
c
l
u
d
ed
7350
re
d
,
g
r
ee
n
,
an
d
b
l
u
e
(
RG
B
)
v
i
d
eo
s
an
d
7350
d
e
p
t
h
v
i
d
eo
s
.
We
p
ro
p
o
s
ed
f
o
u
r
d
ee
p
n
eu
ral
n
e
t
w
o
rk
s
mo
d
e
l
s
u
s
i
n
g
2D
a
n
d
3D
co
n
v
o
l
u
t
i
o
n
al
n
e
u
ra
l
n
e
t
w
o
rk
(CN
N
)
to
c
o
v
e
r
al
l
fe
at
u
r
e
e
x
t
rac
t
i
o
n
me
t
h
o
d
s
an
d
t
h
en
p
as
s
i
n
g
t
h
e
s
e
f
e
at
u
re
s
to
t
he
recu
rr
e
n
t
n
eu
ral
n
e
t
w
o
rk
(RN
N
)
fo
r
s
e
q
u
en
ce
c
l
as
s
i
fi
c
at
i
o
n
.
L
o
n
g
s
h
o
rt
-
t
e
r
m
mem
o
r
y
(L
ST
M)
an
d
g
at
ed
r
ec
u
rr
en
t
unit
(G
RU
)
are
t
w
o
t
y
p
e
s
of
u
s
i
n
g
RN
N
.
A
l
s
o
,
t
h
e
r
e
s
e
ar
ch
i
n
c
l
u
d
ed
e
v
a
l
u
at
i
o
n
f
u
s
i
o
n
t
ech
n
i
q
u
e
s
fo
r
s
e
v
e
ra
l
t
y
p
e
s
of
mu
l
t
i
p
l
e
m
o
d
e
l
s
.
T
h
e
e
x
p
e
r
i
m
en
t
re
s
u
l
t
s
s
h
o
w
the
b
e
s
t
m
u
l
t
i
-
m
o
d
e
l
fo
r
the
d
y
n
ami
c
d
at
as
e
t
of
the
A
rSL
r
ec
o
g
n
i
t
i
o
n
a
c
h
i
e
v
ed
100%
accu
ra
cy
.
K
e
y
w
o
r
d
s
:
A
r
a
bi
c
s
i
g
n
l
a
n
gu
a
ge
C
o
n
v
o
l
ut
i
o
n
a
l
n
e
ur
a
l
n
e
t
wo
r
k
De
e
p
l
e
a
r
ni
ng
d
y
n
a
mi
c
h
a
n
d
ge
s
t
ur
e
M
u
l
t
i
-
m
o
de
l
Th
i
s
is
an
o
p
en
a
c
ces
s
a
r
t
i
c
l
e
u
n
d
e
r
the
CC
BY
-
SA
l
i
cen
s
e.
C
or
r
e
s
pon
din
g
A
u
th
or
:
M
o
h
a
m
m
a
d
H.
I
s
m
a
il
De
pa
r
t
m
e
n
t
of
C
o
m
put
e
r
E
n
g
i
n
e
e
r
i
n
g
,
C
o
l
l
e
ge
of
E
n
g
i
ne
e
r
i
n
g
Uni
ve
r
s
i
t
y
of
M
o
s
u
l
,
M
o
s
u
l
,
I
r
a
q
E
m
a
i
l
:
m
o
h
a
mm
a
d
.
ha
qq
i
@
g
m
a
il
.
c
o
m
1.
I
NT
RODU
C
T
I
ON
P
e
o
pl
e
o
f
t
e
n
ut
i
li
z
e
ha
n
d
ge
s
t
ur
e
s
to
c
o
m
m
u
ni
c
a
t
e
t
h
e
i
r
t
h
o
ugh
t
s
a
n
d
f
e
e
li
ng
s
.
P
e
o
pl
e
w
h
o
a
r
e
de
a
f
or
h
a
r
d
of
h
e
a
r
i
ng
h
a
v
e
t
r
a
d
i
t
i
o
n
a
ll
y
de
pe
n
de
d
on
s
i
g
n
l
a
n
gu
a
ge
to
c
o
m
m
u
ni
c
a
t
e
.
M
o
s
t
o
r
di
n
a
r
y
pe
o
pl
e
do
n
o
t
kn
o
w
t
hi
s
l
a
n
gua
g
e
a
n
d
h
a
v
e
d
i
f
f
i
c
u
l
t
i
e
s
in
c
o
mm
u
ni
c
a
t
i
on
w
i
t
h
de
a
f
pe
o
p
l
e
.
As
a
r
e
s
u
l
t
,
in
o
r
de
r
to
e
a
s
e
c
o
m
m
u
ni
c
a
t
i
o
n
a
n
d
c
l
o
s
e
t
h
e
ga
p,
an
a
ut
o
m
a
t
i
c
s
i
g
n
l
a
n
gua
ge
r
e
c
o
gni
t
i
o
n
s
y
s
t
e
m
m
u
s
t
be
de
v
e
l
o
p
e
d.
S
i
g
n
l
a
n
gua
ge
's
o
r
ga
ni
z
e
d
f
o
r
m
of
ha
n
d
m
o
v
e
m
e
n
t
s
a
i
ds
n
o
nv
e
r
b
a
l
c
o
m
m
u
ni
c
a
t
i
o
n
be
t
we
e
n
de
a
f
a
n
d
h
a
r
d
of
h
e
a
r
i
ng
pe
o
p
l
e
.
M
a
ny
v
o
c
a
b
u
l
a
r
i
e
s
or
wo
r
ds
in
s
ign
l
a
n
gua
g
e
s
h
a
ve
a
c
o
m
p
li
c
a
t
e
d
s
t
r
uc
t
ur
e
,
c
o
m
p
a
r
a
bl
e
to
t
h
a
t
of
s
po
ke
n
l
a
n
gua
ge
s
.
Ha
n
ds
ha
pe
,
po
s
i
t
i
o
n
,
d
i
r
e
c
t
i
o
n
,
m
o
v
e
m
e
n
t
,
a
n
d
f
a
c
i
a
l
e
m
o
t
i
o
ns
a
ll
c
o
n
t
r
i
b
u
t
e
to
t
h
e
e
x
pr
e
s
s
i
o
n
of
s
i
g
n
l
a
n
gua
g
e
m
o
ve
m
e
n
t
s
[
1
]
.
T
h
e
d
i
f
f
i
c
u
l
t
i
e
s
of
s
i
g
n
l
a
n
gua
ge
r
e
c
o
gni
t
i
o
n
m
a
y
be
s
p
l
i
t
i
n
t
o
t
w
o
c
a
t
e
g
o
r
i
e
s
:
s
t
a
t
i
c
ge
s
t
ur
e
r
e
c
o
gni
t
i
o
n
,
w
hi
c
h
f
o
c
us
e
s
on
f
i
nge
r
s
pe
ll
i
ng,
a
n
d
d
y
na
mi
c
ge
s
t
ur
e
r
e
c
o
gni
t
i
o
n
,
whi
c
h
i
nc
l
ud
e
s
s
i
ng
l
e
wo
r
ds
a
n
d
c
o
n
t
i
n
uo
us
s
e
n
t
e
n
c
e
r
e
c
o
gni
t
i
o
n
.
F
o
r
f
u
ll
s
e
n
t
e
n
c
e
r
e
c
o
gni
t
i
o
n
,
m
o
s
t
c
o
n
t
i
n
uo
us
s
i
g
n
l
a
n
gua
g
e
s
y
s
t
e
m
s
ut
i
li
z
e
an
e
nh
a
n
c
e
d
v
e
r
s
i
o
n
of
t
h
e
s
i
n
g
l
e
wo
r
d
f
r
a
m
e
wo
r
k
[
2
]
.
Ha
n
d
s
e
g
m
e
n
t
a
t
i
o
n
/det
e
c
t
i
o
n
,
h
a
n
d
s
h
a
pe
f
e
a
t
ur
e
r
e
pr
e
s
e
n
t
a
t
i
o
n
,
a
n
d
s
e
que
n
c
e
c
l
a
s
s
i
f
i
c
a
t
i
o
n
a
r
e
t
h
r
e
e
d
if
f
i
c
u
l
t
i
e
s
fa
ced
by
i
nde
p
e
n
de
n
t
dy
n
a
mi
c
s
i
g
n
l
a
n
gua
ge
r
e
c
o
gni
t
i
o
n
s
y
s
t
e
m
s
.
T
h
e
m
a
j
o
r
i
t
y
of
s
i
g
n
l
a
n
gua
g
e
r
e
c
o
gni
t
i
o
n
s
y
s
t
e
m
s
h
a
v
e
b
e
e
n
d
i
vi
de
d
i
n
t
o
t
w
o
c
a
t
e
gor
i
e
s
g
l
o
ve
-
b
a
s
e
d
a
n
d
i
m
a
g
e
-
b
a
s
e
d
m
e
t
h
o
ds
[3
]
,
[
4]
.
S
i
g
ne
r
s
m
us
t
we
a
r
an
e
l
e
c
t
r
o
ni
c
s
e
n
s
o
r
gl
o
v
e
t
h
a
t
m
o
nit
o
r
s
a
nd
Evaluation Warning : The document was created with Spire.PDF for Python.
I
n
do
n
e
s
i
a
n
J
E
l
e
c
E
n
g
&
C
o
m
p
S
c
i
I
S
S
N:
2502
-
4752
Dy
namic
hand
ge
s
tur
e
r
e
c
ognit
ion
of
A
r
abic
s
ign
l
anguage
by
us
ing
de
e
p
…
(
M
ohamm
ad
H.
I
s
mail)
953
de
t
e
c
t
s
h
a
n
d
a
n
d
f
i
nge
r
m
o
v
e
m
e
n
t
s
in
o
r
de
r
to
us
e
t
h
e
g
l
o
v
e
-
b
a
s
e
d
m
e
t
h
o
d.
T
h
e
m
a
i
n
d
i
s
a
d
v
a
n
t
a
g
e
of
t
hi
s
m
e
t
h
o
d
is
t
h
a
t
t
h
e
s
i
g
n
e
r
m
us
t
e
x
e
c
ut
e
t
h
e
s
i
g
n
s
w
hil
e
we
a
r
i
ng
a
h
e
a
vy
g
l
o
v
e
w
i
t
h
a
pa
i
r
of
s
e
ns
o
r
s
.
W
i
t
h
o
ut
t
h
e
n
e
e
d
of
g
l
o
v
e
s
or
s
e
ns
o
r
s
,
t
h
e
vi
s
i
o
n
-
ba
s
e
d
met
h
o
d
r
e
c
o
r
ds
t
h
e
h
a
n
d
ge
s
t
ur
e
in
t
h
e
f
o
r
m
of
s
e
q
ue
n
t
i
a
l
or
f
i
xe
d
im
a
ge
s
by
t
h
e
c
a
m
e
r
a
.
Al
t
h
o
ugh
t
h
e
r
e
a
r
e
n
u
m
e
r
o
us
d
i
f
f
i
c
u
l
t
i
e
s
s
uc
h
as
b
a
c
kgr
o
un
d
v
a
r
i
a
t
i
o
n
s
,
s
k
in
c
o
l
o
ur
,
l
i
g
h
t
i
n
g
c
o
n
d
i
t
i
o
ns
,
a
n
d
t
h
e
c
h
a
r
a
c
t
e
r
i
s
t
i
c
s
a
n
d
s
e
tt
i
n
g
s
of
t
h
e
c
a
m
e
r
a
,
t
hi
s
m
e
t
h
o
d
is
m
o
r
e
s
u
i
t
a
bl
e
f
o
r
t
h
e
de
a
f
a
n
d
m
ut
e
in
t
h
e
i
r
e
ve
r
y
da
y
li
ve
s
a
n
d
s
im
p
l
e
r
f
o
r
t
h
e
s
i
g
n
e
r
[
4]
.
T
h
e
s
e
g
m
e
n
t
a
t
i
o
n
of
t
h
e
ha
n
d
s
a
n
d
f
i
nge
r
s
,
l
i
ke
t
hi
s
t
e
c
hni
que
,
r
e
qu
i
r
e
s
a
l
o
t
of
pr
o
c
e
s
s
i
ng.
To
m
a
ke
t
h
e
s
e
g
m
e
n
t
a
t
i
o
n
pr
o
c
e
s
s
e
a
s
i
e
r
,
th
e
s
i
g
n
e
r
m
a
y
be
a
s
ke
d
to
we
a
r
c
o
l
o
ur
e
d
c
ott
o
n
gl
o
ve
s
.
R
e
c
e
n
t
l
y
,
a
m
e
t
h
o
d
b
a
s
e
d
on
t
h
e
M
i
c
r
o
s
o
f
t
Ki
n
e
c
t
(
M
K
)
de
vi
c
e
wa
s
pr
e
s
e
n
t
e
d.
MK
is
a
m
o
t
i
o
n
s
e
n
s
i
ng
de
vi
c
e
t
h
a
t
M
i
c
r
o
s
o
f
t
i
ni
t
i
a
ll
y
de
s
i
g
n
e
d
f
o
r
b
e
tt
e
r
us
e
r
e
x
pe
r
i
e
n
c
e
dur
i
ng
vi
de
o
ga
mi
ng
a
n
d
e
n
t
e
r
t
a
i
nm
e
n
t
[
5]
.
T
h
e
K
i
ne
c
t
V2
s
e
ns
o
r
c
o
n
s
i
s
t
s
of
c
o
l
o
ur
a
n
d
i
nf
r
a
r
e
d
(
I
R
)
c
a
m
e
r
a
s
.
T
he
c
o
l
o
ur
c
a
m
e
r
a
o
u
t
pu
t
s
a
hi
g
h
-
r
e
s
o
l
ut
i
o
n
r
e
d,
gr
e
e
n
,
a
n
d
bl
u
e
(
R
GB
)
vi
de
o
s
t
r
e
a
m
w
i
t
h
a
r
e
s
o
l
ut
i
o
n
of
1920
x
108
0
p
i
x
e
l
s
.
T
h
e
IR
c
a
m
e
r
a
c
a
pt
ur
e
s
t
h
e
m
o
du
l
a
t
e
d
i
nf
r
a
r
e
d
li
g
h
t
s
e
n
t
o
u
t
by
t
h
e
IR
pr
o
j
e
c
t
o
r
/em
i
t
t
e
r
to
o
u
t
p
ut
de
p
t
h
im
a
ge
s
/
m
a
ps
t
h
a
t
de
t
e
r
m
i
ne
t
h
e
d
i
s
t
a
n
c
e
be
t
we
e
n
t
h
e
s
e
n
s
o
r
a
n
d
e
a
c
h
po
i
n
t
of
t
h
e
s
c
e
n
e
ba
s
e
d
on
t
he
T
i
m
e
-
of
-
F
li
g
h
t
(
T
o
F
)
a
n
d
i
n
t
e
n
s
i
t
y
m
o
du
l
a
t
i
o
n
t
e
c
hni
qu
e
.
T
h
e
de
pt
h
i
m
a
ge
s
a
r
e
e
n
c
o
de
d
u
s
i
n
g
16
bi
t
s
a
nd
ha
v
e
a
r
e
s
o
l
ut
i
o
n
of
512
x
424
p
i
xe
l
s
[
6]
.
In
a
dd
i
t
i
o
n
,
t
h
e
Ki
ne
c
t
V2
s
e
ns
o
r
pr
o
vi
de
s
i
n
f
o
r
m
a
t
i
o
n
a
b
o
ut
t
h
e
s
i
g
ne
r
’
s
s
ke
l
e
t
o
n
t
h
r
o
ugh
25
j
o
i
n
t
po
i
n
t
s
,
a
n
d
it
is
e
qu
i
ppe
d
w
i
t
h
a
mi
c
r
o
ph
o
n
e
a
r
r
a
y
to
c
a
pt
ur
e
s
o
un
ds
.
T
hi
s
de
vi
c
e
'
s
da
t
a
h
a
s
b
e
e
n
e
x
a
mi
ne
d
f
o
r
d
i
f
f
e
r
e
n
t
s
t
udy
f
i
e
lds
a
n
d
it
h
a
s
b
e
e
n
ut
i
li
z
e
d
in
li
mi
t
e
d
s
c
o
pe
f
o
r
ge
s
t
ur
e
r
e
c
o
gni
t
i
o
n
of
A
r
a
bi
c
s
i
g
n
l
a
n
gua
g
e
(
A
r
S
L
)
.
T
h
e
r
e
f
o
r
e
,
we
ut
i
l
i
z
e
d
t
h
e
MK
de
vi
c
e
to
r
e
c
o
gni
z
e
A
r
S
L
in
t
hi
s
s
t
udy
.
T
h
e
pr
im
a
r
y
pur
po
s
e
of
t
hi
s
r
e
s
e
a
r
c
h
is
to
pr
e
pa
r
e
t
h
e
da
t
a
of
A
r
S
L
,
a
m
o
u
n
t
i
n
g
f
o
r
e
a
c
h
of
t
h
e
R
G
B
a
n
d
t
h
e
de
pt
h
to
7
,
350
vi
de
o
s
of
s
i
g
n
l
a
n
gu
a
ge
w
o
r
ds
.
T
h
e
n
,
pr
o
po
s
i
n
g
s
i
ng
l
e
m
o
de
l
s
a
n
d
c
h
o
o
s
i
n
g
t
h
e
be
s
t
o
n
e
to
r
e
c
o
gni
z
e
A
r
S
L
ba
s
e
d
on
t
hi
s
da
t
a
s
e
t
.
T
h
e
n
e
v
a
l
ua
t
i
o
n
is
do
n
e
e
m
p
l
o
y
i
ng
d
i
f
f
e
r
e
n
t
pr
e
-
t
r
a
i
n
e
d
m
o
de
l
s
in
t
h
e
a
r
c
hi
t
e
c
t
ur
e
of
t
h
e
c
h
o
s
e
n
m
o
de
l
to
r
e
c
o
gn
i
z
e
A
r
S
L
on
o
ur
da
t
a
s
e
t
as
a
s
i
ng
l
e
m
o
de
l
a
n
d
t
h
e
n
as
a
m
u
l
t
i
-
m
o
de
l
.
As
we
l
l
as
e
v
a
l
ua
t
i
o
n
of
t
h
e
f
u
s
i
o
n
mo
de
l
’
s
m
e
t
h
o
d
is
do
n
e
.
2.
RE
L
AT
E
D
WORK
S
S
i
g
n
l
a
n
gu
a
ge
(
S
L
)
is
a
n
a
t
ur
a
l
a
n
d
e
f
f
i
c
i
e
n
t
wa
y
f
o
r
h
a
r
d
of
he
a
r
i
n
g
or
de
a
f
i
nd
i
vi
du
a
l
s
to
c
o
m
m
u
ni
c
a
t
e
w
i
t
h
o
t
h
e
r
s
in
t
h
e
i
r
s
o
c
i
e
t
y
.
I
n
t
e
r
a
c
t
i
n
g
w
i
t
h
t
h
e
m
n
e
c
e
s
s
i
t
a
t
e
s
l
e
a
r
ni
ng
s
i
g
n
l
a
ngua
ge
or
a
n
o
t
h
e
r
t
y
pe
of
c
o
m
m
u
ni
c
a
t
i
o
n
.
S
i
g
n
l
a
n
gua
ge
ha
s
be
c
o
m
e
of
gr
e
a
t
i
n
t
e
r
e
s
t
to
r
e
s
e
a
r
c
h
e
r
s
w
i
t
h
th
e
r
a
p
i
d
de
v
e
l
o
p
m
e
n
t
of
m
u
l
t
im
e
d
i
a
c
o
m
m
u
ni
c
a
t
i
o
n
t
e
c
hno
l
o
g
i
e
s
to
i
m
pr
o
v
e
s
o
c
i
a
l
c
o
m
m
u
ni
c
a
t
i
o
n
a
m
o
n
g
i
t
s
us
e
r
s
.
On
e
of
t
h
e
m
a
i
n
t
r
e
n
d
s
in
s
i
g
n
l
a
n
gu
a
ge
r
e
c
o
gni
t
i
o
n
m
e
n
t
i
o
n
e
d
is
vi
s
i
o
n
-
ba
s
e
d
s
y
s
t
e
m
s
w
hi
c
h
u
s
i
n
g
d
e
pt
h
c
a
m
e
r
a
s
to
c
a
pt
u
r
e
c
o
l
o
ur
or
de
p
t
h
i
m
a
ge
s
/
vi
de
o
s
in
a
m
o
r
e
na
t
ur
a
l
e
nvi
r
o
nm
e
n
t
.
S
e
v
e
r
a
l
m
a
c
hi
ne
le
a
r
ni
ng
a
l
go
r
i
t
hm
s
ha
v
e
b
e
e
n
c
r
e
a
t
e
d,
to
a
n
a
l
y
z
e
a
n
d
c
a
t
e
go
r
i
z
e
vi
d
e
o
da
t
a
.
T
h
e
mi
n
o
r
de
t
a
i
l
s
of
A
r
S
L
we
r
e
de
a
l
t
w
i
t
h
u
s
i
ng
a
f
e
a
t
ur
e
s
e
x
t
r
a
c
tor
w
i
t
h
de
e
p
be
h
a
vi
o
r
f
o
r
A
r
S
L
R
e
c
o
gni
t
i
o
n
[
7
]
.
Ut
i
li
z
e
d
3D
-
c
o
nv
o
l
ut
i
o
n
a
l
ne
ur
a
l
n
e
t
wo
r
k
(
C
NN
)
to
i
de
n
t
i
f
y
25
ge
s
t
u
r
e
s
f
r
o
m
an
A
r
a
bic
s
i
g
n
l
a
n
gua
g
e
d
i
c
t
i
o
n
a
r
y
.
Da
t
a
f
r
o
m
de
pt
h
m
a
p
s
w
a
s
i
n
put
i
n
t
o
t
h
e
r
e
c
o
gni
t
i
o
n
s
y
s
t
e
m
.
F
o
r
o
b
s
e
r
v
e
d
da
t
a
,
t
h
e
s
y
s
t
e
m
wa
s
98%
a
c
c
ur
a
t
e
,
whil
e
f
o
r
n
e
w
da
t
a
,
it
wa
s
85%
a
c
c
ur
a
t
e
on
a
v
e
r
a
ge
.
M
a
s
o
o
d
et
al
.
[
8
]
us
e
d
a
ppr
o
xi
m
a
t
e
l
y
2300
vi
de
o
s
fo
r
46
ge
s
t
u
r
e
c
a
t
e
gor
i
e
s
to
pr
o
p
o
s
e
a
vi
s
i
o
n
-
b
a
s
e
d
s
y
s
t
e
m
to
tr
a
n
s
l
a
t
e
i
s
o
l
a
t
e
d
h
a
n
d
ge
s
t
ur
e
s
f
r
o
m
t
h
e
A
r
ge
n
t
i
n
e
a
n
s
i
g
n
l
a
n
gua
ge
,
w
i
t
h
an
a
c
c
ur
a
c
y
of
95.
217
%
.
T
h
e
y
ut
i
li
z
e
d
t
wo
m
e
t
h
o
ds
to
c
a
t
e
g
o
r
i
z
e
t
h
e
d
a
t
a
:
C
NN
f
o
r
s
pa
t
i
a
l
f
e
a
t
ur
e
s
a
n
d
r
e
c
ur
r
e
n
t
n
e
ur
a
l
ne
t
wor
k
(
R
NN
)
f
o
r
t
e
m
po
r
a
l
f
e
a
t
ur
e
s
.
T
h
e
r
e
a
l
-
t
i
m
e
s
y
s
t
e
m
is
de
s
c
r
i
be
d
in
[
9
]
f
o
r
an
a
uto
m
a
t
e
d
A
r
S
L
r
e
c
o
gni
t
i
o
n
s
y
s
t
e
m
b
a
s
e
d
on
t
h
e
Ki
ne
c
t
s
e
n
s
o
r
t
h
a
t
c
o
m
pa
r
e
s
s
i
g
ns
a
n
d
r
e
c
o
gni
z
e
s
30
i
s
o
l
a
t
e
d
wo
r
ds
f
r
o
m
s
t
a
n
da
r
d
A
r
S
L
s
i
g
ns
us
i
ng
t
h
e
d
y
n
a
mi
c
t
i
m
e
wa
r
p
i
ng
a
l
g
o
r
i
t
hm
.
T
h
e
s
y
s
t
e
m
o
b
t
a
i
n
s
a
s
ign
e
r
-
de
pe
n
de
n
t
r
e
c
o
gni
t
i
o
n
r
a
t
e
of
97.
58%
a
n
d
a
s
i
g
n
e
r
-
i
nde
pe
n
de
n
t
r
e
c
o
gni
t
i
o
n
r
a
t
e
of
95.
25%
.
A
m
u
l
t
i
-
m
o
d
e
l
d
y
na
m
i
c
s
i
g
n
l
a
n
gua
g
e
r
e
c
o
gni
t
i
o
n
t
e
c
h
ni
que
b
a
s
e
d
on
a
de
e
p
3
-
di
m
e
ns
i
o
n
a
l
r
e
s
i
du
a
l
C
o
n
v
N
e
t
a
n
d
Bi
-
d
i
r
e
c
t
i
o
n
a
l
l
o
n
g
s
h
o
r
t
-
t
e
r
m
m
e
m
o
r
y
(
L
S
T
M
)
n
e
t
wo
r
ks
(
B
3D
R
e
s
Ne
t
)
wa
s
r
e
p
o
r
t
e
d
by
L
i
a
o
et
al
.
[
10]
.
T
h
e
i
r
t
e
c
h
ni
que
is
d
i
vi
d
e
d
i
n
t
o
t
h
r
e
e
pa
r
t
s
.
T
h
e
h
a
n
d
i
t
e
m
wa
s
f
i
r
s
t
l
o
c
a
t
e
d
in
t
h
e
vi
de
o
f
r
a
m
e
s
.
T
h
e
B
3D
R
e
s
Ne
t
wa
s
t
h
e
n
u
s
e
d
to
e
x
t
r
a
c
t
s
pa
t
i
ot
e
m
po
r
a
l
f
e
a
t
ur
e
s
f
r
o
m
t
h
e
vi
de
o
s
e
que
n
c
e
s
,
a
n
d
t
h
e
d
y
n
a
mi
c
s
i
g
n
l
a
n
gu
a
ge
wa
s
c
o
r
r
e
c
t
l
y
r
e
c
o
gni
z
e
d
by
c
l
a
s
s
i
f
yi
ng
t
h
e
vi
de
o
s
e
que
n
c
e
s
.
On
t
h
e
DE
VI
S
I
GN
_D
da
t
a
s
e
t,
t
h
e
e
x
pe
r
i
m
e
n
t
got
a
r
e
c
o
gni
t
i
o
n
a
c
c
ur
a
c
y
of
89.
8%
,
whi
l
e
on
t
he
s
im
p
l
e
li
ne
a
r
r
e
gr
e
s
s
i
o
n
(
S
L
R
)
Da
t
a
s
e
t,
it
got
86.
9%
.
A
n
e
w
d
y
na
mi
c
ge
s
t
ur
e
r
e
c
o
gni
t
i
o
n
t
e
c
hni
que
w
a
s
pr
e
s
e
n
t
e
d
by
Z
h
a
n
g
et
al
.
[
11
].
To
e
x
t
r
a
c
t
f
r
a
m
e
-
l
e
v
e
l
m
o
t
i
o
n
f
e
a
t
ur
e
s
f
o
r
ge
s
t
ur
e
c
a
t
e
g
o
r
i
z
a
t
i
o
n
,
a
m
ot
i
o
n
r
e
pr
e
s
e
n
t
a
t
i
o
n
2D
-
C
NN
mo
de
l
wa
s
f
i
r
s
t
pr
o
p
o
s
e
d.
T
h
e
s
e
que
n
t
i
a
l
m
o
t
i
o
n
i
nf
o
r
m
a
t
i
o
n
in
t
hi
s
m
o
de
l
m
a
y
be
r
e
p
r
e
s
e
n
t
e
d
i
n
t
o
a
s
i
n
g
le
i
m
a
ge
.
S
e
c
o
n
d,
t
h
e
s
ugge
s
t
e
d
3D
De
n
s
e
Ne
t
m
o
de
l
wa
s
bu
i
l
t
to
di
r
e
c
t
l
y
c
a
pt
ur
e
s
pa
t
i
o
t
e
m
po
r
a
l
m
o
t
i
o
n
i
n
f
o
r
m
a
t
i
o
n
f
r
o
m
R
G
B
vi
d
e
o
s
.
F
i
n
a
ll
y
,
t
h
e
2D
-
C
NN
m
o
de
l
's
pr
e
d
i
c
t
i
o
n
r
e
s
u
l
t
s
a
n
d
t
h
e
3D
De
n
s
e
Ne
t
m
o
de
l
's
p
r
e
d
i
c
t
i
o
n
r
e
s
u
l
t
s
we
r
e
m
e
r
ge
d
to
i
m
pr
o
v
e
pe
r
f
o
r
m
a
n
c
e
.
T
h
e
ir
s
ugge
s
t
e
d
a
ppr
o
a
c
h
o
b
t
a
i
n
e
d
c
l
a
s
s
if
i
c
a
t
i
o
n
a
c
c
ur
a
c
y
r
a
t
e
s
of
89.
1%
a
n
d
89.
5%
,
r
e
s
pe
c
t
i
v
e
ly
,
on
t
h
e
d
i
f
f
i
c
u
lt
vi
s
i
o
n
f
o
r
i
n
t
e
l
li
ge
n
t
v
e
hi
c
l
e
s
a
n
d
a
pp
l
i
c
a
t
i
o
n
s
(
VI
VA
)
-
da
t
a
s
e
t
a
n
d
t
h
e
UT
D
-
m
u
l
t
i
m
o
d
a
l
h
u
m
a
n
a
c
t
i
o
n
d
a
t
a
s
e
t
(
M
HA
D
)
d
a
t
a
s
e
t
.
T
h
e
J
e
s
t
e
r
20B
N
-
J
E
S
T
E
R
da
t
a
s
e
t
Evaluation Warning : The document was created with Spire.PDF for Python.
I
S
S
N
:
2502
-
4752
I
n
do
n
e
s
i
a
n
J
E
l
e
c
E
n
g
&
C
o
m
p
S
ci
,
Vo
l
.
25
,
N
o
.
2
,
F
e
b
r
ua
r
y
20
22
:
952
-
96
2
954
wa
s
u
s
e
d
in
r
e
f
r
e
n
c
e
[
12
]
to
tr
a
i
n
t
h
e
3D
-
C
N
N
a
r
c
hi
t
e
c
t
ur
e
f
o
r
h
a
n
d
ge
s
t
ur
e
de
t
e
c
t
i
o
n
by
u
s
i
ng
t
h
e
3D
-
C
NN
m
o
de
l
,
w
hi
c
h
c
o
m
pr
i
s
e
s
f
o
ur
c
o
nv
o
l
ut
i
o
n
l
a
y
e
r
s
a
n
d
t
wo
f
u
ll
y
c
o
nn
e
c
t
e
d
l
a
y
e
r
s
o
n
e
dr
o
po
u
t
l
a
y
e
r
,
th
e
r
e
a
r
e
27
c
a
t
e
g
o
r
i
e
s
of
h
a
n
d
ge
s
t
ur
e
s
.
T
h
e
y
o
b
t
a
i
n
e
d
a
90%
a
c
c
ur
a
c
y
m
o
de
l
w
i
t
h
t
h
r
e
e
da
y
s
of
t
r
a
i
ni
ng.
Ha
n
d
s
e
m
a
n
t
i
c
s
e
g
m
e
n
t
a
t
i
o
n
,
h
a
n
d
s
h
a
pe
f
e
a
t
ur
e
r
e
pr
e
s
e
n
t
a
t
i
o
n
,
a
n
d
de
e
p
r
e
c
ur
r
e
n
t
n
e
ur
a
l
n
e
t
wo
r
k
a
r
e
a
l
l
p
a
r
t
of
a
n
o
v
e
l
f
r
a
m
e
wo
r
k
t
h
a
t
us
e
s
s
e
ve
r
a
l
de
e
p
l
e
a
r
ni
n
g
a
r
c
hi
t
e
c
t
ur
e
s
,
wa
s
pr
e
s
e
n
t
e
d
in
[1
3
]
f
o
r
i
nde
pe
n
de
n
t
A
r
S
L
r
e
c
o
gni
t
i
o
n
i
nc
l
ud
i
n
g
23
i
s
o
l
a
t
e
d
wo
r
ds
.
A
s
i
ng
l
e
l
a
y
e
r
c
o
n
v
o
l
ut
i
o
n
a
l
s
e
l
f
-
o
r
ga
ni
z
i
ng
m
a
p
wa
s
us
e
d
to
e
x
t
r
a
c
t
h
a
n
d
s
ha
pe
f
e
a
t
ur
e
s
.
A
de
e
p
B
i
L
S
T
M
r
e
c
ur
r
e
n
t
n
e
ur
a
l
ne
t
wor
k
r
e
c
o
gni
z
e
s
t
h
e
s
e
que
n
c
e
of
e
x
t
r
a
c
t
e
d
f
e
a
t
ur
e
v
e
c
t
o
r
s
.
T
h
e
s
ugge
s
t
e
d
s
y
s
t
e
m
o
b
t
a
i
ns
an
a
v
e
r
a
ge
a
c
c
ur
a
c
y
of
89.
5
%
w
h
e
n
t
e
s
t
e
d
on
t
h
e
A
r
a
bi
c
b
e
n
c
hm
a
r
k
da
t
a
s
e
t
.
T
r
a
n
et
al
.
[1
4
]
pr
e
s
e
n
ted
a
n
o
v
e
l
t
e
c
hni
que
to
i
de
n
t
i
f
y
f
i
nge
r
s
a
n
d
r
e
c
o
gni
z
e
h
a
nd
ge
s
t
ur
e
s
in
r
e
a
l
-
t
i
m
e
us
i
ng
an
R
G
B
-
D
c
a
m
e
r
a
a
n
d
a
t
h
r
e
e
-
d
i
m
e
n
s
i
o
na
l
c
o
nv
o
l
ut
i
o
n
n
e
ur
a
l
n
e
t
wo
r
k
(
3D
-
C
NN
)
.
T
h
e
i
r
n
e
t
wo
r
k
i
n
c
l
ude
d
s
e
v
e
n
c
o
nv
o
l
ut
i
o
n
a
l
l
a
y
e
r
s
,
t
h
r
e
e
m
a
x
po
o
l
i
ng
l
a
y
e
r
s
,
a
n
d
t
wo
f
u
l
ly
c
o
nn
e
c
t
e
d
l
a
y
e
r
s
,
a
n
d
t
h
e
i
r
s
y
s
t
e
m
ha
d
s
e
v
e
n
h
a
n
d
g
e
s
t
ur
e
s
.
T
h
e
y
e
v
a
l
ua
t
e
d
t
h
e
i
r
3D
-
C
NN
t
r
a
i
ni
ng
t
i
m
e
o
f
1
h
o
ur
a
n
d
35
m
i
nut
e
s
w
i
t
h
92.
6
%
a
c
c
ur
a
c
y
,
a
n
d
t
h
e
n
i
nc
r
e
a
s
e
d
t
h
e
a
c
c
ur
a
c
y
to
97.
12
%
us
i
n
g
t
h
e
e
n
s
e
m
bl
e
m
o
de
l
w
i
t
h
15
di
f
f
e
r
e
n
t
3D
-
C
NN
s
.
S
a
r
m
a
et
al
.
[1
5
]
de
v
e
l
o
pe
d
a
t
w
o
-
s
t
r
e
a
m
n
e
t
wor
k
to
de
t
e
c
t
a
n
d
i
de
n
t
i
f
y
i
s
o
l
a
t
e
d
d
y
n
a
mi
c
h
a
n
d
ge
s
t
ur
e
s
w
i
t
h
c
h
a
n
g
i
ng
f
o
r
m
,
s
i
z
e
,
a
n
d
c
o
l
o
ur
of
t
h
e
h
a
n
d.
T
he
s
ugge
s
t
e
d
m
e
t
h
o
d
ut
i
li
z
e
d
3D
-
C
NN
to
e
x
t
r
a
c
t
m
o
t
i
o
n
pa
t
t
e
r
n
s
f
o
r
ge
s
t
ur
e
c
l
a
s
s
if
i
c
a
t
i
o
n
a
n
d
2D
-
C
NN
to
c
a
p
t
ur
e
S
pa
t
i
o
-
t
e
m
po
r
a
l
i
n
f
o
r
m
a
t
i
o
n
d
i
r
e
c
t
l
y
f
r
o
m
R
GB
ge
s
t
ur
e
vi
de
o
s
.
T
h
e
y
o
b
t
a
i
n
e
d
an
a
c
c
ur
a
c
y
of
92.
60%
wh
e
n
us
i
ng
o
nly
2D
-
C
NN
,
97.
30%
wh
e
n
u
s
i
n
g
o
nl
y
3D
-
C
NN
,
a
n
d
99.
20%
wh
e
n
us
i
ng
o
nly
t
h
e
f
us
i
o
n
m
o
de
l
.
R
a
s
t
goo
et
al
.
[1
6
]
pr
e
s
e
n
t
e
d
a
de
e
p
-
b
a
s
e
d
m
o
de
l
f
o
r
e
f
f
e
c
t
i
v
e
h
a
n
d
s
i
g
n
r
e
c
o
gni
t
i
o
n
by
t
r
a
i
ni
ng:
f
i
r
s
t
,
t
h
e
s
i
n
g
l
e
s
h
o
t
de
t
e
c
to
r
(
SSD
)
m
o
de
l
f
o
r
h
a
n
d
i
de
n
t
i
f
i
c
a
t
i
o
n
us
i
ng
a
nn
ot
a
t
e
d
vi
d
e
o
s
of
f
i
ve
o
nl
i
ne
s
i
g
n
d
i
c
t
i
o
n
a
r
i
e
s
,
a
n
d
s
e
c
o
n
d,
a
c
o
m
bi
na
t
i
o
n
a
l
m
o
de
l
w
i
t
h
a
C
NN
a
n
d
d
if
f
e
r
e
n
t
s
pa
t
i
a
l
f
e
a
t
ur
e
s
.
T
h
e
y
pe
r
f
o
r
m
e
d
a
c
o
m
pr
e
h
e
ns
i
ve
s
t
ud
y
of
s
e
que
n
c
e
l
e
a
r
ni
ng
ut
i
li
z
i
ng
v
a
r
i
o
us
pr
e
-
t
r
a
i
n
m
o
de
l
s
,
s
pa
t
i
a
l
f
e
a
t
ur
e
s
,
a
n
d
t
e
m
p
o
r
a
l
-
ba
s
e
d
m
o
de
l
s
.
T
h
e
y
e
x
t
r
a
c
t
e
d
f
e
a
t
ur
e
s
f
r
o
m
s
t
i
ll
R
GB
f
r
a
m
e
s
us
i
ng
t
h
e
R
e
s
Ne
t
50
m
o
de
l
,
whi
c
h
h
a
d
an
a
c
c
ur
a
c
y
of
86.
32%
a
n
d
a
pr
e
d
i
c
t
i
o
n
t
i
m
e
of
2.
58
s
e
c
o
n
ds
f
o
r
each
s
i
g
n
.
E
l
s
a
y
e
d
a
n
d
F
a
t
hy
[1
7
]
pr
e
s
e
n
t
e
d
a
s
y
s
t
e
m
t
h
a
t
u
t
i
li
z
e
d
3D
-
C
NN
a
n
d
L
S
T
M
to
e
n
ha
n
ce
d
y
na
mi
c
s
i
g
n
l
a
n
gua
ge
r
e
c
o
gni
t
i
o
n
a
c
c
ur
a
c
y
on
t
h
r
e
e
d
y
na
mi
c
g
e
s
t
ur
e
da
t
a
s
e
t
s
e
x
t
r
a
c
t
e
d
f
r
o
m
c
o
l
o
ur
vi
de
o
s
,
w
i
t
h
an
a
v
e
r
a
ge
r
e
c
o
gni
t
i
o
n
a
c
c
ur
a
c
y
of
97.
4%
.
E
l
h
a
g
r
y
a
n
d
G
l
a
[1
8
]
i
n
t
r
o
duc
e
d
a
vi
s
i
o
n
-
b
a
s
e
d
s
ys
t
e
m
f
o
r
t
r
a
n
s
l
a
t
i
n
g
s
o
m
e
A
r
S
L
ge
s
t
ur
e
s
i
n
t
o
i
t
s
a
l
t
e
r
n
a
t
e
i
s
o
l
a
t
e
w
o
r
ds
.
T
h
e
y
o
b
t
a
i
n
e
d
an
a
c
c
ur
a
c
y
of
90%
by
us
i
ng
t
h
e
I
n
c
e
pt
i
o
n
V3
C
NN
a
r
c
hi
t
e
c
t
ur
e
a
n
d
an
a
c
c
u
r
a
c
y
of
72%
by
u
s
i
ng
I
n
c
e
pt
i
o
nV3
C
NN
-
L
S
T
M
.
T
h
e
y
de
m
o
ns
t
r
a
t
e
d
t
h
a
t
C
NN
gi
v
e
s
hi
g
h
a
c
c
ur
a
c
i
e
s
f
o
r
i
s
o
l
a
t
e
d
s
i
g
n
l
a
n
gua
ge
r
e
c
o
gni
t
i
o
n
,
whil
e
CN
N
-
L
S
T
M
is
a
n
e
x
c
e
l
l
e
n
t
c
h
o
i
c
e
f
o
r
c
o
n
t
i
n
uo
us
wor
d
r
e
c
o
gni
t
i
o
n
.
T
h
e
r
e
a
r
e
va
r
i
o
us
e
f
f
o
r
t
s
in
t
h
e
pr
e
s
e
n
t
r
e
s
e
a
r
c
h
to
c
o
n
s
t
r
uc
t
b
ot
h
s
i
n
g
l
e
a
n
d
m
u
l
t
i
p
l
e
m
o
de
l
s
to
im
pr
o
v
e
t
h
e
a
c
c
ur
a
c
y
a
n
d
pe
r
f
o
r
m
a
n
c
e
of
t
h
e
r
e
c
o
gni
t
i
o
n
f
o
r
A
r
S
L
.
T
h
e
f
o
l
l
o
w
i
ng
we
r
e
t
h
e
hi
g
hli
g
ht
s
o
f
t
hi
s
s
t
udy
:
A
l
a
r
ge
-
s
c
a
l
e
A
r
S
L
da
t
a
s
e
t
pr
e
pa
r
e
d
us
i
n
g
K
i
n
e
c
t
V2
t
h
a
t
i
n
c
l
ude
s
R
GB
a
n
d
de
pt
h
B
e
c
a
us
e
of
t
h
e
un
a
v
a
i
l
a
bil
i
t
y
a
n
d
i
n
a
bi
li
t
y
to
a
c
c
e
s
s
s
uc
h
as
t
h
i
s
da
t
a
.
T
h
e
r
e
is
no
e
v
a
l
ua
t
i
o
n
of
t
h
e
di
f
f
e
r
e
n
t
m
e
t
h
o
ds
of
f
us
i
o
n
m
o
de
l
s
to
r
e
c
o
gn
i
z
e
A
r
S
L
us
i
n
g
m
ul
t
i
-
m
o
de
l
f
us
i
o
n
in
pr
e
v
i
o
us
r
e
s
e
a
r
c
h
e
s
.
T
h
e
m
o
de
l
us
e
d
in
e
v
a
l
ua
t
i
n
g
t
h
e
f
us
i
o
n
m
e
t
h
o
ds
wa
s
s
e
l
e
c
t
e
d
f
r
o
m
s
e
v
e
r
a
l
pr
o
po
s
e
d
s
i
n
gl
e
m
o
de
l
s
a
n
d
f
r
o
m
s
e
v
e
r
a
l
pr
e
-
t
r
a
i
n
e
d
m
ul
t
i
-
m
o
de
l
s
un
i
f
i
e
d
in
t
h
e
m
e
t
h
o
d
of
f
us
i
o
n
a
f
t
e
r
e
v
a
l
ua
t
i
n
g
t
h
e
i
r
pe
r
f
o
r
m
a
n
c
e
in
r
e
c
o
gn
i
t
i
o
n
of
t
h
e
A
r
S
L
.
3.
E
XP
E
R
I
M
E
NT
AL
M
E
T
HO
DOL
OG
Y
3.
1.
Dat
as
e
t
T
h
e
d
y
na
mi
c
da
t
a
s
e
t
wa
s
c
o
l
l
e
c
t
e
d
us
i
n
g
a
Ki
ne
c
t
c
a
m
e
r
a
,
a
n
d
e
a
c
h
r
e
c
o
r
de
d
da
t
a
c
o
n
t
a
i
n
s
R
GB
a
n
d
De
pt
h
vi
de
o
.
T
h
e
da
t
a
s
e
t
c
o
n
s
i
s
t
s
of
7,
350
i
s
o
l
a
t
e
d
vi
de
o
s
a
m
p
l
e
s
of
s
i
g
n
l
a
n
gua
ge
wo
r
ds
,
whic
h
s
h
o
w
A
r
S
L
f
o
r
a
t
ot
a
l
of
t
we
n
t
y
m
e
a
ni
ng
f
u
l
wo
r
ds
a
nd
o
n
e
no
s
i
g
n
do
i
n
g
o
t
h
e
r
t
hi
n
g
s
.
T
h
e
do
i
n
g
o
t
h
e
r
t
hi
n
gs
c
a
t
e
g
o
r
y
is
a
c
o
l
l
e
c
t
i
o
n
of
v
a
r
i
o
us
a
c
t
i
vi
t
i
e
s
s
uc
h
as
a
s
im
p
l
e
b
o
d
y
or
h
e
a
d
m
o
v
e
m
e
n
t
.
T
h
r
e
e
h
u
n
d
r
e
d
f
if
t
y
r
e
c
o
r
de
d
v
i
d
e
o
s
c
o
v
e
r
e
d
each
wo
r
d
ge
s
t
u
r
e
.
T
he
vi
de
o
da
t
a
wa
s
us
e
d
f
o
r
C
hi
ne
s
e
S
i
g
n
L
a
n
gua
g
e
w
i
t
h
t
h
e
s
a
m
e
m
e
a
ni
ng
or
e
m
p
l
o
y
e
d
in
o
t
h
e
r
m
e
a
ni
ng
s
in
Ar
S
L
,
whi
c
h
r
e
pr
e
s
e
n
t
e
d
250
vi
d
e
o
s
r
e
c
or
de
d
f
o
r
each
wo
r
d
r
e
pe
a
t
e
d
f
i
ve
t
i
m
e
s
by
50
pe
o
p
l
e
.
In
a
dd
i
t
i
o
n
to
r
e
c
o
r
di
n
g
100
vi
d
e
o
s
f
o
r
each
wo
r
d
r
e
pe
a
t
e
d
20
t
i
m
e
s
by
f
i
v
e
pe
o
pl
e
.
T
h
e
r
e
c
o
r
de
d
v
i
de
o
us
e
d
P
y
t
h
o
n
Ope
nC
V
c
a
m
e
r
a
w
i
t
h
v
a
r
i
a
t
i
o
n
s
in
s
ur
r
o
un
d
i
n
g
s
,
c
l
o
t
h
i
ng
a
n
d
li
g
h
t
i
n
g.
T
h
e
dur
a
t
i
o
n
of
ge
s
t
ur
e
l
e
n
gt
h
w
a
s
f
r
o
m
1.
5
s
e
c
to
3
s
e
c
.
T
h
e
pr
e
s
e
n
t
s
t
udy
wo
r
ds
a
r
e
:
[
Not
hi
n
g,
C
he
e
k,
F
r
i
e
n
d,
P
l
a
t
e
,
M
a
r
r
i
a
ge
,
M
o
o
n
,
B
r
e
a
k,
B
r
oo
m
,
Yo
u,
M
i
r
r
o
r
,
T
a
bl
e
,
T
r
ut
h
,
W
a
t
c
h
,
A
r
c
h
,
S
u
c
c
e
s
s
f
u
l
,
S
h
o
r
t
,
S
m
o
k
i
n
g,
I,
P
us
h
,
st
i
ng
y,
a
n
d
L
o
n
g]
,
F
i
gur
e
1
s
h
o
ws
t
h
e
d
y
na
m
i
c
wo
r
ds
s
i
g
n
s
s
a
m
p
l
e
s
f
o
r
Ar
S
L
.
T
h
e
da
t
a
s
e
t
wa
s
di
vi
de
d
i
n
t
o
t
h
r
e
e
s
ubs
e
t
s
:
85.
7%
f
o
r
tr
a
i
ni
ng
,
i
.
e
.
6
,
300
vi
de
o
s
of
s
i
g
n
l
a
n
gua
ge
wo
r
d
s
,
7.
14%
f
o
r
v
a
li
da
t
i
o
n
525
vi
de
o
s
of
s
i
g
n
l
a
n
gua
ge
wo
r
ds
a
n
d
7.
14%
f
o
r
t
e
s
t
i
n
g
,
i
.
e
.
525
vi
de
o
s
of
s
i
g
n
lan
gua
g
e
Evaluation Warning : The document was created with Spire.PDF for Python.
I
n
do
n
e
s
i
a
n
J
E
l
e
c
E
n
g
&
C
o
m
p
S
c
i
I
S
S
N:
2502
-
4752
Dy
namic
hand
ge
s
tur
e
r
e
c
ognit
ion
of
A
r
abic
s
ign
l
anguage
by
us
ing
de
e
p
…
(
M
ohamm
ad
H.
I
s
mail)
955
wo
r
ds
.
T
h
e
da
t
a
s
e
t
s
we
r
e
c
o
l
l
e
c
t
e
d
us
i
n
g
M
i
c
r
o
s
o
f
t
Ki
n
e
c
t
2.
0;
t
h
e
r
e
c
o
r
de
d
da
t
a
i
n
c
l
ude
d
7
,
350
R
G
B
vi
d
e
o
s
a
n
d
7
,
350
de
pt
h
vi
de
o
s
.
F
i
gur
e
1
.
Dy
n
a
mi
c
wo
r
ds
s
i
g
n
s
s
a
m
p
l
e
s
f
r
o
m
unif
ied
A
r
S
L
d
i
c
t
i
o
n
a
r
y
3.
2.
Dat
a
p
r
e
p
r
o
c
e
s
s
in
g
T
h
e
d
y
na
mi
c
ge
s
t
ur
e
da
t
a
s
e
t
is
g
e
n
e
r
a
l
ly
c
o
m
po
s
e
d
of
m
a
ny
vi
de
o
s
.
V
i
de
o
’
s
r
e
s
o
l
ut
i
o
n
a
n
d
t
i
m
e
l
e
n
gt
h
m
a
y
d
if
f
e
r
,
so
each
ge
s
t
ur
e
vi
d
e
o
in
t
h
e
da
t
a
s
e
t
m
us
t
be
pr
e
pr
o
c
e
s
s
e
d.
V
i
de
o
s
w
i
ll
be
s
p
l
i
t
i
n
t
o
c
o
n
s
e
c
ut
i
v
e
f
r
a
m
e
s
,
an
d
t
h
e
uppe
r
s
i
g
ne
r
b
o
d
y
w
it
h
hi
s
h
a
n
ds
is
c
r
o
ppi
n
g
f
r
o
m
t
h
e
f
r
a
m
e
u
s
i
n
g
t
h
e
c
a
pt
ur
e
s
ke
l
e
t
o
n
.
T
h
e
n
t
h
e
f
r
a
m
e
is
n
o
r
m
a
li
z
e
d
w
he
r
e
d
i
f
f
e
r
e
n
t
us
e
r
s
m
a
y
pe
r
f
o
r
m
ge
s
t
ur
e
m
o
v
e
s
at
d
i
f
f
e
r
e
nt
s
pe
e
ds
,
a
n
d
t
h
e
n
e
ur
a
l
n
e
t
wo
r
k
i
n
put
s
h
o
u
l
d
be
t
h
e
s
a
m
e
.
S
i
n
c
e
t
he
t
i
m
e
l
e
n
gt
h
of
t
h
e
r
e
c
o
r
de
d
vi
de
o
is
v
a
r
i
a
bl
e
to
e
x
pr
e
s
s
a
s
p
e
c
i
f
i
c
wo
r
d
of
t
h
e
s
i
g
n
l
a
n
gua
ge
,
whe
n
t
h
e
vi
de
o
s
a
r
e
c
o
n
ve
r
t
e
d
to
f
r
a
m
e
s
,
it
m
a
y
e
xc
e
e
d
80
f
r
a
m
e
s
.
In
s
o
m
e
c
a
s
e
s
,
t
h
e
vi
de
o
’
s
f
r
a
m
e
s
i
nc
l
ude
a
dd
i
t
i
o
n
a
l
f
r
a
m
e
s
b
e
f
o
r
e
a
n
d
a
f
t
e
r
t
h
e
s
i
g
n
l
a
n
gua
g
e
,
whi
c
h
e
x
pr
e
s
s
e
s
a
s
pe
c
if
i
c
wo
r
d.
T
h
e
s
e
a
dd
i
t
i
o
na
l
f
r
a
m
e
s
t
h
a
t
di
d
n
o
t
e
x
pr
e
s
s
a
wo
r
d
we
r
e
r
e
m
o
v
e
d.
T
h
e
r
e
f
o
r
e
,
32
f
r
a
m
e
s
we
r
e
u
s
e
d
to
e
x
pr
e
s
s
a
wo
r
d
e
x
pr
e
s
s
e
d
in
s
i
g
n
l
a
ngua
ge
.
Af
t
e
r
t
h
a
t
,
a
l
l
f
r
a
m
e
s
n
e
e
d
to
be
r
e
s
i
z
e
d
to
200×
200.
T
h
e
f
r
a
m
e
s
w
e
r
e
o
r
ga
ni
z
e
d
in
f
o
l
de
r
s
t
h
a
t
we
r
e
i
de
n
t
i
c
a
l
to
t
h
o
s
e
us
e
d
f
o
r
t
h
e
vi
de
o
s
,
w
i
t
h
t
he
f
o
l
de
r
n
a
m
e
s
i
n
d
i
c
a
t
i
n
g
t
h
e
s
i
g
n
c
l
a
s
s
l
a
b
e
l
.
3.
3.
Dat
a
au
gm
e
n
t
at
ion
De
e
p
l
e
a
r
ni
ng
is
c
o
nne
c
t
e
d
w
i
t
h
m
il
li
o
ns
of
i
m
a
ge
s
f
o
r
v
e
r
y
s
t
r
o
n
g
de
e
p
ne
ur
a
l
n
e
t
wo
r
ks
.
T
h
e
s
m
a
l
l
t
r
a
i
ni
ng
im
a
ge
s
e
t
pr
o
bl
e
m
is
t
h
a
t
a
l
t
h
o
ugh
t
he
n
e
ur
a
l
n
e
t
wo
r
k
r
e
m
e
m
be
r
s
o
ur
t
r
a
i
ni
ng
da
t
a
a
n
d
c
a
n
a
c
c
ur
a
t
e
l
y
pr
e
d
i
c
t
t
h
e
tr
a
i
ni
ng
s
e
t
'
s
pe
r
f
o
r
m
a
n
c
e
,
t
h
e
v
a
li
da
t
i
o
n
a
c
c
ur
a
c
y
is
l
o
w.
To
a
v
o
i
d
o
v
e
r
f
i
t
t
i
n
g
a
n
d
i
nc
r
e
a
s
e
m
o
de
l
ge
n
e
r
a
li
z
a
t
i
o
n
c
a
pa
bil
i
t
i
e
s
,
da
t
a
a
ug
m
e
n
t
a
t
i
o
n
wa
s
us
e
d
to
s
o
l
v
e
t
h
e
da
t
a
s
e
t
i
s
s
ue
[
19]
.
To
a
v
o
i
d
m
o
de
l
o
v
e
r
f
i
t
t
i
n
g
a
n
d
im
pr
o
v
e
l
e
a
r
ni
ng
c
a
p
a
bil
i
t
i
e
s
,
n
u
m
e
r
o
us
da
t
a
a
ug
m
e
n
t
a
t
i
o
n
s
t
r
a
t
e
gi
e
s
a
r
e
ut
i
l
i
z
e
d
f
o
r
d
y
n
a
mi
c
s
i
g
n
l
a
n
gua
ge
:
I
m
a
ge
n
o
r
m
a
l
i
z
a
t
i
o
n
,
z
oo
m
r
a
n
g
e
(
1.
0,
1.
2)
,
wi
dt
h
s
hif
t
r
a
n
g
e
(
10%
)
,
h
e
i
g
h
t
s
hi
f
t
r
a
n
ge
(
10%
)
,
r
ot
a
t
i
o
n
r
a
n
g
e
(
±
10
°
)
,
a
n
d
b
r
i
g
h
t
n
e
s
s
r
a
n
ge
(
0.
4
-
1.
2)
.
Al
s
o
,
da
t
a
wa
s
a
ug
m
e
n
t
e
d
by
bl
ur
r
i
n
g
im
a
ge
s
us
i
ng
a
v
e
r
a
g
i
ng,
m
e
d
i
a
n
,
a
n
d
ga
u
s
s
i
a
n
f
o
r
t
h
e
d
y
na
mi
c
s
i
g
n.
An
d
m
o
r
p
h
o
l
o
g
i
c
a
l
o
pe
r
a
t
i
o
n
s
e
r
o
s
i
o
n
a
n
d
d
il
a
t
i
o
n
,
a
dd
i
n
g
n
o
i
s
e
s
a
l
t
a
n
d
pa
pe
r
,
a
n
d
s
h
a
r
pe
ni
ng
t
h
e
da
t
a
s
e
t
'
s
im
a
ge
s
.
T
h
r
o
ugh
t
h
e
s
e
o
pe
r
a
t
i
o
n
s
,
t
h
e
t
r
a
i
ni
ng
s
e
t
is
e
nl
a
r
ge
d
by
a
r
o
un
d
48
t
i
m
e
s
.
All
of
t
h
e
s
e
o
pe
r
a
t
i
o
n
s
a
r
e
a
pp
l
i
e
d
at
r
a
n
do
m
i
ns
i
de
t
h
e
m
i
ni
-
b
a
t
c
h
i
nput
i
n
t
o
t
h
e
m
o
de
l
.
I
n
t
hi
s
wo
r
k
dur
i
ng
t
r
a
i
ni
ng,
we
pe
r
f
o
r
m
e
d
an
o
nli
ne
s
pa
t
i
o
t
e
m
p
o
r
a
l
da
t
a
a
ug
m
e
n
t
a
t
i
o
n
,
f
or
s
pa
t
i
a
l
a
ug
m
e
n
t
a
t
i
o
n
s
a
n
d
we
a
pp
ly
v
a
r
i
o
us
da
t
a
a
ug
m
e
n
t
a
t
i
o
n
t
e
c
hni
que
s
.
An
d
f
o
r
t
e
m
po
r
a
l
a
ug
m
e
n
t
a
t
i
o
n
,
we
s
e
l
e
c
t
e
d
c
o
n
s
e
c
ut
i
ve
f
r
a
m
e
s
in
a
r
a
n
ge
of
de
f
i
ne
d
i
n
put
s
i
z
e
s
.
3.
4.
P
r
op
os
e
d
m
od
e
l
s
C
NN
wa
s
u
s
e
d
to
t
r
a
i
n
t
h
e
m
o
de
l
on
s
pa
t
i
a
l
f
e
a
t
ur
e
s
,
whil
e
R
NN
wa
s
us
e
d
to
tr
a
i
n
t
h
e
m
o
de
l
on
t
e
m
po
r
a
l
f
e
a
t
ur
e
s
.
C
NN
s
uc
c
e
e
ds
in
i
de
n
t
i
f
y
i
ng
pa
tt
e
r
n
s
a
n
d
a
pp
lyi
ng
t
h
e
m
to
i
m
a
ge
c
l
a
s
s
if
i
c
a
t
i
o
n
.
C
NN
a
s
s
u
m
e
s
t
h
a
t
t
he
n
e
t
wo
r
k'
s
i
n
put
w
i
l
l
be
an
im
a
ge
.
R
e
c
ur
r
e
n
t
n
e
ur
a
l
ne
t
wor
ks
(
R
NN
s
)
u
t
i
li
z
e
t
he
i
n
f
o
r
m
a
t
i
o
n
of
s
e
que
n
c
e
i
t
s
e
l
f
to
pe
r
f
o
r
m
r
e
c
o
gni
t
i
o
n
t
a
s
ks
.
B
e
c
a
us
e
R
NN
s
c
o
n
t
a
i
n
l
o
o
ps
,
t
h
e
i
r
o
u
t
pu
t
is
de
pe
n
de
n
t
on
t
h
e
mi
x
of
c
ur
r
e
n
t
i
n
put
a
n
d
pr
e
vi
o
us
o
ut
pu
t.
B
e
c
a
us
e
a
vi
de
o
s
e
que
n
c
e
i
nc
l
ude
s
b
o
t
h
t
e
m
po
r
a
l
a
nd
s
pa
t
i
a
l
Evaluation Warning : The document was created with Spire.PDF for Python.
I
S
S
N
:
2502
-
4752
I
n
do
n
e
s
i
a
n
J
E
l
e
c
E
n
g
&
C
o
m
p
S
ci
,
Vo
l
.
25
,
N
o
.
2
,
F
e
b
r
ua
r
y
20
22
:
952
-
96
2
956
i
n
f
o
r
m
a
t
i
o
n
,
it
is
h
a
r
d
to
c
l
a
s
s
if
y
vi
de
o
.
F
r
o
m
t
he
f
r
a
m
e
s
of
t
h
e
vi
de
o
,
t
h
e
s
pa
t
i
a
l
f
e
a
t
ur
e
s
a
r
e
e
x
t
r
a
c
t
e
d,
wh
e
r
e
a
s
f
r
o
m
r
e
l
a
t
i
n
g
t
h
e
f
r
a
m
e
s
of
vi
d
e
o
wi
t
h
t
h
e
t
i
m
e
t
h
e
t
e
m
po
r
a
l
f
e
a
t
ur
e
s
a
r
e
e
x
t
r
a
c
t
e
d
.
D
y
n
a
mic
ge
s
t
ur
e
s
i
nv
o
l
v
e
a
s
e
r
i
e
s
of
f
r
a
m
e
s
,
n
ot
j
us
t
o
n
e
f
r
a
m
e
,
a
n
d
t
h
e
s
e
f
r
a
m
e
s
w
i
ll
de
pe
n
d
on
e
a
c
h
o
t
h
e
r
o
v
e
r
t
i
m
e
.
S
e
v
e
r
a
l
m
o
de
l
s
a
r
e
b
u
il
t
to
r
e
c
o
gni
z
e
ge
s
t
ur
e
s
,
i
nc
l
ud
i
ng
3D
-
C
NN
or
2D
-
C
NN
wi
t
h
R
NN
.
C
NN
is
us
e
d
to
e
x
t
r
a
c
t
t
h
e
f
e
a
t
ur
e
s
f
o
r
each
im
a
ge
,
a
n
d
t
h
e
n
t
h
e
s
e
que
n
c
e
of
t
h
e
s
e
f
e
a
t
ur
e
s
is
pa
s
s
e
d
to
t
h
e
R
NN
.
A
f
u
ll
y
c
o
n
n
e
c
t
e
d
l
a
y
e
r
f
o
l
l
o
w
s
t
h
e
o
u
t
pu
t
of
t
h
e
R
NN
w
i
t
h
s
o
f
t
m
a
x
a
c
t
i
v
a
t
i
o
n
f
o
r
a
c
l
a
s
s
if
i
c
a
t
i
o
n
pr
o
bl
e
m
.
F
o
r
t
h
e
e
m
p
i
r
i
c
a
l
e
va
l
u
a
t
i
o
ns
,
we
c
o
n
s
t
r
uc
t
e
d
f
o
ur
de
e
p
n
e
ur
a
l
n
e
t
wo
r
ks
.
F
i
gur
e
s
2
i
ll
u
s
t
r
a
t
e
t
h
e
m
o
de
l
s
t
h
a
t
we
r
e
c
o
n
s
t
r
uc
t
e
d
a
n
d
a
pp
li
e
d
to
t
h
e
vi
de
o
da
t
a
s
e
t
.
A
n
d
w
hi
c
h
t
h
e
y
ha
v
e
b
e
e
n
t
r
a
i
n
e
d
a
n
d
t
e
s
t
e
d
on
t
hi
s
da
t
a
s
e
t
.
T
h
e
s
e
m
o
de
l
s
c
o
v
e
r
a
l
l
m
e
t
h
o
ds
of
f
e
a
t
ur
e
e
x
t
r
a
c
t
i
o
n
a
n
d
t
he
pr
o
c
e
s
s
f
o
r
pa
s
s
i
n
g
t
h
o
s
e
f
e
a
t
ur
e
s
o
n
to
R
NN
.
M
o
de
l
s
1
:
R
e
c
ur
r
e
nt
c
o
n
v
o
l
ut
i
o
n
a
l
n
e
ur
a
l
n
e
t
wo
r
k
(
R
C
NN
)
,
by
i
t
s
n
a
m
e
,
is
a
c
o
m
bi
na
t
i
o
n
of
C
NN
a
n
d
R
NN
.
R
C
NN
us
e
s
a
2D
c
o
n
v
o
l
ut
i
o
n
a
l
ne
t
wor
k
to
e
x
t
r
a
c
t
s
pa
t
i
a
l
f
e
a
t
ur
e
s
on
f
r
a
m
e
s
,
whi
c
h
wa
s
do
n
e
by
u
s
i
n
g
pr
e
-
t
r
a
i
n
e
d
a
r
c
hi
t
e
c
t
ur
e
s
uc
h
as
R
e
s
N
e
t
50
[
20]
,
De
n
s
e
N
e
t
121
[
21]
,
VGG
1
6
[
22]
,
a
n
d
M
o
bil
e
Ne
t
[
23]
.
T
h
e
n
t
h
e
s
e
f
e
a
t
ur
e
s
we
r
e
i
n
j
e
c
t
e
d
to
R
NN
,
whi
c
h
i
nc
l
ud
e
r
e
c
ur
r
e
n
c
e
r
e
l
a
t
i
o
n
s
,
m
e
a
ni
ng
t
h
a
t
each
s
ubs
e
que
n
t
o
u
t
pu
t
is
de
pe
n
de
n
t
on
a
mi
x
t
ur
e
of
pr
e
vi
o
u
s
o
u
t
pu
t
s
.
F
or
R
NN
,
t
h
e
t
w
o
m
e
t
h
o
ds
i
nc
l
ude
d
in
R
N
N
a
r
e
L
S
T
M
a
n
d
ga
t
e
d
r
e
c
ur
r
e
n
t
uni
t
(
GR
U
)
,
whi
c
h
a
r
e
s
t
a
c
ke
d
on
to
p
of
t
h
e
c
o
nvo
l
ut
i
o
n
a
l
n
e
t
wo
r
k
to
m
o
de
l
t
e
m
po
r
a
l
de
p
e
n
de
n
c
i
e
s
.
O
n
e
,
t
wo
or
t
h
r
e
e
R
NN
l
a
y
e
r
s
we
r
e
us
e
d
.
M
o
de
l
2
is
an
a
r
c
hi
t
e
c
t
ur
e
t
h
a
t
i
nc
l
ud
e
s
c
o
nv
2d
to
e
x
tr
a
c
t
f
e
a
t
ur
e
s
a
n
d
u
s
e
L
S
T
M
to
h
a
n
d
l
e
s
e
que
n
c
e
s
b
e
t
we
e
n
f
r
a
m
e
s
.
M
o
de
l
3
:
3D
-
C
NN
w
i
t
h
L
S
T
M
t
h
e
m
o
de
l
c
o
n
s
i
s
t
s
of
a
3D
-
C
NN
a
n
d
an
L
S
T
M
f
o
r
s
e
que
nc
e
c
l
a
s
s
if
i
c
a
t
i
o
n
.
T
h
e
3D
-
C
NN
t
a
ke
s
bl
o
c
k
s
of
32*200*200
f
r
a
m
e
s
of
t
h
r
e
e
c
h
a
nn
e
l
s
as
i
nput
a
n
d
pr
o
duc
e
s
a
f
e
a
t
ur
e
ve
c
t
or
.
T
h
e
L
S
T
M
t
a
ke
s
th
at
v
e
c
t
o
r
s
e
que
n
t
i
a
ll
y
f
o
r
an
e
n
t
i
r
e
ge
s
t
ur
e
a
n
d
g
i
ve
s
t
h
e
f
i
na
l
pr
e
d
i
c
t
i
o
n
.
M
o
de
l
4:
E
a
c
h
s
i
g
n
wo
r
d
ge
s
t
ur
e
is
r
e
pr
e
s
e
n
t
e
d
us
i
n
g
32
c
o
ns
e
c
u
t
i
v
e
f
r
a
m
e
s
,
w
i
t
h
d
im
e
ns
i
o
n
s
200
x
200
a
n
d
t
h
r
e
e
c
h
a
nn
e
l
s
.
T
h
e
m
o
de
l
c
o
n
s
i
s
t
s
of
s
i
x
c
o
nv
o
l
ut
i
o
n
3D
l
a
y
e
r
s
,
whi
c
h
wa
s
us
e
d
to
e
x
t
r
a
c
t
f
e
a
t
ur
e
s
a
n
d
t
he
r
e
l
a
t
i
o
n
s
hi
p
b
e
t
we
e
n
s
e
qu
e
n
c
e
s
of
f
r
a
m
e
s
.
C
o
nv
o
l
ut
i
o
n
l
a
y
e
r
s
a
r
e
a
l
s
o
f
o
l
l
o
w
e
d
by
b
a
t
c
h
n
o
r
m
a
li
z
a
t
i
o
n
,
a
n
d
a
poo
l
i
ng
l
a
y
e
r
f
o
l
l
o
w
s
e
v
e
r
y
t
wo
c
o
n
v
o
l
ut
i
o
n
l
a
y
e
r
s
f
o
r
down
-
s
a
m
p
li
ng
by
a
f
a
c
t
o
r
of
t
w
o
.
C
o
n
v
o
l
ut
i
o
n
is
p
e
r
f
o
r
m
e
d
w
i
t
h
f
il
t
e
r
s
i
z
e
2,
pa
dd
i
n
g
s
a
m
e
,
an
d
e
a
c
h
C
o
nv
o
l
ut
i
o
n
l
a
y
e
r
f
o
l
l
o
w
e
d
by
R
e
L
U
a
c
t
i
v
a
t
i
o
n
.
A
nd
t
h
e
n
,
it
is
f
o
l
l
o
we
d
by
t
h
r
e
e
f
u
ll
y
c
o
nn
e
c
t
e
d
l
a
y
e
r
s
(
F
C
s
)
f
o
r
c
l
a
s
s
if
i
c
a
t
i
o
n
.
F
i
gur
e
2
.
T
h
e
a
r
c
hi
t
e
c
t
ur
e
of
pr
o
p
o
s
e
d
m
o
de
l
s
3.
5.
M
u
l
t
i
-
m
od
e
l
f
u
s
ion
T
h
e
r
e
a
s
o
n
f
o
r
us
i
n
g
m
u
l
t
i
-
m
o
d
e
l
da
t
a
is
t
h
a
t
c
o
m
p
l
e
m
e
n
t
a
r
y
i
nf
o
r
m
a
t
i
o
n
,
whi
c
h
i
nc
l
ude
s
R
GB
a
n
d
De
pt
h
,
pr
o
vi
de
s
a
r
i
c
h
e
r
r
e
pr
e
s
e
n
t
a
t
i
o
n
t
h
a
t
m
a
y
be
ut
i
li
z
e
d
to
pr
o
duc
e
m
uc
h
b
e
t
t
e
r
r
e
s
ul
t
s
t
h
a
n
s
i
ng
le
m
o
d
e
l
da
t
a
.
T
h
e
m
u
l
t
i
-
m
o
d
e
l
f
u
s
i
o
n
r
e
s
e
a
r
c
h
c
o
m
m
u
ni
t
y
h
a
s
m
a
d
e
s
i
g
nif
i
c
a
n
t
pr
o
g
r
e
s
s
[
24]
.
D
a
t
a
le
v
e
l
f
us
i
o
n
,
F
e
a
t
ur
e
l
e
v
e
l
f
u
s
i
o
n
,
a
n
d
D
e
c
i
s
i
o
n
l
e
ve
l
f
u
s
i
o
n
we
r
e
t
h
e
t
h
r
e
e
k
i
nds
of
m
u
l
t
i
-
m
o
d
e
l
f
u
s
i
o
n
t
e
c
hni
que
s
we
t
e
s
t
e
d
as
s
h
o
wn
in
F
i
gur
e
3
.
Da
t
a
l
e
v
e
l
f
u
s
i
o
n
:
T
he
Da
t
a
l
e
v
e
l
f
us
i
o
n
of
R
GB
a
n
d
de
pt
h
da
t
a
i
n
c
l
ude
c
r
e
a
t
i
n
g
Evaluation Warning : The document was created with Spire.PDF for Python.
I
n
do
n
e
s
i
a
n
J
E
l
e
c
E
n
g
&
C
o
m
p
S
c
i
I
S
S
N:
2502
-
4752
Dy
namic
hand
ge
s
tur
e
r
e
c
ognit
ion
of
A
r
abic
s
ign
l
anguage
by
us
ing
de
e
p
…
(
M
ohamm
ad
H.
I
s
mail)
957
a
ne
w
m
a
t
r
i
x
f
o
r
i
n
put
da
t
a
wi
th
f
o
ur
c
h
a
nne
l
s
(
R
GB
D)
.
In
t
h
e
pr
e
s
e
n
t
e
d
r
e
s
e
a
r
c
h
,
a
C
NN
m
o
de
l
w
a
s
us
e
d
to
pr
o
c
e
s
s
t
h
e
R
GB
D
i
n
put
.
We
us
e
d
a
c
o
nv
2d
f
o
r
f
e
a
t
ur
e
e
x
t
r
a
c
to
r
to
pr
o
c
e
s
s
t
h
e
R
G
B
D
i
nput
.
An
d
t
e
m
po
r
a
l
r
e
l
a
t
i
o
n
s
of
t
h
e
f
r
a
m
e
s
a
r
e
c
a
pt
ur
e
d
by
an
L
S
T
M
la
y
e
r
,
f
o
r
t
hi
s
f
u
s
i
o
n
us
e
d,
t
h
e
m
o
de
l
ne
e
ds
f
o
ur
c
ha
nn
e
l
s
as
i
nput
da
t
a
.
W
h
e
n
we
u
s
e
d
t
h
e
pr
e
-
tr
a
i
n
m
o
de
l
R
e
s
Ne
t
50,
whi
c
h
a
c
c
e
pt
s
o
nl
y
t
h
r
e
e
c
h
a
nn
e
l
s
i
nput
,
we
a
dde
d
o
n
e
e
x
t
r
a
l
a
y
e
r
c
o
nv
2D
to
pa
s
s
t
h
e
4
-
c
ha
nn
e
l
i
nput
t
h
r
o
ugh
t
hi
s
l
a
y
e
r
to
m
a
ke
t
h
e
o
ut
pu
t
of
t
hi
s
l
a
y
e
r
s
u
i
t
a
bl
e
f
o
r
Re
s
Ne
t
a
r
c
hi
t
e
c
t
ur
e
,
t
h
e
n
L
S
T
M
h
a
n
d
l
e
s
e
que
n
c
e
b
e
t
we
e
n
f
r
a
m
e
s
,
To
pr
e
di
c
t
t
h
e
pr
o
b
a
bi
li
t
y
of
e
a
c
h
of
t
h
e
r
e
s
u
l
t
c
l
a
s
s
e
s
,
t
h
r
e
e
f
u
ll
y
c
o
nn
e
c
t
e
d
(
F
C
)
la
y
e
r
s
we
r
e
ut
i
li
z
e
d,
f
o
ll
o
we
d
by
an
o
ut
pu
t
l
a
ye
r
w
i
t
h
s
o
f
t
m
a
x
/s
i
g
m
o
i
d
a
c
t
i
va
t
i
o
n
.
F
e
a
t
ur
e
l
e
v
e
l
d
u
s
i
o
n
:
it
co
n
c
a
t
e
n
a
t
e
s
t
h
e
t
w
o
s
e
pa
r
a
t
e
m
o
de
l
s
in
a
s
pe
c
i
f
ic
c
o
n
v
o
l
ut
i
o
n
a
l
l
a
y
e
r
or
f
u
ll
y
c
o
nn
e
c
t
e
d
l
a
y
e
r
.
F
e
a
t
ur
e
s
e
x
t
r
a
c
t
i
o
n
f
r
o
m
R
GB
a
n
d
de
pt
h
s
t
r
e
a
m
s
of
a
s
ign
f
r
a
m
e
in
pa
r
a
ll
e
l
is
u
s
e
d
.
It
c
o
n
c
a
t
e
n
a
t
e
s
t
h
e
o
u
t
pu
t
f
e
a
t
u
r
e
m
a
p
s
f
r
o
m
t
wo
m
o
de
l
s
'
f
i
na
l
c
o
nv
o
l
ut
i
o
na
l
l
a
y
e
r
s
as
o
ne
i
nput
to
t
h
e
f
o
l
l
o
w
i
ng
l
a
y
e
r
s
.
De
c
i
s
i
o
n
l
e
ve
l
f
u
s
i
o
n
:
it
c
o
m
bi
ne
d
t
h
e
pr
e
d
i
c
t
i
o
ns
pr
o
b
a
bi
li
t
y
r
e
s
u
l
t
s
f
r
o
m
t
w
o
s
e
pa
r
a
t
e
m
o
de
l
s
[
25]
.
On
e
m
o
de
l
t
a
ke
s
an
R
GB
vi
de
o
as
i
nput
,
a
n
d
t
h
e
ot
h
e
r
t
a
ke
s
t
h
e
de
pt
h
vi
de
o
as
i
nput
.
T
h
e
pr
o
c
e
s
s
is
do
n
e
by
c
o
l
l
e
c
t
i
n
g
t
h
e
pr
o
b
a
bil
i
t
y
of
t
we
n
t
y
-
o
n
e
c
l
a
s
s
r
e
c
o
gni
t
i
o
n
f
o
r
each
of
t
h
e
t
wo
m
o
de
l
s
a
nd
t
a
k
i
n
g
t
h
e
a
ve
r
a
ge
f
o
r
i
t
,
t
h
e
n
a
do
pt
i
n
g
t
h
e
hi
g
h
e
s
t
pr
o
b
a
bil
i
t
y
to
r
e
p
r
e
s
e
n
t
t
h
e
c
or
r
e
c
t
di
s
t
i
n
c
t
i
o
n
f
o
r
t
h
a
t
c
l
a
s
s
.
F
i
gur
e
3
.
T
y
pe
s
of
f
us
i
o
n
l
e
ve
l
s
4.
RE
S
UL
T
S
AND
DI
S
CU
S
S
I
ON
F
o
r
e
x
pe
r
i
m
e
n
t
h
a
r
dwa
r
e
us
e
d
,
we
us
e
d
K
a
gg
l
e
,
a
c
l
o
ud
s
e
r
vi
c
e
b
a
s
e
d
on
J
up
y
t
e
r
Not
e
b
oo
ks
.
Al
l
t
h
e
tr
a
i
ni
ng
wa
s
do
n
e
on
g
r
a
phi
c
s
pr
o
c
e
s
s
i
ng
u
ni
t
(
GPU
)
wi
t
h
t
h
e
s
uppo
r
t
of
t
h
e
ka
gg
l
e
e
nvi
r
o
nm
e
n
t
,
t
h
e
ka
gg
l
e
s
e
r
vi
c
e
pr
o
vi
de
s
GPU
N
vi
d
i
a
P
100
w
hi
ch
h
a
s
16GB
VR
AM
w
i
t
h
3
,
584
c
o
m
put
e
unif
i
e
d
de
vi
c
e
a
r
c
hi
t
e
c
t
ur
e
(
C
UD
A
)
c
o
r
e
s
.
Al
s
o
,
we
us
e
d
t
h
e
L
e
g
i
o
n
Y
-
540
pe
r
s
o
n
a
l
l
a
pt
o
p
f
o
r
i
nf
e
r
e
n
c
e
.
I
t
s
s
pe
c
i
f
i
c
a
t
i
o
n
s
a
r
e
:
Gr
a
phi
c
s
c
a
r
d
in
t
hi
s
s
y
s
t
e
m
is
N
vi
d
i
a
Ge
F
o
r
c
e
R
T
X
2060,
whi
c
h
h
a
s
6GB
GD
DR
5X
VR
AM
w
i
t
h
1
,
92
0
C
UD
A
c
o
r
e
s
,
pr
o
c
e
s
s
o
r
us
e
d
is
I
n
t
e
l
C
o
r
e
i
7.
Our
n
e
w
l
a
r
ge
-
s
c
a
l
e
A
r
S
L
da
t
a
s
e
t
e
v
a
l
ua
t
e
d
o
u
r
m
o
de
l
'
s
pe
r
f
o
r
m
a
n
c
e
by
r
a
n
do
m
l
y
c
h
o
o
s
i
n
g
t
h
e
t
r
a
i
ni
n
g,
v
a
li
da
t
i
o
n
,
a
n
d
t
e
s
t
s
e
t
s
.
T
h
e
e
v
a
l
ua
t
i
o
n
m
e
a
s
ur
e
is
pr
e
s
e
n
t
e
d
in
t
h
e
r
e
s
u
l
t
s
t
a
bl
e
s
of
o
ur
e
x
pe
r
i
m
e
n
t
s
.
4
.1
.
E
val
u
at
ion
m
e
t
r
ic
an
d
h
yp
e
r
p
ar
am
e
t
e
r
s
W
e
ge
t
de
f
i
ne
d
w
i
t
h
s
o
m
e
of
t
h
e
t
e
r
m
s
us
e
d
in
t
h
e
e
va
l
u
a
t
i
o
n
s
c
a
l
e
a
n
d
hy
p
e
r
pa
r
a
m
e
t
e
r
s
in
a
n
a
ly
z
i
ng
t
h
e
r
e
s
u
l
t
s
of
t
h
e
pr
e
s
e
n
t
s
t
ud
y
.
B
a
t
c
h
s
ize
:
is
t
h
e
n
u
m
be
r
of
s
a
m
p
l
e
s
t
h
a
t
a
r
e
pa
s
s
e
d
to
t
h
e
n
e
t
wo
r
k
at
o
n
c
e
dur
i
n
g
t
r
a
i
ni
ng.
T
h
e
GPU
m
e
m
o
r
y
c
o
n
s
u
m
e
d
by
t
h
e
de
e
p
l
e
a
r
ni
n
g
(
DL
)
m
o
de
l
is
o
f
t
e
n
u
n
k
n
o
wn
b
e
f
o
r
e
s
t
a
r
t
i
n
g
t
r
a
i
ni
ng
or
i
nf
e
r
e
nc
i
n
g,
so
i
n
c
o
r
r
e
c
t
c
h
o
i
c
e
of
ne
ur
a
l
n
e
t
wo
r
k
a
r
c
hi
t
e
c
t
ur
e
or
hy
pe
r
pa
r
a
m
e
t
e
r
s
can
c
a
u
s
e
t
h
e
t
a
s
k
to
f
a
i
l
to
r
un
o
u
t
of
m
e
m
o
r
y
.
A
c
c
o
r
di
n
g
ly
,
t
h
e
b
a
t
c
h
s
i
z
e
is
us
e
d
in
pr
o
p
o
r
t
i
o
n
to
t
h
e
s
i
z
e
of
t
h
e
l
im
i
t
e
d
GPU
m
e
m
o
r
y
f
o
r
c
o
n
s
u
m
pt
i
o
n;
t
h
e
r
e
f
o
r
e
,
we
us
ua
l
ly
us
e
t
h
e
l
a
r
ge
s
t
b
a
t
c
h
s
i
z
e
po
s
s
i
ble
to
o
u
r
r
e
s
o
ur
c
e
s
[
26]
.
C
h
a
nn
e
l
s
:
o
ne
a
n
d
t
h
r
e
e
r
e
pr
e
s
e
n
t
gr
a
y
a
n
d
R
GB
f
r
a
m
e
s
,
r
e
s
pe
c
t
i
v
e
ly
.
In
o
bj
e
c
t
de
t
e
c
t
i
o
n
,
t
h
e
m
o
s
t
f
r
e
que
n
t
uni
t
of
t
i
m
e
is
t
h
e
f
r
a
m
e
pe
r
s
e
c
o
nd
(
F
P
S
)
,
t
h
e
m
a
xim
u
m
n
u
m
be
r
of
f
r
a
m
e
s
t
h
a
t
t
h
e
n
e
t
wo
r
k
can
h
a
n
d
l
e
in
a
s
e
c
o
n
d
is
i
nd
i
c
a
t
e
d
by
t
hi
s
v
a
l
ue
.
T
h
e
t
i
m
e
it
t
a
ke
s
to
tr
a
i
n
t
h
e
n
e
t
wo
r
k
f
o
r
20
e
p
o
c
h
s
is
c
a
l
l
e
d
t
r
a
i
ni
ng
t
i
m
e
.
I
n
c
o
r
r
e
c
t
r
e
c
o
gni
z
e
t
e
s
t
i
n
g,
va
l
i
d
a
t
i
o
n
,
a
n
d
t
r
a
i
ni
ng
:
F
o
r
t
e
s
t
i
n
g,
va
l
i
d
a
t
i
o
n
,
a
n
d
t
r
a
i
ni
ng,
t
h
e
n
u
m
be
r
of
mi
s
c
l
a
s
s
if
i
e
d
da
t
a
s
e
t
vi
de
o
s
.
T
ot
a
l
i
nc
o
r
r
e
c
t
r
e
c
o
gn
i
z
e
:
t
h
e
n
u
m
be
r
of
vi
de
o
s
t
h
a
t
h
a
v
e
b
e
e
n
mi
s
c
l
a
s
s
if
i
e
d.
T
h
e
n
u
m
be
r
of
c
o
r
r
e
c
t
c
l
a
s
s
if
i
c
a
t
i
o
ns
d
i
vi
de
d
by
t
h
e
tot
a
l
n
u
m
be
r
of
c
l
a
s
s
i
f
i
c
a
t
i
o
n
s
is
k
n
o
wn
as
a
c
c
ur
a
c
y
.
F
o
r
m
o
de
l
’
s
pe
r
f
o
r
m
a
n
c
e
s
e
v
a
l
ua
t
i
o
n
,
we
us
e
t
h
e
n
u
m
be
r
of
m
i
s
c
l
a
s
s
i
f
i
e
d
vi
de
o
s
,
i
n
c
o
r
r
e
c
t
,
as
de
f
i
ne
d
,
=
∑
[
(
)
,
(
)
]
=
1
(
1)
w
h
e
r
e
N
is
t
h
e
t
ot
a
l
n
u
m
be
r
of
s
a
m
p
l
e
s
;
y
r
e
pr
e
s
e
n
t
s
t
h
e
a
c
t
ua
l
l
a
b
e
l
;
p
r
e
pr
e
s
e
n
t
s
t
h
e
pr
e
d
i
c
t
e
d
l
a
be
l
;
if
y
(
i)
≠
p(
i)
;
f[
y
(
i)
,
p(
i)
]
=
0,
ot
h
e
r
w
i
s
e
f
[
y
(
i)
,
p(
i)
]
=
1
I
nf
e
r
e
nc
e
:
I
nf
e
r
e
n
c
e
is
t
h
e
s
t
a
ge
of
us
i
ng
a
t
r
a
i
n
e
d
m
o
de
l
to
i
nf
e
r
or
pr
e
di
c
t
t
e
s
t
s
a
m
p
l
e
s
.
We
m
e
a
s
ur
e
t
h
e
i
nf
e
r
e
nc
e
of
e
a
c
h
i
s
o
l
a
t
e
d
v
i
de
o
s
i
g
n
wo
r
d
of
e
a
c
h
de
e
p
n
e
ur
a
l
n
e
t
wor
k
(
DNN
)
m
o
de
l
of
GPU
Evaluation Warning : The document was created with Spire.PDF for Python.
I
S
S
N
:
2502
-
4752
I
n
do
n
e
s
i
a
n
J
E
l
e
c
E
n
g
&
C
o
m
p
S
ci
,
Vo
l
.
25
,
N
o
.
2
,
F
e
b
r
ua
r
y
20
22
:
952
-
96
2
958
a
n
d
C
P
U.
We
m
e
a
s
ur
e
t
h
e
i
nf
e
r
e
n
c
e
in
F
P
S
.
It
wa
s
s
t
a
t
i
s
t
i
c
a
ll
y
v
e
r
i
f
i
e
d
by
f
i
nd
i
ng
t
h
e
a
v
e
r
a
ge
nu
m
b
e
r
of
F
P
S
a
c
c
o
r
di
n
g
to
15
r
un
a
tt
e
m
pt
s
.
4.
2
.
S
in
gl
e
m
od
e
l
T
a
bl
e
1
s
h
o
ws
t
h
e
t
e
s
t
a
c
c
ur
a
c
y
wi
th
i
nc
o
r
r
e
c
t
tr
a
i
ni
ng,
v
a
li
da
t
i
o
n
,
a
n
d
t
e
s
t
i
n
g
c
a
s
e
s
w
h
e
n
a
do
pt
i
n
g
R
GB
vi
de
o
f
r
a
m
e
s
f
i
r
s
t
a
n
d
de
pt
h
vi
de
o
f
r
a
m
e
s
s
e
c
o
n
d
f
o
r
t
h
e
s
i
ng
l
e
pr
o
po
s
e
d
m
o
de
l
s
w
h
e
r
e
t
e
s
t
i
n
g
,
v
a
li
da
t
i
o
n
,
a
n
d
t
r
a
i
ni
ng
of
t
h
e
pr
o
po
s
e
d
m
o
de
l
s
we
r
e
do
n
e
on
o
ur
da
t
a
f
o
r
A
r
S
L
.
D
e
pe
n
d
i
ng
on
t
h
e
tot
a
l
n
u
m
be
r
of
i
nc
o
r
r
e
c
t
c
a
s
e
s
in
t
e
s
t
i
n
g
,
v
a
li
da
t
i
o
n
,
a
n
d
t
r
a
i
ni
ng
a
n
d
f
o
r
R
GB
vi
de
o
f
r
a
m
e
s
a
n
d
de
p
t
h
vi
de
o
f
r
a
m
e
s
,
t
h
e
r
e
s
u
l
t
s
of
m
o
de
l
1
a
r
e
t
h
e
b
e
s
t
c
o
m
pa
r
e
d
to
t
h
e
r
e
s
t
of
t
h
e
m
o
de
l
s
whe
r
e
t
h
e
tot
a
l
i
n
c
o
r
r
e
c
t
c
a
s
e
s
a
r
e
17.
M
o
de
l
1
of
t
h
e
pr
o
p
o
s
e
d
f
o
ur
de
e
p
ne
ur
a
l
n
e
t
wo
r
ks
is
t
h
e
b
e
s
t
m
o
de
l
,
w
hi
c
h
im
p
l
e
m
e
n
t
s
f
e
a
t
ur
e
e
x
t
r
a
c
t
i
o
n
t
h
r
o
ugh
pr
e
-
tr
a
i
n
e
d
m
o
de
l
s
.
T
a
bl
e
1
.
T
h
e
t
e
s
t
a
c
c
ur
a
c
y
w
i
t
h
i
nc
o
r
r
e
c
t
c
a
s
e
s
in
tr
a
i
ni
ng,
v
a
li
da
t
i
o
n
,
a
n
d
t
e
s
t
i
n
g
wh
e
n
a
do
pt
i
n
g
R
G
B
vi
de
o
f
r
a
m
e
s
f
i
r
s
t
a
n
d
de
pt
h
vi
de
o
fr
a
m
e
s
s
e
c
o
n
d
ly
f
o
r
t
h
e
s
i
n
g
l
e
pr
o
p
o
s
e
d
m
o
de
l
s
4.
3
.
M
u
l
t
i
-
m
od
e
l
T
he
f
o
c
us
wa
s
on
m
o
de
l
1
us
i
ng
t
h
e
pr
e
-
t
r
a
i
n
e
d
m
o
de
l
s
w
i
t
h
t
w
o
di
f
f
e
r
e
n
t
R
NN
a
s
:
R
e
s
Ne
t
50
-
L
S
T
M
,
De
n
s
e
N
e
t
121
-
L
S
T
M
,
R
e
s
Ne
t
50
-
GR
U,
M
o
bi
l
e
Ne
t
-
L
S
T
M
a
n
d
VG
G16
-
L
S
T
M
,
w
h
e
r
e
th
e
y
we
r
e
t
r
a
i
n
e
d
on
o
ur
da
t
a
f
o
r
A
r
S
L
f
o
r
R
GB
vi
de
o
f
r
a
m
e
s
a
n
d
de
pt
h
vi
de
o
f
r
a
m
e
s
as
a
s
i
ng
l
e
m
o
de
l
a
n
d
t
h
e
n
m
u
l
t
i
-
m
o
d
e
l
.
In
t
h
e
c
a
s
e
of
t
h
e
m
u
l
t
i
-
m
o
d
e
l
,
t
h
e
s
a
m
e
f
u
s
i
o
n
m
e
t
h
o
d
wa
s
a
do
pt
e
d
wh
e
n
u
s
i
ng
e
a
c
h
of
t
h
e
f
i
ve
pr
e
-
t
r
a
i
n
e
d
m
o
de
l
s
m
e
n
t
i
o
n
e
d,
w
hi
c
h
i
nc
l
ude
s
f
u
s
i
o
n
b
e
f
o
r
e
t
h
e
l
a
s
t
l
a
y
e
r
of
t
h
e
a
r
c
hi
t
e
c
t
ur
e
,
i
.
e
.
f
e
a
t
ur
e
l
e
v
e
l
f
us
i
o
n
t
y
pe
T
a
bl
e
2
s
h
o
ws
t
h
e
t
e
s
t
a
c
c
ur
a
c
y
w
i
t
h
i
nc
o
r
r
e
c
t
c
a
s
e
s
in
t
r
a
i
ni
ng,
v
a
li
da
t
i
o
n
,
a
n
d
t
e
s
t
i
n
g
w
he
n
a
do
p
t
i
n
g
R
GB
vi
de
o
f
r
a
m
e
s
f
i
r
s
t
a
n
d
d
e
pt
h
vi
d
e
o
f
r
a
m
e
s
s
e
c
o
n
d
ly
f
o
r
t
h
e
s
i
ng
l
e
a
n
d
m
u
l
t
i
-
m
o
de
l
s
in
t
he
f
o
r
m
of
m
o
de
l
1
.
It
can
be
s
e
e
n
f
r
o
m
T
a
bl
e
2
t
h
a
t
d
e
pe
n
d
i
n
g
on
tot
a
l
i
nc
o
r
r
e
c
t
;
in
ge
ne
r
a
l
,
th
e
r
e
is
a
c
o
n
v
e
r
ge
n
c
e
b
e
t
we
e
n
t
h
e
pe
r
f
o
r
m
a
nc
e
of
t
h
e
m
o
d
e
l
s
,
whe
n
t
h
e
y
a
r
e
us
e
d
as
a
s
i
ng
l
e
m
o
de
l
f
o
r
R
G
B
vi
de
o
f
r
a
m
e
s
or
de
pt
h
vi
de
o
f
r
a
m
e
s
,
a
n
d
a
l
s
o
as
a
m
u
l
t
i
-
m
o
d
e
l
.
It
a
l
s
o
s
h
o
ws
t
h
a
t
t
h
e
b
e
s
t
m
u
l
t
i
-
m
o
d
e
l
is
t
he
R
e
s
Ne
t
50
-
L
S
T
M
,
wh
e
r
e
t
h
e
s
u
m
of
t
h
e
tot
a
l
i
n
c
o
r
r
e
c
t
c
a
s
e
s
is
5
c
a
s
e
s
.
T
a
bl
e
2
.
T
h
e
t
e
s
t
a
c
c
ur
a
c
y
w
i
t
h
i
nc
o
r
r
e
c
t
c
a
s
e
s
in
tr
a
i
ni
ng,
v
a
li
da
t
i
o
n
,
a
n
d
t
e
s
t
i
n
g
wh
e
n
a
do
pt
i
n
g
R
G
B
vi
de
o
f
r
a
m
e
s
f
i
r
s
t
a
n
d
de
pt
h
vi
de
o
f
r
a
m
e
s
s
e
c
o
n
d
ly
f
o
r
t
h
e
s
i
n
g
l
e
a
n
d
m
u
l
t
i
-
m
o
de
l
s
in
t
h
e
f
o
r
m
of
m
o
de
l
1
M
o
de
l
B
a
tc
h
s
i
z
e
C
ha
nne
ls
T
r
a
in
in
g
t
ime
I
nc
o
r
r
e
c
t
t
r
a
in
I
nc
o
r
r
e
c
t
v
a
li
da
ti
o
n
I
nc
o
r
r
e
c
t
t
e
s
t
T
e
s
t
a
c
c
u
r
a
c
y
%
T
ot
a
l
i
nc
o
r
r
e
c
t
R
G
B
R
e
s
N
e
t5
0
-
L
S
T
M
4
3
8h2mi
n
0
2
2
99.62
4
D
e
ns
e
N
e
t1
21
-
L
S
T
M
4
3
10h1mi
n
0
2
3
99.43
5
R
e
s
N
e
t5
0
-
G
R
U
4
3
7h59mi
n
0
3
3
99.43
6
M
o
bi
l
e
N
e
t
-
L
S
T
M
8
3
7h44mi
n
3
3
1
99.81
7
V
G
G
16
-
L
S
T
M
8
3
8h43mi
n
19
6
3
99.43
28
D
E
P
T
H
R
e
s
N
e
t5
0
-
L
S
T
M
4
3
11h12mi
n
4
8
6
98.86
11
D
e
ns
e
N
e
t1
21
-
L
S
T
M
4
3
9h50mi
n
1
7
4
99.24
12
R
e
s
N
e
t5
0
-
G
R
U
4
3
9h35mi
n
1
6
4
99.24
11
M
o
bi
l
e
N
e
t
-
L
S
T
M
8
3
9h41mi
n
1
3
7
98.67
11
V
G
G
16
-
L
S
T
M
8
3
9h37mi
n
43
17
12
97.71
72
M
U
L
T
-
M
O
D
E
L
R
e
s
N
e
t5
0
-
L
S
T
M
2
3
23h3mi
n
0
3
2
99.62
5
D
e
ns
e
N
e
t1
21
-
L
S
T
M
2
3
19h
1
6
0
100.00
7
R
e
s
N
e
t5
0
-
G
R
U
2
3
19h40mi
n
0
3
3
99.43
6
M
o
bi
l
e
N
e
t
-
L
S
T
M
4
3
21h43mi
n
2
7
7
98.67
16
V
G
G
16
-
L
S
T
M
4
3
23h2mi
n
0
8
3
99.43
11
4.
4
.
E
val
u
at
ion
of
m
u
l
t
i
-
m
od
e
l
f
u
s
ion
T
h
e
t
y
pe
of
m
u
l
t
i
-
m
o
d
e
l
f
u
s
i
o
n
wa
s
e
va
l
u
a
t
e
d
w
i
t
h
t
h
e
a
ppr
o
v
e
d
M
o
de
l
1
(
R
e
s
Ne
t
5
0
-
L
S
T
M
)
.
S
e
v
e
r
a
l
t
y
p
e
s
of
m
u
l
t
i
-
m
o
d
e
l
f
u
s
i
o
n
t
e
c
h
ni
qu
e
s
we
t
e
s
t
e
d.
F
i
gur
e
4
s
h
o
ws
t
h
e
m
u
l
t
i
-
m
o
d
e
l
a
r
c
hi
t
e
c
t
ur
e
of
M
o
de
l
B
a
tc
h
S
iz
e
C
ha
nne
ls
T
r
a
in
in
g
T
im
e
I
nc
o
r
r
e
c
t
T
r
a
in
in
g
I
nc
o
r
r
e
c
t
V
a
li
da
ti
o
n
I
nc
o
r
r
e
c
t
T
e
s
t
T
e
s
t
A
c
c
u
r
a
c
y
%
T
ot
a
l
I
nc
o
r
r
e
c
t
R
G
B
M
o
de
l
1:
R
e
s
N
e
t5
0
-
L
S
T
M
4
3
8h2mi
n
0
2
2
99.62
4
M
o
de
l
2:
C
o
n
v
2D
-
L
S
T
M
4
3
7h38m
in
68
20
17
96.76
105
M
o
de
l
3:
C
o
n
v
3D
-
L
S
T
M
8
3
7h34mi
n
192
34
36
93.14
262
M
o
de
l
4:
C
o
n
v
3D
8
3
6h57mi
n
491
73
66
87.43
630
D
E
P
T
H
M
o
de
l
1:
R
e
s
N
e
t5
0
-
L
S
T
M
4
3
11h12mi
n
4
8
6
98.86
11
M
o
de
l
2:
C
o
n
v
2D
-
L
S
T
M
4
3
10h06mi
n
65
17
14
97.33
96
M
o
de
l
3:
C
o
n
v
3D
-
L
S
T
M
8
1
7h
801
86
83
84.19
079
M
o
de
l
4:
C
o
n
v
3D
8
1
9h50mi
n
56
36
24
95.43
116
Evaluation Warning : The document was created with Spire.PDF for Python.
I
n
do
n
e
s
i
a
n
J
E
l
e
c
E
n
g
&
C
o
m
p
S
c
i
I
S
S
N:
2502
-
4752
Dy
namic
hand
ge
s
tur
e
r
e
c
ognit
ion
of
A
r
abic
s
ign
l
anguage
by
us
ing
de
e
p
…
(
M
ohamm
ad
H.
I
s
mail)
95
9
t
y
pe
s
da
t
a
l
e
v
e
l
a
n
d
f
e
a
t
ur
e
l
e
v
e
l
(
A
-
H)
.
We
c
r
e
a
t
e
d
a
n
e
w
m
u
l
t
i
-
m
o
de
l
G
w
i
t
h
t
h
e
s
a
m
e
m
u
l
t
i
-
m
o
de
l
H
a
r
c
hi
t
e
c
t
ur
e
a
n
d
r
e
p
l
a
c
e
d
t
he
c
o
n
c
a
t
e
n
a
t
e
d
l
a
y
e
r
w
i
t
h
t
h
e
m
a
xim
u
m
l
a
y
e
r
.
We
t
h
e
n
c
r
e
a
t
e
d
a
n
e
w
m
u
l
t
i
-
m
o
de
l
I
w
i
t
h
t
h
e
s
a
m
e
m
u
l
t
i
-
m
o
de
l
H
a
r
c
hi
t
e
c
t
ur
e
a
n
d
r
e
p
l
a
c
e
d
t
he
L
S
T
M
w
i
t
h
a
bi
d
i
r
e
c
t
i
o
n
a
l
L
S
T
M
(
B
L
S
T
M
)
.
T
h
e
pe
r
f
o
r
m
a
n
c
e
of
t
h
e
m
o
de
l
c
a
n
be
im
pr
o
v
e
d
by
us
i
ng
bi
d
i
r
e
c
t
i
o
n
a
l
L
S
T
M
s
in
s
e
que
n
c
e
c
l
a
s
s
if
i
c
a
t
i
o
n
pr
o
bl
e
m
s
.
T
wo
L
S
T
M
l
a
y
e
r
s
a
r
e
t
r
a
i
n
e
d
on
t
h
e
i
nput
s
e
que
nc
e
s
in
bi
-
d
i
r
e
c
t
i
o
n
a
l
L
S
T
M
,
whi
c
h
a
r
e
an
e
x
t
e
ns
i
o
n
of
t
r
a
d
i
t
i
o
n
a
l
L
S
T
M
s
,
to
o
f
f
e
r
e
xt
r
a
i
nf
o
r
m
a
t
i
o
n
f
r
o
m
t
he
i
nput
a
n
d
de
li
ve
r
qu
i
c
ke
r
r
e
s
u
l
t
s
.
T
h
e
f
i
r
s
t
L
S
T
M
us
e
d
f
o
r
wa
r
d
t
e
m
po
r
a
l
i
nf
o
r
m
a
t
i
o
n
to
tr
a
i
n
on
t
h
e
i
n
put
s
e
qu
e
n
c
e
,
w
h
e
r
e
a
s
t
h
e
s
e
c
o
n
d
u
s
e
d
r
e
v
e
r
s
e
t
e
m
po
r
a
l
i
nf
o
r
m
a
t
i
o
n
to
t
r
a
i
n
on
t
h
e
r
e
v
e
r
s
e
d
c
o
py
of
t
h
e
i
nput
s
e
que
n
c
e
.
To
r
e
pr
e
s
e
n
t
t
h
e
s
e
que
n
c
e
'
s
bi
-
d
i
r
e
c
t
i
o
n
a
l
de
pe
n
de
n
c
e
,
t
h
e
f
o
r
wa
r
d
a
n
d
b
a
c
k
wa
r
d
o
u
t
pu
t
s
w
i
ll
be
c
o
n
c
a
t
e
n
a
t
e
d
.
We
t
h
e
n
c
r
e
a
ted
a
n
e
w
m
u
l
t
i
-
m
o
de
l
J
w
i
t
h
t
h
e
s
a
m
e
a
r
c
hi
t
e
c
t
ur
e
of
m
u
l
t
i
-
m
o
de
l
H
a
n
d
a
dde
d
n
o
r
m
a
li
z
a
t
i
o
n
l
a
y
e
r
s
f
o
r
t
wo
m
o
de
l
s
b
e
f
o
r
e
c
o
n
c
a
t
e
na
t
e
d
l
a
y
e
r
.
We
n
o
r
m
a
li
z
e
d
t
h
e
v
a
l
ue
s
of
b
o
t
h
f
e
a
t
ur
e
m
a
p
s
in
t
h
e
s
a
m
e
r
a
n
g
e
b
e
c
a
us
e
t
h
e
n
e
t
wo
r
ks
of
b
o
t
h
m
o
de
l
s
c
r
e
a
t
e
s
e
pa
r
a
t
e
f
e
a
t
ur
e
m
a
p
s
w
i
t
h
d
i
f
f
e
r
e
n
t
r
a
n
ge
v
a
l
ue
s
.
Af
t
e
r
n
o
r
m
a
li
z
a
t
i
o
n
,
we
c
o
n
c
a
t
e
n
a
t
e
l
a
y
e
r
s
to
c
o
m
bi
ne
t
h
e
va
l
u
e
s
of
t
h
e
f
e
a
t
ur
e
m
a
p
s
in
o
r
de
r
to
i
n
c
r
e
a
s
e
t
h
e
qu
a
l
i
t
y
of
t
h
e
pr
o
duc
e
d
f
e
a
t
ur
e
s
.
T
h
e
a
r
c
hi
t
e
c
t
ur
e
of
m
u
l
t
i
-
m
o
d
e
l
K
as
i
t
’
s
f
o
r
H
w
i
t
h
Bi
-
d
i
r
e
c
t
i
o
na
l
L
S
T
M
a
n
d
n
o
r
m
a
li
z
a
t
i
o
n
.
T
a
bl
e
3
s
h
o
ws
t
he
t
e
s
t
a
c
c
ur
a
c
y
w
i
t
h
i
nc
o
r
r
e
c
t
c
a
s
e
s
in
t
r
a
i
ni
ng,
v
a
li
da
t
i
o
n
,
a
n
d
t
e
s
t
i
n
g
as
we
ll
as
t
h
e
I
nf
e
r
e
n
c
e
on
GPU
a
n
d
C
P
U
f
o
r
R
e
s
N
e
t
50
m
u
l
t
i
-
m
o
de
l
s
.
T
h
e
R
e
s
Ne
t
50
m
u
l
t
i
-
m
o
de
l
s
i
nc
l
ude
d
da
t
a
l
e
v
e
l
f
us
i
o
n
,
s
e
v
e
r
a
l
F
e
a
t
u
re
l
e
v
e
l
f
us
i
o
n
s
,
a
n
d
de
c
i
s
i
o
n
l
e
v
e
l
f
u
s
i
o
n
.
It
is
c
l
e
a
r
f
r
o
m
T
a
bl
e
3
t
h
a
t
t
h
e
a
c
c
ur
a
c
y
is
n
o
t
l
e
s
s
t
h
a
n
99%
f
o
r
a
ll
t
h
e
a
ppr
o
v
e
d
t
y
pe
s
e
xc
e
pt
f
o
r
da
t
a
l
e
v
e
l
mu
l
t
i
-
m
o
de
l
f
u
s
i
o
n
.
W
h
e
n
c
o
m
pa
r
i
n
g
t
h
e
t
y
pe
s
of
f
us
i
o
n
at
t
h
e
da
t
a
l
e
v
e
l
,
f
e
a
t
ur
e
l
e
v
e
l
or
de
c
i
s
i
o
n
l
e
ve
l
,
t
h
e
b
e
s
t
is
t
h
e
de
c
i
s
i
o
n
l
e
v
e
l
.
T
he
b
e
s
t
t
y
pe
f
o
r
f
us
io
n
at
t
h
e
f
e
a
t
ur
e
l
e
v
e
l
is
t
h
e
m
u
l
t
i
-
m
o
d
e
l
K,
ba
s
e
d
on
t
h
e
tot
a
l
i
n
c
o
r
r
e
c
t
;
wh
e
r
e
t
h
e
tot
a
l
i
nc
o
r
r
e
c
t
is
z
e
r
o
a
n
d
t
e
s
t
a
c
c
ur
a
c
y
is
100%
.
As
f
o
r
t
h
e
n
u
m
be
r
of
F
P
S
,
T
a
ble
3
s
h
o
ws
t
h
a
t
t
h
e
gr
e
a
t
e
s
t
F
P
S
i
s
w
h
e
n
t
h
e
f
u
s
i
o
n
is
at
t
h
e
da
t
a
l
e
v
e
l
,
whe
r
e
t
h
e
us
e
of
a
s
i
ng
l
e
m
o
de
l
is
M
o
de
l
1.
T
h
e
l
e
a
s
t
F
P
S
is
w
h
e
n
t
h
e
f
u
s
i
o
n
is
at
t
h
e
de
c
i
s
i
o
n
l
e
v
e
l
w
h
e
r
e
t
wo
s
e
pa
r
a
t
e
m
o
de
l
s
a
r
e
us
e
d.
B
ut
wh
e
n
t
h
e
f
us
i
o
n
is
at
t
h
e
f
e
a
t
ur
e
s
l
e
v
e
l
,
t
h
e
F
P
S
b
e
t
we
e
n
t
h
e
t
w
o
c
a
s
e
s
w
h
e
r
e
t
h
e
m
u
l
t
i
-
m
o
de
l
is
gr
e
a
t
e
r
t
h
a
n
a
s
in
g
l
e
m
o
de
l
a
n
d
l
e
s
s
t
h
a
n
t
wo
s
e
pa
r
a
t
e
s
i
n
g
l
e
m
o
de
ls
.
F
i
gur
e
4.
Di
f
f
e
r
e
n
t
m
e
t
h
o
ds
of
f
us
i
o
n
m
u
l
t
i
p
l
e
m
o
de
l
s
us
i
n
g
t
h
e
m
o
de
1/
R
e
s
Ne
t
50
-
L
S
T
M
Evaluation Warning : The document was created with Spire.PDF for Python.
I
S
S
N
:
2502
-
4752
I
n
do
n
e
s
i
a
n
J
E
l
e
c
E
n
g
&
C
o
m
p
S
ci
,
Vo
l
.
25
,
N
o
.
2
,
F
e
b
r
ua
r
y
20
22
:
952
-
96
2
960
T
a
bl
e
3
.
T
h
e
t
e
s
t
a
c
c
ur
a
c
y
w
i
t
h
i
nc
o
r
r
e
c
t
c
a
s
e
s
in
tr
a
i
ni
ng,
v
a
li
d
a
t
i
o
n
,
a
n
d
t
e
s
t
i
n
g
as
we
l
l
as
t
h
e
i
nf
e
r
e
n
c
e
on
GPU
a
n
d
C
P
U
f
o
r
R
e
s
Ne
t
50
m
u
l
t
i
-
m
o
de
l
s
M
ul
ti
-
M
o
d
e
l
B
a
tc
h
S
iz
e
c
ha
nne
ls
T
r
a
in
in
g
T
im
e
I
nc
or
r
e
c
t
T
r
a
in
in
g
I
nc
or
r
e
c
t
V
a
li
da
ti
on
I
nc
or
r
e
c
t
T
e
s
t
T
e
s
t
A
c
c
ur
a
c
y
%
T
ot
a
l
I
nc
or
r
e
c
t
I
n
f
e
r
e
n
c
e
on
G
P
U
I
n
f
e
r
e
n
c
e
on
C
P
U
FPS
G
P
U
U
s
a
ge
%
C
P
U
U
s
a
ge
%
FPS
G
P
U
U
s
a
ge
%
C
P
U
U
s
a
ge
%
D
a
ta
le
ve
l
4
4
26h40mi
n
0
26
30
94.29
56
200
87
20
26
0
100
F
e
a
tu
r
e
le
ve
l
A
2
3
24h33mi
n
0
4
2
99.62
6
128
95
23
13
0
100
B
2
3
25h16mi
n
1
3
2
99.62
6
133
100
21
11
0
100
C
2
3
28h23mi
n
0
2
2
99.62
4
128
94
23
14
0
100
D
2
3
27h25mi
n
3
4
1
99.81
8
124
96
23
14
0
100
E
2
3
23h57mi
n
1
4
3
99.43
8
129
96
19
14
0
100
F
2
3
29h42mi
n
1
5
1
99.81
7
128
98
18
14
0
100
G
2
3
23h42mi
n
2
3
3
99.43
8
128
93
19
14
0
100
H
2
3
23h3mi
n
0
3
2
99.
62
5
125
98
24
13
0
100
I
2
3
23h12mi
n
0
4
0
100.00
4
105
81
26
10
0
100
J
2
3
26h30mi
n
0
1
1
99.81
2
122
96
20
14
0
100
K
2
3
23h37mi
n
0
0
0
100
0
104
81
27
10
0
100
D
e
c
is
io
n
l
e
ve
l
0
0
0
100
0
96
92
22
9
0
100
R
e
s
Ne
t
50
-
B
i
L
S
T
M
-
No
r
m
a
li
z
a
t
i
o
n
v
a
li
da
t
i
o
n
a
n
d
t
r
a
i
ni
ng
l
o
s
s
,
as
w
e
l
l
as
va
l
i
d
a
t
i
o
n
a
n
d
t
r
a
i
ni
ng
a
c
c
ur
a
c
y
,
a
r
e
s
h
o
wn
in
F
i
gur
e
5.
Dur
i
n
g
t
h
e
t
r
a
i
ni
ng
a
n
d
v
a
li
da
t
i
o
n
p
h
a
s
e
s
,
t
h
e
a
c
c
ur
a
c
y
c
o
n
t
i
nue
s
to
i
nc
r
e
a
s
e
as
t
h
e
l
o
s
s
r
a
t
e
de
c
r
e
a
s
e
s
.
T
h
e
c
o
nf
us
i
o
n
m
a
t
r
i
x
in
F
i
gur
e
6
s
h
o
ws
t
h
e
m
u
l
ti
-
m
o
d
e
l
n
e
ur
a
l
n
e
t
wo
r
k
e
v
a
l
ua
t
i
o
n
r
e
s
u
l
t
s
f
o
r
t
h
e
t
e
s
t
i
n
g
a
n
d
t
r
a
i
ni
n
g
ne
t
wo
r
ks
of
R
e
s
Ne
t
50
-
B
i
L
S
T
M
-
No
r
m
a
li
z
a
t
i
o
n
.
T
h
e
m
o
de
l
is
ut
i
li
z
e
d
f
o
r
da
t
a
a
ugm
e
n
t
a
t
i
o
n
a
ppr
o
a
c
h
e
s
in
A
r
S
L
vi
de
o
c
l
a
s
s
i
f
i
c
a
t
i
o
n
,
as
s
e
e
n
in
t
h
e
t
a
bl
e
of
21
-
c
l
a
s
s
c
o
nf
us
i
o
n
m
a
t
r
i
x
.
Al
l
c
o
r
r
e
c
t
i
o
n
c
a
t
e
go
r
i
e
s
a
r
e
gr
oupe
d
on
a
s
qua
r
e
m
a
t
r
i
x's
d
i
a
go
na
l
.
C
o
l
u
m
ns
r
e
pr
e
s
e
n
t
t
h
e
a
c
t
ua
l
c
l
a
s
s
e
s
,
a
n
d
r
o
ws
r
e
pr
e
s
e
n
t
t
h
e
c
l
a
s
s
if
i
e
r
'
s
pr
e
d
i
c
t
i
o
n
s
.
T
a
bl
e
4
s
h
o
ws
t
h
e
C
o
m
pa
r
i
s
o
n
f
o
r
th
e
A
r
S
L
b
e
t
we
e
n
pr
e
vi
o
us
wo
r
ks
[7
]
,
[
9
]
,
[
13
]
,
[
18
]
,
[
27
]
-
[
30]
a
n
d
t
hi
s
wo
r
k.
F
r
o
m
t
hi
s
t
a
bl
e
,
b
o
t
h
pr
o
p
o
s
e
d
m
e
t
h
o
ds
f
o
r
a
s
i
n
g
l
e
m
o
de
l
a
n
d
m
u
l
t
i
-
m
o
de
l
a
r
e
b
e
t
t
e
r
t
h
a
n
t
h
e
m
o
de
l
s
pr
e
s
e
n
t
e
d
i
n
t
h
e
pr
e
vi
o
u
s
s
t
ud
i
e
s
r
e
f
e
r
r
e
d
to
i
n
t
h
e
t
a
bl
e
.
F
i
gur
e
5
.
T
h
e
t
r
a
i
ni
ng
a
n
d
v
a
li
da
t
i
o
n
a
c
c
ur
a
c
y
in
a
dd
i
t
i
o
n
to
t
h
e
t
r
a
i
ni
ng
a
n
d
v
a
l
i
da
t
i
o
n
l
o
s
s
of
R
e
s
Ne
t
50B
i
L
S
T
M
+
n
o
r
m
a
l
i
z
a
t
i
o
n
F
i
gur
e
6
.
T
r
a
i
ni
ng
a
n
d
t
e
s
t
i
n
g
c
o
nf
u
s
i
o
n
m
a
t
r
i
x
of
m
u
l
t
i
-
m
o
de
l
R
e
s
Ne
t
50B
i
L
S
T
M
+
n
o
r
m
a
li
z
e
Evaluation Warning : The document was created with Spire.PDF for Python.
I
n
do
n
e
s
i
a
n
J
E
l
e
c
E
n
g
&
C
o
m
p
S
c
i
I
S
S
N:
2502
-
4752
Dy
namic
hand
ge
s
tur
e
r
e
c
ognit
ion
of
A
r
abic
s
ign
l
anguage
by
us
ing
de
e
p
…
(
M
ohamm
ad
H.
I
s
mail)
961
T
a
bl
e
4
.
C
o
m
pa
r
i
s
o
n
f
o
r
A
r
S
L
of
pr
e
vi
o
us
wo
r
ks
a
n
d
t
hi
s
wo
r
k
P
a
pe
r
n
o
.
M
e
th
o
d
S
ig
ns
/s
ig
ne
r
A
c
c
u
r
a
c
y
[
7]
S
in
gl
e
mo
d
e
l:
3D
-
C
N
N
25/
5
98.00%
[
9]
S
in
gl
e
m
o
d
e
l:
D
y
na
mi
c
ti
m
e
w
a
r
pi
ng
D
T
W
&
K
in
e
c
t
c
e
ns
o
r
30/
-
97.58%
[
13]
S
in
gl
e
m
o
d
e
l:
D
e
e
p
L
a
b
v
3+
C
S
O
M
+
B
i
L
S
T
M
N
e
t
23/
3
89.59%
[
18]
S
in
gl
e
m
o
d
e
l:
I
nc
e
pt
i
o
n
v3
C
N
N
-
L
S
T
M
9/
2
90.00%
[
27]
S
in
gl
e
m
o
d
e
l:
E
u
c
li
de
a
n
di
s
ta
n
c
e
30/
-
97.
00%
[
28]
S
in
gl
e
m
o
d
e
l:
K
N
N
,
S
V
M
,
M
L
P
23/
3
99.00%
[
29]
S
in
gl
e
m
o
d
e
l:
3D
-
C
N
N
M
ul
ti
-
mo
d
e
l:
3D
-
C
N
N
40/
4
96.69%
98.12%
[
30]
S
in
gl
e
m
o
d
e
l:
H
id
d
e
n
M
a
r
k
ov
M
o
d
e
l
20/
-
82.20%
T
h
is
w
o
r
k
S
in
gl
e
m
o
d
e
l:
R
e
s
N
e
t5
0
-
L
S
T
M
(
R
G
B
)
21/
55
99.62%
T
h
is
w
o
r
k
S
in
gl
e
m
o
d
e
l:
R
e
s
N
e
t5
0
-
G
R
U
(
D
e
pt
h)
99.24%
T
h
is
w
o
r
k
mul
ti
-
m
o
d
e
l:
R
e
s
N
e
t5
0
-
B
i
L
S
T
M
-
N
o
r
ma
li
z
a
ti
o
n
100.00%
5.
CONC
L
USI
ON
Va
r
i
o
us
m
o
de
l
s
h
a
v
e
b
e
e
n
i
nve
s
t
i
ga
t
e
d
f
o
r
dy
n
a
mi
c
h
a
n
d
ge
s
t
ur
e
r
e
c
o
gni
t
i
o
n
us
i
ng
R
GB
a
n
d
de
pt
h
i
n
f
o
r
m
a
t
i
o
n
,
t
h
r
o
ugh
t
h
e
i
r
a
n
a
ly
s
i
s
a
n
d
d
i
s
c
us
s
io
n
r
e
s
u
l
t
s
,
t
h
e
f
o
l
l
o
w
i
n
g
wa
s
c
o
n
c
l
ude
d:
T
h
e
r
e
s
e
a
r
c
h
pr
e
pa
r
e
d
r
e
c
o
r
de
d
a
n
d
c
o
l
l
e
c
t
e
d
da
t
a
i
nc
l
ud
i
ng
7
,
350
R
GB
vi
de
o
s
a
n
d
7350
de
pt
h
vi
de
o
s
as
A
r
S
L
da
t
a
s
e
t
s
.
M
o
de
l
1
of
t
he
pr
o
po
s
e
d
f
o
ur
de
e
p
n
e
ur
a
l
ne
t
wor
ks
is
t
h
e
b
e
s
t
m
o
de
l
,
w
hi
c
h
i
m
p
l
e
m
e
n
t
s
f
e
a
t
ur
e
e
x
t
r
a
c
t
i
o
n
t
h
r
o
u
gh
pr
e
-
t
r
a
i
n
e
d
m
o
de
l
s
.
Am
o
n
g
t
h
e
pr
e
-
t
r
a
i
n
e
d
m
o
de
l
s
w
i
t
h
t
wo
di
f
f
e
r
e
n
t
R
NN
:
R
e
s
N
e
t
50
-
L
S
T
M
,
De
ns
e
Ne
t
121
-
L
S
T
M
,
R
e
s
Ne
t
50
-
GR
U,
M
o
bi
l
e
Ne
t
-
L
S
T
M
a
n
d
VG
G16
-
L
S
T
M
,
t
h
e
be
s
t
m
u
l
t
i
-
m
o
d
e
l
is
t
h
e
R
e
s
Ne
t
50
-
L
S
T
M
,
wi
t
h
t
h
e
s
a
m
e
f
us
i
o
n
m
e
t
h
o
d.
T
he
t
e
s
t
a
c
c
ur
a
c
y
is
gr
e
a
t
e
r
t
h
a
n
99%
f
o
r
a
l
l
t
h
e
a
ppr
o
v
e
d
m
u
l
t
i
-
m
o
de
l
f
u
s
i
o
n
t
y
pe
s
f
o
r
t
h
e
f
e
a
t
ur
e
l
e
v
e
l
.
T
h
e
r
e
s
e
a
r
c
h
pr
e
s
e
n
t
e
d
t
h
e
o
p
t
i
m
a
l
m
u
l
t
i
-
m
o
de
l
R
e
s
Ne
t
50
-
B
i
L
S
T
M
-
No
r
m
a
li
z
a
t
i
o
n
w
i
t
h
100%
t
e
s
t
e
f
f
i
c
i
e
nc
y
a
n
d
w
i
t
h
o
u
t
i
n
c
o
r
r
e
c
t
t
r
a
i
ni
ng,
v
a
li
da
t
i
o
n
a
n
d
t
e
s
t
.
RE
F
E
RE
NC
E
S
[
1]
P.
K
uma
r
,
H.
G
a
uba
,
P.
P
r
a
ti
m
R
o
y
,
a
nd
D.
P
r
o
s
a
d
D
o
gr
a
,
"
A
mul
ti
mo
da
l
f
r
a
me
w
o
r
k
f
o
r
s
e
ns
o
r
-
ba
s
e
d
s
ig
n
la
ngua
ge
r
e
c
o
gni
ti
o
n,'
'
N
e
ur
oc
om
put
in
g
,
v
o
l.
259,
pp.
21
-
38,
O
c
t.
2017
,
do
i:
10.1016/j
.ne
u
c
o
m.2016.08.132
.
[
2]
E.
K.
K
uma
r
,
P.
V.
V.
K
is
ho
r
e
,
M.
T.
K.
K
uma
r
,
a
nd
D.
A.
K
uma
r
,
"
3D
s
ig
n
la
ngua
ge
r
e
c
o
gni
ti
o
n
w
it
h
jo
in
t
di
s
ta
nc
e
a
nd
a
ngul
a
r
c
o
de
d
c
o
lo
r
to
po
gr
a
phi
c
a
l
de
s
c
r
ip
to
r
on
a
2_S
tr
e
a
m
C
N
N
,'
'
N
e
ur
oc
om
put
in
g
,
v
o
l.
372,
pp.
40
-
54,
J
a
n.
2020
,
do
i:
10.1016/j
.ne
uc
o
m.2019.09.059
.
[
3]
A.
V.
N
a
ir
a
nd
V.
B
in
du
,
"A
R
e
v
ie
w
On
I
ndi
a
n
S
ig
n
L
a
ngua
ge
R
e
c
o
gni
ti
o
n,"
I
nt
e
r
nat
io
nal
jo
ur
nal
of
c
om
put
e
r
appl
ic
at
io
ns
,
v
o
l.
73,
no
.
22,
pp:
33
-
38,
2
013
,
do
i:
10.5120/13037
-
0260
.
[
4]
H.
V
o
,
V.
H.
P
ha
m,
a
nd
B.
T.
N
guy
e
n
,
“
D
e
e
p
L
e
a
r
ni
ng
f
o
r
V
ie
tn
a
me
s
e
S
ig
n
L
a
ngua
ge
R
e
c
o
gni
ti
o
n
in
V
id
e
o
S
e
que
nc
e
,
”
I
nt
e
r
nat
io
nal
J
our
nal
of
M
ac
hi
ne
L
e
ar
ni
ng
and
C
om
put
in
g
, v
o
l.
9
, no
.
4,
pp.
440
-
445,
2019
,
do
i:
10.18178/i
jm
lc
.2019.9.4.823.
[
5]
E.
L
a
c
ha
t,
H.
M
a
c
he
r
,
T.
L
a
nde
s
,
a
nd
P.
G
r
us
s
e
nme
y
e
r
,
"
A
s
s
e
s
s
me
nt
a
nd
C
a
li
br
a
ti
o
n
of
A
R
G
B
-
D
C
a
me
r
a
(
K
in
e
c
t
V2
S
e
ns
o
r
)
T
o
w
a
r
ds
A
P
o
te
nt
ia
l
U
s
e
F
o
r
C
lo
s
e
-
R
a
nge
3D
M
o
de
li
ng,"
R
e
m
o
te
Se
ns
in
g
,
v
o
l.
7,
no
.
10,
pp
.
13070
-
13097,
2015
,
do
i:
10.3390/r
s
71013070.
[
6]
S.
P
a
ul
,
S.
B
a
s
u,
a
nd
M.
N
a
s
ip
ur
i,
"
M
ic
r
o
s
o
f
t
K
in
e
c
t
In
G
e
s
tu
r
e
R
e
c
o
gni
ti
o
n:
A
S
ho
r
t
R
e
v
ie
w
,"
I
nt
.
J.
C
ont
r
ol
T
he
or
y
A
ppl
,
v
o
l.
8,
no
.
5,
pp
.
2071
-
2076,
2015.
[
7]
M.
E
lB
a
da
w
y
,
A.
S.
E
lo
ns
,
H.
A.
S
he
de
e
d,
a
nd
M.
F.
T
o
lb
a
,
“
A
r
a
bi
c
S
ig
n
L
a
ngua
ge
R
e
c
o
gni
ti
o
n
W
it
h
3d
C
o
nv
o
lu
ti
o
na
l
N
e
ur
a
l
N
e
two
r
ks
,”
In
2017
E
ig
ht
h
I
nt
e
r
nat
io
nal
C
onf
e
r
e
nc
e
on
I
nt
e
ll
ig
e
nt
C
om
put
in
g
and
I
nf
or
m
at
io
n
Sy
s
te
m
s
(
I
C
I
C
I
S)
,
I
E
E
E
,
pp.
66
-
71,
2017
,
do
i:
10.1109/I
N
T
E
L
C
I
S
.2017.8260028.
[
8]
S
.
M
a
s
o
o
d,
A
.
S
r
iv
a
s
ta
v
a
,
H
.
C
.
T
huw
a
l,
a
nd
M
.
A
hma
d,
"
R
e
a
l
-
T
im
e
S
ig
n
L
a
ngua
ge
G
e
s
tu
r
e
(
W
o
r
d)
R
e
c
o
gni
ti
o
n
f
r
o
m
V
id
e
o
S
e
que
nc
e
s
U
s
in
g
C
N
N
a
nd
R
N
N
,"
In
I
nt
e
ll
ig
e
nt
E
ngi
ne
e
r
in
g
I
nf
or
m
at
ic
s
,
S
pr
in
ge
r
,
S
in
ga
po
r
e
,
pp.
623
-
632,
2018
,
do
i:
10.1007/
978
-
981
-
10
-
7566
-
7_63.
[
9]
S.
A
bde
l,
A.
A
bde
l
-
R
a
bo
uh,
F.
A.
E
lm
is
e
r
y
,
A.
M.
B
r
is
ha
,
a
nd
A.
H.
K
ha
li
l,
"
A
r
a
bi
c
S
ig
n
L
a
ngua
ge
R
e
c
o
gni
ti
o
n
U
s
in
g
K
in
e
c
t
S
e
ns
o
r
,"
R
e
s
e
ar
c
h
J
our
nal
of
A
ppl
ie
d
Sc
ie
nc
e
s
,
E
ngi
ne
e
r
in
g
and
T
e
c
hnol
ogy
,
v
o
l.
15,
no
.
2,
pp
.
57
-
67,
2018
,
do
i
:
10.19026/r
ja
s
e
t.
15.5292
.
[
10]
Y.
L
ia
o
,
P.
X
io
ng,
W.
M
in
,
W.
M
in
,
a
nd
J.
L
u,
"
D
y
na
mi
c
S
ig
n
L
a
ngua
ge
R
e
c
o
gni
ti
o
n
B
a
s
e
d
on
V
id
e
o
S
e
que
nc
e
W
it
h
B
L
S
T
M
-
3D
R
e
s
id
ua
l
N
e
two
r
ks
,"
I
E
E
E
A
c
c
e
s
s
, v
o
l.
7
,
pp:
38044
-
38054,
2019
,
do
i:
10.1109/a
c
c
e
s
s
.2019.2904749
.
[
11]
E.
Z
ha
ng,
B.
X
ue
,
F.
C
a
o
,
J.
D
ua
n,
G.
L
in
,
a
nd
Y.
L
e
i,
"
F
us
io
n
Of
2D
-
C
N
N
a
nd
3D
D
e
ns
e
ne
t
F
o
r
D
y
na
mi
c
G
e
s
tu
r
e
R
e
c
o
gni
ti
o
n,"
E
le
c
tr
oni
c
s
, v
o
l.
8,
no
.
12,
p.
1511,
2019
,
do
i:
10.3390/e
le
c
tr
o
ni
c
s
81215
11
.
[
12]
W.
Z
ha
ng,
W
e
nj
in
,
a
nd
J.
W
a
ng,
"
D
y
na
mi
c
ha
nd
ge
s
tu
r
e
r
e
c
o
gni
ti
o
n
ba
s
e
d
on
3D
c
o
nv
o
lu
ti
o
na
l
ne
ur
a
l
ne
two
r
k
mo
de
ls
,"
In
2019
I
E
E
E
16t
h
I
nt
e
r
nat
io
nal
C
onf
e
r
e
nc
e
on
N
e
tw
or
k
in
g,
Se
ns
in
g
and
C
ont
r
ol
(
I
C
N
SC
)
,
I
E
E
E
,
pp.
224
-
229,
2019
,
do
i:
10.1109/I
C
N
S
C
.2019.
8743159
.
[
13]
S.
A
ly
a
nd
W.
A
ly
,
“
D
e
e
pA
r
S
L
R
:
A
N
o
v
e
l
S
ig
ne
r
-
I
nde
pe
nde
nt
D
e
e
p
L
e
a
r
ni
ng
F
r
a
me
w
o
r
k
f
o
r
I
s
o
la
te
d
A
r
a
bi
c
S
ig
n
L
a
ngua
ge
G
e
s
tu
r
e
s
R
e
c
o
gni
ti
o
n,”
I
E
E
E
A
c
c
e
s
s
,
v
o
l.
8,
pp.
199
-
212,
2020
,
do
i:
10.1109/AC
C
E
S
S
.2020.2990699
.
[
14]
D.
S.
T
r
a
n,
N.
H.
H
o
,
H.
J.
Y
a
ng
,
E.
T.
B
a
e
k,
S.
H.
K
im
,
a
nd
G.
L
e
e
.
"
R
e
a
l
-
T
im
e
H
a
nd
G
e
s
tu
r
e
S
po
tt
in
g
a
nd
R
e
c
o
gni
ti
o
n
U
s
in
g
R
G
B
-
D
C
a
me
r
a
A
nd
3D
C
o
nv
o
lu
ti
o
na
l
N
e
ur
a
l
N
e
two
r
k,"
A
ppl
ie
d
Sc
ie
nc
e
s
,
v
o
l.
10,
no
.
2,
p.
722,
2020
,
do
i:
10.3390/a
pp10020722
.
[
15]
D.
S
a
r
ma
,
V.
K
a
v
y
a
s
r
e
e
,
a
nd
M.
K.
B
huy
a
n,
"
T
w
o
-
S
tr
e
a
m
F
us
io
n
M
o
de
l
F
o
r
D
y
na
mi
c
H
a
nd
G
e
s
tu
r
e
R
e
c
o
gni
ti
o
n
U
s
in
g
3d
-
C
nn
A
nd
2d
-
C
nn
O
pt
ic
a
l
F
lo
w
G
ui
de
d
M
o
ti
o
n
T
e
mpl
a
te
,"
ar
X
iv
pr
e
pr
in
t
ar
X
iv
:
2007,
08847
,
2020.
[
16]
R.
R
a
s
tg
o
o
,
K.
K
ia
ni
,
a
nd
S.
E
s
c
a
le
r
a
,
"
V
id
e
o
-
B
a
s
e
d
I
s
o
la
te
d
H
a
nd
S
ig
n
L
a
ngua
ge
R
e
c
o
gni
ti
o
n
U
s
in
g
A
D
e
e
p
C
a
s
c
a
de
d
M
o
de
l,
"
M
ul
ti
m
e
di
a
T
ool
s
and
A
ppl
ic
at
io
ns
,
v
o
l.
79,
pp
.
22965
-
22987,
2020
,
do
i:
10.1007/s
11042
-
020
-
09048
-
5
.
[
17]
E.
K.
E
ls
a
y
e
d
a
nd
D.
R.
F
a
th
y
,
"
S
e
ma
nt
ic
D
e
e
p
L
e
a
r
ni
ng
to
T
r
a
ns
la
te
D
y
na
mi
c
S
ig
n
L
a
ngua
ge
,"
I
nt
e
r
nat
io
nal
J
our
nal
of
I
nt
e
ll
ig
e
nt
E
ngi
ne
e
r
in
g
and
Sy
s
te
m
s
,
v
o
l.
14,
n
o.
1,
pp
.
316
-
325,
2021,
do
i:
10.22266/i
ji
e
s
2021.0228.30
.
[
18]
E
lh
a
gr
y
a
nd
R.
G
la
,
"
E
gy
pt
ia
n
S
ig
n
L
a
ngua
ge
R
e
c
o
gni
ti
o
n
U
s
in
g
C
N
N
a
nd
L
S
T
M
,"
ar
X
iv
pr
e
pr
in
t
ar
X
iv
:
2107
,
13647,
2021.
Evaluation Warning : The document was created with Spire.PDF for Python.