I
nte
rna
t
io
na
l J
o
urna
l o
f
E
lect
rica
l a
nd
Co
m
p
ute
r
E
ng
in
ee
ring
(
I
J
E
CE
)
Vo
l.
10
,
No
.
6
,
Dec
em
b
er
2
0
2
0
,
p
p
.
6
0
1
9
~
60
25
I
SS
N:
2
0
8
8
-
8708
,
DOI
: 1
0
.
1
1
5
9
1
/
i
j
ec
e
.
v
1
0
i
6
.
pp
6
0
1
9
-
6
0
2
5
6019
J
o
ur
na
l ho
m
ep
a
g
e
:
h
ttp
:
//ij
ec
e.
ia
esco
r
e.
co
m/in
d
ex
.
p
h
p
/I
JE
C
E
Video
conten
t
a
n
a
ly
sis
and retriev
a
l sy
ste
m
using
video
s
tory
telling a
nd i
ndex
ing
t
echn
iqu
es
J
a
i
m
o
n J
a
co
b
1
,
M.
Su
dh
ee
p
E
la
y
ido
m
2
,
V.
P.
Dev
a
s
s
ia
3
1
G
o
v
t.
M
o
d
e
l
E
n
g
in
e
e
rin
g
Co
ll
e
g
e
,
In
d
ia
2
C
o
c
h
in
Un
iv
e
rsity
o
f
S
c
ien
c
e
a
n
d
T
e
c
h
n
o
lo
g
y
,
In
d
ia
3
S
t.
Jo
se
p
h
’s co
ll
e
g
e
o
f
En
g
g
a
n
d
T
e
c
h
n
o
lo
g
y
,
In
d
ia
Art
icle
I
nfo
AB
ST
RAC
T
A
r
ticle
his
to
r
y:
R
ec
eiv
ed
Au
g
7
,
2
0
1
9
R
ev
i
s
ed
Ma
y
5
,
2020
A
cc
ep
ted
Ma
y
20
,
2
0
2
0
V
id
e
o
s
a
re
u
se
d
o
f
ten
f
o
r
c
o
m
m
u
n
ica
ti
n
g
id
e
a
s,
c
o
n
c
e
p
ts,
e
x
p
e
rien
c
e
,
a
n
d
situ
a
ti
o
n
s,
b
e
c
a
u
se
o
f
th
e
sig
n
if
ic
a
n
t
a
d
v
a
n
c
e
s
m
a
d
e
in
v
id
e
o
c
o
m
m
u
n
ica
ti
o
n
tec
h
n
o
l
o
g
y
.
T
h
e
so
c
ial
m
e
d
ia
p
latf
o
rm
s
e
n
h
a
n
c
e
d
th
e
v
id
e
o
u
sa
g
e
e
x
p
e
d
it
io
u
sly
.
A
t,
p
re
se
n
t,
re
c
o
g
n
it
io
n
o
f
a
v
id
e
o
is
d
o
n
e
,
u
sin
g
th
e
m
e
tad
a
ta
li
k
e
v
id
e
o
ti
tl
e
,
v
id
e
o
d
e
sc
rip
ti
o
n
s
,
a
n
d
v
id
e
o
th
u
m
b
n
a
il
s
.
T
h
e
re
a
re
situ
a
ti
o
n
s
li
k
e
v
id
e
o
se
a
r
c
h
e
r
re
q
u
ires
o
n
ly
a
v
id
e
o
c
li
p
o
n
a
sp
e
c
if
ic
to
p
ic
f
r
o
m
a
lo
n
g
v
id
e
o
.
T
h
is
p
a
p
e
r
p
ro
p
o
se
s
a
n
o
v
e
l
m
e
th
o
d
o
lo
g
y
f
o
r
th
e
a
n
a
l
y
si
s
o
f
v
id
e
o
c
o
n
ten
t
a
n
d
u
si
n
g
v
id
e
o
sto
ry
telli
n
g
a
n
d
in
d
e
x
in
g
tec
h
n
iq
u
e
s
f
o
r
th
e
re
tri
e
v
a
l
of
th
e
in
ten
d
e
d
v
id
e
o
c
li
p
f
ro
m
a
lo
n
g
d
u
ra
ti
o
n
v
id
e
o
.
V
id
e
o
sto
ry
telli
n
g
tec
h
n
iq
u
e
is
u
se
d
f
o
r
v
id
e
o
c
o
n
te
n
t
a
n
a
ly
sis
a
n
d
to
p
r
o
d
u
c
e
a
d
e
sc
rip
ti
o
n
o
f
th
e
v
id
e
o
.
T
h
e
v
id
e
o
d
e
sc
rip
t
io
n
th
u
s
c
re
a
ted
is
u
se
d
f
o
r
p
re
p
a
r
a
ti
o
n
o
f
a
n
in
d
e
x
u
sin
g
w
o
r
m
h
o
le
a
lg
o
rit
h
m
,
g
u
a
ra
n
t
y
in
g
th
e
s
e
a
rc
h
o
f
a
k
e
y
w
o
rd
o
f
d
e
f
in
it
e
len
g
th
L
,
w
it
h
in
th
e
m
in
im
u
m
w
o
rst
-
c
a
se
ti
m
e
.
T
h
is
v
id
e
o
in
d
e
x
c
a
n
b
e
u
se
d
b
y
v
id
e
o
se
a
rc
h
in
g
a
lg
o
rit
h
m
to
re
tri
e
v
e
th
e
re
lev
a
n
t
p
a
r
t
o
f
th
e
v
id
e
o
b
y
v
irt
u
e
o
f
th
e
f
re
q
u
e
n
c
y
o
f
th
e
w
o
rd
in
th
e
k
e
y
w
o
rd
se
a
rc
h
o
f
th
e
v
id
e
o
i
n
d
e
x
.
I
n
ste
a
d
o
f
d
o
w
n
lo
a
d
in
g
a
n
d
tran
sf
e
rrin
g
a
wh
o
le
v
id
e
o
,
th
e
u
se
r
c
a
n
d
o
w
n
lo
a
d
o
r
tran
s
f
e
r
th
e
sp
e
c
if
i
c
a
ll
y
n
e
c
e
ss
a
r
y
v
id
e
o
c
li
p
.
T
h
e
n
e
t
w
o
rk
c
o
n
stra
in
ts
a
ss
o
c
iate
d
w
it
h
th
e
tran
sfe
r
o
f
v
id
e
o
s
a
re
c
o
n
sid
e
ra
b
ly
a
d
d
re
ss
e
d
.
K
ey
w
o
r
d
s
:
Vid
eo
co
n
ten
t a
n
al
y
s
is
Vid
eo
i
n
d
ex
in
g
Vid
eo
s
ea
r
ch
in
g
Vis
u
a
l
s
to
r
y
t
ell
in
g
Co
p
y
rig
h
t
©
2
0
2
0
In
stit
u
te o
f
A
d
v
a
n
c
e
d
E
n
g
i
n
e
e
rin
g
a
n
d
S
c
ien
c
e
.
Al
l
rig
h
ts re
se
rv
e
d
.
C
o
r
r
e
s
p
o
nd
ing
A
uth
o
r
:
J
aim
o
n
J
ac
o
b
,
Go
v
t.
Mo
d
el
E
n
g
i
n
ee
r
in
g
C
o
ll
eg
e,
Ker
ala,
I
n
d
ia
E
m
ail:
j
ai
m
o
n
@
m
ec
.
ac
.
i
n
1.
I
NT
RO
D
UCT
I
O
N
No
w
ad
a
y
s
,
s
o
cial
m
ed
ia
an
d
n
et
w
o
r
k
in
g
to
o
ls
ar
e
v
er
y
co
m
m
o
n
a
n
d
ac
ce
s
s
ib
le
to
all
u
s
er
s
.
Vid
eo
m
e
s
s
a
g
es
ar
e
p
r
ed
o
m
i
n
an
t
l
y
u
s
i
n
g
in
t
h
ese
p
lat
f
o
r
m
s
f
o
r
co
n
v
e
y
i
n
g
t
h
eir
v
ie
w
s
,
co
n
ce
p
ts
an
d
id
ea
s
.
As
an
o
u
tco
m
e
o
f
th
i
s
e
x
p
o
n
e
n
tial
g
r
o
w
th
in
v
id
eo
co
m
m
u
n
icatio
n
,
t
h
e
a
m
o
u
n
t
o
f
v
id
eo
d
ata
tr
an
s
m
it
tin
g
i
n
co
m
m
u
n
icatio
n
n
et
w
o
r
k
s
also
in
cr
ea
s
es i
n
t
h
e
ex
p
o
n
e
n
tial r
a
te.
B
ased
o
n
th
e
w
h
ite
p
ap
er
r
elea
s
ed
b
y
C
I
S
C
O
i
n
d
icate
s
v
is
u
al
n
et
w
o
r
k
i
n
g
i
n
d
ex
(
V
NI
)
,
clea
r
l
y
esti
m
ati
n
g
th
e
s
u
r
g
e
in
tr
af
f
ic
o
f
g
lo
b
al
I
P
b
y
th
r
ee
t
i
m
e
s
i
n
2
0
2
2
co
m
p
ar
ed
w
it
h
t
h
at
in
2
0
1
7
[
1
]
,
b
ec
au
s
e
o
f
th
is
g
r
o
w
th
i
n
v
id
eo
d
ata.
A
c
co
r
d
i
n
g
to
th
ese
s
t
u
d
ies,
th
e
v
id
eo
d
ata
tr
an
s
m
itt
in
g
o
v
er
t
h
e
in
ter
n
et
i
s
m
o
s
tl
y
co
n
tr
ib
u
ted
b
y
v
id
eo
f
ile
s
h
ar
in
g
,
v
id
eo
g
a
m
e
s
a
n
d
v
id
eo
co
n
f
er
en
ce
s
,
a
n
d
b
y
2
0
2
2
w
i
ll
co
n
s
tit
u
te
8
2
%
o
f
th
e
to
tal
d
ata
tr
a
f
f
ic.
T
h
e
i
m
p
o
r
tan
ce
o
f
v
id
eo
tr
af
f
ic
h
a
n
d
lin
g
ca
n
b
e
u
n
d
er
s
to
o
d
f
r
o
m
t
h
ese
s
tu
d
ie
s
an
d
d
ev
elo
p
in
g
a
n
en
v
is
io
n
ed
s
p
ec
if
ic
p
ar
t
d
o
w
n
lo
ad
i
n
g
p
r
o
v
is
io
n
f
r
o
m
e
n
tire
v
id
eo
r
esu
lts
i
n
i
m
p
r
o
v
ed
h
an
d
li
n
g
o
f
v
id
eo
tr
af
f
ic.
T
h
e
tech
n
iq
u
e
o
f
v
id
eo
s
to
r
y
tellin
g
i
s
t
h
e
ac
tio
n
o
f
d
e
s
cr
ib
in
g
th
e
v
id
e
o
li
k
e
a
s
to
r
y
.
I
t
an
al
y
s
e
s
th
e
co
n
ten
t
s
o
f
ea
c
h
f
r
a
m
e
a
n
d
id
en
ti
f
y
i
n
g
s
i
g
n
i
f
ican
t
v
id
eo
clip
s
f
r
o
m
a
lo
n
g
v
id
eo
.
I
n
t
h
e
f
ir
s
t
s
ta
g
e,
a
co
n
te
x
t
-
a
w
ar
e
m
u
l
ti
m
o
d
al
e
m
b
ed
d
in
g
lear
n
i
n
g
f
r
a
m
e
w
o
r
k
is
i
m
p
le
m
e
n
ted
f
o
r
e
x
tr
ac
tin
g
t
h
e
co
n
tex
tu
al
Evaluation Warning : The document was created with Spire.PDF for Python.
I
SS
N
:
2
0
8
8
-
8708
I
n
t J
E
lec
&
C
o
m
p
E
n
g
,
Vo
l.
10
,
No
.
6
,
Dec
em
b
er
2
0
2
0
:
6
0
1
9
-
6
0
2
5
6020
m
ea
n
in
g
o
f
e
v
e
n
t
d
y
n
a
m
ics
i
n
ea
ch
f
r
a
m
e.
I
n
th
e
s
ec
o
n
d
s
tag
e,
a
Nar
r
ato
r
m
o
d
el
is
u
s
ed
f
o
r
g
en
er
ati
n
g
a
s
to
r
y
f
r
o
m
t
h
e
v
id
eo
f
r
a
m
e
s
.
Vid
eo
in
d
ex
i
n
g
f
o
r
m
u
la
tes
a
m
et
h
o
d
o
f
cr
ea
tin
g
a
n
i
n
d
ex
f
r
o
m
a
v
id
eo
w
h
ic
h
is
g
i
v
e
n
as
i
n
p
u
t
u
s
in
g
th
e
v
id
eo
s
to
r
y
te
lli
n
g
tec
h
n
iq
u
e.
I
t
is
s
i
m
ilar
to
th
e
i
n
d
ex
ap
p
ea
r
in
tex
tb
o
o
k
s
.
W
h
en
a
u
s
er
s
ea
r
ch
e
s
f
o
r
a
k
e
y
w
o
r
d
,
th
e
v
id
eo
s
ea
r
ch
en
g
i
n
e
c
h
ec
k
s
th
e
co
n
ten
ts
o
f
t
h
e
in
d
e
x
f
i
le
an
d
co
n
f
ir
m
t
h
at
v
id
eo
is
r
elev
a
n
t
to
th
e
s
ea
r
ch
k
e
y
w
o
r
d
u
s
in
g
th
e
w
o
r
d
co
u
n
t
av
a
ilab
le
i
n
t
h
e
in
d
e
x
f
ile.
I
f
i
t
i
s
f
o
u
n
d
r
el
ev
an
t,
t
h
e
r
an
g
e
o
f
f
r
a
m
e
s
th
a
t
is
s
ig
n
i
f
ica
n
t
to
th
e
s
ea
r
ch
k
e
y
w
o
r
d
w
ill
b
e
ex
tr
ac
ted
f
r
o
m
lo
n
g
v
id
eo
an
d
s
en
d
to
th
e
s
ea
r
c
h
er
.
T
h
u
s
,
i
n
s
tead
o
f
tr
an
s
f
er
r
i
n
g
t
h
e
e
n
tire
v
id
eo
,
th
e
ap
p
r
o
p
r
iate
p
ar
t o
f
t
h
e
v
id
eo
w
i
ll o
n
l
y
n
ee
d
to
s
en
d
a
n
d
th
u
s
r
ed
u
c
es th
e
v
id
eo
tr
af
f
ic
d
r
asti
ca
ll
y
.
I
n
th
is
p
ap
er
,
an
in
n
o
v
ati
v
e
m
o
d
el
u
s
in
g
R
esB
R
NN
-
k
NN
v
i
d
eo
s
to
r
y
telli
n
g
an
d
W
o
r
m
h
o
l
e
in
d
ex
i
n
g
tech
n
iq
u
es
i
s
p
r
o
p
o
s
ed
f
o
r
th
e
co
n
ten
t
a
n
al
y
s
is
o
f
th
e
v
id
eo
an
d
its
r
etr
iev
al.
He
n
ce
,
w
h
e
n
a
s
ea
r
ch
k
e
y
w
o
r
d
is
r
aised
,
an
ap
p
r
o
p
r
iate
v
id
eo
clip
ca
n
b
e
id
en
ti
f
ied
,
ex
tr
a
cted
f
r
o
m
a
lo
n
g
v
id
eo
an
d
c
an
b
e
tr
an
s
f
er
r
ed
to
th
e
s
ee
k
er
,
r
esu
lti
n
g
in
d
ec
r
ea
s
ed
v
id
eo
tr
af
f
ic
b
y
a
v
o
id
in
g
t
h
e
n
ee
d
f
o
r
en
t
ir
e
v
id
eo
tr
an
s
f
er
.
T
h
is
p
ap
er
is
o
r
g
an
i
s
ed
as
f
o
ll
o
w
s
.
I
n
s
ec
t
io
n
2
,
co
n
te
m
p
o
r
ar
y
w
o
r
k
s
r
elate
d
to
t
h
i
s
ar
ea
is
d
escr
ib
ed
.
T
h
e
p
r
o
p
o
s
ed
w
o
r
k
is
d
es
cr
ib
ed
in
d
etail
i
n
Sectio
n
3
.
R
es
u
lts
o
f
i
m
p
le
m
en
ta
tio
n
i
s
d
es
cr
ib
ed
in
s
ec
tio
n
4
.
Sectio
n
5
d
escr
ib
es th
e
co
n
cl
u
s
io
n
an
d
f
u
tu
r
e
s
co
p
e
f
o
r
i
m
p
r
o
v
e
m
e
n
t.
2.
RE
L
AT
E
D
WO
RK
S
Mo
s
t
o
f
th
e
co
n
te
m
p
o
r
ar
y
w
o
r
k
s
i
n
th
e
ar
ea
o
f
v
id
eo
co
n
te
n
t
a
n
al
y
s
i
s
a
n
d
i
n
d
ex
i
n
g
ar
e
b
ased
eith
er
L
ST
M
-
L
o
n
g
s
h
o
r
t
-
ter
m
m
e
m
o
r
y
o
r
co
n
v
o
lu
t
io
n
al
n
e
u
r
a
l
n
et
w
o
r
k
(
C
NN)
.
I
n
[
2
]
,
Ven
u
g
o
p
alan
et
al
.,
p
r
o
p
o
s
ed
a
f
r
a
m
e
w
o
r
k
w
h
ic
h
i
s
b
ased
o
n
d
ee
p
i
m
ag
e
d
escr
ip
tio
n
m
o
d
els
f
o
r
tr
an
s
lati
n
g
v
id
eo
s
to
n
atu
r
a
l
lan
g
u
a
g
e
u
s
in
g
d
ee
p
r
ec
u
r
r
en
t
n
eu
r
al
n
et
w
o
r
k
s
.
I
n
t
h
e
f
ir
s
t
s
tag
e,
ex
tr
ac
ti
n
g
th
e
f
c7
f
e
atu
r
es
[
2
]
f
o
r
ea
ch
f
r
a
m
e,
t
h
e
L
ST
M
n
et
w
o
r
k
i
s
f
ed
b
y
m
ea
n
p
o
o
l
o
f
th
e
f
ea
tu
r
es
ac
r
o
s
s
t
h
e
e
n
tire
v
id
eo
at
ev
er
y
ti
m
e
s
tep
.
Un
til
th
e
L
ST
M
p
ick
s
th
e
en
d
-
of
-
s
e
n
te
n
ce
tag
,
i
t
o
u
tp
u
ts
o
n
e
w
o
r
d
at
ea
c
h
ti
m
e
s
tep
b
ased
o
n
th
e
v
id
eo
f
ea
t
u
r
es (
an
d
th
e
p
r
ev
io
u
s
w
o
r
d
)
.
T
h
e
w
o
r
k
p
r
o
p
o
s
ed
in
[
3
]
is
g
r
o
u
n
d
ed
v
id
eo
d
escr
ip
tio
n
,
w
h
ic
h
co
n
tai
n
s
t
h
r
ee
m
o
d
u
les
to
p
er
f
o
r
m
lan
g
u
a
g
e
g
en
er
atio
n
n
a
m
el
y
,
g
r
o
u
n
d
i
n
g
m
o
d
u
le,
r
eg
io
n
att
en
tio
n
m
o
d
u
le
an
d
la
n
g
u
a
g
e
g
en
er
atio
n
m
o
d
u
le.
T
h
e
v
is
u
al
h
in
ts
f
r
o
m
t
h
e
v
id
eo
is
p
er
ce
iv
ed
b
y
t
h
e
g
r
o
u
n
d
in
g
m
o
d
u
le,
th
e
v
i
s
u
a
l
cl
u
es
t
o
f
o
r
m
a
h
i
g
h
-
le
v
el
in
s
cr
ip
tio
n
o
f
t
h
e
v
i
s
u
al
co
n
t
en
t
i
s
d
y
n
a
m
icall
y
atte
n
d
ed
b
y
r
eg
io
n
atte
n
tio
n
an
d
th
e
la
n
g
u
a
g
e
d
ec
o
d
in
g
is
d
o
n
e
b
y
t
h
e
lan
g
u
ag
e
g
e
n
er
ati
o
n
m
o
d
u
le.
I
n
p
ap
er
[
4
]
,
a
co
n
v
o
lu
tio
n
al
r
elatio
n
al
m
ac
h
i
n
e
(
C
R
M)
w
h
i
ch
is
an
e
n
d
-
to
-
en
d
d
ee
p
co
n
v
o
lu
tio
n
al
n
eu
r
al
n
et
w
o
r
k
is
p
r
ese
n
ted
f
o
r
id
en
tify
i
n
g
g
r
o
u
p
ac
tio
n
s
e
x
p
lo
itin
g
t
h
e
i
n
f
o
r
m
atio
n
b
y
s
p
atial
r
elatio
n
s
o
f
in
d
iv
id
u
al
p
er
s
o
n
s
i
n
i
m
a
g
e
o
r
v
id
eo
.
I
t
g
e
n
er
ates
an
i
n
ter
m
ed
iar
y
ac
ti
v
it
y
m
ap
(
s
p
atial
r
e
p
r
esen
tatio
n
)
b
a
s
ed
o
n
ac
tiv
itie
s
o
f
in
d
i
v
id
u
al
s
an
d
g
r
o
u
p
s
.
T
h
e
r
ed
u
ctio
n
o
f
in
co
r
r
ec
t
p
r
o
p
h
ec
ies
in
t
h
e
ac
tiv
it
y
m
ap
is
th
e
r
esp
o
n
s
ib
il
it
y
o
f
a
m
u
lti
-
s
ta
g
e
en
h
a
n
ce
m
en
t
co
m
p
o
n
en
t.
Fi
n
all
y
,
g
r
o
u
p
ac
tiv
ities
ar
e
id
en
tif
ied
b
y
th
e
r
ef
in
ed
i
n
f
o
r
m
atio
n
f
r
o
m
th
e
ac
c
u
m
u
lat
io
n
co
m
p
o
n
e
n
t.
E
x
p
er
i
m
e
n
tal
o
u
tco
m
es
ex
h
ib
it
t
h
e
b
en
e
f
icia
l
in
v
o
l
v
e
m
en
t o
f
t
h
e
d
ata
m
i
n
ed
an
d
s
ig
n
i
f
ied
in
ter
m
s
o
f
th
e
a
ctiv
it
y
m
ap
.
A
n
o
v
el
v
id
eo
ca
p
tio
n
i
n
g
f
r
a
m
e
w
o
r
k
i
s
p
r
o
p
o
s
ed
in
[
5
]
,
w
h
ic
h
a
s
s
i
m
ilate
s
a
s
o
f
t
atten
tio
n
m
ec
h
a
n
i
s
m
an
d
b
id
ir
ec
tio
n
a
l
l
o
n
g
-
s
h
o
r
t
ter
m
m
e
m
o
r
y
(
B
iLST
M)
to
p
r
o
d
u
ce
en
h
an
ce
d
g
l
o
b
al
d
ep
ictio
n
s
f
o
r
v
id
eo
s
an
d
i
m
p
r
o
v
ed
r
ec
o
g
n
it
io
n
o
f
last
i
n
g
g
est
u
r
es
i
n
th
e
v
id
eo
s
.
T
h
e
lo
n
g
-
s
h
o
r
t
ter
m
m
e
m
o
r
y
is
u
s
ed
as
a
d
ec
o
d
er
to
f
u
ll
y
e
x
p
lo
r
e
g
l
o
b
al
co
n
tex
t
u
al
i
n
f
o
r
m
atio
n
f
o
r
p
r
o
d
u
cin
g
v
id
eo
ca
p
tio
n
s
.
T
h
e
f
o
llo
w
i
n
g
ar
e
th
e
b
en
e
f
it
s
o
f
t
h
e
m
et
h
o
d
p
r
o
p
o
s
ed
in
th
e
p
ap
er
:
1
)
th
e
B
iL
ST
M
co
n
s
tr
u
ctio
n
s
y
s
te
m
atica
ll
y
co
n
s
er
v
e
s
v
is
u
al
d
ata
a
n
d
g
lo
b
al
te
m
p
o
r
al
an
d
2
)
th
e
m
ec
h
a
n
is
m
o
f
s
o
f
t
atte
n
tio
n
d
is
ti
n
g
u
i
s
h
a
n
d
f
o
cu
s
o
n
p
r
in
cip
al
tar
g
ets
f
r
o
m
t
h
e
co
n
v
o
l
u
ted
co
n
ten
t b
y
h
elp
o
f
a
la
n
g
u
a
g
e
d
ec
o
d
er
.
Vid
eo
d
escr
ip
tio
n
g
en
er
atio
n
u
s
i
n
g
au
d
io
an
d
v
i
s
u
al
cu
e
s
i
s
p
r
o
p
o
s
ed
in
[
6
]
,
th
is
s
y
s
te
m
r
elies
o
n
d
ee
p
m
o
d
els
li
k
e
C
NN
a
n
d
L
ST
M
-
R
NN.
T
h
e
s
y
s
te
m
co
n
s
t
r
u
ctio
n
o
f
v
is
u
al
-
o
n
l
y
p
o
r
tr
a
y
al
s
y
s
te
m
i
s
al
m
o
s
t
s
i
m
ilar
to
th
at
o
f
th
e
a
u
d
io
-
o
n
l
y
s
y
s
te
m
d
i
f
f
er
i
n
g
o
n
l
y
i
n
th
e
r
ep
r
esen
tat
io
n
m
o
d
u
le
f
ea
t
u
r
e.
T
h
e
f
ea
tu
r
e
en
co
d
in
g
i
n
v
is
u
al
-
o
n
l
y
s
y
s
te
m
u
s
e
s
C
NN
w
h
ile
t
h
e
b
ag
-
of
-
ac
o
u
s
tic
-
w
o
r
d
s
is
u
s
ed
in
t
h
e
au
d
io
-
o
n
l
y
s
y
s
te
m
.
T
h
e
t
w
o
s
ta
g
e
s
i
n
th
e
s
tr
u
ct
u
r
e
ex
ec
u
tio
n
ar
e
tr
ain
i
n
g
a
n
d
test
p
h
ase.
T
h
e
L
ST
M
-
R
NN
m
o
d
el
i
s
p
r
e
-
tr
ain
ed
u
s
i
n
g
r
elate
d
au
x
i
liar
y
d
ata
a
n
d
f
i
n
e
-
tu
n
ed
o
n
th
e
tar
g
et
d
o
m
ai
n
d
ata
o
r
tr
ain
ed
u
s
in
g
tar
g
et
d
o
m
ai
n
tr
ain
i
n
g
d
ata
in
th
e
tr
ain
in
g
p
h
ase.
T
h
i
s
m
et
h
o
d
u
tili
ze
s
L
ST
M
-
R
NN
to
p
r
o
ce
s
s
in
co
m
i
n
g
v
id
eo
f
r
a
m
es
f
o
r
v
is
u
al
an
d
ac
o
u
s
tic
en
co
d
in
g
b
y
m
o
d
el
s
eq
u
en
ce
d
y
n
a
m
ic
s
an
d
co
n
n
e
ctin
g
it
d
ir
ec
tl
y
to
an
ac
o
u
s
ti
c
f
ea
tu
r
e
ex
tr
ac
tio
n
m
o
d
u
le
an
d
a
C
NN.
L
ST
M
-
lo
n
g
s
h
o
r
t
-
ter
m
m
e
m
o
r
y
n
et
w
o
r
k
s
ar
e
a
v
ar
ia
n
t
o
f
R
NN
-
r
ec
u
r
r
en
t
n
eu
r
al
n
et
w
o
r
k
,
ab
le
to
lear
n
th
e
lo
n
g
-
ter
m
d
ep
en
d
en
cies.
I
n
[
7
]
,
C
N
N
f
ea
tu
r
es
ar
e
e
x
tr
ac
ted
f
r
o
m
v
id
eo
f
r
a
m
es,
th
en
th
e
s
in
g
le
f
ea
t
u
r
e
v
ec
to
r
g
en
er
ated
,
th
at
co
n
v
e
y
s
th
e
m
ea
n
in
g
o
f
t
h
e
e
n
tire
v
id
eo
.
L
ST
M
u
s
ed
f
o
r
cr
ea
tin
g
t
h
e
v
id
eo
d
escr
ip
tio
n
s
f
r
o
m
t
h
e
m
er
g
ed
i
m
ag
e
s
o
f
s
ep
ar
ate
f
r
a
m
es.
A
s
eq
u
e
n
ce
d
ec
o
d
er
is
u
s
ed
to
Evaluation Warning : The document was created with Spire.PDF for Python.
I
n
t J
E
lec
&
C
o
m
p
E
n
g
I
SS
N:
2
0
8
8
-
8708
V
id
eo
co
n
te
n
t a
n
a
lysi
s
a
n
d
r
etri
ev
a
l sys
tem
u
s
in
g
vid
eo
s
to
r
ytellin
g
a
n
d
i
n
d
ex
in
g
tech
n
iq
u
es (
Ja
imo
n
Ja
co
b
)
6021
cr
ea
te
n
ar
r
atio
n
f
r
o
m
th
e
m
ea
n
-
p
o
o
led
v
ec
to
r
.
L
ST
M
is
u
s
e
d
as
a
s
eq
u
en
ce
d
ec
o
d
er
,
b
u
t
it
f
ai
ls
to
co
n
s
id
e
r
th
e
en
t
ir
e
ti
m
e
-
b
ased
in
f
o
r
m
a
t
io
n
w
h
ile
g
e
n
er
atin
g
t
h
e
n
ar
r
at
io
n
.
Se
m
a
n
tic
f
ea
t
u
r
es
u
s
ed
f
o
r
r
ec
o
g
n
i
s
in
g
th
e
ac
tio
n
,
s
ce
n
e
an
d
o
b
j
ec
ts
in
[
8
]
.
I
t
u
s
es
a
s
e
m
an
t
ic
co
n
ce
p
t
s
p
ac
e
to
m
o
d
el
th
e
e
v
en
t
s
u
s
i
n
g
t
h
e
s
e
m
a
n
tic
c
h
a
r
ac
ter
is
tic
f
ea
t
u
r
es.
I
n
t
h
e
r
ea
l
m
o
f
v
id
eo
ev
e
n
t
in
ter
p
r
etatio
n
a
n
d
ca
talo
g
u
i
n
g
,
th
e
ad
v
a
n
ta
g
es
o
f
co
n
ce
p
t
-
b
ased
ev
e
n
t
r
ep
r
esen
ta
ti
o
n
(
C
B
E
R
)
is
u
tili
ze
d
b
y
th
is
m
et
h
o
d
.
W
eb
v
id
eo
q
u
er
y
u
s
i
n
g
s
e
m
a
n
tic
s
i
g
n
at
u
r
e
ill
u
s
tr
atio
n
i
s
i
m
p
le
m
en
ted
i
n
[
9
]
,
s
p
ec
if
icall
y
f
o
r
co
m
p
le
x
e
v
e
n
ts
in
v
id
eo
q
u
er
y
ex
a
m
p
les.
T
o
co
m
p
u
te
th
e
v
ar
ian
ce
i
n
s
e
m
a
n
tic
e
x
is
ten
ce
o
f
ev
e
n
ts
,
th
is
m
et
h
o
d
u
s
e
s
th
e
o
f
f
-
t
he
-
s
h
el
f
co
n
ce
p
t in
d
icato
r
s
.
An
i
n
n
o
v
ati
v
e
al
g
o
r
ith
m
p
r
o
p
o
s
ed
in
[
1
0
]
to
m
a
n
a
g
e
t
h
e
co
m
p
lex
d
y
n
a
m
ic
s
i
n
v
id
eo
s
f
r
o
m
t
h
e
r
ea
l
w
o
r
ld
.
I
n
th
is
al
g
o
r
ith
m
,
ca
p
tio
n
s
ar
e
cr
ea
ted
u
s
in
g
en
d
-
to
-
e
n
d
s
eq
u
e
n
ce
-
to
-
s
eq
u
e
n
ce
m
o
d
el.
L
ST
M
is
th
e
s
tate
-
of
-
t
h
e
-
ar
t
tec
h
n
o
lo
g
y
i
n
r
ec
u
r
r
en
t
n
e
u
r
al
n
e
t
w
o
r
k
s
f
o
r
g
en
er
ati
n
g
ca
p
t
io
n
s
f
o
r
v
id
eo
s
.
I
n
th
e
al
g
o
r
ith
m
,
L
ST
M
m
o
d
el
is
u
s
ed
to
g
e
n
er
ate
t
h
e
d
es
cr
ip
tio
n
o
f
a
n
e
v
en
t
i
n
a
v
id
e
o
clip
.
T
o
c
o
n
n
ec
t
a
s
eq
u
e
n
ce
o
f
v
id
eo
f
r
a
m
es
w
it
h
a
s
eq
u
e
n
ce
o
f
w
o
r
d
s
,
L
ST
M
m
o
d
el
i
s
tr
ain
ed
o
n
v
i
d
e
o
s
en
ten
ce
p
air
s
.
T
h
e
d
y
n
a
m
ic
s
tr
u
ct
u
r
e
o
f
t
h
e
f
r
a
m
e
s
in
a
v
id
eo
s
eq
u
en
ce
ca
n
b
e
an
al
y
s
ed
an
d
ca
n
b
e
lear
n
ed
is
th
e
m
aj
o
r
m
er
it o
f
t
h
is
m
o
d
el.
I
n
[
1
1
]
,
v
ar
io
u
s
v
id
eo
in
d
e
x
i
n
g
tech
n
iq
u
e
s
ar
e
co
m
p
ar
ed
t
o
th
eir
m
er
its
a
n
d
d
e
m
er
it
s
.
A
d
etailed
p
o
r
tr
ay
al
o
f
v
i
d
eo
in
d
e
x
i
n
g
p
r
esen
ted
in
t
h
i
s
p
ap
er
an
d
its
s
ig
n
i
f
ica
n
ce
i
n
co
n
ten
t
-
b
ased
i
n
f
o
r
m
atio
n
r
etr
iev
al,
ev
en
t
f
ea
t
u
r
e
ex
tr
ac
tio
n
an
d
m
u
lt
i
m
o
d
al
v
id
eo
in
d
ex
i
n
g
.
A
n
i
n
n
o
v
ati
v
e
alg
o
r
ith
m
p
r
o
p
o
s
ed
in
[
1
2
]
,
v
id
eo
in
d
ex
in
g
a
n
d
r
etr
iev
al
u
s
i
n
g
v
id
eo
co
n
ten
t
a
n
al
y
s
i
s
.
Var
io
u
s
s
e
m
a
n
tic
f
ea
t
u
r
es
l
i
k
e
m
o
t
io
n
f
ea
t
u
r
es,
ed
g
e
an
d
k
e
y
f
r
a
m
e
te
x
t
u
r
e
u
s
ed
f
o
r
v
id
eo
co
n
te
n
t
a
n
al
y
s
i
s
.
T
h
ese
s
e
m
a
n
tic
f
ea
t
u
r
e
s
ar
e
ab
s
tr
ac
ted
to
ch
ar
ac
ter
is
e
a
v
id
eo
u
s
in
g
a
f
e
atu
r
e
v
ec
to
r
.
Me
th
o
d
o
f
s
eq
u
e
n
ce
-
to
-
s
eq
u
en
ce
w
it
h
m
ec
h
a
n
is
m
o
f
te
m
p
o
r
al
atte
n
tio
n
is
u
s
ed
i
n
p
ap
er
[
1
3
]
,
to
g
e
n
er
ate
a
n
i
m
a
g
e
ca
p
ti
o
n
au
to
m
atica
l
l
y
co
n
s
id
er
in
g
t
h
e
te
m
p
o
r
al
d
y
n
a
m
ics
o
f
ea
ch
f
r
a
m
e
in
th
e
i
m
a
g
e
g
i
v
en
as
in
p
u
t.
R
ec
u
r
r
en
t
n
e
u
r
al
n
et
w
o
r
k
(
R
N
N
)
is
u
s
ed
f
o
r
g
en
er
ati
n
g
t
h
e
ca
p
tio
n
s
.
I
n
[
1
4
]
,
an
in
n
o
v
at
iv
e
co
n
te
n
t
-
b
ased
v
id
eo
in
d
ex
i
n
g
a
n
d
r
ec
lam
atio
n
al
g
o
r
ith
m
i
s
p
r
o
p
o
s
ed
.
T
h
is
f
r
a
m
e
w
o
r
k
u
s
e
s
th
e
co
r
r
esp
o
n
d
en
ce
-
la
ten
t
d
ir
ich
let
allo
ca
tio
n
(
co
r
r
-
L
D
A
)
p
r
o
b
ab
ilis
tic
f
r
a
m
e
w
o
r
k
f
o
r
v
id
eo
co
n
ten
t
an
al
y
s
is
a
n
d
i
n
d
ex
i
n
g
.
T
h
e
m
aj
o
r
m
er
it
o
f
th
i
s
p
r
o
p
o
s
ed
alg
o
r
ith
m
is
a
u
to
m
atica
ll
y
a
n
n
o
tati
n
g
t
h
e
v
id
eo
s
w
h
ic
h
ar
e
s
to
r
ed
in
v
id
eo
d
atab
ase
an
d
r
etu
r
n
m
ea
n
i
n
g
f
u
l
d
escr
ip
tio
n
b
a
s
ed
o
n
s
e
m
a
n
tic
r
elat
io
n
s
b
et
w
ee
n
s
ce
n
es.
T
h
e
m
er
its
an
d
d
e
m
er
its
o
f
d
if
f
er
en
t
ass
e
s
s
m
e
n
t
m
etr
ic
s
s
u
ch
as
W
MD
,
C
I
DE
r
,
SP
I
C
E
,
R
OUGE
,
ME
T
E
OR
an
d
B
L
E
U
u
s
ed
in
th
e
v
id
eo
d
escr
ip
tio
n
s
p
r
esen
ted
in
[
1
5
]
in
ter
m
s
o
f
d
ee
p
lear
n
in
g
m
o
d
els,
d
ata
s
ets
u
s
ed
,
n
o
o
f
clas
s
es
an
d
d
if
f
er
en
t
d
o
m
ai
n
.
T
h
e
ef
f
ec
ti
v
e
u
s
a
g
e
o
f
li
n
g
u
i
s
tic
k
n
o
w
led
g
e
w
h
ic
h
i
s
e
x
tr
ac
ted
f
r
o
m
a
h
u
g
e
te
x
t
co
r
p
u
s
,
i
n
d
escr
ib
i
n
g
a
v
id
eo
co
n
te
n
t
is
p
r
ese
n
ted
in
t
h
e
p
ap
er
[
1
6
]
.
A
n
in
n
o
v
at
iv
e
v
id
eo
s
u
m
m
ar
y
f
r
a
m
e
w
o
r
k
is
s
u
g
g
e
s
ted
in
[
1
7
]
.
I
t
u
s
es
atten
ti
v
e
en
co
d
e
r
d
ec
o
d
er
n
et
w
o
r
k
(
A
VS)
to
r
etr
iev
e
th
e
s
e
m
a
n
tic
i
n
f
o
r
m
atio
n
f
r
o
m
t
h
e
v
id
eo
f
r
a
m
e
s
,
u
s
i
n
g
th
e
B
iLST
M
b
id
ir
ec
tio
n
al
lo
n
g
ter
m
m
e
m
o
r
y
.
I
n
[
1
8
]
,
a
n
o
v
el
f
r
a
m
e
w
o
r
k
f
o
r
v
id
eo
s
u
m
m
ar
izatio
n
is
p
r
esen
ted
f
o
r
d
o
m
ai
n
-
s
p
ec
i
f
ic
v
id
eo
s
.
T
h
e
alg
o
r
ith
m
co
n
s
id
er
s
t
h
e
ch
ar
ac
ter
is
tic
s
f
ea
t
u
r
es
m
o
s
t
r
elev
an
t
to
t
h
e
p
ar
ticu
lar
d
o
m
ai
n
an
d
g
e
n
er
ati
n
g
a
s
u
m
m
ar
y
w
h
ich
d
escr
ib
es
t
h
e
co
n
te
x
tu
a
l
m
ea
n
i
n
g
f
in
p
u
t
v
id
eo
.
I
n
[
1
9
]
,
ad
d
r
ess
v
ar
io
u
s
tas
k
s
in
v
o
lv
ed
i
n
v
id
eo
co
n
ten
t
a
n
al
y
s
is
a
n
d
in
d
ex
in
g
s
u
ch
a
s
to
d
ev
elo
p
an
in
ter
ac
tiv
e
e
n
v
ir
o
n
m
en
t
w
i
th
o
b
j
ec
ts
f
o
r
v
id
eo
,
to
m
a
n
a
g
e
v
id
eo
co
n
ten
t
m
a
n
ag
e
m
e
n
t
tas
k
s
,
to
d
ef
in
e
a
ch
ar
ac
ter
is
tic
ar
ch
i
tectu
r
e,
to
d
ev
elo
p
au
to
m
a
ted
to
o
ls
an
d
tech
n
iq
u
es
f
o
r
v
id
eo
co
n
ten
t
r
ec
o
g
n
it
io
n
an
d
r
ep
r
esen
tatio
n
,
to
ap
p
ly
th
e
tec
h
n
iq
u
es
o
f
k
n
o
w
led
g
e
r
ep
r
esen
tatio
n
f
o
r
in
d
e
x
co
n
s
tr
u
ctio
n
d
ev
elo
p
m
e
n
t
a
n
d
m
e
th
o
d
s
f
o
r
r
esto
r
atio
n
.
A
J
E
DDi
-
Net
(
j
o
in
t
ev
e
n
t
d
etec
tio
n
an
d
d
escr
ip
tio
n
n
et
wo
r
k
)
is
p
r
o
p
o
s
ed
in
p
a
p
er
[
2
0
]
th
at
s
o
lv
e
s
th
e
tas
k
o
f
d
en
s
e
v
i
d
eo
d
escr
ip
tio
n
in
p
o
s
s
ib
le
w
a
y
s
.
T
h
e
m
o
d
el
co
n
ti
n
u
o
u
s
l
y
co
d
es
a
th
r
ee
-
d
i
m
en
s
io
n
al
co
n
v
o
l
u
tio
n
al
la
y
er
i
n
p
u
t
v
id
eo
s
tr
ea
m
,
s
u
g
g
e
s
ts
a
n
d
g
e
n
er
ates
v
ar
y
i
n
g
ti
m
e
ev
e
n
ts
o
n
th
e
b
asi
s
o
f
p
o
o
led
c
h
ar
ac
ter
is
tics
.
A
n
o
v
el
v
id
eo
d
escr
ip
tio
n
m
o
d
el
h
as
b
ee
n
p
r
o
p
o
s
ed
in
p
ap
er
[
2
1
]
to
allo
w
u
s
e
o
f
b
o
u
n
d
i
n
g
b
o
x
an
n
o
tatio
n
s
an
d
p
r
o
d
u
ce
g
r
o
u
n
d
ed
s
u
b
titl
e
s
.
I
n
th
is
s
t
u
d
y
,
th
e
v
id
eo
f
ac
ts
ar
e
co
m
p
ar
ed
w
it
h
th
e
s
u
b
titl
e
s
en
te
n
ce
b
y
r
ef
er
e
n
cin
g
t
h
e
n
o
u
n
o
f
th
e
s
en
te
n
ce
i
n
o
n
e
o
f
th
e
f
r
a
m
e
s
o
f
a
v
id
eo
.
In
[
2
2
]
,
a
n
e
w
d
en
s
e
f
r
a
m
e
w
o
r
k
f
o
r
v
id
eo
ca
p
tio
n
in
g
w
a
s
p
r
o
p
o
s
ed
,
w
h
ich
s
p
ec
if
icall
y
m
o
d
els
t
h
e
te
m
p
o
r
al
d
ep
en
d
en
c
y
o
f
o
cc
u
r
r
en
ce
s
i
n
th
e
v
id
eo
an
d
u
tili
ze
s
v
i
s
u
al
an
d
la
n
g
u
a
g
e
c
o
n
tex
t
s
f
o
r
co
h
er
e
n
t
s
to
r
y
telli
n
g
f
r
o
m
p
as
t
ev
e
n
ts
.
Fro
m
th
e
d
etailed
liter
atu
r
e
s
u
r
v
e
y
,
it
is
o
b
s
er
v
ed
t
h
at
v
id
eo
s
to
r
y
te
llin
g
tec
h
n
iq
u
e
-
b
ased
v
id
eo
co
n
ten
t
an
al
y
s
is
a
n
d
r
etr
iev
al
i
s
n
o
t a
d
d
r
ess
ed
.
3.
P
RO
P
O
SE
D
WO
RK
T
h
e
w
h
o
le
v
id
eo
is
co
n
s
id
er
ed
in
all
v
id
eo
s
ea
r
ch
a
lg
o
r
ith
m
s
m
e
n
tio
n
ed
i
n
r
ec
e
n
t
liter
atu
r
e,
alth
o
u
g
h
a
p
o
r
tio
n
o
f
th
e
v
id
eo
m
a
y
b
e
o
f
in
ter
est
to
th
e
s
t
ak
eh
o
ld
er
s
.
Ho
w
e
v
er
,
th
e
p
r
o
p
o
s
ed
m
o
d
el
o
n
l
y
allo
w
s
co
n
ten
t
w
i
s
e
r
etr
iev
al
o
f
th
e
in
ter
esti
n
g
v
id
eo
f
r
a
m
es,
s
i
g
n
if
ican
t
l
y
r
ed
u
c
in
g
t
h
e
b
an
d
w
id
th
.
Evaluation Warning : The document was created with Spire.PDF for Python.
I
SS
N
:
2
0
8
8
-
8708
I
n
t J
E
lec
&
C
o
m
p
E
n
g
,
Vo
l.
10
,
No
.
6
,
Dec
em
b
er
2
0
2
0
:
6
0
1
9
-
6
0
2
5
6022
A
r
ig
h
teo
u
s
co
n
ce
p
t
to
s
p
ee
d
u
p
th
e
r
ec
o
v
er
y
o
f
v
id
eo
d
ata
u
s
i
n
g
a
v
id
eo
in
d
ex
is
p
r
o
p
o
s
ed
in
th
is
ap
p
r
o
ac
h
.
Vid
eo
in
d
ex
i
s
a
n
alo
g
o
u
s
to
th
e
p
ag
e
in
d
e
x
t
h
at
lo
o
k
s
i
n
tex
tb
o
o
k
s
,
w
h
er
e
e
v
er
y
o
t
h
e
r
i
m
p
o
r
tan
t
w
o
r
d
o
r
p
h
r
ase
is
ar
r
an
g
ed
i
n
a
s
o
r
ted
o
r
d
er
to
g
eth
er
w
it
h
t
h
e
d
e
n
s
e
an
d
f
r
a
m
e.
T
h
e
m
aj
o
r
f
u
n
ctio
n
al
u
n
its
ap
p
ea
r
in
th
is
co
n
ce
p
t a
r
e
VI
G
f
o
r
g
e
n
er
atio
n
o
f
th
e
v
id
eo
i
n
d
ex
,
A
T
C
f
o
r
co
n
v
er
ti
n
g
a
u
d
io
to
tex
t,
T
E
f
o
r
e
x
tr
ac
tio
n
o
f
th
e
tex
t,
an
d
VC
G
f
o
r
g
e
n
er
ati
n
g
v
id
eo
ca
p
tio
n
as ill
u
s
tr
ated
in
Fi
g
u
r
e
1
.
I
C
G
g
en
er
ate
s
ca
p
tio
n
s
to
ea
ch
i
m
a
g
e
af
ter
i
m
ag
e
co
n
ten
t
an
al
y
s
is
.
A
u
d
io
to
tex
t
co
n
v
er
ter
g
en
er
ate
s
th
e
k
e
y
w
o
r
d
s
b
ase
d
o
n
th
e
au
d
io
av
ailab
le
w
i
t
h
th
e
v
id
eo
.
T
ex
t
ex
tr
ac
to
r
ex
tr
ac
ts
t
h
e
te
x
tu
al
in
f
o
r
m
atio
n
p
r
ese
n
t i
n
th
e
i
m
a
g
es.
T
h
e
tex
t
u
al
i
n
f
o
r
m
atio
n
a
v
ailab
le
f
r
o
m
th
e
s
e
th
r
ee
f
u
n
ct
io
n
al
u
n
it
s
is
g
iv
e
n
as
in
p
u
t
to
v
id
eo
in
d
ex
g
en
e
r
ato
r
to
g
en
er
ate
th
e
v
id
eo
in
d
ex
w
h
ic
h
co
n
tain
s
m
aj
o
r
k
ey
w
o
r
d
s
,
d
en
s
e
an
d
r
an
g
e
o
f
f
r
a
m
es
w
h
e
r
e
th
i
s
m
aj
o
r
k
ey
w
o
r
d
ap
p
ea
r
s
.
Vid
eo
s
ea
r
ch
e
n
g
in
e
ex
a
m
i
n
e
s
t
h
e
c
o
n
ten
t
o
f
th
e
v
id
eo
in
d
ex
tab
le
an
d
co
n
f
ir
m
w
h
e
th
er
th
i
s
v
id
eo
is
r
elev
an
t
to
th
e
s
ea
r
ch
k
e
y
w
o
r
d
b
y
ch
e
ck
in
g
t
h
e
d
en
s
e
in
th
e
i
n
d
ex
tab
le.
I
f
it
is
f
o
u
n
d
r
elev
an
t,
t
h
e
s
ig
n
i
f
ica
n
t
p
ar
t
o
f
th
e
v
id
eo
clip
w
ill
b
e
e
x
tr
ac
ted
f
r
o
m
t
h
e
lo
n
g
v
id
eo
u
s
i
n
g
th
e
f
r
a
m
e
r
a
n
g
e
i
n
f
o
r
m
atio
n
av
ailab
le
i
n
t
h
e
in
d
e
x
tab
le.
Fig
u
r
e
1
.
P
r
o
ce
s
s
f
lo
w
-
v
id
eo
r
ec
o
u
p
in
g
m
o
d
el
u
s
in
g
v
id
eo
in
d
ex
i
n
g
Vid
eo
co
n
ten
t
an
al
y
s
i
s
co
n
te
m
p
late
th
r
ee
t
y
p
e
s
o
f
in
f
o
r
m
a
tio
n
p
r
esen
t
i
n
an
y
v
id
eo
,
viz
au
d
io
,
tex
t
,
v
id
eo
.
T
h
is
in
f
o
r
m
at
io
n
ca
n
b
e
u
s
ed
f
o
r
an
al
y
zi
n
g
a
n
d
to
g
e
n
er
ate
an
er
r
o
r
f
r
ee
v
id
eo
in
d
e
x
.
I
n
p
u
t v
id
eo
s
m
a
y
b
e
in
teg
r
ated
w
i
th
co
m
b
in
a
tio
n
o
f
o
n
e
o
r
m
o
r
e
i
n
f
o
r
m
atio
n
.
A
b
s
e
n
ce
o
f
an
y
o
n
e
o
r
t
w
o
t
y
p
e
s
o
f
in
f
o
r
m
at
io
n
co
n
ten
t
d
o
es
n
’
t
a
f
f
ec
t
th
e
ac
c
u
r
ac
y
o
f
v
id
eo
i
n
d
ex
it
g
e
n
er
a
tes.
Vid
eo
co
n
ten
t
an
al
y
s
is
co
m
p
r
i
s
es
t
h
r
ee
m
aj
o
r
b
lo
ck
s
,
Au
d
io
to
T
ex
t
co
n
v
er
t
er
(
A
T
C
)
,
tex
t e
x
tr
ac
to
r
(
TE
)
,
an
d
th
e
i
m
ag
e
ca
p
tio
n
g
e
n
er
at
o
r
(
I
C
G
).
3
.
1
.
Audi
o
t
o
t
ex
t
co
nv
er
t
er
I
n
th
i
s
m
o
d
u
le,
th
e
r
ec
u
r
r
e
n
t
n
e
u
r
al
n
et
w
o
r
k
-
b
ased
tech
n
iq
u
e
is
u
s
ed
to
g
e
n
er
ate
th
e
te
x
tu
a
l
in
f
o
r
m
atio
n
f
r
o
m
t
h
e
au
d
io
av
ailab
le
w
it
h
th
e
v
id
eo
.
A
u
d
i
o
is
s
liced
in
to
ch
u
n
k
s
w
it
h
2
0
m
s
s
ize
an
d
f
ee
d
in
to
th
i
s
m
o
d
u
le
as
in
p
u
t.
I
t
w
ill
g
e
n
er
ate
tex
t
s
co
r
r
esp
o
n
d
in
g
to
th
e
au
d
io
ch
u
n
k
s
u
s
in
g
th
e
ab
ilit
y
t
o
m
e
m
o
r
ize
a
n
d
co
m
p
u
te
t
h
e
est
i
m
ate
s
.
3
.
2
.
T
ex
t
ex
t
ra
ct
o
r
T
ex
t
ex
tr
ac
to
r
g
e
n
er
ates
th
e
te
x
tu
a
l
i
n
f
o
r
m
atio
n
p
r
esen
t
i
n
t
h
e
i
m
a
g
es,
g
i
v
e
n
as
i
n
p
u
t
to
th
i
s
m
o
d
u
le.
I
t
co
n
s
i
s
ts
o
f
th
r
ee
s
ta
g
es,
lin
e
ed
g
e
d
etec
tio
n
m
as
k
f
o
r
ed
g
e
g
en
er
atio
n
,
p
r
o
j
ec
tio
n
p
r
o
f
ile
s
f
o
r
te
x
t
lo
ca
lizatio
n
,
an
d
s
eg
m
e
n
tatio
n
o
f
te
x
t
&
r
ec
o
g
n
itio
n
o
f
te
x
t
as
ill
u
s
tr
ated
i
n
[
2
3
]
.
I
n
th
i
s
m
et
h
o
d
,
an
ef
f
ic
ien
t
alg
o
r
ith
m
i
s
p
r
o
p
o
s
ed
to
lo
c
alize
an
d
ex
tr
ac
t
th
e
tex
t
s
f
r
o
m
g
r
ap
h
ic
s
as
w
ell
as
s
ce
n
es
p
r
esen
t
i
n
v
id
eo
s
i
m
a
g
es.
A
lo
ca
lized
v
id
eo
i
m
ag
e
f
r
a
m
e
is
co
n
v
er
ted
i
n
to
in
ten
s
i
t
y
-
b
ased
ed
g
e
m
ap
b
y
S
OB
E
L
ed
g
e
o
p
er
ato
r
in
th
is
al
g
o
r
ith
m
.
Evaluation Warning : The document was created with Spire.PDF for Python.
I
n
t J
E
lec
&
C
o
m
p
E
n
g
I
SS
N:
2
0
8
8
-
8708
V
id
eo
co
n
te
n
t a
n
a
lysi
s
a
n
d
r
etri
ev
a
l sys
tem
u
s
in
g
vid
eo
s
to
r
ytellin
g
a
n
d
i
n
d
ex
in
g
tech
n
iq
u
es (
Ja
imo
n
Ja
co
b
)
6023
3
.
3
.
I
m
a
g
e
ca
ptio
n g
ener
a
t
o
r
T
h
e
v
id
eo
s
to
r
y
te
lli
n
g
m
et
h
o
d
[
2
4
]
,
is
u
s
ed
f
o
r
v
id
eo
ca
p
tio
n
g
e
n
er
atio
n
an
d
it
i
n
v
o
lv
e
s
t
w
o
s
u
b
tas
k
s
:
(
a)
id
en
tify
t
h
e
m
o
s
t
r
elev
an
t
v
id
eo
c
lip
s
f
r
o
m
th
e
in
p
u
t
v
id
eo
,
(
b
)
cr
ea
te
a
n
ar
r
at
io
n
f
r
o
m
th
e
v
id
eo
clip
id
en
tif
ied
as
s
i
g
n
i
f
ican
t.
E
m
b
ed
d
in
g
lear
n
i
n
g
w
it
h
m
u
lti
m
o
d
al
s
e
m
a
n
tic
is
u
s
ed
as
a
co
n
tex
t
-
a
w
ar
e
f
r
a
m
e
w
o
r
k
f
o
r
id
en
ti
f
y
i
n
g
t
h
e
r
elev
an
t
p
ar
t
o
f
t
h
e
v
id
eo
clip
.
A
lo
ca
l
-
to
-
g
lo
b
al
t
w
o
-
p
h
a
s
e
p
r
o
ce
s
s
is
u
s
ed
f
o
r
tr
ain
i
n
g
t
h
e
E
m
b
ed
d
in
g
.
T
h
e
f
ir
s
t
p
h
a
s
e
p
r
o
to
t
y
p
es
s
p
ec
i
f
i
c
p
air
s
o
f
clip
-
s
en
te
n
ce
to
k
n
o
w
a
n
e
m
b
ed
d
ed
lo
ca
le.
T
h
e
s
ec
o
n
d
s
tag
e
co
n
s
id
er
th
e
w
h
o
le
v
id
eo
as
a
s
eg
m
en
t
s
eq
u
e
n
ce
.
T
h
e
v
id
eo
'
s
tem
p
o
r
al
d
y
n
a
m
ics
i
s
ca
p
tu
r
ed
b
y
R
e
s
id
u
al
b
id
ir
ec
ti
o
n
al
R
N
N
(
R
esB
R
NN)
a
n
d
th
en
i
n
teg
r
ate
s
p
ast
a
n
d
f
u
t
u
r
e
r
elev
an
t
i
n
f
o
r
m
atio
n
in
to
th
e
m
u
lti
m
o
d
al
e
m
b
ed
d
in
g
s
p
ac
e.
T
h
e
te
m
p
o
r
al
co
h
er
en
ce
is
p
r
eser
v
ed
,
an
d
th
u
s
i
n
cr
ea
s
i
n
g
th
e
co
r
r
esp
o
n
d
in
g
e
m
b
ed
d
in
g
v
ar
iet
y
.
I
n
t
h
e
s
ec
o
n
d
s
tag
e,
th
e
s
to
r
ie
s
ar
e
cr
ea
ted
u
s
i
n
g
a
Nar
r
ato
r
m
o
d
el.
T
h
e
n
ar
r
ato
r
ex
tr
ac
ts
a
s
er
ies
o
f
r
elev
an
t
cl
ip
s
f
r
o
m
it,
d
u
e
to
an
in
p
u
t
v
id
eo
clip
.
A
s
eq
u
e
n
c
e
o
f
s
en
te
n
ce
s
ap
t
f
o
r
th
e
clip
s
in
t
h
e
m
u
lt
i
m
o
d
al
e
m
b
ed
d
in
g
s
p
ac
e
i
s
r
etr
ie
v
ed
f
o
r
g
en
er
at
in
g
a
s
to
r
y
lin
e.
I
t's
d
i
f
f
icu
l
t
to
d
i
s
co
v
er
t
h
e
r
ig
h
t
clip
s
b
ec
au
s
e
th
er
e
'
s
n
o
s
i
m
p
le
d
escr
ip
tio
n
o
f
w
h
at
v
is
u
al
ele
m
e
n
t
s
a
r
e
n
ec
es
s
ar
y
to
s
h
ap
e
a
p
er
f
ec
t
s
to
r
y
.
T
o
th
is
en
d
,
th
e
n
ar
r
ato
r
h
as
b
ee
n
f
o
r
m
u
la
ted
as
a
r
ein
f
o
r
ce
m
e
n
t
lear
n
i
n
g
ag
e
n
t
e
x
a
m
in
in
g
t
h
e
v
id
eo
in
p
u
t
s
eq
u
e
n
tia
ll
y
an
d
th
e
n
s
t
u
d
ies
a
s
tr
ateg
y
o
f
ch
o
o
s
i
n
g
cl
ip
s
to
m
a
x
i
m
i
ze
th
e
o
u
tco
m
e.
W
e
co
n
s
tr
u
ct
th
e
o
u
tco
m
e
a
s
th
e
li
n
g
u
is
tic
m
ea
s
u
r
e
b
etw
ee
n
t
h
u
s
r
ec
ei
v
ed
s
to
r
y
an
d
th
e
r
ef
er
e
n
ce
s
to
r
ies
w
r
itte
n
b
y
h
u
m
a
n
s
.
B
y
m
ax
i
m
izi
n
g
t
h
e
liter
al
m
e
tr
ic
d
ir
ec
tl
y
,
t
h
e
n
ar
r
ato
r
lear
n
s
to
id
en
t
if
y
in
ter
e
s
ti
n
g
,
b
r
o
ad
cli
p
s
t
h
at
f
o
r
m
a
n
ice
n
ar
r
ati
v
e.
3
.
4
.
Video
ind
ex
g
ener
a
t
o
r
T
h
er
e
ar
e
th
r
ee
d
is
ti
n
ct
m
o
d
u
les
t
h
er
e
to
g
en
er
ate
a
s
to
r
y
f
r
o
m
t
h
e
g
iv
e
n
v
id
eo
.
T
h
ese
m
o
d
u
les
ar
e
A
T
C
,
T
E
,
an
d
VC
G.
T
h
e
tex
t
u
al
i
n
f
o
r
m
atio
n
g
en
er
ated
b
y
t
h
ese
m
o
d
u
le
s
i
s
p
r
o
v
id
e
as
i
n
p
u
t
to
t
h
e
VI
G
an
d
it
u
s
e
s
W
o
r
m
h
o
le
[
2
5
]
alg
o
r
i
th
m
f
o
r
in
d
ex
in
g
,
w
h
ich
g
u
a
r
an
tees
w
o
r
s
t
-
ca
s
e
ti
m
e
co
m
p
lex
it
y
O
(
lo
g
L
)
i
n
s
ea
r
ch
o
f
a
k
e
y
w
o
r
d
w
it
h
a
d
ef
i
n
ite
len
g
t
h
L
.
Alo
n
g
w
it
h
th
e
d
en
s
e
i
n
f
o
r
m
atio
n
f
o
r
ea
ch
k
e
y
w
o
r
d
,
v
id
eo
f
r
a
m
e
r
a
n
g
e
w
ill al
s
o
s
t
o
r
e
in
th
i
s
in
d
e
x
tab
le,
w
h
ich
ca
n
b
e
u
s
ed
b
y
VSE
w
h
ile
c
h
e
ck
in
g
t
h
e
r
ele
v
an
ce
.
I
n
d
ex
tab
le
f
o
r
th
e
v
id
eo
w
ill
f
o
llo
w
t
h
e
f
o
r
m
at
o
f
k
e
y
w
o
r
d
,
d
en
s
e,
lo
w
er
f
r
a
m
e
n
o
,
u
p
p
er
f
r
a
m
e
n
o
(
k
e
y
w
o
r
d
s
i
n
s
o
r
ted
o
r
d
er
,
d
en
s
e
o
f
k
e
y
w
o
r
d
,
th
e
lo
w
er
f
r
a
m
e
n
u
m
b
er
a
n
d
u
p
p
er
f
r
a
m
e
n
u
m
b
er
w
h
er
e
th
e
d
en
s
e
i
s
f
o
u
n
d
h
ig
h
)
.
4.
RE
SU
L
T
S AN
D
D
I
SCU
SS
I
O
N
T
h
e
p
r
o
p
o
s
ed
w
o
r
k
w
a
s
i
m
p
le
m
e
n
ted
in
P
y
th
o
n
I
DE
an
d
ten
s
o
r
f
lo
w
d
ee
p
lear
n
in
g
f
r
a
m
e
w
o
r
k
.
Vid
eo
ca
p
tio
n
s
w
er
e
g
en
er
ate
d
u
s
i
n
g
v
id
eo
s
to
r
y
telli
n
g
m
e
th
o
d
u
s
i
n
g
v
id
eo
s
to
r
y
d
atase
t
,
a
n
e
w
d
ata
s
et
t
h
at
p
r
ep
ar
e
d
to
en
ab
le
t
h
e
s
t
u
d
y
.
VI
G
u
s
es
w
o
r
m
h
o
le
al
g
o
r
ith
m
to
w
ar
r
an
t
least
ti
m
e
co
m
p
l
ex
it
y
.
8
6
%
ac
c
u
r
ac
y
w
a
s
ac
h
ie
v
ed
i
n
a
v
id
eo
ca
mp
fir
e
in
a
fo
r
est
u
s
i
n
g
t
h
e
s
to
r
ie
s
g
e
n
er
ated
b
y
t
h
e
R
e
s
B
R
NN
-
k
NN
m
et
h
o
d
[
2
4
]
.
Fi
g
u
r
e
2
r
ep
r
esen
t
th
e
s
et
o
f
s
h
o
t
s
in
t
h
e
v
id
eo
clip
-
C
a
m
p
f
ir
e
in
a
f
o
r
est
u
s
ed
f
o
r
g
en
er
atin
g
s
to
r
y
.
T
h
e
o
u
tp
u
t o
f
th
e
v
id
eo
ca
p
tio
n
g
e
n
er
ato
r
is
as f
o
llo
w
s
.
“
T
h
e
n
at
u
r
e
o
f
th
e
f
o
r
est
i
s
s
h
o
w
n
w
h
ile
th
e
h
i
k
er
an
d
d
o
g
ar
e
o
n
t
h
e
h
ik
i
n
g
tr
ail.
A
g
ir
l
talk
s
t
o
th
e
ca
m
er
a
w
h
ile
la
y
i
n
g
d
o
w
n
h
er
ten
t.
T
h
e
y
ar
e
p
r
ep
ar
in
g
th
e
ca
m
p
s
ite
an
d
s
tar
ti
n
g
u
p
th
e
tr
ail
o
n
a
h
ik
e.
T
h
e
ca
m
p
er
s
e
x
p
lo
r
e
th
e
w
o
o
d
s
an
d
h
i
k
e
u
p
s
o
m
e
r
o
ck
s
.
T
h
e
co
u
p
le
h
i
k
e
alo
n
g
a
f
o
r
est
tr
ail
w
i
th
t
h
eir
d
o
g
.
T
h
e
ca
m
p
er
s
tak
e
t
u
r
n
s
s
w
i
n
g
in
g
o
v
er
t
h
e
w
ater
.
T
h
e
f
r
ien
d
s
s
tar
t
a
ca
m
p
f
ir
e.
A
m
a
n
is
c
o
o
k
in
g
h
is
f
o
o
d
o
n
th
e
f
ir
e.
T
h
e
ca
m
p
er
s
ar
e
e
x
p
l
o
r
in
g
t
h
e
ar
ea
ar
o
u
n
d
th
e
ir
ca
m
p
s
ite
an
d
th
e
s
u
r
r
o
u
n
d
i
n
g
wo
o
d
s
.
T
h
ey
g
et
b
ac
k
in
th
e
ca
r
a
n
d
d
r
iv
e
ag
ai
n
.
”
Fig
u
r
e
2
.
S
et
o
f
s
h
o
t
s
in
v
id
eo
c
lip
-
c
a
m
p
f
ir
e
i
n
a
f
o
r
est
Evaluation Warning : The document was created with Spire.PDF for Python.
I
SS
N
:
2
0
8
8
-
8708
I
n
t J
E
lec
&
C
o
m
p
E
n
g
,
Vo
l.
10
,
No
.
6
,
Dec
em
b
er
2
0
2
0
:
6
0
1
9
-
6
0
2
5
6024
T
h
e
v
id
eo
in
d
ex
g
en
er
ato
r
g
e
n
er
ates
th
e
i
n
d
ex
as
s
h
o
w
n
i
n
T
ab
le
1
.
B
ased
o
n
th
e
s
to
r
y
g
e
n
er
ated
b
y
v
id
eo
ca
p
tio
n
g
en
er
ato
r
.
T
h
ir
d
co
lu
m
n
i
n
t
h
e
tab
le
r
ep
r
esen
t
t
h
e
p
er
ce
n
ta
g
e
o
f
tr
u
e
p
o
s
itiv
e
to
ev
al
u
ate
th
e
ac
cu
r
ac
y
.
T
ab
le
1
.
Vid
eo
i
n
d
ex
co
r
r
esp
o
n
d
in
g
to
th
e
v
id
eo
s
h
o
w
n
in
F
i
g
u
r
e
2
K
e
y
w
o
r
d
F
r
a
me
n
o
r
a
n
g
e
T
r
u
e
P
o
si
t
i
v
e
C
a
me
r
a
15
-
2
8
,
3
2
-
5
9
,
7
8
-
9
9
,
1
2
4
-
1
4
5
9
1
%
C
a
m
p
e
r
12
-
2
4
,
3
5
-
4
9
,
5
6
-
64
8
9
%
C
a
m
p
f
i
r
e
1
4
0
-
164
7
9
%
Car
19
-
2
4
,
1
4
5
-
1
6
1
,
1
6
5
-
199
8
4
%
D
o
g
23
-
3
5
,
6
7
-
8
9
,
9
7
-
123
9
2
%
F
o
o
d
1
3
4
-
168
8
0
%
F
o
r
e
st
1
-
9
,
2
1
-
3
1
,
6
2
-
8
7
,
9
9
-
1
2
0
7
4
%
F
r
i
e
n
d
25
-
3
9
,
6
1
-
9
9
,
1
0
3
-
123
8
7
%
G
i
r
l
25
-
3
9
,
6
1
-
9
9
,
1
0
3
-
123
6
8
%
H
i
k
e
r
12
-
2
4
,
3
5
-
4
9
,
5
6
-
64
8
1
%
M
a
n
15
-
3
9
,
6
1
-
6
9
,
1
0
1
-
1
2
3
,
1
4
1
-
158
8
9
%
N
a
t
u
r
e
1
-
2
4
,
3
4
-
4
5
,
6
5
-
8
9
,
1
0
2
-
1
5
6
8
1
%
W
a
t
e
r
20
-
3
1
,
6
2
-
8
7
,
1
0
9
-
1
2
9
8
1
%
W
o
o
d
1
1
1
-
1
2
3
,
1
3
4
-
1
4
5
,
1
6
5
-
1
9
6
7
8
%
5.
CO
NCLU
SI
O
N
As
p
er
th
e
l
iter
atu
r
e
s
u
r
v
e
y
p
r
esen
ted
,
t
h
e
e
n
tire
v
id
eo
n
ee
d
s
to
b
e
tr
an
s
f
er
r
ed
to
th
e
s
ea
r
ch
er
w
h
ile
a
p
o
r
tio
n
o
f
th
e
v
id
eo
m
a
y
b
e
o
f
in
ter
est
to
th
e
s
ta
k
e
h
o
ld
er
.
I
n
th
i
s
p
ap
er
,
a
v
id
eo
r
etr
iev
al
s
y
s
te
m
i
s
p
r
o
p
o
s
ed
to
r
etr
iev
e
a
p
o
r
tio
n
o
f
th
e
lo
n
g
v
id
eo
w
h
ic
h
i
s
in
ter
es
ted
,
u
s
in
g
v
id
eo
co
n
ten
t
an
al
y
s
i
s
,
v
id
eo
s
to
r
y
tel
lin
g
a
n
d
v
id
eo
in
d
ex
i
n
g
co
n
ce
p
ts
.
T
h
e
in
f
o
r
m
atio
n
p
r
ese
n
t
i
n
t
h
e
v
id
eo
in
t
h
e
f
o
r
m
s
o
f
i
m
a
g
es,
au
d
io
an
d
tex
t
co
n
s
id
er
ed
f
o
r
v
id
eo
co
n
ten
t a
n
a
l
y
s
i
s
an
d
s
u
b
s
eq
u
e
n
tl
y
f
o
r
v
id
eo
in
d
ex
in
g
.
Hen
ce
,
m
o
r
e
ac
cu
r
ac
y
is
e
n
s
u
r
ed
.
T
ex
t
g
en
er
atio
n
f
r
o
m
a
u
d
io
in
p
u
t
is
i
m
p
le
m
en
ted
u
s
i
n
g
t
h
e
R
NN
b
ased
au
d
io
r
ec
o
g
n
iti
o
n
m
o
d
el.
T
ex
t
ex
tr
ac
tio
n
f
r
o
m
th
e
v
id
eo
im
a
g
e
s
th
r
o
u
g
h
v
ar
io
u
s
s
ta
g
es
lik
e
ed
g
e
g
en
er
at
io
n
b
ased
o
n
m
as
k
-
b
ased
lin
e
ed
g
e
d
etec
tio
n
,
te
x
t
lo
ca
lizati
o
n
b
ased
o
n
p
r
o
j
ec
tio
n
p
r
o
f
il
es,
s
e
g
m
e
n
tat
io
n
o
f
te
x
t
&
r
ec
o
g
n
itio
n
o
f
te
x
t.
Vid
eo
s
to
r
y
te
lli
n
g
m
et
h
o
d
is
i
m
p
le
m
e
n
ted
u
s
i
n
g
th
e
v
id
eo
s
to
r
y
d
ata
s
et
to
g
en
er
ate
t
h
e
c
ap
tio
n
s
f
r
o
m
v
id
eo
i
m
a
g
es.
I
n
v
id
eo
s
ea
r
ch
en
g
i
n
e,
s
u
cc
es
s
f
u
l
i
m
p
le
m
e
n
tatio
n
o
f
t
h
is
f
r
a
m
e
w
o
r
k
,
w
o
u
ld
f
ac
ilit
ate
e
n
o
r
m
o
u
s
i
m
p
r
o
v
e
m
en
t
s
i
n
d
ata
tr
af
f
ic
b
y
r
ed
u
ci
n
g
i
n
f
o
r
m
atio
n
ex
c
h
an
g
e
s
ize.
F
u
r
th
er
m
o
r
e,
t
h
e
i
n
ten
d
ed
p
o
r
tio
n
o
f
th
e
v
id
eo
o
n
l
y
n
ee
d
s
to
b
e
v
i
e
w
ed
f
r
o
m
th
e
u
s
er
'
s
p
o
in
t
o
f
v
ie
w
.
As
a
co
n
ti
n
u
atio
n
o
f
t
h
e
w
o
r
k
,
th
e
s
a
m
e
alg
o
r
ith
m
ca
n
b
e
i
m
p
le
m
en
te
d
an
d
test
ed
i
n
a
n
e
w
s
v
id
eo
w
h
ic
h
co
n
tai
n
s
s
e
v
er
al
co
n
tex
t
u
al
co
n
ten
t
s
i
n
v
ar
io
u
s
to
p
ics
RE
F
E
R
E
NC
E
S
[1
]
CIS
CO,
“
Cisc
o
V
isu
a
l
Ne
tw
o
rk
in
g
In
d
e
x
:
F
o
re
c
a
st an
d
T
re
n
d
s,
2
0
1
7
–
2
0
2
2
,
”
W
h
it
e
p
a
p
e
r
Cisc
o
p
u
b
li
c
,
2
0
1
9
.
[2
]
S.
V
e
n
u
g
o
p
a
lan
,
e
t
a
l.
,
“
T
ra
n
sla
ti
n
g
v
id
e
o
s
t
o
n
a
tu
ra
l
lan
g
u
a
g
e
u
sin
g
d
e
e
p
re
c
u
rre
n
t
n
e
u
ra
l
n
e
tw
o
rk
s
,
”
a
rXiv:
1
4
1
2
.
4
7
2
9
,
2
0
1
4
.
[3
]
L
.
Zh
o
u
,
e
t
a
l.
,
“
G
ro
u
n
d
e
d
Vid
e
o
De
sc
rip
ti
o
n
,
”
a
rXiv:
1
8
1
2
.
0
6
5
8
7
v
2
,
2
0
1
9
.
[4
]
S
.
M
.
A
z
a
r,
e
t
a
l.
, “
Co
n
v
o
lu
ti
o
n
a
l
re
latio
n
a
l
m
a
c
h
in
e
f
o
r
g
ro
u
p
a
c
ti
v
it
y
re
c
o
g
n
it
io
n
,
”
ar
Xi
v
:
1
9
0
4
.
0
3
3
0
8
v
1
,
2
0
1
9
.
[5
]
Y.
Bin
,
e
t
a
l.
,
“
De
sc
rib
in
g
Vid
e
o
w
it
h
A
tt
e
n
ti
o
n
-
Ba
se
d
Bi
d
irec
ti
o
n
a
l
L
S
T
M
,
”
in
IEE
E
T
r
a
n
sa
c
ti
o
n
s
o
n
Cy
b
e
rn
e
ti
c
s
,
v
o
l.
4
9
,
n
o
.
7
,
p
p
.
2
6
3
1
-
2
6
4
1
,
2
0
1
9
.
[6
]
Q
.
Jin
a
n
d
J
.
L
ian
g
,
“
V
id
e
o
De
sc
rip
ti
o
n
G
e
n
e
ra
ti
o
n
u
sin
g
A
u
d
io
a
n
d
V
isu
a
l
Cu
e
s
,
”
ICM
R
‘1
6
Pro
c
e
e
d
in
g
s
o
f
th
e
2
0
1
6
,
ACM
o
n
I
n
ter
n
a
ti
o
n
a
l
Co
n
fer
e
n
c
e
o
n
M
u
lt
ime
d
i
a
Retrie
v
a
l
,
p
p
.
2
3
9
-
2
4
2
,
2
0
1
6
.
[7
]
C
.
Zh
a
n
g
a
n
d
Y
.
T
ian
,
“
A
u
to
m
a
t
ic
v
id
e
o
d
e
sc
rip
ti
o
n
g
e
n
e
ra
ti
o
n
v
ia
L
S
T
M
w
it
h
jo
in
t
tw
o
-
stre
a
m
e
n
c
o
d
i
n
g
,
”
2
0
1
6
2
3
rd
In
ter
n
a
ti
o
n
a
l
Co
n
fer
e
n
c
e
o
n
Pa
tt
e
rn
Rec
o
g
n
it
i
o
n
(
ICPR
)
,
Ca
n
c
u
n
,
p
p
.
2
9
2
4
-
2
9
2
9
,
2
0
1
6
.
[8
]
J.
L
iu
,
e
t
a
l.
,
“
V
id
e
o
e
v
e
n
t
re
c
o
g
n
it
io
n
u
si
n
g
c
o
n
c
e
p
t
a
tt
ri
b
u
tes
,
”
IEE
E
W
o
rk
sh
o
p
o
n
Ap
p
li
c
a
ti
o
n
s
o
f
C
o
mp
u
ter
Vi
sio
n
(
W
ACV
)
,
p
p
.
3
3
9
-
3
4
6
,
2
0
1
3
.
[9
]
M
.
M
a
z
lo
o
m
,
e
t
a
l
.
,
“
Qu
e
ry
in
g
fo
r
V
id
e
o
Ev
e
n
ts
b
y
S
e
m
a
n
ti
c
S
ig
n
a
tu
re
s
f
ro
m
F
e
w
Ex
a
m
p
les
,
”
Pro
c
e
e
d
in
g
s
o
f
th
e
2
1
st
ACM
In
ter
n
a
ti
o
n
a
l
Co
n
fe
re
n
c
e
o
n
mu
l
ti
me
d
ia
,
p
p
.
6
0
9
-
6
1
2
,
2
0
1
3
.
[1
0
]
S
.
V
e
n
u
g
o
p
a
lan
,
e
t
a
l.
,
“
S
e
q
u
e
n
c
e
to
S
e
q
u
e
n
c
e
-
V
i
d
e
o
to
T
e
x
t
,
”
Pro
c
e
e
d
in
g
s
o
f
th
e
2
0
1
5
,
IEE
E
In
ter
n
a
ti
o
n
a
l
Co
n
fer
e
n
c
e
o
n
Co
m
p
u
ter
Vi
si
o
n
(
ICCV)
,
p
p
.
4
5
3
4
-
4
5
4
2
,
2
0
1
5
.
[1
1
]
M.
Ra
v
in
d
e
r
a
n
d
T
.
V
e
n
u
g
o
p
a
l
,
“
Co
n
ten
t
-
Ba
se
d
V
i
d
e
o
In
d
e
x
in
g
a
n
d
Re
tri
e
v
a
l
u
sin
g
Ke
y
f
ra
m
e
s T
e
x
tu
re
,
Ed
g
e
a
n
d
M
o
t
i
o
n
F
e
a
tu
re
s
,
”
In
ter
n
a
ti
o
n
a
l
J
o
u
rn
a
l
o
f
C
u
rr
e
n
t
E
n
g
i
n
e
e
rin
g
a
n
d
T
e
c
h
n
o
l
o
g
y
,
v
o
l
.
6
,
n
o.
2,
p
p
.
6
7
2
-
6
7
6
,
2
0
1
6
.
[1
2
]
N.
L
a
o
k
u
lrat,
e
t
a
l.
,
“
Ge
n
e
ra
ti
n
g
V
id
e
o
De
sc
rip
ti
o
n
u
sin
g
S
e
q
u
e
n
c
e
-
to
-
se
q
u
e
n
c
e
M
o
d
e
l
w
it
h
T
e
m
p
o
ra
l
A
tt
e
n
ti
o
n
,
”
Pro
c
e
e
d
in
g
s
o
f
I
n
ter
n
a
t
io
n
a
l
Co
n
fer
e
n
c
e
o
n
Co
mp
u
t
a
ti
o
n
a
l
L
in
g
u
isti
c
s:
T
e
c
h
n
ica
l
Pa
p
e
rs
,
Os
a
k
a
,
Ja
p
a
n
,
p
p.
44
-
52
,
2
0
1
6
.
Evaluation Warning : The document was created with Spire.PDF for Python.
I
n
t J
E
lec
&
C
o
m
p
E
n
g
I
SS
N:
2
0
8
8
-
8708
V
id
eo
co
n
te
n
t a
n
a
lysi
s
a
n
d
r
etri
ev
a
l sys
tem
u
s
in
g
vid
eo
s
to
r
ytellin
g
a
n
d
i
n
d
ex
in
g
tech
n
iq
u
es (
Ja
imo
n
Ja
co
b
)
6025
[1
3
]
Q.
Yo
u
,
e
t
a
l.
,
“
Im
a
g
e
c
a
p
ti
o
n
i
n
g
w
it
h
se
m
a
n
ti
c
a
tt
e
n
ti
o
n
,”
Pro
c
e
e
d
in
g
s o
f
th
e
IE
EE
c
o
n
fer
e
n
c
e
o
n
c
o
mp
u
ter
v
isi
o
n
a
n
d
p
a
tt
e
rn
re
c
o
g
n
it
io
n
,
p
p
.
1
-
1
0
,
2
0
1
6
.
[1
4
]
R.
R.
Iy
e
r,
e
t
a
l.
,
“
Co
n
ten
t
-
b
a
se
d
v
id
e
o
in
d
e
x
in
g
a
n
d
re
tri
e
v
a
l
u
si
n
g
c
o
rr
-
ld
a
,”
a
rXiv:
1
6
0
2
.
0
8
5
8
1
,
2
0
1
9
.
[1
5
]
N.
A
a
f
a
q
,
e
t
a
l.
,
“
V
i
d
e
o
d
e
sc
rip
ti
o
n
:
A
su
rv
e
y
o
f
m
e
th
o
d
s,
d
a
ta
se
ts,
a
n
d
e
v
a
lu
a
ti
o
n
m
e
tri
c
s
,”
ACM
Co
mp
u
ti
n
g
S
u
rv
e
y
s (
CS
UR)
,
v
o
l.
52
,
n
o
.
6
,
p
p
.
1
-
28
,
2
0
1
9
.
[1
6
]
S.
V
e
n
u
g
o
p
a
lan
,
e
t
a
l.
,
“
Im
p
ro
v
in
g
lstm
-
b
a
se
d
v
id
e
o
d
e
sc
rip
ti
o
n
w
it
h
li
n
g
u
i
stic
k
n
o
w
led
g
e
m
in
e
d
f
ro
m
tex
t
,”
Pro
c
e
e
d
in
g
s
o
f
Emp
irica
l
M
e
th
o
d
s i
n
Na
t
u
ra
l
L
a
n
g
u
a
g
e
Pr
o
c
e
ss
in
g
,
p
p
.
1
9
6
1
-
1
9
6
6
,
2
0
1
6
.
[1
7
]
Z.
Ji,
e
t
a
l.
,
“
V
i
d
e
o
su
m
m
a
ri
z
a
ti
o
n
w
it
h
a
tt
e
n
ti
o
n
-
b
a
se
d
e
n
c
o
d
e
r
-
d
e
c
o
d
e
r
n
e
tw
o
rk
s
,”
IEE
E
T
r
a
n
sa
c
ti
o
n
s o
n
Circ
u
it
s
a
n
d
S
y
ste
ms
fo
r V
id
e
o
T
e
c
h
n
o
lo
g
y
,
v
o
l.
3
0
,
n
o
.
6
,
p
p
.
1
7
0
9
-
1
7
1
7
,
2
0
1
9
.
[1
8
]
V
.
Ka
u
s
h
a
l,
e
t
a
l.
,
“
A
F
ra
m
e
w
o
rk
to
w
a
rd
s
Do
m
a
in
S
p
e
c
if
ic
V
id
e
o
S
u
m
m
a
riz
a
ti
o
n
,”
2
0
1
9
IE
EE
W
in
ter
Co
n
fer
e
n
c
e
o
n
A
p
p
li
c
a
ti
o
n
s
o
f
C
o
mp
u
ter
Vi
si
o
n
(
W
ACV
)
,
p
p
.
6
6
6
-
6
7
5
,
2
0
1
9
.
[1
9
]
S
.
W
.
S
m
o
li
a
r
a
n
d
H
.
J
.
Zh
a
n
g
,
“
Co
n
ten
t
b
a
se
d
v
id
e
o
in
d
e
x
in
g
a
n
d
re
tri
e
v
a
l
,”
IEE
E
mu
lt
ime
d
ia
,
v
o
l.
1
,
n
o
.
2
,
pp.
62
-
72
,
1
9
9
4
.
[2
0
]
H.
X
u
,
e
t
a
l.
,
“
Jo
in
t
e
v
e
n
t
d
e
tec
ti
o
n
a
n
d
d
e
sc
rip
t
i
o
n
in
c
o
n
ti
n
u
o
u
s v
id
e
o
stre
a
m
s
,”
2
0
1
9
IEE
E
W
in
ter
Co
n
fer
e
n
c
e
o
n
Ap
p
li
c
a
ti
o
n
s
o
f
Co
m
p
u
ter
V
isio
n
(
W
ACV
)
,
p
p
.
2
5
-
2
6
,
2
0
1
9
.
[2
1
]
J.
A
n
e
ja,
e
t
a
l.
,
“
Co
n
v
o
lu
ti
o
n
a
l
i
m
a
g
e
c
a
p
ti
o
n
i
n
g
,”
Pro
c
e
e
d
in
g
s
o
f
th
e
IEE
E
Co
n
fer
e
n
c
e
o
n
C
o
mp
u
ter
Vi
sio
n
a
n
d
Pa
tt
e
rn
Rec
o
g
n
it
io
n
,
p
p
.
5
5
6
1
-
5
5
7
0
,
2
0
1
8
.
[2
2
]
J.
M
u
n
,
e
t
a
l.
,
“
S
tr
e
a
m
li
n
e
d
d
e
n
se
v
id
e
o
c
a
p
ti
o
n
i
n
g
,”
Pro
c
e
e
d
in
g
s
o
f
th
e
I
EE
E
Co
n
fer
e
n
c
e
o
n
Co
mp
u
ter
Vi
sio
n
a
n
d
Pa
tt
e
rn
Rec
o
g
n
it
io
n
,
p
p
.
6
5
8
1
-
6
5
9
0
,
2
0
1
9
.
[2
3
]
A
.
Ku
m
a
r
a
n
d
R.
K.
G
o
e
l,
“
A
n
Eff
icie
n
t
A
lg
o
r
it
h
m
f
o
r
Tex
t
L
o
c
a
li
z
a
ti
o
n
a
n
d
Ex
trac
ti
o
n
i
n
Co
m
p
lex
V
id
e
o
T
e
x
t
Im
a
g
e
s
,
”
2
0
1
3
In
ter
n
a
ti
o
n
a
l
C
o
n
f
e
re
n
c
e
o
n
In
f
o
rm
a
ti
o
n
M
a
n
a
g
e
me
n
t
i
n
th
e
K
n
o
w
led
g
e
Ec
o
n
o
my
,
p
p
.
1
4
-
1
9
,
2
0
1
3
.
[2
4
]
J
.
L
i,
e
t
a
l.
,
“
Vid
e
o
S
to
ry
telli
n
g
,
”
a
rXiv
:
1
8
0
7
.
0
9
4
1
8
v
1
,
2
0
1
8
.
[2
5
]
X
.
W
u
,
e
t
a
l.
,
“
W
o
rm
h
o
le:
A
F
a
st
-
Ord
e
re
d
In
d
e
x
f
o
r
In
-
m
e
m
o
r
y
D
a
ta M
a
n
a
g
e
m
e
n
t
,
”
a
rXiv:
1
8
0
5
.
0
2
2
0
0
v
2
,
2
0
1
8
.
B
I
O
G
RAP
H
I
E
S
O
F
AUTH
O
RS
J
a
i
m
o
n
J
a
c
o
b
,
a
tt
a
in
e
d
th
e
d
e
g
re
e
s
B.
T
e
c
h
in
Co
m
p
u
ter
S
c
ien
c
e
a
n
d
En
g
i
n
e
e
rin
g
f
ro
m
Un
iv
e
rsit
y
o
f
Ca
li
c
u
t
in
2
0
0
3
,
M
.
T
e
c
h
in
Dig
it
a
l
Im
a
g
e
p
ro
c
e
ss
in
g
f
ro
m
A
n
n
a
Un
iv
e
rsity
,
Ch
e
n
n
a
i
in
2
0
1
0
,
M
BA
in
In
f
o
rm
a
ti
o
n
T
e
c
h
n
o
l
o
g
y
f
ro
m
S
ik
k
i
m
M
a
n
ip
a
l
Un
iv
e
rsity
in
2
0
1
2
,
M
.
T
e
c
h
i
n
Co
m
p
u
ter
a
n
d
In
f
o
rm
a
ti
o
n
S
c
ien
c
e
f
ro
m
Co
c
h
in
Un
iv
e
rsity
o
f
S
c
ien
c
e
a
n
d
T
e
c
h
n
o
lo
g
y
in
2
0
1
4
.
Cu
rre
n
tl
y
w
o
rk
in
g
a
s
A
ss
t.
p
ro
f
e
ss
o
r
in
Co
m
p
u
ter
S
c
ien
c
e
a
n
d
En
g
in
e
e
rin
g
,
G
o
v
t.
M
o
d
e
l
En
g
in
e
e
rin
g
Co
ll
e
g
e
,
Ke
ra
la.
Au
th
o
r
p
a
ss
io
n
a
te
i
n
re
se
a
rc
h
a
re
a
“
v
id
e
o
p
ro
c
e
ss
in
g
”
.
A
ss
o
c
iate
w
it
h
p
ro
f
e
ss
io
n
a
l
b
o
d
ies
IS
T
E,
IE
T
E
a
n
d
IE.
Pro
f.
(Dr
.
)
S
u
d
h
e
e
p
El
a
y
id
o
m
a
tt
a
in
e
d
t
h
e
d
e
g
re
e
s
B.
T
e
c
h
,
M
.
T
e
c
h
,
P
h
.
D.
Cu
rre
n
tl
y
W
o
rk
in
g
a
s
P
ro
f
e
ss
o
r
,
Div
isio
n
o
f
Co
m
p
u
ter
En
g
in
e
e
rin
g
,
S
c
h
o
o
l
o
f
En
g
in
e
e
rin
g
,
Co
c
h
in
u
n
iv
e
rsity
o
f
S
c
ien
c
e
a
n
d
T
e
c
h
n
o
l
o
g
y
.
Ern
a
k
u
la
m
,
Ke
ra
la.
A
w
e
ll
-
k
n
o
w
n
m
u
sic
i
a
n
in
M
a
lay
a
la
m
F
il
m
In
d
u
stry
.
P
a
ss
io
n
a
te i
n
re
se
a
rc
h
a
re
a
Da
ta
M
in
i
n
g
,
Big
Da
ta an
d
re
late
d
a
re
a
s.
Pro
f.
(Dr
.
)
V
.
P.
De
v
a
s
sia
a
tt
a
in
e
d
th
e
d
e
g
re
e
s
B.
S
c
.
En
g
in
e
e
rin
g
f
ro
m
M
A
Co
ll
e
g
e
o
f
En
g
in
e
e
rin
g
,
Ko
th
a
m
a
n
g
a
la
m
,
in
1
9
8
3
,
M
.
T
e
c
h
in
In
d
u
strial
El
e
c
tro
n
ics
f
ro
m
Co
c
h
in
Un
iv
e
rsity
o
f
S
c
ien
c
e
a
n
d
T
e
c
h
n
o
lo
g
y
,
P
h
.
D
in
S
ig
n
a
l
P
r
o
c
e
ss
in
g
f
ro
m
Co
c
h
in
Un
iv
e
rsity
o
f
S
c
ien
c
e
a
n
d
T
e
c
h
n
o
lo
g
y
.
W
o
rk
e
d
a
s
G
ra
d
u
a
te
En
g
in
e
e
r
(T
)
in
Hin
d
u
sta
n
P
a
p
e
r
Co
rp
o
ra
ti
o
n
L
td
,
De
sig
n
En
g
in
e
e
r,
HMT
L
i
m
it
e
d
,
P
rin
c
ip
a
l,
G
o
v
t.
M
o
d
e
l
En
g
in
e
e
rin
g
Co
ll
e
g
e
,
Ern
a
k
u
la
m
.
A
u
th
o
r
p
a
ss
io
n
a
te
i
n
re
se
a
rc
h
a
re
a
S
ig
n
a
l
P
ro
c
e
ss
in
g
.
A
ss
o
c
iate
w
it
h
P
ro
f
e
ss
io
n
a
l
b
o
d
ies
a
s
LM
-
IS
T
E,
F
IET
E,
F
IE
a
n
d
C.
E
n
g
.
IE(
I).
Evaluation Warning : The document was created with Spire.PDF for Python.