Int
ern
at
i
onal
Journ
al of Ele
ctrical
an
d
Co
mput
er
En
gin
eeri
ng
(IJ
E
C
E)
Vo
l.
9
, No
.
5
,
Octo
ber
201
9
, pp.
3642
~
36
48
IS
S
N:
20
88
-
8708
,
DOI: 10
.11
591/
ijece
.
v
9
i
5
.
pp3642
-
36
48
3642
Journ
al h
om
e
page
:
http:
//
ia
es
core
.c
om/
journa
ls
/i
ndex.
ph
p/IJECE
Sp
ee
ch t
o text
conversion
and su
mm
ar
i
zation for
eff
ective
un
derstandi
ng and docum
entation
Vinn
ara
s
u
A
.
,
D
eep
a V
.
Jos
e
Depa
rtment
o
f
C
om
pute
r
Scie
n
ce,
CHRIS
T
(De
e
m
ed
to
b
e
Unive
rsit
y
)
,
Ind
ia
Art
ic
le
In
f
o
ABSTR
A
CT
Art
ic
le
history:
Re
cei
ved
J
a
n
17
, 2
01
9
Re
vised
A
pr
1
,
201
9
Accepte
d
Apr
1
0
, 20
19
Speec
h,
is
th
e
m
ost
powerful
wa
y
of
comm
unic
a
ti
on
wi
th
w
hic
h
hum
an
bei
ngs
expr
ess
the
ir
thought
s
and
fe
el
in
gs
th
rough
diffe
r
ent
la
nguag
es.
The
f
ea
tur
es
of
spee
ch
diffe
rs
with
each
la
ngu
age
.
How
eve
r,
eve
n
whi
le
comm
unic
at
ing
in
the
sam
e
la
n
guage
,
the
pa
ce
and
the
di
al
e
ct
var
ie
s
wit
h
ea
ch
p
erson.
Th
is
cre
a
te
s
difficu
lty
in
under
st
an
ding
the
conv
e
yed
m
essage
for
som
e
peopl
e.
Som
et
imes
l
engt
h
y
sp
eeche
s
are
a
lso
quit
e
diffi
cult
to
foll
ow due
to
r
eas
ons such
as
diffe
ren
t
pronun
ci
a
tion,
pace
and
so
on.
Speec
h
rec
ogni
ti
on
whi
ch
is
an
int
e
r
di
scipl
inar
y
f
ield
of
computat
ion
a
l
li
nguisti
cs
ai
ds
in
dev
el
o
ping
t
ec
hnolog
i
es
tha
t
empow
ers
the
rec
ognit
ion
and
tra
nsla
ti
on
of
spee
ch
in
to
te
xt
.
Te
xt
sum
m
ari
za
t
ion
ext
ra
ct
s
the
utmos
t
important
infor
m
at
ion
from
a
sour
ce
which
is
a
te
xt
an
d
provide
s
the
ad
equate
su
m
m
ary
of
th
e
sa
m
e.
The re
se
arch work
pre
sent
e
d
in
th
is pa
per
desc
ribe
s
an
ea
s
y
and
eff
ec
t
ive
m
et
hod
for
spee
ch
rec
ogn
it
ion
.
The
spee
ch
is
conve
rt
ed
to
the
cor
responding
t
ext
and
produ
ces
summ
ari
ze
d
t
e
xt.
Th
is
has
var
ious
appl
icat
i
ons
li
ke
le
ct
ur
e
note
s
cre
a
ti
on,
s
um
m
ari
zi
ng
ca
talogues
for
le
ngth
y
do
cume
nts
and
so
on.
Ext
ensiv
e
exp
er
imenta
ti
on
is
pe
rform
ed
t
o
val
id
at
e
th
e
ef
ficien
c
y
of the
p
rop
osed
m
et
hod.
Ke
yw
or
d
s
:
Feat
ur
e
ex
tr
act
ion
Natu
ral
la
ngua
ge pr
ocessi
ng
Natu
ral
la
ngua
ge
to
olk
it
Sp
eec
h reco
gnit
ion
Text s
umm
ariz
at
ion
Copyright
©
201
9
Instit
ut
e
o
f Ad
vanc
ed
Engi
n
ee
r
ing
and
S
cienc
e
.
Al
l
rights
reserv
ed
.
Corres
pond
in
g
Aut
h
or
:
Vinnaras
u A
.
,
Dep
a
rtm
ent o
f C
om
pu
te
r
Scie
nce,
CHRIST
(Dee
m
ed
to
be Un
i
ver
sit
y),
Be
ng
al
uru,
Ka
rn
at
a
ka,
India
.
Em
a
il
: vinn
ara
su
.a
@cs.c
hr
ist
un
i
ver
sit
y.i
n
1.
INTROD
U
CTION
Sp
eec
h
is
the
m
os
t
i
m
po
rtan
t
par
t
of
com
m
un
ic
at
ion
be
tween
hum
an
bein
gs
.
T
houg
h
there
are
diff
e
re
nt
m
ea
ns
to
e
xpress
our
t
houg
hts
an
d
feeli
ng,
sp
eec
h
is
c
onside
red
as
t
he
m
ai
n
m
edu
im
fo
r
com
m
un
ic
at
ion
.
S
peec
h
rec
ogniti
on
is
the
process
of
m
akin
g
a
m
achi
ne
rec
ognize
the
sp
eec
h
of
diffre
nt
people
base
d
on
ce
rtai
n
w
ords
or
phrase
s.
Va
riat
ion
s
in
the
pro
nunc
ia
ti
on
are
qu
it
e
eviden
t
i
n
each
ind
ivi
du
al
’s
s
pe
ech.
T
he
or
ig
inal
form
of
th
e
sp
eec
h
is
a
sign
al
,
a
nd
a
si
gn
al
is
pr
ocess
ed
su
c
h
th
at
al
l
the
inf
or
m
at
ion
present
in
the
sign
al
is
conve
rted
in
to
the
te
xt
fo
rm
at
.
The
featur
e
ext
rac
ti
on
is
the
pro
cess
of
ta
kin
g
a
sig
nal
an
d
c
onver
ti
ng
it
to
t
he
re
qu
ired
form
at
wit
h
certai
n
lo
gic.
Eve
n
th
ough
s
peech
is
the
ea
sie
st
way
of
c
omm
un
ic
at
io
n,
t
here
exist
s
om
e
pr
oble
m
s
with
sp
eech
rec
ognit
ion
li
ke
the
flu
ency,
pro
nunci
at
ion
,
bro
ken
wor
ds,
stutt
erin
g
issues
et
c.
All
these
ha
ve
to
be
a
ddre
s
sed
w
hile
pr
ocessin
g
a
s
peech.
Text
s
umm
arizat
ion
is
on
e
of
the
m
ajo
r
c
on
c
epts
use
d
i
n
th
e
fiel
d
of
docu
m
entat
ion
.
Le
ngthy
do
c
um
ents
are
diff
ic
ult
to
read
an
d
unde
rs
ta
nd
as
it
con
su
m
es
lot
of
tim
e.
Text
s
u
m
m
arisat
ion
so
lves
this
pr
ob
le
m
by pr
ov
i
ding a
sh
ort
ene
d sum
m
ary of
it
with
sem
antic
s.
In
the
pro
pos
ed
w
ork
a
co
m
bin
at
ion
of
sp
eec
h
to
te
xt
conversi
on
and
te
xt
su
m
m
arisat
ion
is
i
m
ple
m
ented.
This
hy
br
i
d
m
e
thod
will
ai
d
a
pp
li
cat
io
ns
tha
t
require
bri
ef
su
m
m
ary
of
le
ng
t
hy
sp
eec
hes
wh
ic
h
is
quit
e
us
e
fu
l
f
or
docum
entat
ion
.
The
fl
ow
diag
ram
of
the
pr
opos
e
d
a
ppr
oach
is
m
e
ntion
e
d
i
n
Fig
ure
1,
in
w
hich
the
s
peech
rec
ognit
ion
a
nd
te
xt
s
um
m
arization
is
giv
e
n
as
tw
o
diff
e
re
nt
m
od
ul
es.
The
com
bin
at
ion
Evaluation Warning : The document was created with Spire.PDF for Python.
In
t J
Elec
&
C
om
p
En
g
IS
S
N:
20
88
-
8708
Sp
eec
h
t
o
te
xt
con
ve
rsio
n and
summariz
atio
n
fo
r eff
ect
iv
e u
nderst
anding
and d
ocume
ntat
ion
(
Vin
nara
s
u
A
)
3643
of
the
se
tw
o
m
odules
ai
ds
a
ny
app
li
cat
ion
i
n
w
hich
s
umm
arizat
ion
is
re
quire
d.
T
he
first
and
forem
os
t
ste
p
to
work
with
NL
P
(N
at
ur
al
La
ngua
ge
P
ro
ces
sing)
is
to
e
xtr
act
the
featu
re
s
from
the
sp
eech
w
hich
has
so
m
e
values
.
If
a
word
or
a
sentenc
e
is
reco
gn
iz
e
d
as
m
eaning
le
ss,
then
it
becom
es
an
ob
sta
cl
e
to
su
m
m
ariz
at
io
n
process
.
E
ve
n
the
punct
uation
play
s
a
vital
ro
le
in
su
m
m
arization
as
sem
antic
s
is
im
po
rtant
w
hile
su
m
m
arisi
ng
the text.
Figure
1. S
pee
ch reco
gnit
ion
and text s
umm
arisat
ion
proce
ss f
lo
w
2.
R
EL
ATED
W
ORKS
Sp
eec
h
to
te
xt
conver
si
on
fi
nd
s
a
pp
li
cat
io
ns
in
var
i
ous
scenari
os
.
A
n
eff
ect
ive
m
et
h
od
to
gai
n
flue
ncy
in
En
glish
la
ngua
ge
that
enh
ance
s
the
us
er'
s
way
of
sp
eec
h
th
rou
gh
co
rrec
tn
ess
of
pr
onunc
ia
ti
on
fo
ll
owin
g
the
En
glish
ph
on
e
ti
cs
was
dev
el
op
e
d
by
J
os
e
et
al
.
[1
]
.
A
com
par
at
ive
an
al
ysi
s
m
entionin
g
the
ben
e
fits
an
d
de
m
erit
s
of
the
va
rio
us
siz
es
of
vo
ca
bula
ry
s
pe
ech
rec
ogniti
on
syst
em
s
was
done
by
Siva
kum
ar
et
al
.
[2
]
.
This
w
ork
dem
on
s
trat
ed
the
r
ole
of
la
ngua
ge
m
od
el
i
n
im
pr
ov
ing
t
he
acc
ur
a
cy
of
sp
eec
h
t
o
te
xt
conve
rsion
with
diff
e
re
nt sce
nar
i
os
with
no
i
ses and
bro
ke
n w
ords.
Yogita
et
al
.
[3
]
pr
ese
nted
a
m
ulti
l
ing
ual
s
peech
-
to
te
xt
conve
rsion
sys
tem
us
ing
Me
l
-
Fr
e
quency
Ce
ps
tral
Coe
ffi
ci
ent
(MFCC
)
featu
re
e
xtracti
on
te
c
hn
i
qu
e
and
Mi
nim
u
m
Dista
nce
Cl
ass
ifie
r,
S
upport
Vecto
r
Ma
chine
(
SVM
)
m
e
tho
ds
f
or
sp
eec
h
cl
assi
ficat
ion
.
I
n
[
4]
a
m
od
el
to
c
onver
t
natu
ral
Be
ng
al
i
la
ng
uag
e
to
te
xt
was
pro
po
s
ed
wh
ic
h
us
e
d
open
s
ource
fr
am
ewor
k
S
phin
x
4.
A
uthors
cl
ai
m
an
aver
a
ge
of
71.7
%
accu
ra
cy
for
their
ap
p
r
oach
in
the
te
ste
d
dataset
.
Eng
l
ish
te
xt
su
m
marisat
ion
ba
se
d
on
associat
ion
sem
antic
ru
le
s
is
pro
po
se
d
by
Wan
[
5].
Accor
ding
to
the
aut
hor
the
ne
w
ex
tract
ion
schem
e
pr
ove
d
to
ha
ve
bette
r
co
nverg
e
nce
and
pr
eci
si
on
per
f
orm
ance
in
the
extra
ct
ion
proce
ss.
LDA
is
the
m
os
t
acce
pted
al
gorithm
for
te
xt
cl
assifi
cat
ion
base
d
on
a
pa
rtic
ular
t
op
ic
.
An
im
pr
ov
em
ent
of
t
he
sam
e
is
pro
pose
d
in
a
novel
sim
il
arity
com
pu
ta
ti
on
m
et
hod.
Saiy
ed
an
d
Sa
j
j
a
[
6]
ga
ve
a
n
br
ie
f
intr
od
uction
to
t
he
cat
egories
of
su
m
m
arization
te
chn
i
qu
e
s
highli
gh
ti
ng
t
he
ir
ad
va
ntages
an
d
dra
wb
ac
ks
.
This
w
ork
s
giv
e
s
insi
ghts
to
the
resea
rch
e
rs
for
sel
e
ct
ing
sp
eci
fic
m
et
ho
ds
based
on
th
ei
r
re
qu
irem
ent.
The
se
ntenc
e
sel
ect
ion
pro
cess
m
od
el
le
d
as
a
m
ulti
-
ob
je
ct
iv
e
op
ti
m
iz
ation
pro
blem
was
de
scribe
d
in
[
7].
The
a
uthor
s
use
d
hu
m
an
le
ar
ning
op
ti
m
iz
ati
on
al
gorithm
fo
r
t
his
pur
po
se
.
I
n
[
8]
feature
extra
ct
ion
base
d
on
ne
ur
al
netw
orks
was
propose
d
wh
ic
h
the
a
uthors
cl
ai
m
to
be
m
or
e
eff
ect
ive
com
par
ed
to
the
onli
ne
extracti
ve
opti
ons.
Vythel
i
ngum
et
al
.[
9]
had
propo
se
d
a
te
chn
iq
ue
for
error
detect
ion
of
gr
aph
em
e
to
-
phonem
e
con
ver
si
on
in
te
xt
-
to
-
s
peech
sy
nth
esi
s.
Accor
ding
to
them
their
appr
oach
gav
e
bette
r
e
rror
c
orrecti
on
r
at
e
wh
ic
h
can
ai
d
the
hu
m
an
annotat
or.
F
r
om
the
li
te
rature
that
was
rev
i
ewed
it
was
quit
e
evid
ent
the
require
m
ent
of
sp
eec
h
to
te
xt
con
ve
r
sion
as
well
as
the
su
m
m
ariz
at
ion
of
the
sam
e
is
a
necessit
y an
d h
ence this
resea
rch
w
ork.
Zen
ke
rt at el
. [10
]
i
ntr
oduce
d
a c
r
os
s
-
dim
ensional
text su
m
m
ar
iz
at
ion
wh
ic
h
us
e
s
th
e
con
ce
pt
of
dim
ension
al
sel
ect
ion
an
d
fi
lt
ering
.
T
he
m
et
hod
was
e
xp
erim
ented
us
in
g
the
resu
lt
s
of
Mult
idi
m
ension
al
knowle
dge
re
presentat
ion
database.
A
te
xt
an
al
yz
er
was
devel
op
e
d
by
De
va
sen
a
and
Hem
al
at
ha
[
11
]
w
hich
wa
s
us
e
d
to
i
den
ti
fy
the
struct
ur
e
of
the
te
xt
give
n
as
in
pu
t.
Th
e
aut
hors
cl
ai
m
s
the
pro
po
se
d
syst
e
m
was
able
to
giv
e
the
res
ults
eff
ect
ively
w
hi
ch
ha
d
us
e
d
th
e
autom
at
ic
te
xt
cat
ego
risat
io
n
an
d
te
xt
su
m
m
arisat
ion
.
The
re
ex
ist
s
diff
ere
nt
te
xt
su
m
m
arisati
on
te
ch
niques
.
a
detai
le
d
ov
e
r
view
of
the
sa
m
e
i
s
pro
po
se
d
i
n[1
2]
b
y R
ahim
i et
al
. A
sim
il
ar s
tud
y
was d
one
by D
al
al
a
nd
Ma
li
k
al
so
[13
]
.
A
m
od
ifie
d
a
ppr
oach
of
K
Nea
rest
Neighb
or
f
or
ac
hieving
te
xt
su
m
m
arization
was
do
ne
by
J
o
[
14]
.
The
a
utho
r
f
oc
us
se
d
m
or
e
on
the
reli
abi
li
ty
aspect.
A
Viet
nam
ese
la
nguag
e
base
d
te
xt
su
m
m
arization
ap
proac
h
with
th
ree
sta
ges
us
i
ng
grap
hs
was
pro
po
se
d
by
Tra
n
a
nd
Nguyen
in
[15].
Evaluation Warning : The document was created with Spire.PDF for Python.
IS
S
N
:
2088
-
8708
In
t J
Elec
&
C
om
p
En
g,
V
ol.
9
, N
o.
5
,
Oct
ober
201
9
:
3
6
4
2
-
3
6
4
8
3644
The
a
uthor
s
cl
aim
s
that
the
pro
po
se
d
a
ppr
oach
wa
s
able
to
gathe
r
m
or
e
m
eaning
f
ul
te
xt
rele
van
t
t
o
native
sp
ea
ker
s
.
Vim
al
aksh
a
et
al
.
[
16
]
pro
vid
e
d
a
m
e
tho
d
to
s
um
m
arize
the
vi
deo
so
as
to
sa
m
e
tim
e
and
s
pace
a
s
well
as
helps
i
n
a
rch
i
ving.
A
n
ov
e
r
view
of
te
xt
s
umm
arisat
ion
fo
c
us
si
ng
m
or
e
on
the
te
chn
i
qu
e
s
to
av
oid
redu
nd
a
ncy
wa
s
done
in
[
17]
.
Ma
tsub
ay
as
hi
et
al
.
[1
8]
pro
pose
d
a
syst
e
m
for
effe
ct
ive
te
xt
retrieval
ba
s
ed
on
the
qu
ery.
T
he
auth
or
s
us
e
d
autom
at
ic
te
xt
su
m
m
arisat
ion
ap
proac
h
f
or
the
sam
e.
The
rest
of
t
he
pa
per
is
orga
nised
as
f
ollows.
Sect
io
n
3
giv
es
the
detai
ls
of
the
pro
po
se
d
m
odel
;
Sect
ion
4
m
entions
the
r
esults
ob
ta
ine
d f
ollo
wed b
y C
oncl
usi
on
s
and
Fu
t
ure sc
op
e
in
Sec
ti
on
5.
3.
PROP
OSE
D MO
DEL
3.1.
Ex
peri
ment
al setup
The
s
peec
h
from
the
source
is
rec
orde
d
usi
ng
a
m
ic
ro
phone
a
nd
the
featur
e
is
e
xtr
act
ed
in
te
xt
form
at
us
ing
Goo
gle
A
ppli
cat
ion
Progr
a
m
m
ing
I
nter
face
(
AP
I
).
Ho
we
ver,
the
te
xt
extracte
d
us
i
ng
the
Goo
gle
AP
I
does
not
inclu
de
per
i
od
(.
)
at
the
end
of
the
se
ntence
.
This
ca
n
le
ad
to
con
f
us
io
n
in
the
te
rm
inati
o
n
of
the
sta
te
m
ents.
In
orde
r
to
av
oid
this
,
in
the
pro
pos
ed
ap
proac
h
a
custom
cod
e
ha
s
bee
n
wr
it
te
n
to
pro
vid
e
a
per
i
od
after
a
pa
us
e
of
2e+
6
µs
or
m
or
e.
T
his
m
akes
the
se
nte
nce
cl
eare
r
an
d
it
is
pre
-
pr
ocesse
d
to
ad
d
per
i
od
(
.
)
an
d
qu
est
io
n
m
ark
(
?
)
.
I
n
order
t
o
proce
ed
with
the
c
once
pt
of
ad
ding
a p
eri
od
to
the
ext
racted
te
xt,
2e+6
µ
s
has
been
c
on
sidere
d
as
the
m
ini
m
u
m
pau
se
tim
e.
If
there
is
a
pau
se
f
or
m
ore
than
t
he
sai
d
ti
m
e also,
the
syst
e
m
w
il
l wait
f
or th
e
user i
nput due t
o vali
da
ti
on
.
Since
pe
rio
d
pl
ay
s
a
vital
ro
le
in
the
c
om
pl
et
ion
of
a
se
nt
ence,
a
ne
w
se
ntence
will
be
sta
rted
with
the
co
nce
pt
of
co
njuncti
on
in
the
ab
sence
of
pe
rio
d.
This
pro
blem
is
el
i
m
inate
d
in
th
e
pro
posed
m
od
el
by
the
us
e
of
te
m
porar
y
sto
rag
e
.
Ther
ef
ore,
w
he
nev
e
r
there
is
a
pau
se,
the
pe
rio
d
will
be
add
e
d
to
the
te
xt
and
will
be
te
m
porar
il
y
sto
red
in
t
he
te
m
po
ra
ry
va
riable.
I
f
the
ne
xt
se
ntence
be
gin
s
w
it
h
a
c
onjun
ct
i
on,
t
he
te
m
po
rar
y
var
i
able
will
be
cl
eared
a
nd
the
se
ntence
will
be
app
e
nded
t
o
previo
us
sente
nc
e
us
in
g
co
njun
ct
ion
.
Conver
sel
y
to
the
conj
un
ct
io
n,
if
the
senten
ce
beg
ins
with
a
su
bject
then
the
tem
po
ra
ry
va
riable
value
is
us
e
d
and
t
he
pe
rio
d
will
be
app
e
nd
ed
to
the
se
ntence.
Wh
-
quest
i
on
s
a
re
ex
pect
ed
to
en
d
wit
h
a
qu
est
io
n
m
a
rk(
?
)
.
Hen
ce
,
w
he
ne
ver
the
se
nte
nc
e
beg
i
ns
with
t
he
w
h
-
sta
tem
e
nt,
the
te
m
po
ra
ry
var
ia
ble
will
ho
ld
(
?
)
on
a
pause
of
2e+6
or
m
or
e
.
I
n
case
the
ne
xt
sente
nce
be
gins
wi
th
the
quest
io
n
ta
g
sta
te
m
e
n
t
then
t
he
va
lue
in
the
te
m
po
rar
y
var
ia
ble
will
not
be
use
d.
If
t
he
se
ntence
be
gin
s
with
a
ne
w
s
ubj
ect
t
hen
qu
e
sti
on
m
ark
will
be
app
e
n
ded to t
he
end
of the se
ntence.
The
pro
posed
m
et
ho
d
su
m
m
arizes
the
extr
act
ed
te
xt
accor
ding
to
the
r
ank
of
the
sen
te
nces
wh
ic
h
can
be
determ
i
ned
th
rou
gh
th
e
fr
e
quency
of
occ
ur
e
nce
of
words.
The
se
ntence
t
okeniz
e
an
d
wor
d
to
ken
iz
e
te
chn
iq
ues
fro
m
the
pack
a
ge
s
of
python
N
LTK
are
us
e
d
to
fin
d
the
f
r
equ
e
ny
of
w
ords.
Wh
e
n
the
te
xt
is
extracte
d
f
ro
m
the
in
put
us
in
g
Goo
gle
A
PI,
the
se
ntence
s
and
w
ords
in
t
he
te
xt
are
obta
ined
us
i
ng
se
ntence
tok
e
nize
and
word
to
ke
nize
resp
ect
ively
.
T
he
input
giv
e
n
by
the
us
er
as
sp
eech
will
be
conver
te
d
to
sign
al
.
And
the
si
gn
al
s
will
be
conve
rted
to
te
xt
f
or
m
at
in
colla
bo
r
at
ion
with
t
he
Goo
gle
API.
I
n
orde
r
to
proc
ess
the
gen
e
rated
te
xt
with
the
pr
opose
d
m
od
el
,
w
ord
to
ken
iz
e
a
nd
se
ntence
to
ke
nize
is
us
ed
.
The
c
om
plete
set
of
a
sentence
is
gi
ven
as
in
puts
to
the
senten
ce
tok
e
nize,
ever
y
sente
nce
is
separ
at
ed
with
the
occ
urre
nce
of
t
he
dot.
A
ll
the
se
ntenc
es
are
gi
ven
as
inputs
to
t
he
w
ord
to
ke
ni
ze,
each
w
ord
is
se
par
at
e
d
wit
h
the o
cc
urence
of the s
pace.
Wh
e
n
a
te
xt
w
it
h
pro
per
for
m
at
is
us
ed
f
or
su
m
m
arisi
ng
,
it
is
le
ss
co
m
plex
to
process
a
s
it
is
in
the
exact
form
at
a
nd
is
oft
en
pr
eci
se
and
cl
ear.
But
this
is
no
t
the
case
w
hen
a
s
peech
i
s
ta
ken
as
the
inp
ut.
Her
e
the
s
peec
h
has
to
be
c
onve
rted
to
te
xt
and
the
n
it
sh
ou
l
d
be
su
m
m
arized.
T
he
pr
ob
le
m
s
to
be
t
ackled
her
e
are
t
he
oc
currence
of
repea
te
d
w
ords
,
bro
ke
n
w
ords
,
di
ff
ere
nt
diale
ct
an
d
syn
onym
s
us
e
d
to
co
nve
y
the
m
essage
et
c.
T
her
e
fore,
t
o
overco
m
e
su
ch
pro
blem
s,
the
w
ords
with
le
ss
i
m
po
rtance
is
el
i
m
inate
d.
F
or
this
,
a
m
ini
m
u
m
an
d
m
axi
m
u
m
ran
ge
is
set
f
or
the
occ
urren
ce
of
any
sp
eci
fic
w
ord.
E
ven
th
ough
th
e
se
nte
nce
a
nd
word f
re
qu
e
nc
y are
us
ed
, to f
ind
t
he
im
po
rta
nt se
ntence i
n
t
he whole
c
on
te
nt,
a
ra
nk
i
ng m
od
el
is
appli
ed.
Finall
y,
after
the
wor
ds
are
tok
enize
d,
th
e
fr
eq
ue
ncy
of
eve
ry
w
ord
is
cal
culat
ed
by
the
by
the
su
m
m
ariza
ti
on
al
gorithm
in
pro
po
se
d
m
od
el
.
T
he
wei
ght
of
the
se
ntence
is
found
w
it
h
the
con
si
de
rati
on
of
the
fr
e
que
nc
y
of
w
ords
.
The
i
nd
e
x
is
ranke
d
acc
ordi
ng
to
the
wei
gh
t
of
the
se
nt
ence
a
nd
wit
h
th
e
identific
at
ion
of
the
in
de
x,
t
he
sente
nce
is
sum
m
arized.
Pyt
hon
nlar
gest
f
unct
ion
is
us
e
d
t
o
rank
t
he
s
ent
ence
base
d
on
the
weig
ht
of
th
e
sentence
.
The
te
xt
will
be
su
m
m
arized
based
on
the
weigh
t
of
the
sent
ences.
The flo
w
c
har
t
of im
ple
m
entation
proce
dure
of the
pro
pose
d
m
od
el
is as
s
how
n
in
F
ig
ure
2.
Evaluation Warning : The document was created with Spire.PDF for Python.
In
t J
Elec
&
C
om
p
En
g
IS
S
N:
20
88
-
8708
Sp
eec
h
t
o
te
xt
con
ve
rsio
n and
summariz
atio
n
fo
r eff
ect
iv
e u
nderst
anding
and d
ocume
ntat
ion
(
Vin
nara
s
u
A
)
3645
Figure
2. Flo
w
ch
a
rt of the
pr
opos
e
d
m
od
el
.
3.
2
.
Pr
oposed
p
roce
dure
The
al
gorithm
for
the
pr
opos
e
d
m
et
ho
d
is
g
i
ven b
el
ow.
Step
1
−
ST
A
RT
Step
2
−
declar
e m
ic
ro
ph
on
e
as a s
ource
Step
3
−
declar
e thr
ee
li
sts au
dio
Re
c
orded, t
extForm
at
Of
Re
cord,
tem
pr
oa
ryLi
st
Step
4
--
w
hile
sente
nce!
=
e
xi
t exit
aud
i
oReco
rd
e
d = l
ist
en(
s
ourc
e)
extracte
dT
ext
= rec
ognize_
google
(a
udioRe
corde
d)
If
(
pau
se
&&
next se
ntence
of
ex
tract
e
dTe
xt
sta
rts w
it
h su
bj
ect
)
sentence=”
.”+s
entence
el
se if(
pau
se
&
& n
e
xt se
ntenc
e of e
xtractedT
ext starts
with
conj
un
ct
io
n)
sentence=”
,”+s
entence
end whil
e
[Th
e
whil
e lo
o
p
is e
xited
with
text “exit
e
xit”]
Step
5
−
declar
e w
e
bSpoke
n
a
s f
il
e
webSp
oken=
se
ntence
Step
6− decl
ar
e two l
ist
s sen
t
ences,
words
Step
7
-
se
nten
ces=se
nt_
t
ok
e
nizer(we
bspok
en)
words =
word_t
ok
e
nize(se
nte
nc
es)
com
pu
te
_frequ
encies(
wor
ds
)
Step
8
–
init
al
iz
e m
inCut=0.
1 an
d
m
axCut=0.9
If
w
ord fre
que
ncy>m
axCut an
d f
reque
ncy<m
inCut
rem
ov
e the
w
ord
Wh
il
e
rankin
g i
nd
e
x
!
(
s
or
te
d)
rankin
g=
nlarge
st(Sorte
d l
ist
of se
ntence
)
Step
9
–
P
rint t
he
se
ntence
in t
he
or
der
of r
a
nk
i
ng
Step
10
–
ST
O
P
4.
RESU
LT
S
The
rec
orde
d
s
peech
ca
n
be
c
onve
rted
to
te
xt
with
the
help
of
Goo
gle
AP
I
.
It
is
diff
ic
ult
to
separa
te
the
te
xt
i
nto
s
entence
w
hich
is
ge
ne
rated
us
in
g
G
oogle
AP
I
,
beca
us
e
the
e
xtrated
te
xt
do
es
no
t
ha
ve
a
per
i
od(.).
T
o
m
ake
the
sent
ences
disti
nct,
in
the
propo
s
ed
m
od
el
,
a
p
eri
od
is
a
ppe
nd
e
d
at
the
e
nd
of
Evaluation Warning : The document was created with Spire.PDF for Python.
IS
S
N
:
2088
-
8708
In
t J
Elec
&
C
om
p
En
g,
V
ol.
9
, N
o.
5
,
Oct
ober
201
9
:
3
6
4
2
-
3
6
4
8
3646
the
se
ntence
wh
e
n
t
her
e
is
a
pa
us
e.
I
f
t
he
sente
nce
is
a
w
h
-
se
nte
nce,
a
quest
io
n
m
ark(
?
)
is
a
pp
e
nded
to
the
en
d
of
the
sente
nce.
Thi
s
m
akes
it
easi
er
to
t
ok
e
nize
the
se
ntence
s,
as
python
st
ring
to
ken
iz
at
io
n
us
e
s
per
i
od
to
di
ff
e
ren
ti
at
e
senten
ces.
If
the
sen
te
nce
has
a
pa
us
e
an
d
if
it
beg
ins
a
no
t
he
r
sentence
with
the
conj
un
ct
io
n,
a
com
m
a(,)
is
app
e
nded
t
o
the
en
d
of
the
sentence
.
Thi
s
m
akes
it
easi
er
to
to
ken
i
ze
th
e
sentences
,
as
py
tho
n
strin
g
to
ken
iz
at
io
n
us
e
s
pe
ri
od
t
o
differentia
te
sente
nces.
T
he
pro
pose
d
m
od
el
co
ns
ide
rs
the
punct
uatio
ns
(
‘.
’
,
‘,’
an
d
‘
?
’
)
in
the
re
cognized
te
xt.
The
pro
posed
m
od
el
recogni
ti
on
is
faster
wh
e
n
com
par
ed
to
the
basic
m
od
el
(sen
te
nces
w
it
ho
ut
‘.’,
‘,
’
a
nd
‘
?
’
)
recog
ni
t
ion
.
T
he
basi
c
m
od
el
su
m
marizes
the
rec
ognized
text
with
ou
t
a
ny
pr
e
-
proce
ssing.
But
in
the p
r
opos
e
d
a
ppr
oach,
pre
-
proc
essing
is use
d
t
o
ad
d
a
per
i
od(.)
at
th
e
en
d
of
eac
h
se
ntence
to
ind
ic
at
e
t
he
te
rm
inati
on
of
a
se
ntence.
In
pyth
on
se
nt
ence
tok
e
nizat
ion,
sentences
ar
e
t
okenize
d
base
d
on
the
pr
ese
nc
e
of
pe
rio
d.
T
hough
the
re
ar
e
m
any
pu
nctu
at
ion
m
ark
s
that
ca
n
be
i
nclu
ded
in
a
se
nten
ce,
t
he
f
oc
us
i
n
th
e
pro
posed
m
od
el
is
only
on
per
i
od
an
d
quest
io
n
m
ark
.
Ta
ble
1
show
s
the
ti
m
e
ta
ken
to
r
ecognize
se
nte
nces
with
a
nd
with
ou
t
pe
rio
d
a
nd
quest
io
n
m
ark
resp
ect
ively
. B
ased on
the r
ec
ognizti
on
ti
m
e,
w
e can
say
that the sen
te
ces wh
ic
h
inclu
des
p
erio
d
an
d
qu
est
ion
m
ark
are
r
ec
og
nized fast
er
tha
n
the
se
ntences
w
it
hout it
.
Table
1.
Rec
og
niti
on
ti
m
e fo
r
sentence
w
it
h
and w
it
ho
ut (.)
a
nd
(
?
)
Reco
g
n
ized
sen
ten
ce
Reco
g
n
itio
n
ti
m
e f
o
r
sen
ten
ce
with
o
u
t app
en
d
i
n
g
(
.)
or
(?
)
(in µs
)
Reco
g
n
itio
n
ti
m
e f
o
r
sen
ten
ce
ap
p
en
d
in
g
(
.)
or
(
?
)
(in µs
)
Techn
o
lo
g
y
so
lv
es
pro
b
le
m
s an
d
in t
u
rn cre
ates p
rob
le
m
s
3
6
0
9
9
6
1
8
3
9
8
1
Stan
d
ards
are
alwa
y
s
o
u
t
o
f
state
th
at
is
wh
at
m
ak
es
th
e
m
stan
d
ard
9
8
2
5
5
4
5
2
1
0
7
The
g
reate
st
en
e
m
y
o
f
k
n
o
wled
g
e
is
no
t
ig
n
o
rance
it
is th
e
illu
sio
n
o
f
kn
o
wle
d
g
e
2
4
0
1
7
8
1
1
0
3
6
7
W
e
h
av
e
to
sto
p
o
p
ti
m
izin
g
f
o
r
p
r
o
g
ra
m
m
e
rs
an
d
st
art
o
p
ti
m
izin
g
f
o
r
u
ser
s
3
9
6
6
5
8
7
8
1
3
4
Low lev
el
p
rog
ra
m
m
in
g
is
no
t that ea
sy
f
o
r
b
eg
in
n
ers
6
6
1
4
1
1
0
8
3
2
6
The
gra
ph
for
t
i
m
e
ta
ken
to
re
cognize
se
nten
ces
with
an
d
w
it
ho
ut
per
i
od(.)
is
as
sh
ow
n
in
F
ig
ure
3.
The
gr
a
ph
is pl
otted f
or num
ber
o
f
sente
nces
agai
ns
t
the
ti
m
e
ta
ken
t
o
re
cognize
se
nten
ces
that
a
re
m
e
ntion
e
d
in
T
able
1.
Th
e
blu
e
li
ne
sym
bo
li
zes
reco
gnit
ion
ti
m
e
fo
r
te
xt
without
pe
rio
d
rec
ogniti
on
an
d
the
or
a
nge
li
ne
sta
nd
s
f
or
recogn
it
io
n
with
pe
ri
od.
T
he
rec
ogniti
on
ti
m
e
i
s
com
pu
te
d
as
the
dif
fer
e
nce
as
the
en
d
ti
m
e
an
d
sta
rt
tim
e
of
t
he
rec
ogniti
on
proces
s.
T
he
ti
m
e
li
br
ary
in
py
tho
n
is
us
e
d
t
o
record
ti
m
e.
The
dif
fer
e
nce
tim
e
can
be
c
om
pu
te
d usin
g (1).
(1)
The
ti
m
e
ta
ken
by
the
ge
nsim
l
ibrar
y
an
d
the
pro
posed
m
et
ho
d
to
s
um
m
arize
te
xt
is
sh
ow
n
in
T
able
2.
For
va
li
dating
the
te
xt
su
m
m
arisat
i
on,
data
w
as
gi
ven
as
a
c
on
ti
nuous
s
peech,
do
c
um
ents
of
f
ifte
en
pag
e
s,
an
d
var
i
ou
s
we
bs
it
es
f
or
e
valuati
ng
the
pe
rfor
m
an
c
e.
Acc
ordin
g
to
the
data
in
th
e
T
able,
the
propose
d
m
et
ho
d
c
om
par
it
ively
ta
kes
le
sser
ti
m
e
fo
r
su
m
m
arization
of
t
he
sam
e
num
ber
of
li
nes
as
c
om
pared
t
o
Gen
sim
li
br
ary
.
Figure
3. Com
par
is
on of s
pee
ch reco
gnit
ion
with
do
t a
nd
w
it
ho
ut
dot
Evaluation Warning : The document was created with Spire.PDF for Python.
In
t J
Elec
&
C
om
p
En
g
IS
S
N:
20
88
-
8708
Sp
eec
h
t
o
te
xt
con
ve
rsio
n and
summariz
atio
n
fo
r eff
ect
iv
e u
nderst
anding
and d
ocume
ntat
ion
(
Vin
nara
s
u
A
)
3647
Table
2.
Su
m
m
arizat
ion
ti
m
e
us
in
g Ge
ns
im
l
ibrar
y a
nd
pro
po
s
ed
m
et
ho
d
Nu
m
b
e
r
o
f
lines
Su
m
m
a
rized
lin
es
Su
m
m
a
rization
ti
m
e
f
o
r
Gen
si
m
libra
ry (in
µs
)
Su
m
m
a
rization
ti
m
e
f
o
r
p
rop
o
sed
m
eth
o
d
(
in
µs
)
10
1
1
6
3
6
8
0
6190
5
1
2336
1886
8
1
3675
3640
5
1
8
7
3
8
8
7
3279
6
1
5576
3279
Com
par
ing
to
the
Gen
sim
su
m
m
arization,
pro
po
se
d
al
gorith
m
con
sum
es
le
ss
t
i
m
e
to
produce
the
re
su
lt
.
I
n
Gen
sim
su
m
m
arizat
ion
,
th
ou
gh
do
c
um
ent
is
ve
ry
big
,
re
s
ult
of
t
he
al
gorithm
wo
ul
d
be
a
l
ine
m
os
tly.
Pr
opose
d
al
gorithm
can
su
m
m
ari
ze
the
whole
do
c
um
ent
int
o
a
nu
m
ber
of
li
nes
by
the
us
e
r
requirem
ent. T
he fre
qu
e
ncy
of wo
rd
s
are
com
pu
te
d
us
i
ng (2) w
hich
is use
d
to
ra
nk the
se
ntences
.
(2)
By
app
ly
ing
t
his
f
or
m
ula,
th
e
fr
e
qu
e
ncy
of
the
sin
gle
w
ord
ca
n
b
e
f
ound.
F
r
om
her
e
the
in
dex
of
the
se
ntence
c
an
be
f
ound
by
the
rankin
g
m
et
ho
d.
I
n
Fig
ure
4,
blu
e
li
ne
sta
nds
f
or
th
e
ge
ns
im
li
br
ary
an
d
the
or
a
nge
li
ne
sta
nd
s
for
the
pro
po
se
d
a
ppr
oach.
Th
e
x
-
ax
is
sh
ows
the
num
ber
of
li
nes
giv
e
n
as
in
pu
t
that
is
sh
ow
n
i
n
the
T
able
2
an
d
y
-
a
xis
s
hows
t
he
t
i
m
e
ta
ken
.
T
he
tim
e
con
s
um
e
d
a
nd
pe
rfo
rm
ance
of
the
pro
po
s
ed
appr
oach is co
ns
ist
ent
for dif
fer
e
nt in
pu
t
s.
Figure
4. Com
par
is
on of s
umm
arizat
ion
tim
e f
or
ge
ns
im
lib
ra
ry an
d p
rop
os
e
d
m
et
ho
d
5.
CONCL
US
I
O
N
Sp
eec
h
rec
ogni
ti
on
an
d
te
xt
s
umm
arizat
ion
are
tw
o
va
st
ar
eas
to
be
ex
plored
.
Th
e
pro
po
sed
re
sear
c
h
work
ai
m
s
to
reduce
the
ti
m
e
and
ef
fort
of
m
anu
al
docum
entat
ion
o
f
le
ngt
hy
spe
eches
in
a
n
even
t.
Sp
eec
h
rec
ogni
ti
on
an
d
te
xt
s
umm
arizat
ion
can
ease
the
w
ork
of
do
c
um
entat
ion
.
Eve
n
f
or
t
he
ve
rificat
ion
of
the
su
m
m
arize
d
co
ntent,
t
he
syst
e
m
can
be
autom
at
ed
to
r
ead
out
the
s
um
m
arised
con
t
ent
with
th
e
he
lp
o
f
te
xt
to
s
peec
h
conve
rsion.
As
of
now,
s
pee
ch
s
umm
arization
f
or
sente
nc
es
te
rm
inati
ng
with
a
f
ull
s
top
or
con
ta
ini
ng
a
s
m
al
l
pau
se
s
how
n
by
c
omm
a
is
ex
pe
rim
en
te
d.
T
he
f
uture
w
ork
is
to
i
nc
lud
e
al
l
punct
uation
m
ark
s
in
the
r
ecognize
d
s
pe
ech
w
hich
hel
ps
in
im
pr
ov
in
g
the
te
xt
su
m
m
arizat
ion
perform
ance.
This
m
od
el
can
be
us
e
d
w
her
e
e
ve
r
there
is
a
requirem
ent
of
su
m
m
arisi
ng
le
ngt
hy
le
ct
ur
es
int
o
pr
eci
se
docum
ents
as
the
autom
at
ed
syst
e
m
will
conve
r
t
the
s
peec
h
to
te
xt
a
nd
al
so
su
m
m
arise
the
co
ntent.
It
ca
n
be
of
great
he
lp
f
or
stud
e
nts to
arc
hiv
e lec
t
ur
e
no
te
s f
r
om
classe
s,
c
onfer
e
nces
or sem
inars
.
Evaluation Warning : The document was created with Spire.PDF for Python.
IS
S
N
:
2088
-
8708
In
t J
Elec
&
C
om
p
En
g,
V
ol.
9
, N
o.
5
,
Oct
ober
201
9
:
3
6
4
2
-
3
6
4
8
3648
REFERE
NCE
S
[1]
Jos
e
D
V,
Alfa
te
h
Mus
ta
fa,
Sha
ran
R,
"
A
Novel
Model
for
Spe
ec
h
to
T
ext
Con
ver
sion
,"
Inte
rn
ati
onal
R
eferee
d
Journal
of
Engi
n
ee
ring a
nd
Sc
ie
n
ce
(
IRJES)
, v
ol
3,
no
.
1
,
2014
.
[2]
K.
M.
Shivakumar,
V.
V.
Jain
an
d
P.
K.
Pri
y
a
,
"
A
stud
y
on
impact
of
la
ngu
age
m
odel
in
improvin
g
the
accurac
y
o
f
spee
ch
to
t
ext
conve
rsion
s
y
st
em,"
2017
Inte
rnational
Confe
r
enc
e
on
Comm
unic
ati
on
and
S
ignal
Proce
ss
in
g
(
ICCSP
)
,
Chenn
ai
,
pp.
1148
-
115
1
,
2017
.
[3]
Y.
H.
Ghada
ge
and
S.
D.
Shelke
,
"S
pee
ch
to
te
xt
conv
ersion
for
m
ult
il
ingual
la
nguage
s
,
"
2016
Inte
rnationa
l
Confe
renc
e
on
C
omm
unic
ati
on
a
nd
Signal P
roc
e
ss
ing
(
ICCSP)
,
Melmaruva
thur
,
pp.
0236
-
0240
,
2016
.
[4]
Um
ar
Nasib
Ab
dull
ah
,
Kab
ir
H
um
a
y
un,
Ahm
e
d
Ruhan,
Uddin
Jia.
,
"
A
Real
Ti
m
e
Speec
h
to
Te
xt
Conversio
n
Te
chn
ique
for
Benga
li
L
anguage
,
"
2018
Int
ernati
onal
Con
fe
r
enc
e
on
Compu
te
r,
Comm
unic
a
ti
on,
Chemic
a
l
,
Mate
rial
and
E
l
ec
troni
c
Eng
inee
ring (
IC4ME2)
,
pp.
1
-
4
,
2018
.
[5]
L.
W
an
,
"Ex
tracti
on
Algo
rit
hm
of
Eng
li
sh
T
e
xt
Sum
m
ari
za
tion
for
Eng
li
sh
Teac
hing
,
"
20
18
Inte
rnati
ona
l
Confe
renc
e
on
I
nte
lligent Tr
anspor
tat
ion, B
ig
D
ata
&
S
mar
t
Cit
y
(
ICITBS)
,
Xiamen,
China
,
201
8
,
pp
.
307
-
310
.
[6]
Sai
y
ed
S
.
,
Sajja
P
.
S
.
,
"
Revi
ew
on
te
xt
sum
m
ar
iz
ation
eva
lu
at
io
n
m
et
hods
,
"
Ind
ian
Journal
of
Computer
Sci
en
ce
and
Engi
n
ee
ring
,
v
ol
.
8
,
n
o
.
4
,
pp
.
497
,
2017
.
[7]
R.
Alguliy
ev
,
R.
Aligul
i
y
ev
and
N.
Isaz
ade,
"A
sente
nc
e
sele
c
tion
m
odel
and
H
LO
al
gorit
hm
for
ext
racti
ve
t
ext
summ
ari
za
t
ion,
"
2016
IEE
E
10t
h
Inte
rnational
Confe
renc
e
on
Appl
ic
a
ti
on
of
I
nformation
and
Comm
unic
ati
o
n
Technol
ogi
es
(
AICT)
,
Baku
,
pp
.
1
-
4
,
2016
.
[8]
Jain
D.
Bha
ti
a
and
M.
K.
Thakur,
"Extr
ac
t
ive
Te
xt
Sum
m
ari
za
t
ion
Us
ing
W
ord
Vec
tor
E
m
beddi
ng,
"
201
7
Inte
rnational
Co
nfe
renc
e
on
Mac
hine
Learning
a
nd
Data
Sc
ie
n
ce
(
ML
DS
)
,
Noida
,
pp.
51
-
55
,
2017
.
[9]
K.
V
y
th
elingum
,
Y.
Estè
ve
and
O.
Rosee,
"Erro
r
det
e
ction
of
gr
aphe
m
e
-
to
-
phon
eme
conve
rsion
in
te
x
t
-
to
-
spe
ec
h
s
y
nthe
sis
using
spee
ch
signal
an
d
le
xical
context
,
"
2017
IEE
E
A
utomati
c
Spe
ec
h
Re
cognition
an
d
Unders
tandi
ng
Workshop (
ASR
U)
,
Okinawa
,
pp
.
692
-
697
,
2017
.
[10]
J.
Ze
nker
t,
A.
Kl
ahol
d
and
M.
Fa
thi
,
"Towards Extra
c
ti
ve Te
xt
S
um
m
ari
za
ti
on
U
sing Mult
idi
m
en
sional
Know
le
dg
e
Repre
sent
at
ion
,
"
2018
IEE
E
In
t
ernati
onal
Conf
ere
nce
on
El
e
ctr
o/Inf
orm
ati
on
Technol
ogy
(
E
I
T)
,
Roche
ster,
MI
,
pp.
0826
-
0831
,
2018
.
[11]
C.
La
kshm
i
Deva
sena
and
M.
He
m
al
at
ha
,
"A
utomati
c
Te
x
t
cate
g
oriz
a
ti
on
and
sum
m
ari
za
ti
on
usi
ng
rule
red
uctio
n,
"
IEE
E
-
In
te
rnatio
nal
Confe
renc
e
on
Adv
ance
s
in
Engi
nee
rin
g,
Sci
en
ce
and
Manage
ment
(
ICAE
SM
-
2012)
,
Naga
patt
ina
m
,
T
amil
Nadu
,
pp
.
5
94
-
598
,
2012
.
[12]
S.
R.
R
ahi
m
i,
A
.
T
.
Mozhde
h
i
a
nd
M.
Abdolah
i,
"A
n
over
vie
w
on
ext
r
ac
t
ive
te
x
t
sum
m
ari
za
ti
on
,
"
2017
IEEE
4
th
Inte
rnational
Co
nfe
renc
e
on
Kno
wle
dge
-
Based E
ngine
ering
and
I
nnovat
ion
(
KBE
I)
,
Te
hra
n
,
pp
.
00
54
-
0062
,
2017
.
[13]
V.
Dala
l
and
L
.
Mali
k
,
"A
Surve
y
of
Extrac
tive
and
Abs
tract
ive
T
ext
Sum
ma
rizati
on
Techni
ques,
"
2013
6
th
Inte
rnational
Co
nfe
renc
e
on
Eme
rging Trends i
n
Engi
ne
ering
and
Technol
og
y
,
Na
gpur,
pp
.
109
-
11
0
,
2013
.
[14]
T.
Jo,
"K
nea
re
st
nei
ghbor
for
te
xt
sum
m
ari
za
t
ion
usi
ng
fea
tur
e
sim
il
ari
t
y
,
"
20
17
Inte
rnational
Confe
renc
e
on
Comm
unic
ati
on,
Control, Computi
ng
and
Elec
tronic
s E
ng
ine
ering
(
ICCCCEE
)
,
Khart
oum
,
pp
.
1
-
5
,
2017
.
[15]
T.
Tr
an
and
D
.
T.
Ngu
y
en
,
"Te
x
t
Gene
ra
ti
on
fro
m
Abs
tra
ct
Sem
ant
i
c
Repre
s
ent
a
ti
on
for
Sum
m
ari
zi
ng
Vi
et
names
e
Para
gra
phs
Hav
ing
Co
-
ref
er
ences,"
2018
5th
NAF
OST
ED
Confe
renc
e
on
Inf
orm
ati
on
and
Computer
Sci
en
ce
(
NICS
)
,
Ho Chi
Minh,
Vie
tna
m
,
pp.
93
-
98
,
2018
.
[16]
A.
Vim
al
aksha
,
S.
Vina
y
,
A.
Pre
kash
and
N.
S.
Kum
ar,
"A
utomate
d
Sum
m
ari
za
ti
o
n
of
Le
ct
ur
e
Vid
eos,
"
2018
IEE
E
Tenth
Int
ernati
o
nal
Conf
ere
nce
on
Technol
og
y f
or E
ducation
(
T4E)
,
Chennai,
In
dia
,
pp.
126
-
129
,
2018
.
[17]
S.
Biswas,
R.
Raut
ra
y
,
R.
Dash and R.
Dash,
"Te
xt
Summ
ari
za
t
io
n:
A Re
vie
w,"
2018
2nd
Inte
rnational
Confe
ren
c
e
on
Data
Sc
ie
n
ce
and
Busine
ss
An
aly
tics (
ICDSBA)
,
Changsha, China
,
pp
.
231
-
235,
2018
.
[18]
Matsuba
y
ashi
A.
Yam
ashit
a
H.
N
onaka
and
Y
.
Konno,
"A
Resea
r
ch
on
Docum
ent
Summ
ari
za
t
ion
and
Presentation
S
y
stem
Based
o
n
Feat
ure
W
ord
Ext
ra
ct
ion
fro
m
Stored
Inform
at
ions,"
2018
Confe
renc
e
on
Technol
ogi
es
and
App
li
ca
ti
ons o
f Artif
i
ci
al
Int
el
l
ig
enc
e
(
TAAI)
,
Ta
i
chung,
Taiwan
,
pp.
60
-
63
,
2018
.
Evaluation Warning : The document was created with Spire.PDF for Python.