I
A
E
S
I
n
t
e
r
n
at
io
n
al
Jou
r
n
al
of
A
r
t
if
ic
ia
l
I
n
t
e
ll
ig
e
n
c
e
(
I
J
-
AI
)
V
ol
.
14
, N
o.
5
,
O
c
to
be
r
2025
, pp.
3599
~
3612
I
S
S
N
:
2252
-
8938
,
D
O
I
:
10.11591/
ij
a
i.
v
14
.i
5
.pp
3599
-
3612
3599
Jou
r
n
al
h
om
e
page
:
ht
tp
:
//
ij
ai
.
ia
e
s
c
or
e
.c
om
A
u
t
om
at
i
c
e
ss
ay sc
or
i
n
g:
l
e
ve
r
agi
n
g
J
ac
c
ar
d
c
o
e
f
f
i
c
i
e
n
t
an
d
C
osi
n
e
si
m
i
l
ar
i
t
y
w
i
t
h
n
-
gr
am
var
i
at
i
on
i
n
ve
c
t
or
sp
ac
e
m
od
e
l
ap
p
r
oac
h
A
n
d
h
ar
in
i
D
w
i
C
ah
yan
i
1
, M
oh
. Wi
l
d
an
F
a
t
h
on
i
1
,
F
ik
a H
as
t
ar
it
a R
ac
h
m
an
1
, A
r
i
B
as
u
k
i
2
,
S
al
m
an
A
m
in
3
, B
ai
n
K
h
u
s
n
u
l
K
h
ot
im
ah
1
1
D
e
pa
r
t
m
e
nt
of
I
nf
or
m
a
t
i
c
s
E
ngi
ne
e
r
i
ng
,
F
a
c
ul
t
y of
E
ngi
ne
e
r
i
ng
,
U
ni
ve
r
s
i
t
a
s
T
r
unoj
oyo M
a
dur
a
,
B
a
ngka
l
a
n,
I
ndone
s
i
a
2
D
e
pa
r
t
m
e
nt
of
I
ndus
t
r
i
a
l
E
ngi
ne
e
r
i
ng,
F
a
c
ul
t
y o
f
E
ngi
ne
e
r
i
ng
,
U
ni
ve
r
s
i
t
a
s
T
r
u
noj
oyo M
a
dur
a
,
B
a
ngka
l
a
n,
I
ndone
s
i
a
3
S
c
hool
of
M
e
di
a
a
nd C
om
m
uni
c
a
t
i
on S
t
udi
e
s
, M
i
nha
j
U
ni
v
e
r
s
i
t
y
,
L
a
hor
e
,
P
a
k
i
s
t
a
n
A
r
t
ic
le
I
n
f
o
A
B
S
T
R
A
C
T
A
r
ti
c
le
h
is
to
r
y
:
R
e
c
e
iv
e
d
J
un
9
,
2024
R
e
vi
s
e
d
J
ul
2
,
2025
A
c
c
e
pt
e
d
A
ug
6
,
2025
Automated
essay
scoring
(AES)
is
a
vital
area
of
research
aiming
to
p
rovide
efficient
and
accurate
assessmen
t
tools
for
evaluatin
g
written
content.
This
study
investigates
the
effectiveness
of
two
popular
similarity
metrics,
Jaccard
coefficie
nt
,
and
Cosine
similarity
,
within
the
context
of
vector
space
models
(VSM)
employi
ng
unigram,
bigram,
and
trigram
represent
ations.
The
data
used
in
this
research
was
obtained
from
the
formative
essay
of
the
c
itizenship
e
ducation
subject
in
a
junior
high
school.
Each
essay
und
ergoes
preprocessing
to
extract
features
using
n
-
gram
models,
follow
ed
by
vectorization
to
transform
t
ext
data
into
numerical
repres
entations.
Then,
similarity scores ar
e computed betwe
en essays using both Ja
ccard co
ef
ficient
and
Cosine
similarity
.
The
performance
of
the
system
is
evaluat
ed
by
analyzing
the
root
mean
square
error
(RMSE),
which
measur
es
the
difference
between
the
scores
given
by
human
graders
and
those
ge
nerated
by
the
syst
em.
The
result
shows
that
the
Cosine
similarity
outperformed
the
Jaccard
coefficie
nt.
In
terms
of
n
-
gram,
unigrams
have
lower
RMSE
compared t
o bigram
s and t
rigrams.
K
e
y
w
o
r
d
s
:
A
ut
om
a
te
d e
s
s
a
y s
c
or
in
g
C
os
in
e
s
im
il
a
r
it
y
J
a
c
c
a
r
d c
o
e
f
f
ic
ie
nt
N
-
gr
a
m
va
r
ia
ti
on
V
e
c
to
r
s
pa
c
e
m
ode
l
This is an
open
acce
ss artic
le unde
r the
CC BY
-
SA
license.
C
or
r
e
s
pon
di
n
g A
u
th
or
:
A
ndha
r
in
i
D
w
i
C
a
hya
ni
D
e
pa
r
tm
e
nt
of
I
nf
or
m
a
ti
c
s
E
ngi
ne
e
r
in
g, F
a
c
ul
ty
of
E
ngi
ne
e
r
in
g
, U
ni
ve
r
s
it
a
s
T
r
unoj
oyo M
a
dur
a
B
a
ngka
la
n, I
ndone
s
ia
E
m
a
il
:
a
ndha
r
in
i.
c
a
hya
ni
@
tr
unoj
oyo.a
c
.i
d
1.
I
N
T
R
O
D
U
C
T
I
O
N
C
onve
nt
io
na
l
e
s
s
a
y
te
s
ts
gi
ve
pupi
ls
a
c
ha
nc
e
to
de
m
ons
tr
a
te
th
e
ir
in
te
ll
e
c
tu
a
l
di
ve
r
s
it
y
by
pr
e
s
e
nt
in
g
or
ig
in
a
l
id
e
a
s
a
nd
poi
nt
s
of
vi
e
w
.
P
r
of
ic
ie
nc
y
in
w
r
it
te
n
c
om
m
uni
c
a
ti
on
is
a
n
e
s
s
e
nt
ia
l
c
om
pe
te
nc
y i
n both a
c
a
de
m
ic
a
nd pr
of
e
s
s
io
na
l
c
ont
e
xt
s
[
1]
.
S
t
ude
nt
s
c
a
n s
how
t
ha
t
th
e
y c
a
n a
na
ly
s
e
di
f
f
ic
ul
t
s
it
ua
ti
ons
,
f
or
m
ul
a
te
a
r
gum
e
nt
s
,
a
nd
s
ugge
s
t
a
ns
w
e
r
s
by
w
r
it
in
g
e
s
s
a
y
s
.
I
t
e
va
lu
a
te
s
th
e
ir
a
bi
li
ty
to
a
ns
w
e
r
pr
obl
e
m
s
, w
hi
c
h i
s
c
r
uc
ia
l
in
a
l
ot
of
r
e
a
l
-
w
or
ld
s
it
ua
ti
ons
. E
s
s
a
ys
a
s
s
e
s
s
a
s
tu
de
nt
'
s
a
bi
li
ty
f
or
pe
r
s
ua
s
iv
e
a
nd
lu
c
id
id
e
a
e
xpr
e
s
s
io
n
[
2]
.
O
r
ga
ni
s
a
ti
on,
c
ohe
r
e
nc
e
,
a
nd
c
l
a
r
it
y
a
r
e
a
ll
e
s
s
e
nt
ia
l
c
om
m
uni
c
a
ti
on
s
ki
ll
s
in
a
va
r
ie
ty
of
a
c
a
de
m
ic
a
nd pr
of
e
s
s
io
na
l
s
e
tt
in
gs
.
B
e
c
a
us
e
e
s
s
a
y
e
va
lu
a
ti
ons
a
r
e
s
ubj
e
c
ti
ve
,
te
a
c
h
e
r
s
a
r
e
a
bl
e
to
ta
ke
in
to
a
c
c
ount
s
tu
de
nt
s
'
in
di
vi
dua
l
w
r
it
in
g
pr
e
f
e
r
e
nc
e
s
,
vi
e
w
poi
nt
s
,
a
nd
in
ve
nt
iv
e
ne
s
s
.
T
hi
s
a
d
a
pt
a
bi
li
ty
is
us
e
f
ul
f
or
a
s
s
e
s
s
in
g
a
r
a
nge
of
a
ns
w
e
r
s
.
N
e
v
e
r
th
e
le
s
s
,
de
s
pi
te
it
s
be
ne
f
it
s
,
m
a
nua
l
e
s
s
a
y
t
e
s
t
a
s
s
e
s
s
m
e
nt
pos
e
s
s
e
r
io
us
di
f
f
ic
ul
ti
e
s
f
or
te
a
c
he
r
s
[
3]
.
R
e
s
e
a
r
c
he
r
s
a
r
e
in
c
r
e
a
s
in
gl
y
a
ddr
e
s
s
in
g
bi
a
s
e
s
a
nd
e
th
ic
a
l
is
s
ue
s
in
m
a
nua
l
e
s
s
a
y
s
c
or
in
g,
w
or
ki
ng
to
de
te
c
t
a
nd
e
li
m
in
a
te
unf
a
ir
ne
s
s
.
A
ut
om
a
te
d
e
s
s
a
y
s
c
or
in
g
(
A
E
S
)
s
ys
te
m
s
ha
ve
e
m
e
r
ge
d
a
s
a
Evaluation Warning : The document was created with Spire.PDF for Python.
I
S
S
N
:
2252
-
8938
I
nt
J
A
r
ti
f
I
nt
e
ll
, V
ol
.
14
, N
o.
5
,
O
c
to
be
r
20
25
:
3599
-
3612
3600
ga
m
e
-
c
ha
ngi
ng
s
ol
ut
io
n,
of
f
e
r
in
g
a
m
or
e
e
f
f
ic
ie
nt
a
nd
obj
e
c
ti
v
e
a
s
s
e
s
s
m
e
nt
m
e
th
od
[
4]
–
[
6]
.
T
he
li
m
it
a
ti
ons
of
m
a
nua
l
gr
a
di
ng ma
y be
a
dd
r
e
s
s
e
d w
it
h a
n A
E
S
s
ys
te
m
, w
hi
c
h pr
ovi
de
s
a
qui
c
k a
nd e
f
f
ic
ie
nt
w
a
y t
o a
s
s
e
s
s
s
tu
de
nt
s
'
w
r
it
te
n
w
or
k.
T
he
s
ubj
e
c
t
of
A
E
S
r
e
s
e
a
r
c
h
is
vi
gor
o
us
,
e
xa
m
in
in
g
a
r
a
nge
of
m
e
th
ods
,
s
tr
a
te
gi
e
s
,
a
nd t
e
c
hnol
ogi
c
a
l
im
pr
ove
m
e
nt
s
f
or
a
ut
om
a
ti
c
a
ll
y gr
a
di
ng a
nd
s
c
or
in
g e
s
s
a
y
s
[
7]
. T
he
r
e
i
s
a
r
is
in
g i
nt
e
r
e
s
t
in
m
a
ki
ng
A
E
S
m
ode
ls
m
or
e
c
om
pr
e
he
ns
ib
le
a
nd
e
xpl
a
in
a
bl
e
.
U
nde
r
s
ta
ndi
ng
how
m
ode
ls
r
e
s
ul
t
in
s
pe
c
if
ic
r
a
ti
ngs
is
c
r
it
ic
a
l
f
or
in
c
r
e
a
s
in
g
tr
us
t
in
a
ut
om
a
te
d
s
y
s
te
m
s
,
pa
r
ti
c
ul
a
r
ly
in
e
duc
a
ti
ona
l
s
e
tt
in
gs
.
A
E
S
r
e
s
e
a
r
c
h
ha
s
be
n
e
f
it
e
d
gr
e
a
tl
y
f
r
om
th
e
a
ppl
ic
a
ti
on
of
na
tu
r
a
l
la
ngua
ge
pr
oc
e
s
s
in
g
(
N
L
P
)
a
nd
te
xt
m
in
in
g
te
c
hni
que
s
[
8]
.
E
s
s
a
ys
c
a
n
yi
e
ld
va
lu
a
bl
e
in
f
or
m
a
ti
on
th
r
ough
th
e
us
e
of
s
e
nt
im
e
nt
a
na
ly
s
is
,
s
ynt
a
c
ti
c
a
n
a
ly
s
is
,
doc
um
e
nt
s
r
e
s
e
m
bl
a
n
c
e
,
a
nd
s
e
m
a
nt
ic
a
na
ly
s
is
[
9]
,
[
10]
.
T
he
s
e
m
e
th
ods
a
id
in
c
om
pr
e
he
ndi
ng
th
e
te
xt
'
s
s
e
nt
im
e
nt
, c
ont
e
nt
, s
im
il
a
r
it
y, a
nd or
ga
ni
s
a
ti
on.
A
nde
r
s
e
n
e
t
al
.
[
11]
pr
opos
e
d
th
e
A
E
S
f
r
a
m
e
w
or
k
f
or
a
s
s
e
s
s
in
g
D
a
ni
s
h
w
r
it
in
g
pr
of
ic
ie
nc
y
in
te
r
m
s
of
te
xt
s
tr
uc
tu
r
e
,
s
e
nt
e
nc
e
f
or
m
,
a
nd
m
odi
f
ie
r
us
a
ge
.
T
h
e
y
e
xpl
or
e
d
N
L
P
a
nd
m
a
c
hi
ne
le
a
r
ni
ng
a
ppr
oa
c
h
e
s
to
s
ol
ve
th
e
pr
obl
e
m
.
T
he
r
e
s
e
a
r
c
h
m
e
th
odol
ogy
e
m
pl
oye
d
in
th
is
s
tu
dy
m
a
in
ly
ba
s
e
d
on
th
e
a
n
a
ly
ti
c
a
l
f
r
a
m
e
w
or
k
s
ugge
s
te
d
by K
a
be
l
e
t
al
.
[
12]
f
or
a
na
ly
s
in
g
e
a
r
ly
w
r
it
in
g.
W
it
hi
n
th
is
a
r
c
hi
te
c
tu
r
e
,
e
a
c
h
te
xt
goe
s
th
r
ough
two
pha
s
e
s
:
s
ta
ti
s
ti
c
a
l
R
a
s
c
h
m
ode
ll
in
g
f
or
s
c
or
in
g,
a
nd
a
nnot
a
ti
on
by
a
hum
a
n
e
xpe
r
t
f
ol
lo
w
in
g
a
pr
e
de
f
in
e
d
c
la
s
s
if
ic
a
ti
on
s
c
he
m
e
.
T
h
e
y
c
a
r
r
ie
d
out
e
xpe
r
im
e
nt
s
to
c
om
pa
r
e
a
nd
a
s
s
e
s
s
th
e
two
a
ppr
oa
c
h
e
s
.
T
he
ir
r
e
s
ul
ts
s
how
th
a
t
th
e
s
c
or
e
s
ge
n
e
r
a
te
d
by
th
e
a
ut
om
a
ti
c
te
c
hni
que
a
nd
th
e
one
s
e
s
ta
bl
is
h
e
d
by
hum
a
n
e
xpe
r
ts
ha
ve
a
s
tr
ong c
or
r
e
la
ti
on a
nd
a
r
e
s
ta
ti
s
ti
c
a
ll
y
s
ig
ni
f
ic
a
nt
.
S
üz
e
n
e
t
al
.
[
13]
e
xpl
or
e
d
a
ut
om
a
ti
c
gr
a
di
ng
of
s
hor
t
a
ns
w
e
r
s
a
nd
pr
ovi
di
ng
in
s
ig
ht
f
ul
f
e
e
dba
c
k
us
in
g a
da
ta
s
e
t
f
r
om
t
he
U
ni
ve
r
s
it
y of
N
or
th
T
e
xa
s
'
s
I
nt
r
oduc
to
r
y C
om
put
e
r
S
c
ie
nc
e
c
our
s
e
. T
he
y a
ppl
ie
d t
he
ve
c
to
r
s
pa
c
e
m
ode
l
(
V
S
M
)
to
m
e
a
s
ur
e
th
e
s
im
il
a
r
it
y
be
twe
e
n
s
tu
de
nt
r
e
s
pons
e
s
a
nd
m
ode
l
a
n
s
w
e
r
s
b
a
s
e
d
on
c
om
m
onl
y
us
e
d
te
r
m
s
.
T
he
y
a
na
ly
z
e
d
th
e
c
or
r
e
la
ti
on
be
twe
e
n
th
e
s
e
s
im
il
a
r
it
ie
s
a
nd
th
e
s
c
or
e
r
s
'
a
s
s
ig
ne
d
gr
a
de
s
.
A
ddi
ti
ona
ll
y,
th
e
y
us
e
d
th
e
k
-
m
e
a
n
s
c
lu
s
te
r
in
g
m
e
th
od
to
gr
oup
s
tu
de
nt
r
e
s
pons
e
s
,
a
s
s
ig
ni
ng
th
e
s
a
m
e
s
c
or
e
a
nd i
de
nt
ic
a
l
f
e
e
dba
c
k t
o a
ns
w
e
r
s
w
it
hi
n e
a
c
h c
lu
s
te
r
. T
he
c
lu
s
te
r
s
r
e
pr
e
s
e
nt
e
d gr
oups
of
s
tu
de
nt
s
w
it
h
s
im
il
a
r
pe
r
f
or
m
a
nc
e
,
de
te
r
m
in
e
d
by
c
om
pa
r
in
g
te
r
m
s
i
n
s
tu
de
nt
a
ns
w
e
r
s
w
it
h
th
os
e
in
th
e
m
ode
l
a
ns
w
e
r
.
A
c
c
or
di
ng
to
th
e
r
e
s
e
a
r
c
h
c
it
e
d
a
bove
,
te
xt
m
in
in
g
te
c
hni
que
s
c
a
n
be
us
e
d
to
de
ve
lo
p
obj
e
c
ti
ve
s
c
or
in
g
s
ta
nda
r
ds
ba
s
e
d
on
qua
nt
if
ia
bl
e
li
ngui
s
ti
c
e
le
m
e
nt
s
c
o
ll
e
c
te
d
f
r
om
e
s
s
a
ys
.
E
s
s
a
ys
a
nd
ot
he
r
te
xt
ua
l
da
ta
c
a
n
be
a
na
ly
s
e
d
a
nd
th
e
ir
c
ont
e
nt
unde
r
s
to
od
u
s
in
g
te
xt
m
in
in
g
a
ppr
oa
c
he
s
by
ut
il
iz
in
g
s
ophi
s
ti
c
a
te
d
N
L
P
a
lg
or
it
hm
s
[
14
]
,
[
15
]
.
T
hi
s
m
a
ke
s
it
pos
s
ib
le
f
or
th
e
s
ys
te
m
to
c
om
pr
e
he
nd,
pa
r
s
e
,
a
nd
e
xt
r
a
c
t
r
e
le
va
nt
e
le
m
e
nt
s
f
r
om
th
e
e
s
s
a
ys
,
in
c
lu
di
ng
s
e
nt
e
nc
e
s
tr
uc
tu
r
e
, s
e
m
a
nt
i
c
s
ig
ni
f
ic
a
nc
e
,
c
ohe
r
e
nc
e
,
a
nd
la
ngua
ge
us
a
g
e
[
16]
,
[
17]
.
T
hi
s
s
tu
dy
c
ont
r
ib
ut
e
s
to
th
e
f
ie
ld
of
A
E
S
by
e
va
lu
a
ti
ng
C
os
in
e
s
im
il
a
r
it
y
a
nd
J
a
c
c
a
r
d
c
oe
f
f
ic
ie
nt
m
e
tr
ic
s
w
it
h
va
r
io
us
n
-
g
r
a
m
va
r
ia
ti
ons
(
uni
g
r
a
m
s
,
bi
gr
a
m
s
,
tr
i
gr
a
m
s
)
to
im
p
r
ove
e
s
s
a
y
s
c
or
in
g
a
c
c
ur
a
c
y.
I
t
e
xpl
or
e
s
s
e
m
a
nt
ic
s
im
il
a
r
it
y
be
twe
e
n
s
tu
de
nt
e
s
s
a
ys
a
nd
m
ode
l
a
ns
w
e
r
s
w
hi
le
a
na
ly
z
in
g
how
n
-
gr
a
m
s
c
a
pt
ur
e
bot
h
in
di
vi
dua
l
w
or
ds
a
nd
w
or
d
s
e
que
nc
e
s
f
or
be
tt
e
r
c
ont
e
xt
.
T
he
r
e
s
e
a
r
c
h
id
e
nt
if
ie
s
w
hi
c
h
m
e
tr
ic
a
nd n
-
gr
a
m
c
om
bi
na
ti
on be
s
t
c
or
r
e
la
te
s
w
it
h huma
n s
c
or
in
g.
I
n
r
e
la
ti
on
to
A
E
S
,
th
e
V
S
M
ha
s
be
e
n
e
xt
e
n
s
iv
e
ly
r
e
s
e
a
r
c
h
e
d.
E
s
s
a
y
doc
um
e
nt
s
u
s
in
g
V
S
M
a
r
e
m
ode
ll
e
d
in
to
ve
c
to
r
s
in
a
h
ig
h
-
di
m
e
ns
io
na
l
s
pa
c
e
[
18]
,
[
19
]
.
E
ve
r
y
di
m
e
ns
io
n
is
r
e
la
te
d
w
it
h
a
di
s
ti
nc
t
te
r
m
(
w
or
d
or
n
-
gr
a
m
)
,
a
nd
th
e
ve
c
to
r
'
s
va
lu
e
s
r
e
ve
a
ls
th
e
s
ig
ni
f
ic
a
nc
e
or
oc
c
ur
r
e
nc
e
of
th
os
e
phr
a
s
e
s
in
th
e
e
s
s
a
y
doc
um
e
nt
s
(
r
e
f
e
r
t
o F
ig
ur
e
1)
.
F
ig
ur
e
1. I
l
l
us
tr
a
ti
on of
doc
um
e
nt
s
im
il
a
r
it
y i
n
a
ve
c
to
r
s
pa
c
e
m
ode
l
W
he
n
c
r
e
a
ti
ng
th
e
s
e
ve
c
to
r
s
,
w
e
us
u
a
ll
y
ut
il
iz
e
ve
c
to
r
iz
a
ti
o
n
te
c
hni
que
s
,
s
uc
h
a
s
ba
g
-
of
-
w
or
ds
(
B
oW
)
[
20]
,
[
21]
a
nd
te
r
m
f
r
e
que
nc
y
-
in
ve
r
s
e
doc
um
e
nt
f
r
e
que
nc
y
(
T
F
-
I
D
F
)
[
22]
,
[
23]
.
E
s
s
a
ys
c
a
n
be
m
or
e
Evaluation Warning : The document was created with Spire.PDF for Python.
I
nt
J
A
r
ti
f
I
nt
e
ll
I
S
S
N
:
2252
-
8938
A
ut
om
at
ic
e
s
s
ay
s
c
or
in
g:
l
e
v
e
r
agi
ng J
ac
c
a
r
d c
oe
ff
ic
ie
nt
and
C
os
in
e
s
imi
la
r
it
y
…
(
A
ndhar
in
i
D
w
i
C
ahy
ani
)
3601
nua
nc
e
dl
y
a
na
ly
s
e
d
a
nd
c
om
pa
r
e
d
by
m
a
na
gi
ng
N
L
P
te
c
hni
que
s
th
a
t
c
a
pt
ur
e
th
e
s
e
m
a
nt
ic
r
e
la
ti
ons
hi
ps
be
twe
e
n
w
or
ds
a
nd
phr
a
s
e
s
th
a
t
tr
a
ns
f
or
m
th
e
m
in
to
ve
c
to
r
r
e
pr
e
s
e
nt
a
ti
ons
u
s
in
g
th
e
v
e
c
to
r
m
a
tr
ix
[
19]
,
[
24]
–
[
26]
.
T
he
V
S
M
is
c
om
m
onl
y
us
e
d
f
or
tr
a
ns
f
or
m
in
g
te
xt
u
a
l
da
ta
in
to
num
e
r
ic
a
l
f
or
m
s
.
V
S
M
a
ll
ow
s
f
or
th
e
c
om
pa
r
is
on
of
a
r
ti
c
le
s
ba
s
e
d
on
th
e
s
im
il
a
r
it
y
of
th
e
ir
ve
c
to
r
r
e
pr
e
s
e
nt
a
ti
ons
[
27]
–
[
29]
.
B
e
c
a
us
e
of
it
s
s
im
pl
ic
it
y,
V
S
M
m
a
y
be
us
e
d
w
it
h
a
w
id
e
r
a
nge
of
m
a
c
hi
ne
le
a
r
ni
ng
m
ode
ls
,
pr
ovi
di
ng
f
le
xi
bi
li
ty
in
th
e
s
e
le
c
ti
on of
a
lg
or
it
hm
s
[
30]
, [
31]
.
2.
R
E
S
E
A
R
C
H
M
E
T
H
O
D
V
S
M
i
s
a
m
a
th
e
m
a
ti
c
a
l
m
ode
l
us
e
d t
o de
s
c
r
ib
e
doc
um
e
nt
s
i
n v
e
c
to
r
f
or
m
[
32]
,
[
33]
.
E
a
c
h doc
um
e
nt
is
r
e
pr
e
s
e
nt
e
d
a
s
a
ve
c
to
r
in
th
e
s
a
m
e
di
m
e
ns
io
na
l
s
pa
c
e
w
it
h
th
e
num
be
r
o
f
di
m
e
ns
io
ns
e
qui
va
le
nt
to
th
e
num
be
r
of
w
or
ds
[
34]
,
[
35
]
.
O
ur
r
e
s
e
a
r
c
h
f
oc
us
e
s
on
th
e
u
s
a
ge
of
V
S
M
f
or
A
E
S
,
w
hi
c
h
a
ll
ow
s
f
or
th
e
e
xt
r
a
c
ti
on
of
m
e
a
ni
ngf
ul
f
e
a
tu
r
e
s
f
r
om
e
s
s
a
ys
,
c
a
pt
ur
in
g
th
e
r
e
la
ti
ons
hi
ps
be
twe
e
n
w
or
ds
a
nd
th
e
ir
im
por
ta
nc
e
i
n t
he
c
ont
e
xt
of
s
c
or
in
g. T
he
T
F
-
I
D
F
w
e
ig
ht
in
g he
lp
s
i
n e
m
pha
s
iz
in
g w
or
ds
t
ha
t
a
r
e
i
m
por
ta
nt
i
n
a
s
pe
c
if
ic
e
s
s
a
y
w
hi
le
dow
nw
e
ig
ht
in
g
c
om
m
on
te
r
m
s
[
36]
–
[
38
]
.
O
nc
e
th
e
e
s
s
a
ys
a
r
e
r
e
pr
e
s
e
nt
e
d
a
s
T
F
-
I
D
F
ve
c
to
r
s
,
C
os
in
e
s
im
il
a
r
it
y
is
c
om
m
onl
y
us
e
d
to
m
e
a
s
ur
e
th
e
s
im
il
a
r
it
y
be
twe
e
n
e
s
s
a
ys
.
C
os
in
e
s
im
il
a
r
it
y
c
a
lc
ul
a
te
s
th
e
c
o
s
in
e
of
th
e
a
ngl
e
be
twe
e
n
two
ve
c
to
r
s
a
nd
r
a
nge
s
f
r
om
-
1
(
c
om
pl
e
te
ly
di
s
s
im
il
a
r
)
to
1
(
c
om
pl
e
te
ly
s
im
il
a
r
)
[
39
]
,
[
40
]
.
I
n
s
om
e
c
a
s
e
s
,
J
a
c
c
a
r
d
s
im
il
a
r
it
y
m
a
y
be
us
e
d,
e
s
pe
c
ia
ll
y
if
th
e
f
oc
us
is
on
bi
na
r
y
pr
e
s
e
nc
e
/a
bs
e
nc
e
of
te
r
m
s
r
a
th
e
r
th
a
n
th
e
ir
f
r
e
que
nc
y
[
41]
,
[
42]
.
F
ig
ur
e
2
s
how
s
th
e
pr
oc
e
s
s
th
a
t
is
c
a
r
r
ie
d out i
n our
s
tu
dy.
F
ig
ur
e
2. R
e
s
e
a
r
c
h m
e
th
od
2.1.
D
at
a
c
ol
le
c
t
io
n
T
he
s
tu
dy
c
ol
le
c
te
d
da
ta
f
r
om
30
e
ig
ht
h
-
gr
a
de
s
tu
de
nt
s
a
t
J
uni
or
H
ig
h
S
c
hool
A
s
a
C
e
nde
ki
a
S
id
oa
r
jo
,
s
pe
c
if
ic
a
ll
y
f
r
o
m
th
e
c
it
iz
e
ns
hi
p
e
duc
a
ti
on
s
ubj
e
c
t.
F
or
f
or
m
a
ti
ve
a
s
s
e
s
s
m
e
nt
,
th
e
te
a
c
he
r
ga
ve
th
e
s
tu
de
nt
s
5
e
s
s
a
y
que
s
ti
ons
to
a
s
s
e
s
s
th
e
ir
unde
r
s
ta
ndi
ng
of
t
he
m
a
te
r
ia
l,
r
e
s
ul
ti
ng
in
a
to
ta
l
o
f
150
e
s
s
a
y
r
e
s
pons
e
s
i
n t
he
da
ta
s
e
t.
T
he
t
e
s
t
w
a
s
c
onduc
te
d on pa
pe
r
, a
nd t
he
s
tu
de
nt
s
'
a
ns
w
e
r
s
w
e
r
e
t
he
n c
onve
r
te
d i
nt
o
E
xc
e
l
f
or
m
a
t
f
or
f
ur
th
e
r
a
na
ly
s
is
.
2.2.
P
r
e
p
r
oc
e
s
s
in
g
T
e
xt
pr
e
pr
oc
e
s
s
in
g
is
a
c
r
uc
ia
l
s
te
p
in
te
xt
m
in
in
g,
tr
a
ns
f
or
m
in
g
r
a
w
te
xt
in
to
a
n
a
na
ly
z
a
bl
e
f
or
m
a
t
.
I
ts
m
a
in
goa
ls
a
r
e
to
e
nha
nc
e
da
ta
qu
a
li
ty
,
r
e
duc
e
noi
s
e
,
a
nd
e
xt
r
a
c
t
m
e
a
ni
ngf
ul
in
f
or
m
a
ti
on
[
43]
–
[
45]
.
H
e
r
e
a
r
e
t
he
s
pe
c
if
ic
s
te
p
s
us
e
d i
n t
hi
s
s
tu
dy.
−
T
e
xt
c
le
a
ni
ng:
th
e
goa
l
of
te
xt
c
le
a
ni
ng
is
to
im
pr
ove
th
e
da
ta
qua
li
ty
by
r
e
m
ovi
ng
i
r
r
e
le
va
nt
or
noi
s
e
e
le
m
e
nt
s
[
46]
.
T
he
pr
oc
e
s
s
of
te
xt
c
le
a
ni
ng
m
ig
ht
va
r
y
ba
s
e
d
o
n
th
e
na
tu
r
e
of
th
e
d
a
ta
a
nd
th
e
ne
e
d
s
of
th
e
r
e
s
e
a
r
c
h
[
47]
.
I
n
our
s
tu
dy,
w
e
us
e
d
a
te
xt
c
le
a
ni
ng
s
te
p
to
ge
t
r
id
of
m
ul
ti
pl
e
s
pa
c
e
s
,
punc
tu
a
ti
on
m
a
r
ks
, a
nd non
-
a
lp
ha
be
ti
c
c
ha
r
a
c
t
e
r
s
. T
hi
s
a
id
s
i
n ke
e
pi
ng t
he
da
ta
s
e
t
m
or
e
s
tr
uc
tu
r
e
d.
−
C
a
s
e
f
ol
di
ng:
th
is
s
te
p
in
vol
ve
s
tr
a
ns
f
or
m
in
g
a
ll
c
ha
r
a
c
te
r
s
,
b
ot
h
uppe
r
c
a
s
e
a
nd
lo
w
e
r
c
a
s
e
le
tt
e
r
s
a
r
e
c
onve
r
te
d
to
lo
w
e
r
c
a
s
e
[
48]
.
T
he
goa
l
o
f
c
a
s
e
f
ol
di
ng
is
to
s
ta
n
da
r
di
z
e
th
e
te
xt
da
ta
,
m
a
ki
ng
it
e
a
s
ie
r
to
c
om
pa
r
e
,
s
e
a
r
c
h,
a
nd
a
na
ly
z
e
[
49]
.
T
hi
s
pr
oc
e
s
s
i
s
im
por
ta
nt
in
N
L
P
ta
s
ks
w
he
r
e
c
a
s
e
s
e
ns
it
iv
it
y
is
us
ua
ll
y not r
e
qui
r
e
d a
nd c
oul
d l
e
a
d t
o unne
c
e
s
s
a
r
y c
om
pl
e
xi
ty
.
−
T
oke
ni
z
a
ti
on:
i
n
th
is
pha
s
e
,
th
e
doc
um
e
nt
is
di
vi
de
d
in
to
s
m
a
ll
e
r
pa
r
ts
c
a
ll
e
d
to
ke
ns
.
T
oke
ni
z
a
ti
on
is
m
a
in
ly
us
e
d
to
br
e
a
k
up
c
ont
in
uous
te
xt
in
to
r
e
a
di
ly
p
r
oc
e
s
s
e
d
di
s
c
r
e
te
pa
r
ts
,
w
hi
c
h
a
r
e
w
or
ds
[
50]
,
[
51]
. T
he
a
na
ly
s
is
of
N
L
P
is
bui
lt
on t
oke
ns
[
52]
.
Evaluation Warning : The document was created with Spire.PDF for Python.
I
S
S
N
:
2252
-
8938
I
nt
J
A
r
ti
f
I
nt
e
ll
, V
ol
.
14
, N
o.
5
,
O
c
to
be
r
20
25
:
3599
-
3612
3602
−
N
or
m
a
li
z
a
ti
on:
t
hi
s
s
te
p
in
c
lu
de
s
r
e
pl
a
c
in
g
s
la
ng
a
nd
ty
po
w
or
ds
w
it
h
f
or
m
a
l
w
or
ds
ta
ke
n
f
r
o
m
a
di
c
ti
ona
r
y.
W
e
a
ls
o
e
xpa
nd
c
ont
r
a
c
ti
ons
w
or
ds
a
nd
a
bbr
e
vi
a
ti
ons
to
e
ns
ur
e
c
ons
i
s
te
nc
y
in
te
xt
r
e
pr
e
s
e
nt
a
ti
on. T
he
goa
l
of
t
hi
s
s
te
p i
s
t
o t
r
a
ns
f
or
m
t
e
xt
da
ta
i
nt
o a
m
or
e
c
ons
is
te
nt
f
or
m
a
t
[
53]
.
−
S
to
pw
or
d
r
e
m
ova
l:
s
to
p
w
or
ds
a
r
e
w
or
d
s
th
a
t
of
te
n
do
not
in
f
lu
e
nc
e
m
uc
h
to
th
e
m
e
a
ni
ng
of
a
te
xt
[
54]
, [
55
]
. S
to
pw
o
r
ds
a
r
e
c
om
m
on i
n
m
os
t
doc
um
e
nt
s
, a
nd t
he
i
r
hi
gh
f
r
e
que
nc
y c
a
n l
e
a
d t
o unne
c
e
s
s
a
r
y
c
om
put
a
ti
on
ti
m
e
dur
in
g
te
xt
a
na
ly
s
i
s
[
56]
.
R
e
m
ovi
ng
s
to
p
w
or
ds
c
a
n
r
e
duc
e
th
e
c
om
put
a
ti
ona
l
di
f
f
ic
ul
ty
a
nd f
oc
us
on mor
e
m
e
a
ni
ngf
ul
c
ont
e
nt
[
57]
,
a
nd s
pe
e
d up the
t
e
xt
pr
oc
e
s
s
in
g.
T
o
s
um
up,
te
xt
pr
e
pr
oc
e
s
s
in
g
in
our
s
tu
dy
r
e
f
e
r
s
to
pr
e
pa
r
e
r
a
w
te
xt
da
ta
f
or
a
na
ly
s
is
.
T
he
s
te
ps
in
te
xt
pr
e
pr
oc
e
s
s
in
g
a
r
e
de
pe
nd
on
th
e
goa
ls
of
th
e
te
xt
m
in
in
g
ta
s
k
a
nd
th
e
na
tu
r
e
of
th
e
da
ta
s
e
t
[
58]
–
[
60]
.
I
m
pl
e
m
e
nt
in
g
th
e
s
e
s
ta
ge
s
e
f
f
e
c
ti
ve
ly
he
lp
s
to
pr
e
pa
r
e
th
e
te
xt
da
ta
f
or
s
ubs
e
que
nt
a
na
ly
s
i
s
,
lo
w
e
r
th
e
c
om
put
a
ti
ona
l
c
om
pl
e
xi
ty
,
a
nd
m
a
ke
it
m
or
e
s
ui
ta
bl
e
f
or
m
a
c
hi
ne
le
a
r
ni
ng
m
ode
ls
a
nd
ot
he
r
te
xt
m
in
in
g
te
c
hni
que
s
.
2.3.
N
-
gr
am
var
ia
t
io
n
N
L
P
ta
s
ks
nor
m
a
ll
y
us
e
n
-
gr
a
m
s
to
id
e
nt
if
y
a
nd
c
om
pr
e
he
n
d
pa
tt
e
r
ns
in
te
xt
ua
l
doc
um
e
nt
[
61]
.
S
uc
h
N
L
P
a
ppl
ic
a
ti
on
th
a
t
us
e
s
n
-
gr
a
m
s
a
r
e
la
ngua
ge
m
ode
ll
in
g,
m
a
c
hi
ne
tr
a
ns
la
ti
on,
doc
um
e
nt
s
im
il
a
r
it
y
c
om
pa
r
a
ti
on,
a
nd
te
xt
ge
ne
r
a
ti
on
[
62]
,
[
63
]
.
N
-
g
r
a
m
s
a
r
e
e
s
s
e
nt
ia
l
to
e
s
s
a
y
s
c
or
in
g,
w
hi
c
h
a
im
s
to
m
e
a
s
ur
e
th
e
s
im
il
a
r
it
y
be
twe
e
n
s
tu
de
nt
s
’
a
ns
w
e
r
w
it
h
th
e
m
ode
l
a
ns
w
e
r
.
N
-
gr
a
m
m
ode
ls
a
id
in
c
a
pt
ur
in
g
th
e
li
nks
be
twe
e
n
w
or
ds
in
a
s
e
que
nc
e
a
nd
th
e
s
ur
r
ounding
in
f
or
m
a
ti
on
[
64]
.
T
he
te
xt
is
to
ke
ni
z
e
d
in
pr
e
-
pr
oc
e
s
s
in
g
pr
oc
e
s
s
e
s
t
o s
e
p
a
r
a
te
i
t
in
to
i
ndi
vi
dua
l
w
or
ds
or
t
oke
ns
be
f
or
e
pr
oduc
in
g n
-
gr
a
m
s
[
65]
, [
66
]
.
I
n our
s
tu
dy, n
-
gr
a
m
s
a
r
e
e
m
pl
oye
d a
s
f
e
a
tu
r
e
s
t
o
r
e
pr
e
s
e
nt
t
he
c
ont
e
nt
a
nd s
tr
uc
tu
r
e
of
t
he
te
xt
. T
he
V
S
M
is
a
m
a
th
e
m
a
ti
c
a
l
a
ppr
oa
c
h
th
a
t
tr
a
n
s
f
or
m
s
e
a
c
h
e
s
s
a
y
doc
um
e
nt
a
s
a
v
e
c
to
r
in
a
hi
gh
-
di
m
e
ns
io
na
l
s
pa
c
e
,
w
he
r
e
e
a
c
h
di
m
e
n
s
io
n
r
e
pr
e
s
e
nt
in
g
a
uni
que
f
e
a
tu
r
e
[
6
7]
.
E
a
c
h
uni
que
n
-
gr
a
m
be
c
om
e
s
a
f
e
a
tu
r
e
in
th
e
V
S
M
.
T
he
pr
e
s
e
nc
e
or
a
bs
e
nc
e
of
th
e
s
e
f
e
a
tu
r
e
s
is
th
e
n
us
e
d
to
r
e
pr
e
s
e
nt
th
e
e
s
s
a
y
[
68]
,
[
69]
.
T
he
n
-
gr
a
m
s
f
or
m
a
ti
on
p
r
oc
e
s
s
e
s
th
e
c
onve
r
s
io
n
of
e
s
s
a
y
doc
um
e
nt
in
to
a
hi
gh
-
di
m
e
ns
io
na
l
ve
c
to
r
,
w
he
r
e
e
a
c
h
di
m
e
ns
io
n c
or
r
e
s
ponds
t
o t
he
oc
c
ur
r
e
nc
e
or
a
bs
e
n
c
e
of
a
s
pe
c
if
ic
n
-
gr
a
m
.
N
-
gr
a
m
f
e
a
tu
r
e
e
xt
r
a
c
ti
on
c
a
pt
ur
e
s
w
or
d
s
e
que
n
c
e
s
f
r
om
a
te
xt
doc
um
e
nt
to
d
e
s
c
r
ib
e
it
s
li
ngui
s
ti
c
s
tr
uc
tu
r
e
. F
ig
ur
e
3
il
lu
s
tr
a
te
s
how
t
he
f
e
a
tu
r
e
s
pa
c
e
e
xpa
nds
e
x
pone
nt
ia
ll
y w
it
h a
n i
nc
r
e
a
s
e
i
n "
n"
i
n n
-
g
r
a
m
s
,
w
hi
c
h c
a
n r
e
s
ul
t
in
l
a
r
ge
r
di
m
e
ns
io
na
li
ty
a
nd mor
e
c
om
put
in
g
c
om
pl
e
xi
ty
. W
hi
c
h n
-
gr
a
m
s
iz
e
t
o us
e
de
pe
nds
on
th
e
pa
r
ti
c
ul
a
r
ta
s
k
a
t
ha
nd
a
s
w
e
ll
a
s
th
e
pr
ope
r
ti
e
s
of
th
e
da
ta
be
in
g
e
xa
m
in
e
d.
V
a
r
io
us
n
-
gr
a
m
s
iz
e
s
m
a
y
be
a
ppr
opr
ia
te
f
or
di
f
f
e
r
e
nt
j
obs
.
F
ig
ur
e
3. N
-
gr
a
m
t
oke
ni
z
a
ti
on mode
l
U
ni
gr
a
m
s
a
r
e
of
te
n
us
e
d
f
or
ba
s
ic
te
xt
a
na
ly
s
is
ta
s
k
s
a
nd
in
it
ia
l
f
e
a
tu
r
e
e
xt
r
a
c
ti
on.
B
ig
r
a
m
s
c
a
pt
ur
e
s
om
e
le
ve
l
of
lo
c
a
l
c
ont
e
xt
a
nd
a
r
e
u
s
e
f
ul
f
or
ta
s
ks
li
ke
s
e
nt
im
e
nt
a
na
ly
s
is
.
T
r
ig
r
a
m
s
pr
ovi
de
a
bi
t
m
or
e
c
ont
e
xt
a
nd
a
r
e
e
m
pl
oye
d
in
ta
s
ks
w
he
r
e
unde
r
s
ta
ndi
ng
th
e
r
e
l
a
ti
ons
hi
ps
be
twe
e
n
th
r
e
e
c
ons
e
c
ut
iv
e
w
or
ds
is
im
por
ta
nt
.
‒
U
ni
gr
a
m
(
1
-
gr
a
m
)
:
a
uni
gr
a
m
is
a
s
in
gl
e
w
or
d,
r
e
pr
e
s
e
nt
in
g
th
e
s
im
pl
e
s
t
f
or
m
of
n
-
gr
a
m
.
I
n
uni
g
r
a
m
f
e
a
tu
r
e
e
xt
r
a
c
ti
on, e
a
c
h w
or
d i
n t
he
doc
um
e
nt
i
s
t
r
e
a
te
d
a
s
a
s
e
pa
r
a
te
f
e
a
tu
r
e
.
‒
B
ig
r
a
m
(2
-
gr
a
m
)
:
a
bi
gr
a
m
is
a
s
e
qu
e
nc
e
of
two
a
dj
a
c
e
nt
w
or
ds
.
I
n
bi
gr
a
m
f
e
a
tu
r
e
e
xt
r
a
c
ti
on,
pa
ir
s
of
c
ons
e
c
ut
iv
e
w
or
ds
a
r
e
c
ons
id
e
r
e
d
a
s
f
e
a
tu
r
e
s
.
‒
T
r
ig
r
a
m
(
3
-
gr
a
m
)
:
a
tr
ig
r
a
m
is
a
s
e
que
nc
e
of
th
r
e
e
a
dj
a
c
e
nt
w
o
r
ds
.
I
n
tr
ig
r
a
m
f
e
a
tu
r
e
e
xt
r
a
c
ti
on,
tr
ip
le
ts
of
c
ons
e
c
ut
iv
e
w
or
ds
a
r
e
t
r
e
a
te
d
a
s
f
e
a
tu
r
e
s
.
2.4.
V
e
c
t
or
iz
at
io
n
V
e
c
to
r
iz
a
ti
on
in
N
L
P
c
onve
r
ts
te
xt
da
ta
in
to
num
e
r
ic
a
l
r
e
pr
e
s
e
nt
a
ti
ons
f
or
m
a
c
hi
ne
le
a
r
ni
ng
a
lg
or
it
hm
s
.
A
c
om
m
on
m
e
th
od
is
T
F
-
I
D
F
,
w
hi
c
h
a
s
s
ig
ns
w
e
ig
ht
s
to
w
or
ds
ba
s
e
d
on
th
e
ir
f
r
e
que
nc
y
in
a
doc
um
e
nt
(
T
F
)
a
nd
r
a
r
it
y
a
c
r
os
s
a
c
or
pus
(
I
D
F
)
[
70]
.
T
hi
s
m
e
th
od
r
e
f
le
c
ts
th
e
s
ig
ni
f
ic
a
nc
e
of
te
r
m
s
not
ju
s
t
w
it
hi
n
a
doc
um
e
nt
,
but
a
c
r
os
s
a
n
e
nt
ir
e
c
ol
le
c
ti
on
[
71]
.
T
F
-
I
D
F
a
ppl
ie
s
to
uni
gr
a
m
s
,
bi
gr
a
m
s
,
a
nd
tr
ig
r
a
m
s
,
w
it
h
c
ont
e
xt
a
nd
m
e
a
ni
ng
in
c
r
e
a
s
in
g
w
it
h
hi
ghe
r
n
-
gr
a
m
s
.
U
ni
gr
a
m
s
f
oc
us
on
in
di
vi
dua
l
w
or
ds
,
bi
gr
a
m
s
on
w
or
d pa
ir
s
, a
nd t
r
ig
r
a
m
s
on mor
e
c
om
pl
e
x pa
tt
e
r
ns
.
Evaluation Warning : The document was created with Spire.PDF for Python.
I
nt
J
A
r
ti
f
I
nt
e
ll
I
S
S
N
:
2252
-
8938
A
ut
om
at
ic
e
s
s
ay
s
c
or
in
g:
l
e
v
e
r
agi
ng J
ac
c
a
r
d c
oe
ff
ic
ie
nt
and
C
os
in
e
s
imi
la
r
it
y
…
(
A
ndhar
in
i
D
w
i
C
ahy
ani
)
3603
T
F
:
qua
nt
if
ie
s
how
of
te
n
a
te
r
m
a
ppe
a
r
s
in
a
doc
um
e
nt
[
72
]
.
I
t
is
c
a
lc
ul
a
te
d
by
di
vi
di
ng
th
e
num
be
r
of
ti
m
e
s
a
te
r
m
oc
c
ur
s
by
th
e
doc
um
e
nt
'
s
to
ta
l
num
be
r
of
te
r
m
s
,
a
s
s
how
n
in
(
1)
,
w
he
r
e
:
(
,
)
r
e
pr
e
s
e
nt
s
th
e
num
be
r
of
ti
m
e
s
te
r
m
tt
t
a
ppe
a
r
s
in
doc
um
e
nt
A
,
∑
(
,
)
de
not
e
s
th
e
to
ta
l
c
ount
of
a
ll
te
r
m
s
in
doc
um
e
nt
A
.
I
D
F
:
a
s
s
e
s
s
e
s
a
te
r
m
'
s
im
por
ta
nc
e
a
c
r
os
s
a
doc
u
m
e
nt
c
ol
le
c
ti
on
[
73]
.
I
t'
s
c
a
lc
ul
a
te
d
by
ta
ki
n
g
th
e
lo
ga
r
it
hm
o
f
th
e
r
a
ti
o
of
th
e
to
ta
l
num
be
r
o
f
doc
um
e
nt
s
to
t
he
num
be
r
of
doc
um
e
nt
s
c
ont
a
in
in
g
th
e
te
r
m
,
a
s
s
how
n
in
(
2)
,
w
he
r
e
:
∣
∣
r
e
pr
e
s
e
nt
s
th
e
to
ta
l
num
be
r
of
doc
um
e
nt
s
in
th
e
c
or
pus
.
(
)
de
not
e
s
th
e
num
be
r
of
doc
um
e
nt
s
t
ha
t
c
ont
a
in
t
he
t
e
r
m
t
.
TF
-
I
D
F
c
a
lc
ul
a
ti
on:
is
obt
a
in
e
d
by
m
ul
ti
pl
yi
ng
th
e
T
F
a
nd
th
e
I
D
F
f
o
r
e
a
c
h
te
r
m
in
th
e
e
s
s
a
y
[
74]
.
T
he
f
or
m
ul
a
to
c
a
lc
ul
a
te
T
F
-
I
D
F
is
w
r
it
te
n
a
s
in
(
3
)
.
T
he
r
e
s
ul
ti
ng
T
F
-
I
D
F
s
c
or
e
s
c
r
e
a
te
a
w
e
ig
ht
e
d
r
e
pr
e
s
e
nt
a
ti
on
of
te
r
m
s
in
th
e
e
s
s
a
y,
e
m
ph
a
s
iz
in
g
t
e
r
m
s
th
a
t
a
r
e
bot
h
f
r
e
que
nt
in
th
e
doc
um
e
nt
a
nd
r
a
r
e
in
th
e
ove
r
a
ll
c
or
pus
.
(
,
)
=
(
,
)
∑
(
,
)
(
1)
(
,
)
=
∣
∣
(
)
(
2)
−
(
,
,
)
=
(
,
)
(
,
)
(
3)
2.5.
S
im
il
ar
it
y
m
e
t
r
ic
B
e
f
or
e
s
c
or
in
g
s
tu
de
nt
e
s
s
a
ys
,
a
s
e
t
of
r
e
f
e
r
e
nc
e
a
ns
w
e
r
s
is
c
r
e
a
te
d
to
r
e
pr
e
s
e
nt
e
xe
m
pl
a
r
y
r
e
s
pons
e
s
.
A
s
im
il
a
r
it
y
m
e
tr
ic
is
th
e
n
a
ppl
ie
d
to
c
om
pa
r
e
s
tu
d
e
nt
e
s
s
a
ys
w
it
h
th
e
s
e
r
e
f
e
r
e
nc
e
s
,
ge
ne
r
a
ti
ng
a
s
c
or
e
th
a
t
r
e
f
le
c
ts
th
e
ir
a
li
gnm
e
nt
.
T
he
c
hoi
c
e
of
m
e
tr
ic
de
pe
n
ds
on
th
e
ta
s
k
a
nd
da
ta
c
ha
r
a
c
te
r
is
ti
c
s
,
a
s
te
xt
s
im
il
a
r
it
y
m
e
tr
ic
s
a
r
e
w
id
e
ly
us
e
d
in
N
L
P
f
o
r
va
r
io
us
a
ppl
i
c
a
ti
ons
[
75]
.
T
he
s
e
m
e
tr
ic
s
a
r
e
in
te
r
pr
e
ta
bl
e
,
s
c
a
la
bl
e
f
or
la
r
ge
da
ta
s
e
ts
,
a
nd
r
e
qui
r
e
c
a
r
e
f
ul
s
e
le
c
ti
on
ba
s
e
d
on
ta
s
k
-
s
pe
c
if
ic
ne
e
d
s
[
76]
–
[
78]
.
J
a
c
c
a
r
d
a
nd
C
os
in
e
s
im
il
a
r
it
y
a
r
e
bot
h
te
xt
s
im
il
a
r
it
y
m
e
a
s
ur
e
m
e
nt
m
e
t
hod
but
s
e
r
ve
di
f
f
e
r
e
nt
pur
pos
e
s
[
39]
,
[
41]
.
J
a
c
c
a
r
d
s
im
il
a
r
it
y
m
e
a
s
ur
e
s
th
e
pr
opor
ti
on
of
s
ha
r
e
d
te
r
m
s
b
e
twe
e
n
two
doc
um
e
nt
s
r
e
la
ti
ve
to
th
e
ir
to
ta
l
uni
que
te
r
m
s
,
m
a
ki
ng
it
us
e
f
ul
f
or
a
s
s
e
s
s
in
g
te
xt
ove
r
la
p
a
nd
id
e
nt
if
yi
ng
ne
a
r
-
dupl
ic
a
te
r
e
s
pons
e
s
.
I
n
c
ont
r
a
s
t,
C
os
in
e
s
im
il
a
r
it
y
e
va
lu
a
te
s
th
e
a
ngl
e
be
twe
e
n
doc
um
e
nt
ve
c
to
r
r
e
pr
e
s
e
nt
a
ti
ons
in
a
hi
gh
-
di
m
e
ns
io
na
l
s
pa
c
e
,
m
a
ki
ng
it
e
f
f
e
c
ti
ve
f
or
c
a
pt
ur
in
g
s
e
m
a
nt
ic
s
im
il
a
r
it
y
e
ve
n
w
he
n
doc
um
e
nt
s
ha
ve
di
f
f
e
r
e
nt
le
ngt
hs
.
2.6.
C
os
in
e
s
im
il
ar
it
y
C
os
in
e
s
im
il
a
r
it
y
is
a
s
im
il
a
r
it
y
m
e
tr
ic
be
twe
e
n
two
ve
c
to
r
s
i
n
a
di
m
e
ns
io
na
l
s
pa
c
e
,
th
a
t
m
e
a
s
ur
e
s
a
ngl
e
be
twe
e
n
th
e
do
c
um
e
nt
ve
c
to
r
a
nd
th
e
que
r
y
ve
c
to
r
[
79]
.
E
a
c
h
ve
c
to
r
r
e
pr
e
s
e
nt
s
th
e
doc
um
e
nt
be
in
g
c
om
pa
r
e
d
a
nd
a
w
or
d
in
th
e
que
r
y.
T
h
e
s
c
or
e
r
a
nge
of
C
o
s
in
e
s
im
il
a
r
it
y
va
r
ie
s
c
ont
in
uous
ly
be
twe
e
n
0
a
nd
1
[
34]
, w
he
r
e
0
in
di
c
a
te
s
t
ha
t
th
e
t
w
o doc
um
e
nt
s
a
r
e
e
nt
ir
e
ly
di
s
s
im
il
a
r
, w
hi
le
1 s
ig
ni
f
ie
s
t
ha
t
th
e
y a
r
e
pe
r
f
e
c
tl
y
a
li
gne
d
in
di
r
e
c
ti
on,
r
e
g
a
r
dl
e
s
s
of
th
e
ir
le
ngt
h
di
f
f
e
r
e
nc
e
s
.
T
he
C
os
in
e
s
im
il
a
r
it
y
f
or
m
ul
a
be
twe
e
n
two
ve
c
to
r
s
c
a
n
be
w
r
it
te
n
a
s
in
(
4)
,
w
he
r
e
d
j
=
ve
c
to
r
of
dj
do
c
um
e
nt
s
,
q
=
ve
c
to
r
of
que
r
y
doc
um
e
nt
s
,
∑
W
ij
t
i
=
1
=
to
ta
l
of
t
he
w
e
ig
ht
s
of
w
or
d i
i
n doc
um
e
nt
j
, a
nd
∑
W
iq
t
i
=
1
=
to
ta
l
of
t
he
w
e
ig
ht
s
of
w
or
ds
i
i
n q
.
S
im
(
d
j
,
q
)
=
d
j
∙
q
|
d
j
|
∙
|
q
|
=
∑
W
iq
∙
W
ij
t
i
=
1
√
∑
(
W
iq
)
2
t
i
=
1
∙
∑
(
W
ij
)
2
t
i
=
1
(
4)
2.7.
Jac
c
ar
d
s
im
il
a
r
it
y
J
a
c
c
a
r
d
s
im
il
a
r
it
y
is
one
of
s
im
il
a
r
it
y
m
e
tr
ic
m
e
th
od
th
a
t
c
a
n
be
a
ppl
ie
d
to
va
r
io
us
te
xt
da
ta
r
e
pr
e
s
e
nt
a
ti
ons
,
m
a
ki
ng
it
s
ui
ta
bl
e
f
or
ta
s
ks
w
he
r
e
th
e
pr
e
s
e
nc
e
or
a
bs
e
nc
e
of
te
r
m
s
is
c
r
uc
ia
l
[
80]
.
J
a
c
c
a
r
d
pr
ovi
de
s
a
s
tr
a
ig
ht
f
or
w
a
r
d
m
e
a
s
ur
e
of
s
im
il
a
r
it
y
ba
s
e
d
on
s
e
t
ope
r
a
ti
ons
,
m
a
ki
ng
it
in
te
r
pr
e
ta
bl
e
a
nd
e
a
s
y
to
unde
r
s
ta
nd.
T
F
-
I
D
F
ta
ke
s
in
to
a
c
c
ount
not
onl
y
th
e
f
r
e
que
nc
y
of
te
r
m
s
but
a
ls
o
th
e
ir
im
por
ta
nc
e
in
th
e
c
ont
e
xt
of
t
he
e
nt
ir
e
c
or
pus
. T
hi
s
a
ll
ow
s
J
a
c
c
a
r
d s
im
il
a
r
it
y t
o c
a
pt
ur
e
m
e
a
ni
ngf
ul
t
e
r
m
ove
r
la
ps
.
T
he
J
a
c
c
a
r
d s
im
il
a
r
it
y c
oe
f
f
ic
ie
nt
is
t
he
n c
a
lc
ul
a
te
d ba
s
e
d on th
e
T
F
-
I
D
F
ve
c
to
r
s
o
f
t
w
o
e
s
s
a
ys
[
81]
.
J
a
c
c
a
r
d
s
c
or
e
r
a
nge
s
b
e
twe
e
n
0
a
nd
1
c
ont
in
uous
ly
[
80]
;
0
m
e
a
ns
no
s
h
a
r
e
d
te
r
m
s
be
twe
e
n
th
e
two
doc
um
e
nt
s
,
w
hi
le
1
m
e
a
n
s
bot
h
doc
um
e
nt
s
c
ont
a
in
th
e
e
xa
c
t
s
a
m
e
te
r
m
s
.
T
he
J
a
c
c
a
r
d
s
im
il
a
r
it
y
be
tw
e
e
n
e
s
s
a
y
s
A
a
nd
B
i
s
gi
ve
n
in
(
5)
.
I
n
th
e
c
ont
e
xt
of
T
F
-
I
D
F
,
th
e
"
te
r
m
s
in
c
om
m
on"
r
e
f
e
r
to
th
e
s
e
t
of
te
r
m
s
th
a
t
ha
ve
non
-
z
e
r
o T
F
-
I
D
F
va
lu
e
s
i
n both e
s
s
a
ys
.
(
,
)
=
|
−
|
|
−
|
(
5)
Evaluation Warning : The document was created with Spire.PDF for Python.
I
S
S
N
:
2252
-
8938
I
nt
J
A
r
ti
f
I
nt
e
ll
, V
ol
.
14
, N
o.
5
,
O
c
to
be
r
20
25
:
3599
-
3612
3604
2.8.
E
s
s
ay
s
c
or
in
g s
y
s
t
e
m
T
he
s
c
or
in
g
s
ys
te
m
in
th
is
A
E
S
is
ba
s
e
d
on
a
s
im
il
a
r
it
y
m
e
tr
i
c
th
a
t
c
om
pa
r
e
a
s
tu
de
nt
'
s
e
s
s
a
y
a
nd
m
ode
l
a
ns
w
e
r
s
obt
a
in
e
d
f
r
om
th
e
te
a
c
h
e
r
[
6]
.
T
he
c
hos
e
n
s
im
il
a
r
it
y
m
e
tr
ic
(
e
.g.,
C
os
in
e
s
im
il
a
r
it
y
a
nd
J
a
c
c
a
r
d
s
im
il
a
r
it
y)
is
a
ppl
ie
d
to
c
om
pa
r
e
th
e
ve
c
to
r
r
e
pr
e
s
e
nt
a
ti
on
of
th
e
s
tu
de
nt
'
s
e
s
s
a
y
w
it
h
e
a
c
h
te
a
c
he
r
’
s
e
s
s
a
y,
w
hi
c
h
r
e
s
ul
t
s
a
s
im
il
a
r
it
y
s
c
or
e
f
or
e
a
c
h
qu
e
s
ti
on.
T
he
in
di
vi
dua
l
s
im
il
a
r
it
y
s
c
or
e
s
a
r
e
m
ul
ti
pl
ie
d
by
th
e
w
e
ig
ht
of
e
a
c
h que
s
ti
on, a
nd t
h
e
n a
ggr
e
ga
te
d t
o obta
in
a
n o
ve
r
a
ll
s
c
or
e
f
or
t
he
s
tu
de
nt
'
s
e
s
s
a
y.
2.9.
T
e
s
t
in
g
w
it
h
r
oot
m
e
an
s
q
u
ar
e
e
r
r
or
R
oot
m
e
a
n s
qua
r
e
e
r
r
or
(
R
M
S
E
)
,
m
e
a
n s
qua
r
e
d e
r
r
or
(
M
S
E
)
, a
nd
m
e
a
n a
bs
ol
ut
e
e
r
r
or
(
M
A
E
)
a
r
e
a
ll
m
e
tr
ic
s
us
e
d
to
e
va
lu
a
te
th
e
pe
r
f
or
m
a
nc
e
of
a
pr
e
di
c
ti
ve
m
ode
l
[
82]
,
[
83]
.
T
he
c
hoi
c
e
of
th
e
s
e
m
e
tr
ic
s
de
pe
nds
on
th
e
pr
obl
e
m
c
ha
r
a
c
te
r
is
ti
c
.
I
n
th
is
s
tu
dy,
w
e
e
m
pl
o
y
R
M
S
E
to
e
va
lu
a
te
th
e
pr
opos
e
d
A
E
S
m
ode
l
s
im
pl
y
be
c
a
us
e
R
M
S
E
is
m
or
e
s
e
n
s
it
iv
e
to
la
r
ge
e
r
r
or
s
th
a
n
M
S
E
a
nd
M
A
E
.
R
M
S
E
pe
na
li
z
e
s
l
a
r
ge
r
e
r
r
or
s
m
or
e
he
a
vi
ly
due
to
th
e
s
qua
r
in
g
ope
r
a
ti
on.
T
hi
s
s
e
n
s
it
iv
it
y
c
a
n
be
a
dva
nt
a
g
e
ous
w
he
r
e
l
a
r
ge
e
r
r
or
s
a
r
e
c
ons
id
e
r
e
d m
or
e
c
r
it
ic
a
l
a
nd ha
ve
a
s
ig
ni
f
ic
a
nt
i
m
pa
c
t
on t
he
o
ve
r
a
ll
pe
r
f
or
m
a
nc
e
of
t
he
m
ode
l
[
84]
.
R
M
S
E
is
c
ons
is
t
e
nt
w
it
h
th
e
s
ta
nda
r
d
de
vi
a
ti
on
of
th
e
ta
r
ge
t
va
r
ia
bl
e
[
85]
.
I
t
a
ls
o
a
ll
ow
s
f
or
a
di
r
e
c
t
c
om
pa
r
is
on
w
it
h
th
e
s
ta
nda
r
d
de
vi
a
ti
on,
pr
ovi
di
ng
a
s
e
n
s
e
of
s
c
a
le
f
or
th
e
e
r
r
or
s
.
T
hi
s
m
a
ke
s
it
e
a
s
ie
r
to
in
te
r
pr
e
t
th
e
e
r
r
or
c
om
pa
r
e
d
to
th
e
v
a
r
ia
bi
li
ty
of
th
e
da
ta
.
T
he
R
M
S
E
is
th
e
f
in
a
l
m
e
tr
ic
th
a
t
qu
a
nt
if
ie
s
th
e
a
ve
r
a
ge
m
a
gni
tu
de
of
th
e
e
r
r
or
s
m
a
de
by
th
e
m
ode
l
in
p
r
e
di
c
ti
ng
th
e
s
c
or
e
s
.
A
lo
w
e
r
R
M
S
E
s
how
s
be
tt
e
r
pe
r
f
or
m
a
nc
e
,
a
s
it
s
ig
ni
f
ie
s
th
a
t
th
e
pr
e
di
c
ti
ons
a
r
e
c
lo
s
e
r
to
t
he
a
c
tu
a
l
s
c
or
e
s
.
C
onv
e
r
s
e
ly
,
a
hi
ghe
r
R
M
S
E
s
ugge
s
ts
l
a
r
ge
r
di
s
c
r
e
pa
nc
ie
s
be
twe
e
n
pr
e
di
c
te
d
a
nd
a
c
tu
a
l
s
c
or
e
s
.
T
he
R
M
S
E
f
or
m
ul
a
is
a
s
in
(
6)
,
w
he
r
e
Yt
=
s
c
or
e
f
r
om
t
e
a
c
he
r
f
or
e
a
c
h s
tu
d
e
nt
,
Ut
=
a
ggr
e
ga
ti
on s
c
or
e
f
r
om
s
ys
te
m
, a
nd
n
=
t
ot
a
l
s
tu
d
e
nt
.
=
√
1
∑
(
−
)
=
1
2
(
6)
3.
R
E
S
U
L
T
S
A
N
D
D
I
S
C
U
S
S
I
O
N
3.1.
R
e
s
u
lt
s
T
hi
s
s
tu
dy
us
e
d
de
s
c
r
ip
ti
ve
s
ta
ti
s
ti
c
s
a
nd
a
one
-
w
a
y
r
e
pe
a
te
d
m
e
a
s
ur
e
s
A
N
O
V
A
to
te
s
t
th
e
nul
l
hypothe
s
is
th
a
t
th
e
r
e
is
no
s
t
a
ti
s
ti
c
a
ll
y
s
ig
ni
f
ic
a
nt
di
f
f
e
r
e
nc
e
b
e
twe
e
n
th
e
m
e
a
n
s
c
or
e
s
a
s
s
ig
ne
d
by
th
e
A
E
S
V
S
M
s
ys
te
m
a
nd
hum
a
n
te
a
c
he
r
s
.
T
a
bl
e
1
pr
e
s
e
nt
s
a
c
om
pa
r
is
on
of
th
e
s
e
m
e
a
n
s
c
or
e
s
.
I
n
th
e
c
ont
e
xt
of
e
va
lu
a
ti
ng
s
c
or
in
g
m
e
th
ods
s
uc
h
a
s
A
E
S
V
S
M
w
it
h
C
os
in
e
s
im
il
a
r
it
y
,
A
E
S
V
S
M
w
i
th
J
a
c
c
a
r
d
c
oe
f
f
ic
ie
nt
,
a
nd
hum
a
n
gr
a
di
ng,
a
one
-
w
a
y
A
N
O
V
A
he
lp
s
to
a
s
s
e
s
s
w
he
t
he
r
th
e
obs
e
r
ve
d
di
f
f
e
r
e
nc
e
s
in
th
e
ir
a
ve
r
a
ge
s
c
or
e
s
a
r
e
due
to
tr
ue
di
f
f
e
r
e
nc
e
s
in
th
e
m
e
th
ods
or
m
e
r
e
ly
a
r
e
s
ul
t
of
r
a
ndom
va
r
ia
ti
on.
U
s
in
g
a
one
-
w
a
y
A
N
O
V
A
is
c
r
uc
ia
l
in
e
xp
e
r
im
e
nt
a
l
a
na
ly
s
is
be
c
a
us
e
it
pr
ovi
d
e
s
a
r
ig
or
ous
m
e
th
od
f
or
te
s
ti
ng
w
h
e
th
e
r
th
e
m
e
a
ns
of
di
f
f
e
r
e
nt
gr
oups
di
f
f
e
r
s
ig
ni
f
ic
a
nt
ly
,
w
hi
le
c
ont
r
ol
li
ng
f
or
va
r
ia
bi
li
ty
a
nd
r
e
duc
in
g
th
e
r
is
k
of
e
r
r
or
s
.
T
hi
s
a
ll
ow
s
r
e
s
e
a
r
c
he
r
s
to
m
a
ke
c
onf
id
e
nt
,
da
ta
-
dr
iv
e
n
de
c
is
io
ns
a
bout
th
e
e
f
f
e
c
ti
ve
ne
s
s
or
r
e
li
a
bi
li
ty
of
di
f
f
e
r
e
nt
m
e
th
ods
.
T
a
bl
e
2
pr
e
s
e
nt
s
th
e
r
e
s
ul
ts
of
a
one
-
w
a
y
r
e
pe
a
te
d
m
e
a
s
ur
e
s
A
N
O
V
A
c
om
pa
r
in
g
A
E
S
w
it
h
C
os
in
e
s
im
il
a
r
it
y
to
hum
a
n
gr
a
di
ng.
T
he
F
-
va
lu
e
of
6.48
in
di
c
a
te
s
th
a
t
th
e
di
f
f
e
r
e
nc
e
s
in
s
c
or
e
s
be
twe
e
n
th
e
two
m
e
th
ods
a
r
e
not
a
bl
e
c
om
pa
r
e
d
to
th
e
va
r
ia
bi
li
ty
w
it
hi
n
e
a
c
h
gr
oup.
W
it
h
a
s
ig
ni
f
ic
a
nc
e
v
a
lu
e
(
p
=
0.025)
be
lo
w
th
e
th
r
e
s
hol
d
of
0.05,
it
is
c
le
a
r
th
a
t
th
e
di
f
f
e
r
e
nc
e
be
t
w
e
e
n
A
E
S
w
it
h
C
os
in
e
s
im
il
a
r
it
y
a
nd
hum
a
n
gr
a
di
ng i
s
s
ta
ti
s
ti
c
a
ll
y s
ig
ni
f
ic
a
nt
.
S
im
il
a
r
ly
,
T
a
bl
e
3
hi
ghl
ig
ht
s
th
e
r
e
s
ul
ts
of
a
on
e
-
w
a
y
r
e
pe
a
te
d
m
e
a
s
ur
e
s
A
N
O
V
A
c
om
pa
r
in
g
A
E
S
w
it
h
J
a
c
c
a
r
d
c
oe
f
f
ic
ie
nt
to
hum
a
n
gr
a
di
ng.
T
h
e
te
s
t
s
how
s
a
s
ta
ti
s
ti
c
a
ll
y
s
ig
ni
f
ic
a
nt
di
f
f
e
r
e
nc
e
(
p
=
0.031)
,
m
e
a
ni
ng t
he
t
w
o m
e
th
ods
pr
oduc
e
di
s
ti
nc
t
s
c
or
in
g pa
tt
e
r
ns
. T
h
e
l
a
r
ge
e
f
f
e
c
t
s
iz
e
(
η2
=
0.535)
a
nd l
ow
W
il
ks
'
s
l
a
m
bda
(
A
=
0.236)
f
ur
th
e
r
unde
r
s
c
or
e
th
a
t
th
e
gr
a
di
ng
m
e
th
od
s
tr
ongl
y
in
f
lu
e
nc
e
s
th
e
s
c
or
e
s
.
T
he
s
e
r
e
s
ul
t
s
de
m
ons
tr
a
te
th
a
t
s
c
or
e
s
f
r
om
A
E
S
w
it
h
J
a
c
c
a
r
d
c
oe
f
f
ic
ie
nt
di
f
f
e
r
s
ig
ni
f
ic
a
nt
ly
f
r
om
th
os
e
a
s
s
ig
ne
d
by
hum
a
n
gr
a
di
ng.
I
n
th
is
s
tu
dy,
w
e
im
pl
e
m
e
nt
th
e
V
S
M
m
e
th
od
f
or
A
E
S
s
ys
te
m
a
nd
in
ve
s
ti
ga
t
e
th
e
u
s
a
ge
of
C
os
in
e
s
im
il
a
r
it
y
a
nd
J
a
c
c
a
r
d
s
im
il
a
r
it
y
f
o
r
uni
gr
a
m
,
bi
gr
a
m
,
a
nd
tr
i
gr
a
m
.
W
e
c
om
pa
r
e
th
e
s
tu
de
nt
s
’
a
ns
w
e
r
s
a
nd
m
ode
l
a
ns
w
e
r
to
ge
t
th
e
s
im
il
a
r
it
y
s
c
or
e
s
.
T
hos
e
s
c
or
e
s
a
r
e
th
e
n
e
va
lu
a
te
d
w
it
h
th
e
s
c
or
e
f
r
om
th
e
te
a
c
he
r
,
yi
e
ld
in
g
th
e
R
M
S
E
s
c
or
e
.
T
he
R
M
S
E
s
c
or
e
f
r
om
our
e
xpe
r
im
e
nt
is
s
how
n
in
T
a
bl
e
1.
L
ow
e
r
R
M
S
E
va
lu
e
s
in
di
c
a
te
be
tt
e
r
pe
r
f
or
m
a
nc
e
, a
s
t
he
y i
ndi
c
a
te
s
m
a
ll
e
r
e
r
r
or
s
be
t
w
e
e
n pr
e
di
c
te
d a
nd a
c
tu
a
l
s
c
or
e
s
.
T
he
lo
w
e
s
t
R
M
S
E
s
c
or
e
a
c
r
os
s
a
ll
te
s
ti
ng
s
c
e
n
a
r
io
s
in
T
a
bl
e
4 i
s
hi
ghl
ig
ht
e
d.
F
or
C
os
in
e
s
im
il
a
r
it
y
,
th
e
m
in
im
um
R
M
S
E
is
2.04,
a
c
hi
e
ve
d
w
it
h
t
r
ig
r
a
m
.
I
n
c
ont
r
a
s
t,
f
or
J
a
c
c
a
r
d
c
oe
f
f
ic
ie
nt
,
th
e
lo
w
e
s
t
R
M
S
E
is
1.72,
obt
a
in
e
d
us
in
g
u
ni
gr
a
m
.
T
hi
s
in
di
c
a
te
s
th
a
t
th
e
C
o
s
i
ne
s
im
il
a
r
it
y
pe
r
f
or
m
s
be
tt
e
r
in
th
is
te
s
ti
ng
s
c
e
na
r
io
,
a
s
it
a
c
hi
e
ve
s
a
lo
w
e
r
R
M
S
E
c
om
pa
r
e
d
to
J
a
c
c
a
r
d
c
o
e
f
f
ic
ie
nt
.
A
ddi
ti
ona
ll
y,
th
e
pe
r
f
or
m
a
nc
e
s
e
e
m
s
in
f
lu
e
nc
e
d
by
th
e
f
e
a
tu
r
e
r
e
pr
e
s
e
nt
a
ti
on
m
e
th
od
(
uni
gr
a
m
,
tr
i
gr
a
m
)
,
w
it
h
t
r
ig
r
a
m
be
in
g
m
o
r
e
e
f
f
e
c
ti
ve
f
o
r
C
os
in
e
s
im
il
a
r
it
y
a
nd J
a
c
c
a
r
d
c
oe
f
f
ic
ie
nt
.
Evaluation Warning : The document was created with Spire.PDF for Python.
I
nt
J
A
r
ti
f
I
nt
e
ll
I
S
S
N
:
2252
-
8938
A
ut
om
at
ic
e
s
s
ay
s
c
or
in
g:
l
e
v
e
r
agi
ng J
ac
c
a
r
d c
oe
ff
ic
ie
nt
and
C
os
in
e
s
imi
la
r
it
y
…
(
A
ndhar
in
i
D
w
i
C
ahy
ani
)
3605
T
a
bl
e
1. D
e
s
c
r
ip
ti
ve
s
ta
ti
s
ti
c
s
S
our
c
e
M
e
a
ns
S
t
a
nda
r
d
d
e
vi
a
t
i
on
C
oe
f
f
i
c
i
e
nt
of
v
a
r
i
a
t
i
on
A
E
S
V
S
M
w
i
t
h
C
os
i
ne
s
i
m
i
l
a
r
i
t
y
79
3.46
4.385
A
E
S
V
S
M
w
i
t
h J
a
c
c
a
r
d
c
oe
f
f
i
c
i
e
nt
81
3.74
4.619
H
um
a
n gr
a
di
ng
78
2.45
3.140
T
a
bl
e
2. O
ne
-
w
a
y
r
e
p
e
a
te
d m
e
a
s
ur
e
s
A
N
O
V
A
(
A
E
S
C
os
in
e
s
i
m
il
a
r
it
y
vs
huma
n gr
a
di
ng
)
S
our
c
e
F
W
i
l
ks
’
s
A
S
i
g
ŋ2
G
r
a
di
ng
m
e
t
hods
(
A
E
S
w
i
t
h
C
os
i
ne
s
i
m
i
l
a
r
i
t
y
vs
hum
a
n gr
a
di
ng
)
6.128
0.312
0.025
0.572
E
r
r
or
d
f
8.70
T
a
bl
e
3. O
ne
-
w
a
y
r
e
p
e
a
te
d m
e
a
s
ur
e
s
A
N
O
V
A
(
A
E
S
J
a
c
c
a
r
d
s
i
m
il
a
r
it
y vs
huma
n gr
a
di
ng
)
S
our
c
e
F
W
i
l
ks
’
s
A
S
i
g
ŋ2
G
r
a
di
ng
m
e
t
hods
(
A
E
S
w
i
t
h J
a
c
c
a
r
d
c
oe
f
f
i
c
i
e
nt
vs
hum
a
n gr
a
di
ng
)
7.128
0.236
0.031
0.535
E
r
r
or
d
f
9.00
T
a
bl
e
4. C
om
pa
r
is
on of
R
M
S
E
va
lu
e
f
or
C
os
in
e
s
im
il
a
r
it
y
a
nd
J
a
c
c
a
r
d
s
im
il
a
r
it
y
R
M
S
E
c
os
i
ne
R
M
S
E
J
a
c
c
a
r
d
U
ni
gr
a
m
B
i
gr
a
m
T
r
i
gr
a
m
U
ni
gr
a
m
B
i
gr
a
m
T
r
i
gr
a
m
Q
ue
s
t
i
on 1
2.33
2.06
2.04
2.72
2.91
2.3
Q
ue
s
t
i
on 2
2.89
6.09
8.32
3.4
5.89
7.95
Q
ue
s
t
i
on 3
3.2
4.56
5.31
3.62
4.92
5.94
Q
ue
s
t
i
on 4
5.09
3.1
4.05
2.73
4.35
5.46
Q
ue
s
t
i
on 5
2.38
3.79
6.76
3.38
6.33
8.43
O
ve
r
a
ll
,
ba
s
e
d
on
F
ig
ur
e
4,
C
os
in
e
s
im
il
a
r
it
y
te
nds
to
pe
r
f
or
m
be
tt
e
r
th
a
n
J
a
c
c
a
r
d
s
im
il
a
r
it
y,
a
s
in
di
c
a
te
d
by
th
e
ge
ne
r
a
ll
y
lo
w
e
r
R
M
S
E
va
lu
e
s
a
c
r
os
s
a
ll
n
-
gr
a
m
r
e
pr
e
s
e
nt
a
ti
ons
a
nd
qu
e
s
ti
ons
.
T
hi
s
s
ugg
e
s
ts
th
a
t,
ba
s
e
d
on
th
e
pr
ovi
de
d
da
ta
,
C
os
in
e
s
im
il
a
r
it
y
m
ig
ht
be
p
r
ovi
di
ng
a
be
tt
e
r
f
it
to
th
e
a
c
tu
a
l
s
c
or
e
s
.
F
r
om
our
a
na
ly
s
is
,
th
e
C
os
in
e
s
im
il
a
r
it
y
c
ons
id
e
r
s
th
e
f
r
e
que
nc
y
of
to
ke
ns
(
w
or
ds
,
n
-
gr
a
m
s
)
in
th
e
e
s
s
a
ys
.
I
t
c
ons
id
e
r
s
bot
h
th
e
pr
e
s
e
nc
e
a
nd
f
r
e
que
nc
y
of
w
or
ds
in
th
e
ve
c
to
r
s
.
H
ow
e
ve
r
,
J
a
c
c
a
r
d
s
im
il
a
r
it
y
c
ons
id
e
r
s
onl
y
th
e
pr
e
s
e
nc
e
or
a
bs
e
nc
e
of
to
ke
ns
,
w
it
hout
a
c
c
ount
in
g
f
or
th
e
ir
f
r
e
que
nc
y.
U
s
in
g
our
da
ta
s
e
t,
it
s
e
e
m
s
th
a
t
th
e
f
r
e
que
nc
y
of
s
pe
c
if
ic
w
or
ds
or
n
-
gr
a
m
s
is
c
r
uc
ia
l
f
or
s
c
or
in
g
e
s
s
a
ys
.
T
he
r
e
f
or
e
,
C
os
in
e
s
im
il
a
r
it
y
m
a
y be
tt
e
r
c
a
pt
ur
e
t
he
s
e
nua
n
c
e
s
, l
e
a
di
ng t
o l
ow
e
r
R
M
S
E
va
lu
e
s
.
F
ig
ur
e
4. C
om
pa
r
is
on of
R
M
S
E
i
n
uni
gr
a
m
, bi
gr
a
m
, a
nd t
r
ig
r
a
m
3.2.
D
is
c
u
s
s
io
n
O
ur
e
xpe
r
im
e
nt
s
how
s
th
a
t
C
o
s
in
e
s
im
il
a
r
it
y
pe
r
f
or
m
s
be
tt
e
r
a
nd
ha
s
hi
ghe
r
s
im
il
a
r
it
y
to
hum
a
n
gr
a
di
ng
c
om
pa
r
e
d
to
J
a
c
c
a
r
d
c
oe
f
f
ic
ie
nt
.
T
hi
s
r
e
s
ul
t
of
o
ur
s
tu
dy
is
in
l
in
e
w
it
h
th
a
t
r
e
por
te
d
b
y
Evaluation Warning : The document was created with Spire.PDF for Python.
I
S
S
N
:
2252
-
8938
I
nt
J
A
r
ti
f
I
nt
e
ll
, V
ol
.
14
, N
o.
5
,
O
c
to
be
r
20
25
:
3599
-
3612
3606
W
a
hyunings
ih
e
t
al
.
[
86]
.
T
h
e
ir
r
e
s
e
a
r
c
h
s
ho
w
e
d
th
a
t
th
e
C
os
in
e
s
im
il
a
r
it
y
ha
s
s
im
il
a
r
p
e
r
f
or
m
a
nc
e
w
it
h
d
ic
e
c
oe
f
f
ic
ie
nt
m
e
th
od
a
nd
is
be
tt
e
r
th
a
n
J
a
c
c
a
r
d
c
oe
f
f
ic
ie
nt
m
e
th
ods
.
A
lo
be
d
e
t
al
.
[
41]
a
ls
o
r
e
por
te
d
th
a
t
th
e
C
os
in
e
s
im
il
a
r
it
y
ha
s
th
e
lo
w
e
s
t
e
r
r
or
c
om
pa
r
e
d
w
it
h
th
e
J
a
c
c
a
r
d
a
nd
E
uc
li
de
a
n
s
im
il
a
r
it
y
in
th
e
ir
a
ut
om
a
te
d
A
r
a
bi
c
e
s
s
a
y
s
c
or
in
g
(
A
A
E
S
)
a
ppl
ic
a
ti
on.
M
a
da
to
v
a
nd
S
a
t
ta
r
ova
[
87]
c
onduc
te
d
an
e
xpe
r
im
e
nt
to
ge
t
hi
ghe
s
t
pe
r
f
or
m
a
nc
e
of
s
im
il
a
r
it
y
m
e
tr
ic
us
in
g
C
o
s
in
e
s
im
il
a
r
i
ty
,
J
a
c
c
a
r
d
s
im
il
a
r
it
y
,
a
nd
th
e
c
om
bi
na
ti
on
of
th
e
m
. T
he
ir
e
xpe
r
im
e
nt
s
how
s
t
h
a
t
C
os
in
e
s
im
il
a
r
it
y
out
pe
r
f
or
m
e
d t
he
ot
he
r
t
w
o m
e
th
od
s
.
I
n
ge
ne
r
a
l
,
C
os
i
ne
s
im
il
a
r
it
y
c
ons
id
e
r
s
t
he
m
a
g
ni
tu
de
o
f
th
e
ve
c
t
or
s
r
e
p
r
e
s
e
nt
in
g
th
e
da
ta
po
in
ts
in
a
hi
gh
-
d
im
e
ns
i
ona
l
s
pa
c
e
[
88
]
.
T
h
is
m
e
a
ns
th
a
t
e
ve
n
if
th
e
da
ta
i
s
s
pa
r
s
e
a
nd
c
o
nt
a
in
s
m
a
ny
z
e
r
o
va
lu
e
s
,
C
os
in
e
s
i
m
i
la
r
it
y
c
a
n
s
t
il
l
c
a
p
tu
r
e
th
e
s
im
il
a
r
it
y
i
n
d
ir
e
c
t
io
n
b
e
twe
e
n
n
on
-
z
e
r
o
v
a
l
ue
s
,
w
hi
c
h
is
e
s
s
e
n
ti
a
l
i
n
hi
g
h
-
di
m
e
ns
io
na
l
s
pa
c
e
s
.
C
os
in
e
s
i
m
i
la
r
i
ty
n
or
m
a
l
iz
e
s
t
he
ve
c
to
r
s
be
f
o
r
e
c
o
m
pu
ti
ng
t
he
s
im
il
a
r
it
y,
w
h
ic
h
m
i
ti
g
a
te
s
t
he
e
f
f
e
c
t
of
v
a
r
yi
ng
m
a
gn
it
u
de
s
be
twe
e
n
d
a
ta
poi
nt
s
.
T
h
is
no
r
m
a
l
iz
a
t
io
n
e
ns
u
r
e
s
th
a
t
t
he
s
im
il
a
r
i
ty
m
e
a
s
u
r
e
is
n
ot
b
ia
s
e
d
b
y
th
e
ove
r
a
ll
m
a
gn
it
ude
o
f
t
he
ve
c
t
o
r
s
,
m
a
ki
ng
i
t
s
ui
ta
b
le
f
or
s
p
a
r
s
e
da
ta
[
8
9]
.
M
or
e
ove
r
,
i
n
s
pa
r
s
e
da
ta
,
w
he
r
e
m
os
t
o
f
th
e
va
lu
e
s
a
r
e
z
e
r
o
(
e
.
g.,
i
n
te
x
t
da
t
a
r
e
p
r
e
s
e
nt
e
d
a
s
B
o
W
or
TF
-
I
D
F
ve
c
t
o
r
s
)
,
J
a
c
c
a
r
d
s
im
il
a
r
it
y
m
a
y
no
t
c
a
pt
ur
e
th
e
s
i
m
i
la
r
it
y
w
e
l
l
be
c
a
us
e
i
t
on
ly
c
ons
id
e
r
s
th
e
pr
e
s
e
nc
e
or
a
bs
e
nc
e
o
f
n
on
-
z
e
r
o
v
a
l
ue
s
[
90
]
.
C
os
i
ne
s
i
m
i
la
r
i
ty
,
on
th
e
ot
h
e
r
ha
n
d,
f
oc
us
e
s
o
n
t
he
a
n
gl
e
s
be
tw
e
e
n
ve
c
to
r
s
a
n
d
is
le
s
s
a
f
f
e
c
te
d
by
t
he
s
pa
r
s
i
ty
o
f
t
he
da
ta
.
S
i
nc
e
C
os
in
e
s
im
il
a
r
i
ty
f
oc
us
e
s
on
t
he
a
ngl
e
s
b
e
twe
e
n
ve
c
to
r
s
r
a
t
he
r
th
a
n
t
he
s
pe
c
if
ic
e
le
m
e
n
ts
,
i
t
c
a
n
ha
nd
le
hi
g
h
-
d
im
e
ns
io
na
l,
s
pa
r
s
e
d
a
ta
m
or
e
e
f
f
e
c
t
iv
e
ly
t
ha
n
J
a
c
c
a
r
d
s
i
m
i
la
r
i
ty
.
O
ur
e
xpe
r
i
m
e
n
t
i
n
F
i
gu
r
e
4
s
ho
w
s
th
a
t
th
e
im
pa
c
t
o
f
n
-
g
r
a
m
s
iz
e
va
r
ie
s
a
c
r
os
s
que
s
ti
ons
i
n
C
os
in
e
s
i
m
i
la
r
i
ty
.
F
o
r
s
om
e
qu
e
s
t
io
ns
(
q
ue
s
ti
on
4
)
,
in
c
r
e
a
s
in
g
t
he
n
-
gr
a
m
s
iz
e
le
a
ds
to
a
n
in
c
r
e
a
s
e
in
R
M
S
E
,
in
d
ic
a
te
s
t
ha
t
g
r
e
a
t
e
r
n
-
g
r
a
m
s
iz
e
r
e
s
u
lt
s
le
s
s
a
c
c
u
r
a
t
e
in
c
a
pt
ur
in
g
t
he
de
s
i
r
e
d
s
im
il
a
r
it
ie
s
be
twe
e
n
te
x
ts
.
S
i
m
i
la
r
ly
,
C
i
ta
w
a
n
e
t
al
.
[
91
]
r
e
p
or
te
d
th
a
t
th
e
ir
r
e
s
e
a
r
c
h
in
A
E
S
us
in
g
la
te
n
t
s
e
m
a
n
ti
c
a
na
l
ys
is
(
L
S
A
)
s
h
ow
s
un
ig
r
a
m
ha
v
e
hi
ghe
r
a
c
c
ur
a
c
y
c
o
m
pa
r
e
d
to
b
ig
r
a
m
a
nd
t
r
ig
r
a
m
.
T
he
i
r
r
e
s
e
a
r
c
h
i
m
p
li
e
d
t
ha
t
va
r
ia
t
io
ns
o
f
n
-
g
r
a
m
s
s
iz
e
s
ho
w
p
os
i
ti
ve
c
or
r
e
la
t
io
n
in
A
E
S
s
ys
te
m
.
C
om
bi
ni
n
g
n
e
i
ghb
ou
r
i
ng
w
o
r
ds
in
to
bi
g
r
a
m
s
o
r
tr
ig
r
a
m
s
c
a
p
tu
r
e
s
m
o
r
e
c
om
pl
ic
a
t
e
d
te
x
t
pa
tt
e
r
ns
a
nd
s
e
nt
e
nc
e
s
.
C
o
m
pa
r
e
d
to
un
ig
r
a
m
s
,
bi
g
r
a
m
s
a
nd
t
r
i
gr
a
m
s
ha
ve
m
or
e
i
n
f
o
r
m
a
ti
on
,
w
hi
c
h
c
oul
d
in
c
r
e
a
s
e
t
he
m
od
e
l
'
s
c
om
pl
e
xi
ty
.
B
i
gr
a
m
s
a
nd
tr
ig
r
a
m
s
o
f
te
n
le
a
d
to
f
e
a
t
ur
e
s
pa
c
e
s
w
i
th
h
ig
he
r
di
m
e
ns
io
n
s
,
w
h
ic
h
in
t
ur
n
r
e
s
u
lt
in
a
g
r
e
a
te
r
l
e
ve
l
o
f
s
pa
r
s
it
y
in
th
e
r
e
p
r
e
s
e
nt
a
ti
on
[
92
]
.
T
he
p
r
e
s
e
nc
e
of
s
pa
r
s
i
ty
m
i
gh
t
pos
e
d
if
f
i
c
u
lt
ie
s
in
t
he
m
od
e
l
li
ng
pr
oc
e
s
s
a
n
d
m
a
y
n
e
c
e
s
s
i
ta
te
a
la
r
ge
r
a
m
ou
nt
of
da
ta
t
o
a
c
hi
e
ve
g
oo
d
ge
ne
r
a
l
is
a
t
io
n.
I
f
t
he
m
ode
l
ha
s
di
f
f
ic
ul
ty
c
a
pt
u
r
in
g
s
i
gn
if
ic
a
n
t
pa
tt
e
r
ns
i
n
th
e
da
t
a
,
t
he
hi
g
he
r
di
m
e
ns
io
na
l
it
y
c
a
n
le
a
d
to
in
c
r
e
a
s
in
g
R
M
S
E
va
lu
e
s
.
Y
a
z
da
ni
e
t
a
l
.
p
r
op
os
e
d
t
ha
t
u
ni
gr
a
m
s
c
r
e
a
te
a
lo
w
e
r
-
di
m
e
ns
io
na
l
f
e
a
tu
r
e
s
pa
c
e
c
o
m
pa
r
e
d
to
bi
g
r
a
m
s
a
n
d
t
r
i
g
r
a
m
s
,
w
h
ic
h
he
l
ps
r
e
du
c
e
t
he
r
is
k
o
f
s
pa
r
s
it
y
a
nd
ove
r
f
i
tt
in
g
[
93
]
.
L
ik
e
w
is
e
,
L
i
e
t
al
.
in
d
ic
a
te
d
t
ha
t
a
m
ode
l
w
i
th
f
e
w
e
r
d
im
e
ns
i
ons
is
m
or
e
li
ke
l
y
t
o
g
e
ne
r
a
li
z
e
e
f
f
e
c
t
iv
e
ly
to
p
r
e
v
io
us
l
y
uns
e
e
n
da
ta
,
r
e
s
u
lt
in
g
in
lo
w
e
r
R
M
S
E
v
a
lu
e
s
[
9
4]
.
I
n
m
os
t
e
s
s
a
y
doc
u
m
e
n
ts
,
u
ni
g
r
a
m
s
us
ua
ll
y
b
e
s
e
e
n
m
o
r
e
f
r
e
qu
e
nt
ly
r
a
t
he
r
t
ha
n
b
ig
r
a
m
s
o
r
t
r
i
gr
a
m
s
.
T
he
f
or
m
o
f
uni
gr
a
m
s
p
r
od
uc
e
d
m
o
r
e
a
bu
nda
n
t
to
ke
ns
,
th
e
r
e
f
o
r
e
m
a
y
of
f
e
r
a
de
ns
e
r
r
e
p
r
e
s
e
n
ta
ti
o
n.
H
ig
he
r
f
r
e
qu
e
nc
y
m
a
y
r
e
s
u
l
t
to
m
or
e
s
ta
b
le
a
nd
r
e
l
ia
b
le
r
e
p
r
e
s
e
nt
a
ti
ons
,
c
o
nt
r
i
bu
ti
ng
to
lo
w
e
r
R
M
S
E
.
M
o
r
e
ove
r
,
i
t
is
e
s
s
e
n
ti
a
l
to
no
te
th
a
t
th
e
e
f
f
e
c
ti
ve
ne
s
s
o
f
e
a
c
h
s
i
m
i
la
r
i
ty
m
e
t
r
ic
m
a
y
va
r
y
s
ub
je
c
t
t
o
th
e
s
pe
c
if
ic
c
ha
r
a
c
t
e
r
is
ti
c
s
of
th
e
N
L
P
t
a
s
k
a
n
d
th
e
na
t
u
r
e
o
f
t
h
e
da
ta
s
e
t.
T
he
r
e
f
or
e
,
f
u
r
th
e
r
e
x
pe
r
i
m
e
n
ta
ti
o
n
a
nd
e
va
l
ua
t
io
n
m
a
y
be
s
ig
n
i
f
ic
a
nt
to
e
x
pl
o
r
e
th
e
be
s
t
s
i
m
i
la
r
i
ty
m
e
t
r
i
c
f
o
r
th
e
A
E
S
t
a
s
k.
3.3.
F
u
t
u
r
e
r
e
s
e
ar
c
h
F
ut
ur
e
r
e
s
e
a
r
c
h
on
A
E
S
c
oul
d
e
xpl
or
e
s
e
ve
r
a
l
pr
om
is
in
g
d
ir
e
c
ti
ons
.
F
ir
s
t,
in
c
or
por
a
ti
ng
m
or
e
a
dva
nc
e
d
N
L
P
te
c
hni
qu
e
s
,
s
uc
h
a
s
tr
a
n
s
f
or
m
e
r
-
ba
s
e
d
m
ode
ls
li
ke
b
id
ir
e
c
ti
ona
l
e
nc
ode
r
r
e
pr
e
s
e
nt
a
ti
ons
f
r
om
tr
a
ns
f
or
m
e
r
s
(
B
E
R
T
)
or
ge
ne
r
a
ti
ve
pr
e
-
tr
a
in
e
d
tr
a
ns
f
or
m
e
r
(
G
P
T
)
,
c
oul
d
im
pr
ove
th
e
s
ys
te
m
'
s
a
bi
li
ty
to
c
a
pt
ur
e
c
om
pl
e
x
li
ngui
s
ti
c
a
nd
s
e
m
a
nt
ic
pa
tt
e
r
ns
.
B
E
R
T
,
be
i
ng
a
tr
a
ns
f
or
m
e
r
-
ba
s
e
d
m
ode
l
pr
e
-
tr
a
in
e
d
on
va
s
t
a
m
ount
s
of
te
xt
da
ta
,
e
xc
e
ls
a
t
unde
r
s
ta
ndi
ng
c
ont
e
xt
a
nd
ge
ne
r
a
ti
ng
r
ic
h,
c
ont
e
xt
ua
li
z
e
d
w
or
d
a
nd
s
e
nt
e
nc
e
e
m
be
ddi
ngs
.
B
E
R
T
w
or
k
s
a
s
a
f
e
a
tu
r
e
e
xt
r
a
c
to
r
,
w
h
e
r
e
th
e
pr
e
-
tr
a
in
e
d
m
ode
l
pr
oc
e
s
s
e
s
e
s
s
a
y
te
xt
to
pr
oduc
e
hi
gh
-
di
m
e
ns
io
na
l
ve
c
to
r
s
th
a
t
c
a
pt
ur
e
de
e
p
s
e
m
a
nt
ic
m
e
a
ni
ng,
c
ohe
r
e
nc
e
,
a
nd
e
ve
n
s
ubt
le
nua
nc
e
s
th
a
t
s
im
pl
e
n
-
gr
a
m
or
TF
-
I
D
F
r
e
pr
e
s
e
nt
a
ti
ons
m
is
s
.
T
he
s
e
s
ophi
s
ti
c
a
te
d e
m
be
ddi
ngs
c
a
n
th
e
n
be
f
e
d
in
to
a
s
e
pa
r
a
te
r
e
gr
e
s
s
io
n or
c
la
s
s
if
ic
a
ti
on mode
l
to
pr
e
di
c
t
e
s
s
a
y s
c
or
e
s
.
S
e
c
ond,
e
xpa
ndi
ng
th
e
da
ta
s
e
ts
to
in
c
lu
de
e
s
s
a
y
s
f
r
om
di
ve
r
s
e
s
ubj
e
c
t
s
a
nd
la
ngua
ge
s
w
oul
d
e
nha
nc
e
th
e
s
y
s
te
m
'
s
ge
ne
r
a
li
z
a
bi
li
ty
a
nd
r
obus
tn
e
s
s
.
A
ddi
ti
o
na
ll
y,
in
te
gr
a
ti
ng
e
xpl
a
in
a
bl
e
A
I
m
e
th
ods
c
a
n
pr
ovi
de
tr
a
ns
pa
r
e
nc
y
in
to
th
e
s
c
or
in
g
pr
oc
e
s
s
,
h
e
lp
in
g
e
duc
a
to
r
s
tr
us
t
a
nd
a
dopt
A
E
S
s
ys
te
m
s
m
or
e
w
id
e
ly
.
F
in
a
ll
y,
r
e
s
e
a
r
c
h
c
a
n
a
ls
o
f
oc
u
s
on
opt
im
iz
in
g
c
om
put
a
ti
ona
l
e
f
f
ic
ie
nc
y
to
e
ns
ur
e
s
c
a
la
bi
li
ty
f
or
r
e
a
l
-
ti
m
e
a
ppl
ic
a
ti
ons
in
la
r
ge
e
duc
a
ti
ona
l
s
e
tt
in
g
s
.
T
he
s
e
a
dva
nc
e
m
e
nt
s
w
il
l
he
lp
A
E
S
s
ys
te
m
s
be
c
om
e
m
or
e
r
e
li
a
bl
e
,
e
qui
ta
bl
e
, a
nd a
c
c
e
s
s
ib
l
e
.
Evaluation Warning : The document was created with Spire.PDF for Python.
I
nt
J
A
r
ti
f
I
nt
e
ll
I
S
S
N
:
2252
-
8938
A
ut
om
at
ic
e
s
s
ay
s
c
or
in
g:
l
e
v
e
r
agi
ng J
ac
c
a
r
d c
oe
ff
ic
ie
nt
and
C
os
in
e
s
imi
la
r
it
y
…
(
A
ndhar
in
i
D
w
i
C
ahy
ani
)
3607
3.4.
I
m
p
li
c
at
io
n
T
he
a
dopt
io
n
of
A
E
S
ha
s
s
ig
ni
f
ic
a
nt
im
pl
ic
a
ti
ons
f
or
e
duc
a
ti
ona
l
pr
a
c
ti
c
e
s
a
nd
pol
ic
y
de
ve
lo
pm
e
nt
.
B
y
pr
ovi
di
ng
a
n
e
f
f
ic
ie
nt
a
nd
obj
e
c
ti
ve
m
e
th
od
f
or
e
va
lu
a
ti
n
g
w
r
it
te
n
a
s
s
e
s
s
m
e
nt
s
,
A
E
S
s
y
s
te
m
s
h
a
ve
th
e
pot
e
nt
ia
l
to
a
ddr
e
s
s
lo
ng
-
s
ta
ndi
ng
c
ha
ll
e
nge
s
a
s
s
oc
ia
te
d
w
it
h
m
a
nua
l
gr
a
di
ng,
s
uc
h
a
s
bi
a
s
,
in
c
ons
is
t
e
nc
y,
a
nd
hi
gh
w
or
kl
oa
ds
f
or
e
duc
a
to
r
s
[
95]
.
T
hi
s
e
f
f
ic
ie
nc
y
a
ll
ow
s
te
a
c
he
r
s
to
a
ll
oc
a
te
m
or
e
ti
m
e
to
p
e
r
s
ona
li
z
e
d
in
s
tr
uc
ti
on
a
nd
m
e
nt
or
in
g,
th
us
e
nha
nc
in
g
th
e
ov
e
r
a
ll
qua
l
it
y
of
e
duc
a
ti
on.
A
ddi
ti
ona
ll
y,
A
E
S
f
os
te
r
s
s
c
a
la
bi
li
ty
in
a
s
s
e
s
s
m
e
nt
pr
a
c
ti
c
e
s
,
e
na
bl
in
g
in
s
ti
tu
ti
ons
to
e
va
lu
a
te
la
r
ge
vol
um
e
s
of
s
tu
d
e
nt
e
s
s
a
ys
in
a
ti
m
e
ly
m
a
nne
r
w
it
hout
c
om
pr
om
is
in
g f
a
ir
ne
s
s
or
a
c
c
ur
a
c
y.
T
he
im
pl
e
m
e
nt
a
ti
on
of
A
E
S
s
ys
te
m
s
r
e
qui
r
e
s
c
a
r
e
f
ul
a
tt
e
nt
io
n
to
e
th
ic
a
l
c
ons
id
e
r
a
ti
ons
,
e
s
pe
c
ia
ll
y
in
te
r
m
s
of
tr
a
ns
pa
r
e
n
c
y,
da
ta
pr
iv
a
c
y,
a
nd
th
e
r
is
k
of
e
x
c
e
s
s
iv
e
r
e
li
a
n
c
e
on
a
ut
om
a
te
d
to
ol
s
[
96]
.
F
or
in
s
ta
nc
e
,
tr
a
ns
pa
r
e
nc
y
is
c
r
uc
ia
l
to
e
ns
ur
e
th
a
t
s
tu
de
nt
s
a
nd
e
d
uc
a
to
r
s
unde
r
s
ta
nd
how
A
E
S
s
ys
te
m
s
ge
ne
r
a
te
s
c
or
e
s
. W
it
hout
c
le
a
r
e
xpl
a
na
ti
ons
of
th
e
a
lg
or
it
hm
s
a
nd
c
r
it
e
r
ia
us
e
d,
th
e
s
e
s
y
s
te
m
s
c
oul
d
f
a
c
e
s
ke
pt
ic
is
m
or
m
is
tr
us
t
f
r
om
s
ta
ke
hol
de
r
s
[
97]
.
M
or
e
ove
r
,
th
e
r
e
is
a
r
is
k
th
a
t
ove
r
-
r
e
li
a
nc
e
on
A
E
S
c
oul
d
m
a
r
gi
na
li
z
e
th
e
r
ol
e
of
e
duc
a
to
r
s
,
r
e
duc
in
g
th
e
ir
in
vol
ve
m
e
nt
in
a
s
s
e
s
s
in
g
s
tu
d
e
nt
le
a
r
ni
ng
a
nd
pr
ovi
di
ng
va
lu
a
bl
e
f
e
e
dba
c
k
[
98]
.
F
or
e
xa
m
pl
e
,
w
hi
le
A
E
S
c
a
n
qui
c
kl
y
s
c
or
e
a
la
r
ge
num
be
r
of
e
s
s
a
ys
,
it
m
ig
ht
s
tr
uggl
e
to
r
e
c
ogni
z
e
c
r
e
a
ti
ve
or
nua
nc
e
d
r
e
s
pons
e
s
th
a
t
r
e
qui
r
e
hum
a
n
ju
dgm
e
nt
.
P
ol
ic
ym
a
ke
r
s
a
nd
in
s
ti
tu
ti
ons
m
us
t
e
ns
ur
e
th
a
t
A
E
S
s
ys
te
m
s
a
r
e
de
pl
oye
d
a
s
s
uppor
ti
ve
to
ol
s
th
a
t
e
nha
nc
e
,
r
a
t
he
r
th
a
n
r
e
pl
a
c
e
,
th
e
pr
of
e
s
s
io
na
l
e
xpe
r
ti
s
e
of
te
a
c
he
r
s
[
99]
.
U
lt
im
a
te
ly
,
A
E
S
r
e
pr
e
s
e
nt
s
a
tr
a
ns
f
or
m
a
ti
ve
to
ol
in
m
ode
r
n
e
duc
a
ti
on,
w
it
h
th
e
pot
e
nt
ia
l
to
e
nha
nc
e
th
e
obj
e
c
ti
vi
ty
a
nd
e
f
f
ic
ie
nc
y
of
a
s
s
e
s
s
m
e
nt
s
w
hi
le
s
uppor
ti
ng
e
qui
ta
bl
e
e
duc
a
ti
ona
l
oppor
tu
ni
ti
e
s
.
O
ngoi
ng
r
e
s
e
a
r
c
h
a
nd
c
ol
la
bor
a
ti
on
be
twe
e
n
e
duc
a
to
r
s
,
te
c
hnol
ogi
s
ts
,
a
nd
pol
ic
ym
a
ke
r
s
w
il
l
be
c
r
uc
ia
l
in
r
e
a
li
z
in
g i
ts
f
ul
l
pot
e
nt
ia
l
w
hi
le
m
it
ig
a
ti
ng a
s
s
oc
ia
te
d r
is
k
s
.
4.
C
O
N
C
L
U
S
I
O
N
T
hi
s
s
tu
dy
r
e
v
e
a
ls
va
lu
a
bl
e
in
s
ig
ht
in
th
e
dom
a
in
of
A
E
S
by
in
ve
s
ti
ga
ti
ng
th
e
pr
e
s
e
nt
a
ti
on
of
J
a
c
c
a
r
d
c
oe
f
f
ic
ie
nt
a
nd
C
o
s
in
e
s
im
il
a
r
it
y
m
e
tr
ic
s
us
in
g
th
e
f
r
a
m
e
w
or
k
of
V
S
M
w
it
h
n
-
gr
a
m
va
r
ia
ti
ons
.
T
hi
s
r
e
s
e
a
r
c
h
va
li
da
te
s
th
e
pr
e
pr
oc
e
s
s
in
g
te
c
hni
que
s
a
nd
T
F
-
I
D
F
ve
c
to
r
iz
a
ti
on
to
ge
t
th
e
doc
um
e
nt
f
e
a
tu
r
e
s
by
us
in
g a
da
ta
s
e
t
f
r
om
f
or
m
a
ti
ve
e
s
s
a
y
s
i
n
c
it
iz
e
ns
hi
p e
duc
a
ti
on
a
t
th
e
j
uni
or
hi
gh s
c
hool
l
e
ve
l.
T
he
c
om
pa
r
is
on
of
J
a
c
c
a
r
d
c
oe
f
f
ic
ie
nt
a
nd
C
os
in
e
s
im
il
a
r
it
y
de
m
ons
tr
a
te
s
th
a
t
th
e
la
tt
e
r
s
ur
pa
s
s
e
s
th
e
f
or
m
e
r
in
r
e
vi
e
w
in
g
s
e
m
a
nt
ic
s
im
il
a
r
it
y
be
twe
e
n
doc
um
e
nt
s
.
M
or
e
ove
r
,
th
e
n
-
gr
a
m
va
r
ia
ti
ons
a
na
ly
s
is
di
s
c
ov
e
r
s
th
a
t
uni
gr
a
m
s
le
a
d
to
lo
w
e
r
R
M
S
E
va
lu
e
s
c
om
pa
r
e
d
to
bi
gr
a
m
s
a
nd
t
r
ig
r
a
m
s
,
s
ugge
s
ti
ng
th
e
ir
a
bi
li
ty
in
c
a
tc
hi
ng
th
e
m
a
i
n
te
xt
ua
l
f
e
a
tu
r
e
s
.
T
he
s
e
f
in
di
ngs
hi
ghl
ig
ht
th
e
c
on
s
e
que
nc
e
of
s
e
le
c
ti
ng
r
ig
ht
s
im
il
a
r
it
y
m
e
tr
ic
s
a
nd
n
-
gr
a
m
s
r
e
pr
e
s
e
nt
a
ti
ons
to
lo
w
e
r
th
e
R
M
S
E
s
c
or
e
of
A
E
S
s
y
s
te
m
s
.
F
ur
th
e
r
r
e
s
e
a
r
c
h
c
oul
d
s
tu
dy
ot
he
r
f
a
c
to
r
s
in
f
lu
e
nc
in
g
A
E
S
pe
r
f
o
r
m
a
nc
e
a
nd
in
ve
s
ti
ga
te
te
c
hni
que
s
f
o
r
r
e
f
in
in
g
c
om
put
a
ti
ona
l
e
f
f
ic
ie
nc
y
w
it
hout
c
om
pr
om
is
in
g
th
e
pe
r
f
or
m
a
nc
e
.
U
lt
im
a
te
ly
,
a
dva
nc
e
m
e
nt
s
i
n
A
E
S
m
e
th
odol
ogi
e
s
ha
ve
th
e
pot
e
nt
ia
l
to
r
e
vol
ut
io
ni
z
e
e
duc
a
ti
ona
l
a
s
s
e
s
s
m
e
nt
pr
a
c
ti
c
e
s
,
of
f
e
r
in
g
e
duc
a
to
r
s
a
nd
s
ta
ke
hol
de
r
s
tr
us
twor
th
y
to
ol
s
f
o
r
e
va
lu
a
ti
ng w
r
it
te
n c
ont
e
nt
s
uc
c
e
s
s
f
ul
ly
.
A
C
K
N
O
WL
E
D
G
M
E
N
T
S
T
he
a
ut
hor
s
gr
a
te
f
ul
ly
a
c
knowl
e
dge
th
e
s
uppor
t
pr
ovi
de
d
by
th
e
D
e
pa
r
tm
e
nt
of
I
nf
or
m
a
ti
c
s
E
ngi
ne
e
r
in
g, U
ni
ve
r
s
it
a
s
T
r
unoj
oyo M
a
dur
a
, w
hi
c
h f
a
c
il
it
a
te
d t
he
c
om
pl
e
ti
on of
t
hi
s
r
e
s
e
a
r
c
h.
F
U
N
D
I
N
G
I
N
F
O
R
M
A
T
I
O
N
T
he
a
ut
hor
s
de
c
la
r
e
t
ha
t
no e
xt
e
r
na
l
f
undi
ng w
a
s
r
e
c
e
iv
e
d f
or
c
onduc
ti
ng t
hi
s
s
tu
dy.
A
U
T
H
O
R
C
O
N
T
R
I
B
U
T
I
O
N
S
S
T
A
T
E
M
E
N
T
T
hi
s
jo
ur
na
l
us
e
s
th
e
C
ont
r
ib
ut
or
R
ol
e
s
T
a
xonomy
(
C
R
e
di
T
)
to
r
e
c
ogni
z
e
in
di
vi
dua
l
a
ut
hor
c
ont
r
ib
ut
io
ns
, r
e
duc
e
a
ut
hor
s
hi
p di
s
put
e
s
,
a
nd f
a
c
il
it
a
te
c
ol
la
bo
r
a
ti
on.
N
am
e
o
f
A
u
t
h
or
C
M
So
Va
Fo
I
R
D
O
E
Vi
Su
P
Fu
A
nd
ha
r
in
i
D
w
i
C
a
h
ya
ni
✓
✓
✓
✓
✓
✓
✓
✓
✓
M
oh. W
il
da
n F
a
th
oni
✓
✓
✓
✓
✓
✓
F
ik
a
H
a
s
t
a
r
it
a
R
a
c
hm
a
n
✓
✓
✓
✓
✓
A
r
i
B
a
s
uki
✓
✓
✓
✓
✓
S
a
lm
a
n A
m
in
✓
✓
✓
✓
B
a
i
n
K
h
u
s
n
ul
K
h
ot
i
m
a
h
✓
✓
✓
✓
✓
✓
✓
Evaluation Warning : The document was created with Spire.PDF for Python.
I
S
S
N
:
2252
-
8938
I
nt
J
A
r
ti
f
I
nt
e
ll
, V
ol
.
14
, N
o.
5
,
O
c
to
be
r
20
25
:
3599
-
3612
3608
C
:
C
onc
e
pt
ua
l
i
z
a
t
i
on
M
:
M
e
t
hodol
ogy
So
:
So
f
t
w
a
r
e
Va
:
Va
l
i
da
t
i
on
Fo
:
Fo
r
m
a
l
a
na
l
ys
i
s
I
:
I
nve
s
t
i
ga
t
i
on
R
:
R
e
s
our
c
e
s
D
:
D
a
t
a
C
ur
a
t
i
on
O
:
W
r
i
t
i
ng
-
O
r
i
gi
na
l
D
r
a
f
t
E
:
W
r
i
t
i
ng
-
R
e
vi
e
w
&
E
di
t
i
ng
Vi
:
Vi
s
ua
l
i
z
a
t
i
on
Su
:
Su
pe
r
vi
s
i
on
P
:
P
r
oj
e
c
t
a
dm
i
ni
s
t
r
a
t
i
on
Fu
:
Fu
ndi
ng a
c
qui
s
i
t
i
on
C
O
N
F
L
I
C
T
O
F
I
N
T
E
R
E
S
T
S
T
A
T
E
M
E
N
T
T
he
a
ut
hor
s
s
ta
te
th
a
t
th
e
r
e
a
r
e
no
known
c
om
pe
ti
ng
f
in
a
nc
ia
l
i
nt
e
r
e
s
ts
or
pe
r
s
on
a
l
r
e
la
ti
ons
hi
p
s
th
a
t
c
oul
d ha
ve
i
nf
lu
e
nc
e
d t
he
w
or
k r
e
por
te
d i
n t
hi
s
pa
pe
r
.
I
N
F
O
R
M
E
D
C
O
N
S
E
N
T
T
he
a
ut
hor
s
a
f
f
ir
m
th
a
t
in
f
or
m
e
d
c
ons
e
nt
w
a
s
obt
a
in
e
d
f
r
om
a
ll
in
di
vi
dua
l
pa
r
ti
c
ip
a
nt
s
in
c
lu
de
d
in
th
is
s
tu
dy, a
nd w
r
it
te
n pe
r
m
is
s
io
n w
a
s
s
e
c
ur
e
d pr
io
r
t
o t
he
ir
i
nc
lu
s
io
n.
D
A
T
A
A
V
A
I
L
A
B
I
L
I
T
Y
T
he
da
ta
th
a
t
s
uppor
t
th
e
f
in
di
ngs
of
th
is
s
tu
dy
a
r
e
a
va
il
a
bl
e
f
r
om
th
e
c
or
r
e
s
ponding
a
ut
hor
,
[
A
D
C
]
,
upon r
e
a
s
ona
bl
e
r
e
que
s
t.
R
E
F
E
R
E
N
C
E
S
[
1]
C
.
K
i
e
s
s
l
i
ng
e
t
al
.
,
“
D
o
e
s
i
t
m
a
ke
s
e
n
s
e
t
o
u
s
e
w
r
i
t
t
e
n
i
ns
t
r
um
e
nt
s
t
o
a
s
s
e
s
s
c
om
m
uni
c
a
t
i
on
s
ki
l
l
s
?
S
ys
t
e
m
a
t
i
c
r
e
vi
e
w
on
t
he
c
onc
ur
r
e
nt
a
nd
pr
e
di
c
t
i
ve
va
l
ue
of
w
r
i
t
t
e
n
a
s
s
e
s
s
m
e
nt
f
or
pe
r
f
or
m
a
nc
e
,”
P
at
i
e
nt
E
duc
at
i
on
and
C
ouns
e
l
i
ng
,
vol
.
108,
M
a
r
.
2023
,
doi
:
10.1016/
j
.pe
c
.2022.107612.
[
2]
J
.
W
.
G
.
P
ut
r
a
,
S
.
T
e
u
f
e
l
,
a
nd
T
.
T
okuna
ga
,
“
I
m
pr
ovi
ng
l
ogi
c
a
l
f
l
ow
i
n
E
ngl
i
s
h
-
as
-
a
-
f
or
e
i
gn
-
l
a
ngua
ge
l
e
a
r
ne
r
e
s
s
a
ys
by
r
e
or
de
r
i
ng s
e
nt
e
nc
e
s
,”
A
r
t
i
f
i
c
i
al
I
nt
e
l
l
i
ge
nc
e
, vol
. 320, J
ul
. 2023, doi
:
10.1016/
j
.a
r
t
i
nt
.2023.103935.
[
3]
A
. M
. A
z
m
i
, M
.
F
. A
l
-
J
oui
e
, a
nd
M
. H
u
s
s
a
i
n,
“
A
A
E
E
–
a
ut
om
a
t
e
d e
va
l
ua
t
i
on of
s
t
ude
nt
s
’
e
s
s
a
ys
i
n A
r
a
bi
c
l
a
ngu
a
ge
,”
I
nf
or
m
at
i
on
P
r
oc
e
s
s
i
ng and M
anage
m
e
nt
, vol
. 56, no. 5, pp. 1736
–
1752, S
e
p. 2019, doi
:
10
.1016/
j
.i
pm
.2019.05.008.
[
4]
C
.
T
.
L
i
m
,
C
.
H
.
B
ong,
W
.
S
.
W
ong,
a
nd
N
.
K
.
L
e
e
,
“
A
c
om
pr
e
he
ns
i
ve
r
e
vi
e
w
of
a
ut
om
a
t
e
d
e
s
s
a
y
s
c
or
i
ng
(
A
ES
)
r
e
s
e
a
r
c
h
a
n
d
de
ve
l
opm
e
nt
,”
P
e
r
t
ani
k
a J
our
nal
of
Sc
i
e
n
c
e
and T
e
c
hnol
ogy
, vol
. 29, no. 3, pp.
1875
–
1899, J
ul
. 2021, doi
:
10.47836/
pj
s
t
.29.3.27.
[
5]
Q
.
W
a
ng,
“
A
m
ul
t
i
f
a
c
e
t
e
d
a
r
c
hi
t
e
c
t
ur
e
t
o
a
ut
om
a
t
e
e
s
s
a
y
s
c
or
i
ng
f
or
a
s
s
e
s
s
i
ng
e
ngl
i
s
h
a
r
t
i
c
l
e
w
r
i
t
i
ng:
I
nt
e
gr
a
t
i
ng
s
e
m
a
nt
i
c
,
t
he
m
a
t
i
c
,
a
nd
l
i
ngui
s
t
i
c
r
e
pr
e
s
e
nt
a
t
i
on
s
,”
C
om
put
e
r
s
and
E
l
e
c
t
r
i
c
al
E
ngi
ne
e
r
i
ng
,
vol
.
118,
A
ug.
2024,
doi
:
10.1016/
j
.c
om
pe
l
e
c
e
ng.2024.109308.
[
6]
A
.
P
a
c
k,
A
.
B
a
r
r
e
t
t
,
a
nd
J
.
E
s
c
a
l
a
nt
e
,
“
L
a
r
ge
l
a
ngua
ge
m
ode
l
s
a
nd
a
ut
om
a
t
e
d
e
s
s
a
y
s
c
or
i
ng
of
E
ngl
i
s
h
l
a
ngua
ge
l
e
a
r
ne
r
w
r
i
t
i
ng
:
I
ns
i
ght
s
i
nt
o
va
l
i
di
t
y
a
nd
r
e
l
i
a
bi
l
i
t
y,”
C
om
put
e
r
s
and
E
duc
at
i
on:
A
r
t
i
f
i
c
i
al
I
nt
e
l
l
i
ge
nc
e
,
vol
.
6,
J
a
n.
2024,
doi
:
10.1016/
j
.c
a
e
a
i
.2024.100234.
[
7]
M
.
N
.
I
.
S
us
a
nt
i
,
A
.
R
a
m
a
dha
n,
a
nd
H
.
L
.
H
.
S
.
W
a
r
na
r
s
,
“
A
ut
om
a
t
i
c
e
s
s
a
y
e
x
a
m
s
c
or
i
ng
s
ys
t
e
m
:
a
s
ys
t
e
m
a
t
i
c
l
i
t
e
r
a
t
ur
e
r
e
vi
e
w
,”
P
r
oc
e
di
a C
om
put
e
r
S
c
i
e
nc
e
, vol
. 216, pp. 531
–
538, 2022, doi
:
10.1016/
j
.pr
oc
s
.
2022.12.166.
[
8]
A
.
M
i
z
um
ot
o
a
nd
M
.
E
guc
hi
,
“
E
xpl
or
i
ng
t
he
pot
e
nt
i
a
l
of
us
i
ng
a
n
A
I
l
a
ngua
ge
m
ode
l
f
or
a
ut
om
a
t
e
d
e
s
s
a
y
s
c
or
i
ng,”
R
e
s
e
a
r
c
h
M
e
t
hods
i
n A
ppl
i
e
d L
i
ngui
s
t
i
c
s
, vol
. 2, no. 2, A
ug. 2023, doi
:
10.1016/
j
.r
m
a
l
.20
23.100050.
[
9]
Q
.
W
a
ng,
“
T
he
us
e
of
s
e
m
a
nt
i
c
s
i
m
i
l
a
r
i
t
y
t
ool
s
i
n
a
ut
om
a
t
e
d
c
ont
e
nt
s
c
or
i
n
g
of
f
a
c
t
-
ba
s
e
d
e
s
s
a
ys
w
r
i
t
t
e
n
by
E
F
L
l
e
a
r
ne
r
s
,
”
E
duc
at
i
on and I
nf
or
m
at
i
on T
e
c
hnol
ogi
e
s
, vol
. 27, no. 9, pp. 13021
–
13049, N
ov. 2022, doi
:
10.1007/
s
10639
-
022
-
11179
-
1.
[
10]
V
.
R
a
m
na
r
a
i
n
-
S
e
e
t
ohul
,
V
.
B
a
s
s
oo,
a
nd
Y
.
R
os
un
a
l
l
y,
“
S
i
m
i
l
a
r
i
t
y
m
e
a
s
ur
e
s
i
n
a
ut
om
a
t
e
d
e
s
s
a
y
s
c
or
i
ng
s
ys
t
e
m
s
:
a
t
e
n
-
ye
a
r
r
e
vi
e
w
,”
E
duc
at
i
on and I
nf
or
m
at
i
on T
e
c
hnol
ogi
e
s
, vol
. 27, no. 4, pp. 5573
–
5604, M
a
y 2022, doi
:
10.1007/
s
10639
-
021
-
10838
-
z.
[
11]
M
.
R
.
A
nde
r
s
e
n,
K
.
K
a
be
l
,
J
.
B
r
e
m
hol
m
,
J
.
B
unds
ga
a
r
d,
a
nd
L
.
K
.
H
a
ns
e
n,
“
A
ut
om
a
t
i
c
pr
of
i
c
i
e
nc
y
s
c
or
i
ng
f
or
e
a
r
l
y
-
s
t
a
ge
w
r
i
t
i
ng,”
C
om
put
e
r
s
and E
duc
at
i
on:
A
r
t
i
f
i
c
i
al
I
nt
e
l
l
i
ge
nc
e
, vol
. 5, 2023, doi
:
10.1016/
j
.c
a
e
a
i
.2023.100168.
[
12]
K
.
K
a
be
l
,
J
.
B
r
e
m
hol
m
,
a
nd
J
.
B
und
s
ga
a
r
d,
“
A
f
r
a
m
e
w
or
k
f
or
i
de
nt
i
f
yi
ng
e
a
r
l
y
w
r
i
t
i
ng
de
ve
l
opm
e
nt
,”
W
r
i
t
i
ng
and
P
e
dagogy
,
vol
. 13, no. 1
–
3, pp. 51
–
87, J
ul
. 2021, doi
:
10.1558/
w
a
p.21467.
[
13]
N
.
S
üz
e
n,
A
.
N
.
G
or
ba
n,
J
.
L
e
ve
s
l
e
y,
a
nd
E
.
M
.
M
i
r
ke
s
,
“
A
ut
om
a
t
i
c
s
hor
t
a
ns
w
e
r
gr
a
di
ng
a
nd
f
e
e
dba
c
k
us
i
ng
t
e
xt
m
i
ni
ng
m
e
t
hods
,”
P
r
oc
e
di
a C
om
put
e
r
Sc
i
e
nc
e
, vol
. 169, pp. 726
–
743, 2020, doi
:
10.1016/
j
.pr
oc
s
.2020.02.171.
[
14]
C
.
C
.
L
i
n
, E
. S
.
J
.
C
he
n
g,
A
.
Y
.
Q
.
H
ua
n
g,
a
n
d
S
. J
. H
.
Y
a
ng
, “
D
N
A
of
l
e
a
r
n
i
ng
be
ha
vi
or
s
:
a
no
ve
l
a
pp
r
oa
c
h
o
f
l
e
a
r
ni
ng
pe
r
f
o
r
m
a
nc
e
pr
e
di
c
t
i
on
by
N
L
P
,”
C
om
pu
t
e
r
s
and
E
du
c
a
t
i
o
n
:
A
r
t
i
f
i
c
i
al
I
nt
e
l
l
i
ge
nc
e
,
vo
l
.
6,
J
un
.
20
24,
d
oi
:
1
0.
101
6/
j
.c
a
e
a
i
.2
02
4.1
00
227
.
[
15]
F
.
C
hi
a
r
e
l
l
o,
V
.
G
i
or
da
no,
I
.
S
pa
da
,
S
.
B
a
r
a
ndoni
,
a
nd
G
.
F
a
nt
oni
,
“
F
ut
ur
e
a
ppl
i
c
a
t
i
ons
of
ge
ne
r
a
t
i
ve
l
a
r
ge
l
a
ngua
ge
m
ode
l
s
:
a
da
t
a
-
dr
i
ve
n c
a
s
e
s
t
udy on C
h
a
t
G
P
T
,”
T
e
c
hnov
at
i
on
, vol
. 133, M
a
y 2024, doi
:
1
0.1016/
j
.t
e
c
hnova
t
i
on.2024.103002.
[
16]
R
.
G
a
o,
H
.
E
.
M
e
r
z
dor
f
,
S
.
A
nw
a
r
,
M
.
C
.
H
i
pw
e
l
l
,
a
nd
A
.
R
.
S
r
i
ni
va
s
a
,
“
A
ut
om
a
t
i
c
a
s
s
e
s
s
m
e
nt
of
t
e
xt
-
ba
s
e
d
r
e
s
pons
e
s
i
n
pos
t
-
s
e
c
onda
r
y
e
duc
a
t
i
on:
a
s
ys
t
e
m
a
t
i
c
r
e
vi
e
w
,”
C
om
put
e
r
s
and
E
duc
at
i
on:
A
r
t
i
f
i
c
i
al
I
nt
e
l
l
i
ge
nc
e
,
vol
.
6,
J
un.
2024,
doi
:
10.1016/
j
.c
a
e
a
i
.2024.100206.
[
17]
B
.
A
bu
-
S
a
l
i
h
a
nd
S
.
A
l
ot
a
i
bi
,
“
A
s
y
s
t
e
m
a
t
i
c
l
i
t
e
r
a
t
ur
e
r
e
vi
e
w
of
know
l
e
dge
gr
a
ph
c
ons
t
r
uc
t
i
on
a
nd
a
ppl
i
c
a
t
i
on
i
n
e
duc
a
t
i
on,
”
H
e
l
i
y
on
, vol
. 10, no. 3, F
e
b. 2024, doi
:
10.1016/
j
.he
l
i
yon.2024.e
25383.
[
18]
T
.
S
.
W
a
l
i
a
,
G
.
S
.
J
os
a
n,
a
nd
A
.
S
i
ngh,
“
A
n
e
f
f
i
c
i
e
nt
a
ut
om
a
t
e
d
a
ns
w
e
r
s
c
or
i
ng
s
ys
t
e
m
f
or
P
unj
a
bi
l
a
ngua
ge
,”
E
gy
pt
i
an
I
nf
or
m
at
i
c
s
J
our
nal
, vol
. 20, no. 2, pp. 89
–
96, J
ul
. 2019, doi
:
10.1016/
j
.e
i
j
.2018.11.001.
[
19]
J
.
Á
.
M
a
r
t
í
ne
z
-
H
ue
r
t
a
s
,
G
.
J
or
ge
-
B
ot
a
na
,
A
.
M
a
r
t
í
ne
z
-
M
i
ngo,
J
.
D
.
M
or
e
no,
a
nd
R
.
O
l
m
os
,
“
M
ode
l
i
ng
pe
r
s
ona
l
i
t
y
l
a
ngua
ge
u
s
e
w
i
t
h
s
m
a
l
l
s
e
m
a
nt
i
c
v
e
c
t
or
s
ubs
p
a
c
e
s
,
”
P
e
r
s
onal
i
t
y
and
I
ndi
v
i
dual
D
i
f
f
e
r
e
nc
e
s
,
vol
.
219,
M
a
r
.
2024
,
doi
:
10.1016/
j
.pa
i
d.2023.112514.
[
20]
A
.
P
.
C
a
s
t
r
o,
G
.
A
.
W
a
i
ne
r
,
a
nd
W
.
P
.
C
a
l
i
xt
o,
“
W
e
i
ght
i
ng
c
on
s
t
r
uc
t
i
on
by
ba
g
-
of
-
w
or
ds
w
i
t
h
s
i
m
i
l
a
r
i
t
y
-
l
e
a
r
ni
ng
a
nd
s
upe
r
vi
s
e
d
t
r
a
i
ni
ng
f
or
c
l
a
s
s
i
f
i
c
a
t
i
on
m
ode
l
s
i
n
c
our
t
t
e
xt
doc
um
e
nt
s
,”
A
ppl
i
e
d
Sof
t
C
om
put
i
ng
,
vol
.
124,
J
ul
.
2022,
doi
:
10.1016/
j
.a
s
oc
.2022.108987.
Evaluation Warning : The document was created with Spire.PDF for Python.