Internati
o
nal
Journal of Ele
c
trical
and Computer
Engineering
(IJE
CE)
V
o
l.
5, N
o
. 2
,
A
p
r
il
201
5, p
p
.
31
8
~
33
2
I
S
SN
: 208
8-8
7
0
8
3
18
Jo
urn
a
l
h
o
me
pa
ge
: h
ttp
://iaesjo
u
r
na
l.com/
o
n
lin
e/ind
e
x.ph
p
/
IJECE
Ontology-based Why-Question An
alysis
Using L
e
xico-Syn
tactic
Patt
erns
A.
A.I.
N.
E
k
a
Kar
yaw
ati
,
E
d
i
Wi
nar
k
o
,
A
z
hari
, A
g
us
H
a
rj
ok
o
Department o
f
C
o
mputer Scien
c
e
and
Electron
i
cs, Gadjah
Mada U
n
iversity
, Yog
y
akarta, Indon
esia
Article Info
A
B
STRAC
T
Article histo
r
y:
Received Ja
n
2, 2015
Rev
i
sed
Feb
13
, 20
15
Accepted
Feb 27, 2015
This research f
o
cuses on dev
e
loping a
meth
od
to an
aly
z
e wh
y
-
questions.
S
o
m
e
previous
res
earch
es
on t
h
e wh
y
-
qu
es
tio
n anal
ys
is
us
ual
l
y
us
ed th
e
morphological and the s
y
ntactical appro
ach witho
u
t considering th
e expected
answer ty
pes
.
Moreover, they
rarely
involv
e
d domain
ontolog
y
to captur
e
thes
em
anti
c or
concep
tual
iz
ati
on of the con
t
ent
.
Cons
eque
ntl
y
, s
o
m
e
s
e
m
a
ntic m
i
s
m
atches
oc
curred
and then
resulting not appropr
iate
answers.
The proposed
method considers the exp
e
cted
answer ty
p
e
s
and involv
e
s
domain ontology
. I
t
adapts th
e simple, th
e b
a
g-o
f-words like mo
del, b
y
using
sem
a
ntic en
tit
ie
s (i.e
.,
conc
epts
/enti
ties
and re
l
a
tions) inst
ead
of words to
represent a quer
y
. Th
e proposed
method expands the question by
adding th
e
addition
a
l sem
a
ntic en
titi
es got b
y
ex
ecut
i
ng the
constructed SPARQL quer
y
of the wh
y
-
qu
estion over
th
e do
main ontolog
y
.
The major
contr
i
bution o
f
this resear
ch is in developing
an
ontolog
y
-
b
a
sed wh
y
-
question analy
s
is
method
b
y
cons
idering the expected an
swer ty
p
e
s. Some
exper
iments have
been condu
cted
to evaluate each
phase
of the pr
oposed method.
The results
show good perf
ormance for all
performance
measures used (i
.e., precision,
recall, und
ergeneration, and
overgener
ation
)
.
F
u
rtherm
ore,
com
p
aris
on
against twobaseline methods, th
e key
w
ord-b
a
sed ones (i.e., th
e term-based
and the phrase-b
a
sed method), showstha
t the prop
osed method obtainedb
etter
performance res
u
lts in
term
s of
MRR and P@10 valu
es.
Keyword:
Lex
i
co-syn
tact
ic p
a
ttern
Nl
p
-
base
d t
e
xt
m
i
ni
ng
Questi
on analy
s
is
Questi
on ans
w
ering system
Ty
ped
de
pe
nde
ncy
pa
rse
Wh
y
-
qu
estion
Copyright ©
201
5 Institut
e
o
f
Ad
vanced
Engin
eer
ing and S
c
i
e
nce.
All rights re
se
rve
d
.
Co
rresp
ond
i
ng
Autho
r
:
A.
A.I
.
N
.
E
k
a Kary
a
w
ati,
Lect
ure
r
St
a
ff
of
De
part
m
e
nt
of
C
o
m
put
er
S
c
i
e
n
ce, Facu
lty of Math
em
atic
s and
Natural Scien
ces,
Udaya
n
a
Uni
v
ersity, Bali, Indonesia
Doct
oral
St
u
d
e
n
t
o
f
De
part
m
e
nt
o
f
C
o
m
put
er
Sci
e
nce a
n
d E
l
ect
roni
cs,
Facu
lty of Math
em
at
ics an
d
Natural Scien
c
es, Gadj
ah
Mad
a
Un
i
v
ersity,
Yog
y
ak
arta,
Ind
o
n
e
sia.
Em
a
il: ek
a.k
a
ryawati@m
a
il.u
g
m
.ac.id
1.
INTRODUCTION
A
q
u
e
stion
analysis is a p
r
ocess to
an
alyze a n
a
t
u
ral lag
u
a
g
e
qu
estion
in ord
e
r t
o
co
nv
ert th
e
que
st
i
on i
n
t
o
a fo
rm
al
query
represe
n
t
a
t
i
o
n. T
h
e f
o
rm
al que
ry
rep
r
ese
n
t
a
t
i
on i
s
co
n
s
t
r
uct
e
d s
o
t
h
at
t
h
e
i
n
f
o
rm
at
i
on cont
ai
ne
d i
n
t
h
e quest
i
o
n can be p
r
oce
sse
d by
a quest
i
on a
n
swe
r
i
n
g
sy
st
em
. The quest
i
o
n
anal
y
s
i
s
i
s
a
f
u
n
d
am
ent
a
l
co
m
ponent
of
a
q
u
est
i
o
n a
n
s
w
eri
n
g
sy
st
em
beca
use t
h
e
que
ry
re
p
r
ese
n
t
a
t
i
on
represe
n
ts a user inform
ation need. Thus, the syste
m
wi
ll result accurate answe
r
s (i
.e
., sa
tisfy the information
n
eed), if th
e user inform
atio
n
n
eed can be
re
prese
n
ted acc
urately.
Th
is r
e
sear
ch
f
o
cu
ses
o
n
d
e
v
e
lop
i
ng
a m
e
th
od
to
an
alyze a w
h
y-qu
est
i
o
n
(
i
.e., a
w
hy-
qu
estion
anal
y
s
i
s
m
e
t
hod)
. Acc
o
r
d
i
n
g t
o
t
h
e Ari
s
t
o
t
l
e
phi
l
o
s
o
phy
, t
h
ere i
s
a cl
ose rel
a
t
i
on bet
w
ee
n u
nde
rst
a
n
d
i
n
g a
n
d
wh
y-qu
estion
[1
]. Hu
m
a
n
d
o
n
o
t
th
ink
u
nderstand
so
m
e
t
h
ing
un
til th
ey g
r
asp
th
e why o
f
it. On
th
e o
t
h
e
r
words, it is n
ecessary to
kn
ow th
e an
swer o
f
th
e
wh
y
-
questio
n
in
ord
e
r to
u
n
d
e
rstand so
m
e
th
in
g
.
It is th
e
reaso
n
why
d
e
vel
o
pi
n
g
a m
e
t
h
o
d
t
o
an
al
y
ze a why
-
q
u
es
t
i
on i
s
im
port
a
nt
. The
g
o
o
d
m
e
t
hod o
f
a
why
-
que
stion analy
s
is will result accura
te a
n
swers, hence
use
r
s
can
unde
rstand som
e
thing acc
urately.
Evaluation Warning : The document was created with Spire.PDF for Python.
I
J
ECE
I
S
SN
:
208
8-8
7
0
8
Ont
o
l
o
gy-
b
ase
d
W
h
y-
Q
u
est
i
o
n A
n
al
ysi
s
Usi
n
g Lexico-Synt
a
ctic Patter
n
s
(
A
.A.
I
.N. EkaK
ary
a
wati)
31
9
Why
-
q
u
est
i
o
n
i
s
a quest
i
o
n
prece
de
d by
a
why
-
q
u
est
i
o
n
wo
rd a
nd
fol
l
owe
d
by
a t
o
pi
c of t
h
e
que
st
i
o
n
.
F
o
r
e
x
am
pl
e, fo
r a
q
u
est
i
o
n, "
W
hy
i
s
a vect
or s
p
a
ce m
odel
use
d
i
n
i
n
fo
rm
ati
on
ret
r
i
e
val
?
", t
h
e
t
opi
c
of the question is “A vector space
m
odel is used in inform
ation retr
i
e
val”. Actuall
y
, the why-question
an
alysis n
e
ed
s to
d
e
term
in
e an
swer typ
e
s.
Verb
ern
e
stat
ed
th
at it is n
e
cessary to
sp
lit th
e an
swer type o
f
a
wh
y-qu
estion
in
to
sub
-
typ
e
s, fo
r
g
e
ttin
g
m
o
re sp
ecific
an
swer typ
e
in
o
r
d
e
r to
select p
o
t
en
tial
an
swer
sen
t
en
ces
o
r
paragraph
s
[2
].
Howev
e
r, th
ere are s
till few research
es stud
yin
g
o
n
wh
y
-
q
u
e
stio
n
an
alysis b
y
consideri
n
g the
expected
ans
w
er types.
To
get
m
o
re ac
curat
e
a
n
s
w
er
s
,
a
q
u
est
i
o
n a
n
al
y
s
i
s
m
e
t
hod
al
so s
h
oul
d i
n
v
o
l
v
e
sem
a
nt
i
c
app
r
oach
t
o
capture the c
onceptualizations
associat
ed
wi
t
h
u
s
er i
n
f
o
rm
ati
on
nee
d
s a
nd
cont
e
n
t
s
.
Neve
rt
hel
e
ss, m
o
st
of
t
h
e
que
st
i
on a
n
al
y
s
i
s
appr
oac
h
es onl
y
anal
y
ze sy
nt
act
i
cal
ly
and m
o
rp
h
o
l
o
gi
cal
l
y
wi
tho
u
t
co
nsi
d
e
r
i
ng t
h
e
sem
a
n
tic o
f
th
e qu
estion
co
n
t
en
t [3
]-[5
].
Based on the above facts, a resear
ch
pr
obl
e
m
i
s
form
ul
at
ed, t
h
at
is how to analyze a why-questi
on
by
consi
d
eri
n
g t
h
e expect
e
d
ans
w
er t
y
pe
s, and
by
ut
i
l
i
z
i
ng d
o
m
a
i
n
ont
ol
o
g
y
i
n
orde
r t
o
capt
u
r
e
t
h
e
conce
p
t
u
al
i
zat
i
on
of t
h
e
qu
est
i
on co
nt
ent
.
Thi
s
re
sea
r
c
h
foc
u
ses o
n
devel
opi
ng a
m
e
t
hod usi
n
g t
h
e
com
b
i
n
at
i
on
o
f
pa
rt
-
o
f
-
sp
eec
h (
P
O
S
) t
a
ggi
ng
, t
y
pe
d-
depe
nde
ncy
parsi
n
g,
ver
b
cl
assi
f
i
cat
i
on, a
nd
d
o
m
a
i
n
ont
ol
o
g
y
.
Som
e
researc
h
es h
a
ve bee
n
use
d
dom
ai
n ont
ol
ogy
t
o
f
o
rm
ul
at
e a query
, and t
o
ca
pt
ure
t
h
e
co
n
c
ep
tu
alizatio
n
of
t
h
e q
u
e
r
y
con
t
en
t [
6
], [7],
[
9
]-[1
3
]
, bu
t th
ey
d
i
d
no
t f
o
cu
s on
wh
y-questio
n
s
.
Th
e
p
r
op
osed m
e
th
od
is
p
e
rfo
r
m
e
d
b
y
u
t
ilizin
g
lex
i
co-syn
tactic p
a
ttern
s em
p
l
o
y
ed ov
er typ
e
d
depe
n
d
ency
pa
rse t
r
ees. T
h
e t
y
ped
depe
n
d
en
cy
parsi
n
g i
s
used beca
use the depe
ndencies
or relations
be
tween
el
em
ent
s
of a
sent
ence a
r
e cl
earl
y
defi
ne
d.
There
f
ore, t
y
p
e
d de
pe
nde
ncy
parsi
ng t
oget
h
er wi
t
h
P
O
S t
a
ggi
n
g
can
b
e
u
s
ed mo
re easily to
co
n
s
t
r
u
c
t t
h
e
p
a
ttern
s
u
s
ed
fo
r ex
tracting
term
s an
d
relation
s
o
f
a wh
y-qu
estion
.
Fu
rt
h
e
rm
o
r
e, by co
n
s
id
ering
th
e v
e
rb
classificatio
n
,
th
e lexico
syn
t
actic p
a
ttern
s can
also
b
e
u
s
ed
to
iden
tify
the expected a
n
swe
r
types
of the
why-quest
ion. The proposed
m
e
thod
a
d
ap
ts the sim
p
le, the
bag-of-words
lik
e m
o
d
e
l, b
y
u
s
ing
sem
a
n
tic en
tities (i.e., co
n
c
ep
ts/en
tities an
d
relation
s
) i
n
stead
of
words to
represen
t a
form
al q
u
e
ry rep
r
esen
tation
.
In
add
itio
n, the propo
sed m
e
th
od
ex
p
a
nd
s t
h
e
q
u
e
stio
n b
y
add
i
ng
th
e add
itio
n
a
l
sem
a
nt
i
c
ent
i
t
i
e
s got
by
ex
ecut
i
ng t
h
e const
r
uct
e
d SP
AR
QL
que
ry
of t
h
e
why
-
q
u
est
i
o
n o
v
er
dom
ai
n
ont
ol
o
g
y
.
Th
us, t
h
e m
a
jor c
o
nt
ri
b
u
t
i
o
n
of t
h
i
s
res
e
arch i
s
i
n
de
v
e
l
opi
n
g
a
n
o
n
t
ol
ogy
-base
d
why
-
q
u
est
i
o
n
anal
y
s
i
s
m
e
t
hod
usi
n
g t
h
e l
e
xi
co
-sy
n
t
act
i
c
pat
t
e
rns
b
y
c
o
n
s
i
d
eri
n
g
t
h
e e
xpect
e
d
a
n
s
w
e
r
t
y
pes
.
T
h
e
m
e
t
hod
uses O
W
L fo
r bui
l
d
i
n
g
o
n
t
o
l
ogy
,
save
s
t
h
e dat
a
usi
n
g
R
D
F
f
o
rm
at
,
and
uses SPAR
Q
L fo
r que
ry
p
r
oc
essi
ng
.
In
SP
AR
Q
L
c
onst
r
uct
i
o
n,
t
h
e p
r
o
p
o
se
d m
e
t
h
o
d
c
o
nsi
d
e
r
s
t
w
o a
n
s
w
er
t
y
pes
of
t
h
e
w
h
y
-
q
u
est
i
o
n i
n
cl
u
d
i
n
gt
h
e
cause answer type, andthe m
o
tivati
on answer type. Som
e
experim
e
nts ha
ve been conducted to eval
uate each
pha
se o
f
t
h
e
p
r
o
p
o
sed m
e
t
h
o
d
,
by
usi
ng s
o
m
e
eval
uat
i
o
n
m
easures s
u
c
h
as the
precision, the recall, the
u
n
d
e
r
g
en
er
ati
o
n
,
an
d
t
h
e ov
erg
e
n
e
r
a
tion
m
easu
r
e [8
]. Th
e r
e
su
lts sh
ow
g
ood p
e
rf
or
m
a
n
ce f
o
r
al
l
per
f
o
r
m
a
nce
m
easure
s
use
d
.
F
u
rt
herm
ore, t
h
e com
p
ari
s
on
s
agai
nst
t
w
o
ba
sel
i
n
e m
e
t
hods
, sh
ow t
h
e
pr
o
pos
ed
m
e
t
hod
obt
ai
n
e
d bet
t
e
r
per
f
o
rm
ance resul
t
s
(i
.e., i
n
t
e
rm
s of M
R
R
an
d
P@
10
val
u
es
[
35]
,
[3
6]
, [
3
7]
) t
h
a
n
bot
h
basel
i
n
e
m
e
t
hods
, t
h
e
k
e
y
w
o
r
d
-
ba
sed
one
s (i
.e
., t
h
e t
e
rm
-based a
n
d
t
h
e p
h
r
ase-
bas
e
d m
e
t
hod
).
Th
e m
a
in
assum
p
t
i
o
n
u
s
ed
i
n
th
is research is th
e
q
u
est
i
o
ns m
u
st
be i
n
cor
r
ect
E
ngl
i
s
h
gram
m
a
r.
Othe
r ass
u
m
p
tionis t
h
e term
s and t
h
e
relations
that
are
que
ried a
r
e
restri
cted in th
e
spe
c
ific scope, be
cause
th
e im
p
l
e
m
en
t
a
tio
n
o
f
th
e
p
r
o
p
o
s
ed
m
e
th
od
is in a sp
ecific d
o
m
ain
(e.g., tex
t
retriev
a
l
do
m
a
in
).In add
itio
n,
the que
stions a
s
ke
d shoul
d ha
ve the patterns
already defi
ne
d. As a res
u
lt, the propose
d
m
e
thod will show the
b
e
st
p
e
rform
a
n
ce un
d
e
r t
h
o
s
e
co
nd
itio
ns.
The
rest
of
t
h
i
s
pa
per
i
s
or
ga
n
i
zed as
f
o
l
l
o
ws
. Sec
tion
2
presen
ts work
s related
to
th
is research.
Th
e
t
h
eo
ret
i
cal
basi
s and t
h
e p
r
o
pos
ed
why
-
q
u
e
st
i
on a
n
al
y
s
i
s
m
e
t
hod a
r
e
gi
ve
n i
n
sect
i
on
3 a
nd sect
i
on
4,
respect
i
v
el
y
.
S
ect
i
on 5
p
r
esen
t
s
t
h
e researc
h
m
e
t
hod.
Di
sc
u
ssi
ons a
b
out
t
h
e resul
t
s
are
p
r
e
s
ent
e
d i
n
Sect
i
on
6
.
Fi
nal
l
y
, co
ncl
u
si
ons
are
gi
ven
i
n
Sect
i
o
n
7.
2.
RELATED WORK
The p
r
op
ose
d
why
-
q
u
est
i
o
n anal
y
s
i
s
m
e
t
h
o
d
i
n
vol
ve
s t
h
e d
o
m
a
i
n
ont
ol
o
g
y
t
o
gras
p t
h
e
conce
p
t
u
al
i
zat
i
o
n
o
f
t
h
e
que
s
t
i
on c
ont
e
n
t
s
t
h
r
o
ug
h i
d
ent
i
f
y
i
ng t
h
e
sem
a
nt
i
c
an
not
at
i
o
n
s
. C
o
n
s
eq
ue
nt
l
y
, t
h
e
que
st
i
on a
n
al
y
s
i
s
adapt
s
a
n
ont
ol
o
g
y
-
base
d q
u
est
i
o
n a
n
s
w
eri
ng m
odel
i
n
v
o
l
v
i
n
g t
h
re
e
m
a
i
n
com
ponent
s
,
term/relatio
n
ex
traction
,
sem
a
n
tic en
tity
m
a
p
p
i
ng
, an
d fo
rm
al q
u
e
ry con
s
tru
c
tio
n. Mo
st
of th
e
o
n
t
o
l
og
y-b
a
sed
que
st
i
on a
n
sw
eri
n
g m
e
t
hod
use
d
l
i
n
g
u
i
s
t
i
c
app
r
oac
h
f
o
r
ext
r
act
i
n
g t
e
r
m
s and rel
a
t
i
o
ns [
9
]
-
[
1
3]
. M
o
re
o
v
er,
for sem
a
n
tic e
n
tity
map
p
i
ng
, so
m
e
researchers u
s
ed
string si
m
i
larity
m
a
t
c
h
i
ng
and
Wo
rd
n
e
t [10
]
, [11
]
, [13
]
.
Si
m
ilar to
th
e
p
r
ev
iou
s
wo
rks, th
e
propo
sed m
e
th
o
d
u
s
es lin
gu
istic app
r
oach
fo
r ex
t
r
actin
g term
s an
d
relatio
n
s
and
uses
st
ri
n
g
sim
i
l
a
ri
ty
m
a
tchi
n
g
f
o
r sem
a
nt
i
c
m
a
ppi
ng. Ho
we
ver
,
t
h
e pr
o
pose
d
m
e
t
hod d
o
es not
em
pl
oy
a
g
e
n
e
r
a
l
d
o
m
ai
n
lex
i
co
n (
e
.g.,
W
o
r
d
n
e
t)
, it e
m
p
l
o
y
s a list o
f
synon
ymie
s (
i
.e., a
sp
eci
f
i
c do
m
a
in
lex
i
con)
i
n
st
ead.
F
u
rt
he
rm
ore, f
o
r
co
n
s
t
r
uct
i
n
g
the form
al queries
that are
com
p
lia
nt
wi
t
h
t
h
e
d
o
m
ai
n ont
ol
o
g
y
,
m
o
st
Evaluation Warning : The document was created with Spire.PDF for Python.
I
S
SN
:
2
088
-87
08
IJEC
E V
o
l
.
5, No
. 2, A
p
ri
l
20
15
:
31
8 – 3
3
2
32
0
o
f
research
es used
th
e trip
le-based
d
a
ta rep
r
esen
tatio
n
s
that
are refe
rre
d to as the que
ry-tri
ples and use
d
OWL
t
o
b
u
i
l
d
t
h
e o
n
t
o
l
o
gy
an
d s
a
ved t
h
e dat
a
usi
n
g R
D
F
f
o
rm
at
[9]
-[1
2]
, [1
4]
. M
o
reo
v
er
, t
h
ey
em
p
l
oy
ed
SPAR
Q
L (i.e.
,
SQL-like
que
ry langua
ge and suitable for
accessing
data in RDF form
at) to pe
rform
the
que
ry
p
r
o
cessi
n
g
. The p
r
o
p
o
s
ed
m
e
th
od
also
bu
ilds th
e d
o
m
ain
o
n
t
o
l
og
y u
s
i
n
g
OW
L f
o
r
m
at,
sav
e
s th
e know
ledg
e
b
a
se in
R
D
F
form
at (i.e., trip
le-b
ased
represen
tatio
n), an
d
uses S
P
AR
QL
fo
r q
u
e
r
y
p
r
oc
essing
. It is
fo
r som
e
practical reas
ons.
The
p
r
o
p
o
sed
m
e
t
hod
use
s
NLP
-
bas
e
d
t
e
x
t
m
i
ni
ng
f
o
r
e
x
t
r
act
i
n
g t
e
rm
s an
d
rel
a
t
i
o
n
s
. T
h
e
NLP
-
base
d t
e
xt
m
i
ni
ng i
s
pe
rf
orm
e
d t
h
r
o
ug
h em
pl
oy
i
n
g som
e
pat
t
e
rns
(i
.e., l
e
xi
co
-sy
n
t
act
i
c
pat
t
e
rns
)
o
n
parse
t
r
ees t
o
ge
ne
r
a
t
e
st
ruct
ural
rep
r
ese
n
t
a
t
i
on
of
f
r
ee t
e
xt
. In
th
is research
, t
h
e lex
i
co-syn
tactic p
a
ttern
s are
con
s
t
r
uct
e
d o
v
e
r t
h
e t
y
pe
d d
e
pen
d
e
n
cy
pa
r
s
e t
r
ees. T
h
e l
e
xi
co
-sy
n
t
act
i
c
pat
t
e
rns
ha
ve
been
wi
del
y
u
s
ed b
y
researc
h
ers for extracting inform
ation (i.e., term
s and
relations
) from
sentences of a free
text. Kim
e
m
p
l
oyed
han
d
-
cra
f
t
e
d
p
a
t
t
e
rns o
n
t
y
p
e
d-
de
pen
d
e
n
cy
parse
s
[
14]
t
o
i
d
ent
i
f
y
t
e
rm
s and
rel
a
t
i
o
n
s
.
On t
h
e
ot
her
ha
n
d
,
Zo
uaq c
o
m
b
i
n
ed P
O
S t
a
ggi
n
g
a
ndt
y
p
e
d
-
d
e
p
en
de
ncy
pa
rsi
ngt
o em
pl
oy
l
e
xi
co
-sy
n
t
act
i
c
pat
t
e
rn
s i
n
or
der t
o
extract term
s and relations c
o
ntained in a se
ntence
[
15]
,
[
1
7]
. M
o
re
ove
r,
som
e
ot
her
res
earches
em
pl
oy
ed
pat
t
e
rns
o
n
de
p
e
nde
ncy
parse
t
r
ees t
o
e
x
t
r
act
t
e
rm
s and re
l
a
t
i
ons
fr
om
nat
u
ral
l
a
ng
ua
ge t
e
xt
[
16]
-
[
18]
.
O
n
t
h
e
ot
he
r
han
d
,
M
ous
avi
a
p
pl
i
e
d
NL
P-
base
d t
e
xt
m
i
ni
ng t
h
r
o
ug
h
em
pl
oy
i
n
g
som
e
pat
t
e
rn
s
o
n
p
h
rase
st
r
u
ct
u
r
e
parse
t
r
ees
t
o
gene
rat
e
st
r
u
ct
ural
re
prese
n
t
a
t
i
on
of
f
r
ee t
e
xt
[
1
9]
. I
n
co
n
t
rast
t
o
t
h
e
pre
v
i
o
us
resea
r
ch
es t
h
at
foc
u
se
d
on
t
h
e
free
t
e
xt
s
re
pr
esent
a
t
i
ons
(i
.e
.,
not
q
u
est
i
o
n
s
), t
h
e
pr
o
pose
d
m
e
t
hod
f
o
cu
ses o
n
w
h
y
-
q
u
e
st
i
o
n
represen
tatio
n
s
in
stead.
3.
THE THEORETICAL B
A
SIS
3.
1.
Definiti
ons
Definiti
on 1
(Typed Depe
ndency Pars
e)
Ty
ped de
pen
d
e
ncy
pa
rse
i
s
a
ki
n
d
of de
pe
nde
ncy
parse
that re
prese
n
ts
depe
ndencies
betw
een words and
labels
t
h
e depe
ndenci
es
b
y
gram
m
a
ti
cal
rel
a
t
i
ons
[
20]
.
Definiti
on 2 (Acti
o
n
Verb)
The
action
verbs
are
ve
rbs
that
e
x
press
an action. Act
i
on m
eans
som
e
t
h
i
ng ha
p
p
eni
ng
or s
o
m
e
t
h
i
ng c
h
an
gi
n
g
. M
o
st
act
i
on
ver
b
s re
fer t
o
phy
si
cal
act
i
o
n
s
, but
s
o
m
e
are ver
b
s
o
f
rep
o
rting
o
r
v
e
rb
s
o
f
t
h
ink
i
ng
[21
]
. Ex
am
p
l
es o
f
th
e actio
n
v
e
rb
s are ‘u
se’,
‘u
tilize’, ‘em
p
lo
y’, ‘ap
p
l
y’,
‘pe
r
f
o
rm
’, a
n
d
othe
rs.
Definiti
on
3 (Process Verb and Ca
usative/Inchoati
ve Alternati
o
n)
The proc
ess verbs a
r
e ve
rbs
that express a process
.
In this context,
process
m
ean
s ch
ange
of st
at
e o
r
cha
nge
of
po
si
t
i
on. O
n
t
h
e
ot
he
r
h
a
nd
, th
e cau
s
ativ
e/in
cho
a
tive altern
ation
is a tran
sitiv
ity
altern
atio
n wh
ere th
e tran
sitiv
e u
s
e
of a
v
e
rb
V can
be pa
ra
phrased as roughly “cause to V-int
r
an
sitive” [22]. Moreover, ve
rbs unde
r going the
cau
sativ
e/in
cho
a
tiv
e altern
atio
n can b
e
ch
aracterized
as v
e
rb
s of ch
an
g
e
o
f
state or ch
an
g
e
of
po
sitio
n. Th
u
s
,
th
e pro
cess v
e
rb
s are v
e
rb
s
fro
m
th
e cau
sativ
e/in
cho
a
tive
altern
atio
n,
esp
eciallyin
an in
tran
sitiv
e co
n
t
ex
t.
Exam
ple of
the
pr
ocess
ve
rbs
are
‘a
ppe
ar’,
‘a
rise’,
‘occur’, ‘happe
n’, ‘cha
nge’
, ‘
c
om
press’,
‘c
ol
l
ect
’
,
‘im
p
rove
’, ‘inc
rease’
,
a
n
d
ot
h
e
rs.
Definiti
on
4 (
E
dit Dist
ance
)
Edi
t
di
st
ance i
s
defi
ned as t
h
e
m
i
nim
u
m
num
ber of edi
t
o
p
erat
i
o
ns t
o
t
r
ans
f
o
r
m
one st
ri
ng i
n
t
o
t
h
e
ot
he
r. Tw
o p
r
e
v
al
ent
edi
t
distance algorithms are the
Leve
nst
e
i
n
di
st
an
ce
[2
3]
,
and t
h
e
Dam
e
r
a
u-Le
ve
ns
ht
ei
n di
st
a
n
ce [
2
4]
. The
Leve
ns
ht
ei
n di
st
ance
de
fi
nes e
d
i
t
o
p
er
at
i
ons as i
n
sert
i
o
n
s
,
d
e
letio
n
s
, and
su
bstitu
tio
n
s
. Th
e Dam
e
rau
-
Lev
e
n
s
h
t
ein
d
i
st
an
ce is a v
a
ri
atio
n
of th
e Lev
e
nstein
d
i
stance with
th
e add
ition
a
l
o
p
e
ration
o
f tran
spo
s
ition
.
Definiti
on
5
(D
om
ain Ont
o
lo
g
y
)
Do
m
a
in
o
n
t
o
l
og
y is an
explicit sp
ecifica
tio
n
of a
conce
p
t
u
al
i
zat
i
o
n
ab
o
u
t
dom
ai
n
kn
owl
e
d
g
e
[
25]
.
It
ca
n
b
e
desc
ri
be
d a
s
O =
(C,
R
)
,
wh
er
e
C
is t
h
e
set o
f
conce
p
t
s
,
an
d
R
is th
e
set of se
m
a
n
tic relatio
n
s
h
i
p
s
b
e
tween con
cep
ts.
Definiti
on
6
(
S
PA
RQL
Qu
ery)
A SPA
R
Q
L
qu
er
y is based
aro
und
g
r
ap
h
p
a
tter
n
m
a
tch
i
n
g
,
wh
ere
t
h
e gra
p
hs are R
D
F g
r
ap
hs [
2
6]
. M
o
re com
p
l
e
x gra
p
h pat
t
e
rns ca
n be f
o
r
m
ed by
com
b
ini
n
g sm
all
e
r pat
t
e
rn
s
i
n
cl
udi
ng basi
c
gra
ph pat
t
e
r
n
s, gr
o
up gra
p
h
pat
t
e
r
n
, opt
i
onal
gra
p
h pat
t
erns, al
t
e
r
n
at
i
v
e g
r
ap
h
pat
t
e
rn
, an
d
pat
t
e
rns
o
n
na
m
e
d g
r
ap
hs.
Defi
ni
ti
on
7
(
RDF
Gr
ap
h,
B
a
si
c Gr
ap
h
Pattern, and Alternat
i
v
e Graph
P
a
ttern)
An
RD
F
g
r
aph
is a set
o
f
RDF trip
les
(
s
, p, o)
(I
B)
xIx(I
B
L)
, where
I, B,
and
L
are in
fi
n
ite sets IRIs, Blan
k
n
o
d
e
s, and
Literals, resp
ectively [2
6
]
. In
th
i
s
trip
le,
s
is the subject,
p
the
pre
d
icate, and
o
the object.
A basic
g
r
aph
p
a
ttern
i
s
a set o
f
trip
le p
a
ttern
s, wh
ere a trip
le p
a
ttern
is a trip
le
(
s
, p,
o)
(I
V
)
xIx(I
V
L)
[2
6]
. V
i
s
a set
of vari
a
b
l
e
s di
sj
oi
nt
fr
om
set
s
I
,
B
, and
L
. A
quest
i
on m
a
rk
?
i
n
a t
r
i
p
l
e
i
ndi
cat
es
t
h
at
v i
s
a vari
abl
e
. I
n
an
altern
ativ
e
g
r
aph
p
a
ttern
, t
w
o or m
o
re
po
ssi
bl
e ba
si
c g
r
a
p
h
pat
t
e
r
n
s a
r
e
t
r
i
e
d.
Evaluation Warning : The document was created with Spire.PDF for Python.
I
J
ECE
I
S
SN
:
208
8-8
7
0
8
Ont
o
l
o
gy-
b
ase
d
W
h
y-
Q
u
est
i
o
n A
n
al
ysi
s
Usi
n
g Lexico-Synt
a
ctic Patter
n
s
(
A
.A.
I
.N. EkaK
ary
a
wati)
32
1
3.
2.
Ch
arac
teri
sti
c
o
f
N
a
tur
a
l
L
a
ngu
a
ge Q
u
esti
on
This
researc
h
addre
sses s
o
me basic c
h
a
r
acteris
tics of a
n
a
tural lan
g
u
a
g
e
qu
estio
n, i
n
clud
ing
expect
e
d
an
sw
er t
y
pe,
quest
i
on t
opi
c,
q
u
est
i
on t
e
rm
s, and
rel
a
t
i
ons
[9]
.
Tw
o ex
pect
ed
answ
er t
y
pe
of
why
-
que
st
i
ons t
h
at
have
bee
n
al
re
ady
o
b
ser
v
e
d
a
r
et
he ca
use, a
n
d t
h
e m
o
t
i
v
at
i
on a
n
s
w
er t
y
p
e
. Q
u
est
i
o
n t
o
pi
c i
s
a
d
eclarativ
e senten
ce fo
llowing
th
e ‘wh
y
’-questio
n
wo
rd
, fro
m
wh
ich
th
e q
u
e
stio
n
term
s an
d
relation
s
will b
e
ex
tracted. Qu
estio
n
term
s ca
n
b
e
a wo
rd
or
m
u
ltip
le wo
rd
s (i.e.,
n
oun
phrases), th
at are fo
cu
ses of th
e wh
y-
q
u
e
stio
n, and id
en
tified
as co
n
c
ep
ts/in
st
an
ces. Rela
t
i
o
ns o
f
t
e
n a
r
e
ver
b
s
of t
h
e
t
opi
c q
u
est
i
o
n
.
The
co
n
c
ep
ts/in
stan
ces an
d relatio
n
s
will b
e
used
to con
s
tr
u
c
t
a set of in
termed
iate rep
r
esen
tatio
n
s
of the wh
y-
q
u
e
stio
n.
Th
e rep
r
esen
tation
s
e
m
p
l
o
y
trip
le-b
ased
re
p
r
esen
tations re
fer
r
e
d
to
as qu
ery
-
trip
les.
3.
3.
Expecte
d
Ans
w
er
Type of
a Why-Questi
on
A why-question is a question answe
r
ed
by a cause [1
]. Furtherm
ore, the
r
e are four type
s of causes
(i.e.,
Arist
o
tle fou
r
cau
s
es) inclu
d
i
ng
,
t
h
e material, the formal, the efficien
t, and the
fina
l cause. T
h
e material
cause is
about
what
a thi
n
g is
m
a
de
of
, t
h
e
f
o
rm
al
cause a
b
out
i
t
s
f
o
rm
or
what
it
is,
the
efficient
ca
use
about
who
m
a
d
e
it or
h
o
w it cam
e to
be
wh
at it i
s
, an
d th
e fin
a
l
cau
se abou
t
wh
at a t
h
ing
is
mad
e
fo
r or
wh
at its
pu
r
pose
i
s
[1]
,
[2
7]
.
Al
varez
st
at
ed
that t
h
e
efficient ca
use
is what m
o
st
p
e
opl
e
t
h
i
n
k o
f
as
cau
se [2
7]
. On
t
h
e
o
t
h
e
r wo
rd
s, t
h
e efficien
t cau
se is relatin
g
to
th
e reaso
n
clau
se, and
th
e
fin
a
l cau
se relatin
g
to
th
e purpo
se
clause. T
h
is
re
search conce
r
ns in t
h
ese two
cause types
.
Accord
ing
to
Qu
i
r
k, th
ere are fo
ur typ
e
s of th
e
r
eas
on clause, including
the cause-e
ffec
t
, the
reas
on-
co
nsequ
e
n
ce, t
h
e m
o
tiv
atio
n
-
resu
lts, and
th
e circu
m
stan
ces-con
seq
u
e
n
ce clau
se
[2
8
]
. In
ad
d
ition
,
b
ecause
th
e
resu
lt relation in
th
e resu
lt clau
se is th
e co
nv
erse of
th
at o
f
m
o
tiv
atio
n
[2
8
]
, t
h
e resu
lt clau
se is also
considere
d
i
n
this resea
r
ch.
Howe
ver, the
circum
stan
ces-conse
quence
clause is
no
t
tak
e
n
i
n
to
acco
un
t,
because it seldom
arises in a
why-
questi
o
n collection. T
h
us
, the
propos
e
d m
e
thod obs
e
rves
two e
xpected
answer types
of a why-que
s
tion, which a
r
e the cause
answer type and the
m
o
tiv
ation ans
w
er type. These
types involve
fiveclauses,
where the cause answe
r
type rel
a
tes to
the cause-effect claus
e
, and the m
o
tivation
answ
er t
y
pe
rel
a
t
e
s t
o
t
h
e
reas
on
-c
onse
q
uenc
e, t
h
e m
o
tiv
atio
n-resu
lts, t
h
e
resu
lt, and
the
p
u
rp
o
s
e clau
se.
3.
4.
N
L
P
-
Ba
s
e
d Tex
t
M
i
n
i
ng
The
pr
o
p
o
s
ed
m
e
t
hod
uses
NLP
-
bas
e
d
t
e
x
t
m
i
ni
ng t
o
ex
t
r
act
t
e
rm
s and r
e
l
a
t
i
ons.
T
h
e NL
P-
base
d
approach c
o
ns
iders t
h
e m
o
rphological
structure by
pars
ing t
h
e se
ntences
in
to
p
a
rse tree [1
9
]
.
Parse tree
pr
o
v
i
d
es a m
o
rp
h
o
l
o
gi
cal
st
ruct
u
r
e f
o
r
t
e
xt
anal
y
z
i
ng.
Te
x
t
m
i
ni
ng t
h
ro
u
gh
NL
P-
base
d
t
echni
que i
s
u
s
ual
l
y
per
f
o
r
m
e
d by
em
pl
oy
i
ng l
e
xi
co
-sy
n
t
act
i
c
pat
t
e
rns
.
T
h
e
pr
o
pose
d
m
e
tho
d
c
onst
r
uct
s
t
h
e l
e
xi
co
-sy
n
t
act
i
c
pat
t
e
rns
u
s
i
n
g t
h
e c
o
m
b
i
n
at
i
on
of
PO
S t
a
ggi
ng
, t
y
pe
d
depe
nde
ncy
parsi
n
g
,
ve
r
b
cl
assi
fi
c
a
t
i
on, a
n
d
ont
o
l
ogy
.
One
of t
h
e p
o
pul
a
r
t
y
ped de
pen
d
e
n
cy
pars
es i
s
St
anfo
rd
t
y
ped depe
n
d
e
ncy
.
The St
a
n
f
o
r
d
t
y
pe
d
depe
ndencies
are ge
nerate
d from
phrase s
t
ructure
p
a
rse trees th
roug
h two
-
ph
ase meth
od
, in
cl
u
d
i
n
g
th
e
depe
n
d
ency
e
x
t
r
act
i
o
n an
d
t
h
e de
pen
d
e
n
c
y
t
y
pi
ng p
h
as
e [20]. The
depende
n
cy ext
r
action
phase
extracts
depe
n
d
enci
es
by
a
ppl
y
i
n
g
r
u
l
e
s (i
.e.,
C
o
l
l
i
n
head
r
u
l
e
s [2
9]
) o
n
p
h
ras
e
st
r
u
ct
ure
t
r
e
e
s.
F
u
rt
herm
or
e,
t
h
e
depe
n
d
ency
t
y
pi
n
g
p
h
ase l
a
bel
s
t
h
e de
pe
nde
nci
e
s wi
t
h
a g
r
amm
a
tic
al relatio
n
which
is as sp
ecific as
pos
si
bl
e. T
h
e
i
d
ent
i
f
i
cat
i
on
of
whi
c
h g
r
am
m
a
t
i
cal rel
a
t
i
on use
d
t
o
l
a
be
l
t
h
e depe
nde
n
c
i
e
s i
s
based
on t
h
e
patterns (i.e., ove
r
the phras
e
structure parse
tree)
defi
ne
d using a tree-expressi
on sy
ntax, whe
r
e the
tree-
ex
pr
essi
o
n
syntax
is
d
e
f
i
n
e
d by tr
eg
ex
[30
]
.
4.
THE PROPOSED ONTOL
OGY-BASED WH
Y-
QUES
TION A
NAL
YSIS
METH
OD
As can be see
n
i
n
Fi
g
u
re 1
,
t
h
e pr
o
pose
d
m
e
t
hod i
n
cl
ud
es t
h
ree m
a
i
n
com
pone
nt
s,
whi
c
h are t
h
e
term/relatio
n
ex
traction
t
h
at has m
a
in
g
o
a
l t
o
ex
tract
te
rm
s and
rel
a
t
i
o
ns
c
ont
ai
ne
d
i
n
a
why
-
q
u
est
i
o
n i
n
or
de
r
to
con
s
tru
c
t an in
term
ed
iate rep
r
esen
tation
(i.e., qu
ery-
trip
l
e
s), th
e
sem
a
n
tic
m
a
p
p
i
n
g
t
h
at h
a
s m
a
in
g
o
al to
map
b
e
tween
th
e ex
tracted
term
s an
d
relatio
n
s
in
to
sem
a
n
tic en
tities (i.e., o
n
t
o
l
og
ical elemen
ts o
f
th
e do
m
a
in
ont
ol
o
g
y
)
i
n
o
r
der t
o
i
d
e
n
t
i
f
y
sem
a
nt
i
c
annot
at
i
ons o
f
t
h
e o
r
i
g
i
n
al
que
ry
an
d t
o
co
nst
r
uct
ont
ol
o
g
y
-
c
o
m
p
l
i
a
nt
que
ry
-t
ri
pl
es,
and
t
h
e S
P
AR
QL
que
ry
co
n
s
t
r
uct
i
o
n a
n
d
pr
ocessi
ng t
h
a
t
has
goal
t
o
con
s
t
r
uct
a
SP
AR
QL
que
ry
of t
h
e w
h
y
-
quest
i
o
n, a
nd t
h
e
n
t
o
p
r
o
cess t
h
e que
ry
over t
h
e
kn
o
w
l
e
d
g
e base i
n
or
de
r t
o
i
d
ent
i
f
y
t
h
e
ad
d
ition
a
l seman
tic an
no
tation
s
. Th
e qu
ery ex
p
a
n
s
i
o
n
ex
pan
d
s th
e orig
i
n
al sem
a
n
tic a
n
no
tatio
ns u
s
i
n
g
t
h
e
ad
d
ition
a
l
o
n
e
s, wh
ere sem
a
n
tic anno
tatio
n
s
o
f
a
q
u
es
tio
n is
d
e
fi
n
e
d
as a set
o
f
seman
tic en
tities (i.e.,
ont
ological elements including
c
once
p
t
s
/
i
n
st
ances a
n
d rel
a
t
i
ons
)
use
d
t
o
re
prese
n
t
a
q
u
est
i
on.
Evaluation Warning : The document was created with Spire.PDF for Python.
I
S
SN
:
2
088
-87
08
IJEC
E V
o
l
.
5, No
. 2, A
p
ri
l
20
15
:
31
8 – 3
3
2
32
2
‐
‐
‐
‐‐‐‐
‐
‐
‐
‐
‐
‐
‐‐‐‐
‐
‐
‐
‐
‐
‐
‐‐‐‐
‐
‐
‐
‐
‐
‐
‐‐‐‐
‐
‐
‐
‐
‐
‐
‐‐‐‐
‐
‐
‐
‐
‐
‐
‐‐‐‐
‐
‐
‐
Fi
gu
re
1. Gra
p
h
i
cal
repr
esent
a
t
i
on o
f
t
h
e
p
r
o
p
o
se
d ont
ol
o
g
y
-
base
d why
-
q
u
e
s
t
i
on
a
n
al
y
s
i
s
m
e
t
hod
4.
1
Terms and
Re
lations E
x
tr
ac
tion
The p
r
o
p
o
sed
que
st
i
on a
n
al
y
s
i
s
m
e
t
hod em
pl
oy
s pat
t
e
r
n
s (i
.e., n
a
m
e
d as query
-t
ri
pl
e c
onst
r
uct
i
o
n
p
a
ttern
s)
represen
ted u
s
i
n
g the con
v
e
n
tion
,
‘
Grama
tica
l
Rela
tio
n
(
H
e
ad
-I
nd
ex
/
POS
,
De
pen
d
e
n
t
-
I
n
dex
/
POS
)
Tra
n
s
fo
rma
tion
’,
Whe
r
e
Gra
mma
tica
l
Rel
a
tio
n
re
prese
n
t
s
a
depe
n
d
ency
re
l
a
t
i
on,
He
ad
an
d
De
pe
n
d
ent
are varia
b
le
na
m
e
s,
POS
represe
n
ts a part-of-s
p
eech,
In
dex
rep
r
esen
ts t
h
e
po
sitio
n of th
e word in
t
h
e sen
t
en
ce, and
Tran
sfo
r
ma
tion
d
e
scri
b
e
s t
h
e resu
ltin
g exp
r
essio
n
[15
]
. The propo
sed m
e
t
h
od
em
p
l
o
y
s Stan
fo
rd
POS tag
g
i
n
g
fo
r t
a
ggi
ng a wo
rd
, an
d St
anf
o
r
d
pa
rser f
o
r co
nst
r
uct
i
n
g t
y
ped de
pe
n
d
ency
pa
rse t
r
ees. Tabl
e 1 s
h
o
w
s
exam
pl
es of t
h
e l
e
xi
co-sy
n
t
a
c
t
i
c
pat
t
e
rns fo
r
i
d
ent
i
f
y
i
ng
no
un
ph
rases (i
.e
.
,
t
e
rm
s). M
o
reove
r, Ta
bl
e 2 sho
w
s
ex
am
p
l
es o
f
t
h
e lex
i
co-syn
tactic p
a
ttern
s fo
r
ex
tracting
relatio
n
s
. As a
no
te, Ag
en
t is ag
entiv
e no
un
(e.g
., we,
researc
h
er
, use
r
, an
d ot
hers
),
NN i
s
P
O
S f
o
r
al
l
nou
n p
h
ras
e
s (i
.e.,
NN
, N
N
S,
NN
P, N
N
PS), a
n
d VB
i
s
P
O
S
fo
r all ver
b
s (i.
e
., VB,
VBZ,
VBD,
VBG
,
V
B
N, VB
P).
Al
l
POS l
a
bel
s
c
a
n be see
n
i
n
[
31]
. F
u
rt
herm
ore
,
al
l
depe
n
d
ency
rel
a
t
i
on l
a
b
e
l
s
ca
n
be see
n
i
n
[
3
2]
.
Tab
l
e
1
.
Th
e lex
i
co-syn
tactic
p
a
ttern
s fo
r id
en
tifyin
g
con
c
ep
ts
Lexico-Syntact
ic P
a
ttern
Exa
m
ple
nn(X/NN, Y/NN)
Y_X
nn(ret
r
ieval-2/
NN
,
in
format
ion-1/
NN)
i
n
for
m
ati
on_r
e
tri
eval
nn(X/NN, Y/NN),
nn(X/NN, Z/NN)
Y_Z_X
nn(model-4/
NN,
vector-2/
NN),
nn(model-4/
NN,
space-3/
NN)
vector
_space_model
amod(X/
NN,
Y/
JJ)
Y_X
amod(
syst
em-
5
/
NN,
smar
t
-
4/
JJ)
smar
t_system
amod(X/
NN,
Y/
JJ), nn(X/
NN,
Z/
NN
)
Y_Z_X
amod(engine-4/
NN,
semant
ic-2/
JJ),
nn(e
ngine-4/
NN,
search-3/
NN)
semanti
c_sear
ch_engi
ne
Evaluation Warning : The document was created with Spire.PDF for Python.
I
J
ECE
I
S
SN
:
208
8-8
7
0
8
Ont
o
l
o
gy-
b
ase
d
W
h
y-
Q
u
est
i
o
n A
n
al
ysi
s
Usi
n
g Lexico-Synt
a
ctic Patter
n
s
(
A
.A.
I
.N. EkaK
ary
a
wati)
32
3
Tab
l
e
2
.
Th
e lex
i
co-syn
tactic
p
a
ttern
s fo
r ex
t
r
actin
g relations
Lexico-Syntact
ic P
a
ttern
E
x
am
pl
e
Question Topic
Relation Extractio
n
nsubj(X1/
VB,
X2/
A
gent
),
dobj(X1/
VB,
X3/
NN),
prep(X3,
X4)
Be_V3(X1)_Prep(X3, X4)
We use a vect
or_sp
a
ce_model f
o
r
t
e
xt
_ret
rieval.
nsubj(use-2/
VB,
We-1/
P
RP),
dobj(use-2/VB,
vect
or_spac
e_model
-4/
NN), prep_f
or(vect
o
r_space_mod
el-
4/NN, text-retrieval-6
/
NN)
i
s_used_for
(vect
o
r_space_m
odel,
t
e
xt
_ret
rieval)
nsubj(X1/
JJ,
X2/
NN),
cop(X1/JJ, X3/VB)
hasQualit
y(X2, X1)
A vect
or_space
_
mo
del is usef
ul.
nsubj(usef
u
l-4/
JJ,
vect
or_spac
e_mo
del-2/
NN),
cop(usef
ul-
4/
JJ,
i
s
-
3
/
VB)
hasQual
i
t
y
(vect
o
r_space_model,
usef
ul)
expl(X1/
VB,
t
here/
E
X
),
nsubj(X1/
VB,
X2/
NN),
prep_in(X2/
NN,
X3/NN)
occur_in(X2/NN,
X3/NN)
There are word_mismat
c
hes in
search_en
gine.
expl(are-2/
VB,
t
here-1/
EX),
nsubj(are-2
,
word_mismat
c
hes-
3/
NN),
prep_in(word
_mismat
c
hes-3/
NN,
search_en
gine-
5/NN)
occur
_
i
n
(word_mismat
c
hes,
se
arch_engine)
4.
2
Expecte
d
Ans
w
er Type I
d
e
n
tification
an
d Quer
y-Triple Cons
truc
tion
There a
r
e two
answer types
of a
w
h
y
-
q
u
est
i
on c
o
ncer
ne
d i
n
t
h
i
s
resea
r
c
h
i
n
cl
udi
ng t
h
e
cause, a
n
d
th
e
m
o
tiv
atio
n an
swer typ
e
, wh
ere th
eir iden
tificatio
n
is b
a
sed
on
v
e
rb classificatio
n
.
A q
u
e
sti
o
n
has th
e
cau
se an
swer t
y
p
e
, if th
e m
a
i
n
v
e
rb
of th
e
qu
estion
is classified
as th
e pro
cess
v
e
rb
.
In
ad
d
ition
,
a
qu
estio
n
also has
a caus
e
ans
w
er type
if the m
a
in verb is an affect verbs s
u
c
h
as
‘affect’, or ‘i
nfl
u
ence
’, a
ve
rb
wit
h
feature
‘i
ntens
’
, an existential
“the
r
e
” qu
estio
n, a
qu
estio
n
th
at h
a
s
subj
ec
t com
p
le
m
e
nt
‘adjective
phra
se’, a
q
u
e
stio
n
t
h
at h
a
s m
o
d
a
l aux
iliary ‘can
/cou
ld’ or ‘h
av
e/h
a
s to’.On
t
h
e o
t
h
e
r h
a
nd
, a q
u
e
stio
n
has th
e
m
o
ti
vat
i
on a
n
s
w
er t
y
pe
, i
f
t
h
e m
a
i
n
ver
b
of t
h
e
quest
i
o
n i
s
cl
assi
fi
e
d
as t
h
e act
i
o
n
ver
b
s
.
M
o
re
ove
r, a
q
u
e
stio
n also
h
a
s t
h
e m
o
tiv
atio
n
an
swer typ
e
if t
h
e m
a
in
ver
b
is
nee
d
v
e
rbs
s
u
ch
as
‘
n
eed
’,
o
r
‘re
q
u
i
re’, t
h
e
main
v
e
rb
is co
nsid
er
v
e
rb
s su
ch
as
‘c
onsider’, or ‘take-into-acc
ount
’
,
an
d t
h
e
que
s
t
i
on t
h
at
h
a
s
m
odal
au
x
iliary ‘sh
a
ll
/sh
o
u
l
d’.
So
m
e
id
eas
o
f
th
e i
d
en
tificatio
n
are go
t fro
m
[2
].
Mo
reo
v
e
r, th
e lex
i
co
-syn
tact
ic p
a
ttern
s are u
s
ed
as a b
a
sis fo
r con
s
tru
c
tin
g
SPARQL te
m
p
lates.
Thu
s
, th
e p
a
tter
n
s invo
lv
eelemen
ts o
f
th
e do
m
a
in
o
n
t
o
l
ogy. I
n
SPA
R
Q
L
te
m
p
late
co
n
s
t
r
u
c
tion
,
th
e
p
r
o
p
o
s
ed
m
e
thod c
o
nsiders thet
wo ans
w
er ty
pes involving causalitie
s. Conse
q
uent
l
y
,
the dom
ain ont
ology
is des
i
gne
d
so
th
at th
e causalities can
b
e
easily d
e
tected
. Ev
en
thou
gh
, th
ere are fi
v
e
clau
ses co
n
c
ern
e
d
(see sub
-
ch
ap
ter
3
.
3
)
, bu
t all o
f
th
em
are rep
r
esen
ted
i
n
two relatio
n
rep
r
esen
tatio
n
s
, th
e
cause
relation
(i.e.,
fo
r the c
a
use-
effect, the reas
on-c
onse
q
uenc
e, the
m
o
tivatio
n-re
sult, and the res
u
lt clause), and the
p
u
r
pos
e
relation (i
.e., f
o
r
th
e purpo
se cl
au
se). Th
e causality rep
r
esen
t
a
tio
n
s
are d
e
si
g
n
e
d
b
y
invo
lvin
g
h
a
s C
o
m
p
one
nt
relatio
n
,
wh
ere
th
e
ha
s C
o
m
p
one
nt
relation
separates a se
ntence int
o
som
e
se
ntence com
pone
nts (i.e
., term
s that ar
e noun
p
h
rases, an
d relatio
n
s
th
at are u
s
u
a
lly v
e
rb
s) th
at refer
to
as sem
a
n
tic en
tities o
f
a sen
t
en
ce.
It is su
itab
l
e to
t
h
eg
oal
of t
h
e
why
-
q
u
est
i
o
n
anal
y
s
i
s
m
e
t
hod, t
h
at
i
s
t
o
i
d
ent
i
f
y
sem
a
nt
ic ann
o
t
a
t
i
o
n
s
(
i
.e., re
fer t
o
se
m
a
nt
i
c
en
tities) of a
qu
estion
.
Tab
l
e 3
sh
ows represen
tation o
f
th
e cau
s
alities in
th
e k
nowledg
e b
a
se
of th
e do
m
a
in
o
n
t
o
l
og
y.
In
this case,
X is
refe
rre
d to as
a question topi
c, and
Y a
s
the ans
w
er. For
exam
pl
e,a q
u
e
s
t
i
on,
“w
hy
X
?
”
,
m
a
y
has a
n
swe
r
“
b
ecause
Y” (i.e
., the ca
use
relation,
Y cause
X
) o
r
“i
n o
r
d
e
r
t
o
Y”
(i
.e.
,
t
h
e pu
r
pos
e rel
a
t
i
on,
X
has P
u
r
p
o
s
e
Y
)
, de
pe
ndi
ng
o
n
t
h
e ans
w
er t
y
pe of t
h
e
que
st
i
on.
If a q
u
est
i
on ha
s t
h
e cause ans
w
er t
y
p
e
, t
h
e
que
st
i
on o
n
l
y
has ans
w
er c
o
nt
ai
ni
n
g
t
h
e cause rel
a
t
i
o
n
s
. I
n
t
h
e ot
he
r ha
n
d
, i
f
t
h
e q
u
est
i
on
has t
h
e m
o
ti
vat
i
o
n
answ
er t
y
pe
, t
h
e q
u
est
i
o
n m
a
y ha
ve a
n
s
w
ers
cont
ai
ni
ng
bot
h t
h
e
cau
se a
n
d
t
h
e
pu
r
pose
re
l
a
t
i
ons.
Tab
l
e
3
.
Th
e rep
r
esen
tatio
n
o
f
cau
sality in
d
o
main
o
n
t
o
l
og
y
Cause Relation
Pur
pose Relation
Y cause X;
X hasCompone
nt
A1;
XhasComponent
A2;
…
Y hasCompone
nt
B1;
Y hasComponent
B2;
…
X hasPurpose Y;
X hasCompone
nt
A1;
X hasComponent
A2;
…
Y hasCompone
nt
B1;
Y hasComponent
B2;
.
..
Tog
e
th
er
with th
e ter
m
/rela
tio
n
ex
traction p
a
tte
rn
s, th
e cau
sality rep
r
esen
tation
s
are u
s
ed
fo
r
con
s
t
r
uct
i
n
g
t
h
e q
u
ery
-
t
r
i
p
l
e
s
o
f
a
w
h
y
-
que
st
i
on.
The
que
ry
-t
ri
pl
es
co
nst
r
uct
i
o
n i
s
pe
rf
orm
e
d i
n
t
w
o
st
eps.
First, the term
s of t
h
e
why-question a
r
e e
x
tracted
usi
n
g the term
ex
tractio
n p
a
ttern
s (see Tab
l
e 1), and
then,
after g
e
tting
the ex
tracted
term
s, th
e relatio
n
ex
t
r
actio
n
pattern
s (see Tab
l
e 2), th
e
v
e
rb
classification
(i.e.,
id
en
tifying
ex
pected
answer t
y
p
e
), and
th
e cau
s
ality re
presen
tatio
n
s
(see
Tab
l
e
3
)
are em
p
l
o
y
ed
to
g
e
t
h
er
fo
r
co
nstru
c
ting
qu
ery-t
r
ip
les.
Ex
am
p
l
es o
f
t
h
e p
a
ttern
s
for
co
nstr
u
c
ting
qu
er
y-
t
r
ip
les o
f
a
wh
y-q
u
e
sti
o
n
ar
e
sho
w
n i
n
Ta
bl
e 4 a
n
d Ta
bl
e
5.
Evaluation Warning : The document was created with Spire.PDF for Python.
I
S
SN
:
2
088
-87
08
IJEC
E V
o
l
.
5, No
. 2, A
p
ri
l
20
15
:
31
8 – 3
3
2
32
4
Tab
l
e
4
.
Th
e lex
i
co-syn
tactic
p
a
tte
rn
s fo
r con
s
tru
c
tin
g qu
ery-trip
les
Lexico-Syntact
ic P
a
ttern
(
l
eft-
hand-
side of patter
n
,
LHS)
Answer
Typ
e
Query-Triples
(
r
i
ght-
h
and-
side of
patter
n
,
RHS)
nsubj(X1/
VB,
X2/
A
gent
),
dobj(X3/
VB,
X3/
NN), prep(X3/
NN,
X4/
N
N
)
,
X3 is an act
i
on verb
Mo
tiv
a
tio
n
Be_V3(X1)_Prep(X3,
X4); cause(
A1,
Q);
hasPurpose(Q,
A2);
hasCompo
nent
(Q,
X
3
);
hasComponent
(Q,
X4)
nsubj(X1/
JJ,
X2/
NN), cop(X1/
JJ,
X3/
VB),
X1 is an adject
i
ve
Cause
hasQualit
y(X2,
X1),
cause(A,
Q),
hasComponent
(Q,
X1); hasComponent
(Q,
X
2)
expl(X1/
VB,
t
here/
E
X
),
nsubj(X1/
VB,
X2/
NN),
prep-in(X2/
NN,
X3
/
NN),
an exist
ent
ial “t
here
”
quest
i
on
Cause
occur_in(X2,
X3),
ca
use(A,
Q), hasComponent
(Q,
X2); hasComponent
(Q,
X
3)
Tabl
e 5.
E
x
am
pl
es of q
u
ery
-
t
r
i
p
l
e
s
c
o
n
s
t
r
uct
i
on
Question Topic
Lexico-Syntact
ic P
a
ttern (
L
HS
)
Answer
Typ
e
Query-Triples
(
RH
S)
We employ a
vect
or_spac
e_model
f
o
r t
e
xt
_re
t
rieval.
nsubj(employ-2/
VB,
We-1/
P
RP),
dobj(use-2/
VB,
vect
or_space_mo
del-4/
NN),
prep_f
or(vect
o
r_spa
c
e_model-4/
NN,
t
e
xt
_ret
rieval-6
/
NN),
‘employ’ is an act
i
on
verb
Mo
tiv
a
tio
n
is_employed_f
or(vec
t
o
r_space_mod
el,
t
e
xt
_ret
rieval);
cause(A1
,
Q);
hasPurpose(Q,
A2);
hasCompo
nent
(Q,
vect
or_space
_
model);
hasCompo
nent
(Q,
t
e
xt
_ret
rieval)
A
vect
or_spac
e_model
is usef
ul.
nsubj(usef
u
l-4/
JJ,
vect
or_space_mo
del-2/
NN),
cop(usef
ul-4/
JJ,
is-3/VB)
‘usef
u
l’ is an adject
i
v
e
Cause
hasQualit
y(vect
or_s
pace_mod
el,
usef
ul),
cause(A,
Q),
hasComponent
(Q,
usef
ul);
hasCompo
nent
(Q,
v
ect
o
r_space_mo
del)
4.
3
Seman
t
ic
Mapping, SP
ARQL Cons
truc
tion,
an
d Sem
a
ntic
Ann
o
tati
on
In t
h
i
s
re
searc
h
, sem
a
nt
i
c
m
a
ppi
ng i
s
pe
rf
orm
e
d i
n
two
main
p
h
a
ses, first th
e ex
tract
ed
term
s an
d
rel
a
t
i
ons are
m
a
t
c
hed wi
t
h
al
l
l
a
bel
s
defi
n
e
d i
n
d
o
m
a
i
n
ont
ol
o
g
y
by
u
s
i
ng e
d
i
t
di
st
ance, a
nd t
h
e
n
t
h
ey
are
map
p
e
d in
to
sem
a
n
tic en
tit
ies (
i
.e., obj
ect p
r
op
er
ties, and
in
stan
ces)
o
f
t
h
e
d
o
main
o
n
t
o
l
ogy. So
m
e
researc
h
ers use
d
Wordnet as
a lexical res
o
urce.
Howe
ve
r
,
t
h
e use
o
f
s
o
m
e
ge
neral
dom
ai
n l
e
xi
cal
res
o
u
r
ces,
suc
h
as
WordNet, would
not
be practicable
because th
ey
will discard
se
veral term
s belonging to t
h
e specific
dom
ai
n. Th
us,
t
h
e pr
o
pose
d
m
e
t
hod em
pl
oy
s
m
a
nual
l
y
l
i
st
s of sy
n
ony
m
i
es of t
e
r
m
s
and
rel
a
t
i
ons i
n
st
ead
of
W
o
rd
n
e
t as a sp
ecif
i
c do
m
a
in
lex
i
co
n. In
imp
l
em
en
tatio
n
,
syn
o
n
y
m
i
es ar
e sav
e
d
as know
ledg
e b
a
se in RD
F
form
at, where each instance a
nd
relation (i.e
., obj
ect
prope
rty) has list of
synonymy saved as
l
abel
elements
.
More
ove
r,
t
h
e propose
d
m
e
thod uses Dam
e
rau-Le
ve
nste
in edit
distance because
tr
a
n
s
p
osition of chara
c
ters
o
f
ten
o
c
cu
rs wh
en
u
s
ers inputs a q
u
e
stio
n. Sem
a
n
tic
en
titi
es o
f
a w
h
y
-
qu
estion
w
ill b
e
u
s
ed
to
id
en
tify th
e
sem
a
nt
i
c
annot
at
i
ons o
f
t
h
e or
i
g
i
n
al
que
ry
, a
nd t
o
c
ons
t
r
uct
ont
ol
ogy
-com
pl
i
a
nt
que
ry
-t
ri
pl
es t
h
at
are ba
si
s of
SPAR
Q
L co
ns
t
r
uct
i
o
n. Ta
bl
e 6 p
r
esent
s
e
x
a
m
pl
e of ont
ol
o
g
y
-
c
o
m
p
l
i
a
nt
query
-t
ri
pl
es,
w
h
ere
OP(
x
)
is object
p
r
op
er
ty of
label
x
, and
I(
y)
is
in
stan
ce of label
y
.
Tabl
e
6. E
x
am
pl
e o
f
o
n
t
o
l
o
gy
-com
pl
i
a
nt
q
u
e
r
y
-
t
r
i
p
l
e
s c
o
nst
r
uct
i
o
n
Expected
Answer T
y
pe
Query-Triples
Ontology-C
o
m
plia
nt Query-Triples
Mo
tiv
a
tio
n
Be_V3(X1)_Prep(X3, X4);
cause(A1,
Q);
hasPurpose(Q,
A2);
hasCompo
nent
(Q,
X
3
);
hasComponent
(Q,
X4);
hasCompo
nent
(Q,
G
e
rund(X1));
OP(Be_V3(X1)_Prep)(I
(X3), I(X4)
)
;
cause (A1,
Q);
hasP
u
rpose(Q,
A2);
hasCompo
nent
(Q,
I(X3));
hasComponen
t
(
Q, I
(
X4
))
Cause
hasQualit
y(X2,
X1),
cause(A,
E),
hasCompo
nent
(E,
X
1
);
hasComponent
(E,
X
2)
hasQualit
y(I
(
X2), I(X1)),
cause(A,
Q
)
,
hasCompo
nent
(Q,
I(X1));
hasComponent
(Q,
I
(X2))
Tabl
e
7
pre
s
e
n
t
s
e
x
am
pl
es of
a S
P
AR
QL
t
e
m
p
l
a
t
e
for
a w
h
y
-
quest
i
o
n t
h
at
has
t
h
e
m
o
t
i
v
at
i
on
answ
er t
y
pe.
SPAR
Q
L q
u
e
r
i
e
s are const
r
u
c
t
e
d by
us
ing
SPAR
Q
L temp
lates. Th
e tem
p
la
tes are
man
u
a
lly
constructe
d ba
sed on the the que
ry-t
ri
ple construction
patterns
.
A SELE
CT
que
ry form
is e
m
ployed, because
it is m
o
st su
ited
for
represen
tin
g
wh
y-qu
estio
n
.
To
re
triev
e
m
o
re
po
ten
tial an
sw
ers
,
t
h
e pr
op
ose
d
m
e
t
h
o
d
Evaluation Warning : The document was created with Spire.PDF for Python.
I
J
ECE
I
S
SN
:
208
8-8
7
0
8
Ont
o
l
o
gy-
b
ase
d
W
h
y-
Q
u
est
i
o
n A
n
al
ysi
s
Usi
n
g Lexico-Synt
a
ctic Patter
n
s
(
A
.A.
I
.N. EkaK
ary
a
wati)
32
5
co
nsid
ers tax
o
n
o
m
ical relatio
n
s
b
e
t
w
een
con
cep
ts. Thou
gh
th
e kno
wledg
e
b
a
se
do
es no
t con
t
ain
cau
sality o
f
conce
p
ts as
ke
d in a
question, t
h
e system
still can ide
n
tify the add
itio
n
a
l seman
tic an
no
tatio
n
s
of th
e qu
estio
n
.
For i
n
st
ance
, fo
r q
u
est
i
o
n,
“
W
hy
does s
o
m
e
wor
d
m
i
s
m
atches arise in
I
R
?
”
, ev
en th
ou
gh
th
e
k
nowledg
e b
a
se d
o
e
s
no
t co
n
t
ain
cau
s
ality b
e
tween
con
cep
ts WordMis
m
atch
an
d IR (e.g., {<X, cause,
Word
Mism
ac
t
h
>, <Word
M
ismatch
,
Occu
rIn
,
IR>}), th
e
sem
a
n
tic an
n
o
tatio
n
s
still can
b
e
i
d
en
tified if th
e
k
nowledg
e
b
a
se con
t
ain
s
cau
s
ality b
e
tween
co
n
c
ep
t
Wo
rd
Mism
atch
an
d sub
-
con
c
ep
ts of IR
, such
as
Text
R
e
t
r
i
e
val
,
Searc
h
En
gi
ne
,
and
ot
he
rs.
Fu
rt
herm
ore, t
h
e
que
ry
b
o
d
y
of
t
h
e SP
AR
QL
q
u
ery
i
s
o
b
t
a
i
n
e
d
b
y
tran
sform
i
n
g
th
e
q
u
e
ry-tri
p
l
es in
to
altern
ativ
e
g
r
aph
p
a
ttern
fo
r rep
r
esen
tin
g
th
e
altern
ativ
e o
f
sub
-
classes
(i.e., su
b-con
c
ep
ts) and
th
e altern
ativ
e o
f
relatio
n
s
in
clu
d
e
d
in
th
e
m
o
t
i
v
a
tio
n
answer typ
e
,
where th
e
m
o
ti
vat
i
on an
s
w
er t
y
pe i
n
cl
u
d
es t
w
o rel
a
t
i
o
ns, t
h
e ca
use a
nd t
h
e p
u
r
p
ose
rel
a
t
i
on (see s
u
b
-
c
h
apt
e
r
3.
3
dan
4.
2)
.
As ca
n be
see
n
i
n
Tabl
e
7, t
h
e SP
AR
QL t
e
m
p
l
a
t
e
repre
s
e
n
ts the alternat
ive of s
u
b-clas
ses, and t
h
e
altern
ativ
e
o
f
relatio
n
s
in
cl
u
d
ed
in th
e m
o
tiv
atio
n an
swe
r
type, whe
r
e Instance1,
In
s
t
anc
e
2
,
and
G
e
run
d
a
r
e
slots for i
n
stances of term
1 (i.e., c
o
n
cept
1
),
term
2 (i.e., concept
2
), and t
h
e
pre
s
ent
part
i
c
i
p
l
e
of
a ve
r
b
(i
.e.,
rel
a
t
i
on)
, res
p
e
c
t
i
v
el
y
.
Ter
m
1, t
e
r
m
2, and
v
e
rb are e
x
t
r
act
ed fr
om
a why
-
q
u
est
i
o
n. M
o
r
e
ove
r, TR
re
pr
esent
s
th
e Tex
t
Retriev
a
l on
to
l
o
g
y
.
After con
s
tru
c
t
i
n
g
t
h
e SP
AR
QL
qu
ery, t
h
e
ad
d
ition
a
l seman
tic ann
o
t
ation
s
are
id
en
tified b
y
ex
ecu
ting
t
h
e qu
ery ag
ain
s
t th
e kno
wledg
e
b
a
se
o
f
the do
m
a
in
o
n
t
o
l
og
y. Th
e seman
tic
an
no
tatio
ns are all sem
a
n
tic e
n
tities (i.e., in
stan
ces
an
d object p
r
op
erties)
th
at satisfy th
e
SPAR
Q
L
qu
ery.
Tabl
e
7. E
x
am
pl
e o
f
SP
AR
Q
L
t
e
m
p
l
a
t
e
Ont
o
logy-Compliant
Query
-
Triples
SPARQL Template
relat
i
on(I
n
st
ance1,
I
n
st
ance2);
cause(A1,
Q);
hasPurpose(Q,
A2)
hasCompo
nent
(Q,
I
n
st
ance1);
hasCompo
nent
(Q,
I
n
st
ance2)
SELECT ?inst
ance
WHERE {
{
TR:
I
n
st
ance1 TR:
r
elat
ion TR:
I
n
st
ance2.
?A1 TR:
c
ause ?Q.
?QTR:
hasCompo
n
e
n
t
TR:
I
n
st
ance1.
?Q
TR:
hasCompon
ent
TR:
I
n
st
ance2.
?A1 TR:
hasCompon
ent
?inst
ance }
UNION
{
?x TR:
r
elat
ion TR:
I
n
st
ance2.
?A1 TR:
c
au
se ?Q.
?Q TR:
hasComponent
?x.
?
Q
T
R
:h
a
s
Co
mp
on
en
t T
R
:In
s
t
a
n
c
e
2
.
?
x
r
d
f:ty
p
e
?
c
1
.
?
c
1
rd
fs
:
s
u
b
C
la
s
s
O
f ?
c
2
.
T
R
:In
s
t
a
n
c
e
1
rdf:type ?c2. ?A1 TR
:hasComponent ?instance }
UNION
{
TR:
I
n
st
ance1 TR:
r
elat
ion ?x.
?A1 TR:
c
ause ?Q.
?Q TR:
hasComponent
TR:
I
n
st
ance1.
?
Q
T
R
:h
a
s
Co
mp
on
en
t ?
x
.?
x
r
d
f:ty
p
e
?
c
1
.
?
c
1
rd
fs
:s
u
b
C
la
s
s
O
f ?
c
2
.
TR:Instance2rdf:type ?c2. ?A1
TR:hasComponent ?instance }
UNION
{
?x TR:
r
elat
ion ?y.
?A
1 TR:
c
ause ?Q.
?Q TR:
hasCompon
ent
?x.
?Q TR:
hasComponent
?y.
?
x
rd
f:ty
p
e
?
c
1
.
?
c
1
r
d
fs
:s
u
b
C
la
s
s
O
f ?
c
2
.
T
R
:In
s
t
a
n
c
e
1
r
d
f:ty
p
e
?
c
2
.
?
y
rd
f:ty
p
e
?
c
3
.
?c3 rdfs:subClassOf ?c4.TR
:Instance2rdf:type ?c4.?A1 TR
:hasComponent
?instance }
UNION
{
TR:
I
n
st
ance1 TR:
r
elat
ion TR:
I
n
st
ance2.
?Q TR:
hasPurpose
?A2.
?QTR:
hasCompo
n
e
n
t
TR:
I
n
st
ance1.
?QTR:
hasCompone
nt
TR:
I
n
st
ance2.
?A2 TR:
hasCompon
ent
?inst
ance}
UNION
{
?x TR:
r
elat
ion TR:
I
n
st
ance2.
?Q TR:
has
Purpose ?A2.
?Q TR:
hasComponent
?x
?
Q
T
R
:h
a
s
Co
mp
on
en
t T
R
:In
s
t
a
n
c
e
2
.
?
x
rd
f:ty
p
e
?
c
1
.
?
c
1
rd
fs
:s
u
b
C
la
s
s
O
f ?
c
2
.
TR:Instance1rdf:type ?c2. ?A2
TR:hasComponent ?instance }
UNION
{
TR:
I
n
st
ance1 TR:
r
elat
ion ?x.
?Q TR:
hasPurpose ?A2.
?QTR:
hasCompo
nent
TR:
I
n
st
ance1.
?Q TR:
hasCompone
nt
?x.
?xrdf
:
t
y
pe ?c1.
?c1 rdf
s
:
s
ubClassOf
?c2.
TR:Instance2rdf:type ?c2. ?A2
TR:hasComponent ?instance }
UNION
{
?x TR:
r
elat
ion ?y.
?
Q
TR:
hasPurpose ?
A
2.
?Q TR:
hasComponent
?x.
?
Q
T
R
:h
a
s
Co
mp
on
en
t ?
y
.?
x
r
d
f:ty
p
e
?
c
1
.
?
c
1
rd
fs
:s
u
b
C
la
s
s
O
f ?
c
2
.
T
R
:In
s
t
a
n
c
e
1
r
d
f:ty
p
e
?
c
2
.
?
y
rd
f:
ty
p
e
?
c
3
.
?
c
3
rd
fs
:s
u
b
C
la
s
s
O
f ?
c
4
.
TR:Instance2rdf:type ?c4. ?A2
TR:hasComponent ?instance }
}
5.
R
E
SEARC
H M
ETHOD
Devel
opi
ng t
h
e pr
op
ose
d
m
e
t
h
o
d
nee
d
s s
o
m
e
supp
o
r
t
e
d
dat
a
i
n
cl
u
d
i
n
g
a quest
i
o
n c
o
l
l
ect
i
on, an
d
dom
ai
n ont
ol
o
g
y
(i
.e., o
n
t
o
l
o
gy
schem
a
and kn
owl
e
dge
ba
se). Th
e qu
esti
o
n
co
llectio
n
is co
n
s
tru
c
ted
throug
h
t
h
ree st
eps
,
fi
r
s
t
col
l
ect
i
ng w
h
y
-
quest
i
o
ns
(i
.e.,
gene
ral
d
o
m
ai
n quest
i
o
n
s
) fr
om
web and
fr
om
Verb
erne
’s
wh
y-qu
estion
co
llectio
n
[33
]
, secon
d
an
al
yzin
g
th
e
q
u
e
stio
n
s
to
i
d
entify g
e
n
e
ral
pattern
s of th
e wh
y-
que
st
i
ons
, an
d
t
h
i
r
d
ge
nerat
i
n
g w
h
y
-
que
st
i
o
n i
n
a s
p
eci
fi
c
dom
ai
n (i
.e.,
Text Retrieval
) u
s
ing
th
e
p
a
ttern
s. As
d
e
fau
lt, th
e
q
u
estio
n
s
are set in
well-o
r
d
e
red
fo
rm
s (i
.e., th
e qu
estion
s
h
a
v
e
correct
Eng
lish
grammar, the
pat
t
e
rns
ha
ve
b
een al
rea
d
y
def
i
ned,
an
d t
h
e t
e
rm
s and
rel
a
t
i
ons
ha
ve
been
al
ready
c
o
vere
d)
.
Fo
r do
m
a
in
on
to
log
y
b
u
ild
i
n
g,
Text Retri
eval
(TR
)
o
n
t
o
l
o
gy
i
s
de
fi
n
e
d i
n
o
r
de
r t
o
re
p
r
esent
conce
p
t
s
a
n
d
rel
a
t
i
ons
use
d
t
o
c
onst
r
uct
SPAR
Q
L t
r
a
n
sl
at
i
on
of t
h
e
why
-
q
u
est
i
o
ns.
The
Text Ret
r
ieval
o
n
t
o
l
og
y is also
u
s
ed
to id
en
t
i
fy th
e ad
d
ition
a
l sem
a
n
tic an
no
tatio
ns
o
f
th
e
wh
y
-
qu
estio
n
s
b
y
ex
ecu
tin
g the
Evaluation Warning : The document was created with Spire.PDF for Python.
I
S
SN
:
2
088
-87
08
IJEC
E V
o
l
.
5, No
. 2, A
p
ri
l
20
15
:
31
8 – 3
3
2
32
6
SPAR
Q
L
que
r
y
agai
nst
t
h
e k
n
o
w
l
e
d
g
e
base
of t
h
e
d
o
m
a
i
n
ont
ol
o
g
y
.
I
n
t
h
e
Text Retrieval
ontology,
each
conce
p
t ge
ne
ra
lly has one ins
t
ance, whe
r
e each insta
n
ce ha
s a set of labels as synonymies of the ins
t
ances
.
These l
a
bel
s
a
l
so re
prese
n
t
s
y
no
ny
m
i
es of the concept. T
h
e reas
on of
th
is is in
science d
o
m
ain
su
ch
as
Inf
o
rm
at
i
o
n Sc
i
e
nce
or
C
o
m
p
ut
er Sci
e
nce
, it
is d
i
fficu
lt to
i
d
en
tify instan
ces of a co
n
c
ep
t
.
It is
d
i
fferen
t
fro
m
othe
r d
o
m
a
in, fo
r insta
n
ce
Ac
ade
mi
c
do
m
a
in
, th
ere is
stude
n
t
concept that
has s
o
m
e
insta
n
ces re
presenti
ng
by
nam
e
of the
stude
nts.
An instance
of
a conc
ept in t
h
e
Text Retrieval
o
n
t
o
l
ogy
i
s
de
fi
ne
d
as a t
e
rm
appea
r
i
n
g i
n
the inform
atio
n sources
of the knowle
dge
base (i.e.,pape
rs
). Because the
term
s
represe
n
ting concepts c
a
n be
in va
rious
form
s, an instanc
e
is labeled i
n
som
e
sy
nonym
i
es defi
ned m
a
nually. In a
d
dition,
relations
also are
l
a
bel
e
d i
n
som
e
sy
no
ny
m
i
es
defi
ned m
a
nua
l
l
y
based on t
h
e Engl
i
s
h t
h
es
aur
u
s. Ta
x
o
n
o
m
i
cal rel
a
t
i
ons
of t
h
e
dom
ai
n o
n
t
o
l
o
gy
u
s
e t
h
e
t
a
x
o
nom
y
of
In
f
o
rma
tion
Retrieva
l
Mo
d
e
l
[
34]
as a st
art
i
n
g
p
o
i
nt
. E
xpa
nsi
o
n
of
t
h
e
tax
o
n
o
m
ical re
latio
n
s
, id
en
tificatio
n
o
f
n
o
n
-
tax
o
n
o
m
ical re
latio
n
s
, and
iden
tificatio
n
of
term
so
f th
e d
o
main
ont
ol
o
g
y
are
per
f
o
r
m
e
d by
l
earni
ng
t
h
e
t
e
rm
s and
re
l
a
t
i
ons
of
Text Retrieva
l
d
o
m
ain fr
om
IR (i.e.
,
In
fo
rm
at
i
on R
e
t
r
i
e
val
)
t
e
xt
b
o
o
k
[3
5]
,
[3
6]
, a
n
d s
o
m
e
IR
j
o
u
r
nal
s
.
The
pr
o
pose
d
m
e
t
hod i
s
i
m
pl
em
ent
e
d by
usi
n
g
Java
p
r
o
g
r
a
m
m
i
ng (i
.e.,
N
e
t
B
eans I
D
E)
.
Som
e
API
libraries a
r
e e
m
bedded in t
h
e system
, su
ch as
Stanford parser
a
n
d Apac
he
Jena
.T
he St
a
n
f
o
r
d
pa
rser
A
P
I
(i
.e
.
,
st
anf
ord
-
pars
e
r.j
a
r
)
i
s
use
d
f
o
r
co
nst
r
uct
i
n
g
PO
S
t
a
ggi
ng
, and
t
y
pe
d dep
e
nde
ncy
par
s
i
n
g. Fo
r
i
m
pl
em
ent
i
n
g
SPAR
Q
L, t
h
e
AR
Q, a
que
ry
engi
ne f
o
r Je
na
i
s
em
pl
oy
ed.T
he AR
Q AP
I i
s
bu
ndl
e
d
i
n
t
h
e
Jena pac
k
age
s
(i
.e.,
j
e
na
-ar
q
.j
ar
).
In add
itio
n,
Pro
t
ég
é is
u
s
ed fo
r supp
ortin
g
o
n
t
o
l
og
y
sch
e
ma
co
n
s
t
r
u
c
tion
,
bu
tth
e kn
owledge
b
a
se is
d
e
v
e
loped
t
hr
oug
h N
e
t
b
eans ID
E.
There
are
two kinds
of e
v
aluation
that ha
ve been
conducted,
in
cl
udi
ng first, e
v
al
uation for each
pha
se
of t
h
e
m
e
t
hod
(see
F
i
gu
re
1),
i
n
cl
u
d
i
n
g
pha
se
1 t
h
at
i
s
t
h
e
t
e
r
m
/
r
el
at
i
on ext
r
act
i
on
p
h
ase
(
i
.e., t
h
e
o
u
t
p
u
t
is a set
o
f
qu
ery
-
trip
les),
ph
ase
2
t
h
at is th
esem
an
tic en
tity p
h
ase
(i.e., t
h
e
o
u
t
pu
t is a set
o
f
on
to
l
o
g
y
-
com
p
l
i
a
nt
que
r
y
-t
ri
pl
es),
a
n
d
pha
se
3 t
h
at
i
s
t
h
e SP
AR
QL c
onst
r
uct
i
o
n
an
d
pr
ocessi
ng
p
h
ase
(i
.e.
,
t
h
e
out
put
is a set
o
f
seman
tic anno
tatio
n
s
), an
d seco
nd
, ev
alua
tio
n by co
m
p
arin
g th
e
p
r
op
o
s
ed
m
e
th
od
(i.e., retriev
i
ng
doc
um
ent
base
d
on
t
h
e
o
n
t
o
l
o
gy
-
b
ased
w
h
y
-
que
st
i
o
n
anal
y
s
i
s
) a
g
ai
nst
t
w
o
basel
i
n
e
m
e
tho
d
s,
t
h
e
t
e
rm
-base
d
and
t
h
e
p
h
rase
-
m
et
hod.
The fi
rst
e
v
a
l
uat
i
on i
s
pe
r
f
o
r
m
e
d by
com
p
ari
ng t
h
e
out
p
u
t
o
f
s
y
st
em
agai
snt
t
h
e
m
a
nual
id
en
tificatio
n
of a set o
f
qu
ery
-
trip
le
s, a set
of on
to
l
o
g
y
-com
p
l
ian
t
q
u
e
ry-trip
l
es, and
a set o
f
sem
a
n
tic-en
tities
of a w
h
y
-
q
u
est
i
on (i
.e
., as
gol
d st
an
dar
d
)
.
T
hus
, t
h
e
r
e are t
h
ree e
v
al
uat
i
o
n dat
a
set
s
,
fi
rs
t
dat
a
set
com
p
osi
n
g
pai
r
s
of
why
-
q
u
est
i
o
n an
d a s
e
t
of q
u
e
r
y
t
r
i
p
l
e
s, seco
nd
dat
a
set
com
posi
n
g pai
r
s o
f
w
h
y
-
q
u
est
i
o
n an
d
a set
of
ont
ol
o
g
y
-
c
o
m
p
l
i
a
nt
query
-t
ri
p
l
es, and t
h
i
r
d a
dat
a
set
co
m
posi
ng
pai
r
s o
f
w
h
y
-
quest
i
o
n an
d a set
of sem
a
nt
i
c
ann
o
t
a
t
i
o
n
s
.I
n
t
h
i
s
resea
r
ch
,
t
h
e eval
uat
i
o
n
m
easures o
f
B
a
rke
r
[
8
]
are
use
d
fo
r
pha
s
e
1 a
n
d
phase
2.
It
i
n
cl
ude
sf
ou
r m
easure
s
, t
h
e pr
eci
si
on, t
h
e re
cal
l
,
t
h
e unde
r
-
ge
nerat
i
o
n, a
n
d t
h
e ove
r-
ge
n
e
rat
i
on m
easur
e. The
evaluation m
easure
form
ulas
are,
actual
partial
x
0.5
correct
)
(
Precision
P
(1
)
possible
partial
x
0.5
correct
)
(
Recall
R
(2
)
possible
missing
)
(
ation
Undergener
U
(3
)
actual
spurious
)
(
tion
Overgenera
O
(4
)
whe
r
e,
Correct
i
s
t
h
e
num
ber o
f
t
r
i
p
l
e
s of a quest
i
o
n fr
om
out
put
s
of t
h
e pr
o
pose
d
m
e
t
hodt
hat
m
a
t
c
h a t
r
i
p
l
e
fr
om
t
h
e gol
d s
t
anda
rd;
Pa
rtia
l
i
s
t
h
e
num
ber
of t
r
i
p
l
e
s of a
que
st
i
on
fr
om
t
h
e out
p
u
t
s
t
h
at
al
m
o
st
m
a
t
c
h t
h
e gol
d st
an
da
r
d
(i.e., reason
ab
l
e
trip
le th
at
d
i
ffer
by at m
o
st one
elem
ent);
Act
ual
is t
o
tal trip
les
o
f
a
q
u
e
stio
n
fro
m
th
e ou
tpu
t;
Possi
ble
is to
tal trip
les of a questio
n
i
n
th
e gold
stand
a
rd
;
Missing
is t
h
e nu
m
b
er o
f
tri
p
les of a
qu
estio
n
in
t
h
e
gol
d st
an
da
rd t
h
a
t
have
n
o
c
o
u
n
t
e
r
p
art
i
n
t
h
e
out
put
s;
Sp
uri
ous
is th
e nu
m
b
er of tri
p
les
o
f
a qu
est
i
o
n
fro
m
th
e ou
tpu
t
sth
a
t
h
a
ve no
co
un
terp
art in
t
h
e
g
o
l
d
st
anda
rd
.
Ho
we
ver
,
e
v
al
uat
i
o
n
of
p
h
as
e 3
use
s
t
h
e
f
o
ur
m
easures, t
h
e
preci
si
on
, t
h
e
recal
l
,
t
h
e
un
de
r
gene
rat
i
o
n
,
a
n
d
th
e
ov
er g
e
n
e
ratio
n
with
ou
t
pa
rtia
l
m
easure, beca
use t
h
e
outputs a
r
e
n
o
t
i
n
trip
le-b
ased
form
.
Evaluation Warning : The document was created with Spire.PDF for Python.
I
J
ECE
I
S
SN
:
208
8-8
7
0
8
Ont
o
l
o
gy-
b
ase
d
W
h
y-
Q
u
est
i
o
n A
n
al
ysi
s
Usi
n
g Lexico-Synt
a
ctic Patter
n
s
(
A
.A.
I
.N. EkaK
ary
a
wati)
32
7
This eval
uation is perform
e
d by conducting expe
ri
m
e
nts that ge
nerate ra
ndom
ly 100, 200, and 300
que
stions in
20 iterations. T
h
e evaluati
on
perform
a
n
ces are the avera
g
e values
of each m
easure for each
pha
se. T
h
e
form
ula of the a
v
erage
m
easure,
M
is,
20
20
1
1
i
n
j
ji
n
Q
M
M
(5
)
whe
r
e,
M
is t
h
e m
easure
P
,
R
,
O
, or
U
,
n
is 10
0,
2
0
0
,
or
30
0, and
Q
ji
is t
h
e
j
th
quest
i
o
n
o
f
t
h
e
i
th
iteratio
n.
Furt
herm
ore, t
h
e sec
o
n
d
e
v
al
uat
i
o
n
i
s
pe
rf
o
r
m
e
d by
com
p
ari
n
g t
h
e
res
u
l
t
of sea
r
c
h
i
n
g
doc
um
ent
s
(i
.e., doc
um
ent
s
t
h
at
cont
ai
n answ
er of why
-
q
u
est
i
o
ns) bas
e
d
o
n
t
h
e pr
op
ose
d
o
n
t
o
l
o
gy
-
b
ase
d
w
h
y
-
que
st
i
o
n
anal
y
s
i
s
m
e
t
hod a
g
ai
nst
t
h
e
r
e
sul
t
s
o
f
sea
r
c
h
i
n
g
doc
um
ent
s
base
d
o
n
t
h
e
key
w
o
r
d
-
ba
se
d m
e
t
hods
(i
.e.
,
t
e
rm
(i
.e., o
n
e
-
w
o
r
d
)-
base
d, an
d p
h
rase (i
.e
., m
u
lt
i
-
wo
rd
)-
base
d
m
e
t
hod). T
h
i
s
eval
uat
i
on
use
s
dat
a
set
com
p
osi
n
g
pai
r
s
of
why
-
q
u
est
i
o
n an
d a s
e
t
of rel
e
vant
d
o
cum
e
nt
t
h
at
cont
ai
n a
n
s
w
er
s
.
The e
v
al
uat
i
o
n m
easures t
h
a
t
used
in
th
is research
are t
h
e two stan
d
a
rd
ev
al
u
a
ti
on m
easures, MRR (Mean R
eciprocal
Rank) and P@10
(p
reci
si
o
n
at
1
0
)
[
35]
,
[
36]
,
[
37]
.
N
i
i
N
MRR
1
-
question
for
passage)
_relevant_
rank(first
1
1
(6
)
N
N
i
i
P
1
-
question
for
10
at
precision
10
@
(7
)
whe
r
e, precision at
10 for a
que
stion is
1 if the a
n
s
w
er
to
th
is qu
estion
i
s
fo
und
in top
-
1
0
do
cu
m
e
n
t
s and
0
ot
he
rwi
s
e.
The sec
o
nd
e
v
al
uat
i
o
n
i
s
pe
r
f
o
r
m
e
d by
c
o
nd
uct
i
n
g e
x
per
i
m
e
nt
s t
h
at
ret
r
i
e
ve
d
o
cum
e
nt
s sat
i
s
fi
ed
w
h
y-qu
estion
s
, wh
er
e th
e
qu
estion
s
ar
e
gen
e
r
a
ted
r
a
ndo
m
l
y 2
0
,
40, 60
,
80
,
and 10
0 qu
estions. Th
e
expe
rim
e
nts are conducted i
n
20 iterations
. The e
v
al
uati
on
pe
rform
a
nces are the a
v
e
r
age
value
s
of each
measure (i.e., MRR
and
P@10) fr
o
m
th
e 20
iteration
results.
The p
r
op
ose
d
m
e
t
hod
has be
en t
e
st
ed o
n
53
67
w
h
y
-
q
u
est
i
ons i
n
Text Retrieval
d
o
m
ain
.
In
ad
d
ition
,
for th
e first
evalu
a
tio
n
,
th
e ex
p
e
rim
e
n
t
s are also
co
ndu
cted
b
y
inp
u
tting
so
m
e
q
u
e
stions m
a
n
u
a
lly, esp
ecially
th
e qu
estion
s
ou
t of
well-ord
e
red fo
rm
s,
in
o
r
de
r to
analy
z
e
err
o
rs
fu
rthe
r.
6.
RESULTS
A
N
D
DI
SC
US
S
I
ON
Table
8
prese
n
ts the e
v
aluation res
u
lts
of t
h
e first
e
v
aluati
on,
whic
h is t
h
e evaluati
on of each
phase
of t
h
e
pr
o
pose
d
why
-
q
u
est
i
o
n a
n
al
y
s
i
s
m
e
tho
d
(see C
h
apt
e
r 5
o
n
e
v
al
uat
i
on m
e
t
hod
).
As ca
n
be see
n
i
n
Ta
bl
e
8
,
fo
r all ph
ases, th
e r
e
su
lts sho
w
goo
d
p
e
rf
or
m
a
n
ce f
o
r
all measures.
T
h
e avera
g
e va
lues of
th
e
p
r
ecision
and
the recall
m
e
a
s
ures a
r
e grea
ter than
99%
.
Moreove
r
, the avera
g
e val
u
es o
f
t
h
e u
n
d
er
ge
nerat
i
o
n
and t
h
e
ove
rgene
r
ation m
easures are l
e
ss tha
n
1%.
Tabl
e 8.
E
v
al
u
a
t
i
on resul
t
s
fo
r phase
1
,
pha
s
e
2,
an
d p
h
ase 3
i
n
2
0
i
t
e
rat
i
o
ns
Metrics
Phase 1
Phase 2
Phase 3
Dat
a
= 100
Prec
is
ion
99.
35%
99.
34%
99.
40%
Recall
99.
31%
99.
30%
99.
29%
Undergenerat
ion
0.
72%
0.
73%
0.
71%
Overgenerat
ion 0.
14%
0.
15%
0.
00%
Dat
a
= 200
Prec
is
ion
99.
24%
99.
23%
99.
26%
Recall
99.
22%
99.
21%
99.
21%
Undergenerat
ion
0.
80%
0.
81%
0.
79%
Overgenerat
ion 0.
06%
0.
07%
0.
01%
Dat
a
= 300
Prec
is
ion
99.
58%
99.
57%
99.
61%
Recall
99.
55%
99.
55%
99.
54%
Undergenerat
ion
0.
48%
0.
49%
0.
46%
Overgenerat
ion
0.
10%
0.
12%
0.
02%
Evaluation Warning : The document was created with Spire.PDF for Python.