Indonesi
an
Journa
l
of El
ect
ri
cal Engineer
ing
an
d
Comp
ut
er
Scie
nce
Vo
l.
13
,
No.
3
,
Ma
rch
201
9
, p
p.
1175
~
1
18
3
IS
S
N: 25
02
-
4752, DO
I: 10
.11
591/ijeecs
.v1
3
.i
3
.pp
1175
-
1
18
3
1175
Journ
al h
om
e
page
:
http:
//
ia
es
core.c
om/j
ourn
als/i
ndex.
ph
p/ij
eecs
Modifie
d frame
work f
or sarcasm
detecti
on
and classifi
cati
on
in senti
ment an
alys
i
s
Mohd
Suh
airi Md
Suhaimi
n
1
, Mohd H
ana
fi A
h
ma
d Hi
j
az
i
2
, R
ayner
Alfred
3
,
Fr
an
s
Coenen
4
1
,2,3
Facul
t
y
of
C
om
puti
ng
and
In
form
at
ic
s,
Unive
rsiti
Ma
lay
sia
Sa
bah,
Ma
lay
si
a
1
Kuching
Com
m
unity
Col
le
g
e, Mini
str
y
of
Edu
ca
t
ion,
Ma
lay
si
a
4
Depa
rtment of
Com
pute
r
Scie
n
ce
,
Univer
si
t
y
of
Li
v
erp
ool
,
Unit
ed
Kingdom
Art
ic
le
In
f
o
ABSTR
A
CT
Art
ic
le
history:
Re
cei
ved
S
ep
1
5
, 201
8
Re
vised Dec
6
,
2018
Accepte
d Dec
17
, 201
8
Senti
m
ent
an
alys
is
is
dire
cted
at
ide
ntif
y
ing
peo
ple'
s
opini
ons,
b
el
i
efs,
vie
w
s
and
emotions
in
the
con
te
x
t
of
t
he
entities
and
a
tt
ributes
that
ap
pea
r
in
te
x
t
.
The
pre
senc
e
of
sarc
asm
,
how
eve
r,
c
an
signi
fic
an
tly
hamper
senti
m
ent
ana
l
y
sis.
In
thi
s
pape
r
a
senti
m
e
nt
class
ifi
cation
fra
m
ework
is
pre
sente
d
that
inc
orpora
te
s
sar
ca
sm
det
ection
.
The
fra
m
ework
was
eva
luated
u
sing
a
non
-
li
ne
ar
Support
Vec
tor
Ma
chi
n
e
and
Ma
lay
so
c
ia
l
m
edi
a
d
ata.
The
r
esult
s
obt
ai
n
ed
demon
strat
ed
that
th
e
proposed
sarc
as
m
det
ection
pro
ce
ss
could
succ
essfull
y
de
te
c
t
the
pr
ese
n
ce
of
sarc
asm
in
tha
t
b
etter
senti
m
ent
cl
assifi
ca
t
ion
p
e
rform
anc
e
was
r
ec
orde
d
.
A
b
est
ave
rag
e
F
-
m
ea
s
ure
score
of
0.
905
was
rec
or
ded
using
the
fr
amework;
a
s
ig
nifi
c
ant
l
y
b
et
t
er
result
tha
n
when
senti
m
ent
cl
assifi
ca
t
ion
wa
s pe
rform
ed
wit
hout
sarc
asm
detec
t
ion.
Ke
yw
or
ds:
Cl
assifi
cat
ion
Fr
am
ewo
r
k
Ma
la
y
so
ci
al
m
edia
Sarcasm
detect
ion
Sentim
ent
analy
sis
Copyright
©
201
9
Instit
ut
e
o
f Ad
vanc
ed
Engi
n
ee
r
ing
and
S
cienc
e
.
Al
l
rights re
serv
ed.
Corres
pond
in
g
Aut
h
or
:
Mohd S
uhai
ri
Md S
uh
ai
m
in,
Ku
c
hing C
omm
un
it
y C
ollege, Mi
nistry
of
Ed
ucati
on,
Petra Jay
a,
93050 Sa
raw
a
k
M
al
ay
sia
.
Em
a
il
:
m
i15
11
003t@al
um
.u
m
s.ed
u.
m
y /
su
hairisu
haim
in@kk
kg.edu.m
y
1.
INTROD
U
CTION
User
ge
ner
at
e
d
con
te
nt,
ac
qu
i
red
from
so
ci
al
m
edia,
has
be
en
e
xtensiv
el
y
analy
sed
s
o
as
to
ide
ntify
people'
s
em
oti
on
s
,
perspecti
ves,
vie
ws,
be
li
efs
an
d
se
nti
m
ents
towa
rd
s
sit
uatio
ns
,
pro
duct
s,
se
r
vices,
oth
e
r
in
div
id
ua
ls
and
orga
ni
sat
ion
s
[1]
.
S
uc
h
Se
ntim
ent
An
al
ysi
s
(
SA)
fo
c
us
es
on
the
posit
ive
an
d
ne
gative
la
belli
ng
of
co
m
m
ents.
Ho
we
ver,
the
prese
nc
e
of
sa
rcasm
i
n
us
e
r
com
m
e
nts
can
a
dv
e
rs
el
y
aff
ect
the
qual
it
y
of
the
SA.
Wh
en
sa
rcasti
c
con
te
nt
is
include
d
in
wh
at
would
be
co
nsi
der
e
d
to
be
a
p
os
it
ive
sta
t
e
m
ent,
the
m
eaning
is
inten
ded
t
o
be
neg
at
i
ve,
a
nd
vice
ve
rsa
[
2]
.
The
us
e
of
sa
r
casm
is
par
ti
cu
la
rly
pr
e
valent
in
the
con
te
xt
of
poli
ti
cal
exch
an
ge
s
su
c
h
as
in
t
he
case
of
disc
ussi
on
f
orum
s.
The
ov
e
rall
affe
ct
of
sa
rcasm
is
to
'
flip'
the
expr
essed
se
ntim
e
nt
[3]
.
A
fail
ur
e
t
o
detect
sarcasm
will
cl
early
aff
ect
the
outp
ut
fro
m
SA
syst
e
m
s
[4]
.
Extensi
ve
wor
k
has
bee
n
re
por
te
d
directed
at
ov
e
rc
om
ing
the
sarcasm
pr
ob
lem
us
ing
a
r
ang
e
of
te
chn
iq
ues
[5]
.
H
owe
ver
, t
o
the b
est
knowle
dg
e
of
t
he
aut
hors
, no work
ha
s b
ee
n
co
nduc
te
d
w
her
e
by sa
rcasm
detect
ion
a
nd
cl
assifi
cat
ion
ha
ve
bee
n
i
ncor
porated
i
nto
a
sentim
ent
cl
ass
ific
at
ion
f
ram
e
work
as
pr
opose
d
in
this pa
per.
The
obj
ect
ive
of
the
w
ork
presented
in
t
his
pa
per
can
be
su
m
m
arized
as:
giv
e
n
a
n
opinio
nated
te
xt
com
m
ent
x
,
de
te
rm
ine
w
het
her
x
e
xpress
es
a
posit
ive
or
ne
gative
se
nti
m
ent
after
consi
der
i
ng
w
hethe
r
sarcasm
is
pr
e
sent
or
not.
M
or
e
sp
eci
fical
ly
this
pa
pe
r
presents
a
se
ntim
ent
cl
assifi
cat
ion
fr
am
ewo
r
k
tha
t
inco
rpor
at
es
s
arcasm
detect
i
on
a
nd
cl
assifi
cat
ion
.
In
it
i
a
l
senti
m
ent
classificat
ion
is
perform
ed
on
the
pr
e
processe
d
te
xts
from
wh
i
ch
featu
res
a
re
sel
ect
ed
a
nd
extracte
d.
Sa
rc
asm
detect
ion
and
cl
assifi
cat
ion
are
perform
ed
la
te
r.
T
he
ai
m
is
firstly
to
detect
the
pr
ese
nce
of
sarcasm
,
an
d
the
n,
as
c
onseq
uen
ce
,
to
fl
ip
the
init
ia
l cl
assifi
c
at
ion
. Act
ual
s
entim
ent cla
ssific
at
ion
is t
hus
perform
ed
at
the e
nd of t
he p
ro
ces
s.
Evaluation Warning : The document was created with Spire.PDF for Python.
IS
S
N
:
2502
-
4752
Ind
on
esi
a
n
J
E
le
c Eng &
Co
m
p
Sci,
Vo
l.
13
, N
o.
3
,
Ma
rc
h 201
9
:
1
1
7
5
–
1
1
8
3
1176
2.
RELATE
D
W
ORK
Sarcasm
is
a
t
ype
of
ve
r
bal
iro
ny
that
im
pli
es
the
op
posit
e
m
eaning
of
th
e
li
te
ral
m
eani
ng
of
w
hat
was
sai
d
[
6],
[
7]
.
Al
ong
wit
h
hype
rbole,
jo
cularit
y,
r
heto
r
ic
al
qu
est
io
ns
and
un
der
sta
te
m
ents,
the
ide
a
is
to
convey
a
com
bin
at
ion
of
obvi
ou
s
a
nd
m
or
e
s
ub
tl
e
i
nterp
e
rs
on
al
m
eanin
gs
.
H
ow
e
ve
r,
t
he
stud
y
of
sar
cas
m
in
li
ng
uisti
cs
[
8],
[9]
an
d
com
pu
ta
ti
on
[
10]
–
[
12]
has
in
dicat
ed
that
the
pres
ence
of
sarcas
m
in
a
neg
at
ive
te
xt
do
e
s
not
al
wa
ys
ind
ic
at
e
the
opposit
e
of
w
hat
the
s
peak
e
r
m
eant,
hen
ce
wh
e
n
under
ta
king
S
A
we
c
annot
si
m
ply rev
erse
the '
po
la
rity
'
.
Ther
e
has
bee
n
s
om
e
pr
evio
us
work
direct
ed
at
ext
racti
ng
opi
nions
f
rom
te
xt
(co
m
m
entary)
t
hat
m
ay
featur
e
sarcasti
c
con
te
nt
.
The
syst
em
pro
po
se
d
in
[
13]
adopts
a
N
at
ur
al
Lan
gua
ge
Processi
ng
(N
L
P)
appr
oach
a
nd
featur
e
s
a
pro
cess
com
pr
isi
ng
ei
ght
ste
ps
wh
ic
h
ai
m
t
o
detect
ing
sa
rcasti
c
op
ini
ons
an
d
conseq
ue
ntly
flipp
in
g
pola
rit
y
at
fifth
ste
p
,
and
determ
ini
n
g
the
fi
nal
pola
rity
value
f
or
e
ach
ta
r
get
op
i
ni
on
at
ei
gh
t
ste
p
.
T
o
detect
sarcasti
c
sta
tem
ent
,
t
he
aut
hors
sugg
e
ste
d
ad
opti
ng
t
he
‘c
on
te
xt
ual
valence
s
hifter
’
pro
po
se
d
i
n
[1
4
]
.
Valence
ca
lc
ulati
on
was
perform
ed
at
the
se
ntence
le
vel
us
in
g
a
po
sit
ive
an
d
ne
ga
ti
ve
valence
c
orpu
s
to
flip
po
la
rity
.
T
he
auth
ors
al
so
e
m
plo
ye
d
the
w
ork
pr
esented
in
[1
5
]
,
wh
ere
patte
r
n
an
d
punctuati
on
ba
sed
feat
ur
e
s
w
ere
us
e
d,
an
d
t
he
w
ork
of
[
16
]
wh
ere
le
xical
and
pragm
at
ic
featu
res
we
re
us
e
d,
to
char
act
e
rise
a
sarcasti
c
wo
r
d.
H
owev
er,
no
ex
pe
rim
en
ts
wer
e
re
ported
co
ncernin
g
the
evaluati
on
of
th
e
pro
po
se
d fr
am
ewo
r
k.
In
[1
7
]
a
pa
rsi
ng
-
based
uns
uper
vised
ap
pro
ach
was
pro
posed
directed
at
Twitt
er
data.
The
syst
em
adopted
tw
o
a
ppr
oach
es
to
i
den
ti
fy
sa
rcasti
c
tweet
s,
t
he
fi
rst
us
i
ng
pa
rsi
ng
-
based
le
xic
on
ge
ne
rati
on
and
the
seco
nd
us
in
g
inter
j
ect
ion
word
e
nr
ic
hm
ent.
Po
la
rity
identific
at
ion
was
c
onduct
ed
in
a
n
autom
at
ed
m
a
nn
e
r.
Super
vised
le
a
rn
i
ng
w
as
the
n
us
ed
to
cl
assif
y
senti
m
ent
as
neg
at
ive
,
posit
ive
or
ne
utral. Th
e
pro
pose
d
s
yst
e
m
was
te
ste
d
us
i
ng
tw
o
set
s
of
tweet
s,
on
e
t
ha
t
featur
e
d
t
he
sarcasm
hash
t
ag
a
nd
one
tha
t
did
no
t
featu
re
the
sarcasm
hash
ta
g.
T
he
best
f
-
s
cor
e
rec
orde
d
f
or
sa
rcasti
c
has
htag
tweet
s
wa
s
0.
84
usi
ng
th
e
le
xicon
ge
ne
r
at
ion
appr
oach,
an
d
0.90
f
or
t
he
in
te
rj
ect
io
n
w
or
d
en
richm
ent
appr
oach,
outpe
rfor
m
in
g
res
ults
obta
ined
us
i
ng
t
he
set
o
f
tweets
w
it
ho
ut t
he
sa
rca
sti
c h
ashta
g.
The
syst
em
pr
esented
i
n
[1
8
]
was
al
so
direct
ed
at
detect
in
g
the
p
rese
nce,
or
ot
herwise,
of
sarcasm
in
so
ci
al
m
edia
com
m
ent.
The
s
yst
e
m
op
erated
in
the
fo
ll
owi
ng
m
ann
er:
c
om
m
ent
acqu
isi
ti
on
,
po
st
-
proc
essing
of
th
e
acq
uire
d
com
m
ents,
co
rpus
cre
at
ion,
featur
e
s
extrac
ti
on
an
d
sel
ect
ion
a
nd
final
c
la
ssific
at
ion
.
F
or
th
e
eval
uatio
n
r
ep
or
te
d
,
tweet
s
wer
e
us
e
d
whic
h
ha
d
been
annotat
ed
a
nd
ind
e
xe
d
acco
rd
i
ng
t
o
the
ha
sh
ta
gs
pro
du
ce
d
by
use
rs.
T
he
le
xical
featur
es
us
e
d
wer
e
n
-
gr
am
s
(unigr
am
s
and
bigram
s)
cont
ai
ned
in
LI
W
C
[19]
and
Wor
dNet
-
Affect
[
20
]
.
E
m
ot
ic
on
s,
punc
tuati
on
a
nd
co
m
m
on
gro
und
(u
se
r
re
ply
an
d
nam
e
char
act
erist
ic
s
)
wer
e
use
d
as
pragm
at
ic
featur
es.
Chi
-
s
qu
a
re
d
fea
tu
re
sel
ect
ion
was
a
pp
li
e
d
to
identify
use
fu
l
feat
ur
es
.
N
a
ive
Ba
ye
s,
Lo
gisti
c Regres
sio
n
a
nd S
upport
Ve
ct
or
Mac
hin
es
(S
VM
) were
e
m
plo
ye
d
for
th
e cla
ssific
at
ion. It
was
fou
nd
t
hat
the
bin
a
ry
cl
assifi
cat
ion
ou
t
perf
or
m
ed
the
th
re
e
-
way
cl
assifi
c
at
ion
w
he
n
a
n
S
VM
cl
assifi
er
was
e
m
plo
ye
d.
A
best
accu
rac
y
of
0.7
8
3
was
repo
rted
in
t
he
c
onte
xt
of
pola
rity
base
d
cl
ass
ific
at
io
n
(positi
ve vers
us ne
gative)
, a
nd a
n
acc
ur
acy
of
0.7
3
0
in
the
case o
f
sa
rcasti
c v
e
rsus
neg
at
i
ve
cl
assifi
cat
io
n.
The
syst
em
s
br
ie
fly
desc
ribe
d
a
bove
[13]
,
[
17
]
,
[
1
8
]
us
e
unsupe
rv
ise
d
or
super
vised
ap
proac
hes
t
o
identify
in
g
sar
cast
ic
te
xt
(com
m
entary).
The
syst
em
s
cl
a
ssifie
d
te
xt
as
bein
g
ei
the
r:
po
sit
ive
,
ne
gat
ive
or
sarcasti
c.
I
n
th
is
pap
e
r
so
m
e
of
the
id
eas
pr
esented
with
r
espect
to
these
three
syst
em
s
hav
e
bee
n
ada
pted
s
o
that
a
po
sit
ive
or
neg
at
ive
se
nti
m
ent
cl
assifi
cat
ion
can
be
arr
iv
e
d
at
re
ga
rd
le
ss
of
w
het
her
the
te
xt
in
cl
ud
es
sarcasm
o
r no
t.
3.
THE
PROPO
SED F
R
AM
E
WORK
In
this
pa
pe
r,
we
pro
po
se
a
fr
am
ewo
r
k
to
suppo
rt
SA
that
con
sist
s
of
six
m
ai
n
m
od
ules:
(i)
pr
e
processi
ng
of
te
xt
, (
ii
)
ext
r
act
ion
of
feat
ures
,
(iii
)
featu
re s
el
ect
ion
,
(i
v)
i
niti
al
senti
m
ent
cl
assifi
cat
ion
,
(iv)
sarcasm
detection
an
d
cl
assifi
cat
ion
,
a
nd
(
v)
fi
nal
sentim
ent
cl
assifi
c
at
ion
.
Fig
ure
1
shows
the
pro
pose
d
fr
am
ewo
r
k.
I
ni
ti
al
senti
m
ent
cl
assifi
cat
ion
ref
e
rs
to
re
gula
r
se
ntim
ent
cl
asssi
ficat
ion
be
f
or
e
c
ons
iderin
g
sarcasm
detection
,
w
hile
fin
al
senti
m
ent
ref
ers
t
o
final
sentim
ent
cl
as
sific
at
ion
afte
r
perform
ing
sarcasm
detect
ion
a
nd
cl
assifi
cat
ion
.
In
this
fr
am
ework,
the
m
os
t
crit
ic
al
m
od
ule
is
the
dete
ct
ion
of
sarc
as
m
and
cl
assifi
cat
ion
.
It
was
c
onj
ect
ur
e
d
that
t
he
a
bili
ty
of
the
propose
d
work
to
ide
ntify
accu
ratel
y
the
pr
e
s
ence
of
sarcasm
in
tex
ts
woul
d
res
ult
in
bette
r
f
inal
senti
m
ent
cl
assifi
cat
ion
with
resp
ect
to
init
ia
l
sen
tim
ent
cl
assifi
cat
ion
.
Detai
ls
con
ce
r
ning
eac
h
of
the
ab
ove
m
od
ules
,
with
re
sp
ect
to
the
pro
posed
fr
am
ework,
are
giv
e
n
in
S
ub
-
sect
io
ns 3
.1 to
3.6.
Evaluation Warning : The document was created with Spire.PDF for Python.
Ind
on
esi
a
n
J
E
le
c Eng &
Co
m
p
Sci
IS
S
N:
25
02
-
4752
Mo
difi
ed
fra
m
ew
or
k fo
r
s
ar
c
as
m
d
et
ect
io
n and cl
as
sif
ic
ation i
n senti
me
nt
..
.
(
M
ohd S
uhai
ri Md Suh
aimi
n)
1177
C
om
m
e
nt
P
r
e
pr
oc
e
s
s
i
ng
F
e
a
t
ur
e
E
xt
r
a
c
t
i
on
F
e
a
t
ur
e
S
e
l
e
c
t
i
on
I
ni
t
i
a
l
S
e
nt
i
m
e
nt
C
l
a
s
s
i
f
i
c
a
t
i
on
P
os
i
t
i
ve
?
S
a
r
c
a
s
m
D
e
t
e
c
t
i
on
a
nd
C
l
a
s
s
i
f
i
c
a
t
i
on
(
N
e
ga
t
i
vi
t
y
)
S
a
r
c
a
s
m
D
e
t
e
c
t
i
on
a
nd
C
l
a
s
s
i
f
i
c
a
t
i
on
(
P
os
i
t
i
vi
t
y
)
A
c
t
ua
l
S
e
nt
i
m
e
nt
C
l
a
s
s
i
f
i
c
a
t
i
on
Y
e
s
No
Figure
1.
The
fram
ewo
rk to
s
upport
SA usi
ng sa
rcasm
d
et
ect
ion
a
nd class
ific
at
ion
3.1.
Preproces
sing
So
ci
al
m
edia
te
xt
c
on
ta
ins
a
sig
nificant
a
m
ou
nt
of
noise
including:
sp
el
li
ng
er
rors,
non
-
sta
nd
a
r
d
words,
sty
li
stic
wo
r
ds,
short
fo
rm
wo
r
ds
and
re
petit
ion
s
.
Cl
assifi
cat
ion
accuracy
is
increasin
gly
aff
e
ct
ed
as
the
pr
ese
nce
of
no
ise
inc
rease
s
[21]
.
The
pre
sence
of
noisy
te
xt
al
so
cause
s
'
disp
ersio
n'
,
wh
e
re
sam
e
fe
at
ur
es
are
treat
e
d
as
dif
fer
e
nt
feat
ur
es
,
wh
ic
h
r
esults
in
poor
pe
rfor
m
ance
w
hen
buil
di
ng
a
cl
assifi
e
r
[
22]
.
The
prep
ro
ce
s
sing
m
od
ule
e
m
plo
ye
d
in
th
is
pa
per
in
vo
l
ved
to
ke
nizat
ion,
s
pell
ch
ec
king
a
nd
sto
pwo
rd
rem
ov
al
.
To
ke
nizat
ion
break
s
the
c
orpus
into
w
ords
a
nd
sy
m
bo
ls
s
uch
as
pu
nctuati
ons
a
nd
has
hta
gs
(
#)
.
A
co
rr
es
ponde
nce
dicti
onary
was
us
e
d
to
correct
m
isspell
ed
words.
St
opw
ord
rem
oval
was
pe
r
f
orm
ed
to
el
i
m
inate
m
ean
ing
le
ss
w
ords.
3.2.
Feature E
xt
r
ac
tion
Th
ree
cat
egor
ie
s
of
NLP
ba
sed
featu
res
wer
e
co
ns
i
dered:
synta
ct
ic
,
pr
a
gm
at
ic
and
pr
oso
dic
.
This
wa
s
the
featur
e
c
om
bin
at
ion
m
echan
ism
pr
opos
e
d
in
[23]
w
hich
wa
s
ad
opte
d
beca
us
e
t
hi
s
ha
d
dem
on
strat
ed
im
pr
ov
em
ent
in
sarcasm
detect
ion
in
c
om
par
ison
with
com
par
at
or
m
echa
nism
s.
The
ou
t
pu
t
of
this
m
od
ule
was
a
set
of
featur
e
vecto
r
s
(one
per
te
xt)
each
com
pr
ise
d
of
Te
r
m
Fr
equ
e
ncy
-
Inverse
Do
c
um
ent Fr
e
qu
e
ncy
(TF
-
I
D
F)
values
no
rm
al
iz
ed
to
docu
m
ent (
te
xt)
len
gth
.
3.2.1.
Sy
n
ta
c
tic Fe
ature Ex
tr
ac
tio
n
Syntac
ti
c
featu
res
play
an
im
portant
ro
le
i
n
pro
vid
in
g
i
nform
at
ion
co
ncernin
g
the
synta
c
ti
c
structu
re
of
doc
um
ents.
In
this
pa
per,
c
omm
on
synta
ct
ic
featur
es
Pa
r
t
of
S
peec
h
(POS)
ta
gs
wer
e
us
e
d.
Four
gro
up
s
of
PO
S
ta
g:
N
O
UN,
V
ERB
,
A
DJECT
IV
E
a
nd
A
D
VER
B
w
ere
sel
ect
ed.
T
he
Pe
nn
Tree
ba
nk
POS
[
24]
ta
gs
et
was
c
ho
se
n
t
o
ta
g
the
to
ke
ni
zed
w
ords
.
Ea
ch
of
the
ta
gs
was
the
n
m
app
ed
i
nto
eac
h
corres
pondin
g
gro
up.
On
ly
the
to
keni
zed
w
ords
ass
ociat
ed
with
th
e
four
sel
ect
ed
PO
S
gr
oups
,
a
s
descr
i
bed
a
bove
,
we
re
retai
n
ed
i
n
the
te
xt.
A
w
ord
-
ta
g
pair
re
presentat
ion
was
us
e
d
to
re
pr
e
se
nt
the
synta
ct
ic
featur
e
s
as
thi
s
has
been
s
ho
wn
t
o
pro
du
ce
im
pr
ov
ed
se
nti
m
ent cla
ssific
at
ion
pe
rfor
m
ance com
par
ed
to
us
in
g words
or
ta
gs al
on
e
[2
5
]
.
3.2.2.
Pra
gmatic Fe
at
u
re Ex
tr
act
i
on
Pr
a
gm
atic
feat
ur
es
a
re
inten
ded
to
em
ph
as
iz
e
the
m
eani
ng
of
the
c
on
t
ent
of
se
ntenc
es
that
m
ay
include
sa
rcas
m
[2
6
]
.
Em
oti
cons,
'
hea
vy'
punctuati
on
,
ha
sh
ta
gs
(
#)
a
nd
rep
eat
e
d
word
s
are
exam
ples
of
pr
a
gm
at
ic
featur
es
.
Punctuat
ion
m
ark
s
ar
e
con
si
der
e
d
to
be
pragm
at
ic
feat
ur
es,
instea
d
of
se
nt
ence
segm
entat
or
s,
because
of
th
ei
r
po
te
ntial
to
in
dicat
e
sar
casm
[2
7
]
.
He
avy
punct
uatio
n,
f
or
exam
ple
hig
h
occurre
nces
of
var
i
ou
s
punct
ua
ti
on
m
ark
s,
is
of
te
n
a
n
in
dic
at
or
of
the
pr
e
s
ence
of
sarca
s
m
in
te
xt
(co
m
m
ent).
The
punct
uatio
n
m
ark
s
c
onsidere
d
i
n
this
f
ram
ewo
rk
we
r
e:
qu
e
sti
on
m
a
rk
s
(
?
)
,
e
xcla
m
at
ion
m
ark
s
(!)
a
nd
quotati
on
m
ark
s
('
'
and
"
").
Has
htag
s
(
#)
wer
e
al
s
o
co
ns
ide
red
as
it
is
us
ed
com
m
on
ly
to
ind
ic
at
e
the
pr
ese
nce o
f
sa
r
casm
[2
8
]
.
The
le
ng
th o
f
a se
qu
e
nce o
f
punc
tuati
on
m
ark
s w
as
re
duced
t
o
a
m
axi
m
u
m
of
three
cha
racte
rs
to
a
vo
i
d dispe
rsion
.
Evaluation Warning : The document was created with Spire.PDF for Python.
IS
S
N
:
2502
-
4752
Ind
on
esi
a
n
J
E
le
c Eng &
Co
m
p
Sci,
Vo
l.
13
, N
o.
3
,
Ma
rc
h 201
9
:
1
1
7
5
–
1
1
8
3
1178
3.2.3.
Pros
od
ic
Fe
ature Ex
tra
c
tio
n
Pr
oso
dic
feat
ur
es
i
nvolv
e
diff
e
re
nt
pitches,
lo
udne
ss,
tim
ing
an
d
tem
po
s
in
wr
it
in
g
[2
9
]
.
In
te
r
j
ect
io
ns
a
r
e
an
e
xam
ple
of
pro
sodic
fe
at
ur
es.
A
li
st
of
inter
j
ect
io
ns
[30]
wa
s
em
plo
ye
d
t
o
c
om
par
e
a
nd
extract t
he
inte
rj
ect
io
ns
from
the text.
3.3.
Feature Sel
ec
t
ion
To
sel
ect
only
the
m
os
t
signi
ficant
featu
res
,
featu
re
sel
ect
io
n
was
a
ppli
ed.
T
o
t
his
en
d
Pears
on'
s
correla
ti
on
co
eff
ic
ie
nt
was
chosen
to
sup
port
featur
e
s
el
ect
ion
.
The
featur
e
s
wer
e
ranke
d
base
d
on
the
gen
e
rated
co
e
f
fici
ents.
On
ly
t
he
to
p N
featu
r
es w
e
re
us
e
d
f
or cla
ssific
at
io
n pur
po
ses
.
3.4.
Initial Se
nt
im
ent
Clas
si
fica
t
ion
The
i
niti
al
senti
m
ent
cl
assificati
on
m
od
ule
cl
assifi
es
te
xt
as
dis
play
ing
ei
ther
po
sit
ive
or
ne
gativ
e
sentim
ent.
In
the
w
ork
prese
nted
i
n
this
pa
per,
no
n
-
li
nea
r
SV
M
was
use
d
to
ge
ner
at
e
t
he
cl
assifi
er
be
cause
it
has
be
en
s
how
n
to
perform
wel
l
in
the
co
nt
ext
of
s
uper
vi
sed
cl
assifi
cat
ion
[1
8
]
.
T
he
va
riat
ion
of
non
-
li
nea
r
SV
M
us
e
d was
LibS
VM
[
31
]
as provide
d wi
thin the
Wek
a
[3
2
]
data m
ining w
ork
be
nch.
3.5.
Sa
rc
as
m
De
te
cti
on a
nd Clas
sific
at
i
on
This
m
od
ule
was
de
rive
d
f
ro
m
an
app
r
oa
ch
to
detect
and
cl
assify
sarcasm
rep
ort
ed
in
[12]
.
It
has
two
s
ub
-
processe
s:
sarcasm
detect
i
on
a
nd
cl
assif
ic
at
ion
.
Fig
ur
e
2
sh
ows
the
pr
oce
ss
of
sarcasm
detect
ion
an
d
cl
assifi
cat
ion
of
a
giv
en
te
xt
after
init
ia
l
senti
m
ent
c
la
ssific
at
ion
has
bee
n
perform
ed.
Th
e
a
i
m
of
t
his
m
od
ul
e
wa
s
t
o
ide
nt
ify
and
cl
assify
te
xts
that
con
ta
in
sarca
s
ti
c
featur
es
.
T
exts
that
ha
ve
bee
n
identifie
d
as
po
sit
ive
by
t
he
init
ia
l
sentim
ent
cl
assifi
cat
ion
will
be
furthe
r
cl
assifi
ed
as
ei
t
her
posit
ive
sarcasti
c
(sar
c
ast
ic
featur
es
occur
in
the
te
xt)
or
true
pos
it
ive.
Si
m
i
la
rly,
te
xts
that
hav
e
bee
n
identi
fied
as
neg
at
ive
by
th
e
init
ia
l
senti
ment
cl
assifi
cat
ion
m
odule
will
be
f
ur
the
r
cl
as
sifie
d
as
ei
ther
neg
at
ive
sa
rca
sti
c
or
true ne
gative.
N
e
ga
t
i
ve
C
om
m
e
nt
P
os
i
t
i
ve
C
om
m
e
nt
S
a
r
c
a
s
t
i
c
N
on
-
s
a
r
c
a
s
t
i
c
S
a
r
c
a
s
t
i
c
N
on
-
s
a
r
c
a
s
t
i
c
N
e
ga
t
i
ve
s
a
r
c
a
s
t
i
c
T
r
ue
ne
ga
t
i
ve
P
os
i
t
i
ve
s
a
r
c
a
s
t
i
c
T
r
ue
po
s
i
t
i
ve
Figure
2
.
Sarca
sm
d
et
ect
ion
an
d
cl
assifi
cat
io
n of senti
m
ent
3.6.
Act
u
al S
entim
ent Classi
fica
t
ion
As
m
entione
d
i
n
the
f
or
e
goin
g
sect
ion
s
,
sa
rca
sti
c
con
te
nt
te
nds
to
re
verse
th
e
act
ual
se
nti
m
ent
of
th
e
te
xts.
The
refore
,
once
sarcas
m
has
been
de
te
ct
ed,
act
ual
senti
m
ent
cl
a
ssific
at
ion
will
be
p
e
rfor
m
ed
us
i
ng
po
la
rity
flipp
i
ng
[3
3
],
[3
4
]
.
T
he
po
la
rity
flipp
i
ng
is
em
plo
ye
d
t
o
rev
e
rs
e
the
i
niti
al
senti
m
ent
cl
assifi
cat
ion
resu
lt
s
ba
sed
on
li
ng
uisti
c
hypotheses
.
T
wo
hypothese
s
wer
e
c
on
si
de
red
in
t
his
pa
per
:
to
flip
a
ll
te
xts
identifie
d
as
c
on
ta
ini
ng
sa
rc
asm
(p
os
it
ive
and
ne
gative
s
arcasti
c)
or
to
flip
only
posi
ti
ve
te
xts
c
onta
inin
g
sarcasm
(
posit
ive sa
rcasti
c).
3.6.1.
Fli
p
Both P
os
i
tive Sarc
astic
an
d
Ne
gati
ve Sarc
as
tic
The
fir
st
hypothesis
co
ns
ide
rs
sarcasm
as
in
dicat
ing
s
om
eth
in
g
op
po
sit
e
to
w
hat
the
sp
e
aker
m
eant
[6
]
,
[
7]
.
W
he
n
sarcasti
c
con
te
nt
is
us
ed
in
a
po
sit
ive
sta
tem
ent,
the
sp
eaker
is
act
ually
say
ing
so
m
e
thin
g
neg
at
ive
,
an
d
vice
ver
sa
[2
]
,
[2
8
]
.
Ba
sed
on
this
hypothe
sis,
the
po
la
rity
of
posit
ive
sarcasti
c
te
xt
will
be
flipp
e
d
t
o
ne
ga
ti
ve,
a
nd
ne
gat
ive
sa
rcasti
c
te
xt
will
be
flippe
d
t
o
po
sit
ive
.
Figure
3(a)
shows
the
pola
rit
y
flip
base
d on this
hy
po
the
sis.
Evaluation Warning : The document was created with Spire.PDF for Python.
Ind
on
esi
a
n
J
E
le
c Eng &
Co
m
p
Sci
IS
S
N:
25
02
-
4752
Mo
difi
ed
fra
m
ew
or
k fo
r
s
ar
c
as
m
d
et
ect
io
n and cl
as
sif
ic
ation i
n senti
me
nt
..
.
(
M
ohd S
uhai
ri Md Suh
aimi
n)
1179
(a)
(
b)
Figure
3
.
P
olar
it
y fli
pp
in
g bas
ed on (a
)
the
pr
esence
of both
po
sit
ive
and
ne
gative sa
rcasm
(
Hy
po
t
hesis
1)
and (
b) the
pre
sence
of posit
ive sa
rcasm
o
nl
y (H
yp
ot
hesis
2)
3.6.2.
Fli
p p
os
i
tive
s
arcastic
on
ly
The
sec
ond hy
po
t
hesis stat
es
that sarcasm
in
negat
ive stat
em
ent d
oe
s
not
al
ways d
el
ive
r t
he
op
po
sit
e
of
w
hat
the
spe
aker
m
eant.
This
hy
pothesis
was
der
i
ved
f
r
om
the
sarcas
m
li
ng
uisti
c
st
ud
ie
s
prese
nted
in
[
8],
[9]
a
nd
t
he
c
om
pu
ta
ti
on
al
ex
per
im
ents
repo
rted
in
[
10
]
-
[
12]
.
Ba
sed
on
t
his
hy
po
t
hesis,
on
ly
the
pola
r
it
y
of
po
sit
ive
s
arcas
ti
c
te
xt
sh
ould
be
flip
pe
d
to
neg
at
ive
,
w
hil
st
neg
at
ive
sar
cast
ic
te
xt
sh
ould
rem
ai
n
neg
at
ive
.
Figure
3(b
)
s
hows
the
polarit
y fli
p based
on
this sec
ond hy
po
t
hesis.
4.
E
X
PERI
MEN
TAL SET
UP
AND RES
UL
TS
This
sect
ion
de
scribes
the
dataset
us
ed
to
ev
al
uate
the
fr
am
ewor
k
and
th
e
natu
re
of
the
e
xp
e
rim
ents
cond
ucted; a
nd
a d
isc
us
si
on of the
r
es
ults o
bt
ai
ned
.
4.1.
The D
atase
t
To
e
valuate
t
he
pro
posed
f
ra
m
ewo
r
k,
a
Ma
la
y
so
ci
al
m
ed
ia
dataset
wa
s
us
e
d
[
12]
.
T
he
te
xts
wer
e
annotat
ed
by
three
hu
m
an
annotat
ors.
T
hree
annotat
ions
wer
e
produce
d:
(i)
the
se
nt
i
m
en
ts
of
the
te
xts
(positi
ve,
ne
ga
ti
ve
or
ne
utral)
,
(ii)
the
e
xiste
nce
of
sa
rcasm
or
oth
e
rw
ise
(
sarcasti
c
vs
.
no
n
-
sa
rcasti
c),
a
nd
(iii
)
wh
e
re
sarcasm
was
co
ns
ide
r
ed
to
exist
the
po
sit
ivit
y
and
neg
at
ivit
y
of
the
sarcasm
(p
os
it
ive
sarcas
m
vs
.
po
sit
ive
senti
m
ent
,
an
d
negat
ive
sarcasm
vs
.
ne
gative
se
nti
m
ent).
On
ly
te
xts
la
bele
d
as
posit
ive
or
neg
at
ive
,
and
a
gr
ee
d
by
al
l
ann
otato
rs
wer
e
c
on
si
der
e
d
in
the
ex
per
i
m
ents.
Hen
ce
,
a
su
bs
et
of
19
70
te
xts
was
de
rive
d
from
the o
ri
gina
l 300
0
te
xts
.
4.2.
Experim
en
ta
l
Setu
p
The
ai
m
of
the
e
xp
e
rim
ent
was
t
o
eval
uate
the
e
ff
ec
ti
ven
ess
of
th
e
pro
posed
fra
m
ewo
r
k
i
n
su
pp
or
ti
ng
se
nt
i
m
ent
analy
si
s.
To
ac
hieve
this,
five
set
s
of
e
xp
e
rim
ents
wer
e
c
onduct
ed.
T
he
first
was
to
identify
the
pe
rfor
m
ance
of
t
he
init
ia
l
sentim
ent
analy
sis
on
t
he
dataset
and
featu
res
us
ed.
T
he
sec
ond
was
to
m
easur
e
the
pe
rfor
m
ance
of
sarcasm
detect
ion
.
T
he
thir
d
and
f
ourth
set
of
e
xperim
ents
wer
e
co
nd
ucted
t
o
evaluate
t
he
pe
rfor
m
ances
of
posit
ive
a
nd
ne
gative
sa
rcas
m
c
la
ssific
at
ion
s.
The
fi
nal
s
et
of
ex
per
im
e
n
t
wa
s
cond
ucted
t
o
e
valuate
the
pe
r
form
ance
of
a
c
tual
sentim
ent
cl
assifi
cat
ion
,
wh
e
re
po
la
rity
flipp
i
ng
was
use
d
to
rev
e
rse
the
i
niti
al
senti
m
ent
of
te
xts
w
her
e
the
prese
nce
sarcasm
had
be
en
detect
e
d.
All
exp
e
rim
ents
were
cond
ucted usi
ng
10
-
f
old cr
os
s
v
al
i
datio
n
a
nd
the
Wek
a
Kn
owle
dge Fl
ow
[3
5
]
.
4.2.1.
Preproces
sing
The
dat
aset
w
as
first
to
ke
nized,
fo
ll
owe
d
by
sp
el
lc
heck
i
ng
a
nd
sto
pword
rem
ov
al
as
descr
i
bed
i
n
Sub
-
sect
io
n
3.1.
T
he
Ma
la
y
and
E
ngli
sh
dicti
on
a
ries
w
ere
us
e
d
to
c
orrect
m
isspell
ed
w
ords.
St
opw
ord
rem
ov
al
w
as a
pp
li
ed
usi
ng
both Mal
ay
[
36]
an
d
E
ngli
sh
[37]
sto
pword
li
sts.
4.2.2.
Feature E
xt
r
ac
tion a
nd
S
el
ection
Durin
g
the
fea
ture
ext
racti
on
sta
ge,
both
t
he
or
igi
nal
bili
ngual
dataset
an
d
it
s
translat
io
n
to
E
ng
li
s
h
wer
e
co
ns
ide
r
ed.
The
reas
on
was
to
pr
e
ser
ve
the
ori
gi
nal
an
d
tra
ns
la
te
d
featu
res
from
the
te
xt
that
m
igh
t
incl
ud
e
sa
rcas
m
in
Malay
an
d/or
En
glish.
The
proce
ss
of
featur
e
ext
rac
ti
on
co
ns
ist
s
of
two
m
ai
n
step
s:
(i)
P
os
i
t
i
ve
s
a
r
c
a
s
t
i
c
T
r
ue
pos
i
t
i
ve
T
r
ue
ne
ga
t
i
ve
N
e
ga
t
i
ve
s
a
r
c
a
s
t
i
c
P
os
i
t
i
ve
N
e
ga
t
i
ve
P
o
s
i
t
i
ve
s
a
rc
a
s
t
i
c
T
ru
e
p
o
s
i
t
i
v
e
T
ru
e
n
eg
a
t
i
v
e
Ne
g
a
t
i
v
e
s
a
rc
a
s
t
i
c
P
o
s
i
t
i
v
e
N
e
g
a
t
i
ve
Evaluation Warning : The document was created with Spire.PDF for Python.
IS
S
N
:
2502
-
4752
Ind
on
esi
a
n
J
E
le
c Eng &
Co
m
p
Sci,
Vo
l.
13
, N
o.
3
,
Ma
rc
h 201
9
:
1
1
7
5
–
1
1
8
3
1180
extracti
on
of
th
e
pr
a
gm
atic
and
pros
odic
(Ma
la
y)
featur
es
f
r
om
the
or
igi
nal
bili
ngual
data,
and
(ii)
tra
ns
la
ti
on
of
t
he
bili
ngua
l
data
to
En
glish
a
nd
e
xt
racti
on
of
En
glish
pros
od
ic
a
nd
s
ynta
ct
ic
feature
s.
I
n
the
first
ste
p,
the
pragm
at
ic
and
Ma
la
y
pro
so
dic
featu
res
wer
e
e
xtracte
d
from
the
pr
e
proces
sed
ori
gi
nal
bili
ngual
da
ta
set
.
Four
pra
gm
ati
c
featur
es
we
re
extracte
d:
qu
e
sti
on
m
ark
s
(
?
)
,
ex
cl
am
at
ion
m
ark
s
(!
),
quotati
on
m
ark
s
('
'
and
"
")
,
and
has
htags
(#).
T
he
Ma
la
y
pr
oso
dic
f
eat
ur
es
wer
e
i
den
ti
fie
d
usi
ng
the
Ma
la
y
li
st
of
interjecti
ons
.
I
n
the
en
d,
40
pros
odic
feat
ur
e
s
we
re
e
xtracte
d
from
the
or
i
gi
nal
bili
ngual dat
aset
.
In
the
s
econd
ste
p,
the
or
i
gin
al
bili
ngual
dataset
was
tr
anslat
ed
to
E
ngli
sh
usi
ng
G
oogle
Tra
ns
la
t
e
[38]
.
Alth
ou
gh
t
he
resu
lt
in
g
tra
nsl
at
ion
s
wer
e
by
no
m
eans
pe
rf
ect
,
t
hey
were
j
ud
ged
to
pro
du
ce
translat
io
ns
t
ha
t
were
su
f
fici
ently
accurate
t
o
s
upport
furthe
r
a
na
ly
sis,
bette
r
t
ha
n
th
e
tr
a
ns
la
ti
on
s
ob
ta
ine
d
usi
ng
Moses
or
Bi
ng
[3
9
]
.
T
he
En
gl
ish
pros
od
ic
fe
at
ur
es
we
re
i
de
ntifie
d
us
i
ng
an
E
ngli
sh
li
st
s
of
i
nter
j
ect
io
ns
.
I
n
t
he
e
nd,
a
total
of
26
pros
odic
featur
e
s
we
re
extracte
d
fro
m
the
translat
ed
dataset
.
Ne
xt
,
synta
ct
ic
featur
es
wer
e
e
xt
racted.
Four
POS
ta
g
gr
oups
:
N
OUN,
VERB,
A
D
JECTI
VE
a
nd
ADVERB
,
we
r
e
extracte
d
us
i
ng
the
Pyt
hon
Natu
ral
Lan
gu
a
ge
To
ol
kit
(NLTK
)
[
40
]
.
Wo
r
d
-
ta
g
pairs
w
ere
e
xtracted
to
re
pr
ese
nt
eac
h
of
the
te
xts.
T
he
total
nu
m
ber
of
sy
ntact
ic
featur
e
s
ob
ta
in
ed
wa
s
3695.
All
the
extr
act
ed
f
eat
ur
es
wer
e
t
han
vecto
rize
d
an
d
norm
al
iz
ed
to
the
in
div
i
du
al
te
xt
le
ngths
(T
F
-
I
DF).
W
it
h
r
espect
to
the
f
eat
ur
e
s
el
ect
ion
,
t
he
t
op
25%
,
50
%
and
75%
of
th
e
featu
res
were
sel
ect
ed
base
d
on
the
Pears
on
'
s
co
rrel
at
ion
c
oeffici
ent
r
ank
i
ng.
Detai
ls
of
t
he
nu
m
ber
of f
eat
ur
es
for ea
ch
s
et
o
f
ex
per
im
ents, a
nd the
size o
f
the
d
at
aset
in
eac
h
case
, a
re
giv
e
n
in
Ta
bl
e 1
.
Table
1
.
T
he
N
um
ber
of
Feat
ures
Used
f
or
E
xp
e
rim
entat
ion
Exp
eri
m
en
t
Dataset size
% Featu
re
Selectio
n
size
25%
50%
75%
Fu
ll
Initial sen
ti
m
en
t
cl
ass
if
icatio
n
1970
941
1883
2824
3765
Sarcas
m
detectio
n
Sarcas
m
po
sitiv
ity
class
if
icatio
n
802
514
1028
1542
2056
Sarcas
m
neg
ativ
ity
classif
icatio
n
1168
686
1372
2058
2744
Actu
al sen
ti
m
en
t
c
lass
if
icatio
n
1970
941
1883
2824
3765
4.3.
Experim
en
ta
l
Results
The
ex
per
im
ents w
ere
desig
ne
d
to conside
r
on
ly
b
ina
ry cla
ssific
at
ion
. Av
erag
e F
-
m
easur
e (
F
avg
)
wa
s
us
e
d
to
m
easur
e cla
ssific
at
ion pe
rfor
m
ance,
f
or
m
ulate
d
as:
=
×
+
×
+
(1)
wh
e
re
F
i
is
the
F
-
m
easur
e
for
cl
ass
i
an
d
ci
is
the
num
ber
of
docum
ents
in
cl
ass
i
,
w
hile
F
j
is
the
F
-
m
easure
for
cl
ass
j
a
nd
cj
is
the
num
ber
of
do
c
um
ents
in
cl
ass
j
.
T
he
F
-
m
easur
e
(
F
)
is
the
ha
rm
on
ic
m
ean
of
pr
ec
isi
on
and recall
for
e
ach class
i
a
nd
j
.
4.3.1.
Initial Se
nt
im
ent Classi
fica
t
ion
In
this
set
of
e
xp
e
rim
ent,
the
te
xts
wer
e
cl
a
ssifie
d
as
ha
vi
ng
ei
the
r
posit
ive
or
negat
ive
senti
m
ent.
Table
2
s
how
s
the
res
ults
obta
ined.
The
bes
t
senti
m
ent
cl
a
ssific
at
ion
perf
or
m
ance
was
r
ecorde
d
w
he
n
us
in
g
the
top
25%
fe
at
ur
es,
with
a
n
F
avg
scor
e
of
0.839.
The
w
ors
t
was
rec
ord
ed
w
he
n
al
l
fe
at
ur
es
wer
e
use
d
f
or
cl
assifi
cat
ion
F
avg
= 0
.
611).
Table
2
.
Re
s
ults o
f
I
niti
al
Sen
t
i
m
ent Cl
assifi
cat
ion
(
P
os
it
ive
vs
.
Ne
gative
)
% Featu
re
Selectio
n
(
FS)
si
ze
25%
50%
75%
Fu
ll
Exp
eri
m
en
t
Av
erage F
-
m
easu
r
e (
F
a
vg
)
Initial
sen
ti
m
en
t
cl
ass
if
icatio
n
0
.83
9
0
.62
3
0
.75
4
0
.61
1
4.3.2.
Sa
rc
as
m
De
te
ction
Fo
r
t
he
sec
ond
set
of
e
xp
e
rim
ents,
the
te
xts
wer
e
cl
assifi
e
d
as
bein
g
ei
ther
sarcasti
c
or
no
n
-
sa
rcasti
c.
The
best
sarca
sm
detection
w
as
recor
ded
w
he
n
usi
ng
the
to
p
50%
of
the
f
eat
ur
es,
with
a
n
F
avg
sco
re
of
0.
85
2
as
show
n
in
T
able
3.
As
in
t
he
case
of
the
resu
lt
s
ob
ta
ine
d
with
res
pect
to
sentim
ent
classificat
ion
,
sa
rcasm
detect
ion wit
h
al
l feat
ur
es
pr
oduce
d
t
he wors
t perfo
rm
ance w
it
h
a
n
F
avg
sc
or
e
of
0.664.
Evaluation Warning : The document was created with Spire.PDF for Python.
Ind
on
esi
a
n
J
E
le
c Eng &
Co
m
p
Sci
IS
S
N:
25
02
-
4752
Mo
difi
ed
fra
m
ew
or
k fo
r
s
ar
c
as
m
d
et
ect
io
n and cl
as
sif
ic
ation i
n senti
me
nt
..
.
(
M
ohd S
uhai
ri Md Suh
aimi
n)
1181
Table
3
.
Re
s
ults o
f
Sa
rcasm
D
et
ect
ion
a
nd
Sa
rcasm
Cl
assifi
c
at
ion
% Featu
re
Selectio
n
(
FS)
si
ze
25%
50%
75%
Fu
ll
Exp
eri
m
en
t
Av
erage F
-
m
easu
r
e (
F
a
vg
)
Sarcas
m
detectio
n
0
.75
5
0
.85
2
0
.73
7
0
.66
4
Sarcas
m
po
sitiv
ity
class
if
icatio
n
0
.94
2
0
.78
7
0
.77
6
0
.76
7
Sarcas
m
neg
ativ
ity
classif
icatio
n
0
.90
9
0
.79
7
0
.61
4
0
.59
3
4.3.3.
Sa
rc
as
m
Posi
t
ivity a
nd Ne
gati
vity
Cl
as
sifi
cat
i
on
Table
3
al
so
sh
ows
the
res
ults
f
or
sarcas
m
c
la
ssific
at
ion
.
For
posit
ive
sarcasm
cl
assifi
cat
ion
,
the
te
xts
wer
e
cl
assifi
ed
as
be
ing
ei
the
r
posit
ive
sarca
sti
c
or
true
posit
ive.
The
best
resu
lt
was
agai
n
rec
orde
d
wh
e
n
us
i
ng
th
e
top
25%
of
the
featur
e
s,
w
it
h
an
F
avg
scor
e
of
0.942.
N
egati
ve
sarcas
m
,
wh
ere
te
xts
were
cl
assifi
ed
as
be
ing
ne
gative
sa
rcasti
c
or
true
neg
at
ive
,
pro
duced
a
lo
wer
be
st
F
avg
scor
e
of
0.909
com
pared
to
po
sit
ive
sa
rcas
m
cl
assifi
cat
io
n.
I
n
fact,
t
he
ne
gative
sarca
s
m
cl
assifi
cat
io
n
pro
duced
l
ower
perform
ances
than
po
sit
ive
sarcas
m
in
m
os
t
cas
es
re
gardless
of
the
num
ber
of
feat
ur
es
us
e
d.
This
m
ay
be
du
e
to
dif
ficult
ie
s
in
recog
nizing
the
neg
at
ive
s
arcasti
c
featu
r
es
from
true
ne
gative
te
xt
s
com
par
ed
to
posit
ive
sa
rcasm
cl
assifi
c
at
ion
.
4.3.4.
Act
u
al S
entim
ent Classi
fica
t
ion
The
re
su
lt
s
of
the
act
ual
senti
m
ent
cl
assifi
cat
ion
are
s
how
n
in
Table
4.
I
n
this
experim
e
nt,
the
te
xts
wer
e
cl
assi
fied
in
te
rm
s
of
posit
ive
or
ne
gative
se
ntim
ent.
Wh
e
n
po
l
arit
y
flipp
in
g
was
a
pp
li
ed
on
bot
h
po
sit
ive
a
nd
ne
gative
sarcast
ic
te
xts,
the
best
F
a
vg
scor
e
recorde
d
wa
s
0.899
usi
ng
th
e
top
25%
fea
tures
.
Howe
ver,
the
best
F
avg
resu
l
t
with
resp
ect
to
act
ual
senti
m
ent
cl
assifi
c
at
ion
was
rec
orde
d
w
hen
pola
rity
flipp
i
ng
was
a
pp
li
ed
on
ly
th
e
posit
ive
sa
rc
ast
ic
te
xts,
where
the
F
avg
sc
or
es
of
0.9
05
r
ecorde
d
us
in
g
the
t
op
25%
featu
res.
In
m
os
t
cases,
the
best
perfor
m
ing
res
ults
w
ere
obta
ine
d
w
hen
us
i
ng
on
ly
the
top
25%
of
the
featur
e
s.
Wha
te
ver
t
he
cas
e,
the
act
ual
sentim
ent
classificat
ion
i
m
pr
ov
e
d
on
the
init
ia
l
senti
m
ent
cl
assifi
cat
ion
i
n
al
l case
s.
Table
4
.
Re
s
ults o
f
Actual Se
nt
i
m
ent Cl
assificati
on
% Featu
re
Selectio
n
(
FS)
si
ze
25%
50%
75%
Fu
ll
Exp
eri
m
en
t
Av
erage F
-
m
easu
r
e (
F
a
vg
)
Actu
al sen
ti
m
en
t
c
lass
if
icatio
n
(
Flip
bo
th
po
siti
v
e sarc
a
stic &
n
eg
ativ
e sarc
astic)
0
.89
9
0
.71
5
0
.67
1
0
.66
6
Actu
al
sen
ti
m
en
t
c
lass
if
icatio
n
(
Flip
po
sitiv
e sarc
astic o
n
ly
)
0
.90
5
0
.90
3
0
.90
3
0
.90
0
4.4.
Analysis
of R
e
sults
Ba
sed
on
the
resu
lt
s
s
how
n
in
Tables
2,
3
and
4,
it
ca
n
be
obse
rv
e
d
th
at
the
perf
or
m
ance
of
th
e
sentim
ent
cl
as
sific
at
ion
wa
s
i
m
pr
oved
by
6.6%
after
c
on
si
der
i
ng
s
ar
cast
ic
te
xts
(the
init
ia
l
senti
m
ent
cl
assifi
cat
ion
pro
du
ce
d
a
be
st
F
avg
of
0.839
wh
il
e
th
e
ac
tual
sentim
ent
cl
assifi
cat
ion
pro
du
ce
d
best
F
avg
of
0.905
).
5.
CONCL
US
I
O
N
This
pap
e
r
ha
s
prese
nted
a
fr
am
ework
to
s
upport
SA
by
util
iz
i
ng
sarca
sm
detect
ion
a
nd
cl
assifi
cat
ion
.
A
f
ram
ewo
r
k
com
pr
ise
d
of
six
m
od
ules
is
pro
pose
d:
prep
ro
ce
ssin
g,
feat
ur
e
extr
act
ion,
featur
e
sel
ect
i
on,
init
ia
l
senti
m
ent
cl
assifi
c
at
ion
,
sa
rcasm
detect
i
on
a
nd
cl
assifi
cat
ion
,
and
act
ual
se
nti
m
ent
cl
assifi
cat
ion
.
A
non
-
li
near
SV
M
was
use
d
for
cl
assi
ficat
ion
pur
poses
with
res
pe
ct
to
the
re
porte
d
exp
e
rim
ents.
Com
par
ison
of
SA
with
out
sarcasm
detect
ion
(initi
al
senti
m
ent
classificat
ion
)
a
gains
t
cl
assifi
cat
ion
with
sarca
sm
detect
ion
(actua
l
senti
m
ent
cl
a
ssific
at
ion
)
de
m
on
strat
ed
tha
t
the
la
tt
er
pro
du
ce
d
a
bette
r
cl
assifi
c
at
ion
perform
a
nce.
ACKN
OWLE
DGE
MENTS
This
w
ork
wa
s
su
pport
ed
by
Un
iver
sit
i
Ma
la
ysi
a
Sabah
(U
MS
)
thr
ough
a
gr
a
nt
G
UG0
061
-
TK
-
2/2016 a
nd
Ar
t
ific
ia
l In
te
ll
igence Resea
rch
Un
it
, UM
S.
Evaluation Warning : The document was created with Spire.PDF for Python.
IS
S
N
:
2502
-
4752
Ind
on
esi
a
n
J
E
le
c Eng &
Co
m
p
Sci,
Vo
l.
13
, N
o.
3
,
Ma
rc
h 201
9
:
1
1
7
5
–
1
1
8
3
1182
REFERE
NCE
S
[1]
B.
L
iu and
L
.
Zh
ang,
“
A surve
y
o
f
opini
on
m
ini
ng
and
sen
ti
m
ent anal
y
s
is,”
Mini
ng
text
data
,
pp.
41
5
–
463
,
2012
.
[2]
B.
L
iu,
“
Senti
m
e
nt
an
aly
sis: Min
i
ng
opini
ons,
sent
iments,
and
emo
ti
ons
,”
Cambrid
ge
Univer
si
t
y
Pr
ess,
2015.
[3]
W
.
Le
ila,
et
al
.
,
“
How
does
iron
y
aff
ect
senti
m
ent
anal
y
s
is
tools
?
”
Progress
in
Arti
ficial
In
te
l
ligence
,
vol
.
9273,
pp.
803
–
808
,
20
15
.
[4]
A.
Farz
ind
ar
an
d
D.
Inkp
en,
“
Natur
al
L
anguage
Proce
ss
ing
f
or
Social
Medi
a
,
”
Morg
an
&
Cl
a
y
poo
l
Publish
e
rs,
vol.
8
,
2015
.
[5]
A.
Jos
hi,
et al
.
,
“
Autom
at
ic
sar
cas
m
det
ec
t
ion: A
surve
y
,
”
arXiv P
repr.
arXiv
1602
.
03426
,
2016
.
[6]
R.
W
.
Gibbs
,
“
On t
he
ps
y
chol
ing
uisti
cs
of
sarc
asm
,
”
J
.
Ex
p.
Psy
c
hol.
Gen
.
,
vol
/i
ss
ue:
115
(
1
)
, p
p
.
3
,
1986
.
[7]
R.
Gibbs,
“
Iron
y
in
talk
among
f
rie
nds,”
Irony
in
language
and
t
hought:
A
cogn
itive
sc
ie
n
ce
read
er
,
R.
Gibbs
an
d
H.
Colston, E
ds.
London: T
a
y
lor &
Franc
is Group
,
pp
.
339
–
360
,
2
007
.
[8]
R.
J.
Kr
euz
and
S.
Glucksbe
rg
,
“
How
to
be
sar
cas
ti
c:
The
ec
ho
ic
reminder
th
eor
y
of
ver
ba
l
ir
on
y
,
”
J.
E
xp.
Psyc
ho
l
.
Gen.
,
vol
/
issue:
118
(
4
)
,
p
p
.
374
,
1989.
[9]
S.
Atta
rdo
,
“
Iron
y
as
r
el
ev
ant i
na
ppropria
t
ene
ss
,
”
J.
Pragmat.
,
vol
/
issue:
32
(
6
)
,
pp.
793
–
826,
2000
.
[10]
E.
Ril
off
,
et
a
l.
,
“
Sarc
asm
as
con
tra
st
bet
we
en
a
p
ositi
ve
sent
iment
and
negative
si
tua
ti
on
,
”
in
EM
NL
P
2013
-
201
3
Confe
renc
e
on
Empiric
al
M
et
hods
in
Natural
Language
Proce
ss
ing,
Proce
ed
ings
of
the
Confe
ren
ce
,
pp.
704
–
714
,
20
13
.
[11]
M.
Bouaz
izi
an
d
T.
Ohtsuki,
“
Opinion
m
ini
ng
in
Twit
te
r
:
How
to
m
ake
use
of
sarc
asm
to
enha
nce
sent
ime
nt
ana
l
y
sis,
”
Proceedi
ngs
of
the
20
15
IEE
E/ACM
I
nte
rnational
Co
nfe
renc
e
on
Advance
s
in
Soci
al
Net
works
Anal
ys
is
and
Mini
ng
201
5
,
pp
.
1594
–
159
7
,
2015
.
[12]
M.
S.
Md
Suhaimin,
et
al
.
,
“
Mec
hani
sm
for
Sarc
as
m
Dete
ction
and
Cla
ss
ifi
c
at
i
o
n
in
Malay
Soci
al
Media,”
Adv.
Sci
.
Lett.
,
vol
/i
ss
ue:
28
(
2
)
,
pp
.
13
88
–
1392,
2018
.
[13]
E.
Cambria
,
et
al.
,
“
The
C
LSA
m
odel
:
A
novel
fra
m
ework
for
conc
ept
-
le
v
el
s
ent
iment
an
aly
si
s,”
Inte
rnat
ional
Confe
renc
e
on
I
nte
lligent Tex
t P
roce
s
sing a
nd
C
omputati
onal
Linguisti
cs
,
pp.
3
–
22
,
2015
.
[14]
L.
Polan
y
i
and
A.
Za
en
en,
“
Co
nte
xtu
al
va
le
nc
e
shifte
rs,”
Com
puti
ng
attitude
and
aff
ect
in
t
e
xt
:
Theory
and
appli
cations
,
Sp
ringe
r, pp.
1
–
10
,
2006
.
[15]
O.
Tsur,
et
al.
,
“
ICWSM
-
A
g
rea
t
c
at
ch
y
n
ame:
Sem
i
-
supervi
sed
rec
ognition
of
sarc
asti
c
sent
enc
es
in
onli
n
e
produc
t
rev
i
ews,
”
ICWSM
2010
-
Proce
edi
ngs
of
the
4th
Inte
rnat
ional
AA
A
I
Confe
renc
e
on
We
bl
ogs
and
Soci
al
Me
dia
,
pp.
162
–
169
,
2010
.
[16]
R.
G
.
Ibá
ñe
z,
e
t
al.
,
“
Ide
nt
if
y
in
g
sarc
asm
in
Twit
te
r:
A
c
loser
look,
”
ACL
-
HL
T
2011
-
Proce
edi
ngs
of
the
49
th
Annual
M
ee
t
ing
of
th
e
Associ
ati
on
for
Comp
utat
ional
Lingui
stic
s:
Hum
an
Language
Techn
ologi
es
,
vol.
2,
pp.
581
–
586
,
20
11
.
[17]
S.
K.
Bhart
i,
et
al.
,
“
Parsing
-
base
d
sarc
asm
senti
m
ent
rec
ogn
ition
in
Twit
te
r
dat
a
,
”
P
roc
ee
di
ngs
of
the
2015
IEE
E
/A
CM
Int
ernati
onal
Con
fe
renc
e
on
A
dvanc
es
in
S
oci
al
N
et
works
Anal
ysis
and
Mini
ng
2015
,
pp.
1373
–
1380
,
2015
.
[18]
S.
Muresan,
et
al.
,
“
Ide
ntific
at
io
n
of
nonli
te
ral
l
ang
uage
in
soci
a
l
m
edi
a:
A
ca
se
stud
y
on
sarc
as
m
,
”
J.
Assoc.
In
f.
Sci
.
Techno
l.
,
20
15.
[19]
J.
W
.
Penne
baker,
e
t
a
l.
,
“
The development and
p
s
y
chometr
ic
p
ro
per
ties of LIW
C2015,
”
UT
Fac. Wor
k.
,
2015.
[20]
C.
Strapp
ara
v
a
and
A.
Valitu
tt
i,
“
W
ordNet
Affec
t:
an
Aff
ec
t
ive
Ext
ensio
n
of
W
ordNet
,
”
LREC
,
vol
.
4,
pp.
1083
–
1086
,
2004
.
[21]
S.
Agarwal
,
et
a
l.
,
“
How
m
uch
noise
is
too
m
u
ch:
A
stud
y
in
a
utomati
c
te
x
t
c
l
assific
a
ti
on,
”
Da
ta
Mini
ng
,
2007
.
ICDM 2007.
S
event
h
I
EE
E
Int
ernati
onal
Con
fe
re
nce
on
.
I
EEE,
pp
.
3
–
12
,
2007
.
[22]
G.
Form
an,
“
Fe
a
ture
se
lecti
on
for
te
x
t cl
assifi
ca
t
io
n,
”
Comput
.
m
ethods
Fe
atur.
Se
l
.
,
vol. 19443557
97,
2007
.
[23]
M.
S.
Md
Suhaimin,
et
al.
,
“
Natur
al
La
ngu
age
Proce
ss
ing
Based
Feat
ure
s
for
Sarc
asm
Dete
ct
ion
:
An
Inve
stiga
tio
n
Us
ing
Bil
ingual
Social
Medi
a
Te
xts,
”
P
roce
ed
ing
of
the
The
8th
In
te
rnatio
nal
Confe
r
enc
e
on
Informatio
n
Technol
ogy
,
Amman,
Jordan
,
20
17.
[24]
htt
ps://
ww
w.l
ing
.
upenn.edu/cour
ses/Fall
_2003/li
ng001/pe
nn_tr
eebank_pos.ht
m
[25]
R.
Xia
and
C.
Z
ong,
“
Expl
oring
the
use
of
word
rel
a
ti
on
feature
s
for
senti
m
ent
class
ifi
cation,”
Proce
ed
ings
of
the
23rd Int
ernati
on
al
Conf
ere
nce o
n
Computati
onal
Linguisti
cs:
Pos
te
rs
,
pp
.
1336
–
1
344
,
2010
.
[26]
D.
Jurafsk
y
,
“
Speec
h
and
l
anguage
proc
essing:
A
n
in
troduction
to
nat
ura
l
la
ngu
ag
e
proc
essing,
”
C
omput.
Linguist.
spee
ch Recognit.
,
2000
.
[27]
P.
Carva
lho
,
e
t
al.
,
“
Clu
es
for
det
e
ct
ing
iron
y
in
user
-
gene
r
at
ed
cont
en
ts:
Oh...!!
it
’s
‘so
ea
s
y
’ ;
-
)
,
”
Proc
ee
d
ing
of
the
1st
internati
onal
CIKM
work
shop on
Topic
-
s
ent
im
en
t
anal
ysi
s for
mas
s opi
ni
on
-
TSA
’09
,
p
p
.
53
,
2009
.
[28]
C.
C.
Li
ebr
ec
ht
,
et
al.
,
“
The
per
fe
ct
soluti
on
for
det
e
ct
ing
sarc
asm
in
twee
ts# no
t,
”
4th
Workshop o
n
Computati
onal
Approache
s to
S
ubje
c
ti
v
it
y
,
S
ent
i
ment
and
So
ci
a
l Medi
a
Anal
ysis
,
pp.
29
–
37
,
2013
.
[29]
D.
Bikel
and
I.
Zi
touni,
“
Multi
lingual
Natu
ral
L
angua
ge
Proce
ss
ing
Applicati
ons
:
From
The
or
y
t
o
Prac
tice
,”
IB
M
Press
,
2012.
[30]
htt
p://gramm
ar.yourdic
ti
o
n
ar
y
.
co
m
/pa
rts
-
of
-
spee
c
h/i
nte
r
jecti
ons/
lis
t
-
of
-
int
er
je
c
ti
on
s.ht
m
l
[31]
C.
C.
Chang
an
d
C.
J.
Li
n
,
“
LI
BS
VM
:
A
li
bra
r
y
for
support
v
e
ct
or
m
ac
h
ine
s,
”
ACM
Tr
ans.
Int
el
l
.
Syst
.
Te
chno
l.
,
vol
/i
ss
ue:
2
(
3
)
,
p
p.
27
,
2011
.
[32]
M
.
Hall
,
e
t
al.
,
“
The
W
EKA
da
ta
m
ini
ng
software
:
an
upd
at
e
,
”
ACM
SIGKD
D
Ex
plor.
Newsl
.
,
vol
/i
ss
ue:
11
(
1
)
,
pp.
10
–
18
,
2009
.
[33]
C.
F.
Burger
s,
“
Verba
l iron
y
:
Us
e
and
eff
ects i
n
writt
en
discours
e
,”
[Sl:
sn]
,
2010.
[34]
F.
Kunnem
an,
et
al
.
,
“
Signal
ing
sar
ca
sm
:
From
hy
p
erb
ol
e
to
has
ht
ag,
”
Inf.
Proc
ess.
Manag.
,
vol
.
51
,
pp.
500
–
509
,
20
15.
Evaluation Warning : The document was created with Spire.PDF for Python.
Ind
on
esi
a
n
J
E
le
c Eng &
Co
m
p
Sci
IS
S
N:
25
02
-
4752
Mo
difi
ed
fra
m
ew
or
k fo
r
s
ar
c
as
m
d
et
ect
io
n and cl
as
sif
ic
ation i
n senti
me
nt
..
.
(
M
ohd S
uhai
ri Md Suh
aimi
n)
1183
[35]
I.
H.
W
it
t
en,
et
al
.
,
“
Dat
a
m
ini
ng:
Prac
t
ical
m
ac
hin
e
l
earning
tool
s
an
d
te
chn
ique
s
,
”
Boston:
Morgan
Kaufm
ann,
2011
.
[36]
htt
p://nlp.cs.n
y
u
.
edu/
GM
A_file
s/
resourc
es/malay
.
stopli
st
[37]
htt
p://ww
w.nl
tk.org/book/
ch02
.
h
tml
[38]
htt
ps://
tra
nsla
te.
google
.
com
[39]
A.
Balahur
and
M.
Tur
chi
,
“
Com
par
at
ive
exp
eri
m
ent
s
using
supervise
d
l
ea
rn
ing
and
m
ac
h
in
e
tra
nsl
ation
for
m
ult
il
ingual
sen
t
iment
an
aly
sis,
”
Comput.
Spe
ec
h
Lang.
,
vol
/i
ss
ue:
28
(
1
)
,
pp.
56
–
7
5,
2014
.
[40]
S.
Bird,
et
al
.
,
“
Natur
al
la
ngu
a
ge
proc
essing
with
P
y
thon:
an
aly
z
ing
te
xt
wit
h
the
nat
ura
l
language
tool
k
it
,
”
O’Reil
l
y
M
edi
a
,
Inc
.
,
2009.
BIOGR
AP
HI
ES OF
A
UTH
ORS
Mohd
Suhairi
Md
Suhaimin
is
a
l
ec
tur
er
of
I
nform
at
ion
Tec
hnolog
y
and
Ge
ner
al
Stud
ie
s
a
t
Kuching
Com
muni
t
y
Co
ll
eg
e,
Sara
wak.
He
ha
s
complet
ed
postgradua
t
e
stud
y
in
Com
pute
r
Scie
nc
e
a
t
Univ
ersit
i
Ma
lay
sia
Sabah
(UM
S).
His
rese
arc
h
intere
sts
span
both
dat
a
m
ini
ng
an
d
dat
a
visualizatio
n.
He
cur
ren
tly
leads
the
proj
ec
t
'
S
arc
asm
Detect
ion
and
C
l
assific
a
ti
on
to
Support
Senti
m
ent
Ana
l
y
s
is'
fo
cusing
in
b
il
ing
ual
,
Malay
and
Engl
ish
.
He
h
a
s
expl
ore
d
the
pre
senc
e
and
implicati
on
of
sar
c
asm
i
n
senti
m
en
t
anal
y
s
is.
Mohd
Hana
fi
A
hm
ad
Hija
zi
is
an
As
socia
te
Profess
or
of
Co
mput
er
Scie
n
ce
at
the
Facul
t
y
of
Com
puti
ng
and
Inform
at
ic
s,
Univer
siti
Malay
sia
Sabah
in
Malay
s
ia.
His
rese
arc
h
work
addr
esses
the
c
hal
l
enge
s
in
kn
owledge
d
iscov
er
y
and
data
m
ini
ng
to
ide
n
ti
f
y
p
at
t
ern
s
for
pre
diction
on
str
uct
ure
d
and/
or
unstruct
ure
d
data;
his
p
art
i
cular
appl
i
ca
t
ion
dom
ai
ns
ar
e
m
edi
c
al
image
anal
y
s
is
and
un
der
standi
n
g
and
senti
m
ent
ana
l
y
sis
on
soci
al
m
edi
a
da
ta.
He
has
aut
hor
ed/
co
-
aut
hor
ed
m
ore
th
an
30
journ
al
s/
book
cha
p
t
ers
and
conf
er
e
nce
p
ape
rs,
m
ost
of
whi
ch
ar
e
inde
xed
b
y
Sco
pus
and
ISI
W
eb
of
Scie
nce.
He
al
so
serve
d
o
n
the
progra
m
and
orga
nizing
comm
i
tt
ee
s of
n
um
ero
us na
ti
ona
l
and
interna
ti
on
al
conf
er
enc
es
.
Ra
y
n
er
Alfre
d
is
an
As
socia
te
Profess
or
of
Com
p
ute
r
Scie
n
ce
a
t
t
he
Facul
t
y
of
C
om
puti
ng
an
d
Inform
at
ic
s,
Un
ive
rsiti
M
al
a
y
si
a
Sabah
in
Ma
lay
s
ia.
He
l
ea
d
s
and
def
ine
s
p
roje
c
ts
aro
und
knowledge
disco
ver
y
and
informati
on
r
et
ri
eval
th
at
foc
uses
on
bu
il
ding
sm
art
er
m
ec
han
ism
tha
t
ena
bl
es
knowledge
discove
r
y
i
n
struct
ure
d
an
d
unstruct
ure
d
dat
a
.
His
work
addr
esses
the
cha
l
le
ng
es
r
el
a
t
ed
to
big
d
at
a
proble
m
.
He
has
aut
hore
d
a
nd
co
-
aut
hor
ed
m
ore
tha
n
85
journa
ls/book
ch
apt
ers, c
onf
ere
n
ce
p
ape
rs
and ed
it
orials.
Frans
Coene
n
ha
s
a
gene
r
al
b
ac
k
ground
in
AI
an
d
Mac
hine
Lear
ning
and
ove
r
th
e
la
st
te
n
y
e
ars
has
bee
n
workin
g
in
th
e
f
ie
ld
of
Big
Dat
a
Anal
y
t
ic
s
as
appl
i
ed
to
unusual
d
at
a
set
s,
such
as:
(i)
gra
phs
and
soc
i
al
n
et
works
,
(
ii
)
ti
m
e
ser
ie
s,
(iii
)
fre
e
t
ext
of
a
ll
kinds,
(
iv)
2D
and
3D
imag
es,
par
ticula
r
l
y
m
e
dic
a
l
images
,
a
nd
(v)
vid
eo
d
a
ta
.
He
is
al
so
i
nte
rest
ed
in
da
t
a
m
ini
ng
ov
er
enc
r
y
pt
ed
dat
a
.
He
cur
ren
tly
l
eads
a
s
m
al
l
rese
a
rch
group
working
on
m
an
y
aspe
ct
of
m
ac
hine
le
arn
ing.
He
h
as
som
e
380
ref
er
ee
d
publ
ic
a
ti
ons
on
Mac
hine
Learni
ng
and
AI
r
e
la
t
ed
rese
arc
h
.
Frans
Coene
n
i
s
cur
ren
tly
pro
fessor
withi
n
t
he
Depa
rtment
of
Co
m
pute
r
Scie
nc
e
at
the
Univer
sit
y
of
L
i
ver
pool
.
Evaluation Warning : The document was created with Spire.PDF for Python.