Int
ern
at
i
onal
Journ
al of Ele
ctrical
an
d
Co
mput
er
En
gin
eeri
ng
(IJ
E
C
E)
Vo
l.
8
, No
.
6
,
Decem
ber
201
8
, p
p.
5311
~
5317
IS
S
N:
20
88
-
8708
,
DOI: 10
.11
591/
ijece
.
v
8
i
6
.
pp
5311
-
53
17
5311
Journ
al h
om
e
page
:
http:
//
ia
es
core
.c
om/
journa
ls
/i
ndex.
ph
p/IJECE
Complai
nt Anal
ysis in
In
donesi
an
Lan
guage
U
sin
g
WPK
E
and
RAKE
Algorith
m
Rini W
ongso
,
Novit
a Ha
nafiah
,
Jak
a Har
t
an
t
o, Ale
xand
er Kevin
, Ch
ar
le
s Su
t
ant
o,
Fion
a
Kes
um
a
Com
pute
r
Scie
n
ce
Dep
art
m
ent, S
chool
of
Com
p
ute
r
Sc
ie
nc
e, Bina
Nus
ant
ar
a
Un
ive
rsit
y
,
Indon
esia
Art
ic
le
In
f
o
ABSTR
A
CT
Art
ic
le
history:
Re
cei
ved
Dec
25
, 201
8
Re
vised
Ju
l
3
,
201
8
Accepte
d
J
ul
22
, 2
01
8
Socia
l
m
edi
a
p
r
ovide
s
conv
eni
e
nce
in
comm
unic
ating
and
c
an
pre
sent
two
-
wa
y
comm
unic
a
ti
on
th
at
allows
companie
s
to
i
nte
ra
ct
wi
th
the
i
r
customer.
Com
pani
es
ca
n
use
informati
on
obta
in
ed
from
s
oci
a
l
m
edi
a
to
ana
l
y
z
e
how
the
comm
unit
ie
s
respond
to
th
ei
r
services
or
prod
uct
s.
The
bigge
s
t
ch
al
l
eng
e
in
proc
essing
in
form
at
ion
in
s
o
ci
a
l
m
edi
a
li
k
e
Twit
ter,
is
the
unstruct
ure
d
sente
nc
es
whic
h
coul
d
l
ea
d
t
o
inc
orre
ct
te
xt
proc
essing.
H
oweve
r,
thi
s
informati
on
is
ver
y
importan
t
for
companie
s’
su
rviva
l
.
In
thi
s
rese
arc
h,
w
e
proposed
a
m
et
h
od
to
ext
ra
ct
k
e
y
words
from
twee
ts
in
Indon
esia
n
la
ngu
age,
W
PK
E.
W
e
co
m
par
ed
it
with
RAK
E,
an
al
gorit
hm
tha
t
is
la
nguag
e
inde
pend
ent
and
usually
used
fo
r
ke
y
word
ext
r
a
ct
ion
.
Finall
y
,
w
e
develop
a
m
et
hod
to
do
c
luste
ring
to
gro
ups
the
topi
cs
of
complai
n
ts
with
da
ta
s
et
obta
in
ed
from
Twit
te
r
using
the
“
ko
m
pla
in”
hashta
g.
Our
m
et
hod
ca
n
obta
in
the
accura
c
y
of 72
.
92%
whil
e
R
AK
E
ca
n
onl
y
o
bta
in
35.
42%
.
Ke
yw
or
d:
Com
plaint analy
sis
RAKE
A
l
gorithm
Twitt
er
WPKE
Algorit
hm
Copyright
©
201
8
Instit
ut
e
o
f Ad
vanc
ed
Engi
n
ee
r
ing
and
S
cienc
e
.
Al
l
rights re
serv
ed
.
Corres
pond
in
g
Aut
h
or
:
Ri
ni
Wo
ngs
o,
Com
pu
te
r
Scie
nce
Dep
a
rtm
ent,
Scho
ol of C
om
pu
te
r
Scie
nc
e,
Bi
na
N
us
a
ntar
a Unive
rsity
,
Jl K
.
H.
Syah
da
n No.
9,
Palm
e
rah,
Ja
kar
ta
,
11
480,
I
ndonesi
a
.
Em
a
il
:
rw
ongso@
binus.e
du
1.
INTROD
U
CTION
So
ci
al
netw
orks
ha
ve
becom
e
an
in
flue
nt
ia
l
m
eans
of
com
m
un
ic
at
i
ng
these
days,
as
it
al
lo
w
s
interact
ion
between
ac
qu
ai
ntances
i
n
dif
fere
nt
s
ociet
y
[1]
.
S
ocial
m
edia
is
cu
rre
ntly
a
li
festy
le
for
m
os
t
people
al
l
arou
nd
t
he
w
or
l
d
[
2]
.
Twitt
er,
one
of
s
ocial
m
e
dia
ap
ps
,
has
a
ppr
ox
im
at
ely
50
0
m
il
l
ion
twe
et
s
and
307
m
i
ll
ion
act
ive
us
er
s
as
sta
te
d
in
Live
Stat
s
on
20
17
[
3]
.
It
has
bee
n
kn
own
that
Twit
te
r
is
us
ed
f
or
m
any
pur
po
ses
s
uch
as for
pro
te
st,
po
li
ti
cal
cam
pa
ign
s
, m
ark
et
in
g,
a
nd
for
c
omm
enting
se
rv
ic
es or
pro
du
ct
s
[4]
.
Accor
ding
to
G.
G
he
din
,
th
e
high
dif
fusion
of
T
witt
er
has
cl
early
ref
le
ct
ed
w
hat
happe
ns
in
Ind
on
esi
a
’s
m
a
rk
et
in
g
w
or
ld
[5
]
.
D
oze
ns
of
com
pan
ie
s
us
e
Twitt
er
as
the
perfect
m
edia
to
inte
ract
with
their
cl
ie
nts.
H
ow
e
ve
r,
it
is
not
eas
y
to
evaluate
t
he
po
pu
la
rity
or
acce
pta
nce
ra
te
of
pro
du
ct
s
or
se
r
vices
as
al
l
the
inf
or
m
at
ion
is
scat
te
red
a
nd
t
her
e
is
no
way
to
m
anag
e
it
well
.
T
he
am
ou
nt
of
in
form
ation
t
hat
goes
thr
ough
Twitt
er
ta
kes
ti
m
e
fo
r
m
anag
e
rs
to
a
naly
ze
th
e
cor
e
of
the
c
om
plaints
and
so
m
et
i
m
es
there
are
tweet
s
t
ha
t
are
no
t m
eanin
gful.
T
he pr
ocess wil
l be e
ff
ic
ie
nt
b
y u
sin
g
a m
achine t
o
e
xtrac
t t
he
co
re
of a s
entence
(
keyw
ord).
Extracti
ng
a
ke
ywo
rd
of
a
s
hort
sente
nce
is
on
e
o
f
t
he
chall
eng
e
s
in
na
tural
la
ngua
ge
processi
ng
area.
It
is
sta
te
d
by
N.
Ha
na
f
ia
h
that
people
te
nd
s
to
us
e
unstr
uctu
red
se
ntences
su
c
h
a
s
inco
rr
e
ct
gra
m
m
ar,
con
ta
in
s
m
any
abbre
viati
on
,
t
ypogra
phic
al
error
s
,
an
d
em
ot
ic
on
s
in
e
xpres
sing
their
th
ou
gh
ts
in
s
oc
ia
l
m
edi
a
[6]
.
The
un
st
ruct
ur
e
d
sentenc
es
need
to
be
norm
al
iz
ed
so
t
he
m
achine
ca
n
unde
rstan
d
the
w
ords.
A
fte
rw
a
rds
,
the
extracti
ng
keywor
d
al
gorithm
can
be
app
li
ed
to
ge
t
the
cor
e
of
com
plaints
in
Tweets.
D
espi
te
th
e
diff
ic
ulti
es
sta
te
d
ab
ove,
t
hes
e
data
are
ce
rtai
nly
v
ery
use
fu
l
f
or
the
c
om
pan
y
to
know
the
c
omm
u
niti
es’
respo
ns
es
t
oward
s
t
heir
pro
du
ct
s
or
ser
vi
ces.
By
ha
ving
these
data,
com
pan
ie
s
can
m
ake
an
ap
pro
pr
ia
te
decisi
on m
aking
for
thei
r
s
us
t
ai
nab
il
it
y
[4]
.
Evaluation Warning : The document was created with Spire.PDF for Python.
IS
S
N
:
2088
-
8708
In
t J
Elec
&
C
om
p
En
g,
V
ol.
8
, N
o.
6
,
Dece
m
ber
201
8
:
5311
-
5317
5312
Re
search
of
ke
ywo
rd
e
xtract
ion
has
c
om
bi
ned
nat
ur
al
la
ngua
ge
pr
oces
sing
a
ppr
oac
he
s
to
ide
ntify
par
t
-
of
-
s
peec
h
(P
O
S)
ta
gs
wh
ic
h
are
c
om
bin
ed
with
s
up
e
r
vised
le
ar
ning,
m
achine
le
arn
in
g
al
go
rithm
or
sta
ti
sti
cal
m
e
tho
ds.
I
n
R.
Mi
halcea
and
P.
T
arau
,
a
syst
e
m
that
app
li
es
synta
ct
ic
filt
ers
to
identify
POS
ta
gs
are
desc
ribe
d
[7]
.
P
OS
ta
gs
are
us
e
d
to
se
le
ct
wo
r
ds
to
be
eval
uated
a
s
keyw
ords.
T
he
co
-
oc
cu
rence
s
of
sel
ect
ed
w
ords
are
acc
um
ulated
wit
hin
a
w
ord
c
o
-
occ
urre
nce
gr
a
ph
an
d
TextRa
nk
(a
gr
a
ph
-
base
d
ra
nk
i
ng
al
gorithm
)
are
us
e
d
to
rank
th
e
wo
r
ds
based
on
thei
r
associ
at
ion
s
in
the
grap
h.
T
hen,
ke
ywords
are
sel
ect
ed
by
the
top
-
ra
nkin
g
w
ords
.
T
he
resea
rch
re
ported
that
Te
xtRan
k
pe
rform
ed
the
bes
t
wh
e
n
on
ly
no
un
a
nd
adj
ect
i
ves
a
re
s
el
ect
ed
as
ca
ndidate
k
ey
w
ords
.
In
te
xt
process
ing
,
certai
n
al
gori
thm
is
needed
to
obta
in
ke
ywords
,
a
nd
one
of
the
al
gor
it
h
m
that
is
of
te
n
us
e
d
is
R
AK
E
(Rapi
d
A
uto
m
at
ic
Key
word
Ext
racti
on
)
al
gorithm
.
Re
cent
resear
c
h
of
[
8]
com
par
es
the
perform
ance
of
RA
KE
a
nd
T
extRa
nk
usi
ng
the
sam
e
data
set
as
in
[
7]
.
S.
Rose
et
al.
t
he
y
descr
i
bed
RAKE
as
an
un
s
uper
vised
,
dom
ai
n
-
ind
e
pende
nt,
a
nd
la
ngua
ge
-
i
ndepe
ndent
m
eth
od
for
ext
rac
ti
ng
key
wor
ds
from
ind
ivi
du
al
doc
um
ents
[8]
.
It
i
s
base
d
on
the
ob
s
er
vation
th
at
keyw
o
r
ds
frequ
e
ntly
co
nta
in
m
ulti
ple
word
s
but
rar
el
y
co
ntain
sta
nd
a
rd
punct
uation
or
sto
p
words,
s
uc
h
as
the
f
un
ct
io
n
words
an
d,
t
he
,
of,
or
oth
e
r
words
with m
ini
m
u
m
lexica
l m
eaning
.
The
in
put
par
a
m
et
ers
fo
r
R
A
KE
are
a
li
st
of
sto
p
w
ords,
a
nd
a
set
of
phr
ase
delim
it
ers
to
parse
the
do
c
um
ent
te
xt
into
ca
nd
i
date
keyw
ords.
Co
-
occurre
nces
of
w
ords
within
cand
i
date
key
words
are
m
ea
ningf
ul
to
sco
re
the
ca
nd
i
date
key
word
s
.
RA
KE
be
gin
s
keyw
ord
extracti
on
on
a
docum
ent
by
par
si
ng
the
te
xt
into
a
set
of
can
did
at
e
keyw
or
ds.
A
sco
re
is
cal
cu
la
te
d
f
or
eac
h
cand
i
date
key
word
by
cal
c
ul
at
ing
the
su
m
of
it
s
m
e
m
ber
word
scor
es
a
fter
ever
y
ca
nd
i
da
te
keywor
d
is
identifie
d
a
nd
the
gr
a
ph
of
c
o
-
occ
urre
nces
is
com
plete
d
.
Ne
xt
,
the
to
p
T
s
cor
i
ng
can
did
a
te
s
are
sel
ect
e
d
as
keyw
ords
of
t
he
docum
ent.
In
short,
f
irstl
y,
RAKE
rem
ov
ed
the
sto
p
-
wor
ds
f
ro
m
do
cu
m
ent
and
de
fine
the
can
did
a
te
keyword
ac
cordin
g
to
the
do
m
ai
n
by
cal
culat
in
g
w
ord
sco
re
ba
sed
on
t
he
de
gr
ee
an
d
f
re
qu
e
ncy
of
w
ord
ve
rtic
es
in
the
gr
a
ph
:
(
1)
w
ord
fr
e
qu
e
ncy,
w
ord
degree,
a
nd
rati
o
de
gree
to
fr
e
quency
[9]
.
In
the
e
xperi
m
ent
of
[8]
,
RAKE
achie
ves
highe
r
pr
eci
sio
n
a
nd
s
i
m
i
la
r
recall
in
com
par
iso
n
to
TextRa
nk,
as
RAKE
can
sco
re
key
wor
ds
in
a
sin
gle
pass
,
wh
il
e
TextRa
nk
requ
ires re
peated
it
erati
on
s
to
a
chi
eve th
e
con
vergen
ce
on w
ord ran
ks
.
A
researc
h
do
ne
by
J
ungie
wic
z
an
d
Ło
puszy
ńs
ki
use
s
R
AKE
f
or d
oi
ng
ke
yword
ext
racti
on o
f
Po
li
s
h
do
c
um
ents
in
Pr
oc
urem
ent
fiel
d
[10]
.
RA
K
E
is
qu
it
e
ind
e
pende
nt
in
te
r
m
s
of
la
ngua
ge
as
it
is
no
t
dev
el
oped
on
ly
f
or
a
ce
rtai
n
la
ngua
ge.
RAKE
de
pend
s
on
t
he
sto
p
w
ord
li
st
with
th
e
gen
e
ral
idea
of
se
par
at
in
g
a
te
xt
to
gro
up
of
w
ord
s
acco
rd
i
ng
to
a
sepa
rato
r
or
word
f
ro
m
the
stop
w
ord
li
st.
E
ach
wor
d
will
be
c
on
si
der
e
d
as
a
cand
i
date
keyword a
nd a sc
ore is cal
culat
e
d base
d o
n
the
c
o
-
occ
urren
ce
grap
h.
Accor
ding
to
t
he
s
urvey
done
by
S.
Si
dd
i
qi
and
A.
Sh
a
ra
n
,
there
a
re
va
ri
ou
s
te
c
hniq
ues
that
can
be
us
e
d
in
te
xt
m
i
ning
for
e
xtrac
ti
ng
keyw
ord
a
n
d
key
phrase
[11]
.
B
oth
key
word
a
nd
key
phrase
a
re
nee
ded
to
analy
ze
hu
ge
nu
m
ber
of
m
a
te
rial
in
f
orm
of
te
xt.
Keyw
ord
a
nd
key
phrase
are
w
or
d
represe
ntati
on
in
a
do
c
um
ent
wh
i
ch
gi
ve
high
-
l
evel
sp
eci
ficat
ion
of
t
he
c
onte
nt
an
d
usual
l
y
us
ed
f
or
generati
ng
i
nd
e
x,
qu
e
ry
ref
inem
ent,
an
d
te
xt
su
m
m
ar
iz
at
ion
.
I
n
this
m
et
ho
d,
si
gn
i
ficant
w
ords
i
n
a
docum
ent
are
ch
os
e
n
without
dep
e
ndin
g on a
ny voca
bu
la
ry
or extracte
d w
ords fr
om
the docu
m
ent.
So
m
e
researc
he
rs
J
.
Gr
ee
nb
e
rg
et
al
.
com
par
e
f
our
ope
n
so
urce
al
gorith
m
s
fo
r
keyw
ord
e
xtracti
on
,
RAKE
,
Tag
ger,
Kea,
an
d
Ma
ui
[12]
.
Acc
ordin
g
to
their
e
xp
e
rim
ents,
R
AK
E
pro
du
ce
98.57%
un
i
qu
e
wo
r
ds
(69 of 7
0 u
nique
words
of
70
ex
tract
e
d wor
ds
)
. Mean
w
hile t
he best res
ul
t i
s o
btai
ned b
y
u
sin
g Tag
ge
r
(
100%
,
50
uniq
ue
w
or
ds
of
50
e
xtrac
te
d
w
ords).
R
AK
E
is
la
ngua
ge
in
dep
e
ndent
syst
e
m
.
Ho
we
ver,
the
dev
el
opm
ent
of
st
op
w
ord
li
st
in
I
ndonesi
a
n
is
not
as
c
om
ple
te
as
Eng
l
ish.
Hen
ce
,
w
e
pro
po
se
d
a
W
PK
E
(
Weig
ht
Pr
io
rity
Keyw
ord
E
xtr
act
ion
)
al
gorit
hm
wh
ic
h
has
a
highe
r
acc
uracy
in
I
ndones
ia
n
T
weets.
T
he
rankin
g
pro
cess
is
done
by
giv
i
ng
a
n
i
niti
al
we
igh
t
for
eac
h
word
wh
ic
h
w
e
ha
ve
a
naly
zed
from
com
pla
int
tweet
s.
Ne
xt,
t
he
weig
ht
is
being
a
dju
ste
d
by
con
side
rin
g
the
relat
ion
s
hip
bet
ween
words
of
I
nd
on
e
sia
n
gram
m
ar.
The
keyw
ord
is
pro
cessed
t
o
groupin
g
ph
ase
f
or
cal
culat
ing
t
he
keyw
ord
t
hat
app
ea
rs
i
n
the
tweet
s
to
pro
du
ce
the
char
t.
Our
WPKE al
gorithm
w
ork
s
well
in I
ndonesi
an
twe
et
s co
m
par
in
g wit
h
R
AK
E
algorit
hm
.
2.
RESEA
R
CH MET
HO
D
Figure
1
il
lustr
at
es
the
pro
posed
m
et
ho
d
in
the
resear
ch
.
It
beg
i
ns
by
coll
ect
ing
the
in
pu
t
of
te
xt
f
ro
m
Twitt
er.
O
ur
da
ta
set
con
sist
s
of
tweet
s
m
entioni
ng
t
he
a
ccount
of
com
pan
ie
s
’
cust
om
er
relat
ion
c
enter
in
Ind
on
esi
a
n
la
ngua
ge.
The
te
xt
is
pre
-
processe
d,
t
o
m
ake
the
uns
tructu
red
se
ntences
ca
n
be
m
or
e
unde
rstan
dab
le
by
a
m
achine.
The
no
rm
aliz
at
ion
te
c
hn
i
que
us
e
d
is
base
d
on
the
pre
vio
us
re
searc
h
done
by
[6
]
.
It
de
velo
pe
d
the
te
c
hniq
ue
to
no
rm
alize
te
xt
in
I
ndones
ia
n
la
ngua
ge
f
or
c
om
plaint
cat
egory
by
us
i
ng
data
from
Twitt
er
and
achie
ved
t
he
acc
ur
acy
ar
ound
90%
.
T
he
ste
ps
ar
e
di
vid
e
d
int
o
cl
eanin
g
process,
O
O
V
detect
ion,
an
d
word
re
placem
ent.
Key
wo
rds
are
the
n
extrac
te
d
us
in
g
WPKE
al
gorithm
and
gro
up
e
d
th
e
word
that
hav
e
the
s
i
m
i
la
r
m
eaning
to
resu
lt
the
c
om
plaint
cat
ego
ry.
T
he
outp
ut
is
visu
al
iz
ed
in
a
char
t
to
prov
i
de
si
m
plici
t
y fo
r
fur
t
her
a
naly
sis.
Evaluation Warning : The document was created with Spire.PDF for Python.
In
t J
Elec
&
C
om
p
En
g
IS
S
N:
20
88
-
8708
Complai
nt Ana
ly
sis i
n
I
ndone
sian Lan
guage
u
si
ng WPKE
and RAK
E
Alg
ori
thm
(
Rini W
ongso)
5313
I
n
p
u
t
(
T
e
x
t
)
Te
x
t
N
o
r
m
a
l
i
z
a
t
i
o
n
K
e
y
w
o
r
d
E
x
t
r
a
c
t
i
o
n
K
e
y
w
o
r
d
G
r
o
u
p
i
n
g
O
u
t
p
u
t
(
G
r
o
u
p
s
)
Figure
1.
The
pro
po
se
d
m
et
ho
d
a.
Keywor
d E
xt
r
act
i
on
Keyw
ord
extr
act
ion
be
gin
s
after
the
input
te
xt
are
norm
al
iz
ed.
We
exp
e
rim
ente
d
us
i
ng
t
w
o
al
gorithm
s.
First,
w
e
ap
ply
R
AK
E
al
go
rith
m
[8]
to
e
xtra
ct
the
keyw
ords
from
the
T
weets
by
s
plit
t
ing
the
tweet
s
into
sen
te
nces
an
d
rem
ov
i
ng
the
le
ss m
eaningfu
l
w
ords
us
in
g
sto
p
word
li
st
.
This step
ge
ner
at
e
s
li
st
of
cand
i
date
key
words.
Ne
xt,
a
scor
e
is
cal
cul
at
ed
for
each
c
and
i
date
keyw
ord
acco
r
ding
to
the
fr
e
quenc
y
and
known
as
de
gre
e. An
e
xam
ple of RA
KE
alg
or
it
hm
resu
lt
is
show
n
i
n
Ta
bl
e 1
.
Table
1.
Keyw
ords
e
xtracte
d by us
i
ng RA
K
E
Inp
u
t
Key
wo
rds
“
k
ecewa
o
rder
Go
x
x
x
tap
i tidak
ada tn
g
g
ap
an
dari su
p
ir
su
p
ir
saya
tel
ep
o
n
tidak
m
e
n
jawab s
i su
p
ir
sn
d
ri
ju
g
a tidak
m
en
elp
o
n
saya”
[
('kec
ewa order
g
o
x
x
x
,
1
0
.0),
('si su
p
ir
sn
d
ri',
7
.02
6
6
6
6
6
6
6
6
6
6
6
6
7
),
(
'tn
g
g
ap
an
',
2
.0),
('
m
en
elp
o
n
',
2
.
0
),
('telepo
n
',
2
.0),
('su
p
ir
su
p
ir',
0.7
2
0
0
0
0
0
0
0
0
0
0
0
0
0
1
)]
Ther
e
ar
e
6
cand
i
date
keyw
ords
e
xtracted
f
ro
m
the
Tweets
each
with
it
s
scor
e.
Acc
ord
ing
ly
,
with
the
sc
or
e,
the
final
keyw
ord
of
this
sente
nc
e
is
key
wor
d
with
t
he
high
est
value
w
hic
h
is
“
kece
wa
order
goxxx”
.
W
e
th
en
eval
uate
by
requesti
ng
s
om
e
peo
ple
w
ho
un
der
sta
nd
I
ndonesi
an
la
nguange
to
re
view
the
keyw
ords
ob
ta
ined
by
this
al
gorithm
,
and
t
he
res
ult
is
not
sat
isfact
or
y.
Af
te
r
so
m
e
exp
erim
ents
and
analy
sis,
we
pro
po
se
d
a
m
e
tho
d
cal
le
d
W
P
KE
(
Weig
ht
P
rior
it
y
Ke
yword
Ext
racti
on)
w
hich
w
orks
base
d
on
ce
rtai
n
weig
hting sc
he
m
es. S
te
ps
of t
he
m
et
ho
d
is
il
lustrate
d by Fi
gure
2.
I
n
p
u
t
(
T
e
x
t
i
n
N
o
r
m
a
l
F
o
r
m
)
P
a
t
t
e
r
n
A
n
a
l
y
s
i
s
W
e
i
g
h
t
I
n
i
t
i
a
l
i
z
a
t
i
o
n
W
e
i
g
h
t
A
d
j
u
s
t
m
e
n
t
S
o
r
t
i
n
g
a
n
d
R
a
n
k
i
n
g
O
u
t
p
u
t
(
K
e
y
w
o
r
d
)
Figure
2. O
ur
pro
po
se
d WP
K
E m
et
ho
d
WPKE
m
et
ho
d
be
gin
s
by
de
fining
the
pat
te
rn
of
the
t
w
eet
.
Ba
sed
on
our
analy
sis,
t
he
ty
pical
ly
var
ia
ti
on
of
a
keyw
ord
in
a
tw
eet
is
con
struc
te
d
from
the
fo
ll
ow
i
ng
patte
r
ns
:
(1)
N
oun
+
Noun,
(
2)
N
oun,
(
3)
Verb + N
ou
n,
(4)
P
ron
oun.
At f
irst, we
gi
ve
an
init
ia
l weigh
t for each
w
or
d
ty
pe
accor
di
ng
to patt
ern
w
it
h
the
fo
ll
owin
g
r
ules
descr
i
bed
i
n
T
able
2.
We
gi
ve
a
value
of
1
f
or
Nou
n,
beca
us
e
usual
ly
the
thing
to
disc
uss
in
a
sentence
is
ab
ou
t
an
obj
ect
.
Me
anwhil
e,
th
e
init
ia
l
weigh
t
fo
r
an
a
dj
ect
i
ve
is
0,
since
the
ad
j
ect
ive
is
no
t
a
ty
pical
keywo
r
d
wh
ic
h
can
show
the
esse
nce
of
a
tweet
,
bu
t
it
is
us
ually
us
ed
in
a
sentenc
e
relat
ed
to
a
no
un.
A
value
of
0.5
and
0.1
are
gi
ve
n
to
Verb
a
nd
Pro
noun
re
sp
e
ct
ively
,
accor
di
ng
t
o
t
he
poss
ibil
it
y
a
keyword
i
n
a tweet
is a
Ve
rb or P
ron
oun.
Othe
r
ty
pe of
words that a
re
no
t
desc
ribe
d
i
n
Ta
ble
2
is i
gnore
d.
Table
2.
In
it
ia
l
W
ei
ght R
ules
in
WPKE m
et
hod
Ty
p
e
Initial
W
eig
h
t
No
u
n
1
Verb
0
.5
Pron
o
u
n
0
.1
Ad
jectiv
e
0
Evaluation Warning : The document was created with Spire.PDF for Python.
IS
S
N
:
2088
-
8708
In
t J
Elec
&
C
om
p
En
g,
V
ol.
8
, N
o.
6
,
Dece
m
ber
201
8
:
5311
-
5317
5314
Fu
rt
her
m
or
e,
t
he
cu
rr
e
nt
wei
gh
t
is
ad
justed
based
on
the
order
i
n
w
hic
h
the
word
is
form
ed
as
sh
ow
n
in
ta
ble
3.
T
he
ca
nd
i
da
te
scor
e
is
ca
lc
ulate
d
us
i
ng
the
f
or
m
ula:
C
andidate
Sc
or
e
=
I
niti
al
Wei
gh
t
–
Wor
d
Dista
nc
e
+
Ad
diti
onal
Wei
gh
t.
F
or
e
xam
ple,
a
no
r
m
al
iz
ed
tweet
s
"Say
a
kecewa
nasinya
ba
u”
has
a
ty
pe
of
w
ord
“Pron
oun
A
djec
ti
ve
Noun
A
dj
ect
ive”.
Each
word
in
the
tw
eet
is
pr
ocesse
d
sta
rted
from
the
first
word
“say
a”
p
rod
uces
phrase
com
bin
at
ion
s
of
“
say
a
kece
wa”,
“s
ay
a
nas
inya
”,
an
d
“
sa
ya
bau
”
.
T
he
s
econd
word
“
kece
wa
”
produces
phr
ase
com
bin
at
ion
s
of
“k
ecewa
nasinya”
a
nd
“
kecewa
ba
u”,
wh
il
e
the
thir
d
wor
d
on
ly
produces
on
e
phr
a
se co
m
bin
at
ion
of “
nasinya
bau
”
. T
he
sc
or
e
f
or c
and
i
date “say
a
kecewa”
is
-
0.9 whe
re
the
init
ia
l
weig
ht
of
t
he
w
ord
“say
a”
(P
r
onoun)
is
0.1
,
the
word
distance
betwee
n
“say
a
”
and
“
kece
wa
”
is
1
and
the
ad
diti
on
al
wei
gh
t
is
0
(
beca
us
e
t
he
re
is
no
patte
rn
f
or
m
ed
as
sh
ow
n
i
n
T
a
bl
e
3
bel
ow
).
F
or
the
rem
a
ining
can
did
at
e:
“say
a
na
sinya
”,
“say
a
bau
”
,
“
kecewa
nasinya”,
“ke
cewa
bau
”
,
a
nd
“
nasinya
bau”
ha
ve
cand
i
date
sc
ore
-
1.9
,
-
2.9,
-
1,
-
2,
a
nd
0.5
r
especti
vely
.
T
he
fi
nal
key
w
ord
we
ob
ta
in
ed
is
the
one
with
th
e
highest ca
nd
i
da
te
sco
re
.
Table
3.
Weig
ht Ad
j
us
tm
ent
Rules i
n WP
K
E m
et
ho
d
Patterns
Ad
d
itio
n
al W
eig
h
t
No
u
n
+“
ti
d
ak
” +
A
d
jectiv
e
1
No
u
n
+No
u
n
/Verb
+Ad
jectiv
e
0
.5
No
u
n
+
No
u
n
0
.5
Oth
ers
0
b.
Keywor
d Gr
oupi
ng
Ty
pical
ly
,
sever
al
tweet
s
ha
ve
the
sam
e
m
ai
n
top
ic
,
th
eref
or
e
we
wa
nt
to
gro
up
t
hose
sim
il
ar
keyw
ords
int
o
a
group
a
nd
r
a
nk
t
hem
.
The
gr
ou
ping
m
e
thod
is
base
d
on
the
gi
ven
in
put
of
T
witt
er
acco
un
t o
r
top
ic
.
W
e
pr
e
pa
red
the
data
in
pr
i
or
to
key
word
gro
up
i
ng
by
qu
eryi
ng
from
database,
t
he
data
of
:
“ke
pad
a
”
,
“per
i
hal”,
“t
w
eet
_h
asi
l”
,
an
d
“i
nti”.
Thes
e
data
are
ob
t
ai
ned
f
ro
m
the
no
rm
al
iz
at
io
n
process
,
exc
ept
for
“i
nti”, w
hich w
e g
et
fr
om
k
ey
word ext
racti
on
process a
s m
entione
d
i
n
Se
ct
ion
2.1.
T
he qu
e
ry r
es
ult i
s
filt
ered
accor
ding
t
o
c
ertai
n
crit
erias
of
:
(1)
fin
ding
tweet
s
c
onta
ining
the
i
nput
in
“
kep
a
da”
or
“
pe
rihal”
w
it
ho
ut
hav
i
ng
pe
rf
ect
m
at
ch
(%input%),
(2)
the
input
m
us
t
no
t
be
pr
ece
de
d
by
any
oth
e
r
char
act
er
s
to
avo
i
d
irreleva
nt
twee
ts
bein
g
proce
ssed.
T
he
detai
ls
of
the
seco
nd
c
rite
ria
can
be
see
n
i
n
Ta
bl
e
4
belo
w.
Th
e
la
st
row
s
hows
“re
je
ct
ed”
sta
tus
since the
“a
b”
a
ppears
in
a
wo
rd “ak
r
ab
” i
n
the
tweet.
Table
4.
T
weet
Processi
ng Cri
te
ria
Inp
u
t
Tweet
Statu
s
ab
saya
k
ece
wa den
g
a
n
ab
Accepted
ab
saya
k
ece
wa den
g
a
n
abcare
Accepted
ab
ab
m
en
g
ecewakan
Accepted
ab
ab
care
m
en
g
ecewa
k
an
Accepted
ab
Say
a
kecewa
tid
ak
akrab
Rejected
Keyw
ord
gro
upin
g
beg
i
ns
by
ad
ding
a
flag
to
giv
e
a
sta
tus
wh
et
her
a
twee
t
has
bee
n
proc
essed.
T
he
keyw
ord
phras
e
ob
ta
ine
d
fro
m
“i
nti”
is
div
ided
int
o
w
ord
to
searc
h
f
or
a
keyw
ord
wit
hout
ha
ving
to
hav
e
a
perfect
phrase
m
at
ch.
These
w
ords
are
use
d
t
o
sea
rch
f
or
the
sam
e
gr
ou
p
of
w
ords
in
oth
e
r
t
wee
ts.
T
he
al
gorithm
cal
c
ulate
s
the
f
requen
cy
of
occ
urre
nces
of
a
ke
ywo
rd
a
gains
t
keywor
ds
f
rom
oth
er
tweet
s
.
Thi
s
process
pro
du
c
es
the
orde
r
of
words
that
m
o
st
of
te
n
a
pp
e
ar
,
whereas
the
s
a
m
e
resu
lt
is
or
de
red
by
the
le
ng
t
h
of
t
he
wor
d
in
ascen
ding
orde
r.
T
he
e
xam
ple
res
ults
of
key
word
gr
ouping
is
show
n
i
n
Fi
gure
3.
The
ke
yword
in
tweet
-
5
doe
s
not
giv
e
a
c
ontrib
utio
n
valu
e
to
t
he
w
ord
1
an
d
w
ord
2,
t
her
e
fore
t
his
t
weet
go
es
t
o
ke
ywo
rd
gro
up
i
ng
agai
n
(lo
op).
T
he
ori
gin
al
se
ntenc
e
of
tweet
-
5
is
“i
ntern
et
nya
la
m
bat
payah
bu
at
kecewa
be
ra
t
aja”
wh
e
re
the
w
ord
“i
nter
net”
m
akes
this
twee
t
disp
la
ye
d
fro
m
the
qu
ery
r
esults.
Wh
en
t
her
e
is
no
res
ult
of
qu
e
ryi
ng
word
1
a
nd 2 of t
we
et
-
5,
this
keyw
ord g
oes
to
the
Ex
te
nd
e
d Gro
up
i
ng phase
.
Evaluation Warning : The document was created with Spire.PDF for Python.
In
t J
Elec
&
C
om
p
En
g
IS
S
N:
20
88
-
8708
Complai
nt Ana
ly
sis i
n
I
ndone
sian Lan
guage
u
si
ng WPKE
and RAK
E
Alg
ori
thm
(
Rini W
ongso)
5315
Figure
3. Key
word G
r
ouping
Ex
am
ple
We
e
nh
a
nce
d
the
groupin
g
process
t
o
get
si
m
il
ar
tweet
s
that
has
no
t
bee
n
pr
ocesse
d
to
be
gro
upe
d
tog
et
he
r.
This
is
need
e
d
t
o
re
du
ce
the
num
ber
of
c
om
plaint
keyw
or
ds
tha
t
is
sh
ow
n
in
t
he
c
har
t.
Exte
nde
d
gro
up
i
ng
ba
sic
al
ly
us
es
the
list
of
keyw
ords
that
has
the
m
os
t
si
m
il
ar
s
et
of
wor
ds
to
search
f
or
the
righ
t
keyw
ord
in
t
he
tweet
.
T
his
proces
s
m
ulti
pli
es
keyw
ord
t
ha
t
has
a
high
s
cor
e
a
nd
le
t
th
e
keyw
ord
that
has
a
l
ow
sc
ore
to
be
filt
ered
.
S
or
ti
ng
in
t
he
pr
e
vi
ou
s
ste
p
is
im
p
or
ta
nt
to
deter
m
ine
wh
ic
h
ke
yword
will
be
giv
e
n
the
first
opport
un
it
y
to
fi
nd
t
he
keyw
ord
in
t
he
tweet
that
ha
s
not
bee
n
pr
ocesse
d.
E
xten
ded
gro
upin
g
proces
s
will
get
keywo
r
d
with
the
hi
gh
est
sco
r
e
and
sea
rch
f
or
a
m
at
ch
towar
ds
the
sente
nc
es
that
has
the
sa
m
e
keyw
ord,
an
d
j
us
t
li
ke
pr
e
vious
keyw
or
d
gro
up
i
ng
proces
s,
the
sc
or
e
will
be
add
e
d,
a
nd
the
flag
sta
tus w
il
l be
upd
at
e
d.
3.
RESU
LT
S
A
ND AN
ALYSIS
Our
e
xp
e
rim
e
nt
is
do
ne
on
a
sm
a
ll
set
of
data
(
50
tweet
s)
ret
rieve
d
f
r
om
tweet
s
m
e
ntion
i
ng
tw
o
te
le
com
m
un
ic
a
ti
on
se
rv
ic
e
c
om
pan
ie
s
in
I
ndonesi
a,
w
he
re
in
this
pa
pe
r
the
nam
es
are
dissem
ble
d
as:
‘@
c
om
1’
an
d
‘
@co
m
2’
us
in
g
the
ha
sh
ta
g
“
kom
plain”
(#k
om
pla
in).
Othe
r
com
pan
y
we
di
scusse
d
as
e
xa
m
p
l
e
in
this
sect
ion
is
c
omm
ercial
com
pan
y
in
I
ndonesi
a,
di
ssem
bled
as
‘
@co
m
m
3’
.
All
the
tweet
s
a
re
in
Ind
on
esi
a
n
la
ngua
ge.
Ba
se
d
on
t
he
ex
per
i
m
ent
of
key
w
ord
ext
racti
on
RAKE
a
nd
WPKE,
we
ob
ta
in
the
fo
ll
owin
g res
ult as sho
wn in T
able 5.
Table
5.
C
om
par
iso
n of Key
wor
d E
xtracti
on
us
in
g
R
AKE
and
WPKE
Nu
m
b
e
r
o
f
Co
rr
ect
Key
wo
rds
Nu
m
b
e
r
o
f
I
n
co
rr
e
ct
Key
wo
rds
Percentag
e
RAKE
6
44
12%
W
PK
E
18
32
36%
Ba
sed
on
the
resu
lt
descr
ibe
d
in
Table
5,
WPK
E
m
et
ho
d
su
ccess
fu
ll
y
ob
t
ai
ned
18
co
rr
e
ct
keywo
r
ds
(36%
)
from
50
tweet
s,
m
eanw
hile
RA
KE
al
gorithm
can
on
ly
ob
ta
in
6
correct
keyw
ord
s
(12%).
Chec
king
is
done
m
anu
al
ly
by
sel
ect
ed
re
viewe
rs
by
gi
vi
ng
t
hem
the
da
ta
tweet
s
an
d
ask
them
for
th
e
keyw
ords
.
C
orrect
keyw
ords
fro
m
26
tweet
s
(
52%)
ca
nnot
be
fou
nd
ei
ther
by
RA
KE
or
WPKE.
For
e
xam
ple,
in
a
t
weet
of:
“(@c
om
1)
j
a
ring
a
n
c
omm
1
ken
apa
ni
h
le
le
tnya
supe
r
(
#kom
pla
in)”,
us
in
g
RA
KE
we
obta
ined
key
wor
d
of
“l
el
et
nya
su
p
er
”
wh
il
e
us
in
g
WPKE
we
ob
t
ai
ned
“
j
arin
ga
n
com
m
1”.
In
this
exam
ple,
WPKE
is
cons
idere
d
correct
,
a
nd
R
AK
E
is
i
ncorre
ct
.
The
er
r
or
in
RA
KE
al
go
rithm
can
be
s
een
from
the
s
el
ect
ion
of
w
ords
that
fail
ed
to
be
no
r
m
al
iz
ed
because
of
the
E
ng
li
sh
wor
d
“s
uper
”,
w
hi
le
WPKE
su
cce
ssf
ully
extract
the
co
m
pla
int
keyw
ord
as
the
p
at
te
r
n of
“
N
oun +
Nou
n”
is
fou
nd.
Accor
ding
to
our
a
naly
sis,
in
sever
al
case
s
R
AK
E
te
nds
to
t
ake
m
os
t
of
th
e
tweet
as
the
keyw
ord,
a
s
in
tweet
of
“(
@co
m
m
3)
kal
au
em
ang
ti
dak
bisa
na
nogram
bar
a
ng
p
a
s
ha
r
i
sabtu
ti
da
k
usa
h
em
ai
l
pesan
kirim
bar
a
ng
buan
g
-
bu
a
ng
wa
ktu
aj
a
(#
S
AMP
AH)
(#
K
om
plain)”,
RAKE
gi
ves
a
lon
g
ou
t
pu
t
as
“e
m
ail
pesan
kirim
bar
a
ng
buan
g
-
bu
a
ng”
that
af
fects
the
key
word
s
earc
h
r
esult.
Ba
sed
on
the
res
ult,
WPKE
pr
ov
i
de
m
or
e
pr
eci
se
outp
ut
because
of
the
abili
ty
to
reco
gniz
e
patte
rn
s
t
hat
hav
e
be
en
adjuste
d
to
the
keyword
patte
rn
s
of
com
plaint
in
In
do
nesian
la
ng
uag
e
,
w
hile
the
RAKE
al
gor
it
h
m
has
a
disadv
a
ntage
due
to
the
us
e
of
a
stop
word
li
st
that
is
not
sup
porte
d
with
a
c
om
plete
li
st
of
sto
p
word
s
in
Ind
on
e
sia
n
la
ngua
ge.
T
he
la
c
k
of
sto
p
Evaluation Warning : The document was created with Spire.PDF for Python.
IS
S
N
:
2088
-
8708
In
t J
Elec
&
C
om
p
En
g,
V
ol.
8
, N
o.
6
,
Dece
m
ber
201
8
:
5311
-
5317
5316
words
m
akes
the
RAKE
al
gorithm
te
nd
s
to
pr
od
uce
longe
r,
le
ss
preci
se
keyword,
and
le
ads
to
la
rg
e
r
com
pu
ta
ti
on
al
loads
as
the
num
ber
of
w
ord
checks
increas
es
with
the
lo
nger
keyw
ord
obta
ined
.
The
R
A
K
E
al
go
rithm
al
so
has
a
m
or
e
s
uitable
m
et
ho
d
us
e
d
f
or
key
word
e
xtracti
on
of
a
docum
ent,
not
f
or
te
xt
li
ke
m
ic
ro
blogs
tha
t
on
ly
has
a
m
axim
u
m
le
ng
th
of
14
0
char
ac
te
rs.
More
over
,
RAKE
to
ke
nized
w
ord
acco
rdi
ng
to
a
li
st
of
stop
w
ords
and
co
ntin
ue
d
with
freq
uen
cy
cal
cula
ti
on
,
de
grees,
and
the
ap
pear
a
nce
of wor
d
i
n
the
do
c
um
ent.
We
obta
ine
d
the
f
ollo
wing
r
esult
as
can
be
seen
in
Ta
bl
e
6
f
or
t
he
gr
ouping
proces
s
us
in
g
bo
t
h
keyw
or
d g
rou
pi
ng
a
nd
ext
e
nded gr
ouping.
Table
6.
C
om
par
iso
n of Key
word G
r
ouping
and E
xten
de
d
Groupi
ng
Key
wo
rd
Grou
p
in
g
Exten
d
ed
Gr
o
u
p
in
g
Total
21
28
Percentag
e
5
2
.25
%
70%
The
re
su
lt
in
Table
6
a
bove
is
ob
ta
ine
d
from
a
total
of
40
data,
ta
ken
rand
om
l
y
fr
om
Twitt
er
wi
t
h
the
“k
om
plain
”
has
htag
i
n
I
ndonesi
a
n
la
ng
ua
ge,
a
dif
fer
e
nt
data
set
f
r
om
t
he
on
e
us
e
d
for
key
wor
d
ext
r
act
ion
above.
Ba
se
d
on
Ta
ble
6
a
bove
,
it
can
be
seen
that
by
doin
g
the
it
erati
on
twic
e,
key
word
gro
up
i
ng
that
is
con
ti
nue
d
by
e
xten
ded
gro
up
i
ng,
there
is
a
n
increase
o
f
17.
25%.
T
otal
of
data
desc
ribe
d
above
is
the
num
ber
of
data
su
cce
s
sfu
ll
y
groupe
d
to
gether,
a
nd
per
ce
ntage
is
t
he
num
ber
of
data
that
is
s
uc
cessf
ully
gro
up
e
d
in
pro
portion
t
o
the
total
nu
m
be
r
of
data.
T
here
are
12
data
that
can
no
t
be
f
ound
by
exte
nded
gro
up
i
ng
be
cause
no
data
passes
to
create
a
ne
w
set
.
For
e
xam
ple,
there
is
a
com
plaint
ab
ou
t
c
om
2
nam
e
'
,
wh
ic
h
sh
ow
s
com
plaints
of
disap
pointm
ent
of
w
hy,
us
i
ng
the
nam
e
of
com
2,
bu
t
bec
ause
only
a
fe
w
(in
this
case
on
ly
1)
com
plain,
the
n
the
data
is
no
t
fea
si
bl
e,
and
a
ne
w
gro
up
is
no
t
c
reated
as
it
will
br
i
ng
up an u
npr
oces
sed
t
weet.
Ba
sed
on
t
he
r
esult
of
Ta
ble
5
an
d
6,
we
do
ano
t
her
ex
per
i
m
ent
by
com
par
in
g
the
res
ult
of
R
A
KE
com
bin
ed
with
Exten
ded
Gro
up
i
ng
with
res
ult
of
WPKE
with
Keyw
ord
Groupin
g
an
d
Exte
nded
Gro
up
i
ng
,
and
we
obta
ine
d
the
f
ollo
wing
re
su
lt
as
c
an
be
see
n
on
Ta
bl
e
7
belo
w,
us
i
ng
an
oth
e
r
dat
a
set
of
48
dat
a
from
Twitt
er acc
ount
o
f
@c
om
1
an
d @com
2,
bo
t
h
a
re tel
ecom
m
un
ic
at
ion
servic
e com
pan
ie
s in Ind
on
e
sia
.
Table
7.
E
xper
i
m
ent Result
W
PK
E
+
Ke
y
wo
rd
Grou
p
in
g
W
PK
E
+
Ke
y
wo
rd
Gr
o
u
p
in
g
+
Exten
d
ed
Gr
o
u
p
in
g
RAKE +
E
x
ten
d
ed
Grou
p
in
g
Total
27
35
17
Percentag
e
5
6
.25
%
7
2
.92
%
3
5
.42
%
An
exam
ple
of
a
s
uccessful
t
weet
pr
ocesse
d
by
exte
nd
e
d
gro
up
i
ng
is
th
e
tweet
of
“
(
@co
m
2)
say
a
m
au
ta
nya
ini
ken
a
pa
j
a
rin
ga
n
c
om
2
di
te
le
pon
say
a
di
da
erah
ka
bupate
n
Kendal
kok
sinya
lnya
ti
da
k
a
da
(#k
om
plain)
pl
ggn”,
i
n
this
ke
ywo
rd
gro
up
i
ng,
we
obta
ine
d
a
key
wor
d
of
“dae
rah
ka
bupaten”,
m
eanwhil
e
after
e
xten
de
d
gro
up
i
ng
proce
ss,
we
ob
ta
ine
d
“
j
ari
ng
a
n”.
T
his
is
du
e
to
ke
yword
that
is
gro
uped
in
first
tweet
can
c
hange
th
e
keyw
ord
t
o
“j
arin
gan
”
a
s
in
con
ta
in
s
the
w
ord
“
j
ari
ng
a
n”
and
the
posit
ion
of
“
j
ari
ng
a
n”
that
has
t
he hig
hest score,
so i
t i
s p
ri
or
it
iz
ed
i
n d
oing sea
rch
i
ng.
We
vis
ualiz
ed
the
res
ults
of
c
om
plaint
analy
si
s
by
us
in
g
a
char
t
wh
e
re
th
e
horizo
ntal
ba
r
desc
ribes
the num
ber
of
t
weets that
c
onta
ins c
om
plaint
s of ce
rtai
n
to
pi
cs, as
ca
n be s
een in t
he f
ollo
wing Fi
gure
4.
Figure
4. Ba
r C
har
t
for
C
omplai
nt Analy
sis t
ow
a
r
ds
@com
2
Evaluation Warning : The document was created with Spire.PDF for Python.
In
t J
Elec
&
C
om
p
En
g
IS
S
N:
20
88
-
8708
Complai
nt Ana
ly
sis i
n
I
ndone
sian Lan
guage
u
si
ng WPKE
and RAK
E
Alg
ori
thm
(
Rini W
ongso)
5317
4.
CONCL
US
I
O
N
Ba
sed
on
the
exp
e
rim
ents
con
duct
ed
,
we
co
nclu
de
the
fo
ll
ow
i
ng
:
(
1)
WPKE
m
e
tho
d
as
pro
po
se
d
in
the
resea
rch
s
hows
a
sig
nifica
nt
increase
i
n
a
ccur
acy
c
om
par
ed
t
o
RA
KE
Algorithm
du
e
to
the
fact
tha
t
stop
word
li
st
in
I
ndonesi
a
n
la
ng
ua
ge
is
no
t
well
de
velo
ped
ye
t.
(
2)
The
res
ult
obta
ined
usi
ng
WPKE
+
Ke
yword
Groupi
ng
+
E
xten
ded
Gro
uping
ha
s
the
accuracy
of
72.
92%
w
hich
e
xc
eeds
RA
KE
+
Exten
ded
Gro
up
i
ng
with
on
ly
35.42%.
I
n
the
f
uture,
we
plan
t
o
dev
el
op
sto
p
w
ord
li
st
in
Ind
onesi
an
la
ngua
ge
as
it
can
le
a
d
to
a
sign
ific
a
nt im
pr
ovem
ent f
or
a
ll
-
natur
al
la
ngua
ge pr
ocessi
ng in In
donesia
n l
anguag
e
.
REFERE
NCE
S
[1]
K.
A.
Al
-
Ene
z
i,
I.
F.
T
.
Al
Sha
ikhl
i,
and
S.
S.
M.
AlDabba
gh
,
“
The
Influe
n
ce
of
Inte
rn
et
and
Socia
l
M
edi
a
on
Purcha
sing
Dec
i
sions
in
Ku
wait
,
”
Indone
sian
Journal
of
El
ec
tri
c
al
Engi
nee
ring
and
Computer
S
ci
en
ce
(
IJE
ECS)
,
vol.
10
,
no
.
2
,
pp
.
792
–
797
,
2018
.
[2]
C.
-
L.
Hs
u,
C
.
-
C.
Yu,
and
C.
-
C.
W
u,
“
Expl
or
ing
the
con
ti
nu
anc
e
int
en
ti
on
of
socia
l
n
et
wo
rking
website
s:
an
empiric
a
l
r
ese
ar
ch,
”
Inf
.
S
yst. E
-
bus.
Manag.
,
vo
l
.
12
,
no
.
2
,
pp
.
1
39
–
163,
2014
.
[3]
R.
A.
Set
ia
wan
and
D.
B.
Set
y
ohad
i,
“
Anal
isi
s
Kom
unika
si
Sos
ia
l
Media
T
witt
er
seb
aga
i
Salura
n
L
a
y
an
an
Pela
nggan
Provi
der
Int
ern
e
t
d
an S
el
ule
r
di Indon
esia
,
”
J. I
nf
.
Syst
.
Eng
.
Bus.
In
tell
.
,
vol. 3, no. 1, p
p.
16
–
25
,
2017
.
[4]
P.
K.
Ku
m
ar
and
S.
Nanda
gopal
an
,
“
Insights
to
Problems
,
Resea
rch
Tr
end
and
Progress
in
Te
chni
qu
es
o
f
Senti
m
ent
Anal
ysis,”
Inte
rnat
ion
al
Journal
E
le
c
t
rical
and
Computer
Engi
n
ee
rin
g
(
IJE
CE)
,
vol.
7,
no.
5
,
pp.
281
8
–
2822,
2017
.
[5]
G.
Ghedin
,
“
From
customer
ca
r
e
to
r
el
ig
ion,
the
Twit
ter
exp
losio
n
in
Indon
esia
|
Digit
al
in
the
ro
und.
”
[Onl
ine
]
.
Avail
a
b
le
:
htt
p
:/
/
ww
w.di
git
alinth
ero
und.
com/indo
nesia
-
twi
tt
er
/. [
Acc
essed:
03
-
Jul
-
2018]
.
[6]
N.
Hana
fi
ah,
A.
Kevin,
C
.
Suta
nto,
Y.
Arif
in,
and
J.
Hart
ant
o
,
“
Te
xt
Norm
al
i
za
t
ion
Algorit
h
m
on
Twit
te
r
i
n
Com
pla
int
C
at
eg
or
y
,
”
Proce
d
ia Com
put.
Sc
i.
,
vo
l.
116
,
pp
.
20
–
2
6
,
2017
.
[7]
R.
Miha
lc
e
a
an
d
P.
T
ara
u
,
“
T
ext
ran
k
:
Bring
in
g
orde
r
int
o
text,
”
in
Proceed
ings
of
th
e
200
4
Confe
ren
ce
o
n
Empiric
al
Me
tho
ds
in
Natural
Language
Proc
essing
,
2004
.
[8]
S.
Rose,
D.
Eng
el
,
N.
Cr
amer,
a
nd
W
.
Cowle
y
,
“
Autom
at
ic
ke
yword
ext
racti
on
fro
m
indi
vidua
l
documents,
”
Tex
t
Min.
Appl.
Theo
ry
,
pp
.
1
–
20
,
20
10.
[9]
N.
Naw
and
E.
E.
Hl
ai
ng,
“
Rel
e
vant
words
ext
r
a
ct
ion
m
et
hod
for
rec
om
m
enda
ti
o
n
s
y
stem,
”
Bu
ll
e
ti
n
of
Elec
tri
ca
l
Engi
ne
ering
and
Informatic
s
(
BEEI)
,
vol. 2, no. 3, pp. 169
–
176,
20
13.
[10]
M.
Jungiewicz
a
nd
M.
Łopusz
y
ń
ski,
“
Uns
uper
vised
ke
y
word
ex
tr
ac
t
ion
from
Polish
le
gal
te
xts
,
”
i
n
Inte
rnationa
l
Confe
renc
e
on
N
atural
Language
Proce
ss
ing
,
201
4,
pp
.
65
–
70
.
[11]
S.
Siddiqi
and
A.
Shara
n
,
“
Key
word
and
ke
y
p
hra
se
ex
tracti
on
technique
s:
a
literat
ure
r
eview,
”
Int
.
J
.
Compu
t
.
Appl
.
,
vo
l. 109,
no.
2
,
2015
.
[12]
J.
Gree
nber
g
,
Y.
Zha
ng,
A.
Oglet
re
e,
G.
J.
Tuc
ker
,
and
D.
Fole
y
,
“
Thre
s
hold
Dete
rm
ina
t
ion
and
Enga
g
i
ng
Mate
ri
al
s
Sci
entists
in
Ontolog
y
Design,
”
in
R
ese
arch
Confe
ren
ce
on
Me
tada
ta
an
d
Semantic
s
Re
s
earc
h
,
2015
,
pp
.
39
–
50.
BIOGR
AP
HI
ES OF
A
UTH
ORS
Rini
W
ongso
ha
s c
om
ple
te
d
her
bac
he
lor and
m
a
ster
degr
ee majo
ring
Com
pute
r
S
ci
en
ce i
n
B
ina
Nus
ant
ara
Univ
e
rsit
y
,
Jak
arta, In
donesia
in
2014
.
She
is
a lecture
r
and
r
ese
ar
che
r
i
n
Artificial
Inte
lligen
ce fi
e
ld
in
B
ina Nus
ant
a
ra
Univer
si
t
y
,
Ja
kar
ta,
Indon
esia.
She
pre
v
io
usl
y
worked
as
a
Java
Dev
el
oper
,
deve
lop
ing
Bank
ing
and
HR S
y
st
em.
She
is i
nt
eres
te
d
in
the fi
el
d
of
Mac
hin
e
Le
arn
ing, Com
pute
r
Vision
,
Na
t
ura
l La
ngu
age
P
roc
essing,
Ar
ti
fi
ci
a
l
Int
el
l
ige
n
ce
Applic
a
ti
ons,
and
Software
S
ystem.
Novita
Han
afiah
recei
v
ed
th
e
M.
Sc
degr
e
e in
soft
ware
s
y
s
te
m
eng
ine
er
ing
from
K
MN
UTNB,
Tha
iland,
in
201
3.
Th
e
r
ese
ar
ch about
ent
i
t
y
r
ec
o
gnit
ion
w
as
con
duct
ed
in
RW
TH
Aac
hen
in
2012.
She
is c
ur
r
ent
l
y
a
lectur
e
r
a
nd
subject
cont
e
nt
coor
d
ina
tor
in
Bina Nus
ant
ar
a University
.
The
m
ai
n
ar
ea
s
of
rese
arc
h
inter
est
ar
e art
if
icial i
nte
lligen
ce,
na
tu
ral
la
nguag
e
pro
ce
ss
ing
and
software
s
y
s
te
m
.
Jaka
Hart
ant
o
h
a
s
complet
ed
his
bac
he
lor
degr
ee
m
aj
oring
in
Co
m
pute
r
Scie
n
ce,
and
his
m
aste
r
degr
ee
m
aj
or
ing
Gene
ral
Mana
g
ement
in
Bina
N
usanta
ra
Univ
ersity
,
Jaka
r
ta,
Indo
nesia
in
2007.
He
is
a
a
l
ec
tur
e
r
and
r
ese
ar
che
r
in
Software
Eng
ine
er
ing
fi
el
d
in
Bina
Nu
sant
ara
Univer
sit
y
.
He
is a
lso
a
found
er
of
PT
BIG, and S
y
stem Anal
y
st
at
JJ
know
it
.
Evaluation Warning : The document was created with Spire.PDF for Python.