Indonesi
an
Journa
l
of El
ect
ri
cal Engineer
i
ng
an
d
Comp
ut
er
Scie
nce
Vo
l.
12
,
No.
1
,
O
c
to
be
r
201
8
,
pp.
46~5
0
IS
S
N:
25
02
-
4752
,
DOI: 10
.11
591/
ijeecs
.
v
1
2
.i
1
.pp
46
-
50
46
Journ
al h
om
e
page
:
http:
//
ia
es
core.c
om/j
ourn
als/i
ndex.
ph
p/ij
eecs
Random
Forest
Approach
fo
S
entiment A
nalysis i
n Ind
onesian
Langu
age
M.
Ali F
au
z
i
Facul
t
y
of
Com
pute
r
Sc
ie
nc
e, Br
awij
a
y
a
Univ
ersi
t
y
,
Ma
la
ng
,
Indo
nesia
Art
ic
le
In
f
o
ABSTR
A
CT
Art
ic
le
history:
Re
cei
ved
Ma
y
5
, 2
01
8
Re
vised
Ju
l
6
,
201
8
Accepte
d
J
ul
10
, 2
01
8
Senti
m
ent
an
alys
is
bec
om
es
ver
y
usefu
l
since
t
he
rise
of
soc
ia
l
m
edi
a
an
d
onli
ne
r
eview
website
and,
thus,
the
r
equi
rement
of
anal
y
zi
ng
thei
r
senti
m
ent
in
an
eff
e
ct
iv
e
a
nd
eff
i
ci
en
t
wa
y.
W
e
c
an
consid
er
sent
iment
an
a
l
y
sis
as
te
x
t
cl
assifi
ca
t
ion
pr
oble
m
with
senti
m
ent
as
it
s
c
a
te
gor
i
es.
In
thi
s
stud
y
,
we
expl
ore
the
us
e
of
Random
Forest
for
senti
m
ent
cl
assifi
ca
t
ion
in
Indone
sian
la
nguag
e.
W
e
a
lso
expl
ore
the
use
of
bag
of
words
(BOW
)
f
ea
tur
es
with
som
e
te
rm
weight
ing
m
et
hods
var
ia
t
ion
such
as
Bina
r
y
TF,
Raw
TF,
Loga
ri
thmic
TF
and
TF.
IDF
.
T
he
expe
r
iment
r
esult
show
ed
th
at
senti
m
en
t
ana
l
y
sis
s
y
stem
using
ran
dom
fore
st
giv
e
goo
d
per
form
ance
with
av
era
g
e
OO
B
score
0.
82
9.
The
result
al
s
o
depi
c
te
d
tha
t
a
ll
of
th
e
four
t
er
m
weight
ing
m
et
hod
has
co
m
pet
it
ive
resul
t
.
Sinc
e
th
e
sc
ore
diff
ere
n
ce
is
not
v
e
r
y
signifi
c
ant
,
we
c
an
sa
y
th
at
th
e
t
erm
weight
ing
m
et
hod
var
iatio
n
in
stud
y
ha
s
no
remarka
b
le
ef
fec
t
for
sen
ti
m
en
t
an
aly
sis using
Random Forest.
Ke
yw
or
d
s
:
Ra
ndom
f
or
est
Sentim
ent an
al
ysi
s
Term
w
ei
gh
ti
ng
Text cla
ssific
at
ion
TF.IDF
Copyright
©
201
8
Instit
ut
e
o
f Ad
vanc
ed
Engi
n
ee
r
ing
and
S
cienc
e
.
Al
l
rights re
serv
ed
.
Corres
pond
in
g
Aut
h
or
:
M. Ali Fa
uzi,
Faculty
of Com
pu
te
r
Scie
nc
e,
Brawijaya
U
niv
ersit
y, Ma
la
ng, I
ndonesi
a.
Em
a
il
:
m
och
.ali
.f
auzi@
ub.ac
.id
1.
INTROD
U
CTION
Nowa
days,
pe
op
le
te
nd
t
o
w
r
it
e their experi
ence, feel
ing,
opinio
ns, an
d views a
bout ev
e
nts, product
s
or
se
rv
ic
es
i
n
on
li
ne
platf
orm
s
su
ch
as
s
oc
ia
l
m
edia,
blog,
f
orum
,
sh
op
ping
sit
es,
or
r
eview
sit
es.
It
m
akes
on
li
ne
platf
orm
s
beco
m
e
a
source
of
hi
gh
ly
va
l
ua
ble
inf
or
m
at
ion
for
both
co
nsum
ers
and
produce
rs
.
Custom
ers
ge
t
second
op
i
nions
befor
e
purc
hasin
g
s
om
e
pr
od
ucts
or
se
r
vices.
On
t
he
oth
e
r
hand,
pro
ducers
get
inf
or
m
at
ion
ab
ou
t
w
hat
pe
op
le
think
ab
out
their
pro
duct
s
or
se
rv
ic
es
a
nd
predict
the
public
ac
ceptance
r
at
e
level. T
his i
nfor
m
at
ion
can
be ve
ry u
se
f
ul for i
m
pr
ov
em
ent an
d m
ark
et
ing
strate
gies [1]
.
Sentim
ent
analy
sis
is
a
ta
sk
of
analy
zi
ng
pe
op
le
’s
opini
ons
from
a
piece
of
te
xt
in
ord
er
to
s
pecify
wh
et
her
t
he
se
nti
m
ents
are
posit
ive,
ne
gative
or
ne
u
tral
.
Sentim
ent
An
a
ly
sis
hav
e
bee
n
ob
ta
ini
ng
po
pu
la
rity
ov
e
r
t
he
past
y
ears
as
a
resu
lt
of
the
rise
of
s
ocial
m
edia
and
on
li
ne
re
vie
w
websi
te
an
d,
thus,
t
he
requi
rem
ent
of
analy
zi
ng
t
heir
se
ntim
ent
in
a
n
e
ff
ect
i
ve
a
nd
ef
fici
ent
way.
Se
ntim
ent
analy
sis
is
c
urre
ntly
a
m
ajo
r
researc
h
fiel
d
with
m
any
app
li
cat
io
ns
in
a
la
rg
e
nu
m
ber
of
do
m
ai
ns
suc
h
as
el
ect
io
n
res
ults
pr
e
dicti
on
[
2
]
-
[
4],
st
ock
m
ark
et
predict
io
n
[5
]
,
[
6],
pr
oducts
an
d
m
erch
ants
ra
nk
i
ng
[7
]
,
m
ov
ie
re
ve
nu
e
s
pr
e
dicti
on [8
]
-
[
10
]
,
learni
ng e
valuati
on
[11
]
,
[
12]
, a
nd etc.
We
can
c
onsid
er
sentim
ent
a
naly
sis
as
te
xt
cl
assifi
cat
ion
pro
blem
with
senti
m
ent
as
it
s
cat
egories.
Ther
e
f
or
e,
we
can
us
e
s
uper
vi
sed
m
achine
l
earn
i
ng
a
ppr
oa
ches
to
ta
ckle
this
pro
blem
.
T
his
ap
proach
is
ver
y
popula
r
i
n
se
nti
m
ent
analy
si
s
and
pro
ve
n
to
be
ver
y
good
i
n
this
file
d.
S
om
e
m
achine
le
arn
i
ng
ap
proac
h
that
hav
e
bee
n
use
d
in
this
fiel
d
f
or
e
xam
ple
Naive
Ba
ye
s
[13
]
-
[
17]
,
S
uppo
rt
Vect
or
Ma
chines
[
18
]
-
[
19
]
,
Ma
xim
u
m
En
trop
y
[
20
]
, Ne
ural
N
et
w
ork [21
]
,
[
22]
decisi
on tree a
nd K
-
Ne
arest
Nei
ghbor
(
K
N
N) [
23
]
-
[
26
]
.
In
t
his
stu
dy,
we
e
xp
l
or
e
t
he
us
e
of
Ra
ndom
Fo
re
st
f
or
sentim
ent
cl
assifi
cat
ion
i
n
Ind
on
esi
a
n
la
nguag
e
.
Ra
ndom
Fo
rest
is
an
ensem
ble
le
arn
in
g
te
ch
nique
bas
ed
on
decisi
on
tr
ee
al
gorithm
[27].
Ra
ndom
Fo
res
ts
ha
ve
bee
n
i
ncr
e
dib
le
i
n
re
cent
ye
ars
sinc
e
the
pe
rfor
m
ance
of
this
ty
pe
of
al
gorithm
s
ha
ve
Evaluation Warning : The document was created with Spire.PDF for Python.
Ind
on
esi
a
n
J
E
le
c Eng &
Co
m
p
Sci
IS
S
N:
25
02
-
4752
Ra
ndom F
or
es
t Ap
pr
oac
h
fo
r
Sen
ti
me
nt A
naly
sis i
n Ind
on
es
ian
Lan
guage
(
M. Ali F
au
zi
)
47
su
r
pass
S
VMs,
Naïve
Ba
ye
s
and
oth
e
r
m
ac
hin
e
le
ar
ning
al
gorithm
s
fo
r
cl
assifi
cat
ion
ta
sk
in
so
m
e
do
m
ai
n
li
ke
bio
in
form
at
ic
s
and
com
pu
ta
ti
on
al
bi
ology
[28].
W
e
w
il
l
try
wh
et
her
this
ty
pe
of
en
sem
ble
m
et
ho
ds
sti
ll
ou
tst
a
nd
i
ng
on
sentim
ent
ana
ly
sis
ta
sk
s.
In
this
stu
dy,
we
will
al
so
e
xplo
re
the
us
e
of
ba
g
of
w
ords
(
BO
W)
featur
e
s
with
so
m
e
te
r
m
weigh
ti
ng
m
et
ho
ds
va
riat
ion
s
uch
as
Bi
na
ry
TF,
Ra
w
TF,
Log
a
rithm
ic
TF
an
d
TF.IDF
.
2.
RESEA
R
CH MET
HO
D
A
s
dep
ic
te
d
in
Fi
gure
1,
se
nti
m
ent
analy
sis
syst
e
m
in
this
stu
dy
co
nsi
sts
of
t
hr
ee
m
ai
n
sta
ges
,
pr
e
processi
ng,
featur
e
s
e
xtrac
ti
on
a
nd
cl
assif
ic
at
ion
usi
ng
R
andom
Fo
rest.
The
oupt
ut
of
cl
assifi
cat
ion
r
esult
is t
wo cat
eg
or
y
, pos
it
ive a
nd
ne
gative.
Figure
1. Syst
em
m
ai
n
flow
c
ha
rt
2
.
1.
Prepr
oce
ssing
The
fi
rst
sta
ge
of
this
syst
e
m
is
pr
eproce
ssing.
This
sta
ge
in
vo
l
ves
se
ver
al
processe
s
inclu
ding
tok
e
nizat
ion,
c
ase
f
old
in
g
a
nd
cl
eani
ng.
T
okenizat
io
n
is
a
ta
sk
of
s
plit
ti
ng
re
view
te
xt
into
sm
al
le
r
un
it
s
cal
le
d
to
ken
s
or
te
rm
s
[29
]
,
[
30]
.
Ca
se
f
old
in
g
is
a
ta
sk
of
m
aking
al
l
of
cha
racters
in
rev
ie
w
te
xt
beco
m
e
lowe
rcase
[
31
]
,
[
32]
.
Me
an
wh
il
e,
cl
eani
ng
is
a
ta
sk
of
rem
ov
ing
punct
uation,
nu
m
ber
s,
ht
m
l
t
ag
an
d
char
act
e
rs
ou
t
side
of
t
he
al
ph
a
bet.
I
n
this
stu
dy,
we
don’
t
em
plo
y
ste
m
m
ing
and
fil
te
ring
since
in
so
m
e
pr
e
vious
w
orks
abou
t
sentim
e
nt an
al
ysi
s, st
em
m
ing
an
d fil
te
rin
g
ca
nnot i
m
pr
ov
e cla
ssifi
cat
ion
perform
ance.
2
.
2
.
Fe
at
ure
Extr
act
i
on
Ba
g
-
of
-
w
ord
(
BO
W)
feat
ur
e
s
will
be
us
e
d
in
this
stu
dy.
Each
do
c
um
ent
would
be
re
presente
d
as
a
vecto
r
in
a
spa
ce
te
rm
s
with
t
he
un
i
qu
e
t
erm
s
fr
om
pr
e
processi
ng
sta
ge
bec
om
e
it
s
featu
res.
T
he
featu
re
vecto
r value is
determ
ined
us
i
ng so
m
e term
weig
hting m
eth
od.
The m
os
t pop
ular
te
rm
wei
gh
ti
ng m
et
ho
ds
a
re
Term
Fr
eq
uency
(TF),
Invers
e
D
ocu
m
ent
Fre
qu
e
ncy
(IDF)
an
d
the
com
bin
at
ion
of
t
he
t
wo,
Te
rm
Fr
eq
uen
cy
Inverse
Do
c
ume
nt F
reque
ncy
(TF.IDF)
[33
]
.
Term
Fr
equ
e
nc
y
is
assigning
wei
gh
ts
by
assum
ing
that
each
te
rm
hav
e
a
co
ntribut
ion
that
is
pro
portion
al
to
the
nu
m
ber
of
it
s
occu
r
ren
ce
s
in
the
docum
ent
[34
]
,
[
35
]
.
Ther
e
a
re
so
m
e
popu
la
r
va
riat
ion
of
TF
su
c
h
as
Bi
nar
y
TF
,
R
aw
TF,
an
d
L
ogar
it
h
m
ic
TF.
Usin
g
Bi
na
ry
TF,
each
doc
um
e
nt
is
represe
nted
as
a
bin
a
ry v
ect
or
.
A
te
rm
that oc
cur
s
in
a
doc
um
ent w
il
l get va
lue 1 in t
he d
ocu
m
ent v
ect
or,
oth
e
rw
ise
a
term
that
nev
e
r
occ
urs
in
a
do
c
um
ent
will
get
value
0.
T
his
kind
of
te
rm
weig
hti
ng
do
e
s
not
co
ns
ide
r
the
num
ber
of
te
rm
occu
rr
e
nc
es,
on
ly
0/1
va
lues.
In
c
on
tra
st
to
Bi
na
ry
T
F,
Ra
w
TF
m
e
thod
do
e
s
c
onsider
t
he
num
ber
of
te
rm
occu
r
rence
s.
A
te
rm
will
get
valu
e
based
on
how
m
any
tim
es
it
app
ear
s
in
the
do
c
um
ent.
Me
anwhil
e
Lo
gar
it
hm
ic
TF
a
lso
consi
de
r
the
nu
m
ber
of
te
rm
occu
r
re
nce
s.
The
dif
fer
e
nc
e
is
Log
arit
hm
ic
TF
Do
c
u
men
t
Pre
pr
oce
s
si
ng
Feat
ure
Ex
t
r
act
i
on
Senti
me
nt
classi
f
i
cat
i
on
usi
ng
Random F
or
est
Senti
me
nt
Ana
l
ysis
Re
sul
t
Evaluation Warning : The document was created with Spire.PDF for Python.
IS
S
N
:
2502
-
4752
Ind
on
esi
a
n
J
E
le
c Eng &
Co
m
p
Sci,
Vol
.
1
2
, N
o.
1
,
O
c
t
o
b
e
r
201
8
:
46
–
50
48
assum
e
that
th
e
i
m
po
rtance
of
a
te
rm
in
a
do
cum
ent
do
es
no
t
inc
rease
pr
oport
ion
al
ly
w
it
h
te
r
m
ho
w
m
any
tim
es it
o
ccur
s.
Th
e
w
ei
gh
ts
of term
t in doc
um
ent d
usi
ng
Lo
gar
it
hm
ic
TF can be
cou
nted
as
foll
ows:
d
t
f
d
t
TF
,
l
og
1
)
,
(
(1)
wh
e
re
d
t
f
,
is t
he n
um
ber
of the
how m
any tim
es
term
t app
ears
in
the
doc
um
e
nt d.
Me
anwhil
e,
I
nverse
D
ocu
m
ent
Fr
e
quency
i
s
a
global
te
rm
weig
hting
t
hat
been
c
ounte
d
by
regar
ding
the
distrib
ution
of
the
te
rm
in
the
dataset
.
T
his
te
rm
weig
ht
ing
will
giv
e
higher
val
ue
f
or
a
rar
e
te
rm
,
a
te
rm
that o
nly ap
pea
rs
in
certai
n d
oc
um
ents. Th
e
weig
hts
of
te
r
m
t usin
g
I
DF
form
ulate
d
as foll
ows:
t
d
df
N
t
I
D
F
l
o
g
1
)
(
(2)
whe
re
d
N
is
the
nu
m
ber
of
do
c
um
ents
in
dataset
and
t
df
is
the
nu
m
ber
of
do
c
um
ents
in
dataset
that
wh
ere
te
rm
t app
ears.
The
m
os
t
popu
la
r
te
rm
weight
ing
is
T
F.IDF
.
TF.
IDF
is
a
m
ulti
plica
ti
on
of
TF
a
nd
I
DF.
The
weig
ht
com
bin
at
ion
of term
t in d
oc
um
ent d
ca
n be
counted
as
fo
ll
ow
s
[3
6]:
)
(
)
,
(
)
,
(
t
I
D
F
d
t
TF
d
t
I
D
F
TF
(3)
wh
e
re
)
,
(
d
t
TF
is t
he T
F v
al
ue of
ter
m
t i
n
docum
ent d an
d
)
(
t
I
D
F
is t
he
ID
F
v
al
ue of
term
t.
2.3. Sen
timen
t
Cl
as
sific
at
i
on usin
g Rand
om
F
ores
t
The
la
st
sta
ge
is
senti
m
ent
classificat
ion
.
E
ach
re
view
wi
ll
be
cl
assifi
ed
into
po
sit
ive
or
ne
gative
cat
egory.
I
n
th
is
stud
y,
we
e
m
plo
y
ran
dom
forest
for
the
cl
assifi
cat
ion
ta
sk
.
Ra
ndom
fo
rest
al
gorit
hm
is
a
su
pe
r
vised
c
la
ssific
at
ion
al
gorithm
.
It
is
an
e
ns
em
ble
le
arn
in
g
t
echn
i
qu
e
ba
se
d
on
decisi
on
tree
al
gorithm
[2
7].
This
E
ns
em
ble
te
chn
i
qu
e
c
om
bin
es
the
pr
e
dicti
on
s
of
s
om
e
base
est
i
m
at
or
s
c
onstr
ucted
wit
h
decisi
on
tree
al
gorithm
to
enh
ance
r
obust
ne
ss
ov
e
r
an
in
div
id
ual
est
i
m
at
or
.
Ra
nd
om
Fo
r
est
grows
a
lot
of
cl
assifi
cat
ion
trees,
w
hic
h
is
cal
le
d
fo
re
st.
If
we
wa
nt
to
cl
assify
a
new
data,
each
tre
e
giv
es
it
s
cat
egor
y
pr
e
dicti
on
as
one
vote
.
T
he
f
orest
ch
oo
se
s
th
e
cat
egory
that
has
m
ajo
rity
vo
ti
ng.
I
n
ge
ne
r
al
,
the
m
or
e
tr
ees
in
the r
a
ndom
f
or
est
the
higher
a
ccur
acy
res
ults giv
e
n.
Ra
ndom
Fo
res
ts
hav
e
bee
n
ga
ining
popula
r
it
y
in
recent
ye
ars
since
t
he
perform
ance
of
this
ty
pe
of
al
gorithm
s
have
ou
tst
a
nd
i
ng
for
cl
assifi
cat
ion
ta
s
k
in
som
e
do
m
ai
n
li
ke
bio
i
nfor
m
at
i
cs
an
d
com
pu
t
a
ti
on
al
bio
lo
gy
.
The
r
e
al
so
so
m
e
works
in
te
xt
cl
assifi
cat
ion
us
i
ng
Ra
nd
om
fo
rest
s
uch
as
for
hates
peech
detect
ion [
37
]
and aut
hors
hip p
rofil
ing
[
38
]
.
3.
RESU
LT
S
A
ND AN
ALYSIS
Ex
per
im
ent
con
duct
ed
by
usi
ng
386
rev
ie
ws
ta
ke
n
from
Fem
al
eDaily.
All
of
t
he
re
views
is
i
n
Ind
on
esi
a
n
la
ngua
ge.
In
ste
a
d
of
us
i
ng
cr
os
s
validat
ion
,
Ra
ndom
Fo
rest
use
ou
t
-
of
-
bag
(
OO
B
)
error
est
i
m
ate
to
get
a
n
unbia
sed
est
im
at
e
of
the
cl
assifi
cat
ion
pe
rfor
m
an
ce.
O
OB
sc
or
e
r
ang
e
f
or
m
0
to
1.
The
highe
r
OO
B
scor
e
the
bette
r
cl
assifi
cat
ion
per
f
or
m
ance,
oth
e
rw
ise
the
lowe
r
OO
B
sc
or
e
in
dicat
es
worse
cl
assifi
c
at
ion
perform
ance.
I
n
the
experim
ent, Ran
dom
Fo
rest w
il
l be
test
ed usin
g
se
veral
term
w
ei
gh
ti
ng m
et
ho
d
incl
u
di
ng
Bi
nar
y
TF,
R
aw
TF,
Lo
ga
rithm
ic
TF,
a
nd
T
F.
IDF.
T
he
ex
per
im
ent
is
condu
ct
e
d
us
in
g
Scikit
-
le
arn
li
br
ary
[39]. T
her
es
ult can
be
seen
in Fi
gure
2
.
Figure
2. Se
ntim
ent
analy
sis
exp
e
rim
ent r
eu
slt
u
sin
g ran
do
m
f
or
est
0.8
0.82
0.84
B
i
n
ar
y
TF
R
aw TF
Loga
rith
m
ic
TF
TF
.
I
DF
O
O
B
Score
Term
We
igh
tin
g
Met
h
od
Sen
timen
t
An
aly
sis
u
sin
g
Ran
d
om
Fores
t
Evaluation Warning : The document was created with Spire.PDF for Python.
Ind
on
esi
a
n
J
E
le
c Eng &
Co
m
p
Sci
IS
S
N:
25
02
-
4752
Ra
ndom F
or
es
t Ap
pr
oac
h
fo
r
Sen
ti
me
nt A
naly
sis i
n Ind
on
es
ian
Lan
guage
(
M. Ali F
au
zi
)
49
Figure
2
sho
w
that
sentim
ent
analy
s
is
us
in
g
rando
m
fo
r
est
giv
e
good
pe
r
form
ance
with
aver
a
ge
OO
B
scor
e
0.8
29.
We
can
al
so
see
f
or
m
Figu
re
2
that
al
l
of
the
f
our
te
rm
weight
ing
m
et
ho
d
ha
s
com
petit
ive
resu
lt
.
The
O
OB
sco
re
betwee
n
is
j
ust
sli
gh
tl
y
diff
e
ren
t.
T
he
be
st
OO
B
sco
r
e
is
gained
by
Ra
w
TF
by
0.837
.
The
l
ow
est
O
OB
sc
or
e
is
ga
ined
by
L
ogari
thm
ic
TF
by
0.8
21.
In
the
se
cond
place
is
Bi
nar
y
TF
wit
h
OO
B
scor
e
0.8
29
a
nd
the
thir
d
place
is
TF.IDF
with
O
OB
sco
re
0.8
28.
This
resu
lt
is
act
ually
su
rprisi
ng
be
caus
e
us
ua
ll
y
TF.I
D
F
can
outpe
rfo
rm
an
y
oth
er
te
rm
weigh
ti
ng
m
et
ho
d.
Howe
ver,
since
the
s
cor
e
diff
e
re
nce
is
no
t
ver
y
si
gn
ific
a
nt
,
we
ca
n
say
t
hat
the
te
rm
weigh
ti
ng
m
et
ho
d
var
ia
ti
on
in
stud
y
has
no
r
e
m
ark
able
e
ff
e
ct
f
or
sentim
ent
analy
sis using Ra
ndom
Forest.
4.
CONCL
US
I
O
N
In
t
his
stu
dy,
we
e
xp
l
or
e
Ra
ndom
Fo
rest
w
it
h
seve
ral
te
r
m
weigh
ti
ng
m
et
ho
d
f
or
se
nt
i
m
ent
analy
sis
i
n
Ind
on
esi
a
n
L
angua
ge.
Thi
s
syst
e
m
in
this
stu
dy
consi
sts
of
t
hr
ee
m
ai
n
sta
ges,
prep
roc
essing,
featur
e
s
e
xtrac
ti
on
a
nd
cl
assif
ic
at
ion
usi
ng
r
andom
fo
re
st.
T
he
oupt
ut
of
cl
assifi
cat
ion
r
esult
is
tw
o
cat
egory,
po
sit
ive
a
nd
ne
gative.
T
he
e
xp
e
rim
ent
resul
t
sh
owed
t
hat
sentim
ent
anal
ysi
s
us
in
g
ra
ndom
fo
re
st
give
go
od
perform
ance
with
ave
rag
e
OO
B
sc
or
e
0.829.
T
he
res
ul
t
al
so
dep
ic
te
d
that
al
l
of
the
four
te
rm
weigh
ting
m
et
ho
d
has
c
om
pet
it
ive
resu
l
t.
Since
the
sc
or
e
di
ff
e
ren
ce
is
not
ve
ry
si
gnific
ant,
we
ca
n
say
t
hat
the
ter
m
weig
hting m
eth
od
va
riat
ion
i
n
st
ud
y
has n
o rem
ark
able ef
f
ect
f
or se
ntim
e
nt an
al
ysi
s
us
i
ng Ran
dom
Fo
rest.
REFERE
NCE
S
[1]
Jansen
BJ,
Zha
n
g
M,
Sobe
l
K,
Chowdur
y
A.
T
witt
er
power
:
T
wee
ts
as
elec
tro
nic
word
of
m
outh.
Journal
of
t
he
As
socia
ti
on
for
I
nform
at
ion
Sci
e
nce
and Technol
og
y
.
2009
Nov 1
;60(11):
2169
-
88.
[2]
Tumasjan
A,
Sprenge
r
TO,
Sand
ner
PG
,
W
el
pe
IM.
Predic
ti
ng
e
l
ec
t
ions
with
twit
te
r:
W
hat
140
c
ha
racte
rs
rev
e
al
about
po
li
t
ic
a
l
s
ent
iment
.
I
cwsm.
2010
Ma
y
23;1
0(1):178
-
85.
[3]
Bermingham
A,
S
m
ea
ton
A.
On
using
Twit
te
r
to
m
onit
or
poli
tical
sent
iment
and
pre
di
ct
el
ec
t
ion
results
.
InProce
edi
ngs of
the Works
hop
on
Senti
m
ent Ana
l
y
sis wh
ere
AI
m
ee
ts Ps
y
cho
log
y
(SA
AIP
2011)
2011
(pp. 2
-
10).
[4]
Sang
ET
,
Bos
J.
Predic
ti
ng
the
2
011
dutc
h
sena
t
e
el
e
ction
results
with
twit
te
r
.
I
nProce
edi
ngs
of
the
works
hop
on
sem
ant
ic
anal
y
s
i
s in
soci
al m
edi
a
2012
Apr 23
(p
p.
53
-
60)
.
As
socia
ti
on
for
Com
puta
ti
on
al L
ingu
isti
cs.
[5]
Boll
en
J
,
M
ao
H,
Ze
ng
X.
Twi
tt
er
m
ood
pre
di
ct
s
th
e
stock
m
a
rke
t.
Journal
of
computat
ion
al
s
ci
en
ce.
2011
M
ar
31;2(1):
1
-
8.
[6]
Zha
ng
X,
Fuehre
s H,
Gloor
PA
.
P
red
icting
stock m
ark
et
indicat
or
s thr
ough
twit
te
r
“
I
hope
it
is not
as
bad
as
I
fear
”
.
Proce
dia
-
Soc
ia
l
and
Beh
avi
or
al
Scie
nc
es.
2011
Jan
1;26
:55
-
62.
[7]
McGlohon
M,
Glanc
e
NS
,
Re
it
er
Z.
Star
Qu
al
ity
:
Aggrega
ting
Revi
ews
to
Rank
Products
and
Merc
han
ts
.
InICW
SM
2010
Ma
y
16.
[8]
Mishne
G,
Glanc
e
NS
.
Pred
ic
ti
ng
Movi
e
Sale
s
from
Blogge
r
Senti
m
en
t.
InAA
AI
Spr
ing
S
y
m
posium:
Com
puta
ti
onal
Approac
hes
to
A
naly
z
ing
W
ebl
o
gs 2006
Mar
27
(pp.
155
-
158)
.
[9]
Jos
hi
M,
Das
D,
Gim
pel
K,
Sm
i
th
NA
.
Movie
r
evi
ews
and
rev
e
nues:
An
exp
erim
ent
in
t
ext
reg
ression.
InHum
an
La
nguag
e
Tech
nologi
es:
Th
e
2
010
Annual
Confer
ence
of
the
North
Am
eri
ca
n
Chapt
er
of
th
e
As
socia
ti
on
f
or
Com
puta
ti
onal L
ingui
stic
s
2010
J
un
2
(pp
.
293
-
29
6).
As
socia
t
ion
f
or
Com
puta
ti
on
a
l
L
ingui
sti
cs.
[10]
Sadikov
E
,
Par
a
m
eswara
n
AG
,
Vene
ti
s P.
Blogs
as
Predi
ct
ors of
Movie
Succ
ess. InICW
SM
2009
Mar
20.
[11]
Ortigosa
A,
Ma
r
tí
n
JM
,
C
arr
o
R
M.
Senti
m
ent
an
a
l
y
s
is
in
Fa
ce
bo
ok
and
it
s
app
li
c
at
ion
to
e
-
learni
ng.
Com
pute
rs
i
n
Hum
an
Beha
vio
r.
2014
Feb
28;3
1:527
-
41.
[12]
Munez
ero
M,
Montero
CS
,
Moz
govo
y
M,
Sutin
en
E.
Exp
loi
t
ing
senti
m
ent
anal
ysis
to
tra
ck
emot
ions
in
student
s'
le
arn
ing
di
aries.
InProce
ed
ings
of
th
e
13
th
K
oli
C
al
li
ng
Int
e
rna
ti
on
al
Conf
e
ren
ce
on
Com
p
uti
ng
Educ
a
ti
on
Resea
rch
2013
Nov 14
(pp. 145
-
152).
ACM
.
[13]
Kang
H,
Yoo
SJ,
Han
D.
Senti
-
le
xic
on
and
improved
Naïve
Ba
yes
al
gorit
hm
s
for
senti
m
ent
ana
l
y
sis
of
resta
ura
n
t
rev
ie
ws
.
Expe
r
t S
y
stems
with
A
ppli
c
at
ions.
201
2
Apr 30;
39(5)
:6
000
-
10.
[14]
Antina
sari
P,
Perda
na
RS
,
Fauzi
MA
.
Anali
sis
S
ent
imen
Te
nt
an
g
Opini
Film
Pa
da
Dokum
en
T
witt
er
Berb
aha
s
a
Indone
sia
Meng
gunaka
n
Na
ive
Ba
y
es
Denga
n
Perba
ika
n
K
at
a
Ti
dak
Baku
.
Jurnal
Pengemba
ngan
Te
kno
lo
g
i
Inform
asi
dan
I
l
m
u
Kom
pute
r
.
2
017;
1(12):1733
-
41.
[15]
Gunawan
F,
Fauzi
MA
,
Adikar
a
PP
.
Anali
sis
Sen
ti
m
en
Pada
Ulasa
n
Aplika
si
Mobile
Menggunak
an
Naive
Ba
y
e
s
Dan
Norm
al
isasi
Kata
Berb
asis
Le
vensht
ei
n
Dis
ta
nc
e
(Studi
Kas
us
Aplika
si
BCA
Mobile
).
S
y
st
emic:
Inform
at
io
n
S
y
stem a
nd
Info
rm
at
ic
s
Journal
.
2017
Des 31;
3(
2):1
-
6.
[16]
Fauzi
MA
,
Afiria
nto
T
.
Im
proving
Senti
m
ent
Anal
y
sis
of
Short
Inform
al
Indone
sian
Product
Reviews
using
S
y
non
y
m
Based
Feat
ure
Expa
ns
ion.
T
EL
KO
MN
IKA
(Te
le
comm
unic
a
ti
on
Com
puti
ng
Elec
troni
c
s
and
Control
)
.
2018
Jun 1;
16(3
).
[17]
Fan
issa
S,
Fauzi
MA
,
Adinugroho
S.
Anali
sis
S
ent
imen
Par
iwisat
a
d
i
Kot
a
Mal
ang
Menggunak
an
Metod
e
Nai
v
e
Ba
y
es
dan
Sel
eksi
Fitur
Query
Exp
ansion
Ranki
ng.
Jurnal
Pengembanga
n
Te
knologi
Infor
m
asi
dan
Ilmu
Kom
pute
r.
2018;
2(8):2766
-
70
.
[18]
Mulle
n
T
,
Co
ll
i
er
N.
Sen
ti
m
e
nt
Anal
y
s
is
using
Support
Vec
to
r
Mac
hin
es
wit
h
Diver
se
Info
r
m
at
ion
Source
s.
InEMNLP 2004 J
ul
(Vol.
4,
pp.
4
12
-
418).
Evaluation Warning : The document was created with Spire.PDF for Python.
IS
S
N
:
2502
-
4752
Ind
on
esi
a
n
J
E
le
c Eng &
Co
m
p
Sci,
Vol
.
1
2
, N
o.
1
,
O
c
t
o
b
e
r
201
8
:
46
–
50
50
[19]
Rofiqoh
U,
Perda
na
RS
,
Fauzi
MA
.
Anali
sis
Senti
m
en
Ti
ngkat
Kepua
sa
n
Pengguna
Pen
y
ed
ia
L
a
y
anan
Te
l
ekomunikasi
Selul
er
Indone
sia
Pada
Twitte
r
Denga
n
Meto
de
Support
Vec
tor
Mac
hine
d
a
n
Le
xic
on
B
ase
d
Feat
ure
s.
Jurnal
Pengembanga
n
Te
knologi Infor
m
asi
dan
I
lmu Kom
pute
r.
2017;
1(12):1725
-
32.
[20]
Bat
ista F,
R
ibeir
o
R.
Sen
ti
m
ent anal
y
s
is a
nd
topic
class
ifi
cation
ba
sed
on
bin
ar
y
m
axi
m
um
ent
rop
y
class
ifi
ers.
[21]
Munir
MM
,
Fauzi
MA
,
Perda
n
a
RS
.
Im
ple
m
ent
asi
Metode
B
a
ckpr
opagation
Neura
l
Network
ber
basis
Le
xi
c
on
Based
Feat
ur
es
dan
Bag
of
W
ords
Untuk
Ide
nti
fika
si
Uja
ran
Kebe
ncian
Pada
Twit
te
r
.
Jurnal
Pengembanga
n
Te
knologi Infor
m
asi
dan
I
lmu Kom
pute
r
e
-
ISS
N.
2017;254
8:964
X.
[22]
La
m
SL,
Le
e
DL.
Fea
ture
r
e
duct
ion
for
neu
ral
n
et
work
bas
ed
te
x
t
ca
t
egor
i
za
t
ion.
InDa
ta
b
ase
S
y
st
ems
for
Advanc
ed
Appl
i
ca
t
ions,
1999
.
Pr
oce
ed
ings., 6t
h
I
nte
rna
ti
ona
l
Con
fer
ence
on
1999
(pp.
195
-
202)
.
I
EE
E
.
[23]
Bil
al
M,
Isr
ar
H,
Shahid
M
,
Khan
A.
Sent
imen
t
cl
assifi
catio
n
of
Rom
an
-
Urdu
opini
ons
usi
ng
Naïve
Ba
y
es
ia
n
,
Dec
ision
Tr
ee
a
nd
KN
N
cl
assific
a
ti
on
t
ec
hn
iqu
es.
Journal
of
King
Saud
Univer
sit
y
-
Com
puter
and
Inform
at
i
on
Scie
nc
es.
2016
Jul 31
;28(3)
:330
-
44.
[24]
Nurja
nah
W
E,
Perda
na
RS
,
Fauzi
MA
.
Anal
isis
Senti
m
en
Te
rha
d
ap
Tay
a
ngan
Telev
isi
Berda
sarka
n
Op
in
i
Mas
y
ar
aka
t
p
ad
a
Media
Sos
ial
Twit
t
er
m
engg
unaka
n
Metod
e
K
-
Nea
rest
Nei
ghbor
dan
Pem
bobota
n
Jum
la
h
Ret
wee
t. Jurnal
Pengembanga
n
Te
knologi Infor
m
asi
dan
I
lmu Kom
pute
r.
2017;
1
(12), 1750
-
57.
[25]
Menta
ri
ND
,
F
auz
i
MA
,
Muf
li
khah
L
.
Anal
isis
Senti
m
en
Kurikulum
2013
Pada
Sos
ia
l
Media
Twi
tt
e
r
Menggunaka
n
Metode
K
-
Nea
rest
Neighbor
dan
Feat
ure
Sele
ct
ion
Qu
er
y
Exp
ansion
Ranki
ng.
Jurnal
Pengembanga
n
Te
knologi Infor
m
asi
dan
I
lmu Kom
pute
r.
2018;
2
(8):2739
-
43
.
[26]
Cla
ud
y
YI,
Perd
ana
RS
,
Fauzi
MA
.
Klasifi
kasi
Dokum
en
Twitter
Untuk
Meng
et
ahu
i
Kara
k
te
r
Cal
on
Kar
y
aw
a
n
Menggunaka
n
Algorit
m
e
K
-
Nea
rest
Neighbor
(KN
N).
Jurnal
Pengembanga
n
Te
kno
logi
Inf
orm
asi
dan
Ilm
u
Kom
pute
r.
2018
;
2(8):2761
-
65
.
[27]
Brei
m
an
L
.
R
an
dom
fore
sts.
Ma
chi
ne
learni
ng
.
2
001
Oct
1
;45(1):5
-
32.
[28]
Stat
nikov
A,
Al
ife
ris
CF
.
Are
r
andom
fore
sts
bet
t
er
th
an
support
vector
m
ac
h
i
nes
for
m
ic
roa
rr
a
y
-
b
ase
d
ca
nc
er
cl
assifi
ca
t
ion
?
.
I
nAM
IA
annua
l
s
y
m
posium
proc
ee
dings
2007
(Vol.
2007
,
p
.
686
).
Am
eri
c
an
Me
dic
a
l
Inform
atic
s
As
socia
ti
on.
[29]
Fau
zi
MA
,
Arifi
n
AZ,
Gos
ari
a
SC
.
Indone
sian
News
Cla
ss
ifi
cation
Us
ing
Naïve
Ba
y
es
and
T
wo
-
Phase
Feat
ure
Sele
c
ti
on
Mode
l
.
Indone
si
an
Jou
rna
l
of
E
le
c
trica
l
Engi
n
ee
ring
an
d
Com
pute
r
Sci
e
nce
.
2017
De
c
1
;
8(3).
[30]
Rosi
F,
Fauzi
MA
,
Perda
na
R
S.
Prediksi
Rati
ng
Pada
Rev
ie
w
Produk
Kec
anti
kan
Menggunak
an
Metode
Naïv
e
Ba
y
es
dan
Ca
tegorical
Proporti
onal
Diffe
r
ence
(CPD
).
Jurnal
Pengembanga
n
Te
knologi
Inf
orm
asi
dan
Ilmu
Kom
pute
r.
2018
;
2(5):1991
-
97
.
[31]
Le
star
i
AR,
Perda
na
RS
,
Fauzi
MA
.
Anali
sis
S
ent
imen
Te
n
ta
n
g
Opini
Pilka
da
Dki
2017
Pada
Dokum
en
Twit
te
r
Berba
hasa
Indo
nesia
Menggun
aka
n
Nä
ive
B
a
y
es
d
an
Pem
bobota
n
Emoji
.
J
urna
l
Peng
embanga
n
T
eknol
o
gi
Inform
asi
dan
I
l
m
u
Kom
pute
r.
2
017;
1(12):1718
-
24.
[32]
M.
Ali
Fauzi
,
D
joko
Cah
y
o
Utom
o,
Budi
Darm
a
Seti
awa
n
,
and
Eko
Sakti
Pram
ukant
oro.
2
017
.
Autom
at
ic
Es
s
a
y
Scoring
Sy
st
em
Us
ing
N
-
Gra
m
and
Cosine
Simi
la
ri
t
y
for
Gam
if
ic
a
ti
on
Based
E
-
Le
arn
ing.
In
Proce
ed
ings
of
the
Inte
rna
ti
ona
l
Confer
ence
on
Advanc
es
in
Im
age
Proce
ss
ing
(ICAIP
2017).
ACM
,
New
York,
NY,
US
A,
151
-
155.
DO
I:
htt
ps:/
/doi
.
org/10.
1145/313
3264.
3133303
[33]
Pram
ukant
oro
E
S,
Fauzi
MA
.
C
om
par
at
ive
an
aly
sis
of
str
ing
si
m
il
ari
t
y
and
cor
pus
-
base
d
sim
il
a
rity
for
aut
om
ati
c
essa
y
scor
ing
s
y
stem
on
e
-
lea
rning
gamificat
i
on.
InAdvanc
ed
Com
pute
r
Sci
enc
e
and
Infor
m
at
ion
S
y
st
em
s
(ICACS
IS),
2016
Inte
rn
a
ti
on
al C
onfe
ren
c
e
on
20
16
Oct
15
(pp
.
1
49
-
155).
I
EE
E
.
[34]
Fauzi
MA
,
Arifin
A,
Yuniar
ti
A.
Te
rm
W
ei
ghti
ng
Berba
sis
Inde
ks
Buku
dan
Kela
s
untuk
Pera
ngkinga
n
Dokum
en
Berba
hasa
Arab
.
Lontar
Kom
put
er:
Jurna
l
I
lmiah
Te
kno
logi
Infor
m
asi.
2013;5(2)
.
[35]
Suharno
CF
,
Fauzi
MA
,
Perda
n
a
RS
.
Klasifi
k
asi
Te
ks
Baha
sa
In
donesia
Pada
Do
kum
en
Pengadua
n
Sam
bat
Onlin
e
Menggunaka
n
Metode
K
-
Nea
r
est
Neighbor
s
Dan
Chi
-
Square
.
S
y
stemic
:
Inf
orm
at
ion
S
y
ste
m
and
Inform
at
ic
s
Journal.
2017
D
ec
7
;3(1):
25
-
32
.
[36]
Fauzi
MA
,
Arifi
n
AZ,
Yuniar
t
i
A.
Arab
ic
Book
Ret
ri
eva
l
using
Cla
ss
and
Book
Inde
x
Based
Te
rm
W
ei
ghti
ng
.
Inte
rna
ti
ona
l
Jou
rna
l
of
E
le
c
trica
l
and
Com
puter Enginee
r
ing
(IJE
CE).
2017
Dec
1
;7(6):
3705
-
10.
[37]
Alfina
I,
Mulia
R,
Fanan
y
MI,
Eka
na
ta
Y.
Hat
e
Speec
h
Detect
ion
in
the
Indone
sian
La
ngu
ag
e:
A
D
at
ase
t
and
Preli
m
ina
r
y
Stud
y
.
[38]
Palomino
-
Gari
b
a
y
A,
Cama
cho
-
Gonzá
lez
AT,
Fierro
-
Vil
la
ned
a
RA,
Herná
nde
z
-
Faria
s
I,
Busc
aldi
D,
Mez
a
-
Rui
z
IV.
A ra
ndom
fo
rest
appr
o
ac
h
for
aut
horship
profi
li
ng.
Cappell
at
o
et
al.[8]
.
2015.
[39]
Pedre
gosa
F,
V
aro
quaux
G,
Gram
fort
A,
M
ic
h
el
V,
Thi
r
ion
B
,
Grisel
O,
Blo
ndel
M,
Pretten
hofe
r
P,
W
ei
ss
R,
Dubourg
V,
Va
nder
pla
s
J.
Scik
it
-
learn:
Mac
h
in
e
le
arn
ing
in
Py
thon
.
Journal
of
Mac
hine
Lea
rning
Resea
rch
.
2011;12(Oc
t):
28
25
-
30.
Evaluation Warning : The document was created with Spire.PDF for Python.