Int
ern
at
i
onal
Journ
al of Ele
ctrical
an
d
Co
mput
er
En
gin
eeri
ng
(IJ
E
C
E)
Vo
l.
10
,
No.
4
,
A
ugus
t
2020
,
pp.
3537~3
549
IS
S
N: 20
88
-
8708
,
DOI: 10
.11
591/
ijece
.
v10
i
4
.
pp3537
-
35
49
3537
Journ
al h
om
e
page
:
http:
//
ij
ece.i
aesc
or
e.c
om/i
nd
ex
.ph
p/IJ
ECE
Measuri
ng
info
rm
atio
n
credibi
lit
y in so
cial medi
a usin
g
co
mb
ina
tion of u
ser profi
le and
message c
ontent di
mension
s
Er
w
in B. Seti
awan
,
Dwi
H.
Widy
anto
r
o
,
Kridan
t
o
S
ur
endro
School
of El
ec
tr
i
ca
l
Engi
n
ee
r
ing and
Inform
at
i
cs,
Instit
ut Te
kno
lo
gi
Bandung
,
Ind
onesia
Art
ic
le
In
f
o
ABSTR
A
CT
Art
ic
le
history:
Re
cei
ved
Ma
r
1
9
, 201
9
Re
vised
Feb
3
,
20
20
Accepte
d
Fe
b
1
2
, 20
20
Inform
at
ion
credibi
lit
y
in
socia
l
m
edi
a
is
bec
omi
ng
the
m
ost
im
porta
nt
par
t
of
informati
on
sharing
in
th
e
soc
ie
t
y
.
The
l
it
e
rat
u
res
have
show
n
tha
t
the
r
e
is
no
la
be
li
ng
info
rm
at
ion
cre
d
ibi
l
ity
base
d
on
us
er
compete
n
cies
and
thei
r
posted
topi
cs
.
T
his
pape
r
i
n
cre
as
es
the
informat
io
n
cre
dib
il
i
t
y
b
y
addi
ng
new
17
fea
tur
es
for
Twit
ter
and
49
fea
tur
es
for
Fac
ebook.
In
th
e
fi
rst
step,
we
per
form
a
la
b
el
i
ng
proc
ess
base
d
on
user
compet
enc
i
es
and
their
posted
topi
c
to
cl
assif
y
the
users
int
o
two
groups,
cre
dib
le
and
not
credibl
e
users,
reg
ard
ing
th
ei
r
posted
topi
cs.
The
se
appr
oa
c
hes
are
eva
lu
ated
over
te
n
thousand
sam
ple
s
of
real
-
field
dat
a
ob
ta
in
ed
fr
om
Twit
te
r
and
Face
book
net
works
using
cl
assifi
ca
t
ion
of
Naive
B
a
y
es
(N
B),
Support
Ve
c
tor
Mac
hin
e
(SV
M),
Logi
stic
Regr
ession
(Logi
t)
and
J4
8
Algorit
hm
(J48).
W
it
h
the
proposed
ne
w
fea
tur
es,
the
cre
dib
il
i
t
y
of
inf
orm
at
ion
provid
ed
in
social
m
edi
a
is
inc
r
ea
s
ing
signifi
c
antl
y
indi
c
at
ed
b
y
b
et
t
er
accurac
y
c
om
par
ed
to
the
exi
sting
t
ec
h
nique
for
a
ll
class
ifi
ers.
Ke
yw
or
d
s
:
Faceb
ook
Inform
at
ion
cr
edibili
ty
So
ci
al
m
edia
Twitt
er
Copyright
©
202
0
Instit
ut
e
o
f Ad
vanc
ed
Engi
n
ee
r
ing
and
S
cienc
e
.
Al
l
rights re
serv
ed
.
Corres
pond
in
g
Aut
h
or
:
Erw
i
n
B.
Seti
awan,
School
of Elec
tric
al
Engineer
ing
a
nd
Inform
at
ic
s,
In
sti
tut Te
knol
og
i
Ba
ndun
g
Gan
e
sh
a
Street
no. 1
0,
Ba
ndung
,
Ind
on
esi
a
.
Em
a
il
: erwin
budiseti
awa
n@
t
el
ko
m
un
ive
rsity
.ac.id
1.
INTROD
U
CTION
It
cannot
be
den
ie
d
that
th
e
popu
la
rity
of
so
ci
al
m
edia
has
increase
d
rap
idly
in
re
cent
ye
ars.
Currentl
y,
ab
out
320
m
il
l
ion
us
ers
m
on
t
hly
are
act
ive
on
t
he
m
ic
ro
-
blog
ging
sit
e,
Twitt
er.
T
witt
er
is
a
global
ph
e
nom
eno
n,
wh
e
re
77
%
of
Twitt
er
acco
un
ts
a
re
ou
tsi
de
of
the
U
ni
te
d
Stat
es
an
d
Twitt
er
sup
ports
33
la
nguag
e
s.
Be
cause
of
the
eff
ic
ie
ncy,
volu
m
e,
and
tim
eliness
of
in
for
m
at
ion
,
On
li
ne
So
ci
a
l
Netwo
r
ki
ng
(
OSN
),
f
or
e
xam
ple,
twit
te
r.
c
om
,
has
be
com
e
an
im
p
or
ta
nt
sou
rce
of
inf
orm
ation
[
1].
Ac
co
rd
i
ng
to
the
Twitt
er
blog,
a
bout
a
n
ave
rag
e
of
34
0
m
i
ll
ion
tweet
s
ar
e
generate
d
pe
r
day
as
of
Ma
rch
2012.
I
n
ad
diti
on
to
receivin
g
in
form
ation
f
rom
the
peo
ple
t
hey
"fo
ll
ow",
people
are
inc
r
easi
ng
ly
looki
ng
for
rele
van
t
top
ic
al
tweet
s, whic
h
i
s m
or
e than 1
.
6 bil
li
on
reque
sts for
Twitt
er
search
po
rtal
s p
er
d
ay
.
In
par
ti
cula
r,
l
earn
i
ng
a
bout
new
s
is
of
te
n
a
n
im
po
rtant
m
otivati
on
f
or
pe
op
le
to
rea
d
tweet
s
[
2],
f
or
exam
ple,
in
order
to
c
onti
nuously
update
in
f
or
m
at
ion
ab
ou
t
local
e
m
erg
en
ci
es
[3
]
.
O
ne
of
the
O
SN
fun
ct
ion
s
is
to
beco
m
e
a
m
ediu
m
of
sha
rin
g
an
d
sear
c
hing
for
inf
or
m
at
ion
[4
,
5].
Each
us
e
r
can
act
as
a
so
ur
ce
and
sp
rea
de
r
to
th
e
inf
or
m
at
ion
,
e
it
her
f
orwa
rd
e
d
in
fu
ll
or
with
m
od
ific
at
ions
an
d
ad
diti
ons
.
The
r
ole
of
O
SN
as
a
source
of
in
form
ation
is
e
ven
m
or
e
pro
m
inent
in
em
e
rg
e
ncies
s
uc
h
as
in
par
ti
cula
r
acci
de
nts,
na
tura
l
disaste
rs
a
nd i
ncide
nts
of
te
rror
ism
b
ecause
it
p
r
ov
i
des
a
fa
ste
r
re
port
tha
n co
nv
e
ntio
nal
m
edia [6
-
14]
.
Howe
ver,
fals
e
inf
or
m
at
ion
that
sprea
ds
on
s
ocial
m
edia
has
seri
ou
s
co
ns
e
que
nces.
Th
us
,
a
m
echan
is
m
t
o
a
uto
m
at
ic
all
y
determ
ine
the
cre
dib
il
it
y
of
the
tweet
is
require
d.
M
orr
is
et
al
co
nduc
te
d
a
su
r
vey
to
unde
rstan
d
the
pe
rcep
ti
ons
of
use
r
cre
dib
il
it
y
on
Twitt
er
[
3].
Mo
rr
is
et
al
al
so
co
nduc
te
d
an
exp
e
rim
ent
with
the
pu
rpo
se
of
unc
ov
e
rin
g
use
r
-
bas
ed
or
c
on
te
nt
-
base
d
feat
ures
us
e
d
to
asses
s
the
cred
i
bili
ty
.
Con
se
quently
,
us
er
-
base
d
fe
at
ur
es
ca
n
be
gro
up
e
d
into
t
hr
ee
cat
eg
ori
es:
influ
e
nce,
top
ic
al
Evaluation Warning : The document was created with Spire.PDF for Python.
IS
S
N
:
2088
-
8708
In
t J
Elec
&
C
om
p
En
g,
V
ol.
10
, No
.
4
,
A
ugus
t
2020
:
3537
-
3549
3538
exp
e
rtise
,
an
d
rep
utati
on.
T
he
influe
nce
f
eat
ur
e
inclu
de
s
the
nu
m
ber
of
f
ollo
wer
s
,
r
et
weet,
and
m
ent
ion.
Wh
il
e
the
to
pi
cal
exp
erti
se
f
eat
ur
e
is
obta
ined
by
searc
hi
ng
th
rou
gh
t
he
auth
or
'
s
hom
epag
e,
the
a
uthor'
s
i
m
aging
histo
r
y,
outsi
de
t
he
web
pa
ge
that
discusse
s
the
t
op
ic
t
he
a
utho
r
is
c
onveyi
ng,
an
d
t
he
a
utho
r
is
i
n
a
locat
ion
t
hat
is
releva
nt
to
t
he
to
pic.
T
he
r
epu
ta
ti
on
-
base
d
feat
ur
e
help
s
to
sho
w
the
use
r'
s
fam
i
li
arity
with
the T
witt
er au
t
hor.
This f
eat
ur
e i
nc
lud
es t
he
case
, either t
he
a
uth
or is f
ollo
we
d by the
us
e
r,
or the au
t
hor
is s
om
eon
e that
the
us
e
r
has
he
ard
befor
e
,
or
the
aut
hor'
s
account
has
bee
n
ver
ifie
d
by
T
w
it
te
r.
The
co
nte
nt
-
based
featu
r
e
that
rev
eal
s
m
os
t
of
the
cre
dib
il
it
y
of
tweet
s
is
if
the
tweet
co
ntains
a
re
puta
ble
URL
li
nk,
so
m
e
tweet
s
m
ade
the
sa
m
e
cl
a
i
m
as
the
intended
tweet
,
it
us
es
sta
nd
a
rd
gram
m
ar,
or
it
us
es
it
s
own
pro
file
photo
im
age
or
i
m
ages r
el
at
ed t
o
the t
op
ic
s
th
ey
are
inte
rested in an
d
t
he
str
uctu
re
of
t
he
a
uthor'
s u
se
r
na
m
e.
A
stu
dy
to
ana
ly
ze
ho
w
onli
ne
so
ci
al
m
edia
us
ers
rated
t
he
cred
i
bili
ty
of
tweet
s
has
bee
n
cond
ucted
by
S
har
if
f
[15].
I
n
this stu
dy, 9
8
eval
uato
rs
ha
ve
been
em
po
wer
e
d
to
asses
s
the
c
red
i
bili
t
y
le
vel
of
40
0
t
weets
that
hav
e
bee
n
us
e
d.
S
ha
rif
f
revea
ls
that
the
top
ic
in
volving
poli
ti
cs
has
a
num
ber
of
tweet
s
wit
h
lo
w
cred
i
bili
ty
.
In
add
it
io
n,
tweet
s
that
do
not
ha
ve
li
nks,
s
uch
as
URLs,
a
re
of
te
n
dif
fi
cult
for
us
e
rs
to
rec
ognize.
In
a
dd
it
io
n,
one
of
the
ea
rlie
st
wo
r
ks
th
at
au
tom
a
ti
cal
l
y
pr
edict
ed
the
c
redi
bili
ty
of
the
ne
ws
an
d
tweet
s
ha
s
been
c
onduct
ed
by
Ca
sti
ll
o
[16].
T
his
w
ork
ap
plied
t
w
o
sta
ges
of
da
ta
colle
ct
ion
.
First,
la
be
l
an
d
sa
ve
the
tweet
s
tha
t
are
consi
dered
ne
ws
worth
y.
Seco
nd,
use
7
eval
uato
rs
to
la
bel
ne
w
sworthy
tweet
s
with
cred
i
bili
ty
val
ues.
T
o
get
th
is
annotat
ion
,
Ca
sti
ll
o
us
ed
Am
azon
Me
ch
anical
Turk
an
d
la
beled
the
t
weets
base
d on ne
w f
easi
bili
ty
an
d
c
red
i
bili
ty
.
Fu
rt
her
m
or
e,
t
he
use
of
S
V
M
rati
ng
s
a
nd
Pseudo
-
Re
le
va
nce
Fee
db
a
ck
(
PRF)
to
ra
nk
t
he
cre
dib
il
it
y
of
tweet
s
has
been
done
by
Gupta
[17].
G
up
ta
cat
eg
or
iz
es
it
s
featur
es
into
two:
co
nt
ent
-
based
fe
at
ur
es
or
so
urce
-
base
d
f
eat
ur
es.
T
he
re
su
lt
s
of
t
he
st
udy
s
how
t
hat
m
anu
al
la
belin
g
h
as
bee
n
ca
rri
ed
out
for
t
he
le
vel
of
cred
i
bili
ty
rela
te
d
to
tweet
s
that
pr
opa
gate
fak
e
i
m
ages
of
the
hu
rr
ic
a
ne
Sandy
but
hav
e
not
involve
d
the
com
petenc
y
of
the
s
ourc
e/
us
er
who
s
pread
the
twe
et
.
So
m
e
key
observ
at
io
ns
a
bo
ut
the
tweet
fe
at
ur
es
wh
ic
h
c
orrelat
e
with
c
re
dib
il
it
y
hav
e
bee
n
create
d.
T
he
t
weets
with
a
la
rg
e
num
ber
of
un
i
qu
e
ch
aract
ers
a
nd
con
ta
in
U
RL
s t
end to
be
m
or
e
tru
ste
d.
The
la
te
st
rese
arch
was
c
onduct
ed
by
Ros
s
in
20
16
with
a
n
ai
m
ed
at
creati
ng
a
nd
sel
ect
ing
a
rang
e
of
feat
ur
es
that
w
ou
l
d
pr
oduc
e
a
bette
r
perform
ance
wh
e
n
trai
ning
a
nd
te
s
ti
ng
data
set
s
ori
gin
at
in
g
from
t
w
o
diff
e
re
nt
ye
ars
with
diff
e
re
nt
top
ic
s.
The
da
ta
us
ed
in
thi
s
stud
y
is
the
data
us
e
d
by
Gupta
in
two
diff
e
re
nt
stud
ie
s t
hat ha
ve been
m
anu
a
ll
y l
abeled n
a
m
el
y [1
8
,
19]
.
Faceb
ook
has
m
or
e
chall
enges
in
te
rm
of
inf
or
m
at
ion
cre
dib
il
it
y
com
pa
red
t
o
Twitt
er.
Ther
e
f
or
e
,
the
resea
rch
on
the
i
nfor
m
at
ion
c
red
i
bili
ty
on
Face
book
is
rar
el
y
co
nd
ucted
a
nd
one
of
the
researc
h
w
as
cond
ucted
by
Saikaew
i
n
2015.
T
he
rea
so
ns
that
m
ake
Faceb
ook
is
m
or
e
chall
en
ging
beca
us
e
,
first,
the
co
nv
e
nienc
e
in
acce
ssing
Twitt
er
co
nten
t
through
T
witt
er
API.
Althou
gh
Fac
eb
ook
ha
s
a
Gr
a
ph
API
with
the
abili
ty
to
acce
ss
co
ntent
,
the
acce
ss
t
o
the
inf
orm
ati
on
is
al
s
o
li
m
it
ed
thr
ough
the
G
ra
ph
API
it
sel
f.
Seco
nd,
Face
book
has
m
or
e a
ct
ive u
se
rs
th
a
n
T
witt
er.
I
n S
eptem
ber
2017
, abo
ut 2,06
1 b
il
li
on
u
se
rs
a
re
act
ive
in
Face
book,
wh
il
e
328
m
i
llion
us
e
rs
a
re
a
ct
ive
in
Twitt
er
[
20]
.
Wh
il
e
I
ndonesi
a
is
ra
nked
sec
ond,
w
hich
is
48%,
as
the
c
ountry
with
t
he
m
os
t
act
ive
so
ci
a
l
m
edia
us
ers.
Finall
y,
c
om
par
ed
t
o
T
witt
er,
Face
bo
ok
has
richer feat
ur
es
,
su
c
h
as
f
eat
ures that al
lo
w u
sers
t
o
sim
ply
cl
ic
k
an
d
c
omm
ent easi
ly
.
Seve
ral
resea
rc
hes
discusse
d
t
he
cre
di
bili
ty
of
i
nfor
m
at
ion
on
po
pu
la
r
s
oc
ia
l
networ
king
sit
es,
s
uc
h
as
Twitt
er
.
H
ow
e
ve
r,
Saika
ew'
s
resear
ch
is
the
only
re
search
that
f
oc
us
es
on
cal
c
ulati
ng
the
va
lue
of
inf
or
m
at
ion
cr
edibili
ty
on
Fa
ceboo
k
that
ha
s
m
or
e
us
e
rs.
Saikaew
only
us
es
8
featu
res
[21],
howe
ve
r
we
us
e
54
featu
res
t
o
increase
accu
ra
te
of
c
red
i
bili
ty
m
easur
em
e
nt
.
The
la
belin
g
is
m
ade
m
anu
al
ly
then
the
rat
ing
is
updated
syst
em
ic
al
l
y
by
th
e
us
e
r
who
c
an
acce
ss
the
ap
plica
ti
on
.
Howe
ver,
in
Saikaew
,
th
e
us
er
’s
com
petence
is
sti
ll
no
t
bein
g
vie
we
d.
F
ur
therm
or
e,
this
pa
per
a
ppli
es
a
dif
fer
e
nt
a
ppr
oach,
i.e.,
la
belin
g
inf
or
m
at
ion
cr
edibili
ty
based
an
d
i
ntrod
uce
17
new
feat
ures
f
or
Twitt
er
and
49
ne
w
fe
at
ur
es
f
or
Face
book.
Me
anwhil
e,
f
or
the
featu
re d
i
m
ension
s
, w
e u
se
t
wo
featu
re
dim
ension
s
c
onsist
ing
of u
se
r
pro
file
an
d
m
essage
con
te
nt d
im
ension
.
Our
c
ontrib
ution
s
are
s
umm
a
rized as
foll
ow
s:
a.
The
pa
per
intr
oduces
ne
w
17
featu
res
f
or
T
witt
er
an
d
49
f
eat
ur
es
f
or
Fac
eboo
k
to
inc
re
ase
inf
orm
ation
cred
i
bili
ty
b.
We
pr
e
sent
a
la
belli
ng
proc
ess
to
cl
assify
the
us
ers
into
two
gro
up
s
,
cred
i
ble
and
not
cred
ible
us
er
gro
up
s
, depe
nd
ing
t
heir p
os
te
d
to
pics.
The
fin
ding
in
this
pap
er
is
exp
ect
e
d
to
he
lp
orga
nizat
ion
s
a
nd
the
pract
it
ion
ers
to
m
ake
bette
r
decisi
ons,
be
c
ause
accu
rate
cred
ibil
it
y
is
achieva
ble
due
to
la
rg
e
n
um
ber
of
fea
tures.
F
urt
herm
or
e,
the or
gan
iz
at
io
ns
a
nd the
prac
ti
ti
on
e
rs
a
re inf
or
m
ed
with
the
upd
at
e
d
to
pic
du
e
to
a
uto
m
at
ic
al
ly
too
l.
Evaluation Warning : The document was created with Spire.PDF for Python.
In
t J
Elec
&
C
om
p
En
g
IS
S
N: 20
88
-
8708
Meas
ur
in
g
i
nforma
ti
on
credi
bi
li
ty
in
so
ci
al
med
i
a usin
g
c
ombi
na
ti
on
of
use
r p
r
ofil
e
…
(
Erwi
n
B.
S
et
i
aw
an
)
3539
2.
RESEA
R
CH MET
HO
D
The
Propose
d
I
nfor
m
at
ion
Cr
edibili
ty
Mod
e
l
is
show
n
i
n
F
igure
1.
Datase
t
are
div
ide
d
i
nt
o
tw
o,
i.e
.,
trai
ning
da
ta
and
te
sti
ng
da
ta
,
w
her
e
t
ra
ining
data
a
re
la
beled
m
anu
a
ll
y,
an
d
whil
e
te
sti
ng
dat
a
are
pre
-
pr
ocesse
d,
includi
ng
t
heir
featu
re
e
xtra
ct
ion
.
T
he
re
s
ult
of
t
he
featu
re
extracti
on
f
or
trai
ning
dat
a
com
e
into
the
featu
r
e
sel
ect
ion
pro
cess
an
d
the
n
m
ov
e
to
the
cr
edibili
ty
cl
assif
ic
at
ion
m
od
el
ing
process
a
nd
the
n
the
m
od
el
ing
resu
lt
is
us
e
d
to
pr
e
dict
th
e
te
sti
ng
data.
Finall
y,
the
Twitt
er
cre
dibi
li
t
y
cl
ass
wit
h
go
od
accuracy i
s
ex
pected t
o be
ga
ined.
Figure
1. The
pro
po
se
d
i
nform
at
ion
cre
dib
il
it
y
m
od
el
2.1. La
beli
ng
Labeli
ng
is
ap
plied
ba
sed
on
the
com
patibil
it
y
of
us
er
c
om
petencies
and
tweet
or
m
es
sage.
In
t
his
pap
e
r,
we
co
nsi
der
of
c
on
ce
pt
sta
ti
ng
that
poste
d
tweet
s
with
a
tweet
top
ic
c
orrelat
ed
to
c
om
petence
of
the
posti
ng
ac
count
is
a
m
ea
su
re
to
be
cre
di
ble
rathe
r
t
ha
n
poste
d
t
weets
with
a
tweet
top
ic
unco
r
rela
te
d
t
o
com
petence
of
the
posti
ng
ac
count.
T
his
c
oncept
bu
il
ds
a
higher
pro
bab
i
li
ty
of
po
ste
d
t
weet
is
cre
dib
l
e
or
no
t.
W
e
al
so
def
i
ne
tweet
is
m
essage
poste
d
in
twit
te
r
and
m
essage
is
m
essage
po
ste
d
in
Face
book
.
We
pe
rfo
rm
lab
el
ing
m
anu
al
l
y
fo
r
t
weet
an
d
m
essage
cat
egories,
wh
il
e
for
us
e
r
c
om
petencies,
we
pe
rfor
m
a
real
su
r
vey.
The
ob
j
ect
ive
of
s
urvey
is
to
colle
ct
info
rm
at
ion
of
us
e
r
c
om
petencies.
We
m
ade
an
onli
ne
su
r
vey
thr
ough
the
we
bs
it
e
www.
s
urveym
onkey.c
om
in
Jan
uar
y
-
Ma
r
ch
2017.
Re
s
ponde
nts
we
re
ask
e
d
qu
e
sti
on
s
a
bo
ut
their
op
i
nio
n
of
256
fa
m
ou
s
people
with
each
c
or
respo
nd
i
ng
c
om
petence.
In
f
or
m
at
ion
disp
la
ye
d
in
t
he
s
urvey
i
nclud
e
s
photo
s
,
bio
pro
file
s,
five
twe
et
s
an
d
five
m
essages
ha
v
in
g
t
he
highes
t
eng
a
gem
ent,
nu
m
ber
of
f
ollow
e
rs,
nu
m
ber
of
tweet
s,
a
nd
nu
m
ber
of
fo
ll
ow
i
ng.
T
he
survey
has
bee
n
cond
ucted
on
188
res
ponde
nt
s,
137
m
en
a
nd
51
wo
m
en.
Wh
e
re
the
jo
b
distrib
utio
n
is
sh
ow
n
in
Figure
2.
The
pe
rce
ntag
e
of
four
la
rg
e
respo
nd
e
nts
a
re
28.
19%
f
rom
pr
ivate
em
plo
ye
es,
27.
13
%
for
le
ct
ur
e
rs
,
19.
15
%
for
st
ud
e
nts,
a
nd
15.43%
for s
el
f
-
em
plo
ye
d.
Re
sp
on
den
t
di
stribu
ti
on
bas
ed
on
e
ducat
ion
is
sho
wn
in
Fig
ure
3.
The
la
r
gest
c
om
po
ne
nt
of
the
res
pondent
s
is
98
res
pondents
(52%
)
f
r
om
Ba
chelor
de
gr
ee
,
62
re
sponde
nts
(
33%)
from
Ma
ste
r
degree
,
13
res
ponde
nts
(
7%)
f
ro
m
Senior
High
S
chool
le
vel
,
4
res
pondents
(
2%)
f
ro
m
3
-
y
ear
Diplo
m
a,
and
1
respo
nd
e
nt
fro
m
ph
arm
aci
st
edu
cat
io
n.
T
he
way
to
deter
m
ine
wh
et
he
r
the
us
er
is
co
m
petent
or
no
t
is
by
cal
culat
ing
the
highest
num
ber
of
opini
on
giv
e
n
by
the
r
esp
onde
nt
to
the
pro
vi
ded
256
fam
ou
s
pe
op
le
.
The
s
urvey
is
co
nducted
t
o
obta
in
com
pe
te
ncies
from
25
6
fam
ou
s
pe
op
le
,
inclu
ding
115
fam
ou
s
pe
op
le
wh
ic
h
the
d
at
a
are take
n from
Tw
it
te
r.
C
om
petence sam
ple d
at
a of
10
people is s
how
n
in
Tab
le
1.
Figure
2. Job
di
stribu
ti
on
of re
sponde
nts
Evaluation Warning : The document was created with Spire.PDF for Python.
IS
S
N
:
2088
-
8708
In
t J
Elec
&
C
om
p
En
g,
V
ol.
10
, No
.
4
,
A
ugus
t
2020
:
3537
-
3549
3540
Figure
3. Ed
uc
at
ion
distrib
ution o
f res
ponde
nts
Table
1.
Sam
pl
e o
f 10 fam
ou
s
p
e
op
le
c
om
petencies
No
Na
m
e
Co
m
p
eten
ce
1
Co
m
p
eten
ce
2
Co
m
p
eten
ce
3
Co
m
p
eten
ce
4
1
Ab
d
u
llah
Gy
m
n
ast
iar
religio
u
s
m
o
tiv
atio
n
al
so
cial
ed
u
catio
n
2
Ab
u
rizal
Bak
rie
p
o
litical
g
o
v
ern
m
en
tal
econ
o
m
ic
so
cial
3
Ach
a Septriasa
en
tertain
m
en
t
so
cial
g
en
eral
ad
v
ertisin
g
4
Ad
d
ie M
S
en
tertain
m
en
t
cu
ltu
ral
so
cial
g
en
eral
5
Ad
e Ko
m
a
rud
in
p
o
litical
g
o
v
ern
m
en
tal
so
cial
g
en
eral
6
Ad
h
icip
ta R.
W
ira
wan
g
en
eral
p
o
litical
f
in
an
cial
econ
o
m
ic
7
Ad
h
ie M
Massardi
p
o
litical
g
en
eral
g
o
v
ern
m
en
tal
so
cial
8
Ad
h
y
ak
sa Daul
t
p
o
litical
g
o
v
ern
m
en
tal
sp
o
rt
so
cial
9
Ad
i A
m
r
an
Sulai
m
an
so
cial
p
o
litical
g
o
v
ern
m
en
tal
g
en
eral
10
Ad
ib
Hidayat
g
en
eral
en
tertain
m
en
t
so
cial
jo
u
rnalis
m
Tw
o
cred
i
bili
ty
la
bels
are
us
ed
in
this
stu
dy,
i.e.,
“
cre
di
ble
”
an
d
“
not
cred
i
ble
”
.
W
e
def
ine
that
inf
or
m
at
ion
is
consi
der
e
d
as
cred
i
ble
w
he
n
the
fam
ou
s
pe
op
le
posts
twe
et
or
m
essage
appr
opriat
e
to
their
com
petencies.
On
the
oth
e
r
ha
nd
,
w
hen
the
tweet
or
m
essage
are
poste
d
out
of
the
fam
ou
s
pe
op
l
e
com
petencies,
the in
form
at
io
n
is c
onside
red as
no
t c
red
i
ble
. T
he pr
ocess
i
s sho
wn in Fi
gure
4
Figure
4. A
lab
el
li
ng
in
form
ation
c
re
dib
i
li
ty
p
r
ocess
b
y c
om
bin
ing
co
m
pe
te
nce c
orpu
s
and tweet
to
pic
Data res
ulted
f
ro
m
labeli
ng
a
r
e shown
as
fo
ll
ow
s:
a.
Twitt
er
s
ocial
Me
dia
The dist
rib
utio
n of i
nfo
rm
ation
c
red
i
bili
ty
la
beling f
or T
witt
er s
ocial
m
edia is sh
own
in T
able 2.
b.
Faceb
ook
so
ci
al
m
edia
The dist
rib
utio
n of i
nfo
rm
ation
c
red
i
bili
ty
la
beling f
or Face
book s
ocial
m
e
dia is s
how
n
in
Tab
le
3.
Table
2.
Inf
orm
at
ion
cre
dib
il
it
y dist
ribu
ti
on
in T
witt
er
Clas
s
Nu
m
b
e
r
%
Cred
ib
le
1
2
4
3
9
6
4
.12
No
t Credib
le
6962
3
5
.88
Total
1
9
4
0
1
Table
3.
Inf
orm
at
io
n
cre
dib
il
it
y dist
ribu
ti
on
in F
ace
book
Clas
s
Nu
m
b
e
r
%
Cred
ib
le
1
5
6
7
7
6
6
.74
No
t Credib
le
7812
3
3
.26
Total
2
3
4
8
9
2.2. Pre
-
p
roc
essing
By
assumi
ng
te
xt
input
f
ro
m
the
or
i
gin
al
tweet
(T
witt
er)
or
po
st
m
essage
(F
ace
bo
ok)
con
te
nt,
pre
-
processi
ng
co
ns
ist
s
of
ca
se
f
old
in
g,
to
ke
nizat
ion
,
sto
p
-
word
rem
ov
al
,
an
d
ste
m
m
ing
.
Ca
se
F
old
i
ng
is
the pro
ces
s b
y
wh
ic
h
w
ords o
r
phra
ses in a
t
ext tweet
or p
ost
m
essage w
il
l be con
ver
te
d
into lo
we
rcase l
et
te
rs
(a to z).
T
his is
expect
ed
t
o
s
ol
ve
pr
ob
le
m
s w
he
n w
ords
a
r
e wri
tt
en
in
dif
fer
e
nt let
te
rs.
Evaluation Warning : The document was created with Spire.PDF for Python.
In
t J
Elec
&
C
om
p
En
g
IS
S
N: 20
88
-
8708
Meas
ur
in
g
i
nforma
ti
on
credi
bi
li
ty
in
so
ci
al
med
i
a usin
g
c
ombi
na
ti
on
of
use
r p
r
ofil
e
…
(
Erwi
n
B.
S
et
i
aw
an
)
3541
To
ken
iz
at
io
n
i
s
ap
plied
t
o
c
ut
the
in
pu
t
of
a
t
weet
or
post
m
e
ssage
f
r
om
it
s
co
m
posing
w
ords
.
In
pri
nci
ple,
se
par
at
e
each
w
ord
in
the
te
xt
tweet
or
post
m
essage.
This
process
incl
udes
delet
ing
nu
m
ber
s,
punctuati
on
,
and
c
har
act
e
rs
oth
er
tha
n
a
lph
a
betic
al
le
t
te
rs.
These
c
har
act
er
s
are
consi
der
e
d
as
word
separ
at
or
s
s
o
they
will
be
rem
ov
ed
to
pr
e
ve
nt
"no
ise
"
in
furthe
r
p
r
ocess
es.
Me
anwhil
e,
stop
-
w
ord
re
m
ov
al
rem
ov
es
non
-
top
ic
al
w
ords
t
hat
are
not
co
nsi
der
e
d
im
po
rtant
su
c
h
as:
"and
"
,
"t
his'
,
"t
hat",
"i
s",
"or
",
"
wh
ic
h",
"t
hroug
h",
an
d
so
on.
This
pre
-
proce
ssin
g
he
lps
re
du
ce
i
rrel
evan
t
feat
ur
e
s
in
the
data
.
Finall
y,
stemm
in
g
i
s
the
process
of
find
in
g
r
oo
t
words
by
rem
ov
i
ng
pr
e
fixes
,
infix
es
,
su
f
fi
xes,
an
d
co
nf
i
xes
(c
om
bin
at
i
on
of
pr
e
fixes
an
d
s
uffixe
s)
i
n
der
i
va
ti
ve
w
ords.
By
or
igi
natin
g,
var
ia
ti
ons
in
w
ords
t
hat
ha
ve
the
sam
e
ro
ot
will
be
consi
der
e
d
t
he sa
m
e w
ay
(
featur
e
).
It
helps
i
m
pr
ov
e
retriev
al
p
er
form
ance on In
form
at
io
n
Re
trie
val
.
2.3. Fe
at
ure
e
xt
r
act
i
on
This
sect
io
n
el
aborates
the
fe
at
ur
e
e
xtracti
on
on
Twitt
er
a
nd
Face
book.
The
featu
re
dis
tribu
ti
on,
i
n
bo
t
h
Twitt
er
and
Faceb
ook,
is
at
ta
ched
,
wh
il
e
the
us
er
prof
il
e
d
im
e
ns
io
n
feat
ur
e
and
m
essage
con
te
nt
dim
ension
f
e
at
ur
e
are
al
s
o pr
e
sented
.
2.3.1.
Fea
tu
re
s
u
sed
on
t
w
it
t
er
This
pa
per
us
e
s
two
dim
ensio
ns
of
feat
ur
es
,
nam
ely
the
us
er
pro
file
dim
en
sion
a
nd
m
essage
co
nten
t
dim
ension
s.
T
he
m
os
t
popu
l
ar
ol
d
feat
ur
es
us
e
d
by
pr
e
vious
works
ha
ve
al
so
bee
n
s
umm
arized
in
this
study
.
In
total
,
33
fea
tures
obta
ine
d
from
5
diff
ere
nt
pa
per
s
a
re
di
scusse
d
in
thi
s
pap
e
r.
T
he
c
ollec
ti
on
of
fe
at
ur
es
from
wo
rk
s
usi
ng
cl
assifi
ers
is
per
f
or
m
ed
to
predict
cre
dib
il
it
y
[3
,
15,
16
,
22,
18]
.
F
ur
t
her
m
or
e,
17
new
featur
e
s ar
e
pr
opos
e
d
i
n
Ta
ble 4
i
nd
ic
at
e
d b
y unde
rlined
bold feat
ures.
Table
4
. Fea
tu
r
e d
ist
rib
utio
n u
sed
i
n
T
witt
er
No
Feat
u
re
Cas
t
i
l
l
o
(2
0
1
1
)
Mo
rr
i
s
(2
0
1
2
)
G
u
p
t
a
(2
0
1
4
)
Sy
ari
ff
(2
0
1
4
)
Ro
s
s
(2
0
1
6
)
T
h
e
Pro
p
o
s
e
d
1
d
i
s
p
l
ay
_
n
am
e
V
V
2
ag
e
_
acc
o
u
n
t
_
d
ay
V
V
V
3
check
_
w
eb_
i
n
s
ti
tuti
o
n
V
4
h
as
_
b
i
o
V
V
V
V
5
w
o
rds
_
de
s
c
V
6
#
po
s
i
ti
v
e
_
de
s
c
V
7
#
ne
g
a
t
i
v
e
_
de
s
c
V
8
#
s
e
nti
m
ent_
d
es
c
V
9
num
P
o
s
Wo
rdD
e
s
c
V
10
num
N
eg
Wo
rdD
e
s
c
V
11
C
hec
k
_
pers
o
n
a
l
_
w
e
b
V
12
C
hec
k
_
l
o
ca
t
i
o
n
V
13
i
s
_
v
eri
fi
e
d
V
V
V
V
V
14
n
u
m
b
er_
f
o
l
l
o
wer
V
V
V
V
V
15
n
u
m
b
er
_
s
t
a
t
u
s
e
s
V
V
V
V
V
16
n
u
m
b
er_
f
o
l
l
o
wi
n
g
V
V
V
V
17
N
um
F
o
l
l
o
w
i
n
g
N
um
F
o
l
l
o
w
er
V
18
#
l
i
k
es
_
u
s
e
r
V
19
N
um
Li
k
es
N
u
m
F
o
l
l
o
w
er
V
20
l
en
g
t
h
_
t
wee
t
V
V
V
V
21
#
w
o
r
d
s
_
t
w
ee
t
V
V
V
V
22
#
s
t
o
ck
_
c
h
ar
V
V
23
h
as
St
o
c
k
C
h
ar
V
24
#
c
o
l
o
n
_
c
h
ar
V
V
25
h
as
Co
l
o
n
C
h
ar
V
26
#
c
h
ar
V
V
27
N
u
m
Ch
arPan
j
an
g
T
w
eet
V
V
V
28
N
u
m
Ch
arN
u
m
K
at
a
V
V
V
29
#
m
en
t
i
o
n
V
V
V
V
V
V
30
#
h
as
h
t
ag
V
V
V
V
V
V
31
#
u
r
l
V
V
V
V
V
V
32
#
em
o
t
_
h
a
p
p
y
V
V
V
33
h
as
_
h
a
p
p
y
V
V
34
#
em
o
t
_
s
a
d
V
V
V
35
h
as
_
s
a
d
V
V
36
check
_
s
pa
m
V
37
s
o
u
rce
V
38
i
s
_
u
rl
V
V
V
V
V
39
i
s
_
m
en
t
i
o
n
V
V
V
V
40
i
s
_
h
as
h
t
ag
V
V
V
V
41
i
s
_
re
t
w
ee
t
V
V
V
V
V
V
42
#
l
i
k
e_
t
wee
t
V
43
ret
w
ee
t
_
co
u
n
t
e
d
V
V
V
V
V
V
44
#
p
o
s
_
t
w
ee
t
V
V
V
V
45
#
n
eg
_
t
w
e
et
V
V
V
V
46
ra
t
i
o
P
o
s
N
um
T
w
ee
t
V
47
ra
t
i
o
N
e
g
N
um
T
w
eet
V
48
#
s
e
n
t
i
m
en
_
t
w
ee
t
V
V
49
s
en
t
i
m
en
t
_
t
w
eet
V
V
Evaluation Warning : The document was created with Spire.PDF for Python.
IS
S
N
:
2088
-
8708
In
t J
Elec
&
C
om
p
En
g,
V
ol.
10
, No
.
4
,
A
ugus
t
2020
:
3537
-
3549
3542
Fr
om
49
a
vail
able
featu
res,
on
ly
bout
45
f
eat
ur
es
a
re
us
e
d.
Be
si
des,
it
s
dim
ension
s
are
div
i
ded
i
nto
two
dim
ensions,
nam
el
y
19
f
eat
ur
es
of
the
us
er
prof
il
e
di
m
ension
a
nd
26
featu
res
of
the
m
essage
con
te
nt
dim
ension
.
Th
e
m
os
t
widely
us
e
d
t
weets
f
eat
ur
e
f
or
m
easur
i
ng
cre
dib
il
it
y
in
tweet
s
a
re
r
et
weeti
ng,
tweet
le
ng
th
,
num
ber
of
words,
nu
m
ber
of
m
entio
ns,
num
ber
of
has
htags
,
nu
m
ber
of
URLs
,
tweet
s
ha
ving
URLs
,
nu
m
ber
of
re
tweet
s,
ha
ving
ha
ppy
e
m
oticons
,
hav
i
ng
sad
em
otico
ns
,
a
nd
valu
e
senti
m
ents
[22].
The desc
riptio
n of eac
h o
f
th
e 45
featur
e
s is
show
n
i
n
Ta
bl
es 5 an
d 6.
Table
5.
Use
r profile
dim
ensi
on f
eat
ur
e
on
T
witt
er
No
Featur
e
Description
New Feature
1
displa
y
_na
me
W
he
ther
the
displa
y
na
m
e
use
the
real
na
me
of
the
account
ow
ner
o
r
not.
Thi
s
is
cl
osel
y
r
elated
to
th
e
level of
trust.
No
2
age_account_
da
y
In
this
f
eature,
the
age
of
the
user'
s
account
can
be
seen.
The
long
er
the
age
of
som
eone
'
s
account
th
e
higher the le
vel of
trust
No
3
check_web_i
nstitu
ti
on
Having
a
URL
that
co
nnects
to
the
origina
l
website
of
the
user
'
s
i
n
stitution
and
i
t
can
be
used
to
see
the
credibili
t
y
Yes
4
has_bio
If
there
is
a
descriptio
n
of
the
user'
s
authe
nti
cit
y
in
the
prof
ile,
the
n
it
can
be
a
bas
is
f
or
assessing
th
e
user'
s cre
dibilit
y
.
No
5
words_desc
The
nu
m
ber
of
words
which
gives
an
explana
tion
of
whether
the
user
explains
the
bio
prof
ile.
A
detailed
explanati
on
will
m
a
ke i
t easier f
or us t
o asse
ss
a person's credibi
lit
y
Yes
6
#positive_
desc
The nu
m
ber
of
positive
senti
m
ent
words f
ro
m
an account
'
s bio
prof
ile
Yes
7
#negative_d
esc
The nu
m
ber
of
neg
ativ
e senti
m
ent
words f
ro
m
a
n accoun
t'
s b
io prof
ile
Yes
8
#senti
m
ent
_desc
Nu
m
ber of
senti
m
en
ts
f
rom
an acco
unt'
s
bio p
rof
ile
Yes
9
nu
m
Pos
WordDesc
The
ratio
of
the
senti
m
ents
nu
m
ber
is
posi
tive
towards
the
nu
m
ber
o
f
words
in
an
account
'
s
bio
prof
ile.
The
value of
the ratio
i
s bigger equ
al to the
va
lue of
the account’s cr
e
dibilit
y
.
Yes
10
nu
m
Neg
W
o
rdDesc
The
ratio
of
the
sentiments
nu
m
ber
is
negati
v
e
towards
the
nu
m
ber
of
words
in
an
account'
s
bio
prof
ile.
The value of
the ratio
i
s getting s
maller co
mp
ared
to the v
alue of
the
account’
s cre
dibilit
y
.
Yes
11
check_web_
personal
Having a U
RL that c
on
nects to the
user’s or
igi
nal website
and it ca
n b
e used to see
the credi
b
ilit
y
.
Yes
12
check_locat
i
on
Having a loc
ation in
th
e descriptio
n can guar
a
ntee the
auth
enticit
y
of
the user'
s origin
al area.
Yes
13
is_verif
ied
A verif
ied account is
a
n of
f
icial account that
has been aut
henticate
d
b
y
Twitter.
No
14
nu
m
ber_
fol
lower
The
nu
m
b
er
of
f
ollowers
can
hel
p
to
f
ind
o
ut
how
man
y
o
ther
us
ers
want
t
o
see/f
ollow
the
trail
of
inf
orm
ation
f
rom
the
u
ser.
The
nu
m
ber
of
f
ollowers
can
beco
me
an
i
ndication
of
the
user
'
s
inf
orm
ation
credibili
t
y
le
vel, th
e
m
ore f
ollowers
the hi
g
he
r the lev
el of
trust.
No
15
nu
m
ber
_sta
tuses
The
nu
m
ber
of
statuse
s
can
inf
orm
the
level
of
user'
s
activit
y
i
n
using
Twitter.
Use
rs
w
ho
do
more
activit
ies wi
ll have
mor
e credibil
it
y
.
No
16
nu
m
ber_
fol
lowing
Fro
m
the
nu
mber
of
F
ollowing,
it
can
be
see
n
that
t
he
user
has
m
an
y
f
riends
who
m
ight
be
giving
m
ore
sources of
inf
orm
ati
on.
The nu
m
ber
of
Follow
ing shows
man
y
s
ource
s of
inf
orm
ation.
No
17
nu
m
Follo
wingNu
mFol
lower
The ratio of
Follow
ing
to the nu
mber of
Follo
wers of
an account
Yes
18
#
likes_user
The
nu
m
ber
of
likes
c
an
show
how
acti
ve
th
e
user
is
in
using
T
wi
tter.
The
nu
m
ber
of
likes
can
also
indicate the
nu
m
ber
of
truths of
tweets tha
t
are
liked b
y
u
sers.
Yes
19
nu
m
LikesN
u
m
Fol
lowe
r
The ratio of
the nu
mbe
r of
Like to the nu
mber
of
an account'
s f
ollow
ers.
Yes
Table
6.
Messa
g
e c
on
te
nt d
im
ensio
n feat
ure
on T
witt
er
No
Featur
e
Description
New Feature
1
length_
twee
t
The
exist
ence
of
whic
h
length
of
characters
or
words
that
could
ex
plain
whe
ther
t
he
use
r
gives
a
short
or
long
m
essa
ge to inf
lue
nce the percep
tion of
ot
hers.
No
2
#words_tw
eet
W
h
ich
of
the
num
be
r
of
words
that
could
explain
whethe
r
the
u
ser
gives
a
short
or
l
ong
m
essage
to
inf
luence the percepti
o
n of
others.
No
3
#char
Nu
m
ber of
character i
n
a tweet
No
4
nu
m
CharLe
ngthTwe
et
The ratio of
the nu
mbe
r of
characters co
m
pare
d to the len
gth of
a twe
et
No
5
nu
m
CharN
u
mW
ord
s
The
ratio of
the nu
mbe
r of
characters co
m
pare
d to the n
u
mber of
wor
ds f
rom
a t
weet.
No
6
#mention
The nu
m
ber
of
m
ent
io
n f
rom
a twe
et
No
7
#hashtag
The
nu
m
ber
of
hashta
g
f
rom
a
tweet.
B
y
cli
cking
the
#Ha
shtag
in
Twitter,
the
sa
me
inf
orm
at
ion
with
t
he
sa
m
e
has
htag
w
ill
app
ear
so
that
people
w
ill
be
assisted
to
f
ind
th
e
inf
orm
ation
u
nif
orm
i
t
y
to
d
igest
the
truth of
the inf
orm
at
ion
with detail
and clea
r
hi
stor
y
.
No
8
#url
The nu
m
ber
of
URL in
a tweet
No
9
#e
m
ot_hap
p
y
The nu
m
ber
of
happ
y
e
m
o
ticons
No
10
has_happ
y
The existenc
e of
em
o
ti
con that con
tains ha
pp
y
ex
pressio
n
No
11
#e
m
ot_sad
The nu
m
ber
of
sad e
moticons
No
12
has_sad
The existenc
e of
em
o
ti
con that con
tains sa
d e
xpression
No
13
check_spa
m
To see whet
her a twee
t
has so
m
e
words li
sted
in spa
m
.
Yes
14
Source
The
m
eans u
sed to s
har
e a tweet can
be divid
e
d into two,
via a s
martp
hone or PC
Client.
Yes
15
is_url
A
tweet
with
URL
hel
ps
deliver
more
inf
ormation
so
it
can
provide
trust
b
y
giving
the
tw
eet
source.
The
m
o
re in nu
mber of
UR
Ls given in
a
tweet t
he
m
o
re credibl
e the inf
or
m
at
ion is.
No
16
is_
m
ention
Tweet
contains
Me
ntio
n
it
m
eans
w
here
its
s
ource
was
taken
f
rom
so
m
eone
else
to
provi
de
better
source
certaint
y
.
T
he
m
e
ntion
can
indi
cate
wh
ether
t
he
m
entio
ned
us
er
m
e
ntioned
pr
ovides
evide
nce
of
the
new
s
authenti
cit
y
,
f
or exa
m
p
le, the
user i
ncluded
ph
otos o
f
the evi
dence.
No
17
is_hashtag
The
existence
of
#hashtag
helps
to
ensure
and
view
the
news
histor
y
in
order
to
be
able
to
se
ek
inf
orm
ation
credibili
t
y
.
B
y
clic
king
the
#Hash
tag
in
Twitte
r,
the
sa
m
e
inf
orm
a
tion
with
the
sa
m
e
ha
shtag
will
appear
so
that
peop
le
wi
ll
be
ass
isted
to
f
ind
the
inf
orm
at
ion
u
nif
orm
i
t
y
to
d
igest
the
truth
of
the
i
nf
orm
ation
wit
h
detail and cle
ar histor
y
.
No
18
is_retwee
t
To know w
hether th
e t
weet is post
ed b
y
th
e
mselves or rep
osted (re
-
t
weet) f
rom
o
thers.
No
19
#like_
tweet
The nu
m
ber
of
users’ li
kes to a twee
t
Yes
20
retweet_coun
ted
The nu
m
ber
of
users w
ho re
-
tweet a
tweet.
No
21
#pos_tweet
The nu
m
ber
of
positive
senti
m
ents
words f
ro
m a tweet.
No
22
#neg_tweet
The nu
m
ber
of
negativ
e senti
m
ent
s words f
ro
m
a
tweet.
No
23
ratioPosN
u
mTweet
The
ratio
of
the
nu
mbe
r
of
positive
senti
m
en
t
s
to
the
nu
mber
of
wor
ds
in
a
tweet.
The
valu
e
of
the
ratio
is
bigger equal
to the v
alu
e of
the account’s cred
i
bilit
y
.
Yes
24
ratioNegNu
mTweet
The
ratio
of
the
nu
m
be
r
of
negative
senti
ment
s
to
the
nu
m
ber
of
words
in
a
t
weet.
Th
e
valu
e
of
the
ratio
i
s
getting s
m
a
ller equal
to
the value of
the acco
u
nt’s credibil
it
y
.
Yes
25
#senti
m
ent
_tweet
The nu
m
ber
of
senti
m
e
nts f
rom
a t
weet’s bi
o
prof
ile.
No
26
senti
m
ent_t
weet
The
exist
ence
of
pos
iti
ve,
neutra
l,
and
nega
tiv
e
senti
m
e
nts
t
o
sele
ct
t
he
inf
or
m
ation
that
its
credibilit
y
level
is going to
be seen
.
Th
e positive s
enti
m
ent
s ar
e usuall
y
de
scribing
more cre
dible i
nf
orm
ati
o
n.
No
Evaluation Warning : The document was created with Spire.PDF for Python.
In
t J
Elec
&
C
om
p
En
g
IS
S
N: 20
88
-
8708
Meas
ur
in
g
i
nforma
ti
on
credi
bi
li
ty
in
so
ci
al
med
i
a usin
g
c
ombi
na
ti
on
of
use
r p
r
ofil
e
…
(
Erwi
n
B.
S
et
i
aw
an
)
3543
2.3.2. F
ea
tu
re
s
u
sed
in
f
ace
book
This
pa
pe
r
s
uc
cessf
ully
devel
op
s
Faceb
oo
k
A
PI
a
ppli
cat
ion
with
54
featu
res
(
8
us
er
pro
file
dim
ension
feat
ur
es
,
46
m
essage
co
ntent
featur
es
).
Be
si
des,
49
ne
w
feat
ures
ha
ve
bee
n
a
dd
e
d
from
pr
evio
us
researc
h
[21].
Table
7
s
hows
the
use
r
’s
dim
e
ns
ion
featu
r
es
in
Face
book,
w
hile
Table
8
s
hows
the
m
essage
con
te
nt d
im
ension
featu
res
i
n Face
book.
Table
7.
Use
r
prof
il
e
dim
ensi
on f
eat
ur
es
in F
aceb
ook
No
Featur
e
Description
New Feature
1
check_bio
The authenti
cit
y
desc
ri
ption in the
user’s p
rof
ile
can beco
me a basi
s
t
o know t
he
user’s cr
edi
bilit
y
.
Yes
2
#word_bio
The
nu
m
ber
of
words
i
n
describing
the
u
ser’s
prof
ile
(bio
prof
ile).
A
detailed
descri
ption
ca
n
m
a
ke
it
easier
to
know so
m
e
one’s cre
di
bilit
y
.
Yes
3
length_bio
The
lengt
h
of
characte
r
and
w
ords
t
hat
ex
pla
in
whethe
r
the
user
gi
ves
a
s
hort
or
long
message
tha
t
coul
d
inf
luence so
m
eone’s
pe
rception.
Yes
4
#positive_
desc
The nu
m
ber
of
positive
senti
m
ent
words i
n an
account bio
prof
ile
Yes
5
#negative_d
esc
The
nu
m
ber
of
negativ
e senti
m
ent
words i
n a
n account bi
o prof
ile
Yes
6
senti
m
ent_
desc
The
existen
ce
of
positi
ve,
neutral,
and
n
egativ
e
senti
m
en
ts
to
select
t
he
inf
orm
at
ion
tha
t
its
credibilit
y
level
i
s
going to be
seen. Th
e p
ositive sent
i
m
ents a
re u
suall
y
descr
ibing
more
credible inf
or
m
ation.
Yes
7
#url_instit
ution
Having
a
URL
that
co
nnects
to
the
ori
ginal
website
of
the
u
ser'
s
i
nstitution
and
it
can
b
e
used
to
see
t
h
e
credibili
t
y
Yes
8
engage
m
ent
_count
The
nu
m
ber
of
engagem
e
nt
shows
the
nu
m
ber
of
other
users
who
want
to
see/f
ollow
the
user’s
trail
of
inf
orm
ation
.
The
nu
m
b
er
of
engagem
ent
ca
n
beco
m
e
an
indica
tion
o
f
the
user’s
inf
orm
atio
n
credibilit
y
leve
l
.
The
m
ore en
gage
m
en
t
the higher t
he trust.
Yes
Table
8.
Messa
ge
c
on
te
nt f
eat
ur
es
in
F
aceb
ook
No
Featur
e
Description
New Feature
1
t
y
p
e
The classif
ication of
po
st
m
essage
t
y
pes (p
hot
o, link, stat
us, note
, vid
eo, event)
Yes
2
#url_post
The nu
m
ber
of
URL in
a post
m
es
sage
No
3
#char
The nu
m
ber
of
character in a post
message
Yes
4
ratioCharLe
nght
W
or
dP
ost
The ratio of
the charac
t
er nu
m
ber co
mpared t
o
the length
of
a post
m
e
ssage
Yes
5
ratioCharNu
mWord
The ratio of
the charac
t
er nu
m
ber co
mpared t
o
the nu
m
ber
of
words i
n a post
m
e
ssage
Yes
6
#mention
The nu
m
ber
of
m
ent
io
n in a post
message
Yes
7
#hashtag
The
nu
m
ber
of
the
has
htag
in
a
p
ost
m
es
sage
.
B
y
clicki
ng
the
#
has
htag
in
Tw
itter,
the
sa
m
e
inf
orm
at
ion
with
the
sa
m
e
has
htag
will
appear
so
that
p
eople
wil
l
be
assis
ted
to
f
ind
the
inf
orm
a
tio
n
unif
or
m
it
y
to
digest the tr
uth of
the i
nf
orm
ation
with det
ail
a
nd clear his
tor
y
.
No
8
#e
m
ot_hap
p
y
The nu
m
ber
of
happ
y
e
m
o
ticons
Yes
9
has_happ
y
The existenc
e of
em
o
ti
con that con
tains ha
pp
y
ex
pressio
n
Yes
10
#e
m
ot_sad
The nu
m
ber
of
sad e
moticons
Yes
11
has_sad
The existenc
e of
em
o
ti
con that con
tains sa
d
e
xpression
Yes
12
#word
W
h
ich
of
the
nu
mber
of
words
that
c
ould
explain
whether
the
u
ser
gives
a
short
or
l
ong
m
es
sage
to
inf
luence the percepti
o
n of
others.
Yes
13
length_
m
es
sage
W
h
ich
of
the
length
o
f
the
character
and
wo
rd
that
could
explain
whether
the
u
ser
gives
a
short
or
lo
n
g
m
e
ssage to
inf
luence th
e perception
of
others.
Yes
14
check_spa
m
To see whet
her the p
os
t message co
ntains t
he
words inclu
ded in t
he s
pa
m
list
Yes
15
check_f
ull_picture
To check the
presence
or absence of
the f
ull p
icture in
a p
ost
m
es
sag
e
Yes
16
link_do
m
ai
n
The
presence
of
a
post
m
e
ssage
wi
th
URL
hel
ps
deliver
more
inf
or
mation
so
it
c
an
provi
de
trust
b
y
gi
ving
the
post
message
sourc
e.
The
m
ore
in
nu
mber
of
URLs
given
in
a
pos
t
m
essage
t
he
m
ore
in
certaint
y
to
the
credibili
t
y
of
the inf
or
m
at
ion.
Yes
17
post_publi
shed
The age of
a post
m
e
ss
age on the n
u
m
ber of
d
a
y
s
is based
on whe
n t
he last post
m
essage
w
as taken
Yes
18
likes_count
_f
b
The nu
m
ber
of
like cou
nt f
or a post
m
essag
e
No
19
likes_count
_f
b_per_da
y
The nu
m
ber
of
like cou
nt f
or a post
m
essag
e i
n the nu
m
b
er of
da
y
s
b
ased on the
a
ge of
the p
ost
m
essage
Yes
20
co
mm
ents_
count_f
b
The nu
m
ber
of
comme
nts in a pos
t
m
essa
ge
No
21
co
mm
ents_
count_f
b_p
er_da
y
The nu
m
ber
of
comme
nts f
or a post
m
essa
ge i
n the nu
m
b
er of
da
y
s
b
ased on the a
ge of
the p
ost
m
essage
Yes
22
reactions_co
unt_f
b
The
nu
m
ber
of
short
response
acti
vities
wit
h
certain
icons
(
like,
none,
love,
w
ow,
hah
a,
sad,
angr
y
,
thankf
ul) in a post
mes
sage
Yes
23
reactions_co
unt_f
b_per
_da
y
The
nu
m
ber
of
short
response
acti
vities
wit
h
certain
icons
(
like,
none,
love,
w
ow,
hah
a,
sad,
angr
y
,
thankf
ul) in the age
of
a post
m
ess
age
Yes
24
shares_coun
t_f
b
The nu
m
ber
of
users w
ho share a p
ost
m
es
sag
e
No
25
shares_coun
t_f
b_per_d
a
y
The nu
m
ber
of
users w
ho share a p
ost
m
es
sag
e each da
y
ba
sed on t
h
e age of
the post
m
es
sa
ge
Yes
26
engage
m
ent
_f
b
The nu
m
ber
of
interact
ion in a po
st
m
essa
ge (
share, like, c
o
mm
ent
)
Yes
27
engage
m
ent
_f
b_per_da
y
The
nu
mber
of
interact
ion
in
a
post
m
essa
ge
(
share,
like
,
co
mment)
each
da
y
based
on
the
age
of
the
post
m
e
ssage
Yes
28
co
mm
ents_
retrieved
The nu
m
ber
of
comme
nts in a pos
t or b
y
the u
ser
Yes
29
co
mm
ents_
base
The nu
m
ber
of
basic le
vel co
mm
en
ts
Yes
30
co
mm
ents_
replies
The nu
m
ber
of
comme
nt level repl
y
i
ng
Yes
31
co
mm
ent_li
kes_cou
nt
The nu
m
ber
of
like in a
co
mm
ent of
a post
me
ssage
Yes
32
rea_NONE
The nu
m
ber
of
short re
sponse activ
ities b
y
N
ONE in a p
ost
m
es
sage
Yes
33
rea_LIKE
The nu
m
ber
of
short re
sponse
activ
ities b
y
LI
KE in a pos
t
m
essa
ge
Yes
34
rea_LIKE_pe
r_da
y
The
nu
m
ber
of
short
re
sponse
activit
ies
b
y
LI
KE
in
a
post
m
essag
e
each
da
y
based
on
the
age
of
the
post
m
e
ssage
Yes
35
rea_LOVE
The nu
m
ber
of
short re
sponse activ
ities b
y
LO
VE in a pos
t
m
essa
ge
Yes
36
rea_
W
O
W
The nu
m
ber
of
short re
sponse activ
ities b
y
W
OW in a po
st
m
essa
ge
Yes
37
rea_HAHA
The nu
m
ber
of
short re
sponse activ
ities b
y
H
AHA in a p
ost
m
es
sage
Yes
38
rea_SAD
The nu
m
ber
of
short re
sponse activ
ities b
y
SA
D in a post
message
Yes
39
rea_ANGR
Y
The nu
m
ber
of
short re
sponse activ
ities b
y
A
NGRY in a
post
messa
ge
Yes
40
rea_THAN
KFUL
The nu
m
ber
of
short re
sponse activ
ities b
y
TH
ANKFUL
in a pos
t
m
e
ssage
Yes
41
#positive
The nu
m
ber
of
positive
senti
m
ent
words i
n a p
ost
m
essage
Yes
42
ratioPosN
u
mWord
The
ratio
of
the
num
be
r
of
positive
senti
m
ent
s
to
the
num
ber
of
words
in
a
post
m
essage
.
The
value
of
the
ratio is
big
ger equal
to
the value of
a post
mes
sage credi
bil
it
y
Yes
43
ratioNegNu
mWord
The
ratio
of
the
nu
mbe
r
of
negative
s
enti
m
en
t
s
to
the
nu
mber
of
wor
ds
in
a
post
m
essa
ge.
T
he
value
of
the
ratio is
getti
ng s
m
al
ler
equal to t
he
value of
a
post
m
essa
ge credibi
lit
y
Yes
44
#negative
The nu
m
ber
of
negativ
e senti
m
ent
s in a po
st
m
e
ssage
Yes
45
#senti
m
ent
The nu
m
ber
of
senti
m
e
nts in a pos
t
m
essa
ge
Yes
46
senti
m
ent
The
existenc
e
of
positi
ve,
neutral,
a
nd
nega
tiv
e
senti
m
ent
s
to
sele
ct
t
he
inf
orm
ati
on
that
its
credibilit
y
l
evel
is going to
be seen
.
Th
e positive s
enti
m
ent
s ar
e usuall
y
de
scribing
more cre
dible i
nf
orm
ati
o
n.
Yes
Evaluation Warning : The document was created with Spire.PDF for Python.
IS
S
N
:
2088
-
8708
In
t J
Elec
&
C
om
p
En
g,
V
ol.
10
, No
.
4
,
A
ugus
t
2020
:
3537
-
3549
3544
In
a
dd
it
io
n,
th
is
pap
er
al
s
o
app
li
es
a
ne
w
appr
oach
relat
e
d
to
the
sp
am
pr
e
dicti
on
an
d
sentim
ent
pr
e
dicti
on d
e
sc
ribe
d
as
f
ollows:
a. Spam
p
red
ic
ti
on
(
check
_sp
am
)
We
us
e
t
wo
c
orp
us
es
relat
e
d
to
the
s
pam
word
s
or
phr
ases
that
are
200
E
ngli
sh
s
pam
wo
r
ds
or
phrases
a
nd
100
Ba
hasa
I
ndon
e
sia
sp
am
words
or
phra
ses
as
us
ed
in
our
pr
e
vious
stud
y
[
23
]
.
T
he
two
corp
us
es
a
re
de
velo
ped
ba
se
d
on
I
ndonesi
a
sp
am
-
words.
Table
9
de
scri
bes
12
exam
ples
of
Ba
ha
sa
I
ndonesi
a
sp
am
w
ords
or
phrases
[23
]
.
b.
Sentim
ent p
r
edict
ion
This
pa
per
us
e
s
a
corpu
s
w
hich
co
ntains
th
e
l
ist
of
sentim
ents
word
s
c
on
sist
s
of
354
word
s
[24]
.
This
se
nti
m
ent
is
ob
ta
i
ned
by
searchi
ng
f
or
w
ords
t
hat
a
re
cat
eg
or
iz
e
d
as
ne
gative,
po
sit
ive
a
nd
ne
utral.
So
m
e sa
m
ple dat
a are s
how
n
i
n
Ta
ble
10 [24]
.
Table
9.
Sam
pl
es of
12 In
done
sia
n
s
pam
-
word
s
No
In
d
o
n
e
s
i
a
n
S
p
am
-
w
o
rd
s
1
kr
e
d
i
t
d
p
2
p
a
ket
k
r
e
d
i
t
3
ci
c
i
l
a
n
r
i
n
g
a
n
4
d
p
r
i
n
g
a
n
5
ca
s
h
/
k
r
ed
i
t
6
d
a
n
a
t
u
n
a
i
7
p
r
o
s
es
c
ep
a
t
8
d
a
n
a
ce
p
a
t
9
p
i
n
j
a
m
a
n
u
a
n
g
10
p
i
n
j
a
m
a
n
d
a
n
a
11
p
i
n
j
a
m
a
n
12
g
a
d
a
i
Table
10. Te
n data s
urvey o
f senti
m
ent
No
W
o
rd
Po
s
i
t
i
v
e (%
)
N
eg
at
i
v
e (%
)
N
eu
t
r
al
(%
)
Q
u
al
i
t
y
1
b
u
r
u
k
0
7
8
.
3
2
1
.
7
3
2
j
el
ek
0
7
8
.
3
2
1
.
7
3
3
l
a
m
a
4
.3
3
0
.
4
6
5
.
3
0
4
l
a
m
b
a
n
4
.3
7
8
.
3
1
7
.
4
3
5
l
a
m
b
a
t
13
5
2
.
2
3
4
.
8
1
6
b
a
i
k
8
2
.
6
0
1
7
.
4
4
7
b
er
a
n
i
8
2
.
6
0
1
7
.
4
4
8
b
e
n
a
r
8
2
.
6
0
1
7
.
4
4
9
s
u
d
a
h
5
6
.
5
0
4
3
.
5
1
10
a
y
o
6
5
.
2
4
.3
3
0
.
5
2
2.4. Cl
as
sific
ati
on al
go
ri
t
hm
The
f
our
le
arn
i
ng
al
gorithm
s
that
will
be
exp
lo
red
are
N
ai
ve
Ba
ye
s
(N
B)
,
Suppor
t
Vect
or
Ma
chi
ne
(S
VM
),
L
ogist
ic
Re
gr
essi
on
(Logit)
a
nd
J48.
As
il
lustrate
d
in
Fi
g.
1,
t
he
four
al
gorith
m
s
are
us
ed
to
m
od
el
the
top
ic
cl
assi
ficat
ion
of
twe
et
s
durin
g
the
trai
ning
phase
.
The
to
pic
m
od
el
of
tweet
s
is
then
us
ed
t
o
cl
assify
the
cre
dib
il
it
y
of
new
in
for
m
at
ion
,
us
in
g
the
sam
e
al
go
rithm
as
that
us
e
d
t
o
m
odel
the
cl
assifi
cat
ion
.
The follo
wing
is a desc
riptio
n o
f
eac
h
al
gorit
hm
.
a.
Nai
ve
b
ay
es
(N
B
)
Naive
Ba
ye
s
(
NB)
is
a
cl
as
sific
at
ion
m
od
el
in
the
f
or
m
of
pr
ob
a
bili
ty
values
f
or
each
at
tribu
te
to
the
cl
ass,
an
d
the
cl
assifi
cat
io
n
of
ne
w
data
i
s
done
by
look
ing
at
cl
asses
that
ha
ve
the
m
axim
u
m
pr
ob
a
bili
ty
base
d
on
at
tri
bute
data
[
25
]
.
Naive
Ba
ye
s
ha
s
the
a
dv
a
nta
ge
of
c
onstr
uct
ion
ea
siness
w
hich
does
no
t
r
equ
i
re
sever
al
c
om
plex
par
am
et
ers,
and
it
is
scal
ab
le
.
In
ad
diti
on,
this
m
et
ho
d
is
ex
pr
esse
d
as
a
n
al
gorithm
that
has
the pr
op
e
rtie
s
of sim
plicity
, elegan
ce,
ro
bu
st
ness, an
d hig
h ac
cur
acy
[26].
b.
Sup
port
vector m
achine (
S
VM)
The
i
dea
of
S
upport
Vecto
r
Ma
chine
(SV
M)
f
or
cl
assifi
cat
ion
is
t
o
find
the
opti
m
a
l
hype
r
plane
(line/
boun
dar
y
fiel
d)
that
sep
arates
data
int
o
two
cl
asses
i
n
the
data
n
-
dim
ension
al
feat
ur
e
s
pace.
W
it
h
this
con
ce
pt,
the
op
tim
a
l
hyperpla
ne
s
olu
ti
on
in
SV
M
does
not
hav
e
a
l
ocal
optim
u
m
,
and
a
s
a
res
ult,
the
s
olu
ti
on
will
be
un
i
qu
e
[
25
]
.
SV
M
can
be
im
ple
m
ented
easi
ly
an
d
is
one
of
the
rig
ht
m
et
hods
us
e
d
t
o
s
olv
e
high
-
dim
ension
al
pr
ob
le
m
s w
it
hin
t
he
li
m
i
ta
ti
on
s
of
e
xist
ing
data sam
ples.
c.
L
og
ist
ic
r
e
gr
ession (L
ogit
)
Lo
gisti
c
Re
gr
e
ssion
(L
ogit
)
i
s
a
prob
a
bili
ty
cl
assifi
cat
ion
m
od
el
with
a
real
value
i
nput
vecto
r.
The
in
put
vect
or
dim
ension
s
are
cal
le
d
fea
tures.
The
re
ar
e
no
restrict
io
ns
im
po
sed
for
co
rr
el
at
ed
fe
at
ur
es
.
Lo
gisti
c
Re
gr
e
ssion
is
use
d
e
ver
y
tim
e
we
need
to
set
input
to
on
e o
f
se
ve
ral
cl
asses.
Th
e
log
ist
ic
s
func
ti
on
is
a
li
near
com
bi
nation
of
fe
at
ures.
The
ou
t
pu
t
is
us
ually
bin
ary,
bu
t
L
og
ist
ic
Re
gr
essio
n
c
an
al
so
be
ap
pl
ie
d
to
m
ul
ti
cl
ass cla
s
sific
at
ion
pro
ble
m
s [
25]
.
Evaluation Warning : The document was created with Spire.PDF for Python.
In
t J
Elec
&
C
om
p
En
g
IS
S
N: 20
88
-
8708
Meas
ur
in
g
i
nforma
ti
on
credi
bi
li
ty
in
so
ci
al
med
i
a usin
g
c
ombi
na
ti
on
of
use
r p
r
ofil
e
…
(
Erwi
n
B.
S
et
i
aw
an
)
3545
d.
J
48 al
gorith
m
(
J4
8)
J4
8
is
a
de
vel
op
m
ent
of
t
he
I
D3
al
gorith
m
.
J4
8
is
a
n
im
ple
m
entat
ion
of
th
e
C4
.5
a
lgorit
hm
that
pro
du
ces
a
d
ec
isi
on
t
ree.
This
al
gorithm
can
cl
assify
data
w
it
h
decisi
on
tre
e
m
et
ho
ds
t
hat
ha
ve
t
he
a
dva
ntage
of
bei
ng
a
ble
to
process
nu
m
erical
(c
on
ti
nu
ou
s
)
an
d
discr
et
e
data,
can
ha
nd
le
m
issi
ng
at
tribu
te
value
s,
a
nd
pro
du
ce
ru
le
s
that
are
easi
ly
interp
reted
.
E
ach
da
ta
from
an
it
em
is
based
on
t
he
val
ue
of
eac
h
at
trib
ute.
Cl
assifi
cat
ion
can
be
see
n
as
a
m
app
in
g
of
a
group
of
set
s
of
at
trib
utes
f
r
om
a
par
ti
cular
cl
ass.
Decisi
on
tr
ee
cl
assifi
es the
da
ta
g
ive
n usin
g t
he value
of th
e att
ribu
te
[
27
]
.
3.
RESU
LT
S
A
ND AN
ALYSIS
This
sect
io
n
pro
vid
es
the
res
ults
an
d
a
naly
sis
of
the
data
set
an
d
la
beli
ng
schem
e
for
Twitt
er
a
nd
Faceb
ook.
3.1. D
ata set
for e
xp
eri
men
t
The
us
e
of
T
witt
er
data
c
onta
inin
g
Ind
onesi
an
la
ng
uag
e
is
the
sam
e
as
in
[28],
i
nvol
ving
11
5
accounts
with
19401
t
weets.
Table
11
pro
vi
des
a
sa
m
ple
la
beling
of
tw
eet
top
ic
s
fr
om
Law,
Po
li
tics,
an
d
Entertai
nm
ent
[28].
Ta
ble
12
sho
ws
t
he
distrib
utio
n
of
Twitt
er
da
ta
.
It
c
on
sist
s
of
19
t
op
ic
s
w
her
e
the
distrib
utio
n
is
no
t
bala
nc
ed
ra
ng
i
ng
fro
m
0.
2%
to
15.
3%
[
28
]
.
Faceb
oo
k
data
us
ed
in
this
stud
y
c
on
sist
s
of
56
acco
unts
with
23489
m
essages.
Du
e
to
the
ab
senc
e
of
a
us
e
r
ac
count,
no
t
al
l
accounts
on
T
witt
er
(11
5
acco
unts)
can
be
retrie
ved.
Ta
ble
13
descr
i
bes
the
distrib
ution
of
Face
book
da
ta
us
ed
,
co
ns
i
sts
of
19 to
pics,
w
hich
s
hows
that th
e d
ist
rib
utio
n
i
s also u
nb
al
a
nc
ed,
ra
ng
i
ng from
0
.1
7%
to 1
8.38%.
Table
11. Som
e sam
ples o
f
ca
te
gory label
in
g i
n
T
witt
er
T
w
eet
L
ab
el
Y
g
d
i
s
o
a
l
S
ari
p
i
n
cm
ap
a
k
a
h
K
PK
b
erw
e
n
an
g
s
i
d
i
k
#
BG
,
d
u
g
aa
n
k
o
r
u
p
s
i
n
y
a
t
d
k
d
i
u
s
i
k
.
Bag
i
s
ay
a,
BG
t
e
t
ap
"
t
er
s
a
n
g
k
a"
m
es
t
i
n
y
a #J
K
W
j
g
dem
i
k
i
an
L
aw
D
PR A
k
an
G
e
l
ar Par
i
p
u
r
n
a S
ah
k
a
n
Re
v
i
s
i
U
U
Pil
k
ad
a H
ar
i
I
n
i
h
t
t
p
:
/
/
t
.c
o
/
j
c
x
cl
L
9
faO
@d
e
t
i
k
c
o
m
Po
l
i
t
i
c
al
St
u
d
i
o
D
en
n
y
J
A
,
MT
V
d
a
n
M
i
za
n
b
e
rs
am
a
H
an
u
n
g
Bram
a
n
t
y
o
m
e
m
b
u
at
5
f
i
l
m
l
ay
ar
l
eb
ar
b
ert
e
m
a
Is
l
am
Ci
n
t
a
:
h
t
t
p
:
/
/
t
.c
o
/
B
rd
H
f
h
B
s
u
b
E
n
t
er
t
a
i
n
m
en
t
Table
12. T
witt
er d
at
a
distrib
ution by t
op
ic
/
cat
egory
No
L
ab
el
N
u
m
b
er
Perce
n
t
a
g
e
1
Rel
i
g
i
o
n
1
0
2
5
5
.2
8
%
2
Bu
s
i
n
e
s
s
4
6
0
2
.3
7
%
3
Cu
l
t
u
re
2
3
5
1
.2
1
%
4
E
co
n
o
m
y
2
3
5
1
.2
1
%
5
E
n
t
er
t
a
i
n
m
en
t
1
7
4
2
8
.9
8
%
6
L
aw
1
5
5
7
8
.0
3
%
7
A
d
v
er
t
i
s
em
en
t
4
8
5
2
.5
0
%
8
J
o
u
r
n
a
l
i
s
m
2
4
2
0
1
2
.
4
7
%
9
H
eal
t
h
74
0
.3
8
%
10
Fi
n
an
c
e
35
0
.1
8
%
11
Mo
t
i
v
a
t
i
o
n
9
2
7
4
.7
8
%
12
Sp
o
rt
s
4
3
1
2
.2
2
%
13
G
o
v
er
n
m
en
t
1
9
3
5
9
.9
7
%
14
E
d
u
c
at
i
o
n
4
6
6
2
.4
0
%
15
T
ran
s
p
o
rt
at
i
o
n
1
4
9
0
.7
7
%
16
Po
l
i
t
i
c
al
2
9
5
9
1
5
.
2
5
%
17
So
c
i
al
1
2
3
8
6
.3
8
%
18
T
ech
n
o
l
o
g
y
1
2
1
8
6
.2
8
%
19
G
en
era
l
1
8
1
0
9
.3
3
%
To
t
a
l
1
9
4
0
1
Table
13. F
ace
book
data d
ist
r
ibu
ti
on
by to
pi
c/
cat
ego
ry
No
Cat
e
g
o
ry
N
u
m
b
er
Perce
n
t
a
g
e
1
Rel
i
g
i
o
n
1
9
5
2
8
.3
1
%
2
Bu
s
i
n
e
s
s
1
2
6
7
5
.3
9
%
3
Cu
l
t
u
re
87
0
.3
7
%
4
E
co
n
o
m
y
1
4
2
1
6
.0
5
%
5
E
n
t
er
t
a
i
n
m
en
t
2
9
7
7
1
2
.
6
7
%
6
L
aw
1
3
2
9
5
.6
6
%
7
A
d
v
er
t
i
s
em
en
t
3
3
1
1
.4
1
%
8
J
o
u
r
n
a
l
i
s
m
96
0
.4
1
%
9
H
eal
t
h
3
1
3
1
.3
3
%
10
Fi
n
an
c
e
41
0
.1
7
%
11
Mo
t
i
v
a
t
i
o
n
1
1
6
9
4
.9
8
%
12
Sp
o
rt
s
4
9
5
2
.1
1
%
13
G
o
v
er
n
m
en
t
3
6
1
3
1
5
.
3
8
%
14
E
d
u
c
at
i
o
n
7
7
5
3
.3
0
%
15
T
ran
s
p
o
rt
at
i
o
n
43
0
.1
8
%
16
Po
l
i
t
i
c
al
4
2
8
7
1
8
.
2
5
%
17
So
c
i
al
6
3
0
2
.6
8
%
18
T
ech
n
o
l
o
g
y
2
0
2
2
8
.6
1
%
19
G
en
era
l
6
4
1
2
.7
3
%
To
t
a
l
2
3
4
8
9
3.2.
Ex
peri
ment
We
c
onside
r
t
hr
ee
ob
j
ect
ives
of
perform
ing
ex
per
im
ent,
i.
e.,
(i)
t
o
c
om
par
e
the
propos
ed
te
c
hn
i
qu
e
with
pr
e
vious
researc
h
on
T
witt
er
an
d
Fac
eboo
k
a
bout
i
nfor
m
at
ion
cre
dib
il
it
y,
(ii)
e
valuate
t
he
e
ffec
t
of
add
i
ng
ne
w
fe
at
ur
es
i
n
T
witt
er
a
nd
Face
book,
a
nd
(iii
)
e
va
luate
the
e
ff
e
ct
s
of
feat
ur
e
dim
ension
s
use
d
bo
t
h
on
T
witt
er
an
d
Face
book.
O
ur
e
xp
e
rim
ents
us
ed
a
c
om
par
iso
n
of
trai
ni
ng
data
ve
rsus
data
te
sti
ng
,
with
a
com
po
sit
ion
of 80:2
0.
3.2.1. Twi
t
ter
so
ci
al me
dia
In
this
st
ud
y,
each
cel
l
descri
bes
a
n
a
ver
a
ge
of
5
ti
m
es
of
the
accu
racy
ta
king
for
eac
h
te
sti
ng
vs
twit
te
r
com
po
sit
ion
ta
ken
rando
m
ly
.
The
res
ults
of
the
pro
pose
d
m
e
tho
d
a
nd
th
e
previ
ous
researc
h
are
s
how
n
in
Ta
ble
14.
Table
14
s
hows
t
hat
this
pa
per
s
ucceede
d
i
n
inc
reasin
g
t
he
acc
ur
acy
of
pr
e
vious
s
tud
ie
s
i
n
alm
os
t
al
l
cl
as
sifie
rs.
When
c
om
par
ed
t
o
pre
vious
st
ud
ie
s,
it
can
be
see
n
t
hat
the
highest
accu
racy
is
88
.42%
achieve
d by us
ing
J48 classi
fi
er
with the
lo
w
est
incr
ease
of
5.93% a
nd the
highest
of
27.
17%.
Evaluation Warning : The document was created with Spire.PDF for Python.
IS
S
N
:
2088
-
8708
In
t J
Elec
&
C
om
p
En
g,
V
ol.
10
, No
.
4
,
A
ugus
t
2020
:
3537
-
3549
3546
Table
14. T
he pr
opos
e
d
a
nd
previ
ou
s
r
esea
rc
h resu
lt
s i
n
T
w
it
te
r
Cl
a
s
s
i
f
i
er
Perce
n
t
a
g
e (%
)
Cas
t
i
l
l
o
(2
0
1
1
)
Mo
rr
i
s
(2
0
1
2
)
G
u
p
t
a
(2
0
1
4
)
Sy
ari
ff
(2
0
1
4
)
Ro
s
s
(2
0
1
6
)
T
h
e Pro
p
o
s
ed
Fe
at
u
r
e
U
s
er
Pro
f
i
l
e
Mes
s
a
g
e
Pro
f
i
l
e
U
s
er Pro
fi
l
e +
Mes
s
a
g
e Pr
o
fi
l
e
NB
6
0
.
7
9
5
8
.
4
1
6
0
.
7
9
6
5
.
9
5
6
0
.
5
2
6
6
.
9
7
6
2
.
9
3
6
6
.
4
2
SV
M
7
7
.
7
0
8
2
.
3
6
7
6
.
7
3
6
7
.
8
6
7
7
.
1
3
8
2
.
7
0
7
3
.
4
1
8
7
.
3
6
L
o
g
i
t
7
7
.
2
6
7
7
.
2
4
7
6
.
2
7
6
6
.
2
5
7
6
.
5
6
7
8
.
2
5
6
4
.
5
7
7
8
.
0
4
J
4
8
8
3
.
0
9
8
3
.
4
7
8
3
.
0
0
6
9
.
5
3
8
2
.
1
6
8
2
.
7
7
7
9
.
3
6
8
8
.
4
2
Table
14
al
s
o
sh
ows
a
c
om
par
iso
n
of
the
a
ccur
acy
value
betwee
n
the
use
r
pr
of
il
e
dim
ensio
n
a
nd
m
essage
c
on
te
nt
dim
ension
i
n
4
dif
fer
e
nt
cl
assifi
ers.
Th
e
us
e
r
pro
file
dim
ension
acc
ur
acy
is
hi
gh
e
r
tha
n
the
m
essage
c
on
te
nt
dim
ensi
on
acc
ur
acy
for
al
l
cl
assifi
e
rs
.
T
he
highest
accuracy
valu
e
on
t
he
us
er
pro
file
dim
ension
us
in
g
the
J48
cl
assi
fier
is
82.77%
.
All
the
m
erg
in
g
the
feat
ur
es
of
t
he
bo
t
h
dim
ension
s
us
e
d
in
thi
s
stud
y
inc
rease
accuracy
for
SV
M
a
nd
J
48
cl
assifi
ers,
w
hile
the
tw
o
ot
her
cl
assifi
er
s
,
i.e.,
NB
an
d
Lo
git
cl
assifi
ers,
pro
vid
e a
d
ec
rease
on th
e
accu
rac
y.
The
new
featu
r
es
are
cl
assifi
e
d
acco
r
ding
to
the
infl
uen
ce
of
them
on
the
a
ccur
acy
.
The
f
eat
ur
es
t
hat
increase
the
ac
cur
acy
afte
r
it
add
e
d
to
t
he
ba
sel
ine
featu
re
s
are
cl
assifi
ed
as
increase
d
gro
up,
the
featu
res
that
decr
ease
t
he
r
ever
se
a
re
cl
assifi
ed
as
dec
r
eased
gr
oup,
wh
il
e
the
feat
ur
es
t
hat
no
t
e
ff
ect
are
cl
assi
fied
as
m
ixed
gro
up,
in
this
pap
e
r.
Her
e,
the
ba
sel
ine
featur
e
s
rep
re
sent
the
set
of
featu
r
e
us
ed
in
Ro
ss
and
Thir
un
a
rayan
[22
]
.
Table
15
sho
ws
the
ef
fect
of
the
17
ne
w
featur
es
pro
pose
d,
co
ns
ist
of
12
featu
res
base
d
on
us
e
r
prof
il
e
an
d
5
featur
e
s
base
d
on
m
essage
con
te
nt
dim
ension,
in
eac
h
cl
assifi
er
on
Twitt
er.
All
f
eat
ur
es
pro
po
se
d
on
T
witt
er
in
both
featur
e
dim
ension
s
pro
vid
e
a
n
incre
ase
on
the
accu
racy
of
each
cl
assifi
er
.
For
influ
e
nce
on
al
l
cl
assifi
ers,
al
l
new
feat
ur
e
i
nc
rease
on
t
he
a
ccur
acy
of
6.6
0%,
with
6.67
%
for
featu
res
base
d
on
us
e
r
pro
file
,
an
d
6.4
5%
for
feat
ur
es
ba
se
d
on
m
essage
con
te
nt
dim
ension
.
T
he
bigge
st
aver
a
g
e
f
or
f
eat
ur
e
is
8.55%,
ac
hi
eved
by
t
he
N
um
Fo
ll
ow
in
gNum
Fo
ll
ow
er
fe
at
ur
e.
I
n
the
te
rm
s
of
the
e
ff
e
ct
in
each
cl
as
sifie
r
,
#s
e
nti
m
ent_d
e
sc
featur
e
pro
vid
es
the
high
est
i
m
pr
ov
em
ent
of
acc
urac
y
of
+1
3,4
1
%
was
achie
ve
d
on
SV
M cl
assifi
e
r
.
Table
15. New
f
eat
ur
es
distri
bu
ti
on
by in
flu
ence
of accu
ra
cy
b
ase
d on fe
at
ur
es
dim
ension on
Twitt
er
In
f
l
u
e
n
c
e of ac
cu
racy
Feat
u
re
d
i
m
en
s
i
o
n
U
s
er Pro
fi
l
e
Mes
s
a
g
e Co
n
t
e
n
t
In
cre
as
e
d
ch
ec
k
_
w
eb
_
i
n
s
t
i
t
u
t
i
o
n
,
#
s
en
t
i
m
en
t
_
d
e
s
c,
n
u
m
Po
s
W
o
rd
D
es
c,
ch
ec
k
_
w
eb
_
p
er
s
o
n
a
l
,
N
u
m
Fo
l
l
o
w
i
n
g
N
u
m
Fo
l
l
o
w
er,
N
u
m
L
i
k
es
N
u
m
Fo
l
l
o
w
er,
w
o
rd
_
d
es
c,
#
p
o
s
i
t
i
v
e
_
d
es
c,
#
n
eg
a
t
i
v
e_
d
e
s
c,
ch
ec
k
_
l
o
ca
t
i
o
n
,
#
l
i
k
es
_
u
s
er,
n
u
m
N
eg
W
o
rd
D
es
c
s
o
u
rc
e,
rat
i
o
N
e
g
N
u
m
T
w
eet
,
#
l
i
k
e_
t
w
ee
t
,
ch
eck
_
s
p
am
,
rat
i
o
P
o
s
N
u
m
T
w
eet
D
ecreased
-
-
Mi
x
ed
-
-
3.2.2.
Face
boo
k so
ci
al
medi
a
This
pap
e
r
ha
s
carrie
d
ou
t
t
wo
de
velo
pm
e
nts.
Fir
st,
de
ve
lop
in
g
Faceb
ook
API
that
can
ret
rieve
dataset
s
on
li
ne
.
Seco
nd,
add
i
ng
m
or
e
feature
s
to
49
ne
w
fe
at
ur
es
base
d
on
us
e
rs
an
d
co
ntent.
Ta
ble
16
sh
ow
s
the
highest
accuracy
inc
reas
e
com
par
ed
to
S
ai
kaew
'
s
stud
y.
This
pa
per
s
uc
ceeded
i
n
inc
reasin
g
the
acc
ur
acy
of
pre
vious
st
udie
s
in
al
m
os
t
al
l
cl
assifi
ers.
Th
e
i
ncr
eas
e
is
9.9
1%
with
an
accur
acy
v
al
ue
o
f
78.
61
%
by u
sin
g
J4
8
Cl
assifi
er.
Table
16
al
so
sh
ows
a
c
om
par
iso
n
of
the
a
ccur
acy
value
betwee
n
t
he
use
r
prof
il
e
dim
e
ns
io
n
and
m
essage
con
te
nt
dim
ension
in
4
diff
e
re
nt
cl
assifi
ers
.
The
us
er
pro
file
dim
ension
ac
cur
acy
is
high
er
tha
n
the
m
essage
c
on
te
nt
dim
ensi
on
acc
ur
acy
for
al
l
cl
assifi
ers
.
T
he
highest
accuracy
valu
e
on
t
he
us
er
pro
file
dim
ension
usi
ng
t
he
S
VM
cl
assifi
er
is
76.50%.
All
the
m
erg
i
ng
t
he
feat
ur
es
of
the
both
dim
ensions
us
e
d
i
n
this
stud
y
inc
r
ease
accuracy
for
only
J48
cl
assifi
ers,
wh
il
e
the
three
oth
e
r
cl
assifi
ers
prov
i
de
a
dec
rea
se
on
the accu
racy.
Table
16. S
ai
ka
ew vs t
he pr
opose
d i
n F
aceb
ook
Cl
a
s
s
i
f
i
er
Perce
n
t
a
g
e (%
)
Sai
k
aew
(2
0
1
5
)
T
h
e Pro
p
o
s
ed
Fe
at
u
r
e
U
s
er
Pro
f
i
l
e
Mes
s
a
g
e Pr
o
fi
l
e
U
s
er Pro
fi
l
e +
Mes
s
a
g
e Pr
o
fi
l
e
NB
6
5
.
0
2
6
6
.
5
8
6
2
.
3
2
6
5
.
3
9
SV
M
7
1
.
1
0
7
6
.
5
0
7
1
.
3
8
7
1
.
8
3
L
o
g
i
t
6
9
.
9
3
7
3
.
4
1
7
0
.
5
4
7
2
.
5
7
J
4
8
7
1
.
5
2
7
6
.
4
6
7
4
.
6
1
7
8
.
6
1
Evaluation Warning : The document was created with Spire.PDF for Python.