Inter
national
J
our
nal
of
Electrical
and
Computer
Engineering
(IJECE)
V
ol.
11,
No.
2,
April
2021,
pp.
1613
1626
ISSN:
2088-8708,
DOI:
10.11591/ijece.v11i2.pp1613-1626
r
1613
MTVRep:
A
mo
vie
and
TV
sho
w
r
eputation
system
based
on
fine-grained
sentiment
and
semantic
analysis
Abdessamad
Benlahbib,
El
Habib
Nfaoui
Computer
Science
Department,
LISA
C
Laboratory
,
F
aculty
of
Sciences
Dhar
EL
Mehraz
(F
.S.D.M),
Sidi
Mohamed
Ben
Abdellah
Uni
v
ersity
,
Fez,
Morocco
Article
Inf
o
Article
history:
Recei
v
ed
Jul
21,
2020
Re
vised
Sep
8,
2020
Accepted
Sep
28,
2020
K
eyw
ords:
Decision
making
Fine-grained
sentiment
analysis
Natural
language
processing
Reputation
generation
T
e
xt
mining
ABSTRA
CT
Customer
re
vie
ws
are
a
v
aluable
source
of
information
from
which
we
can
e
xtract
v
ery
useful
data
about
dif
ferent
online
shopping
e
xperiences.
F
or
trendy
items
(products,
mo
vies,
TV
sho
ws,
hotels,
services
.
.
.
),
the
number
of
a
v
ailable
users
and
customers’
opinions
could
easily
surpass
thousands.
Therefore,
online
reputation
systems
could
aid
potential
customers
in
making
the
right
decision
(b
uying,
renting,
booking
.
.
.
)
by
automatically
mining
te
xtual
re
vie
ws
and
their
ratings.
This
paper
presents
MTVRep,
a
mo
vie
and
TV
sho
w
reputation
system
that
incorporates
fine-grained
opinion
min-
ing
and
semantic
analysis
to
generate
and
visualize
reputation
to
w
ard
mo
vies
and
TV
sho
ws.
Dif
ferently
from
pre
vious
studies
on
reputation
generation
that
treat
the
task
of
sentiment
analys
is
as
a
binary
classification
problem
(positi
v
e,
ne
g
ati
v
e),
the
proposed
system
identifies
the
sentiment
strength
during
the
pha
se
of
sentiment
classification
by
using
fine-grained
sentiment
analysis
to
separate
mo
vie
and
TV
sho
w
re
vie
ws
into
fi
v
e
discrete
classes:
strongly
ne
g
ati
v
e,
weakly
ne
g
ati
v
e,
neutral,
weakly
positi
v
e
and
strongly
positi
v
e.
Besides,
it
emplo
ys
embeddings
from
language
models
(ELMo)
representations
to
e
xtract
semantic
relations
between
re
vie
ws.
The
contrib
ution
of
this
paper
is
threefold.
First,
mo
vie
and
TV
sho
w
re
vie
ws
are
separated
into
fi
v
e
groups
based
on
their
senti
ment
orientation.
Second,
a
custom
score
is
computed
for
each
opinion
group.
Finally
,
a
numerical
reputation
v
alue
is
produced
to
w
ard
the
tar
get
mo
vie
or
TV
sho
w
.
The
ef
ficac
y
of
the
proposed
sys
tem
is
illustrated
by
conducting
se
v
eral
e
xperiments
on
a
real-w
orld
mo
vie
and
TV
sho
w
dataset.
This
is
an
open
access
article
under
the
CC
BY
-SA
license
.
Corresponding
A
uthor:
Abdessamad
Benlahbib
Computer
Science
Department,
LISA
C
Laboratory
F
aculty
of
Sciences
Dhar
EL
Mehraz
(F
.S.D.M),
Sidi
Mohamed
Ben
Abdellah
Uni
v
ersity
Fez
B.P
.
1796
Fes-Atlas,
30003
Morocco
Email:
abdessamad.benlahbib@usmba.ac.ma
1.
INTR
ODUCTION
The
e
xponential
gro
wth
of
W
eb
2.0
has
dramatically
impacted
the
e
v
olution
of
e-commerce
plat
forms
[1–4].
On
the
one
hand,
some
recent
statistics
sho
w
that
72%
of
customers
will
not
tak
e
action
until
the
y
read
re
vie
ws,
and
only
6%
of
consumers
don’
t
trust
customer
re
vie
ws
at
all,
on
the
other
hand,
the
number
of
user
-
generated
re
vie
ws
attached
to
an
online
entity
could
easily
e
xceed
thousands
[5,
6].
Thus,
a
potential
customer
doesn’
t
ha
v
e
the
time
or
ef
fort
to
e
xamine
all
the
re
vie
ws
manually
in
order
to
mak
e
a
decision
to
w
ard
it
[7,
8].
Little
research
has
been
conducted
in
mining
customer
and
user
re
vie
ws
with
re
g
ard
to
feature
-based
summarization
and
reputation
generation
for
the
purpose
of
supporting
customer
decision
making
process
in
E-commerce
(b
uying,
renting,
booking
.
.
.
).
Ov
er
the
last
tw
o
decades,
a
fe
w
opinion
summarizer
systems
J
ournal
homepage:
http://ijece
.iaescor
e
.com
Evaluation Warning : The document was created with Spire.PDF for Python.
1614
r
ISSN:
2088-8708
ha
v
e
been
proposed
to
produce
a
summary
for
product
re
vie
ws
[9],
mo
vie
re
vie
ws
[3],
hotel
re
vie
ws
[1]
and
local
service
re
vie
ws
[10].
Backing
to
reputation
generation
task,
to
the
best
of
our
kno
wledge,
there
are
v
ery
fe
w
reputation
systems
that
ha
v
e
been
proposed
to
comput
e
a
single
reputation
v
alue
to
w
ard
dif
ferent
entities
based
on
fusing
and
mining
user
and
customer
re
vie
ws
e
xpressed
in
natural
language
[11–15].
Y
an
et
al.
[11]
applied
opinion
mining
and
fusion
techniques
on
product
re
vie
ws.
Benlahbib
and
Nf
aoui
[12]
used
K-Means
clustering
algorithm
on
mo
vie
re
vie
ws.
The
same
authors
[13]
incorporated
semantic
and
sentiment
analysis
to
generate
a
single
reputation
v
alue
from
user
and
customer
re
vie
ws
e
xpressed
in
natural
language
(English).
An
important
issue
that
w
as
ne
glected
in
the
past
research
on
reputation
generation
is
identifying
the
sentiment
strength
during
the
phase
of
sentiment
classification
and
opinion
fusion.
In
f
act,
e
xisting
w
orks
ha
v
e
only
focused
on
classifying
re
vie
ws
into
positi
v
e
or
ne
g
ati
v
e
before
generating
a
single
reputation
v
alue,
disre
g
arding
the
sentiment
strength.
In
this
paper
,
we
propose
MTVRep,
a
mo
vie
and
TV
sho
w
reputation
system
that
appli
es
fine-grained
opinion
mining
to
separate
re
vie
ws
into
fi
v
e
opinion
groups:
strongly
ne
g
ati
v
e,
weakly
ne
g
ati
v
e,
neutral,
weakly
positi
v
e
and
strongly
positi
v
e.
Then,
it
computes
a
custom
score
for
each
group
based
on
the
acquired
statistics
of
each
group,
i.e.,
the
number
of
re
vie
ws
in
each
group,
the
sum
of
their
ratings
and
the
sum
of
their
semantic
similarity
(ELMo
and
cosine
metric).
Finally
,
a
numerical
reputation
v
alue
is
produced
to
w
ard
the
tar
get
mo
vie
or
TV
sho
w
using
the
weighted
arithmetic
mean.
In
this
manner
,
this
study
addressed
the
follo
wing
research
question:
with
the
combination
of
fine-
grained
opinion
mining
and
semantic
analysis,
can
the
proposed
reputation
system
of
fer
better
results
in
terms
of
reputation
generation
than
the
pre
vious
reputation
systems
(consider
only
semantic
relations)?.
The
remain-
der
of
this
paper
is
or
g
anized
as
follo
ws:
Related
w
orks
are
pro
vided
in
Section
2.
Section
3
illustrates
the
w
ork-flo
w
of
the
reputa
tion
system.
Section
4
presents
all
the
e
xperimental
results
and
discusses
its
compara-
ti
v
e
performance,
finally
conclusions
are
dra
wn
in
Section
5.
2.
LITERA
TURE
REVIEW
This
section
describes
and
e
xamines
pre
vious
research
w
ork
done
in
the
area
of
natural
language
processing
(NLP)
techniques
for
decision
making
in
E-commerce
and
fine-grained
sentiment
analysis.
2.1.
Fine-grained
sentiment
analysis
on
the
5-class
stanf
ord
sentiment
tr
eebank
(SST
-5)
dataset
Xu
et
al.
[16]
proposed
Emo2V
ec
which
are
w
ord-le
v
el
representations
that
encode
emotional
se-
mantics
into
fix
ed-sized,
real-v
alued
v
ectors.
Mu
et
al.
[17]
presented
a
simple
post-processing
operation
that
renders
w
ord
representations
e
v
en
stronger
by
eliminating
the
top
principal
components
of
all
w
ords.
Socher
et
al.
[18]
introduced
recursi
v
e
neural
tensor
netw
orks
and
the
stanford
sentiment
treebank.
W
ang
et
al.
[19]
proposed
RNN-Capsule,
a
capsule
model
based
on
recurrent
neural
netw
ork
(RNN)
for
sentiment
analysis.
Y
ang
[20]
presented
RNFs,
a
ne
w
class
of
con
v
olution
filters
based
on
recurrent
neural
netw
orks.
McCann
et
al.
[21]
introduced
an
approach
for
transferring
kno
wledge
from
an
encoder
pretrained
on
machine
translation
to
a
v
ariety
of
do
wnstream
natural
language
processing
(NLP)
tasks.
Munikar
et
al.
[22]
used
the
pretrained
BER
T
[23]
model
and
fine-tuned
it
for
the
fine-grained
sentiment
classification
task
on
the
SST
-5
dataset.
T
able
1
summarizes
the
latest
w
orks
on
fine-grained
opinion
mining
applied
to
stanford
sentiment
treebank
dataset
(SST
-5).
T
able
1.
State-of-the-art
results
for
sentiment
analysis
on
SST
-5
fine-grained
classification
Method
Authors
and
Y
ear
Accurac
y
%
BCN+Suf
fix
BiLSTM-T
ied+CoV
e
Brahma
(2018)
[24]
56.2
BER
T
lar
ge
Munikar
et
al.
(2019)
[22]
55.5
BCN+ELMo
Peters
et
al.
(2018)
[25]
54.7
BCN+Char+CoV
e
McCann
et
al.
(2017)
[21]
53.7
CNN-RNF-LSTM
Y
ang
(2018)
[20]
53.4
RNN-Capsule
W
ang
et
al.
(2018)
[19]
49.3
SWEM-concat
Shen
et
al.
(2018)
[26]
46.1
RNTN
Socher
et
al.
(2013)
[18]
45.7
GR
U-RNN-W
ORD2VEC
Mu
et
al.
(2017)
[17]
45.02
GloV
e+Emo2V
ec
Xu
et
al.
(2018)
[16]
43.6
Emo2V
ec
Xu
et
al.
(2018)
[16]
41.6
Int
J
Elec
&
Comp
Eng,
V
ol.
11,
No.
2,
April
2021
:
1613
–
1626
Evaluation Warning : The document was created with Spire.PDF for Python.
Int
J
Elec
&
Comp
Eng
ISSN:
2088-8708
r
1615
2.2.
NLP
techniques
f
or
decision
making
in
E-commer
ce
It
has
been
well
recognized
that
user
re
vie
ws
attached
to
an
entity
(mo
vie,
product,
etc...)
contain
v
aluable
information
about
it.
Recently
,
fe
w
approaches
ha
v
e
been
proposed
to
help
potential
customers
during
decision-making
proces
s
in
E-comm
erce
websites
by
automatically
mining
user
and
customer
re
vie
ws.
The
most
popular
approaches
are
feature-based
summarization
and
reputation
generation.
Feature-based
summarization
approaches
aim
to
produce
a
feature-based
summary
for
a
tar
get
entit
y
as
sho
wn
in
Figure
1.
The
first
feature-based
summarizer
system
w
as
proposed
by
Hu
and
Liu
[9]
in
which
the
y
applied
association
rule
mining
to
e
xtract
product
features,
and
the
y
used
a
set
of
seed
adjecti
v
es
to
identify
the
semantic
orientation
for
opinion
w
ords.
Zhuang
et
al.
[3]
b
uilt
a
multi-kno
wledge
based
system
that
aims
to
generate
a
feature-based
summary
for
online
mo
vie
re
vie
ws.
Blair
-Goldensohn
et
al.
[10]
presented
a
feature-
based
summarizer
for
local
service
re
vie
ws.
Kang
ale
et
al.
[27]
proposed
a
feature
-based
summarize
system
for
product
re
vie
ws
that
produces
a
rating
as
well
as
re
vie
w
summary
of
each
product
feature
as
sho
wn
in
Figure
1.
Figure
1.
Feature-based
summary
[27]
Reputation
generation
systems
ha
v
e
interest
in
pro
viding
potential
customers
with
suf
ficient
i
n
f
orma-
tion
to
w
ard
the
tar
get
entity
(product,
mo
vie,
hotel
.
.
.
)
to
help
them
mak
e
the
right
decision
to
w
ard
it
(b
uying,
renting,
booking
.
.
.
).
Currently
,
a
fe
w
reputation
systems
ha
v
e
bee
n
proposed
to
tackle
the
task
of
reputa-
tion
generation
using
opinion
mining
techniques
on
user
and
customer
re
vie
ws
e
xpressed
in
natural
language.
Y
an
et
al.
[11]
were
the
first
to
propose
a
reputation
system
that
combines
opinion
mining
and
opinion
fu-
sion
techniques
for
the
purpose
of
producing
a
single
re
p
ut
ation
v
alue
to
w
ard
v
arious
products.
The
system
firstly
eliminates
irrele
v
ant
re
vie
ws
[28],
then,
the
remaining
re
vie
ws
are
grouped
into
dif
ferent
sets
based
on
their
semantic
relations
(latent
semantic
analysis
and
cosine
metric),
and
finally
,
a
single
numerical
reputation
v
alue
i
s
produced.
Benlahbib
and
Nf
aoui
[12]
used
K-Means
clustering
algorithm
to
group
similar
mo
vie
re-
vie
ws
into
the
same
cluster
based
on
their
semantic
r
elations
before
generating
a
reputation
v
alue.
The
same
authors
[13]
designed
and
b
uilt
a
h
ybrid
reputation
system
that
firstly
combines
Na
¨
ıv
e
Bayes
and
linear
sup-
port
v
ector
machine
(SVM)
to
separate
user
and
customer
re
vie
ws
into
positi
v
e
and
ne
g
ati
v
e
(document
le
v
el
sentiment
analysis),
then,
it
groups
them
into
dif
ferent
sets
based
on
semantic
relations,
and
finally
,
a
single
reputation
v
alue
is
computed
using
weighted
arithmetic
mean.
3.
PR
OPOSED
SYSTEM
3.1.
System
o
v
er
view
The
proposed
approach
consists
mainly
on
four
steps:
W
e
collect
mo
vie
and
TV
sho
w
re
vie
ws
from
IMDb
in
https://www
.imdb
.com/,
website
using
the
web
scraping
tool
ScrapeStorm
in
https://www
.scrapestorm.com/,
then,
we
preprocess
them.
MTVRep:
A
mo
vie
and
TV
show
r
eputation
system
based
on...
(Abdessamad
Benlahbib)
Evaluation Warning : The document was created with Spire.PDF for Python.
1616
r
ISSN:
2088-8708
W
e
train
Multinomial
Na
¨
ıv
e
Bayes
model
on
the
5-class
stanford
sentiment
treebank
(SST
-5)
dataset
in
order
to
perform
fine-
grained
sentiment
analysis.
The
model
classifies
the
collected
re
vie
ws
to
fi
v
e
opinion
groups:
strongly
ne
g
ati
v
e,
weakly
ne
g
ati
v
e,
neutral,
weakly
positi
v
e
and
strongly
positi
v
e.
F
or
each
opinion
group,
we
acquire
the
sum
of
user
ratings
and
the
sum
of
re
vie
ws
semantic
similarity
.
The
semantic
simila
rity
between
tw
o
re
vie
ws
is
computed
as
the
cosine
between
their
deep
conte
xtualized
w
ord
embeddings
(ELMo).
These
acquired
statistics
are
used
to
compute
a
custom
score
for
each
opinion
group.
W
e
compute
the
mo
vie
or
TV
sho
w
numerical
reputation
v
alue
based
on
the
opinion
groups’
scores
by
applying
the
weighted
arithmetic
mean.
Figure
2
illustrates
the
w
ork-flo
w
of
the
reputation
system
(MTVRep).
Figure
2.
Reputation
system
pipeline
Int
J
Elec
&
Comp
Eng,
V
ol.
11,
No.
2,
April
2021
:
1613
–
1626
Evaluation Warning : The document was created with Spire.PDF for Python.
Int
J
Elec
&
Comp
Eng
ISSN:
2088-8708
r
1617
3.2.
Fine-grained
sentiment
analysis
W
e
classify
the
collecte
d
re
vie
ws
into
fi
v
e
opinion
groups
based
on
their
sentiment
intensities
by
ap-
plying
the
Multinomial
Na
¨
ıv
e
Bayes
model
trained
on
the
5-class
stanford
sentiment
treebank
(SST
-5)
dataset.
The
reasons
behind
using
Multinomial
Na
¨
ıv
e
Bayes
model
are
discussed
in
section
4.2.
3.3.
Opinion
gr
oups
custom
scor
es
After
separating
mo
vie
and
TV
sho
w
re
vie
ws
into
fi
v
e
opinion
groups:
strongly
ne
g
ati
v
e,
weakly
ne
g
ati
v
e,
neutral,
weakly
positi
v
e
and
strongly
positi
v
e,
we
compute
a
custom
score
for
each
opinion
group
based
on
the
sum
of
their
ratings
and
the
sum
of
their
re
vie
ws
semantic
similarity
.
The
statistics
of
opinion
groups
are
acquired
by
applying
algorithm
1.
Algorithm
1:
Opinion
groups
statistics
acquisition
Define
:
G
pol
ar
ity
=
f
r
pol
ar
it
y
1
;
r
pol
ar
ity
2
;
:
:
:
;
r
pol
ar
ity
n
g
:
the
opinion
group
that
contains
re
vie
ws
which
hold
the
sentiment
orientation
pol
ar
ity
.
R
pol
ar
ity
=
f
r
r
pol
ar
ity
1
;
r
r
pol
ar
it
y
2
;
:
:
:
;
r
r
pol
ar
i
ty
n
g
:
the
set
of
ratings
attached
to
G
pol
ar
it
y
re
vie
ws.
S
S
pol
ar
ity
:
the
sum
of
semantic
similarity
for
G
pol
ar
it
y
re
vie
ws.
S
R
pol
ar
ity
:
the
sum
of
ratings
for
G
pol
ar
it
y
re
vie
ws.
N
R
pol
ar
ity
:
the
number
of
re
vie
ws
in
G
pol
ar
ity
.
E
LM
o
(
r
pol
ar
it
y
i
)
:
ELMo
embeddings
for
re
vie
w
i
from
G
pol
ar
ity
.
cos
(
E
LM
o
(
r
pol
ar
i
ty
i
)
;
E
LM
o
(
r
pol
ar
ity
j
))
:
the
cosine
similarity
between
ELMo
embeddings
for
re
vie
w
i
and
j
from
G
pol
ar
ity
.
Input
:
Opinion
groups,
their
lengths
and
their
user
ratings:
G
pol
ar
it
y
,
N
R
pol
ar
ity
and
R
pol
ar
ity
.
Output:
Opinion
groups’
statistics:
S
S
pol
ar
it
y
and
S
R
pol
ar
ity
1
pol
ar
it
y
[
str
ong
l
y
neg
ativ
e;
w
eak
l
y
neg
ativ
e;
neutr
al
;
w
eak
l
y
positiv
e;
str
ong
l
y
positiv
e
]
2
/
*
After
applying
the
trained
model
on
the
collected
movie
and
TV
show
reviews,
we
separate
them
into
five
opinion
groups:
strongly
negative,
weakly
negative,
neutral,
weakly
positive
and
strongly
positive.
For
each
opinion
group,
we
acquire
the
sum
of
their
reviews
semantic
similarity
(cosine
metric
and
ELMo
embeddings)
and
the
sum
of
their
ratings
*
/
3
f
or
i
in
pol
ar
ity
do
4
S
S
i
0
5
S
R
i
0
6
f
or
j
1
to
N
R
i
do
7
S
S
i
S
S
i
+
cos
(
E
LM
o
(
r
i
1
)
;
E
LM
o
(
r
i
j
))
8
S
R
i
S
R
i
+
r
r
i
j
9
end
f
or
10
end
f
or
By
applying
algorithm
1,
we
retrie
v
e
for
each
group,
the
sum
of
their
ratings
and
the
sum
of
t
heir
semantic
similarity
.
W
e
propose
formula
(1)
to
compute
a
custom
score
for
each
opinion
group.
C
S
(
G
pol
ar
it
y
)
=
maxR
S
S
polar
ity
N
R
polar
ity
+
S
R
polar
ity
N
R
polar
ity
2
(1)
F
ormula
(1)
could
also
be
written
as
follo
ws:
MTVRep:
A
mo
vie
and
TV
show
r
eputation
system
based
on...
(Abdessamad
Benlahbib)
Evaluation Warning : The document was created with Spire.PDF for Python.
1618
r
ISSN:
2088-8708
C
S
(
G
pol
ar
ity
)
=
maxR
S
S
pol
ar
ity
+
S
R
pol
ar
it
y
2
N
R
pol
ar
it
y
(2)
W
e
denote:
maxR
:
Highest
v
alue
of
user
ratings
(5
or
10)
depending
on
the
range
of
ratings
(1
to
5
or
1
to
10).
S
S
pol
ar
ity
:
Sum
of
similarity
for
re
vie
ws
contained
in
opinion
group
G
pol
ar
ity
.
S
R
pol
ar
i
ty
:
Sum
of
user
ratings
in
opinion
group
G
pol
ar
it
y
.
N
R
pol
ar
ity
:
Number
of
re
vie
ws
contained
in
opinion
group
G
pol
ar
it
y
.
The
custom
score
of
each
opinion
group
ranges
between
1
and
5
or
1
and
10
depending
on
the
range
of
user
rating
v
alues.
Since
the
cosine
metric
returns
v
alues
in
the
r
ange
of
[0,1],
the
a
v
erage
of
the
sum
of
semantic
similarity
for
an
opinion
group
is
also
between
0
and
1,
therefore,
we
multiply
the
a
v
erage
of
the
sum
of
semantic
similarity
by
5
or
10
(
maxR
)
to
get
a
numerical
v
alue
between
0
and
5
or
0
and
10,
then,
we
add
this
v
alue
to
the
a
v
erage
of
sum
of
ratings
and
we
di
vide
them
by
2.
3.4.
Reputation
generation
W
e
propose
formula
(3)
(weighted
arithmetic
mean)
to
compute
the
mo
vie
or
TV
sho
w
reputation
v
alue.
R
ep
(
E
)
=
P
pol
ar
ity
C
S
(
G
pol
ar
it
y
)
N
R
pol
ar
ity
P
pol
ar
it
y
N
R
pol
ar
it
y
(3)
C
S
(
G
pol
ar
it
y
)
is
the
custom
score
for
opinion
group
G
pol
ar
ity
computed
by
applying
formula
(1)
or
(2).
The
mo
vie
or
TV
sho
w
reputation
v
alue
has
v
alues
in
the
range
of
[1,
5]
or
[1,
10]
depending
on
the
range
of
user
ratings.
4.
EXPERIMENT
AL
EV
ALU
A
TION
4.1.
Dataset
gathering
W
e
collect
mo
vie
and
TV
sho
w
re
vie
ws
and
their
numerical
ratings
from
IMDb
web
site
using
the
web
scraping
tool
ScrapeStorm.
Figure
3
depicts
the
structure
of
IMDb
user
re
vie
ws.
Figure
3.
IMDb
user
re
vie
ws
structure
The
first
ten
datasets
contain
m
o
vi
e
re
vie
ws
and
the
remaining
ten
datasets
contain
TV
sho
w
re
vie
ws.
T
able
2
sho
ws
the
statistical
information
of
the
collected
datasets.
T
able
2.
Statistical
information
of
the
collected
datasets
Mo
vies
TV
sho
ws
T
otal
Number
of
re
vie
ws
1000
1000
2000
Number
of
entities
10
10
20
Int
J
Elec
&
Comp
Eng,
V
ol.
11,
No.
2,
April
2021
:
1613
–
1626
Evaluation Warning : The document was created with Spire.PDF for Python.
Int
J
Elec
&
Comp
Eng
ISSN:
2088-8708
r
1619
After
collecting
the
re
vie
ws,
we
replace
the
missing
rating
v
alues
with
the
a
v
erage
of
the
ratings,
then,
we
lo
wercase
them
and
we
remo
v
e
punctuation
marks
and
numbers.
4.2.
T
raining
phase
and
fine-grained
opinion
mining
W
e
train
the
Multinomial
Na
¨
ıv
e
Bayes
model
with
SST
-5
dataset.
The
training
set
contains
1092
strongly
ne
g
ati
v
e
re
vie
ws,
2218
weakly
ne
g
ati
v
e
re
vie
ws,
1624
neutral
re
vie
ws,
2322
weakly
positi
v
e
re
vie
ws
and
1288
strongly
positi
v
e
re
vie
ws.
The
test
set
contains
279
strongly
ne
g
ati
v
e
re
vie
ws,
633
weakly
ne
g
ati
v
e
re
vie
ws,
389
neutral
re
vie
ws,
510
weakly
positi
v
e
re
vie
ws
and
399
strongly
positi
v
e
re
vie
ws.
Figure
4
depicts
the
distrib
ution
of
training
and
test
samples
o
v
er
the
fi
v
e
classes.
Figure
4.
Number
of
samples
in
SST
-5
training
and
test
set
Before
feeding
the
data
to
the
classifier
for
training,
we
preprocess
them
by
remo
ving
punct
u
a
tion
marks,
numbers
and
whitespaces,
then,
we
lo
wercase
and
lemmatize
them.
After
preprocessing
the
data,
we
must
choose
which
classifier
we
will
apply
and
which
features
we
will
use.
Since
deep
learning
models
require
substantial
computing
po
wer
(High-performance
CPUs,
GPUs
and
RAM),
we
decided
to
w
ork
with
one
of
MTVRep:
A
mo
vie
and
TV
show
r
eputation
system
based
on...
(Abdessamad
Benlahbib)
Evaluation Warning : The document was created with Spire.PDF for Python.
1620
r
ISSN:
2088-8708
the
four
models:
Random
F
orest,
Logistic
Re
gression,
Multinomial
Na
¨
ıv
e
Bayes
and
Linear
support
v
ector
machine
(SVM).
The
last
tw
o
classifiers
(Na
¨
ıv
e
Bayes
and
SVM)
ha
v
e
been
recognized
as
the
most
popu-
lar
supervised
machine
learning
algorithms
for
polarity
classification
[29].
F
or
features
selection,
we
ha
v
e
tried
man
y
combinations:
unigrams,
bigrams,
trigrams,
tf-idf
unigrams,
tf-idf
bigrams
and
tf-idf
trigrams.
W
e
discarded
some
popular
models
such
as:
w
ord2v
ec
and
doc2v
ec
because
W
ang
et
al.
[30]
ha
v
e
con-
ducted
e
xperiments
on
Na
¨
ıv
e
Bayes,
Logistic
Re
gression
and
linear
support
v
ector
classifier
(SVC)
for
short
te
xt
classification
using
tf-idf
weighting,
w
ord2v
ec
and
paragraph2v
ec
(doc2v
ec),
and
the
y
ha
v
e
report
ed
that
tf-idf/counter
feature
has
the
highest
accurac
y
,
while
w
ord2v
ec
ne
xt,
and
doc2v
ec
has
the
lo
west
accurac
y
.
T
able
3
summarizes
the
classification
result
of
the
four
classifiers
on
SST
-5
dataset.
T
able
3.
Sentiment
analysis
classification
result
Macro
a
v
erage
precision
Macro
a
v
erage
recall
Macro
a
v
erage
f1-score
W
eighted
a
v
erage
precision
W
eighted
a
v
erage
recall
W
eighted
a
v
erage
f1-score
Accurac
y
Random
F
orest
(unigrams)
0.40
0.31
0.30
0.40
0.36
0.33
0.36
Random
F
orest
(bigrams)
0.34
0.29
0.28
0.34
0.32
0.31
0.32
Random
F
orest
(trigrams)
0.29
0.23
0.20
0.31
0.23
0.22
0.23
Random
F
orest
(tf-idf
unigrams)
0.40
0.30
0.28
0.39
0.35
0.31
0.35
Random
F
orest
(tf-idf
bigrams)
0.34
0.29
0.28
0.34
0.32
0.31
0.32
Random
F
orest
(tf-idf
trigrams)
0.28
0.22
0.20
0.29
0.23
0.21
0.23
Multinomial
Nai
v
e
Bayes
(unigrams)
0.43
0.38
0.38
0.43
0.43
0.41
0.43
Multinomial
Nai
v
e
Bayes
(bigrams)
0.36
0.30
0.29
0.36
0.35
0.32
0.35
Multinomial
Nai
v
e
Bayes
(trigrams)
0.31
0.26
0.24
0.31
0.29
0.26
0.29
Multinomial
Nai
v
e
Bayes
(tf-idf
unigrams)
0.48
0.34
0.29
0.46
0.41
0.34
0.41
Multinomial
Nai
v
e
Bayes
(tf-idf
bigrams)
0.38
0.29
0.24
0.38
0.35
0.29
0.35
Multinomial
Nai
v
e
Bayes
(tf-idf
trigrams)
0.29
0.24
0.19
0.30
0.29
0.23
0.29
Logistic
Re
gression
(unigrams)
0.42
0.37
0.37
0.42
0.41
0.39
0.41
Logistic
Re
gression
(bigrams)
0.38
0.28
0.23
0.37
0.34
0.27
0.34
Logistic
Re
gression
(trigrams)
0.36
0.23
0.18
0.35
0.28
0.22
0.28
Logistic
Re
gression
(tf-idf
unigrams)
0.42
0.35
0.34
0.41
0.40
0.37
0.40
Logistic
Re
gression
(tf-idf
bigrams)
0.43
0.28
0.23
0.41
0.35
0.27
0.35
Logistic
Re
gression
(tf-idf
trigrams)
0.30
0.23
0.17
0.32
0.29
0.21
0.29
Linear
SVM
(unigrams)
0.38
0.37
0.37
0.39
0.40
0.39
0.40
Linear
SVM
(bigrams)
0.33
0.31
0.31
0.34
0.34
0.33
0.34
Linear
SVM
(trigrams)
0.31
0.25
0.22
0.32
0.29
0.25
0.29
Linear
SVM
(tf-idf
unigrams)
0.38
0.38
0.38
0.39
0.41
0.39
0.41
Linear
SVM
(tf-idf
bigrams)
0.33
0.31
0.31
0.34
0.34
0.33
0.34
Linear
SVM
(tf-idf
trigrams)
0.31
0.27
0.25
0.31
0.30
0.27
0.30
From
T
able
3,
we
can
see
that
Multinomial
Na
¨
ıv
e
Bayes
classifier
achie
v
es
the
best
classification
result
when
it’
s
trained
with
unigrams.
Logistic
Re
g
r
ession
and
linear
SVM
classifiers
also
g
a
v
e
good
result
when
the
y
are
trained
with
unigrams
or
tf-idf
unigrams.
The
w
orst
results
are
pro
vided
by
Random
F
orest
since
it
achie
v
es
a
0.36
accurac
y
in
its
best.
Figure
5
depicts
the
confusion
matrix
of
Multinomial
Nai
v
e
Bayes
(unigrams)
for
SST
-5
test
set.
W
e
mention
that
B
E
R
T
base
achie
v
es
a
0.45
accurac
y
and
0.40
macro
a
v
erage
f1-score,
GR
U-RNN-
W
ORD2VEC
achie
v
es
a
0.45
accurac
y
and
r
ecursi
v
e
neural
tensor
netw
ork
achie
v
es
a
0.46
accurac
y
,
Besides,
deep
learning
algorithm
tak
es
a
long
time
to
train
as
sho
wn
in
T
able
4
due
to
the
lar
ge
number
of
parameters.
Based
on
that,
we
ha
v
e
made
the
choice
of
applying
Multinomial
Na
¨
ıv
e
Bayes
classifier
since
it
achie
v
es
an
accurac
y
of
0.43
and
it
doesn’
t
require
substantial
computing
po
wer
to
be
trained.
T
able
4
depicts
the
training
time
of
bidirectional
g
ated
recurrent
unit
(Bi-GR
U),
bidirectional
long
short-
term
memory
(Bi-LSTM),
recurrent
neural
netw
ork
(RNN)
and
multinomial
na
¨
ıv
e
bayes
(MNB)
for
SST
-5
dataset.
One
of
the
benefits
of
fine-grained
opinion
mining
is
that
it
pro
vides
a
better
understanding
of
the
distrib
ution
of
re
vie
ws
o
v
er
the
fi
v
e
emotion
classes,
therefore,
visualizing
these
fi
v
e
classes
will
help
users
and
customers
mak
e
up
their
minds
about
the
tar
get
item
(b
uying,
renting).
Int
J
Elec
&
Comp
Eng,
V
ol.
11,
No.
2,
April
2021
:
1613
–
1626
Evaluation Warning : The document was created with Spire.PDF for Python.
Int
J
Elec
&
Comp
Eng
ISSN:
2088-8708
r
1621
Figure
5.
Confusion
matrix
of
multinomial
nai
v
e
bayes
(unigrams)
for
SST
-5
test
set
T
able
4.
T
raining
time
of
bidirectional
g
ated
recurrent
unit
(Bi-GR
U),
bidirectional
long
short-term
memory
(Bi-LSTM),
recurrent
neural
netw
ork
(RNN)
and
multinomial
na
¨
ıv
e
bayes
(MNB)
for
SST
-5
dataset
Model
Epochs
Batch
size
T
raining
time
(seconds)
Bi-GR
U
50
64
210.10
Bi-LSTM
50
64
180.25
RNN
50
64
85.26
MNB
–
–
3.77
4.3.
Reputation
e
v
aluation
MTVRep
of
fers
a
holistic
reputation
visualization
form
as
sho
wn
in
Figure
6
by
depicting
the
numer
-
ical
reputation
v
alue
and
the
distrib
ution
of
re
vie
ws
o
v
er
the
fi
v
e
emotion
classes,
T
able
5
sho
ws
comparison
results
between
MTVRep
and
pre
vious
studies
in
term
of
visualizing
reputation.
An
important
issue
that
w
as
ne
glected
in
the
past
research
on
reputation
generation
is
identi
fying
the
sentiment
strength
during
the
phase
of
opinion
mining.
Actually
,
e
xisting
studie
s
ha
v
e
only
focused
on
classifying
re
vie
ws
as
positi
v
e
or
ne
g
ati
v
e,
disre
g
arding
senti
ment
intensity
.
Therefore,
we
propose
MTVRep,
a
mo
vie
and
TV
sho
w
reputat
ion
system
that
combines
fine-grained
sentiment
analysis
and
semantic
analysis
for
the
purpose
of
genera
ting
and
visualizing
reputation
to
w
ard
mo
vies
and
TV
sho
ws.
T
able
6
depicts
the
features
e
xploited
by
pre
vious
studies
and
MTVRep
during
reputation
generation
and
visualization.
In
order
to
e
v
aluate
the
performance
of
MTVRep
in
generating
accurate
reputation
v
alues
t
o
w
ard
v
arious
mo
vies
and
TV
sho
ws,
we
compared
it
with
Y
an
et
al.
[11]
reputation
system.
W
e
set
the
opinion
fusion
threshold
t
0
to
0.15
since
the
authors
mentioned
that
their
reputation
system
performs
in
its
best
when
t
0
=
0
:
15
.
W
e
applied
the
tw
o
reputation
systems
on
the
twenty
collected
datasets.
The
chosen
e
v
aluation
measure
is
the
squared
error
between
the
mo
vie
or
TV
sho
w
IMDb
weighted
a
v
erage
ratings
and
the
numerical
reputation
v
alue
computed
by
one
of
the
tw
o
reputation
systems.
The
formula
of
the
squared
error
is:
S
E
=
(
x
i
y
i
)
2
where
x
i
is
the
reputation
v
alue
returned
by
one
of
the
tw
o
systems
and
y
i
is
the
IMDb
W
eighted
A
v
erage
Ratings
to
w
ard
the
tar
get
mo
vie
or
TV
sho
w
.
Figure
7
depicts
the
IMDb
weighted
a
v
erage
ratings
for
forrest
gump
mo
vie.
According
to
IMDb
in
https://help.imdb
.com/article/imdb/track-mo
vies-tv/weighted-a
v
erage-rati
ngs/
GWT2DSBYVT2F25SK?ref
=
hel
psect
p
r
o
2
8
#
:
”IMDb
publishes
weighted
vote
aver
a
g
es
r
ather
than
r
aw
data
aver
a
g
es.
V
arious
filter
s
ar
e
applied
to
the
r
aw
data
in
or
der
to
eliminate
and
r
educe
attempts
at
vote
MTVRep:
A
mo
vie
and
TV
show
r
eputation
system
based
on...
(Abdessamad
Benlahbib)
Evaluation Warning : The document was created with Spire.PDF for Python.
1622
r
ISSN:
2088-8708
stuf
fing
by
people
mor
e
inter
ested
in
c
hanging
the
curr
ent
r
ating
of
a
mo
vie
than
giving
their
true
opinion
of
it.
The
e
xact
methods
we
use
will
not
be
disclosed.
This
should
ensur
e
that
the
policy
r
emains
ef
fective
.
The
r
esult,
is
a
mor
e
accur
ate
vote
aver
a
g
e
.
”
The
moti
v
ation
behind
choosing
the
squared
error
instead
of
absolute
error
resides
in
the
f
act
that
reputation
systems
don’
t
tolerate
high
error
v
alues.
Consequently
,
the
squared
error
wil
l
penalize
lar
ge
errors
more.
Figure
8
and
9
sho
w
the
comparison
result
between
the
tw
o
reputation
systems
o
v
er
the
twenty
datasets.
As
illustrated
in
Figure
8,
MTVRep
produces
the
nearest
reputation
v
alue
to
IMDb
weighted
a
v
erage
ratings
for
the
first
ten
datasets
that
contain
mo
vie
re
vie
ws
com
p
a
red
to
reputation
system
[11].
W
e
observ
e
that
the
squared
error
of
reputation
system
[11]
e
xceeds
2.5
in
dataset
1,
dataset
4,
dataset
7
and
dataset
9.
W
e
also
observ
e
that
the
squared
error
of
MTVRep
doesn’
t
surpass
0.1
in
dataset
3,
5
and
10,
which
implies
that
the
system
generates
accurate
reputation
v
alues
to
w
ard
mo
vies
since
the
highest
squared
error
achie
v
ed
by
MTVRep
is
1.87
(dataset
6).
Figure
6.
Reputation
visualization
Int
J
Elec
&
Comp
Eng,
V
ol.
11,
No.
2,
April
2021
:
1613
–
1626
Evaluation Warning : The document was created with Spire.PDF for Python.