Internati
o
nal
Journal of Ele
c
trical
and Computer
Engineering
(IJE
CE)
Vol.
3, No. 6, Decem
ber
2013, pp. 751~
761
I
S
SN
: 208
8-8
7
0
8
7
51
Jo
urn
a
l
h
o
me
pa
ge
: h
ttp
://iaesjo
u
r
na
l.com/
o
n
lin
e/ind
e
x.ph
p
/
IJECE
Recommender System Based
on Semantic Similarity
Kar
a
m
o
llah B
a
gheri F
a
rd
1
,
M
e
h
r
ba
kh
s
h
N
ila
s
h
i
2
,
Mo
hs
en
Ra
hman
i
3
, Othman Ib
rahim
4
1
Dept. of
Computer
Engineering
I
s
lamic Azad
Uni
v
ers
i
t
y
,
Yas
ooj b
r
anch,
Yas
ooj,
Ir
an
2,4
Facult
y
of Co
m
puting, Univer
siti T
e
knol
og
i M
a
lay
s
i
a
, Skudai,
Johor, Mala
y
s
i
a
1,3
Dept. of
Computer
Engineering
,
Facu
lty
of
Engi
neering
,
Arak
U
n
ivers
i
t
y
,
Arak
,
Iran
Article Info
A
B
STRAC
T
Article histo
r
y:
Received
J
u
l 16, 2013
R
e
vi
sed Oct
4,
2
0
1
3
Accepted Oct 22, 2013
In electronic commerce, in order to help users to find their favourite
products, we essentially
n
eed
a s
y
stem
to
classify
the produ
cts b
a
sed on th
e
user's in
ter
e
sts
and ne
eds to r
e
com
m
e
nd them
to th
e users.
For the
sam
e
reas
on the reco
m
m
e
ndation s
y
s
t
em
s
are de
signed to help finding
information
in large websites. They
ar
e basically
d
e
velop
e
d to offer products to the
customers in an automated fash
ion to
help th
em to do conveniently
their
shopping. Th
e d
e
velop
i
ng of such s
y
stems is imp
o
rtant sin
c
e ther
e ar
e often
a
large number of
factors invo
lved
in purchasing
a
product th
at wo
uld make it
difficu
lt for the
custom
er to make th
e best decision. Finding
relationship
among users and relationships among produc
ts are important
issue in th
es
e
s
y
stem
s. One of relat
i
ons is similari
t
y
. Me
asure sim
ilarit
y
am
on
g users and
products is used
in th
e pur
e meth
ods for
calcu
l
ating similarity
degree. In
this
paper, semantic
sim
ilarity
is used to find
a set of
k near
est neigh
bours to the
targe
t
us
er, or t
a
rget it
em
. Thus
,
becaus
e
of
incor
porating s
e
m
a
nt
i
c
s
i
m
ilari
t
y
in the proposed recommendation
sy
stem,
from th
e experimental r
e
sults, the
high accur
a
c
y
was
obtained
on private bu
il
ding com
p
an
y datas
e
t
in
com
p
aris
on with
s
t
at
e-of-th
e
-art
r
ecom
m
e
nder s
y
s
t
em
s
.
Keyword:
Si
m
ilarit
y
Sem
a
n
tic si
mil
a
rity
Recom
m
e
nder syste
m
s
Ont
o
lo
gy
Copyright ©
201
3 Institut
e
o
f
Ad
vanced
Engin
eer
ing and S
c
i
e
nce.
All rights re
se
rve
d
.
Co
rresp
ond
i
ng
Autho
r
:
Karam
o
llah Ba
ghe
ri Fa
rd
,
Depa
rt
em
ent
of C
o
m
put
er
En
gi
nee
r
i
n
g
Islamic Azad
Uni
v
ers
ity, Y
a
so
oj
br
an
ch
,
Y
a
sooj
, Ir
an
B
f
ka
ram
o
l
l
a
h2@l
i
v
e.
ut
m
.
m
y
1.
INTRODUCTION
The rec
o
m
m
en
dat
i
on sy
st
em
s have
bee
n
bas
i
cal
l
y
creat
ed to rec
o
m
m
end pr
o
duct
s
t
o
c
u
s
t
om
ers and
h
e
lp
t
h
em
to
pu
r
c
h
a
se, b
e
cause it is un
lik
ely to
m
a
k
e
an opti
m
a
l
d
ecisio
n
in
bu
ying
[
1
].
Th
e
r
eco
mm
en
d
a
tion
sy
st
em
s al
ready
prese
n
t
e
d
ha
ve l
o
t
s
of
p
r
ob
l
e
m
s
and t
h
i
s
has m
a
de t
h
e l
a
rge
we
bsi
t
e
s t
o
have
di
f
f
i
c
ul
t
y
i
n
recomm
ending products t
o
the use
r
s.
In
t
h
e past
two decades,
we ha
ve w
itnessed a signi
f
ican
t inc
r
ease
in the
n
u
m
b
e
r
o
f
e-co
mmerce sites th
at can
gu
ide u
s
ers in
t
h
e
d
ecision
m
a
k
i
n
g
pro
c
ess.
In
ad
d
ition
to b
e
n
e
fiting
users
,
e-c
o
mmerce sites be
ne
fit com
p
anies as well, by
giving them
access to inform
ation about use
r
interests
an
d cho
i
ces, an
d
u
ltim
a
t
ely
in
creasing
t
h
eir sales an
d
pro
f
its.
Giv
e
n
the larg
e
nu
m
b
er of
p
r
od
u
c
ts/ite
m
s
available
online, the
bi
g c
h
allenge
that t
h
ese
e-c
o
mmer
ce s
ites face today
is how to
effec
tively identify
ite
m
s
t
h
at
use
r
s m
i
g
h
t
be i
n
t
e
rest
e
d
i
n
p
u
rc
hasi
n
g
an
d t
o
rec
o
m
m
e
nd suc
h
i
t
e
m
s
t
o
user
s.
R
ecom
m
e
nder
sy
st
em
s
can
h
e
l
p
h
e
re. Th
e h
i
story of reco
mmen
d
e
r syste
m
s d
a
tes b
ack
t
o
th
e
year 1
979
wit
h
relatio
n
t
o
co
gn
itiv
e
science
[2]. Re
com
m
e
nder sy
ste
m
s gain
e
d
p
r
om
i
n
ence
am
on
g ot
he
r
a
p
pl
i
ca
tion a
r
eas s
u
ch as
approxi
m
ation
t
h
eo
ry
[3]
,
i
n
f
o
rm
at
i
on ret
r
i
e
val
[4]
,
f
o
reca
st
i
ng t
h
e
o
ri
es [
5
]
,
m
a
nagem
e
nt
sci
e
nce [6]
and c
o
n
s
um
er choi
c
e
m
odel
l
i
ng
i
n
m
a
rket
i
ng [7]
.
In
t
h
e
m
i
d-19
9
0
s, recom
m
ender system
s beca
m
e
active in the resea
r
ch
dom
a
in
wh
en
t
h
e fo
cus was sh
ifted
t
o
reco
mm
en
d
a
tio
n
p
r
o
b
l
em
s
b
y
research
ers th
at exp
licitly
rely on
user
ratin
g
st
ruct
u
r
e an
d a
l
so em
erged as
an i
nde
pe
nde
n
t
research a
r
ea
[8
-1
0]
. R
S
’s m
a
ke use
of
pre
v
i
o
us use
r
l
i
k
e
s
and
di
sl
i
k
es an
d st
at
i
s
t
i
c
al
m
e
t
hods t
o
e
x
t
r
act
pat
t
e
rns ab
o
u
t
users a
nd i
t
em
s. These pat
t
e
rns can be
t
h
en
Evaluation Warning : The document was created with Spire.PDF for Python.
I
J
ECE
I
S
SN
:
208
8-8
7
0
8
Recommend
er
system
b
a
s
ed on
seman
tic simila
rity (Ka
r
a
m
o
lla
h Ba
gh
eri
Fa
rd
)
75
2
em
pl
oy
ed t
o
s
u
g
g
est
i
t
e
m
s
of i
n
t
e
re
st
t
o
us
ers.
Gi
ve
n t
h
e
adva
nt
age
s
t
h
a
t
recom
m
ender
sy
st
em
s of
fer,
t
h
e
y
have
bec
o
m
e
an i
n
t
e
gral
par
t
of m
a
ny
bus
i
n
ess m
odel
s
and
are
bei
n
g
use
d
ve
ry
ext
e
nsi
v
el
y
i
n
m
a
ny
e-
commerce we
bsites suc
h
as
Amaz
on.c
o
m
,
eBay, R
eel.com, etc.
I
n
t
h
is
p
a
p
e
r
,
se
m
a
n
tic si
m
i
la
r
ity is u
s
ed to
f
i
nd
a set
o
f
k
n
ear
est
n
e
ighbo
ur
s t
o
th
e targ
et u
s
er
, or
targ
et ite
m
.
Th
e obj
ectiv
e o
f
th
is
p
a
per is to
in
co
rpo
r
ate sem
a
n
tic si
m
ilarit
y
in
th
e d
e
v
e
lo
p
e
d
recomm
endation system
, evaluate its accuracy using th
e private buildi
ng
com
p
any dataset andc
om
pare wit
h
state-of
-the
-art
rec
o
m
m
e
nder
sy
stem
s.
2.
RECO
M
M
E
N
D
A
TIO
N
T
E
CHN
I
Q
U
ES
R
ecom
m
e
ndat
i
on m
e
t
hods
h
a
ve a vari
et
y
of
pos
si
bl
e cat
ego
r
i
e
s [1
1,
12]
. F
o
r a
rra
n
g
i
n
g a fi
rst
revi
e
w
o
f
t
h
e
di
ffe
re
nt
ki
nds
of
R
S
s,
we
w
a
nt
t
o
q
uot
at
i
o
n a t
a
xo
n
o
m
y
of
fere
d
by
[
1
3
]
t
h
at
has
bec
o
m
e
a
t
r
adi
t
i
onal
wa
y
of i
d
ent
i
f
y
i
ng
bet
w
ee
n r
ecom
m
e
nder t
echni
que
s an
d
m
e
nt
i
oni
n
g
t
h
em
. B
u
rke [1
3]
di
ffe
re
nt
i
a
t
e
s bet
w
een
6 di
f
f
er
ent
cl
asses o
f
r
ecom
m
e
ndat
i
o
n ap
pr
oac
h
es t
h
at
3 m
a
i
n
of t
h
em
are expl
ai
ned a
s
fo
llows:
2.
1. C
o
nte
n
t
-
B
ased Fi
l
t
eri
ng (CB
F
)
The c
ont
e
n
t
b
a
sed a
p
p
r
oach
pr
o
v
i
d
es
rec
o
m
m
e
ndat
i
ons
whi
c
h are
bas
e
d o
n
i
n
f
o
rm
at
i
on
on t
h
e
cont
e
n
t
of i
t
e
m
s
rat
h
er t
h
an
on ot
her
user'
s
opi
ni
o
n
s. It
uses a m
achi
n
e l
earni
n
g
al
g
o
ri
t
h
m
t
o
i
ndu
ce t
h
e
profile of the
user prefe
r
ences
from
exa
m
ple
s
base
d on
a fe
ature descri
ption
of t
h
e conte
n
t. The c
o
ntent
of a
n
i
t
e
m
can be st
ruct
u
r
ed
or
unst
r
uct
u
re
d. I
f
we
consi
d
er t
h
e c
onte
n
t of a m
ovie as director, writer, cast etc., the
n
each of these a
ttribute can be
considere
d
as
a feature. B
u
t in the case of
unstructured
item
s
such as text data,
deci
di
n
g
on t
h
e feat
ure set
i
s
m
o
re di
ffi
cul
t
.
C
ont
e
n
t
-
ba
sed
recomm
enders
treat suggestions as a user-specific
categ
or
y pr
ob
l
e
m
an
d
lear
n a
classif
i
er
fo
r the cu
st
o
m
er
's p
r
ef
er
en
ces
d
e
p
e
n
d
i
n
g
on
pr
oduct tr
aits.
Acco
r
d
i
n
g t
o
Zi
egl
e
r [
1
4]
, t
echni
que
s ap
pl
y
i
ng a c
ont
e
n
t
-
b
a
sed rec
o
m
m
endat
i
o
n st
rat
e
g
y
eval
uat
e
a
set
of d
o
c
u
m
e
nt
s an
d/
o
r
det
a
i
l
s
of p
r
o
d
u
ct
s pre
v
i
o
usl
y
ran
k
ed
by
a use
r
,
and
de
vel
o
p a m
odel
or use
r
pr
ofi
l
e
of
use
r
passi
on
s
de
pe
ndi
ng
on
t
h
e feat
u
r
es o
f
t
h
e
t
h
i
n
gs rat
e
d by
t
h
at
use
r
.
C
ont
e
n
t
-
ba
sed
R
S
'
s
can be us
ed
i
n
a vari
et
y
o
f
d
o
m
ai
ns ran
g
es
i
.
e., rec
o
m
m
en
di
n
g
we
b
pag
e
s, ne
ws a
r
t
i
c
l
e
s, j
o
bs , t
e
l
e
vi
si
on
pr
o
g
ram
s
, and
pr
o
duct
s
f
o
r sa
l
e
.
2.
2. C
o
l
l
a
b
o
ra
ti
ve
Fi
l
t
eri
n
g
B
a
sed o
n
t
h
e
g
e
nui
ne an
d
or
d
i
nary
o
f
t
h
i
s
st
rat
e
gy
[
15]
t
h
e
i
t
e
m
s
t
h
at
ot
her use
r
s
wi
t
h
si
m
i
l
a
r t
a
st
es
liked in the
pas
t
are re
comm
e
nde
d t
o
th
e targ
et user. Th
e lik
en
ess in taste
o
f
two
cu
sto
m
ers is co
m
p
u
t
ed
wit
h
rega
rd
s t
o
t
h
e l
i
keness
i
n
t
h
e
r
a
t
i
ng
hi
st
ory
o
f
t
h
e
user
s.
All co
llabo
rati
v
e
filtering
m
e
th
od
s sh
are a
cap
ab
ility
to
utilize th
e p
a
st
ratin
g
s
of
u
s
ers in
o
r
d
e
r to
p
r
ed
ict o
r
recommen
d
n
e
w co
n
t
en
t th
at an
in
d
i
v
i
d
u
a
l
u
s
er will lik
e [16
]
. Th
e actu
a
l assu
m
p
tio
n
is h
i
g
h
l
y
base
d i
n
t
h
e i
d
ea o
f
l
i
k
e
n
es
s bet
w
ee
n
use
r
s o
r
bet
w
ee
n
p
r
od
u
c
ts,
with th
e similarity
b
e
ing
expressed
as a
fun
c
tion
of agreem
en
t b
e
tween
p
a
st ratin
gs o
r
p
r
efer
en
ces. Two
b
a
sic v
a
rian
ts of co
llab
o
rativ
e
filtering
approach can be classified as
user-based and ite
m
-
based.
2.3. Hybrid Recommender Sys
t
ems
Hy
bri
d
R
S
’s c
a
n be o
b
t
a
i
n
e
d
from
a co
m
b
inat
i
on
of m
e
nti
one
d t
echni
qu
es by
bl
endi
ng
t
w
o or m
o
re
t
echni
q
u
es t
h
a
t
t
r
i
e
s t
o
fi
x di
sad
v
ant
a
ges o
f
t
h
em
. A hy
br
i
d
ap
pr
oac
h
es
m
o
re have bee
n
use
d
by
co
m
b
i
n
g
col
l
a
bo
rat
i
v
e
and
co
nt
ent
-
b
a
sed m
e
t
hods
,
w
h
i
c
h t
r
i
e
s t
o
el
i
m
i
n
at
e sho
r
t
c
om
i
ngs
o
f
b
o
t
h
[
1
3
,
1
7
,
1
8
]
.
M
o
re
ove
r, a c
o
m
b
i
n
at
i
on f
o
r
devel
o
pi
n
g
hy
bri
d
recom
m
ende
r sy
st
em
i
s
depe
ndi
ng
on
t
h
e dom
ai
n and
dat
a
characte
r
istics. Seve
n categories of hy
bri
d
recomm
enda
t
i
on sy
st
em
s, wei
g
ht
ed,
swi
t
chi
n
g, m
i
xed,
feat
ure
com
b
ination, feature a
u
gm
entation, ca
scade
,
and
m
e
t
a
-l
evel ha
ve
been
i
n
t
r
od
uce
d
by
[
1
9]
.
3.
SIMIL
A
RITY METRICS
On
e cru
c
ial step
in
th
e co
llabo
rativ
e
filtering
algo
r
ith
m
is
to
calcu
late th
e si
m
i
larity b
e
t
w
een
item
s
and
u
s
ers
an
d
f
i
nal
l
y
t
o
ch
o
o
s
e
a g
r
ou
p
of
ne
arest
nei
g
h
b
o
u
r
s as
rec
o
m
m
e
ndat
i
o
n
pa
rt
ne
r
s
f
o
r
an
act
i
v
e
user
.
After
estab
lishin
g
a set
o
f
pro
f
iles
b
y
th
e reco
mmen
d
e
r
syste
m
, it is p
o
s
sib
l
e to
reason
ab
ou
t th
e
simil
a
rities
bet
w
ee
n u
s
ers
or i
t
e
m
s
, and fi
nal
l
y
cho
o
ses
a gr
o
up
of
nea
r
est
nei
g
hb
o
u
rs
as recom
m
end
a
t
i
on p
a
rt
ne
rs
f
o
r a
n
activ
e u
s
er. Becau
se
of im
p
o
r
tan
ce
o
f
simila
rity
m
a
trices,
so
m
e
o
f
t
h
e
p
opu
lar sim
ilarity
metrics th
at u
s
ed
i
n
co
llab
o
rativ
e fi
lterin
g
will
b
e
ex
am
in
ed
in
d
e
tail.
Evaluation Warning : The document was created with Spire.PDF for Python.
I
S
SN
:
2
088
-87
08
IJECE Vol. 3, No. 6, D
ecem
ber 2013
:
751 – 761
75
3
3.
1. C
o
si
ne
Si
mi
l
a
ri
ty
Usu
a
lly co
sin
e
similarity
met
r
ic is used fo
r
esti
m
a
te th
e similarit
y
b
e
tween
two
i
n
stan
ce a and
b
i
n
in
fo
rm
atio
n
retriev
a
l th
at th
e o
b
j
ects are in
th
e sh
ap
e of v
e
cto
r
x
a
an
d ve
ct
or x
b
[2
0, 2
1
]
and cal
cul
a
t
i
n
g t
h
e
Cosine Vect
or (CV)
(or Vector
Space)
similarity betwee
n t
h
ese
vectors i
n
dicate the
dista
n
ce
of them
to
each
ot
he
r [
2
2,
2
3
]
:
22
.
co
s
(
,
)
||
||
*
|
|
|
|
ab
ab
ab
XX
XX
XX
(1
)
In
th
e con
t
ex
t o
f
ite
m
reco
mmen
d
a
tion
,
fo
r co
m
p
u
tin
g
u
s
er si
m
ilar
ities, th
is
mea
s
u
r
e can
be
em
pl
oy
ed i
n
w
h
i
c
h a use
r
u i
ndi
cat
es vect
or
x
u
R|I| where
x
ui
= r
ui
i
f
user u
has rat
e
d i
t
e
m
i
and fo
r u
n
rat
e
d
i
t
e
m
consi
d
e
r
s
0.
The
si
m
i
l
a
ri
ty
bet
w
ee
n t
w
o
users
u
a
n
d
v
wo
ul
d
t
h
e
n
be
cal
cul
a
t
e
d as:
22
(,
)
c
o
s
(
,
)
uv
ui
v
i
uv
ui
v
i
iI
ab
iI
j
I
rr
CV
u
v
X
X
rr
(2
)
Whe
r
e r
uv
o
n
ce
m
o
re i
ndi
cat
es t
h
e i
t
e
m
s
rat
e
d by
bot
h u an
d v. A s
h
o
r
t
c
o
m
i
ng of t
h
i
s
m
easure i
s
t
h
at
i
t
does
n
o
t
e
x
a
m
i
n
e t
h
e di
ffe
r
e
nces i
n
t
h
e
m
ean a
n
d
va
ri
an
ce o
f
t
h
e
rat
i
n
g
s
m
a
de by
use
r
s u
an
d
v.
Cosine similarity
is calculated on a scale betw
een -1 a
nd +
1
, where
-1
im
p
lies
th
e o
b
j
ects are
co
m
p
letely d
i
s
s
i
m
ilar, +1
imp
lies th
ey are
co
m
p
letely si
milar an
d
0
imp
lies th
at th
e
ob
j
ects
do
n
o
t
hav
e
an
y
relations
hip to
each ot
her.
In
pri
o
r resea
r
c
h
e
s
, vector si
milarity has been
proven to
work well in information
ret
r
i
e
val
[4]
b
u
t
i
t
has
not
bee
n
fo
u
n
d
t
o
ca
rr
y
out
a
s
wel
l
as Pears
o
n’s
f
o
r
use
r
-
b
ase
d
C
F
[
24]
.
3.2. Pears
o
n Correl
ati
o
n
Pearso
n
Correl
a
tio
n
(PC) is a well-k
nown
metric th
at co
m
p
ares ratings
where the e
f
fects of m
ean
and va
riance
have
been eliminated is
the
Pe
arson
C
o
rrelatio
n (PC
)
sim
ila
rity [25
,
26
]:
,
22
,
()
(
)
(,
)
()
(
)
uv
uv
u
v
u
ui
v
i
v
iI
ui
u
v
i
v
iI
i
I
rr
r
r
PC
u
v
rr
r
r
(3
)
Also
, for acq
u
i
ring
th
e sim
i
larity b
e
tween two
item
s
i
and
j
t
h
e rat
i
n
gs
gi
v
e
n
by
users
t
h
a
t
hav
e
rat
e
d
bot
h
of these it
e
m
s is com
p
ared:
22
()
(
)
(,
)
()
(
)
ij
ij
ij
ij
ui
uj
uU
ui
i
u
j
j
uU
uU
rr
r
r
PC
i
j
rr
r
r
(4
)
3.3. Spe
arm
a
n’s Correl
ati
o
n
Coefficient
Spea
rm
an’s correlation coe
ffi
cient is a rank coeffi
cien
t that in
d
e
pend
en
t o
f
t
h
e actu
a
l
ite
m
ratin
g
values
, estim
ates the
differe
n
ce in t
h
e
rank
in
g of t
h
e items in
t
h
e
p
r
o
f
iles [2
7
]
. First
u
s
er’s list o
f
rati
n
g
s
is
t
u
r
n
ed i
n
t
o
a l
i
st
of ran
k
s
,
wh
ere t
h
e user
’s h
i
ghest
rat
i
n
g t
a
kes t
h
e ra
nk
of
1, an
d t
i
e
d rat
i
ngs t
a
ke t
h
e a
v
erag
e
of t
h
e ra
n
k
s f
o
r t
h
ei
r sp
ot
[
2
8
,
29]
.
Herl
oc
ke
r [2
9]
sh
owe
d
t
h
at
Spearm
a
n’s pe
rf
orm
s
sim
i
l
a
rl
y t
o
Pearson
’
s
for user-based CF.
,,
22
,,
()
(
)
(,
)
()
(
)
ab
ai
b
i
iI
a
ai
b
i
b
iI
iI
rr
r
r
SR
C
i
j
rr
r
r
(5
)
The Spea
rm
an Correlation C
o
effici
ent
f
o
r u
s
er-
u
ser
si
m
i
l
a
rit
y
bet
w
een t
w
o u
s
ers a
an
d
b
ha
ve bee
n
rep
r
ese
n
ted in
Eq
uation
5. It is declared re
g
a
rdi
ng the set of all co-r
ated item
s
(I) that
,
ai
r
and
,
bi
r
i
ndi
cat
e
rank each user gave to each ite
m
i and
a
r
and
b
r
finally indicate
each use
r’s a
v
erage ra
nk. Once again, the
Evaluation Warning : The document was created with Spire.PDF for Python.
I
J
ECE
I
S
SN
:
208
8-8
7
0
8
Recommend
er
system
b
a
s
ed on
seman
tic simila
rity (Ka
r
a
m
o
lla
h Ba
gh
eri
Fa
rd
)
75
4
correlation is
measured on a
scale betwee
n -1 t
o
+
1
wh
ere , -1
im
p
lies t
h
e obj
ects are
co
m
p
letely d
i
s
s
i
m
ilar,
+1 im
plies
they are com
p
le
te
ly similar and 0 im
plies
that the obj
ects do not have
a
n
y
relationshi
p
to each
ot
he
r.
4.
SEMANTI
C
SIMIL
A
RITY
Th
ere are t
h
ree typ
e
s of sem
a
n
tic similarity
m
eas
u
r
es u
s
ed
in
calcu
lating
th
e
similariti
es b
e
tween
i
t
e
m
s
servi
ng
as ont
ol
o
g
y
-
ba
sed m
e
t
a
dat
a
inst
ances t
h
at
a
r
e defi
ned as t
h
ree t
y
pes o
f
Tax
o
n
o
m
y
Sim
i
l
a
ri
ty
(TS),
Attribu
t
e Similarity (
A
S) and
Relat
i
o
n
Similarity
(RS).Fo
r
each
p
a
ir
o
f
item, th
e abo
v
e
seman
tic
sim
i
l
a
ri
ty
m
e
asures a
r
e
use
d
by
obt
ai
ni
n
g
t
h
e
wei
g
hted val
u
es of t
h
ese m
easures
[30]. The
se
mantic
sim
ilarity between instance
I
i
and I
j
i
s
de
no
t
e
d by
SS (I
i
,I
j
) an
d TS, R
S
,
and
AS i
s
cal
cul
a
t
e
d f
o
r
wei
ght
e
d
arithm
e
tic m
e
a
n
.
12
12
12
12
,,
,
,
a
T
SI
I
b
R
S
I
I
c
A
SI
I
SS
I
I
abc
(6
)
4.
1. T
a
xo
nom
y
Si
mi
l
a
ri
t
y
Taxonom
y
Similarity (TS) betwee
n two instances
is det
e
rm
ined according
to their c
o
rres
ponding
conce
p
ts’ places
in
c
o
ncept
hierarc
h
y (H
c
) th
at sp
ecified in
on
to
log
y
m
o
d
e
l [31
]
. M
a
in
ly, in
TS t
h
e clo
s
er
co
n
c
ep
ts in tax
ono
m
y
in
d
i
cates th
e stro
ng
si
m
ilarit
y
b
e
tween th
em
. After co
m
p
u
tin
g si
m
ilarit
i
es b
e
tween
co
n
c
ep
ts in
on
to
log
y
, it is p
o
ssib
l
e to
calcu
late si
m
i
l
a
rity b
e
tween two
instan
ces b
y
con
s
id
erin
g
the
si
m
ilarit
i
es b
e
t
w
een
relativ
e co
n
c
ep
ts of th
ese in
stan
ces. To
do t
a
xo
n
o
m
y
sim
i
l
a
ri
ty
cal
c
ul
at
i
on bet
w
ee
n t
w
o
conce
p
t
s
,
4
di
f
f
ere
n
t
m
easure
s
TSC
W
u
&Pal
mer
, TSC
CM
,
TSC
Lin
and TSC
Mcleancan
be u
s
ed
.
According
to Maedche
a
n
d Zacharias [32]
TSC
CM
o
r
taxo
no
m
y
si
milar
ity b
e
tween
con
cep
ts
u
s
ing
co
n
c
ep
t m
a
tch
is u
s
ed
t
o
cal
cu
late TSC.
In on
to
log
y
, it is d
e
fi
n
e
d
based on
d
i
stan
ce
between
t
w
o con
cep
ts.
Co
n
c
ep
t Match (CM)
b
e
tween two con
c
ep
ts
u
s
es TSCCM
an
d
is d
e
term
in
ed
as:
,,
,
,,
cc
ij
ij
cc
ij
UC
C
H
U
C
C
H
CM
C
C
UC
C
H
U
C
C
H
(7
)
wh
ere
UC
(Upward
s
co
top
y
)
is d
e
term
in
ed
as :
,,
cc
ij
i
j
i
j
UC
C
H
C
C
H
C
C
C
C
(8
)
A set
of co
nce
p
t
s
t
h
at
m
a
ke a pat
h
fr
om
a given c
once
p
t
gi
ven c
once
p
t
t
o
t
h
e ro
ot
of a
gi
ven c
once
p
t
hi
erarc
h
y
i
s
de
t
e
rm
i
n
ed
by
U
C
. Su
bse
q
uent
l
y
, TSC
CM
can
be defi
ned
as f
o
l
l
o
w:
1
,
i
f
,
,
,
o
t
h
e
r
w
i
s
e
2
ij
CM
i
j
ij
CC
TS
C
C
C
CM
C
C
(9
)
TSC
Wu
&
P
a
l
me
r
i
s
as secon
d
m
e
asure t
h
at
was
pr
op
ose
d
by
Wu a
n
d Pal
m
er [
33]
.
W
u a
nd Pal
m
er’s
m
easure t
h
at
i
s
use
d
fo
r si
m
i
lari
t
y
bet
w
ee
n c
once
p
t
s
i
s
de
fi
ned
as
fol
l
o
wi
ng:
&
3
12
3
1
,
i
f
,
2 .
,
ot
he
r
w
i
s
e
2 .
ij
Wu
p
a
l
m
e
r
i
j
CC
TS
C
C
C
N
NN
N
(1
0)
The num
b
er
s
ubC
o
n
cept
O
f
i
s
defi
ned
by
N1 a
n
d N
2
t
h
at
m
a
ke l
i
nk f
r
om
C
i
and C
j
to
th
eir m
o
st
part
i
c
ul
a
r
co
nc
ept
C
k
t
h
at
su
b
s
um
es bot
h
o
f
t
h
em
. Al
so,
N
3
stands to t
h
e num
b
er of
s
ubC
once
p
t
O
f
l
i
n
ks
fr
om
C
k
to
th
e roo
t
o
f
t
h
e
o
n
t
o
l
ogy (
r
o
o
t
co
n
c
ep
t
)
. Co
m
p
ar
ed
to TSC
CM
, TSC
W
u&Pal
m
e
r
is also based
on the
dis
t
ance
b
e
tween
co
n
c
ep
ts in
on
to
log
y
. Lin’s taxon
omy si
milarit
y
presente
d
by [34] is chose
n
as
the third m
easure
for
co
m
p
u
tin
g
TSC. Lin
’
s taxon
o
m
y si
milari
t
y
is an
in
fo
rmatio
n
th
eo
ret
i
c ap
p
r
o
a
ch
based
on
p
r
ob
ab
ilistic
m
o
d
e
l. In
t
h
e fo
llo
wi
n
g
, th
e tax
ono
m
y
si
mil
a
rity b
e
tween
co
n
c
ep
ts
b
y
Li
n
’
s taxo
no
m
y
si
m
ilarit
y
(TSC
Lin
) is
prese
n
ted as :
Evaluation Warning : The document was created with Spire.PDF for Python.
I
S
SN
:
2
088
-87
08
IJECE Vol. 3, No. 6, D
ecem
ber 2013
:
751 – 761
75
5
1
,
i
f
2
.
l
og
P
,
,
o
t
h
e
r
w
i
s
e
lo
g
P
l
o
g
P
ij
rk
Li
n
i
j
ri
r
j
CC
C
TS
C
C
C
CC
(1
1)
Pr(C
n
) stand
s
to
th
e
p
r
ob
ab
i
lity wh
ich
a
ran
d
o
m
ly ch
o
s
en
in
stan
ce
b
e
lo
ng
s to con
c
ep
t C
n
, a
n
d
in
corpo
r
ating C
i
and C
j
is C
k
represe
n
ting t
h
e
m
o
st specific
conce
p
t.
Th
e Mov
i
e co
ncep
t and
Feat
ure con
c
ep
t are
th
e two co
n
c
epts u
tilized
in
this stu
d
y
, and
the v
a
lu
es
of
th
eir in
stan
ces h
a
v
e
no
effect o
n
each
o
t
h
e
r’s
p
r
ob
ab
ilities. As an
ex
amp
l
e, on
ly th
e
Mo
v
i
e in
stan
ces are
co
nsid
ered
when
th
e prob
ab
i
lity o
f
a co
n
c
ep
t b
e
long
s to
Mo
v
i
e con
c
ep
t. Pr(C
n
) is therefore
represe
n
t the
fo
llowing
.
,
i
f
,
Pr
,
i
f
,
n
c
n
n
n
c
n
IS
E
T
C
Mo
v
i
e
U
C
C
H
IS
E
T
M
o
v
i
e
C
IS
E
T
C
F
e
at
ur
e
U
C
C
H
I
S
E
T
F
e
at
ur
e
(1
2)
A set of in
stances is d
e
term
in
ed
b
y
ISET(C
n
) wh
ich
are instan
ces of th
e co
n
c
ep
ts th
at are lin
k
e
d
to
th
e C
n
co
nce
p
t
by
su
bC
once
p
t
O
f
lin
ks.
ISET(C
n
) ca
n
be
def
i
ned a
s
f
o
l
l
o
wi
ng:
()
{
|
(
(
)
,
)
}
C
I
S
ET
C
I
I
C
U
C
C
S
ET
I
H
()
{
|
()
CS
E
T
I
C
C
C
I
(1
3)
CSET(I) i
ndicates the set
of concepts
that
instance I is
linked by
i
n
st
ance
O
f
lin
ks
.
The
ot
her
m
easure by
[
3
5]
va
ried strate
gies of
sim
ilarity calculation
are a
n
alysed a
n
d sim
i
larity
measure
defi
ned in the
following equation which is called taxonom
y si
milari
ty between concepts usi
n
g Mclean’s taxonom
y
sim
ilarity
(TS
C
Mclean
), gives
the
best
per
f
o
r
m
a
nce.
1,
(,
)
.,
ij
hh
Mc
l
e
a
n
i
j
l
hh
if
c
c
TS
C
C
C
ee
eo
t
h
e
r
w
i
s
e
ee
(1
4)
The work carri
ed out in [35] re
veals that M
c
lean’s taxonom
y sim
i
larity measurem
ent produced
the
best pe
rf
orm
a
nce with
o
p
tim
al values of
param
e
ters
and
ha
vin
g
0
.
2 a
nd
0.
6 re
s
p
ectively
,
w
h
e
n
evaluatio
n wa
s
do
ne o
n
se
par
a
te sim
i
larity
calculation st
rat
e
gies. l and h a
r
e the s
h
or
test path length
bet
w
een
C
i
and C
j
, and
the m
o
st specific concept in
ontolo
gy res
p
ec
tively. As stated above, T
S
C
CM
, TSC
Wu
&
P
a
l
me
r
, and
TSC
Mclean
are based
on
distanc
e
betwee
n c
oncepts
while TSC
Lin
on i
n
form
ation theoretic approach.
1,
(,
)
(
(
),
(
)),
ij
ij
ij
if
I
TS
I
I
SSI
M
C
SE
T
I
C
S
E
T
I
o
t
h
e
r
w
i
s
e
(1
5)
In the Equation 15 the CSET was determ
i
n
ed. SSIM
(S1, S2) indicates
the
si
milarity between
two
sets S1 and
S2. Sim
ilarit
y
betw
een two sets can be
calculated appl
ying the sim
i
larities between their
ele
m
ents, in thi
s
case T
S
C of
conce
p
ts, and a
m
e
thod
that i
d
entifies a
way
of em
ploying t
h
ese sim
ilaritie
s
.
4
.
2
.
Relation Simila
rity
Relatio
n
sim
i
l
a
r
ity (
R
S)
is an
o
t
h
e
r similar
i
ty
m
easu
r
e th
at u
s
es
on
to
logy-
b
a
sed
m
e
tad
a
ta [
3
6
]
.
In
ont
olo
g
y
-
base
d m
e
tadata, R
S
bet
w
een
tw
o
instance
s is
b
a
sed
o
n
thei
r
r
e
lations to
ot
h
e
r insta
n
ces.
A
ssum
e
that Director Z
is as
a director of Movie
α
a
nd M
o
vie
a
nd
Direct
or
Y
is as a directo
r
of M
ovie
. T
h
at is
clear that the RS betwee
n Movie
α
a
n
d M
o
vie
is higher than th
e RS between Movie
and M
o
vie
. I
t
is
because of
belonging sam
e
director for M
o
vie
α
an
d M
o
vie
. Fo
r R
S
m
easure, the m
odified versi
on
of
Maedche a
nd
Zacharias
’
s RS
m
easure from
th
e [37] is used. RS betwee
n instances I
i
and I
j
can be c
o
m
puted
as follows:
Evaluation Warning : The document was created with Spire.PDF for Python.
I
J
ECE
I
S
SN:
208
8-8
7
0
8
Recomme
nder system bas
ed
on
se
mantic
similarity (Karamo
lla
h Ba
gh
er
i Fard)
75
6
1,
(,
,
,
)
(
,
,
,
)
(,
)
,
||
|
|
CO
I
C
O
O
ij
ij
i
j
ij
PP
CO
I
C
O
O
if
I
OR
I
I
I
N
OR
I
I
O
U
T
RS
I
I
ot
he
r
w
i
s
e
PP
(1
6)
P
co-I
and P
co-O
s
t
ands are f
o
r in
com
i
ng relations an
d o
u
tg
oin
g
relations respectively. The form
er is the
set of relations allowing UC
(C (I
i
),
H
c
) an
d
UC
(C
(
I
j
),
H
c
)
as ranges while the latter is
the set of relations
gra
n
tin
g UC
(
C
(I
i
),
H
c
) an
d
UC
(C
(
I
j
),
H
c
) a
s
dom
a
ins. The avera
g
e
of t
h
e calculated
sim
ilarities for each
incom
i
ng an
d
out
goi
ng
relations
o
f
insta
n
ce
s give
rise
to t
h
e relation similarity between instances. OR
(I
i
,I
j
, P
,
DIR
)
de
notes t
h
e sim
ilarity for relation
P
a
n
d directio
n
DI
R
betwee
n inst
ances I
i
a
nd
I
j
whe
r
e D
I
R
∈
I
N
,
OUT
and can be cal
culated putting into co
nsideration the associat
ed instances
of Ii and
Ij
with respect to
P
and
DIR. For e
x
am
ple, in t
h
e sim
i
larity of relation
has
Director
and
directi
on
OU
T
bet
w
ee
n
two m
ovie inst
ances
in Movie
Ontology, t
h
e
directors
of t
h
e t
w
o m
ovies ar
e considered.
In sim
i
lar fashion, the sim
i
la
rity of
relation
has
Dir
ector
an
d
direc
tion
IN
betwee
n tw
o di
recto
r
s
,
the m
ovies
are considere
d
.
Ass
o
ciated inst
ances
(
A
s)
o
f
in
s
t
an
ce
In
with respect to the
relation
P
and
directi
o
n
DI
R
is the
followi
ng:
{:
,
}
,
(,
,
)
{:
,
}
,
kk
k
n
Sn
kk
n
k
I
I
I
I
I
i
f
D
IR
IN
AP
I
D
I
R
I
II
I
I
i
f
D
I
R
O
U
T
(1
7)
As (P, In,
DIR
)
is defi
ned as the related instances
(As) of instance
In with
regard to t
h
e relation
P
and
directio
n
OR
(Ii,
Ij, P,
DIR
)
calc
u
lation an
d
DI
R
is reduced to si
m
i
larity bet
w
een t
w
o sets with
associated i
n
st
ances.
0
,
A
P
,I
D
I
R
)
=
0
A
P
,I
D
I
R
)
0
)
(,
,
)
(,
,
,
o
t
h
e
r
w
i
s
e
si
si
ij
si
s
j
if
OR
I
I
D
I
R
SS
I
M
A
I
D
I
R
A
I
D
I
R
(1
8)
Recalling what was said in previous sect
ions th
at sim
i
larities between
elem
ents tri
ggers the
sim
ilarity
between two sets (
SSIM
)
usin
g a
m
e
thod. R
S
is used w
h
e
n
calculating SS
s b
e
tween tw
o ins
t
ances
and SSs is em
ployed in calcul
a
ting RS
s
bet
w
een instances, this leads t
o
i
n
fi
nite cycles and the t
o
a
v
ert
this, a
m
a
xim
u
m
recu
rsio
n dept
h has
to be defi
ned
.
R
e
lation sim
i
larity
is adva
ntageo
us
beca
us
e sim
ilarities betwee
n ass
o
c
i
ated instances
are gi
ve
n
prom
inence. In a m
ovie insta
n
ce, t
h
e ass
o
ci
ated insta
n
ces
are feat
ure
-
val
u
es
of these m
ovies
. In a
m
ovie that
has
only
one
f
eature,
the act
or
starre
d i
n
t
h
e m
ovie, a
n
d
decide
d t
o
fi
nd
sim
i
larity
betwee
n
M
o
vi
eX
and
MovieY
havi
n
g
feat
ure
-
val
u
e Acto
r
α
and Actor
respectively.
W
ith the us
er rating m
ovies
casting
only
Actor
α
, predicting the rating of
Movie Y
becom
e
s im
possible has stated.
The relation similarity between
MovieX
a
nd
M
o
vieY
depends
on the sem
a
ntic si
m
ilarit
y
between Act
o
r
α
and Act
o
r
, a
n
d also t
h
e se
m
a
ntic
sim
ilarity
between
ot
her i
n
s
t
ances wit
h
rel
a
tions to
Act
o
r
α
and Actor
.
As such,
si
m
i
larity value of the
m
ovies
can be fo
u
n
d
an
d ratin
g pre
d
iction
d
o
n
e.
4
.
3
.
Attribute Simila
rity
For calculating se
m
a
ntic si
mi
larities
of ontology-based m
e
ta data Attr
ibute Si
m
i
larity (
A
S) is
used
as a third
sim
i
larity
m
easure [38]. Co
m
p
are to the relation si
m
ilarit
y
, als
o
attribute val
u
es is selected for as
betwee
n
tw
o o
b
jects
.
Hence
,
AS betwee
n
t
w
o
in
stances I
i
an
d I
j
is
de
fine
d
as:
1,
(,
,
)
(,
)
,
||
A
ij
ij
ij
aP
A
if
I
OA
I
I
a
AS
I
I
ot
h
e
r
w
i
s
e
P
(1
9)
PA
de
notes t
h
e
set o
f
attrib
ut
es that incl
ude
s attributes
o
f
bot
h
UC
(C
(
I
i
), H
c
)
an
d
UC
(C
(I
j
),
H
c
).
The
sim
ilarity between
objects I
i
and
I
j
is
determ
ined by OA(I
i
, I
j
, a
)
for attribute a. Th
us, attribute sim
i
larity
between two instances is cal
culated
by com
puting simil
a
rities for each a
ttribute in the set PA and taking
average of these si
m
i
laritie
s. Si
milar to t
h
e com
putation of OR(I
i
, I
j
, a)
,
OA
(I
i
, I
j
, a) is calculated by
consideri
ng associated literals of I
i
and I
j
wi
th respect to the attribute a. Asso
ciated literal (Al) of in regard t
o
the attribute A
is as follow:
Evaluation Warning : The document was created with Spire.PDF for Python.
I
S
SN:
2
088
-87
08
IJECE Vol. 3, No. 6, D
ecem
ber 2013
:
751 – 761
75
7
,L
A
(
I
,
)
(,
)
0,
x
xn
x
ln
Li
f
L
L
AA
I
ot
h
e
r
w
i
s
e
(2
0)
The di
ffe
rence
betwee
n A
l
and
A
s
is that A
l
can include at m
o
st one literal unlike A
s
. Th
us, in
or
de
r
to calculate OA, calculating
si
milarity between attribut
e
values is
m
o
re prefe
rre
d rat
h
er tha
n
calculating
si
m
ilarit
y
between two
sets.
0
,
A(
A
,
)
0
A(
A
,
)
0
)
(,
,
)
(,
li
l
j
ij
ij
if
I
I
OA
I
I
a
L
SI
M
L
L
a
o
t
he
r
w
i
s
e
(2
1)
(
,
)
a
nd
(
,
)
il
i
j
l
j
L
Aa
I
L
Aa
I
(2
2)
5.
RECO
M
M
E
NDE
R S
Y
ST
EM BA
SED
ON
SEM
ANT
IC S
I
MIL
A
RI
TY
Collaborative
filtering applied sim
ilarit
y
method for finding K-near
est
neighbour users to target
user. After t
h
at, they utilize the past ratings
of nei
g
hbo
ur
users in
order to
pred
ict or recommend new content
to target user
who
will like.
In this
current
paper, we use
sem
a
ntic si
mi
l
a
rity a
m
ong users
to fi
nd k-nearest
neig
hb
o
u
r
use
r
s.
It’s
w
o
rt
h
m
e
ntioning t
h
at,
user
s p
r
o
f
ile m
u
st be con
s
tructe
d
ba
sed
on
o
n
t
o
lo
gy
.
All
activities of
user can
be col
l
ected and saved in
web pr
oxy. System
can classify
t
h
e records of
the
user's
activ
ities u
s
ing Mach
in
e Learn
i
ng
Algo
r
ithm
an
d
on
to
logy o
f
th
e item
s
.
Som
e
at
tribute of ite
m
s
that a
user
tries to browse and search can be
used t
o
develop the initial user
p
r
o
f
ile on
to
logy. Fin
a
lly, a u
s
er
's f
eed
b
a
ck
s
o
n
th
e r
e
su
lts of
r
eco
mm
en
d
a
tio
n
can
b
e
u
s
ed
as an
i
m
p
o
r
tan
t
act
to adjust the user's profile.
In
o
r
de
r t
o
de
velo
p the
p
r
of
ile ontol
ogy
,
item
s
ont
ol
ogy is
prim
arily
needed
as ela
b
orated in t
h
e
previous steps. After that,
us
er'
s
interests a
n
d prefe
r
e
n
ces
are m
a
de
wit
h
regard to t
h
e content
of the ite
m
s
previously browsed
and searched
by
the user.
The ont
ology generator
us
es the
user's pr
evious act
ivities
regardi
n
g the
vari
ous item
s
to devel
o
p the initial user
profile
ontol
o
gy. Ther
efore, the user's
profile
is
devel
ope
d
bas
e
d
on
the
o
n
to
logy
of
som
e
refe
rence
o
n
to
logy
no
des
an
d eac
h
n
ode
h
a
s an
attrib
ute call
e
d
interest value.
This profile is
upda
ted
with regard to the user's new ac
tivities such as shopping,
visiting the
page
s, ex
plicit rating,
br
o
w
sing a
nd sea
r
c
h
ing
.
The Fi
gu
r
e
1 sh
ows t
h
e
user
pr
ofilin
g
m
odule use
d
in this
study
.
Figure 1.
User Profiling
Module
W
e
b Pr
oxy
Ont
o
logy
Gene
rator
Web
Logs
Classifier
Mo
v
i
e On
to
logy
User profile
Ont
o
l
o
gy
User Profiling Module
Evaluation Warning : The document was created with Spire.PDF for Python.
I
J
ECE
I
S
SN:
208
8-8
7
0
8
Recomme
nder system bas
ed
on
se
mantic
similarity (Karamo
lla
h Ba
gh
er
i Fard)
75
8
In this st
udy for m
a
king recommendation list by colla
borative
filtering, firs
K-nearest neighbour of
active user
(tar
get use
r) m
u
st be gai
n
ed
. F
o
r
obtaini
ng t
h
is
result, sem
a
ntic
si
m
ilari
ty
m
e
thods a
r
e applied.
for
obtaini
ng K-NN
users to active
user, sem
a
ntic si
m
i
larity
between
ontol
o
gy is
used
[32]. In this m
e
thod of
si
m
ilarit
y
, bot
h lexical simil
a
rity an
d c
onc
eptual sim
ilarity are c
o
nside
r
ed
for m
easuri
n
g sim
i
larity between
two
o
n
tolo
gies
. C
o
nce
p
tual C
o
m
p
arison
Le
v
e
l include
s C
o
m
p
aring
betwe
e
n tw
o Ta
x
o
n
o
m
ies and C
o
m
p
ari
n
g
Relatio
n
s
b
e
tween
co
rr
espondin
g
con
cep
ts
of two
taxo
no
m
i
es.
Af
ter
p
r
oducin
g
K-n
e
ar
est n
e
igh
bou
r
u
s
ers,
al
l
item
s
of this lis
t that nei
g
h
b
o
u
r
users
ha
ve
p
u
r
chase
d
b
u
t tar
g
et u
s
er
has
n
o
t
pu
rc
hased
,
re
com
m
e
nded
to
him
.
In content-based filtering syste
m
s,
if ite
ms
are highly similar to
the users’
profiles,
they can be
recomm
ended
to user by considering
item
’
s
content. In thi
s
study, conten
t based filteri
ng
uses
of semantic
si
m
ilar
i
t
y
a
m
o
n
g
item
s
in
th
e ite
m o
n
t
o
l
ogy d
o
m
ain
in
o
r
d
e
r
to
an
ticipate u
n
kno
wn
ratin
g
fo
r
targ
et u
s
er
base
d o
n
his/
her
p
r
o
f
ile. I
n
this
stage, a list includi
ng top-N re
c
o
m
m
e
ndation item
s
are prepared for
recomm
endation to target
use
r
based
o
n
t
h
e user
’s histo
r
y
r
ecor
d
.
6.
EVAL
UATI
O
N
In order to eva
l
uate how accurate
the propos
ed m
e
thods work in
rec
o
m
m
ende
r system
s,
it
is better
to use the transactions (selling and
buying) in a store with
various products. In this
study, the bills of
a
co
nstr
u
c
tion
mater
i
als
sup
p
l
ier
wer
e
u
s
ed
. Th
e d
a
ta
in
clud
e 2
266
bu
yers, 2
581
p
r
od
ucts,
and
21
662
sales
invoices.
To evaluate the recom
m
ender syst
em
, firstly
, the item
s
purc
h
ase
d
by
e
ach use
r
s
h
o
u
l
d
be
divi
ded
into t
w
o sets.
The
first set
was called trai
ning set and th
e
second one
wa
s called “t
he te
st set” and
sets we
re
selected
r
a
ndomly. Th
e p
r
opo
sed
alg
o
r
ith
ms wer
e
f
i
r
s
t i
m
p
l
em
en
ted
o
n
th
e tr
ain
i
ng
set in
o
r
d
e
r
to
f
i
lter
N
item
s
to be rec
o
m
m
e
nded t
o
users
.
Th
e N item
s
recom
m
e
nded t
o
the target user are called Top-N. The
n
, t
h
e
ite
m
s
in Top-N were com
p
ar
ed
with t
h
e ite
m
s
in the test set. Th
e com
m
on ite
m
s
in t
h
e test set and Top-N
were called Hi
t Set. After obtaining the test set, traini
ng set, and Hit Set, the final step is to deter
m
in
e the
accuracy perce
n
tage of
t
h
e
algorithm
using
evaluation crit
eria. He
re, two
evaluation crit
eria called Pre
c
ision
and Recall are
use
d
.
si
ze o
f
h
i
t
set
s
i
z
e
of
t
op-
N
s
e
t
P
r
e
c
is
io
n
(2
3)
s
i
z
e
o
f
h
it s
e
t
Re
si
ze o
f
t
e
st
set
ca
l
l
(2
4)
For a
better
perform
a
nce, F1 that is c
o
m
b
ination
of the two
ab
ove
criteria was used:
1
2*
*
R
e
c
a
l
l
P
ercisi
o
n
F
R
e
c
a
ll
P
e
r
c
is
io
n
(2
5)
F1
was c
o
m
puted for eac
h
user and t
h
e average
F1
obtained
from
all users
was consi
d
ere
d
as t
h
e
criterion for de
term
ining t
h
e a
l
gorith
m
accuracy. In order t
o
com
p
are the
proposed m
e
thods
with the
pre
v
ious
m
e
thods
,
they
are
com
p
are
d
with
the recom
m
ender
sy
stem
that has
bee
n
desig
n
e
d
bas
e
d
on
asso
ciation
rules.
The followi
ng
diagram
s
show the results of these algor
ithms. In the following eval
ua
tions, the various values
o
f
TOP-N
were con
s
id
er
ed fro
m
1
0
to 130.
Experim
e
ntal
results
dem
onstrate that accuracy
of c
o
l
l
aborative filteri
ng base
d
on sem
a
ntic
si
m
ilarit
y
(CF+SeSi) is
higher than
collaborative filtering ba
sed
on Pearson correla
tion similarity (CF+PC)
approach. Further, e
xpe
rim
e
ntal results shows that
accu
racy
of c
onte
n
t based filtering base
d on semantic
si
m
ilarit
y
(CBF+SeSi) is
higher than
content based
filteri
ng based on
co
sine sim
i
larit
y
(CBF+CS) approach
(see
Fi
gures 2 and
3).
Evaluation Warning : The document was created with Spire.PDF for Python.
I
S
SN:
2
088
-87
08
IJECE Vol. 3, No. 6, D
ecem
ber 2013
:
751 – 761
75
9
Figu
re
2.
C
o
m
p
aris
on
F
1
m
e
tric bet
w
een
C
B
F
base
d
o
n
c
o
sine
sim
ilarity
and
C
B
F
bas
e
d
on
sem
a
ntic
si
m
ilarit
y
Figu
re
3.
C
o
m
p
aris
on
F
1
m
e
tric bet
w
een
C
F
base
d
o
n
Pea
r
son
c
o
r
r
elation
an
d C
B
F
base
d
on
sem
a
ntic
si
m
ilarit
y
7.
CO
NCL
USI
O
N
In t
h
is pa
pe
r,
we
pr
o
pose
d
two
ne
w re
c
o
m
m
e
ndatio
n
m
e
thods
by
i
n
co
r
p
o
r
ating
the sem
a
ntic
sim
ilarity
in both C
F
a
n
d C
B
F
recom
m
endation a
p
pr
oac
h
es.
In CF a
p
proac
h
, to
find a set of k
nearest
neighbours to
the target user, user
s
’
p
r
o
f
ile based
on
on
tology
wa
s fo
r
m
ed and the
n
sem
a
ntic sim
i
larity
am
ong
use
r
s’
pr
ofile
was u
s
e
d
.
In C
B
F
a
p
pr
oach
, f
o
r
fin
d
i
ng
similar ite
m
s
to ite
m
s
purchased i
n
the
past by
target
user, semantic si
m
i
larity wa
s used. C
o
n
s
equ
e
n
tly, usin
g
m
o
st br
oad
l
y po
pu
lar measu
r
em
en
t m
e
tr
ics,
F1, tw
o m
e
thods we
re com
p
ared to the C
F
b
a
sed o
n
Pea
r
so
n co
rrelatio
n and C
B
F
based
on c
o
sine sim
i
larity
,
respectively.
In order to e
v
a
l
uate how accurate
the propos
ed m
e
thods work in re
c
o
m
m
e
nde
r system
s,
we use
d
the
transactions (selling and
buy
ing) in
a store with various
products. In th
is study, the bills of a construction
m
a
terials
suppl
ier
were use
d
. In
the dataset, there
w
e
re
22
6
6
b
u
y
e
rs
, 2
5
8
1
pr
od
ucts a
nd
21
6
62 sales i
n
voices
and
e
v
aluatio
n
s
w
e
re
m
a
de f
o
r
the
vari
ous
values
of
TOP-
N fro
m
1
0
to
1
30.Exp
e
r
i
m
e
n
t
al r
e
su
lts on pr
iv
ate
buildi
ng c
o
m
p
any dataset de
m
onstrated t
h
at the hi
gh accuracy is obtained in
bot
h CBF and
CF by
incorporating se
m
a
ntic si
m
i
la
rity.
30
35
40
45
50
55
60
65
70
10
20
30
40
50
60
70
80
90
100
110
120
130
F1
measur
e
To
p
‐
N
CBF+CS
CBF+SeSi
62
64
66
68
70
72
74
76
78
80
10
20
30
40
50
60
70
80
90
100
110
120
130
F1
measur
e
To
p
‐
N
CF+PC
CF+SeSi
Evaluation Warning : The document was created with Spire.PDF for Python.
I
J
ECE
I
S
SN:
208
8-8
7
0
8
Recomme
nder system bas
ed
on
se
mantic
similarity (Karamo
lla
h Ba
gh
er
i Fard)
76
0
ACKNOWLE
DGE
M
ENTS
I woul
d like to acknowledge the fina
ncial support from
Res
earch
Uni
v
ersi
ty facilit
ies of the Islam
i
c
Azad
Uni
v
ersi
ty of Yas
o
oj.
Also tha
n
ks to
the R
e
search
M
a
nagem
e
nt C
e
nt
er of Isla
m
i
c Azad Uni
v
ersity of
Y
a
s
ooj
f
o
r
p
r
ov
id
ing
an
ex
c
e
l
le
n
t
r
e
s
e
ar
c
h
e
nvi
ronm
ent in
which to
com
p
lete this work.
REFERE
NC
ES
[1]
Bobadilla J, et
al.
R
ecom
m
e
nder
system
s
survey.
Knowledge-Bas
ed S
y
stem
s. 201
3.
[2]
Rich E.
U
s
er m
odeling
via
stereo
types.
Cognitive
science. 1979; 3(
4): 329-354
.
[3]
Powe
l
l
MJD.
Ap
proximation
theory and methods
. 1981: Cam
b
rid
g
e univ
e
rsity
press.
[4]
Sa
l
t
on G.
Au
tomatic Text Processi
ng: The Transformation,
Ana
l
ysis, and Retrieva
l of
. 1989: Addis
on-W
e
sley
.
[5]
Arm
s
trong JS.
Principles
of
forecasting: a
handb
ook for research
ers and practitio
ners
. Springer
.
2
001; 30.
[6]
Murthi B and S
Sarkar.
The role of the management scien
ces in research on personalization.
Managem
e
nt Scien
ce.
2003; 49(10): 13
44-1362.
[7]
L
i
l
i
e
n
GL,
P Kotl
e
r
a
nd KS Moort
h
y
.
Marketing models
. 1992: Pr
entice-Hall
Englewood Cliffs.
[8]
Anand SS and B Mobasher.
In
tel
ligent
te
chniqu
e
s
for
web per
s
onaliz
ation
. in Pro
ceed
ings of th
e
2003 intern
atio
n
a
l
conferen
ce
on In
tell
igent
T
echn
i
q
u
es for
W
e
b
Per
s
onalization
.
20
03. Springer
-
Ver
l
ag.
[9]
McSherr
y
F an
d I Mironov.
D
iffer
e
nt
iall
y p
r
ivate r
ecom
m
e
nder
s
y
s
t
em
s
:
building pr
i
v
ac
y into
th
e ne
t
. i
n
Proceedings of the 15th ACM SIGKDD international
confer
en
ce on Knowledge
discover
y
and d
a
ta m
i
ning
. 200
9.
ACM.
[10]
Gol
dbe
rg D,
e
t
a
l
.
U
s
ing collab
o
r
a
tive fi
lter
i
ng
to weave an inf
o
r
m
ation tapes
t
r
y
.
Com
m
unications of the AC
M.
1992; 35(12): 61
-70.
[11]
Resnick P
and H
R
Varian
.
Recommender syst
ems.
Com
m
unications of the ACM. 1997; 40(3)
: 56-
58.
[12]
Schafer JB
, J Ko
nstan and
J Ried
i.
Re
com
m
e
nder
system
s in e
-
com
m
erce
. in Proceedings of th
e 1st
ACM conferen
ce
on Electron
i
c co
m
m
e
rce. 1999
.
ACM.
[13]
Burke R. H
y
brid
web r
ecom
m
e
nder s
y
s
t
em
s. in
The adaptive web
. Springer. 2007:
377-408.
[14]
Ziegl
e
r CN
, et
al.
Improving recommendation
lists
through t
opic div
e
rsifica
t
ion
. in P
r
oceed
ings of the 14th
intern
ation
a
l
con
f
erence on
W
o
rld W
i
de W
e
b
.
20
05. ACM.
[15]
Schafer JB
, et
al. Collaborative
filtering
recom
m
e
nder s
y
s
t
em
s. in
The adaptive web
. Springer
.
200
7: 291-324.
[16]
Roh TH, KJ Oh
and I Han.
The collaborative filtering
recommendation based on SOM cluster-indexing CBR.
Exper
t
S
y
stem
s with A
pplications. 200
3; 25(3): 413-42
3.
[17]
Liu DR, CH Lai, and W
J
Lee.
A
hybrid of sequen
tial rules and co
llabor
ative filter
i
ng for product recommendation
.
Inform
ation Sciences. 2009
; 179(
20): 3505-3519.
[18]
Barragáns-M
art
í
n
ez A
B
,
et
a
l
.
A
h
y
br
id
con
t
ent-b
a
sed and
item
-
based
co
llabor
ativ
e fi
lte
ring appro
ach
to
recom
m
e
nd TV
program
s enhanced with singula
r
value decom
p
o
s
ition.
Information Sciences
. 2010; 180(22): 4290-
4311.
[19]
Burke R. H
y
bri
d
recom
m
e
nder
s
y
stem
s: S
u
rve
y
and exp
e
rim
e
nts
.
User modeling and user-adapted in
teractio
n
.
2002; 12(4): 331
-370.
[20]
Y
e
J
.
Cos
i
ne s
i
m
ilarit
y
m
eas
ur
es
for intuit
ionis
tic fuz
z
y
s
e
ts
an
d their app
lic
at
i
ons
.
Mathematical and Computer
Modelling
. 2011
; 53(1): 91-97.
[21]
Zhu S, et al. Scaling up top-< i>
K
<
/i> cosin
e
sim
ilarit
y
sear
ch.
Data &
K
nowledge
Engineering
. 2011; 70(1): 60-
83.
[22]
Bills
us
D
and M
J
P
azzani
. User modeling for adaptiv
e news access.
User modeling and user-adapted interaction
.
2000; 10(2-3): 1
47-180.
[23]
Lang K.
News
w
eeder
: L
e
ar
ning
to filt
er
netn
ew
s
. in In P
r
oceed
ings of the Tw
e
l
fth Intern
at
iona
l Conferen
ce on
M
achine
L
earni
ng.
Citeseer. 199
5.
[24]
Breese JS
, D
Heckerm
a
n
,
and C K
a
die.
Empirical analysis of
predictive algorithms fo
r collaborative filtering
. i
n
P
r
oceedings
of
t
h
e F
ourteen
th co
nferenc
e
on U
n
c
e
rta
i
nt
y in
art
i
fic
i
al inte
llig
enc
e
. M
o
rgan
K
a
ufm
a
nn
P
ublis
hers
Inc
.
1998.
[25]
Benest
y J,
e
t
al.
P
earson corr
ela
t
i
on coeff
i
c
i
ent
.
in
Noise redu
ction
in speech
processing
. Springer
.
2009: 1-4.
[26]
D
i
Len
a
P
and
L
M
a
rgara
.
O
p
t
i
m
a
l glob
al
al
ignm
ent
of
signals b
y
m
a
xim
i
zation o
f
Pearson corr
elation.
Infor
m
atio
n
Processing Letters
. 2010; 110(16
): 679-686.
[27]
S
c
hem
p
er M
an
d A
K
a
ider
. A
n
e
w
approa
ch to
estim
ate
co
rre
la
t
i
on coef
fic
i
ents
in the
presen
ce
of censoring
an
d
proportional hazards.
Computational Statisti
cs
&
Data Ana
l
ysis
.
1997; 23(4): 467
-476.
[28]
Herlocker
J, JA
Konstan a
nd J R
i
edl. An em
pirical
analy
s
is of
de
sign choices in
neighborhood-b
a
sed co
llabor
ative
filte
ring algori
t
h
m
s
.
Infor
m
ation r
e
tr
ieva
l
. 2002;
5(4): 287-310
.
[29]
Herlocker
JL,
et al.
An a
l
gorith
mic framework for pe
rforming collaborative filtering
. in Proceed
ings of the 22n
d
annual internatio
nal ACM SIGIR confer
ence on
R
e
search
and
dev
e
lopm
ent in info
rm
a
tion retrieval. ACM. 1999
.
[30]
P
e
dersen T, et a
l
.
Measures of se
mantic similarity and
relatednes
s
in the biomedical domain.
Jou
r
nal of biomedical
infor
m
atics
. 200
7; 40(3): 288-29
9.
[31]
Yin Y and K Yasuda. Sim
ilar
i
ty
coefficien
t m
e
th
ods applie
d
to
th
e cell form
ation
problem
: a taxo
nom
y
and r
e
view.
International Jo
urnal
of Production
Economics.
2006; 101(2): 32
9-352.
[32]
M
aedche A
and
S
S
t
aab. M
eas
uring sim
ilarit
y
betw
een ontol
ogies. in
Knowledge engin
eerin
g and knowledge
management: O
n
tologies
and th
e semantic w
e
b
. Springer.
2002
: 251-263.
Evaluation Warning : The document was created with Spire.PDF for Python.