TELKOM
NIKA Indonesia
n
Journal of
Electrical En
gineering
Vol.12, No.6, Jun
e
201
4, pp. 4631 ~ 4
6
3
8
DOI: 10.115
9
1
/telkomni
ka.
v
12i6.544
5
4631
Re
cei
v
ed
De
cem
ber 2
9
, 2013; Re
vi
sed
F
ebruary 28,
2014; Accept
ed March 1
7
, 2014
An Effic
i
ent System for Information Recommendation
Zhenhu
a Hu
ang*, Qiang
Fang
Dep
a
rtment of Comp
uter Sc
ie
nce, T
ongji Uni
v
ersit
y
123
9 Sipi
ng R
oad, Sha
n
g
hai,
P.R. China
*Corres
p
o
ndi
n
g
author, e-ma
i
l
: shtj08.zhh
@
gmail.c
o
m
A
b
st
r
a
ct
A recomm
endation system
is
the on
e of the most effective tools fo
r
tackling with the pr
oblem
of
infor
m
ati
on ov
erlo
ad. How
e
v
e
r, as
the
matu
rity of W
eb 2.0
and th
e e
m
er
g
ence
of massiv
e
infor
m
ation, t
h
e
existin
g
infor
m
ation rec
o
mme
ndati
on syste
m
s hav
e the se
rio
u
s draw
ba
cks in the asp
e
cts of real-timing,
robustn
ess an
d self-a
dapta
b
i
lity. Moti
vated
by the a
bove f
a
cts, in this
p
a
per, w
e
desi
g
n
SIRSCA, w
h
ich i
s
an
efficient s
e
m
a
ntic-driv
e
n informa
tion
rec
o
mmendation system
under
t
he cloud arc
h
it
ectu
re. Specially,
the SIRSCA system
mai
n
ly i
n
clu
de
four
modu
les: se
ma
n
t
ics represe
n
ta
tion of foun
dati
on data
and u
s
er
prefere
n
ce inf
o
rmati
ons; in
de
xing
mec
han
is
m of massive
semantic i
n
for
m
ati
ons u
nder
clou
d architect
u
re;
reco
mme
ndati
on a
ppro
a
ch
e
s
base
d
on s
e
mantic co
mp
utation th
eory;
and tech
no
lo
gies of dy
na
mi
c
mi
gratio
n u
n
d
e
r
clou
d arch
ite
c
ture.
W
e
pres
ent the
extensi
v
e exp
e
ri
me
nts that de
monst
r
ate our
i
m
pro
v
e
d
system
is both efficient and effective.
Ke
y
w
ords
:
cloud co
mputi
n
g
,
recomme
n
d
a
tion syste
m
, se
ma
ntics, dyna
mic
mi
gratio
n
Copy
right
©
2014 In
stitu
t
e o
f
Ad
van
ced
En
g
i
n
eerin
g and
Scien
ce. All
rig
h
t
s reser
ve
d
.
1. Introduc
tion
The
numb
e
r
of se
rvers an
d webp
age
s
acce
ssi
ng
to
Internet i
n
cre
a
se
expo
nent
ially as
the in-de
p
th appli
c
ation of
information
and net
work technol
ogie
s
. And we ne
ed to face the
massive info
rmation
cau
s
e
d
by the
rapi
d devel
opm
e
n
t of the
Internet technol
og
y. For
examp
l
e,
there a
r
e milli
ons of boo
ks on Da
ngda
ng
, millions of
films on Netflix, millions of ne
w arrival items
on eBay, over fifteen hund
red millio
ns o
f
pages o
n
t
he so
cial net
work
del.icio.u
s
[1]. Information
overloa
d
ha
s appea
red a
n
d
use
r
s
can’t
accurately
find their inte
rested item
s. Also inform
ation
overloa
d
will
redu
ce the
econ
omic
benefit
and
market com
petitiv
eness for
enterpri
s
es.
Acco
rdi
ng to [2], we kno
w
that inform
ati
on re
com
m
endatio
n sy
stem
s is the one of the most
effective tools to solve the probl
em of informat
io
n overload. The
s
e
kinds
of
syste
m
s not only a
r
e
a comme
rci
a
l
marketin
g to
ol, but al
so
can effici
ent
ly
improve
u
s
ers’ a
dhe
sio
n
.
Acco
rdi
ng to
the
repo
rt of M
c
K
i
nsey, info
rm
ation recomm
endatio
n
sy
stems
provide
47% an
d 3
5
% pro
duct
s
sales
for eBay and
Amazo
n
re
sp
ectively [3].
Re
cently, re
sea
r
che
r
s h
a
ve focus o
n
the
di
scu
ssi
on
and
d
e
sig
n
of
inf
o
rmatio
n
recomme
ndat
ion meth
od
s, becau
se i
n
formatio
n recommen
dation
method
s a
r
e the
core
of
informatio
n recom
m
en
dati
on syste
m
s
[4]. To our best kno
w
le
dge, there a
r
e thre
e mai
n
informatio
n
recom
m
en
dati
on m
e
thod
s:
co
ntent
-ba
s
ed m
e
thod
s [5-7],
colla
borative
filterin
g
method
s [8
-10], and
h
y
brid m
e
tho
d
s [1
1-1
3
]. Table
1
sho
w
s the
three t
r
aditi
onal
recomme
ndat
ion meth
od
s
use
d
in
typical inform
ation
re
com
m
end
ation
sy
stem
s. Th
ese exi
s
ting
method
s mai
n
ly focus
on
how to buil
d
model of
use
r
interest
to improve
the pre
c
i
s
ion
of
recomme
ndat
ion re
sults. Howeve
r, as th
e maturi
ty of Web 2.0 a
n
d
the emerg
e
n
c
e of massiv
e
informatio
n, there
are at le
ast three d
r
a
w
ba
c
ks
abo
u
t
the existing
informatio
n recom
m
en
dati
o
n
system
s: (1
) sin
c
e the info
rmat
ion of u
s
ers a
nd p
r
od
ucts
cha
nge
s dynamically,
they need b
u
ild
model
s rep
e
a
tedly; (2) du
e to the opening ch
aracte
ristic of Web
2.0 networks, they are often
attacked by some m
a
lici
o
us u
s
ers, an
d their
software mo
dule
s
are often a
b
norm
a
l; (3) t
h
e
y
build mo
del
s according to
cu
rre
nt user prefe
r
en
ce
s,
and d
on’t consi
der th
e e
v
olution of u
s
er
prefe
r
en
ce
s, whi
c
h will largely affect the quality
of in
formation recommen
dation
and the effect of
adaptive pe
rsonali
z
ed reco
mmend
ation.
To solve th
e
above d
r
a
w
b
a
cks
of existi
ng info
rmatio
n re
comm
en
dation
syste
m
s, in this
pa
per
, we
p
r
op
oses
SIR
S
C
A
(Sema
n
tic-dr
iven
In
fo
rmatio
n
Recommend
a
t
io
n
Sys
t
em u
nder
Clou
d Archite
c
ture
), an efficient inform
at
ion recom
m
endatio
n system based on the underl
y
ing
Evaluation Warning : The document was created with Spire.PDF for Python.
ISSN: 23
02-4
046
TELKOM
NI
KA
Vol. 12, No. 6, June 20
14: 4631 – 4
638
4632
kno
w
le
dge
s of data and u
s
er p
r
efe
r
en
ce informat
ion,
and introd
uces the se
man
t
ic comp
utation
in information
recomm
end
a
t
ion systems.
Meanwhil
e
, to improve the real-time an
d robu
stne
ss
of
info
rm
atio
n
recom
m
e
n
d
a
ti
on syst
em
s, we pre
s
e
n
t
an
effi
ci
ent syste
m
a
r
chit
ect
u
re whi
c
h
is
based on the cloud computing pla
t
fo
rm, and fo
c
u
s on the index mechanism of massive
sema
ntic dat
a and the techniqu
e for distribute
d
mi
gration. Moreov
er, we
prese
n
t the extensive
experiments
that demo
n
strate our improved
system is both effi
cient and
effective.
Table 1. Th
re
e Traditio
nal
Method
s use
d
in
Typical Informatio
n Recom
m
en
dati
on Systems
M
e
t
h
od
s
I
n
f
o
rm
a
t
io
n
r
e
co
mm
e
n
d
a
t
i
o
n
sy
ste
m
s
content-based r
e
commendation
Personal Web Watcher, AdR
O
SA,
SIFT,
L
y
ricTime,
Ne
w
s
Weeder, Google Alerts,
etc.
collaborative filte
r
ing recommend
ation
Amazon, eBa
y
,
CDN
OW, Group
Lens, Ringo,
Video Recomme
ndation, MovieLens, ACF, SERF,
Connotea,
Dangdang, etc.
h
y
brid recomme
ndation
F
ab, Dail
y
Lear
ner
,
CWAdvisor
,
Q
u
ickstep,
Foxt
rot, OARs,
P
B
Tango, Y
oda,
Open Bookma
rk, etc.
2. Sy
stem Frame
w
o
r
k
Ov
erv
i
e
w
As the d
e
velo
pment of
We
b 2.0 te
chnol
ogy
and th
e
emergen
ce
o
f
massive inf
o
rmatio
n,
the existin
g
i
n
formatio
n re
comm
end
atio
n sy
stems h
a
ve serio
u
s
dra
w
ba
cks in
the a
s
p
e
ct
s o
f
real
-timing, robust
ness and self-adaptability.
To solve these
main dr
awbacks, we design
and
develop SIRSCA, an efficient
sema
ntic-d
riven in
fo
rmation re
co
mmend
ation
system u
nde
r the
clou
d archite
c
ture. Th
e sy
stem fram
ewor
k of SIRS
CA is sho
w
n in
Figure 1.
Figure 1. The
System Fra
m
ewo
r
k of SIRSCA
Our
system
mainly inclu
d
e
s four m
odul
es:
Module 1: se
mantics rep
r
e
s
entatio
n of found
at
ion dat
a and user p
r
eferen
ce info
rmation.
In this modul
e, we first de
fine and de
scrib
e
t
he se
mantics form
al repr
esenta
t
ion of foundation
data, and then pro
p
o
s
e the ontology
rep
r
e
s
ent
atio
n method of
foundation
data by para
llel
Evaluation Warning : The document was created with Spire.PDF for Python.
TELKOM
NIKA
ISSN:
2302-4
046
An Efficient System
for Inform
ation Re
co
mm
endation (Zhenh
ua Hua
ng)
4633
con
s
tru
c
ting
of
imagina
ry domain ontol
ogy.
M
ean
wh
ile, we p
r
e
s
e
n
t the domi
n
ant and
re
ce
ssive
sema
ntics re
pre
s
entatio
n
of user
prefe
r
nce info
rmati
ons, a
nd th
e
n
co
nst
r
u
c
t the cha
r
a
c
teri
stic
ontology of user p
r
efe
r
en
ces. Fu
rth
e
rm
ore, we an
alyse the time and spa
c
e
co
mplexities of the
s
e
mantic
algorithms
.
Module
2: indexing m
e
cha
n
ism
of massiv
e se
mantic info
rmations
und
er cl
oud
architectu
re.
In this
mo
dule,
we first pr
op
ose th
e HCS (Hie
rarchy Combi
ned Su
rroga
te)
encodin
g
of semantic informations for in
format
ion
re
commen
dation
system
s, an
d then d
e
si
gn
a
two-stage
di
stributed i
ndex
stru
ct
u
r
e b
a
sed on
the
C2
(CA
N
a
nd
CHO
R
D) hyb
r
i
d
ro
ute p
r
oto
c
ol
[9]. Moreover, in order to t
a
ck
le
with th
e pro
b
lem
of netwo
rk nod
es ove
r
loa
d
, we p
r
e
s
ent t
he
strategy of in
dex spli
tting a
nd re
con
s
tructing.
Module
3: informatio
n re
commen
dation
based o
n
se
mantics
com
puting. In this module,
we fi
rst
de
si
gn the
EOA
S
ontology
algeb
ra
syst
em, an
d the
n
p
r
opo
se
t
he info
rmati
o
n
recomme
ndat
ion m
e
thod
s ba
sed
on
the sema
nt
ics
com
puting
between
th
e ontol
ogy
of
foudation d
a
ta and the
ch
ara
c
teri
stic o
n
tology of
user p
r
eferen
ce
. Furt
hermore
, we present the
so
cial re
com
m
endatio
n mech
ani
sm ba
sed on the c
hara
c
te
risti
c
ontology of use
r
prefe
r
e
n
ce
from so
cial n
e
tworks. Fina
lly, we propo
se the e
fficie
n
t appro
a
ch for minin
g
the evolution chai
ns
of user
prefe
r
ence ba
sed o
n
time
sequ
e
n
ce
comp
utin
g operators.
Module
4:
system dyn
a
mi
c mig
r
atio
n u
nder cl
oud
archite
c
ture. Iin
this m
odul
e, we
first
desi
gn the
system mod
e
l
fit for dyna
mic mig
r
atio
n and
dyna
mic mig
r
atio
n rul
e
s. T
h
e
n
we
prop
ose an e
fficient app
ro
ach fo
r intelli
gent sel
e
ctio
n of goal serv
er cl
uste
rs. Fi
nally, we pre
s
ent
the system
d
y
namic mig
r
a
t
ion method
s
in the polyn
o
m
ial-time
co
mplexity with the gua
rante
e
of
migratio
n se
curity.
3. Specific Realizatio
n of
Our SIRSCA Sy
stem
In this
s
e
c
t
ion, we give the s
p
ec
if
ic reali
z
ation of our S
I
RSCA syste
m
.
3.1. Realiza
t
i
on for Modul
e 1
Endowi
ng fou
ndation d
a
ta
with und
ersto
od sem
anti
c
s is the sta
r
tin
g
point for
se
mantic-
driven info
rm
ation re
co
mm
endatio
n tech
nologi
es.
We
find that sem
antics of
foun
dation d
a
ta can
be d
e
fined
b
y
the con
c
ept
s, con
c
ept
-to-con
c
e
p
t rel
a
tions,
attribute
s
, in
stan
ce
s
and
rule
s i
n
t
he
spe
c
ific
are
a
. In our SIRS
CA sy
stem, the ontol
ogy
o
f
foundation
d
a
ta ia defin
ed
as O
=
(C,
R, P,
I, A), “C” den
otes the set
of con
c
ept
s and term
s of
found
ation dat
a; “R” is multi
v
ariate mappi
ng
from C×
C to
Α
; namely,
C is the rel
a
tionship set of c
oncepts; “P” is attribut
e set of con
c
ept
features
; “I” is the ins
t
anc
e
s
e
t of c
o
nc
epts
;
“A” is
rule s
e
ts
.
In orde
r to im
prove the i
n
telligen
ce a
n
d
t
he kn
owle
d
ge re
use rate
, our SIRSCA
system
parall
e
l
con
s
t
r
uct
s
the
onto
l
ogy of foun
d
a
tion
data
u
s
i
ng ima
g
ina
r
y
domain
ontol
ogy tech
nolo
g
y.
This metho
d
use
s
th
e d
o
m
ain d
e
script
i
on d
o
cumen
t
s of
DO
DL l
angu
age
and
the evol
uati
on
theory of pop
ulation of living thing
s
(in
c
luding
sele
cti
on, clon
e, variation,
cross,
comp
osi
ng a
nd
transgen
osi
s
etc.) to
com
b
i
ne o
r
del
ete t
hem. By th
is
way, it ca
n hi
era
r
chically p
r
ocess th
e m
o
st
fundame
n
tal
foundatio
n d
a
ta whi
c
h
initially store
in
the syste
m
, and dyn
a
mically con
s
truct
the
global o
n
tolo
gy.
As for con
s
tructing the ch
ara
c
teri
stic o
n
tology
of user prefe
r
e
n
ce
s,
our SIRSCA system
use
s
the ide
a
of the NER (Name
d
Entity Recog
n
i
t
ion [14]) pro
c
e
ss in the a
r
ea of biolo
g
i
c
al
kno
w
le
dge
a
c
hievin
g, an
d employ
s t
w
o-stag
e mo
dule
s
to exp
r
ess th
e se
mantics of u
s
er
prefe
r
en
ce in
formation. In the first stage
of
semantics expressio
n
, our
S
I
RS
CA
sy
st
em
r
ega
r
d
s
the sp
ecifi
c
u
s
er
prefere
n
ce inform
ation
s
a
s
a d
o
cum
ent fragm
ent
and u
s
e th
e Latent Sema
ntic
Index (LSI) [
15] and Su
p
port Ve
ctor
Machi
ne
(
SVM) [16] tech
nologi
s to ef
fectively cho
o
se
con
c
e
p
ts
of
the do
cum
e
n
t
fragme
n
t. This complet
e
s
domin
ant
se
mantics e
x
traction. In
the
se
con
d
stag
e of sema
ntics exp
r
e
s
sio
n
, our SI
RS
CA system
e
x
ploits
the e
x
isting found
ation
data ontolo
g
y to find relate
d con
c
e
p
ts, relation
s, attributes, an
d in
stan
ce
s. This complete
s t
h
e
latent sema
ntics extra
c
tion
. Based on the domin
a
n
t and latent se
mantics,
our
SIRSCA syst
em
automatically con
s
tru
c
t
s
the cha
r
a
c
teri
st
ic ontolo
g
y of user p
r
efe
r
e
n
ce
s.
3.2. Realiza
t
i
on for Modul
e 2
In the clo
u
d
com
puting
environ
ment,
the
efficien
cy of inform
ation re
co
m
m
endatio
n
depe
nd
s to a great exte
nt on the organi
zati
on a
nd acce
ss mode of ma
ssive
sema
n
t
ics
informatio
n. And the dist
ri
buted ind
e
xin
g
mechani
sm
is the one
of the most effective app
ro
a
c
h
Evaluation Warning : The document was created with Spire.PDF for Python.
ISSN: 23
02-4
046
TELKOM
NI
KA
Vol. 12, No. 6, June 20
14: 4631 – 4
638
4634
for tackling
wi
th this proble
m
. Hen
c
e in
our SIRS
CA
system,
we d
e
sig
n
the two
-
level di
stribu
ted
indexing
stru
cture
C2-DIS
INX which
i
s
ba
sed
on
th
e
C2 (CA
N
and CHO
R
D)
hyb
r
id ro
uting
proto
c
ol,
an
d u
s
e
s
thi
s
ro
uting
pro
t
ocol to
ap
p
o
int spe
c
ific se
rver
clusters for sto
r
ing
corre
s
p
ondin
g
local ind
e
x. In this way,
we
ca
n gua
rantee the scalability and efficien
cy of the
distrib
u
ted in
dex.
Since the m
a
ssive sem
a
ntics info
rma
t
i
on update
s
frequently, we ne
ed to solve the
maintena
nce
probl
em of di
stribute
d
ind
e
x
. Accord
ing
to the theo
ry of distrib
u
ted
databa
se, th
e
main wo
rk of
index maintenan
ce is to pro
c
e
ss the
splitting and
mergi
ng of C2-DISINX ind
e
x
node
s. In our SIRSCA system, we pro
p
o
se a
n
app
ro
ximate optimal strategy to
sele
ct the index
node
s
whi
c
h
need
to be
split and
merg
ed. Let
W
an
d S be
the
set of glo
bal a
nd lo
cal i
nde
x
node
s
re
spe
c
tively. Then
our
app
roxi
mate optimal
strate
gy ca
n
be d
e
scri
be
d belo
w
.
We
first
con
s
tru
c
t a
weig
hted di
rected
bipa
rtite graph
WDBG: map
W and S into
the vertex set of
WDB
G
, an
d
map the
routi
ng
co
st bet
ween
nod
es wh
ich
is obtai
n
ed by
sam
p
le
evaluatio
n to
the
edge
set of WDB
G
. Then
we are ba
sed on the sh
ortest p
a
th theory an
d int
r
odu
ce a virt
ual
vertex to co
nvert WDBG
to the
stein
e
r
weig
hted
path g
r
ap
h
SWPG in
th
e con
s
tant -t
ime
compl
e
xity. Finally, we
prod
uce the
steine
r tr
ee
[17] from S
W
PG in
the
polynomi
a
l
time
compl
e
xity, and get the a
p
p
roximate o
p
timal sol
u
tion
of index nod
e
s
which
nee
d
to be split a
n
d
merg
ed. Accordin
g to th
e theo
ry of
dire
cted
st
einer tree, t
he time
co
mplexity and
the
optimizatio
n low bo
und
ca
n be adju
s
ted
and bala
n
ce
d by the para
m
eter
[0, 1].
In addition,
we find
the
sema
ntics inf
o
rmatio
n is
usu
a
lly ma
ssive, if we in
put index
node
s an
d their data int
o
memory di
rectly,
it will cau
s
e hu
ge
I/O overhea
d and mem
o
ry
con
s
um
ption.
To solve t
h
is p
r
obl
em, in our
S
I
R
S
CA
sy
st
em
,
we en
cod
e
t
he sema
nt
ics
informatio
n
by HCS e
n
c
odi
ng b
a
se
d on
Hierar
chy Combin
ed Surrog
ate. The p
r
o
m
inent
advantag
e of HCS en
co
din
g
is t
hat it can improve the
efficiency of informatio
n recom
m
en
dati
on
by using le
ss and unifo
rm n
u
mbe
r
of bits to store mo
re
data.
3.3. Realiza
t
i
on for Modul
e 3
In our SIRSCA system,
we u
s
e the ONIO
N (O
Nt
ology Com
p
o
s
itION [18])
ontology
algeb
ra theo
ry in the aspect of ontol
ogy algeb
ra
semanti
c
computation.
Since the th
ree
operation
s
are set o
perations,
whi
c
h a
r
e la
ck
of qu
antitative arithmetic a
r
ith
m
etic ability. It is
necessa
ry to improve and
ext
end the ONIO
N ontol
ogy algeb
ra
sy
stem be
ca
use the sem
antic
simila
rity com
putation in
o
u
r SIRS
CA system is
reli
e
d
on set ope
rations, arithm
etic
o
peration
s
,
logic
ope
ratio
n
s a
nd oth
e
r
one
s. So we
prop
os
e EOA
S
(Extension
Ontology Alg
ebra Sy
stem),
whi
c
h i
s
defin
ed by
∑
=(O,
R, Op’
)
, whe
r
e
∑
is th
e alg
ebra
within
o
n
tology, O is
the co
ncept set
of ontology, R is the four
relation
s (pa
r
t-of, kind-
of, attribute
-
of and i
n
stan
ce
-of),
Op’ co
ntain
s
the
set of interse
c
tion, unio
n
, of differen
c
e,
also we
ad
d
the arithmeti
c
ope
rato
rs
such a
s
ad
dition,
subtractio
n,
multiplicatio
n
and lo
gical
operator.
F
u
rtherm
o
re, ou
r SIRSCA
sy
stem p
r
op
oses a
set of sequ
e
n
tial com
puti
ng op
erato
r
t
o
reali
z
e
pa
rallel, ord
e
r, i
n
terrupt, re
co
very, su
spen
d of
sema
ntics in
the aspe
ct of
seq
uential
sema
ntic
co
mputing. At the same tim
e
, we
define
the
rule
s such a
s
the la
ws
of cal
c
ulu
s
,
whi
c
h
can
e
n
su
re
co
rre
ctness of tem
poral
se
mant
ic
comp
uting.
The e
s
sen
c
e
of inform
atio
n re
com
m
en
dation m
e
tho
d
s i
s
to find
the items
wh
ich a
r
e
simila
r to th
e
descri
p
tion
of
user p
r
efere
n
ce
s
and
recommen
d
s the
s
e ite
m
s to th
e u
s
e
r
. Fo
r th
is
observation,
we e
m
ploy t
he follo
wing
idea
s.
The S
I
RSCA sy
ste
m
uses th
e f
ound
ation d
a
t
a
ontology an
d
the use
r
p
r
eferen
ce on
tology
as th
e input of i
n
formatio
n recom
m
en
dati
o
n
methods, and utilizes EOA
S
ext
ension
ontology al
gebra System fo
r semantic computing of
the
foundatio
n da
ta ontology and the us
e
r
p
r
eferen
ce ont
ology. Then
the SIRSCA system achi
eves
retain
s items
that is much
simila
r to the use
r
prefere
n
c
e ontol
ogy, and di
scar
d
s
items that is less
simila
r to the user p
r
efe
r
e
n
ce o
n
tology
. Specially
, in orde
r to effectively integrate the so
cial
recomme
ndat
ion me
ch
ani
sm, the SIRS
CA
system fi
rs
t obtai
ns the
prefere
n
ce o
n
tology
whi
c
h
is
related
to the
use fro
m
so
cial
netwo
rks,
and
se
manti
c
ally comp
utes bet
ween
t
h
is ontology and
the u
s
e
r
p
r
ef
eren
ce
ontol
ogy. The
n
, th
e SIRSCA
sy
stem
gets th
e final
re
com
m
endatio
n
re
sult
by taking th
e sem
antic
comp
uting b
e
twee
n
the relevan
c
e preferen
ce ont
ology
and
t
h
e
foundatio
n da
ta ontology.
In orde
r to efficiently achie
v
e the adaptiv
e personali
z
ed re
comm
e
ndation, the SIRSCA
system p
r
op
ose
s
the con
c
ept
s and te
chn
o
logie
s
of
evolution ch
ain of use
r
p
r
eferen
ce. T
he
evolution
cha
i
n of u
s
er preferen
ce
s i
s
con
s
i
s
t of u
s
er p
r
efe
r
en
ce
ontologi
es in differe
nt time
node
s, whi
c
h
record
s a
n
d
tracks
cha
n
ges in
differe
nt perio
ds of
use
r
preferen
ce
s, and the
n
it
Evaluation Warning : The document was created with Spire.PDF for Python.
TELKOM
NIKA
ISSN:
2302-4
046
An Efficient System
for Inform
ation Re
co
mm
endation (Zhenh
ua Hua
ng)
4635
can
a
ccu
ratel
y
predi
ct a
n
d
adju
s
t item
s intere
sted
for the
u
s
er
by analyzi
n
g
and
mining
the
kno
w
le
dge in
the chai
n of user p
r
efe
r
en
ces.
3.4. Realiza
t
i
on for Modul
e 4
An importa
nt index of migration m
e
th
ods
of info
rmation re
co
mmend
ation
system
s
betwe
en server
clu
s
te
rs is
the syste
m
out
age
probability, the
downtime, and the fu
nction
recovery tim
e
of soft mo
dule
s
. The
r
ef
ore, in
o
u
r
S
I
RS
CA
sy
st
em,
we n
e
e
d
s t
o
solv
e t
w
o
techni
cal
difficultie
s: (1
)
How to
sele
ct
the targ
et se
rver
clu
s
ter
whe
n
the
sy
stem nee
d to
be
migrate
d
; (2)
Ho
w to make
effective and
safe sy
stem
migratio
n.
For the firs
t tec
h
nical difficulty, our SIRSCA s
y
stem
use
s
the qu
e
r
y optimize
r
o
f
existing
distrib
u
ted
d
a
taba
se
man
ageme
n
t
systems to
pe
ri
odically colle
ct meta
data
by the
qu
e
r
y
optimize
r
of t
he existin
g
d
i
stribute
d
dat
aba
se ma
na
gement
syste
m
, includi
ng
the C2
-DISINX
indexing me
chani
sm, the value of Hierarchy Co
m
b
i
ned Surrog
ate, joint proba
bility or density
function
of
u
nderlyin
g d
a
ta an
d
rep
eat
ability of u
s
e
r
prefere
n
ce i
n
formatio
n. A
nd o
n
thi
s
ba
sis,
the proje
c
t ch
ose the n
o
te of the minimu
m co
st
as the
target se
rver cluste
r of migration.
For the
se
co
nd techni
cal
difficulty, our
SIRS
CA sy
stem complete
s the live mi
g
r
ation of
informatio
n recom
m
en
dati
on system
s throug
h
th
re
e
s
t
ag
es
from th
e
s
o
u
r
ce
se
r
v
er
c
l
us
te
r
NSRC to the target se
rver clu
s
ter
NDST. In
the first stage,
our SI
RSCA system creates
sna
p
shots fo
r the metadata
of informatio
n re
co
mm
end
ation syste
m
s, and
migrate the sna
p
shot
s
to NDST, and let information re
co
mmen
d
a
tion alg
o
rith
ms
run
on
ND
ST. At the
same time, thes
e
recomme
ndat
ion algo
rithm
s
are still
ru
nning
on
the
NSRC, thus the re
comm
endatio
n re
sults
from NDST lag behi
nd NSRC. He
nce in the se
co
n
d
stage, ou
r SIRSCA
syst
em synchroni
ze
s
the recomme
ndation resul
t
s of NDST
and NS
RC
circ
ul
arly. In the third sta
g
e, NSRC
sto
p
s
runni
ng
of th
e information
re
com
m
end
ation al
gorit
h
m
s, a
nd
co
p
y
s the
different pa
rts of
the
recomme
ndat
ion re
sults to
NDST. In add
ition, we
prov
e the corre
c
tn
ess and effe
ctiveness of the
system mig
r
a
t
ion method theoretically.
4. Experimental Ev
aluation
This se
ct
ion con
d
u
c
t
s
a
n
empiri
cal st
u
d
y
of our SIRSCA syste
m
usin
g the b
e
n
chm
a
rk
synthetic
dat
aset
s ITEMS and
USERS
.
ITEMS is t
he set of item
s which
ha
s 20
characte
ri
stic
attributes, an
d USERS is the set of use
r
s whic
h has
10 prefe
r
en
ce attributes.
We evaluate
the
efficiency and the scal
abilit
y of our SIRSCA system.
In the first group of expe
ri
ments, we
fix the ca
rdin
ality of USERS to 10
6
, and
let the
cardinality of
ITEMS vary in the ra
nge [
1
10
6
, 9
10
6
]
.
Figure 2
sh
ows the expe
rimental
re
sul
t
s
for this group.
Figure 2. The
first grou
p of experim
ents
0.0%
10.0%
20.0%
30.0%
40.0%
50.0%
60.0%
70.0%
80.0%
90.0%
100.0%
1M
3M
5M
7M
9M
recommendation success
rate
the cardinality of
ITEMS
SIRSCA
Evaluation Warning : The document was created with Spire.PDF for Python.
ISSN: 23
02-4
046
TELKOM
NI
KA
Vol. 12, No. 6, June 20
14: 4631 – 4
638
4636
In the Fig
u
re
2, We
ca
n o
b
s
erve
that o
u
r
SIRSCA
sy
stem h
a
s the
good i
m
plem
entation
perfo
rman
ce.
Speci
a
lly, the re
com
m
en
dation
su
cce
ss
rate
of ou
r SIRSCA sy
stem in ea
ch
case
is great than
80%. For e
x
ample, in Figur
e 2,
whe
n
the cardinali
t
y of ITEMS
equal
s 10
6
, the
recomme
ndat
ion su
cce
s
s rate of our SIRSCA
system is e
q
u
a
l to 90.2%. And whe
n
the
cardinality of
ITEMS equal
s 9
10
6
, th
e recom
m
en
dati
on
su
ccess
rate
of o
u
r SI
RSCA
sy
ste
m
is
equal to 91.5
%
.
In the second
gro
up of
exp
e
rime
nts,
we
fix the c
a
rdinality of ITEMS to 5
10
6
, a
nd let the
cardinality of USERS vary in the range [
2
10
5
, 1
10
6
]. Figure 3 sh
ows the expe
rimental resul
t
s
for this group.
Figure 3. The
Second
Gro
up of Experi
m
ents
In the Fig
u
re
3, We
ca
n o
b
s
erve
that o
u
r
SIRSCA
sy
stem h
a
s the
good i
m
plem
entation
perfo
rman
ce.
Like th
e first gro
up
of e
x
perime
n
ts,
t
he re
comm
e
ndation
success rate of our
SIRSCA sy
stem in
ea
ch
case
is
great t
han
80%. Fo
r example, i
n
Figure 3,
whe
n
the
ca
rdin
al
ity
of USERS
eq
uals 2
10
5
, t
he recomme
ndation
success rate of
o
u
r SIRS
CA
system is eq
u
a
l to
88.6%. And
whe
n
the
cardinality of IT
EMS equal
s
1
10
6
, th
e
re
comm
end
atio
n succe
s
s
rat
e
of
our SIRSCA
system i
s
equ
al to 92.8%.
In the three group of expe
riments, we le
t t
he cardin
alities of ITEMS and USERS vary in
the ra
nge
s [
1
10
6
, 9
10
6
] and [2
10
5
, 1
10
6
] re
sp
ectively. Figu
re 4
shows t
he expe
rime
ntal
results
for this
group.
Figure 4. The
Third G
r
oup
of Experimen
ts
0.0%
10.0%
20.0%
30.0%
40.0%
50.0%
60.0%
70.0%
80.0%
90.0%
100.0%
0.2M
0.4M
0.6M
0.8M
1M
recommendation success
rate
the cardinality of
USERS
SIRSCA
0.0%
10.0%
20.0%
30.0%
40.0%
50.0%
60.0%
70.0%
80.0%
90.0%
100.0%
1M/0.2M
3
M/0.4M
5M/0.6M
7
M/0.8M
9M/1M
recommendation success
rate
the cardinalities
of
ITEMS and USERS
SIRSCA
Evaluation Warning : The document was created with Spire.PDF for Python.
TELKOM
NIKA
ISSN:
2302-4
046
An Efficient System
for Inform
ation Re
co
mm
endation (Zhenh
ua Hua
ng)
4637
In the Fig
u
re
4, We
ca
n o
b
s
erve
that o
u
r
SIRSCA
sy
stem h
a
s the
good i
m
plem
entation
perfo
rman
ce.
Unlike the a
bove two gro
ups of expe
ri
ments, the re
comm
end
at
io
n su
cc
es
s rat
e
of
our SIRS
CA
system i
n
ea
ch
ca
se i
s
g
r
eat than
8
5
%
in this
gro
u
p
of experim
e
n
ts. Fo
r exa
m
ple,
in Figu
re
4,
whe
n
the
ca
rdinalitie
s of I
T
EMS and
USERS equ
al
1
10
6
and
2
10
5
re
sp
ectiv
e
ly,
the recomme
ndation success rate of our SIRSCA
system is equ
al to 90.3%.
And whe
n
the
cardinalitie
s
of ITEMS and USERS eq
ual 9
10
6
an
d 1
10
6
resp
ectively, the recomme
ndat
ion
su
ccess rate
of our SIRSCA
system is e
qual to 91.5%
.
6. Conclusio
n
Information
overloa
d
ha
s appea
re
d a
s
the
matu
ri
ty of Web 2.0, and inf
o
rmatio
n
recomme
ndat
ion system
s p
l
ay important
role for
mini
n
g
the potentia
l con
s
umptio
n tenden
cy a
n
d
finding the ite
m
s that u
s
e
r
s are i
n
tere
ste
d
in. In
this
p
aper,
we
de
si
gn the SIRS
CA system
whi
c
h
is an
efficient
informatio
n recom
m
en
dati
on sy
stem
of
new
gen
erati
on an
d is
ba
sed on th
e cl
o
u
d
comp
uting p
l
atform archi
t
ecture,
sem
antic
-drive
n
foundatio
n d
a
ta co
nnotat
ion an
d u
s
er
prefe
r
en
ce.
Specially, in
our SIRS
CA sy
st
em,
we
propo
se four mo
d
u
les:
sem
a
n
t
ics
rep
r
e
s
entatio
n of foundation data and
user p
r
ef
ere
n
ce info
rmati
ons; indexin
g
mecha
n
ism
of
massive sem
antic info
rma
t
ions un
de
r cloud a
r
chitect
u
re; re
co
mm
endatio
n app
roa
c
he
s b
a
sed
on se
manti
c
comp
utation
theory; and
technol
ogie
s
of dynami
c
migration
unde
r the cl
oud
architectu
re.
O
ur SIRS
CA system
d
r
a
s
tically ch
ang
es the status q
u
o
that the existing info
rmat
ion
recomme
ndat
ion sy
stems
focu
s on th
e
mathemati
c
al
ch
ara
c
te
ristics of data,
and ign
o
re
the
unde
rlying kn
owle
dge
sem
antics of data
,
and provide
s
a novel the
o
retical and t
e
ch
nical way
for
the informatio
n recomme
nd
ation.
Ackn
o
w
l
e
dg
ements
This
wo
rk is
sup
porte
d by
the New
Ce
ntury
Excellent Talent
s in
Univ
er
sity
(
N
o. NCET
-
12-0
413
), the National
Natural Scie
nce Fo
und
ation of Chin
a (No. 61
27
2268
), and
the
Funda
mental
Re
sea
r
ch Fu
nds for th
e Cent
ral Universities (Ton
gji University).
Referen
ces
[1] Beilin
L.
A Stu
d
y of Perso
nal
i
z
e
d
R
e
co
mme
ndati
on Ev
al
u
a
t
ion b
a
sed
on
Custo
m
er S
a
ti
sfaction i
n
E-
commerce
. Procee
din
g
s of the Internati
o
n
a
l
Confer
enc
e
o
n
Comp
uter Scienc
e an
d Se
rvice S
y
ste
m
(CSSS). Nanj
in
g. 2011: 1
29-1
32.
[2]
Kobay
a
shi I,
Saito M. A Study
on an Inf
o
rmation
Rec
o
mmend
ation S
y
stem
that Pr
ovid
es T
opical
Information R
e
lated to User’s
I
nquir
y
for Inf
o
rmatio
n
Retri
e
val.
New
Generati
on Co
mp
uting
. 20
07;
26(1): 39-
48.
[3]
Porat AD. Mass Co
mmun
i
cati
on o
n
Soci
al
Medi
a: Strateg
y
for Sca
l
i
ng u
p
Perso
nal
Co
nversati
ons
.
Journ
a
l of Dig
ital & Socia
l
Me
dia Mark
eting
.
201
3; 1(1): 74-
81.
[4]
Liu
NH.
Com
p
ariso
n
of
Co
ntent-bas
ed
Mus
i
c
R
e
comme
nd
ation
usi
n
g
Dif
ferent D
i
stanc
e Estimati
o
n
Methods.
Ap
pli
ed Intell
ig
ence
.
2013; 3
8
(2): 1
60-1
74.
[5]
Deb
nath S, Ga
ngu
l
y
N, Mitra
P.
F
eature W
e
i
ghtin
g i
n
C
onte
n
t bas
ed
Reco
mme
n
d
a
tion
S
ystem
Usin
g
Social Networ
k Analysis
. Pr
ocee
din
g
s of the 1
7
th Intern
ation
a
l C
onfer
ence
on W
o
rl
d W
i
de W
e
b
(WWW)
. Beijing. 2008: 1041-
1042.
[6]
W
a
rtena C, S
l
akh
o
rst W
,
W
i
bbels
M.
S
e
lecti
ng K
e
yw
ords for
Co
ntent b
a
sed
Re
commen
datio
n
.
Procee
din
g
s
of the
19th
ACM
inte
r
natio
na
l c
onfere
n
ce
on
Informatio
n
a
n
d
kno
w
l
e
dg
e m
ana
geme
n
t
(CIKM).
T
o
ronto. 2010: 1
522-
153
6.
[7]
Canta
dor I, Bello
gín A, Va
ll
et D.
Conte
n
t-base
d
Rec
o
mme
n
d
a
tion
i
n
Soci
al T
a
g
g
i
ng Syste
m
s
.
Procee
din
g
s
of the
fourth
AC
M confer
enc
e
on
Re
comm
en
der s
y
stems
(
R
ecS
y
s). Barc
elo
na.
201
0
:
237-
240.
[8]
Shi J, Lo
ng M
,
Liu Q, Di
ng
G, W
ang J.
T
w
in Bridg
e
T
r
ansfer L
ear
nin
g
for Sp
arse
Coll
ab
orati
v
e
Filtering
. Pr
oce
edi
ngs of th
e 1
7
th Pacific-As
i
a
Co
nfere
n
ce
on Adv
anc
es i
n
Kno
w
l
e
dg
e
Discover
y
an
d
Data Min
i
ng (P
AKDD). Gold C
oast. 201
3; 78
18: 496-
50
7.
[9]
W
e
i S, Ye N, Z
han
g S, Hua
n
g
X, Z
hu J.
Col
l
abor
ative F
ilter
ing R
e
co
mme
n
datio
n Alg
o
rith
m Bas
ed o
n
Item C
l
usteri
n
g
an
d
Gl
ob
al
Si
mi
l
a
ri
ty
. Proc
eed
ings
of the
5th Intern
atio
nal
Conf
erenc
e on
Busi
nes
s
Intelli
genc
e an
d F
i
nanc
ial En
gin
eeri
ng (BIF
E). Lanzho
u. 2
012: 69-
72.
[10]
Karatzog
lo
u A
,
Amatriain
X,
Baltrun
a
s L,
Oliver N.
Mu
ltiverse R
e
co
mme
n
d
a
tion:
n-Di
me
nsi
ona
l
T
ensor F
a
ctori
z
a
t
i
on for Co
ntex
t-aw
are C
o
lla
bor
ative F
i
l
t
ering
. Proce
e
d
in
gs of the fourth ACM
confere
n
ce o
n
Recomme
nd
er
s
y
stems (Rec
S
y
s). Barcel
on
a. 2010: 7
9
-86.
[11]
Burke R. Hy
brid Rec
o
mmender S
y
stems: Surve
y
a
nd E
x
p
e
riments.
Us
er
Mode
lin
g a
nd
User-Ad
apte
d
Evaluation Warning : The document was created with Spire.PDF for Python.
ISSN: 23
02-4
046
TELKOM
NI
KA
Vol. 12, No. 6, June 20
14: 4631 – 4
638
4638
Interaction
. 2
0
02; 2(4): 33
1-3
70.
[12]
Esfahan
i MH, Alhan F
K
.
N
e
w
Hybrid R
e
commen
datio
n
System b
a
se
d On C-Mea
n
s
Clusteri
n
g
Method
. Pr
oce
edi
ngs
of the
5
t
h Co
nferenc
e
on Inform
at
ion
and
Kno
w
l
e
dg
e T
e
chnol
og
y (
I
KT
). Shiraz.
201
3: 145-
149.
[13]
Che
n
X, L
i
u
X,
Huan
g Z
,
Sun
H.
Regi
onKN
N
: A Scalab
le
Hybrid
Col
l
a
b
o
r
ative F
ilteri
ng
Algorit
h
m
for
Person
ali
z
e
d
W
eb Service R
e
co
mme
ndati
o
n
. Proceedings
of IEEE Inter
national Confer
ence on Web
Services (ICW
S). Miami. 201
0: 9-16.
[14]
Nad
eau
D, Sekin
e
S. A Surve
y
of
Nam
ed Entit
y
Rec
ogn
ition
and
Classific
a
tio
n
.
Lingv
istica
e
Investigationes
. 2007; 30(
1): 3-26.
[15] Hofman
n
T
.
Proba
bil
i
stic Lat
ent Se
ma
ntic Analys
is
.
Proceed
ings
of the F
i
fteenth co
nferenc
e
o
n
Uncerta
i
nt
y
in artificial
i
n
tel
lig
enc
e (UAI). Stockho
l
m. 199
9: 289-2
96.
[16]
Ha M, Wang
C, Chen J. T
he Support V
e
c
t
or
Machine based on
Intuitionistic Fuzz
y
Number and
Kernel Function.
Soft Comp
uting
. 20
13; 17(
4
)
: 635-64
1.
[17]
B
y
rka J, Gran
don
i F
,
Rothvoss T
,
Sanità L.
Steiner T
r
ee Appro
x
imati
on via Iterativ
e
Ran
domiz
ed
Rou
ndi
ng.
Jour
nal of the ACM
. 2013; 60(
1): 1-35.
Evaluation Warning : The document was created with Spire.PDF for Python.