Internati
o
nal
Journal of Ele
c
trical
and Computer
Engineering
(IJE
CE)
V
o
l.
5, N
o
. 5
,
O
c
tob
e
r
201
5, p
p
. 1
153
~115
7
I
S
SN
: 208
8-8
7
0
8
1
153
Jo
urn
a
l
h
o
me
pa
ge
: h
ttp
://iaesjo
u
r
na
l.com/
o
n
lin
e/ind
e
x.ph
p
/
IJECE
A User- Based Recommendation
with a Scal
able Machine
Learning Tool
C
h
.V
eena*
, B. V
i
jay
a
Ba
bu**
* CSE Dept
, Bo
m
m
a
Institute of
Engin
eering
an
d Technolog
y
,
I
ndia
** CSE Department, K
L Univ
ersity
, Ind
i
a
Article Info
A
B
STRAC
T
Article histo
r
y:
Received Feb 10, 2015
Rev
i
sed
Jun
26,
201
5
Accepte
d
J
u
l 12, 2015
Recommender Sy
stems have pro
v
en to be
valuable way
for online users to
recommend information items like books
, vid
e
os, songs etc.colloborativ
e
filte
ring m
e
thod
s are used
to m
a
ke al
l pred
ic
tion
s
from
historical
data
. In
this
paper we introd
uce Apach
e mahout which is an
open source and
provides a
rich set of components to construct
a customized recommender s
y
stem from
a s
e
le
ction
of
m
achine l
earn
i
n
g
algori
t
hm
s
.
T
h
is
paper
als
o
focus
e
s
on
addres
s
i
ng the c
h
all
e
nges
in col
l
a
borat
ive fil
t
er
in
g like s
cal
abi
lit
y and da
ta
s
p
ars
i
t
y
[1]
.
To
deal
with s
c
al
abil
ity
problems, we go with a distributed
frame work like hadoop. W
e
then
pr
esent a customized
user based
recom
m
e
nder s
y
s
t
em
Keyword:
Decom
m
endat
i
ons
D
i
str
i
bu
ted
Machine lea
r
ni
ng
Map
r
e
du
ce
Si
m
ilarit
y
Copyright ©
201
5 Institut
e
o
f
Ad
vanced
Engin
eer
ing and S
c
i
e
nce.
All rights re
se
rve
d
.
Co
rresp
ond
i
ng
Autho
r
:
Ch.Veena
,
CSE Dep
t
, B
o
mma In
stitu
te o
f
Eng
i
n
e
ering and
Techn
o
l
og
y
In
dia
e
m
ail: cv
p
c
ho
ud
ary
1
2@g
m
ail
.
co
m
1.
INTRODUCTION
1.1 Recomme
nder
Sys
t
ems
A reco
mm
en
d
e
r syste
m
p
l
ay
s a
m
a
j
o
r
ro
le in
in
tern
et tech
no
log
y
fo
r
d
a
ta g
a
th
eri
n
g
an
d
rating
of
d
a
ta. Th
e m
o
st
po
pu
larly and
wid
e
ly
u
s
ed tech
n
i
q
u
e
is
co
llab
o
rativ
e filterin
g
[1
].
Ev
en
t
h
oug
h th
ere are fo
ur
types of filtering tec
hni
ques we
have
foc
u
se
d on c
o
llaborative
filtering, because
c
o
llaborative
filtering
m
e
t
hods are a
b
l
e
t
o
col
l
ect
and anal
y
ze l
a
rge am
ount
of i
n
fo
rm
at
i
on on use
r
’
s
be
havi
or
, act
i
v
i
t
i
es o
r
pre
f
ere
n
ces. And also ca
n
predict what
use
r
s
wo
ul
d l
i
k
e,
base
d o
n
t
h
ei
r
sim
i
l
a
ri
ti
es t
o
ot
he
rs. T
o
o
v
e
r
com
e
th
e ch
alleng
es in
co
llab
o
rativ
e filtering
we p
r
esen
t t
h
e
co
llab
o
rativ
e
filterin
g
fram
e
w
ork of th
e
Ap
ach
e
M
a
ho
ut
[2]
.
l
i
b
rary
f
o
r
sca
l
abl
e
dat
a
m
i
ni
n
g
a
n
d m
a
chi
n
e
l
earni
n
g
.
M
a
h
out
al
so
p
r
o
v
i
d
e
s
al
g
o
ri
t
h
m
im
pl
em
ent
a
t
i
o
ns t
o
com
put
e
recom
m
endat
i
ons
i
n
bat
c
h
[
3
]
.
o
n
a
M
a
p
R
e
duce
cl
ust
e
r
,
we
p
u
t
o
u
r
f
o
cu
s
o
n
th
e fun
c
tio
n
a
lity it o
ffers
fo
r
dev
e
lop
i
ng
si
n
g
l
e-
m
ach
in
e u
s
er b
a
sed
reco
mmen
d
a
tion
system
s [4
].
1.
2 Col
l
o
bor
a
t
i
ve
Fi
l
t
eri
n
g
Co
llab
o
rativ
e
filterin
g
systems are
bro
a
d
l
y classified
i
n
to
t
w
o
typ
e
s:
User-B
a
sed Collaborative
F
iltering
:
User-based
co
l
l
ab
orativ
e filtering
find
s th
e
u
s
ers
wh
o
sh
are th
e sam
e
rati
n
g
p
a
tterns with
th
e active
user
(t
he
user
wh
om
t
h
e pred
i
c
t
i
on i
s
fo
r)
[5
]
.
2 Use t
h
e
rat
i
ngs
fr
om
t
hose l
i
k
e-m
i
nded
users
f
o
u
n
d
i
n
st
ep 1
to
calcu
late a pred
iction
for t
h
e activ
e
u
s
er.
Co
llab
o
rativ
e filterin
g
can
also
b
e
b
a
sed
on
i
m
p
l
icit
o
b
s
ervatio
n
s
of
no
rm
al
user beha
vi
o
r
w
h
i
c
h i
s
di
f
f
ere
n
t
fr
om
im
p
l
i
c
i
t
feedbac
k
l
i
k
e
r
a
t
i
ngs. T
h
ese s
y
st
em
s obser
v
e
and
m
a
t
c
hes t
h
e user pre
f
ere
n
ce,
beha
vi
o
r
wi
t
h
what
al
l
users have
do
ne (
w
h
a
t
m
u
si
c
t
h
ey
have l
i
s
t
e
ne
d t
o
, w
h
at
Evaluation Warning : The document was created with Spire.PDF for Python.
I
S
SN
:
2
088
-87
08
I
JECE Vo
l. 5
,
N
o
. 5
,
O
c
tob
e
r
20
15
:
115
3
–
11
57
1
154
ite
m
s
th
ey h
a
ve bo
ugh
t) and
u
s
e t
h
at d
a
ta to pred
ict th
e
us
er'
s
beha
vi
o
r
i
n
t
h
e
f
u
t
u
re,
o
r
t
o
pre
d
i
c
t
ho
w
a
user
m
i
ght
l
i
k
e t
o
b
e
have
gi
ven
t
h
e cha
n
ce.
Item ba
sed colla
bo
ra
ti
v
e
filtering
:
Bu
ild
s an
item-ite
m
m
a
trix
d
e
termin
in
g
relatio
n
s
h
i
p
s
bet
w
een pairs of i
t
e
m
s. Estim
ate
the tastes of
th
e curren
t
u
s
er b
y
ex
am
in
i
n
g th
e m
a
trix
an
d m
a
tch
i
n
g
th
at u
s
er's d
a
ta. No
t on
ly dep
e
nd
ing
on
i
m
p
l
icit
scori
ng
or
rat
i
ng sy
st
em
whi
c
h i
s
avera
g
e
d
acr
oss al
l
u
s
ers i
g
no
res s
p
eci
fi
c dem
a
nds o
f
a use
r
,
and i
s
p
a
rticu
l
arly poo
r in
task
s
wh
ere th
er
e is large v
a
riatio
n
i
n
i
n
terest.
1.
3 Ma
p Redu
ce
Frame
W
o
rk
Hadoo
p Map
Red
u
c
e is a so
ft
ware
fram
e
work fo
r easily writin
g app
l
icatio
n
s
wh
ich p
r
o
cess v
a
st
am
ount
s
of
da
t
a
(m
ul
ti
-t
erab
y
t
e dat
a
-set
s) i
n
-
p
aral
l
e
l
o
n
l
a
rge cl
ust
e
rs
(t
ho
usa
n
ds o
f
n
ode
s)
of c
o
m
m
odi
t
y
hardware in a
reliable, fault-tolera
nt
m
a
nner[6]. A
Map Reduce job us
ually splits th
e input data-s
et into
in
d
e
p
e
nd
en
t ch
unk
s
wh
ich
are
p
r
o
cessed
by th
e m
a
p
tasks in
a co
m
p
let
e
ly p
a
rallel m
a
n
n
e
r. Th
e
framework
so
rts t
h
e
o
u
t
puts o
f
th
e m
a
p
s
,
wh
ich
are th
en
in
pu
t to
t
h
e
redu
ce task
s.
Typ
i
cally b
o
t
h
t
h
e i
n
pu
t an
d th
e
ou
tpu
t
of t
h
e
jo
b a
r
e s
t
ore
d
i
n
a
fi
l
e
-
s
y
s
t
e
m
.
The f
r
a
m
e
wor
k
t
a
ke
s
care
of sc
he
du
l
i
ng t
a
sk
s, m
o
n
i
t
o
ri
n
g
t
h
em
and
re-
ex
ecu
tes th
e failed
task
s. Th
e
Map
Red
u
ce fra
m
e
work operates exclusivel
y on
<k
ey, v
a
l
u
e> p
a
irs, th
at is, th
e
fram
e
wor
k
vi
e
w
s t
h
e
i
n
put
t
o
t
h
e
jo
b as
a se
t
of <
k
ey
,
val
u
e> pai
r
s
an
d
pr
od
uces
a set
of
<key
,
val
u
e>
pai
r
s
as t
h
e
o
u
t
p
ut
o
f
t
h
e
j
o
b, c
o
nc
ei
vabl
y
o
f
di
f
f
e
rent
t
y
pes
.
2.
CO
MP
UTAT
ION
A
L MO
D
EL
M
e
m
o
ry
-base
d
user rat
i
n
g da
t
a
i
s
used i
n
com
put
i
ng si
m
i
l
a
ri
t
y
bet
w
een users
or i
t
e
m
s
and al
so
fo
r
m
a
ki
ng rec
o
m
m
e
ndat
i
o
n
s
. T
y
pi
cal
exam
pl
es of t
h
i
s
m
echani
s
m
are (a) ne
i
g
h
b
o
r
ho
o
d
ba
sed C
F
an
d (b
)
i
t
e
m
-
base
d/
use
r
-
b
as
ed t
o
p
-
N
rec
o
m
m
e
ndat
i
ons.
Whe
r
e '
U
'
denotes the set of top '
N
'
users that are
m
o
st
similar to
u
s
er 'u
' wh
o
rated
item ‘i’. So
m
e
ex
a
m
p
l
es
of
t
h
e a
g
gre
g
at
i
on
f
unct
i
o
n i
n
cl
ude:
Whe
r
e
k i
s
a
n
o
rm
al
i
z
i
ng fact
or
de
fi
ne
d as
.
An
d
i
s
t
h
e
av
erage
rat
i
n
g
of
use
r
u
fo
r al
l
t
h
e i
t
e
m
s
rat
e
d
by
th
at u
s
er.
(a)
T
h
e nei
g
hb
or
ho
od-
b
a
s
ed
al
gori
t
hm
calculates th
e sim
i
l
a
rity b
e
tween
two
u
s
ers
o
r
ite
m
s
,
m
u
ltip
le
mechanism
s
s
u
ch a
s
Pearson correla
t
i
o
n a
nd
vect
o
r
c
o
si
ne ba
sed si
m
i
lari
t
y
are use
d
fo
r ge
ne
rat
i
n
g
a
pre
d
i
c
t
i
on f
o
r t
h
e use
r
by
t
a
ki
ng t
h
e w
e
i
g
ht
ed avera
g
e o
f
al
l
t
h
e rat
i
ngs. Si
m
i
l
a
ri
ty
co
m
put
at
i
on bet
w
ee
n
ite
m
s
o
r
users is an im
p
o
r
tan
t
p
a
rt
o
f
th
is appro
a
ch
[7
].
The
Pears
o
n c
o
r
r
el
at
i
on
si
m
i
lari
t
y
of t
w
o
u
s
ers
x, y
i
s
de
fi
n
e
d as
Whe
r
e I
xy
is th
e set of item
s
rated
b
y
bo
th
u
s
er
x
an
d u
s
er y.
The c
o
sine
-bas
ed a
p
proach de
fines t
h
e c
o
si
ne-similarity b
e
t
w
een two
u
s
ers x
and
y as:
Evaluation Warning : The document was created with Spire.PDF for Python.
IJECE
ISS
N
:
2088-8708
A
User
- Base
d Reco
mme
n
dat
i
o
n
w
i
t
h
A Sc
al
abl
e Mac
h
i
n
e
Lear
ni
n
g
To
ol
(
C
h.Veen
a)
1
155
(b
)
The user b
a
se
d top-N rec
ommend
ati
o
n
algorithm
i
d
en
tifies th
e
k
mo
st similar u
s
ers to
an
activ
e
u
s
er
usi
n
g si
m
i
l
a
ri
ty
base
d
vect
or
m
odel
.
A
f
t
e
r t
h
e
k m
o
st
sim
i
l
a
r use
r
s a
r
e
fo
un
d,
t
h
ei
r
co
rr
esp
o
n
d
i
n
g
use
r
-
ite
m
m
a
trices are a
g
gre
g
ated to i
d
en
tify t
h
e set of item
s
to
b
e
reco
mm
e
n
d
e
d
.
In
t
h
is
work,
we
p
r
esen
t
Mah
o
u
t
’s flex
i
b
le co
llab
o
rativ
e
filterin
g
framewo
r
k
,
with
a bro
a
d
rang
e
o
f
algo
rith
m
im
p
l
e
m
en
tatio
n
s
and
p
r
ovi
de a
n
AP
I
of
Ecl
i
p
se
t
o
i
m
pl
em
en
t
the i
t
e
m
-
based,
user
-
b
ased
rec
o
m
m
e
ndat
i
o
ns.
3.
PROBLEM STATEMENT
Let A b
e
a |U|
× |I|
m
a
trix
h
o
l
d
i
ng
all kn
own in
ter
actio
ns between
a set
o
f
u
s
ers U and
a
set o
f
item
s
I.
a
u•
rep
r
esen
ts u
s
er U
with
th
e
h
i
story o
f
t
h
e item in
teractio
n.
a
u•
the
u
th
r
o
w o
f
A
.
[8
].
The to
p-N
recom
m
endat
i
ons
f
o
r
us
er
U
co
rresp
ond
to
th
e first
N item
s
selec
t
ed
from
a ran
k
i
ng
r
u
o
f
all item
s
,b
ased
o
n
ho
w st
ro
n
g
l
y
t
h
ey
we
re
pre
f
e
rre
d
by
. T
h
i
s
r
a
nki
ng
i
s
deri
v
e
d
fr
om
pat
t
e
rns
fo
u
n
d
i
n
A.
3.
1 Se
quen
t
ial
Ap
proach
for
Computing User
Cooccurre
nces
For
a p
a
i
r
wi
se com
p
ari
s
o
n
be
t
w
een
use
r
s, a
dot
pr
o
duct
of
col
u
m
n
s o
f
A
gi
ves t
h
e
num
ber o
f
i
t
e
m
s
,
t
h
at
t
h
e corres
p
o
n
d
i
n
g use
r
s
have i
n
com
m
on .fi
r
st
a s
earch
fo
r
o
t
h
e
r u
s
ers with
si
m
i
lar taste
is
to
be
conducted [9].
r
u = A
T
(A
a
u•
)
Alg
o
rithm
:
1
t
h
e st
a
nda
rd
se
que
nt
i
a
l
ap
pr
o
ach
fo
r c
o
m
put
i
ng t
h
e
Item
si
milari
ty
m
a
trix
S = A
T
A i
s
s
h
ow
n
[1
0
]
.
Fore
ach
use
r
do
Fore
ach
item
i interacted by t
h
e
user u do
Fore
ach
u
s
er
v
also
in
teracted
with
th
e same ite
m
i d
o
s
uv
= s
uv + 1
C
o
u
n
t
i
n
g
user
cooc
u
rre
nces i
n
m
a
p re
duce
Let u
s
start
ou
r al
g
o
rith
m
i
c
fram
e
wo
rk
to
b
e
op
ti
m
i
z
e
d
wit
h
Distri
b
u
t
ed
item
co
o
c
cu
rren
ce
cou
n
t
i
n
g .we t
a
ke a sim
p
l
e
m
odel
wi
t
h
bi
nary
dat
a
i
.
e. (
y
es-1,
n
o-
0)
.I
f we wa
nt
t
o
im
pl
em
ent
i
n
a di
st
ri
but
e
d
fram
e
work like
m
a
preduce, and s
h
are the
work am
ong
several node
s, the issues in comm
on are, the
algorithm
requires ra
ndom
access to
bot
h
us
ers a
nd item
s
.
The c
o
m
p
lexity of
use
r
ba
sed approach is quadric
in
the num
b
er
of use
r
s, beca
use
each
use
r
is to be
com
p
are
d
with ot
her
user.
There
f
ore we need
t
o
parallize
th
e algo
rith
m
1
to
run
p
a
rallel p
r
o
portio
n
a
l
to
th
e
nu
m
b
er of m
achines in the
cluster of m
a
p reduce
fram
e
wo
rk
.
A sta
nda
rd algorithm
does
n
’t suit
with this
dist
ributed fra
m
e work beca
use it
needs a
random
access
to
th
e
rows and
co
lu
m
n
s o
f
A in
t
h
e inn
e
r lo
op
s
o
f
th
e al
go
rith
m
1
. Here
co
m
e
s th
e ad
van
t
ag
e
o
f
ma
ho
u
t
wit
h
its in
bu
ilt scalab
ility o
f
th
e algo
rith
m
s
to
run
o
n
a
d
i
stribu
ted
fram
e
work lik
e h
a
do
op
[1
1].
S =
A =
∑∑
∑
,
,
||
||
||
We g
e
t th
e ou
t
e
r p
r
od
u
c
t fo
rm
u
l
a
tio
n
o
f
the
m
a
trix
m
u
lt
i
p
licatio
n
.
We p
a
rtitio
n
A b
y
co
lu
m
n
s (the
ite
m
s
) an
d
sto
r
e it in
th
e d
i
stri
b
u
t
ed
file system
[3
].
Mapper
: th
is
fun
c
tion
read
s
a sin
g
l
e co
l
u
mn
of A co
m
p
u
t
es th
e co
lu
m
n
s o
u
t
er
p
r
od
uct an
d
send
s it to
th
e
reducer.
Reducer
: th
is fun
c
tion
si
m
p
ly su
m
s
u
p
th
e in
d
i
v
i
d
u
a
l coun
ts o
f
th
e m
a
p
p
e
r and
con
s
o
l
id
ates th
e si
m
i
l
a
rity
S
per i
n
v
o
cat
i
o
n.
Here
we al
so
m
a
ke use
of t
h
i
s
di
st
ri
b
u
t
e
d a
p
p
r
oach t
o
ad
d
r
ess t
h
e
dat
a
s
p
arsi
t
y
p
r
o
b
l
e
m
of
co
llab
o
rativ
e fi
lterin
g
.
b
a
sically, th
e in
teractio
n
m
a
trix
A is
usual
l
y
very
sparse a
nd c
ont
ai
ns fract
i
o
n o
f
cel
l
s
with
non
-zero
ele
m
en
ts. Th
is li
mits th
e n
u
m
b
er o
f
user
pairs to
a sm
all
fraction
.
So
t
h
e
m
a
p
fun
c
tion wh
ich
retu
rn
s th
e i
n
term
ed
iatel
y
o
u
t
er
p
r
o
d
u
c
t m
a
t
r
ices is form
u
l
ated
in
a way t
h
at it return
s
on
ly no
n-zero
en
tries.
Com
b
ine
r: all th
e in
term
ed
iat
e
ly resu
lts o
f
th
e m
a
p
p
e
r are co
m
b
in
ed
and
so
m
e
co
n
s
o
lidated
d
a
ta is sen
d
o
n
the net
w
ork to
the re
ducer
by
th
e co
m
b
in
er fu
n
c
tion
.
It
r
e
d
u
ces n
e
t
w
or
k over
h
ead
.
Alg
o
rithm
2
: counting user
c
o
-occ
urrences
Evaluation Warning : The document was created with Spire.PDF for Python.
I
S
SN
:
2
088
-87
08
I
J
ECE Vo
l. 5
,
N
o
. 5
,
O
c
tob
e
r
20
15
:
115
3
–
11
57
1
156
Functi
on
ma
p
(
•
):
Fore
ach
u
•
do
c
←
spa
r
se
_vect
or ()
Fore
ach
v
•
wi
t
h
v
> u d
o
C[v
]
←
1
e
m
it (u
; c)
Functi
on
co
mb
in
e (u
,
C
1,
…, Cn
)
c
←
vect
o
r
_ad
d
(C
1…
C
n
)
e
m
it (u
; c)
Functi
on
re
duce (
u
, C
1
,…
,C
n)
:
s
←
vect
or
_ad
d
(C
1…
C
n
)
e
m
it (u
; s).
4.
RELATED WORK
Measuring dis
t
ances
and similarities:
M
achi
n
e Le
ar
ni
n
g
al
g
o
r
i
t
h
m
s
li
ke K-
Ne
arest
-
Nei
g
hb
or
, C
l
ust
e
ri
ng
use si
m
i
l
a
ri
ty or
di
st
a
n
ce
measu
r
es
b
e
tween
two
d
a
ta
po
in
ts t
o
fi
n
d
the similarities [1
2
]
.
Distance
betw
een cate
goric
al data points
We can
calcu
l
a
te th
e ratio
to
d
e
term
in
e h
o
w
si
m
ilar two
d
a
ta p
o
i
n
t
s are u
s
in
g
simple matc
hing
c
o
efficient
:
noO
f
Match
A
ttribute
s
/ n
o
OfAttribute
s
.
Basically to
measu
r
e th
e num
b
e
r o
f
attribu
t
es to
b
e
ch
an
g
e
d to
match each
other,
Ha
mming dist
ance
i
s
use
d
[1
3]
.
Gi
ve
n t
h
at
m
o
st
use
r
s
onl
y
s
ee a
very
sm
all
po
rt
i
o
n
of
al
l
m
ovi
es, i
t
d
o
e
s
n'
t
i
ndi
cat
e an
y
sim
i
l
a
ri
ty
bet
w
ee
n
t
h
e use
r
s i
f
b
o
t
h
t
h
e
use
r
s
ha
ve
not
see
n
t
h
e m
ovi
e i
.
e bot
h
val
u
es a
r
e ze
ro
.
On
t
h
e
ot
h
e
r ha
n
d
, i
f
b
o
t
h
user
saw th
e sam
e
m
o
v
i
e (b
o
t
h
v
a
lu
e is o
n
e
), it i
m
p
l
ies a
lo
t o
f
si
m
ilarit
y
b
e
tween
th
e u
s
er. Ev
en
thou
gh
m
a
tch
i
n
g
or
not, if cate
g
ory is struct
ure
d
as a
Tree hi
erarchy
, t
h
en
the dista
n
ce of two cate
g
ory
can
be calculat
e
d
by
m
easuri
n
g t
h
e
pat
h
l
e
ngt
h
of
t
h
ei
r c
o
m
m
on p
a
rent
.
Jac
ard
si
mi
l
a
ri
ty:
No
Of
O
n
esB
o
t
h
/
n
oO
fO
nes
I
n
A +
n
o
O
f
O
nesI
nB
–
no
Of
O
n
e
s
In
A
an
d B
)
Similarity
between instance
s contai
ning mixed types of attribute
s
we can calculate the si
m
i
larit
y
of each
attri
bute (or group the attributes
of
the sam
e
type)i.e. if a dat
a
poi
nt
cont
ai
n
m
i
xed t
y
pe o
f
at
t
r
i
but
es,
an
d t
h
en
com
b
i
n
e
t
h
em
t
oget
h
er
usi
n
g s
o
m
e
wei
g
ht
e
d
a
v
era
g
e
.
combine
d
_si
m
ilarity(x, y) =
Σ
over_k
[w
k
*
δ
k
* simila
rity (x
k
, y
k
)]
/
Σ
over_k
(
δ
k
) Where
Σ
over_k
(w
k
) = 1
5.
METHO
D
OL
OGY
A m
a
hout
-bas
ed c
o
llaborative filtering takes use
r
s
pre
f
e
r
ences
from
a sm
all sub set
of
data a
nd
p
r
ed
icts th
e futu
re fro
m
th
e
p
a
st pr
e
f
er
en
ce
s
[
2
].
Exp
r
ess
i
o
n
s
of
pr
e
f
er
en
ces are th
eir i
m
p
licit an
d
ex
p
licit
r
a
tin
g
s
of
t
h
e
pr
odu
cts. E.g.
pu
r
c
h
a
sing
a boo
k,
r
ead
i
n
g a
new
s
ar
ticle, r
a
tin
g th
e
pr
odu
ct
w
ith
stars etc [4
].
Creating
a us
er-based rec
ommender
API
In t
h
i
s
ap
p
r
oa
ch w
e
com
put
e recom
m
enda
t
i
on f
o
r pa
rt
i
c
ul
ar
user
s;
we
l
o
o
k
f
o
r ot
her
users
wi
t
h
a
sim
i
l
a
r
t
a
st
e and
pi
ck t
h
e r
ecom
m
e
ndat
i
o
ns fr
om
t
h
ei
r i
t
e
m
s
.
M
a
ho
ut
uses any
t
e
xt
fi
l
e
deri
ve
d fr
om
u
s
er/item
m
a
trix
. C
o
lu
m
n
s an
d rows ar
e iden
tified
b
y
u
s
erID, item
I
D and
va
lu
e
d
e
no
tes th
e st
reng
th
o
f
th
e
in
teractio
n (e.g. th
e
rating
g
i
ven
to a m
o
v
i
e).
Loadin
g
the
d
a
t
a
fro
m
te
x
f
ile
in
to
th
e
ma
ho
u
t
in
t
e
rfa
c
e
DataMod
e
l M
= n
e
w FileDat
a
Mo
d
e
l
(n
ew
F
i
l
e
("/
hom
e/
user/
d
es
kt
o
p
/
m
usic.csv"
));
So
to c
o
m
p
ute the
correl
ati
o
n c
o
efficient be
tween
their inter
a
ctions we use
Pears
o
n c
o
rr
elation
coefficient.
User Sim
i
larity
siml = n
e
w Pe
arson
C
o
r
relatio
nSimilarity (M);
Top-N rec
om
mendations
and r
a
tin
g
p
r
e
d
iction
are
listed usin
g
Li
st
<R
ecom
m
ende
d
I
t
e
m
>
t
o
p
I
t
e
m
s
=recom
m
e
nder
.
rec
o
m
m
end
(use
rI
D,
10
);
f
l
o
a
t
p
r
ef
e
r
en
ce
=
r
e
co
mme
n
d
e
r.
e
s
ti
m
a
tePreferen
ce (u
serID,item
I
D);
E
val
u
a
ti
on
Prov
i
d
e a state
m
en
t th
at what is ex
p
ected
, as st
ated
in
th
e "In
t
rodu
ctio
n
"
ch
ap
ter can
u
ltim
a
t
el
y
resu
lt in
"Resu
lts and
Discussio
n
"
ch
ap
ter, so
th
ere
is co
m
p
atib
ilit
y. Mo
reo
v
e
r, it can
also
b
e
add
e
d
t
h
e
Evaluation Warning : The document was created with Spire.PDF for Python.
I
J
ECE
I
S
SN
:
208
8-8
7
0
8
A
User
- Base
d Reco
mme
n
dat
i
o
n
w
i
t
h
A Sc
al
abl
e Mac
h
i
n
e
Lear
ni
n
g
To
ol
(
C
h.Veen
a)
1
157
pr
os
pect
of t
h
e devel
o
pm
ent
of resea
r
ch r
e
sul
t
s
and a
p
p
l
i
cat
i
on pr
ospe
ct
s of fu
rt
he
r st
udi
es i
n
t
o
t
h
e ne
x
t
(base
d
on resul
t
and
disc
ussi
on).
6.
CO
NCL
USI
O
N
AN
D F
U
T
U
RE W
O
R
K
S
In
t
h
is p
a
p
e
r we h
a
v
e
presen
ted
t
h
e
im
p
o
r
tan
ce
and
t
h
e ch
allen
g
es o
f
co
llab
o
rativ
e filterin
g
.
We
reph
rased th
e statistica
l
m
e
th
o
d
s
,
d
a
ta m
i
n
i
n
g
and
m
ach
ine alg
o
rith
m
s
u
s
ed
t
o
b
u
ild a reco
mmen
d
e
r syste
m
.
We
dem
onstra
t
ed how m
a
hout is m
o
re vers
atile to pe
rform
user based or
item
based recom
m
endations
. We
sho
w
e
d
h
o
w
a
scal
abl
e
si
m
i
lari
t
y
-base
d
r
e
c
o
m
m
e
nder
sy
s
t
em
on a
m
a
p red
u
ce
fram
e
wo
rk
ove
rcom
es t
h
e
scalability challenge.
We
have also pr
esente
d the m
a
hout e
v
aluator tool, a
nd als
o
how it m
easures the
quality
of
pre
d
i
c
t
i
o
n
.
M
o
st
of t
h
e al
go
ri
t
h
m
s
of m
a
ho
ut
are
alre
ady im
ple
m
ented in a wa
y that they are s
calable
acro
s
s h
a
doo
p.b
u
t
still th
ere are few ch
alleng
es lik
e sp
arsity, co
ld
-start, ev
en
thou
gh
m
a
h
o
u
t
i
m
p
l
e
m
en
ts few
al
go
ri
t
h
m
s
, som
e
of t
h
e al
go
ri
t
h
m
s
cannot
be pa
ral
l
e
l
i
zed ove
r ha
do
o
p
l
i
k
e st
ochast
i
c
gra
d
i
e
nt
des
c
e
n
t
and
su
ppo
r
t
v
ecto
r
m
ach
in
e.
In
ou
r fu
tur
e
w
o
rk w
e
fo
cu
s
on
the above iss
u
es and
i
n
tend
h
o
w
ou
r
algo
rith
m
s
so
lv
e th
e ch
allen
g
e
s an
d scalab
le on
larg
e parallel an
d d
i
st
ribu
ted
n
e
two
r
k
s
.
REFERE
NC
ES
[1]
“
A
survey
on
recom
m
e
nder system
s based o
n
coll
aborat
ive
filte
ring t
echn
i
ques”,
In
ternational journal o
f
innovation
in
en
gineering
and technology
, vo
lume 2, issue 2
,
Apr
i
l 2013.
[2]
Collaborative F
iltering w
ith
A
pache Mahout
,
Sebastian Sch
e
lt
er,
Techn
i
sch
e
Universit
ä
t B
e
rlin
, Germ
an
y,
ssc@apache.org, Sean Owen
,
M
y
rrix Ltd, srowen@apach
e.org
[3]
Scalable Similar
ity-
B
ased Neig
h
borhood Methods with Map Red
u
ce
,
Se
ba
stia
n Sc
he
lte
r
Ch
ristop
h Boden Volker
Markl,
Te
chnisc
he Universi
tät
B
e
rlin
, Germ
an
y,
firstnam
e.l
a
stna
m
e
@tu-berlin
.de
[4]
Distributed Itembased Collaborati
ve Filterin
g with Apache Mahout
, Sebastian Schelter
, ssc@apache.or
g,
twitter
.
com
/
sscd
o
topen, 7. Octob
e
r 2010.
[5]
“
Collaborative filtering
”,
Wi
ki
pedi
a
,
t
h
e
fre
e
e
n
cy
cl
ope
di
a
[6]
“apache.org”
[7]
Item-based
colla
borative _ltering
recommendatio
n algorithms
. W
WW. pp. 285-29
5, 2001
.
[8]
Y. Koren, R. Bell, and C. Volinsk
y
.
Ma
trix Factorization Techni
qu
es for Recommender Systems
. Computer, 42:30–
37, 2009
.
[9]
J. Jiang, J
.
Lu
,
G. Zhang
,
and
G. Long
. S
c
alin
g-up item-based
collaborativ
e _
ltering recommendation algorith
m
based on hadoo
p
. SERVICES, p
p
. 490-497
, 201
1.
[10]
G. Linden
,
B
.
Smith, and J. York. Amazon.co
m recomme
ndations: it
em
-to-ite
m
collabora
tiv
e
_lter
i
ng. In
tern
et
Computing, IEEE, 7(1)
: 76-80
, 2
003.
[11]
“Inverted indexing in Big
Data using HADOOP”,
International journa
l
of advanced com
puter science
and
applications, vo
lume 4,
issue 11,
2013.
[12]
“
Pragmatic Pro
g
ramming Techniques
”,
ri
ck
yro,
August 8, 2012
[13]
“
I
ncrem
e
ntal
le
a
r
ning for d
y
nam
i
c co
llabor
at
ive
filte
ring”
, Int
e
rn
ation
a
l journ
a
l o
f
advanc
ed com
puter sci
e
nc
e an
d
applications”, vo
lume
6, issue 6
,
June 2011.
BIOGRAP
HI
ES OF
AUTH
ORS
Mrs
. Ch
. Ve
en
a
, B.T
ech
, M
.
T
e
c
h
, (P
hD
), is
w
o
r
k
ing as
A
s
s
t
.P
rof in CS
E dep
a
rt
m
e
nt of Bom
m
a
Institute of Tech
nolog
y
and Scie
nce, KHAMMAM. She is Pursuing
PH.D in CSE departmen
t
of
K L University
,
Vaddeswaram,
Guntur
(Dt) And
h
ra Pradesh
,
IN
DIA.
Dr B.
Vijay
a
Babu
is
pres
entl
y working as
P
r
ofes
s
o
r in
CSE d
e
partment of K L University
,
Vaddeswaram, and Guntur (D.t) Andhra Pradesh,
INDIA. He has obtai
ned B.Tech.,
(ECE)
degree from JNTU College of
Engineering,
K
AKINA
DA,
M.
Tech.
,
(CSE) degree from JNT
U
College of
Engineering
,
Anantapur and PhD
degree from Andhra University
, Visakhapatnam.
He has published several research pap
e
rs in
various Intern
ation
a
l journ
a
ls and attended
International Co
nferences condu
cted
in
India .
Evaluation Warning : The document was created with Spire.PDF for Python.