Internati
o
nal
Journal of Ele
c
trical
and Computer
Engineering
(IJE
CE)
V
o
l.
5, N
o
. 4
,
A
ugu
st
2015
, pp
. 82
1
~
83
1
I
S
SN
: 208
8-8
7
0
8
8
21
Jo
urn
a
l
h
o
me
pa
ge
: h
ttp
://iaesjo
u
r
na
l.com/
o
n
lin
e/ind
e
x.ph
p
/
IJECE
A Study on Effi
ci
ent Desi
gn of a Multimedia Conversi
on
Module in PESMS for Social Media Services
Jongjin Jun
g
1
, M
y
oun
g
jin
Kim
2
, H
a
nk
u
Lee
3
1
Korea E
l
ectroni
cs Technolog
y
I
n
stitute, C
e
nt
er f
o
r Soci
al Med
i
a
Cloud Com
putin
g, Konkuk Univ
ersit
y
, South Ko
rea
2,3
Center
for Social Media Cloud
Compu
ting, Kon
kuk University
,
South Korea
Em
a
il:
m
o
zzalt@keti.
re.kr
1
, tou
gh105@konkuk.ac.kr
2
, h
l
ee@konkuk.ac.kr
3
Article Info
A
B
STRAC
T
Article histo
r
y:
Received
J
u
n 12, 2015
Rev
i
sed
Au
g
20
, 20
15
Accepted Aug 26, 2015
The
m
a
in
contri
bution of
th
is p
a
per
is
to pr
ese
n
t th
e Pl
atform
-
a
s-a-Servic
e
(PaaS) Envi
ron
m
ent for
Socia
l
Multim
ed
ia
Ser
v
ice
(PESMS),
derived
from
the Social Med
i
a Cloud Computing Servic
e Env
i
ronment.
The
main role of
our PESMS is to support the d
e
velopment
of so
cial n
e
tworking
services
that
includ
e aud
i
o, image,
and vid
e
o
formats. In
this
paper,
we focus in
par
ticu
l
ar
on th
e design
and implemen
tation of
PESMS, including
the tr
anscoding
function
for
pr
ocessing
larg
e
amounts of
social media
in
a
parallel
and
distributed
manner. PESMS
is designed to impro
v
e th
e qu
ality
and speed of
m
u
ltim
edia con
v
ersions b
y
incorporating
a m
u
ltim
edia convers
ion m
odule
based on Hadoo
p, consisting of
Hadoop Di
stributed File S
y
stem
for storing
l
a
r
g
e q
u
a
nt
i
t
i
e
s o
f
so
c
i
al
da
ta a
n
d Ma
p
R
e
d
uc
e f
o
r
di
st
r
i
b
u
t
e
d
pa
r
a
ll
el
processing of these data. In this way
,
our P
E
SMS has the
prospect o
f
exponentially
r
e
ducing
the en
co
ding time
for
tr
anscoding
larg
e
numbers of
im
age fi
les
in
t
o
s
p
ecif
i
c
form
ats
.
To t
e
s
t
s
y
s
t
em
perform
an
ce for
th
e
tra
n
sc
oding func
tion,
we me
a
s
ure
d
the
ima
g
e tra
n
sc
oding time
unde
r
a
variety
of
exper
imental
condition
s. Based on
exp
e
riments performed on
a 28-
node cluster
,
we found
that our
s
y
stem
deliver
ed excellent perf
ormance
in
the image transcoding
function.
Keyword:
PESMS
Im
age Convers
i
on
M
a
pR
ed
uce
H
a
doo
p
Med
i
a Tr
an
scod
ing
Copyright ©
201
5 Institut
e
o
f
Ad
vanced
Engin
eer
ing and S
c
i
e
nce.
All rights re
se
rve
d
.
Co
rresp
ond
i
ng
Autho
r
:
Jo
ngj
in Jung
an
d H
a
nk
u Lee
Sm
art Med
i
a Research Cen
t
er, Ko
rea Electron
i
cs Tech
no
l
o
gy In
stitu
te,
and
C
e
nt
er
fo
r S
o
ci
al
M
e
di
a C
l
o
u
d
C
o
m
put
i
ng,
Ko
n
k
u
k
U
n
i
v
e
r
si
t
y
Jo
ng
Jin
,
Jung
Seou
l,
Sou
t
h
Ko
r
e
a
Em
a
il: m
o
zzalt
@
keti.re
.kr
1.
INTRODUCTION
To
day
,
nea
r
l
y
every
one
uses
sm
art
devi
ces suc
h
as
sm
art phones, ta
blet com
puters, a
n
d sm
art TVs.
In
add
itio
n, they u
s
e so
cial
n
e
two
r
k
i
ng
serv
ice
(SNS)
ap
p
lication
s
th
at en
ab
le t
h
em to
tak
e
h
i
g
h
q
u
a
lity
pi
ct
ures
,
rec
o
r
d
vi
de
os,
a
n
d
upl
oad
a
n
d
d
o
w
nl
oad
t
h
em
. These
i
m
ages and
vi
de
os
ha
ve
hi
g
h
res
o
l
u
t
i
on;
therefore, t
h
e
files ar
e
quite
large. Users
also
wa
nt seam
less,
real
-tim
e
service; furthe
rm
ore,
they want
al
l
m
u
l
t
i
m
ed
ia serv
ices to b
e
indep
e
nd
en
t of t
h
e p
l
atfo
rm
o
p
e
ratin
g system
(
O
S), the
d
i
sp
lay d
e
v
i
ce
reso
lu
tio
n,
and s
o
on. In orde
r t
o
m
eet these re
quirem
ents, the
devel
o
pment environm
en
t
s
of
S
N
Ss
n
eed
t
o
be
s
u
p
p
o
rt
e
d
with la
rge
r
st
orage
system
s,
a larger data
ba
se system
,
m
o
re elastic
com
put
i
n
g
reso
u
r
ces
,
a
n
d
dat
a
pr
oc
essi
ng
t
echni
q
u
es
fo
r
n
u
m
e
rou
s
m
u
l
t
i
m
e
di
a de
vi
c
e
s.
Ser
v
i
ce
p
r
ovi
ders
(SPs
)
m
u
st
pr
ovi
de i
n
creasi
ngl
y
m
a
ssi
ve
stora
g
e s
p
aces
to
support th
e volum
e
of s
o
cial m
e
dia created
daily
by
use
r
s. SPs m
u
st als
o
install m
o
re
po
we
rf
ul
ha
r
d
ware
, o
n
l
y
t
o
h
a
ve t
o
cha
n
ge i
t
agai
n
som
e
t
i
m
e
l
a
t
e
r. To
ha
ndl
e t
h
e
ran
g
e
of
use
r
devi
ce
t
y
pes,
SPs m
u
st
ha
ve
al
l
ki
nd
s
of m
u
l
t
i
m
e
di
a. H
o
w
e
ver
,
t
h
i
s
m
a
y
not
be t
h
e
best
sol
u
t
i
o
n.
Th
is
p
a
p
e
r prop
o
s
es th
e Platfo
rm
-as-a-Serv
ice (P
aaS) Environ
m
en
t fo
r So
cial Mu
ltim
e
d
ia Serv
ice
(PESM
S
),
whi
c
h
was
de
ri
ve
d
as a
Paa
S
m
odel
of
t
h
e
S
o
ci
al
M
e
di
a C
l
o
u
d
C
o
m
put
i
n
g
Ser
v
i
ce E
nvi
ro
nm
ent
(SM
C
C
S
E)
[1]
,
[
2
]
.
The m
a
i
n
ob
ject
i
v
e
of
P
E
SM
S i
s
t
o
s
u
pp
o
r
t
de
vel
o
p
m
ent
envi
r
o
nm
ent
s
f
o
r
de
vel
o
pi
n
g
o
r
Evaluation Warning : The document was created with Spire.PDF for Python.
I
S
SN
:
2
088
-87
08
I
J
ECE Vo
l. 5
,
N
o
. 4
,
Au
gu
st 2
015
:
82
1
–
83
1
82
2
i
m
p
l
e
m
en
tin
g
so
cial m
e
d
i
a ser
v
ices
[3
],
[4
],
[5
],
[6
]
usi
n
g
cl
ou
d c
o
m
put
i
n
g
t
ech
ni
q
u
es
and
el
ast
i
c
co
m
put
i
ng
reso
u
r
ces i
n
a
cl
ou
d com
put
i
ng e
n
vi
r
onm
ent
.
In t
h
i
s
st
u
d
y
,
we f
o
c
u
sed
o
n
t
h
e im
pl
em
ent
a
t
i
on o
f
PE
SM
S.
W
e
desi
g
n
e
d
a
n
d
i
m
pl
em
ent
e
d p
a
rt
i
a
l
l
y
funct
i
o
nal
s
o
ci
al
m
e
d
i
a
m
odul
es
f
o
r
co
nve
rt
i
n
g i
m
ages i
n
t
o
J
P
E
G
,
G
I
F
,
PN
G, B
M
P
,
T
I
FF, a
n
d
ot
he
r f
o
rm
at
s and m
odul
es
f
o
r
resi
zi
ng
i
m
ages usi
n
g P
E
SM
S
.
2.
RELATED WORKS
Clou
d com
p
u
t
ing
:
C
l
o
u
d
c
o
m
put
i
ng t
ech
n
o
l
o
gy
ha
s al
rea
d
y
bec
o
m
e
a popular service
for efficient
syste
m
m
a
in
te
n
a
n
c
e, t
h
in
-cli
en
t d
e
v
i
ces, powerfu
l
software serv
ices an
d
o
t
h
e
r ap
p
licatio
n
s
.
Acco
rd
ing
to th
e
N
a
tio
n
a
l
In
stitu
te
o
f
Stand
a
rd
s and
Technolo
g
y
(NIST), clo
u
d
co
m
p
u
ting
is d
e
fin
e
d
as
“a m
o
d
e
l fo
r en
ab
ling
ubi
quitous
, c
o
nve
nient, on-de
m
a
nd
netw
ork access
to a
share
d
pool
of co
nfigurable com
puting res
o
urce
s
(e.
g
.,
net
w
o
r
ks
, se
rve
r
s,
st
o
r
a
g
e,
ap
pl
i
cat
i
o
n
s
, a
n
d
se
rvi
ces
) t
h
at
can
be
ra
pi
dl
y
pr
ovi
si
on
ed a
n
d
rel
ease
d
wi
t
h
m
i
nim
a
l
m
a
nagem
e
nt
eff
o
rt
or
ser
v
i
ce
p
r
o
v
i
der i
n
t
e
ract
i
o
n
”
[
7
]
.
T
h
us,
t
h
e de
vel
o
pe
r, t
h
e sy
st
em
m
a
nager
,
o
r
a co
mm
o
n
u
s
er can easily
u
s
e th
e p
l
atform
o
r
serv
ice withou
t t
h
e
n
eed
for co
m
p
lex in
stallatio
n an
d
associated m
a
jor constrai
nts
of tim
e
, place,
and occa
sion.
Social
-media-service m
o
de
l
using clou
d
com
putin
g:
Soci
al
m
e
di
a–
base
d s
o
ci
al
net
w
or
ki
n
g
servi
ces
(S
NSs
)
ha
ve al
s
o
bec
o
m
e
pop
ul
ar.
M
o
st
pe
opl
e
us
e one
o
r
m
o
re
soci
al
ser
v
i
ce a
ppl
i
cat
i
o
ns, a
n
d t
h
ey
want elastic social
m
e
dia service. Pe
ople s
h
are t
h
ei
r
new
s
, p
h
o
t
o
gra
p
h
s
, an
d m
u
l
t
i
m
e
di
a wi
t
h
s
o
ci
al
m
e
di
a
serv
ices, an
d th
ey wan
t
to en
jo
y
h
i
gh
-qu
a
lity
m
u
lti
me
d
i
a
and
h
i
gh
-sp
e
ed
serv
ice. Furth
e
rm
o
r
e, b
e
cau
s
e
p
e
op
le h
a
v
e
dif
f
e
r
e
n
t
d
e
v
i
ces, sev
e
r
a
l
v
e
r
s
io
n
s
are
n
e
eded
, ev
en thoug
h the con
t
en
t
is t
h
e sam
e
. Mo
r
e
sp
ecifically, there m
a
y b
e
d
i
fferen
t
v
e
rsions
o
f
th
e sa
m
e
co
n
t
en
t th
at
d
i
ffer in reso
l
u
tio
n, qu
ality (e.g
.,
th
u
m
b
n
a
il or
m
o
b
ile p
h
o
n
e
v
e
rsi
o
n), or cod
ec. Fo
r ex
am
p
l
e, it is
kn
own
t
h
at Faceb
ook
h
a
d
7
0
b
illion
im
ag
es
in
201
2. To
f
i
nd
th
e m
o
st ap
pr
op
r
i
ate con
t
ent an
d d
e
li
v
e
r
it
to
th
e user (o
r m
o
re p
r
e
c
isely,
the user’s de
vice),
accurate i
n
formation about
each m
u
ltim
e
d
i
a object is
ve
ry im
portant.
To
obtain this
inform
ation,
socia
l
m
u
lt
im
edi
a
retri
e
val
t
o
ol
s (t
o
ext
r
act
feat
ur
es suc
h
as c
o
l
o
r
,
t
e
xt
u
r
e, a
n
d hi
st
og
ram
)
coul
d be
re
qui
r
e
d. T
o
address t
h
ese i
ssues, the syst
e
m
or
platform for a s
o
ci
al
m
e
di
a ser
v
i
ce m
u
st
ha
ve m
o
re
po
we
rf
ul
com
put
i
n
g
equi
pm
ent
.
Ho
weve
r, s
o
m
e
t
i
m
e
l
a
t
e
r, as m
o
re a
n
d m
o
re
people
use s
o
c
i
al
m
e
dia servi
ces, m
o
re m
u
ltim
edia
data will be
ge
nerate
d, t
hus necessitating more
stora
g
e,
larger
data
base sy
ste
m
s,
and m
u
ch speed
ie
r networks.
As a
resu
lt, th
e trend
is
fo
r m
a
n
y
serv
ice
p
r
ov
id
ers to
im
ple
m
ent social m
e
dia servi
ces
wi
th cloud com
p
uting.
Had
o
op
:
Ha
d
o
o
p
,
devel
ope
d
by
A
p
ache
,
i
s
an
ope
n
s
o
u
r
ce so
ft
wa
re
p
r
oject
.
It
e
n
abl
e
s di
st
ri
b
u
t
e
d
pr
ocessi
ng
o
f
l
a
rge
dat
a
s
e
t
s
a
c
ross
cl
ust
e
rs
of
com
m
odi
t
y
serve
r
s
[
8
]
.
It
i
s
desi
gne
d t
o
s
cal
e up
f
r
om
a
si
ngl
e
serve
r
t
o
t
h
ous
and
s
of
m
achi
n
es,
wi
t
h
very
hi
gh
deg
r
ee
o
f
faul
t
t
o
l
e
ra
nc
e. R
a
t
h
e
r
t
h
an
rel
y
i
n
g
o
n
hi
g
h
-e
n
d
h
a
rdware, th
e
resilien
c
y o
f
t
h
ese clu
s
ters co
m
e
s fro
m
th
e software’s ab
ility to
d
e
tect an
d han
d
l
e
failu
res at th
e
appl
i
cat
i
o
n l
a
y
e
r.
Th
us
, i
t
i
s
usef
ul
f
o
r
dat
a
-i
nt
ensi
ve
di
st
r
i
but
ed
a
ppl
i
cat
i
ons
,
whi
c
h a
r
e capa
b
l
e
o
f
h
a
ndl
i
n
g
t
h
o
u
sa
nds
of
n
ode
s a
n
d
pet
a
by
t
e
s o
f
dat
a
.
Had
o
o
p
c
h
an
g
e
s t
h
e
eco
n
o
m
i
cs an
d t
h
e
dy
nam
i
cs of l
a
r
g
e-scal
e
com
puting. Its
im
pact can be summ
ar
ized
in
fou
r
no
table
characteristi
cs, listed in
T
a
ble 1. T
h
ese
four
ch
aracteristics en
ab
le co
m
p
u
tin
g so
l
u
tio
n
s
that are scalab
le, co
st effectiv
e,
flex
ib
le, and
fau
lt to
leran
t
.
Tab
l
e
1
.
C
h
ar
acter
i
stics o
f
H
a
d
oop
Characteristic
Descr
i
ption
Scalability
A cluster
can be expanded
by
adding
new se
r
v
er
s or
r
e
s
our
ces without hav
i
ng to
m
ove,
r
e
form
at, or
change the dependent an
aly
tic
wor
k
flows or
applications.
Cost-ef
f
ectiveness
Hadoop br
ings
m
a
ssively
par
a
llel co
m
puting
to co
m
m
odity
ser
v
er
s.
The r
e
sult is a
sizeable decrease i
n
the cost per terabyte of
storage, which in turn
m
a
kes
it af
fordable
to m
odel
all your
d
a
ta.
Flexibility
Hadoop is schem
a
-
l
ess and can abso
r
b
any
ty
pe
of data,
str
u
ctur
ed or
not,
fr
o
m
any
nu
m
b
er of sour
ces.
Data fr
o
m
m
u
ltiple s
ources can be joined and aggregated in
arbitrar
y
ways
ena
b
ling deeper analys
es than any
one s
y
stem
can pr
ovide.
Fault tolerance
W
h
en y
ou lose a node,
the sy
stem
redir
ects
wor
k
to another
location o
f
the data and
continues pr
ocessi
ng without
m
i
ssing a beat.
Had
o
op
has
fo
ur m
a
i
n
com
pone
nt
s:
Ha
d
o
o
p
C
o
m
m
on, H
a
do
o
p
Di
st
ri
bu
t
e
d Fi
l
e
Sy
st
em
(HD
F
S
)
,
MapReduce, a
n
d “Yet
Anot
her Res
o
urce
Negotiato
r”
(YARN). T
h
ese a
r
e
explained in T
a
ble 2.
Two
of these
c
o
m
pone
nts,
Ha
doop
Distribut
e
d
File System
(HDFS)
a
n
d
MapReduce, are utilized i
n
PESMS. A
s
men
tio
n
e
d
prev
i
o
u
s
ly, H
a
d
oop
h
a
s
scalab
ility
.
Map
R
ed
u
c
e is th
e
h
eart
o
f
H
a
doo
p, an
d
it
is th
is
pr
o
g
ram
m
i
ng
para
di
gm
t
h
at
al
l
o
ws f
o
r m
a
ssi
ve scal
abi
l
i
t
y
across h
u
n
d
re
ds
or t
h
o
u
s
a
nd
s o
f
ser
v
er
s i
n
a
Had
o
op cl
ust
e
r. T
h
e M
a
pR
e
duce
fram
e
wo
r
k
p
r
o
v
i
d
es
a s
p
ecific progra
mming
m
odel and a
runtim
e
syste
m
for processing
and c
r
eating la
rge
data sets t
h
at are s
u
itabl
e
fo
r vario
u
s r
eal-w
orld
task
s
[9]
.
This fra
m
e
wor
k
al
so ha
ndl
es a
u
t
o
m
a
t
i
c
schedul
i
n
g
,
com
m
uni
cat
i
on, an
d sy
nch
r
oni
zat
i
o
n
du
ri
n
g
t
h
e
pr
o
cessi
ng
of
h
u
g
e
dat
a
Evaluation Warning : The document was created with Spire.PDF for Python.
IJECE
ISS
N
:
2088-8708
A S
t
u
d
y
on
Effi
cien
t Desi
g
n
o
f
A Mu
ltimed
i
a
Co
n
version
Mo
du
le in PES
M
S
for
S
o
c
i
a
l …
(Jo
n
g
jin
Ju
ng
)
82
3
sets, and it
has
a fault tole
ranc
e capacity. T
h
e
MapReduce
prog
ramm
in
g
mo
d
e
l is ex
ecu
t
ed
in two
m
a
in
step
s,
called
m
app
ing
and
redu
cing
,
pe
rf
o
r
m
e
d by
t
h
e
ma
pp
er
and
re
ducer
fun
c
tion
s
, respectiv
ely. Each ph
ase
req
u
i
r
es
a l
i
s
t
of
key
a
n
d
val
u
e
pai
r
s a
s
t
h
e
i
n
p
u
t
a
n
d
o
u
t
put
.
I
n
t
h
e m
a
ppi
ng
step, Ma
pReduce
recei
ves t
h
e
input data set and
feeds each data ele
m
ent to the m
a
pper
i
n
t
h
e f
o
rm
of
key
an
d val
u
e pai
r
s.
In t
h
e re
duc
i
n
g
step
, all of th
e ou
tpu
t
s fro
m
th
e m
a
p
p
e
r
a
r
e
pr
ocesse
d,
a
n
d
t
h
e
fi
nal
res
u
l
t
i
s
gene
rat
e
d
by
t
h
e
red
u
ce
r
u
s
i
ng
the m
e
rging process. T
h
e
HDFS c
o
m
pone
nt furthe
r c
o
nt
ri
butes
to
Ha
doop’s s
calability. All data
in
Hadoop
are c
o
nfigured
with
HDFS;
da
ta in
a
Hadoop
cluster
a
r
e broken down
i
n
to sm
a
ller
pieces (called bloc
ks
)
and
di
st
ri
b
u
t
e
d t
h
r
o
u
g
h
o
u
t
t
h
e cl
ust
e
r.
In t
h
i
s
way
,
t
h
e m
a
ppi
n
g
an
d re
d
u
c
i
ng f
u
nct
i
o
ns
can be e
x
ecu
t
e
d o
n
sm
a
ller su
b
s
ets o
f
th
e
u
s
er’s l
a
rg
er d
a
ta sets, an
d
t
h
is
p
r
ov
id
es th
e scalab
ility
th
at is n
eed
ed
for pro
cessin
g
of
bi
g dat
a
.
Tab
l
e
2
.
Th
e Fo
ur
C
o
r
e
Co
mp
on
en
ts
o
f
H
a
do
op
A
r
ch
itectur
e
Cor
e
Co
m
ponent
Description
Hadoop Co
m
m
o
n
A
m
odule containing the utilities
that support the other Hadoop co
m
ponents.
Hadoop
Distributed File
Syste
m
(
HDFS
)
A file sy
ste
m
that pr
ovides
r
e
liable
data stor
age and a
ccess acr
o
ss all th
e nodes in a
Hadoop cluster
.
I
t
links together
the f
ile sy
ste
m
s on
m
a
n
y
local nodes to create a single
file sy
ste
m
.
M
a
pReduce
A fr
am
ewor
k for
wr
iting applications that
pr
ocess lar
g
e a
m
ounts of str
u
ctur
ed and
unstr
uctur
e
d data i
n
par
a
llel acr
oss a cluster
of th
ousa
nds o
f
m
achines,
in a r
e
liable,
fault-tolerant m
a
n
n
er.
Yet Another
Resour
ce
Negotiator
(YARN
)
T
h
e next-
g
ener
ation M
a
pReduce,
which assigns CPU,
m
e
m
o
r
y
and stor
age to
applications r
unni
ng on a Hadoo
p cluster
.
I
t
enables ap
plication fr
am
eworks other
than
MapReduce to run on Hadoop, openi
ng up a wealth of possibilities.
3.
SM
CC
SE
(
S
O
C
IAL
ME
DI
A CLOU
D C
O
MP
UTI
N
G SERV
ICE
E
N
VIR
O
N
M
E
N
T)
SM
C
C
S
E i
s
a
devel
opm
ent
e
nvi
ro
nm
ent
wi
t
h
whi
c
h a
n
S
P
ca
n easi
l
y
de
vel
o
p
or
i
m
pl
em
ent
cl
ou
d-
base
d s
o
ci
al
m
e
di
a se
r
v
i
ces
[
5
]
[
1
0]
[1
1]
.
It
pr
o
v
i
d
es
a
n
e
n
vi
r
onm
ent
f
o
r
sup
p
o
rt
i
n
g
t
h
e
devel
opm
ent
o
f
SN
Ss,
add
r
essi
ng
n
u
m
erous
S
N
Ss
and
pr
ovi
di
n
g
m
o
re effect
i
v
e
al
g
o
ri
t
h
m
s
fo
r
pr
ocessi
ng
hi
gh
vol
um
es of
soci
a
l
media data a
nd a set
of m
echanism
s
to m
a
nage t
h
e e
n
tire i
n
f
r
a
stru
ctur
e [1
],
[2
].
I
t
is co
m
p
o
s
ed
of
fo
ur
layer
s
:
the social m
e
dia applications layer,
the s
o
cial service s
o
ft
ware library
layer, the
distributed soci
al data
pr
ocessi
ng
l
a
y
e
r, a
n
d t
h
e
cl
o
u
d
vi
rt
ual
i
zat
i
on
pl
at
fo
rm
l
a
y
e
r. Fi
g
u
re
1
sh
o
w
s t
h
e SM
C
C
SE arc
h
i
t
ect
ure
.
Figure
1. The
a
r
chitecture
of t
h
e Social Medi
a C
l
ou
d C
o
m
put
i
n
g
Se
rvi
ce
En
vi
ro
nm
ent
(
S
M
C
C
S
E)
Evaluation Warning : The document was created with Spire.PDF for Python.
I
S
SN
:
2
088
-87
08
I
JECE Vo
l. 5
,
N
o
. 4
,
Au
gu
st 2
015
:
82
1
–
83
1
82
4
In t
h
e s
o
cial media applications layer, m
a
ny social
ap
pl
i
cat
i
ons
(S
NS,
m
u
l
t
i
m
e
di
a, gam
e
s, an
d s
o
on
)
can
b
e
d
e
v
e
lop
e
d using
m
o
du
les
o
f
th
e lay
e
r
b
e
low,
wh
ich
is t
h
e so
cial
serv
ice software library. Th
is layer
has a
so
ft
wa
re
devel
opm
ent
k
i
t
(SD
K
)
an
d
m
a
ny
API
s
,
fu
nct
i
o
n
s
t
h
at
a
r
e
pr
o
v
i
d
e
d
t
o
t
h
e de
vel
o
per
o
r
SP as
a
fo
rm
of so
ftwa
re as a se
rvice
(SaaS
) f
o
r
de
v
e
lopin
g
t
h
e va
rious s
o
cial m
e
dia appli
catio
ns. In
th
e
d
i
stri
b
u
t
ed
soci
al
dat
a
pr
o
cessi
ng
l
a
y
e
r,
m
a
ny
m
odul
e
-
rel
a
t
e
d
di
st
ri
b
u
t
ed so
ci
al
dat
a
pr
ocesses
are
p
r
o
v
i
d
e
d
as
a f
o
rm
of
PaaS
[
12]
.
I
n
t
h
e cl
ou
d
vi
rt
u
a
l
i
zat
i
on
pl
at
fo
rm
l
a
y
e
r, h
u
g
e
ha
r
d
wa
re
eq
ui
pm
ent
i
s
p
r
ovi
ded
as
a
vi
rt
ua
l
i
zed
platform
as a form
of i
n
fras
tructure as a
service (IaaS).
4.
PESMS
(P
A
A
S
EN
VI
RO
N
M
ENT FO
R SOCI
AL M
U
LTIMEDI
A
S
E
RVI
CE)
As m
e
nt
i
one
d
i
n
t
h
e
I
n
t
r
o
duc
t
i
on,
PESM
S i
s
a
PaaS
m
odel
of
SM
C
C
S
E
.
PESM
S i
s
co
m
posed
of
a
soci
al
m
e
di
a
dat
a
anal
y
s
i
s
pl
at
fo
rm
(soci
a
l
com
m
on algo
ri
t
h
m
s
l
i
b
rary
), a cl
ou
d
di
st
ri
b
u
t
e
d a
n
d dat
a
pr
ocessi
ng
pl
at
fo
rm
, and a
cl
o
u
d
i
n
fra m
a
nag
e
m
e
nt
pl
at
fo
rm
, as s
h
ow
n i
n
F
i
gu
re
2.
Figure
2. The
a
r
chitecture
of t
h
e Platform
-as-a-Se
rv
ice (Paa
S) E
n
vironm
ent for
Social Multimedia Service
(PESM
S
)
Soci
al
medi
a
dat
a
a
n
al
ysi
s
pl
at
for
m
:
T
h
e
m
a
in role of t
h
e s
o
cial m
e
dia data a
n
alysis
platform
i
s
t
o
anal
y
ze usag
e pat
t
e
rns
a
n
d r
e
l
a
t
i
onshi
ps
be
t
w
een
t
h
e
us
ers a
n
d t
h
e s
o
ci
al m
e
dia data
they
dem
a
nd a
n
d
t
o
pr
o
v
i
d
e
enc
o
di
ng
,
dec
o
di
n
g
,
t
r
ansc
odi
ng
,
a
n
d
t
r
a
n
sm
odi
n
g
f
unct
i
o
ns
i
n
a
l
i
b
ra
ry
fo
r
m
. Transc
odi
n
g
i
s
t
h
e
co
nv
ersion
o
f
a m
e
d
i
a file in
to file typ
e
s
su
itab
l
e
fo
r
n
u
m
e
rous
di
gi
t
a
l
de
vi
ces,
an
d t
r
a
n
sm
odi
n
g
i
s
t
h
e
co
nv
ersion
o
f
a m
e
d
i
a file in
to
m
u
ltip
le files of m
o
re su
itable sizes.
Clou
d
distrib
u
ted and par
a
llel
data pr
ocessing pl
atform:
This
is a
core
pa
rt o
f
P
E
SM
S.
It ca
n
st
ore,
di
st
ri
b
u
t
e
, a
n
d
p
r
ocess
soci
al
m
e
di
a dat
a
creat
e
d
b
y
user
s
by
a
p
pl
y
i
ng
H
D
FS
,
M
a
pR
e
duce
,
and
a
Had
o
o
p
dat
a
ba
se sy
st
em
(HB
a
se)
[
13]
[
1
4]
[1
5]
. T
h
e
so
ci
al
m
e
di
a dat
a
are
del
i
v
e
r
e
d
t
o
v
a
ri
o
u
s
use
r
de
vi
ces
suc
h
as
m
obi
l
e
p
h
one
s,
sm
art pa
ds
,
PC
s,
an
d
TVs
.
The
di
st
ri
but
e
d
a
n
d
p
a
ral
l
e
l
dat
a
p
r
ocessi
n
g
sy
st
em
has
t
w
o
su
bsy
s
t
e
m
s
:
a
di
st
ri
b
u
t
e
d
dat
a
sy
st
em
and
a
di
st
ri
but
e
d
pa
ral
l
e
l
pr
oc
essi
ng
sy
st
em
. In
t
h
e
fi
rst
,
H
D
FS
i
s
ado
p
t
e
d
fo
r a
di
st
ri
but
e
d
s
o
ci
al
m
u
l
t
i
m
e
di
a
dat
a
sy
s
t
em.
I
n
th
e
s
e
co
nd
, Map
R
e
d
u
c
e
is ad
op
ted
f
o
r a
di
st
ri
b
u
t
e
d
pa
r
a
l
l
e
l
pr
og
ram
m
i
ng m
odel
.
Al
l
f
unct
i
o
ns
of
t
h
i
s
pl
at
f
o
r
m
are per
f
o
r
m
e
d by
soci
al
m
e
di
a
comm
on algorith
m
s
from
the libra
ries in the s
o
cial m
e
di
a dat
a
a
n
al
y
s
i
s
pl
at
f
o
rm
[1
6]
[1
7]
. T
h
e
ge
n
e
rat
e
d
social m
e
dia data (text, im
ages, audi
o, a
nd vide
o) a
nd
da
tabase are
stored in HDFS
or HBase
.
T
h
e
n
, the
st
ore
d
dat
a
are
pr
ocesse
d i
n
t
w
o
st
eps
usi
n
g
M
a
pR
ed
uce.
C
l
o
u
d
inf
r
a ma
na
gement
pla
t
f
o
rm:
The
cl
o
u
d
i
n
f
r
a m
a
nagem
e
nt
pl
at
form
i
n
vol
ves
t
h
e
c
once
p
t
s
o
f
cl
o
u
d
q
u
a
lity o
f
serv
ice (Qo
S
)
and
a
g
r
een
In
tern
et d
a
ta
cen
ter (IDC), an
d
it is u
s
ed
t
o
man
a
g
e
an
d
m
o
n
itor
co
m
p
u
tin
g r
e
so
ur
ces th
at
do no
t
d
e
p
e
n
d
on
a sp
ecif
i
c
OS
o
r
pl
at
f
o
r
m
. It
i
n
cl
u
d
es
res
o
urce
sc
he
dul
i
n
g,
Evaluation Warning : The document was created with Spire.PDF for Python.
IJECE
ISS
N
:
2088-8708
A S
t
u
d
y
on
Effi
cien
t Desi
g
n
o
f
A Mu
ltimed
i
a
Co
n
version
Mo
du
le in PES
M
S
for
S
o
c
i
a
l …
(Jo
n
g
jin
Ju
ng
)
82
5
reso
u
r
ce i
n
f
o
r
m
at
i
on
m
a
nag
e
m
e
nt
, reso
u
r
c
e
m
oni
t
o
ri
ng
,
and
vi
rt
ual
m
a
chi
n
e m
a
nage
m
e
nt
. These
f
unct
i
o
n
s
are p
r
ov
id
ed
o
n
Web
serv
i
ces b
a
sed o
n
Eu
calyp
tu
s. In
add
itio
n,
ou
r IaaS is d
e
si
gn
ed to
o
f
fer flex
ib
le
com
put
i
ng
res
o
u
r
ces i
n
cl
u
d
i
n
g
ser
v
e
r
s, st
o
r
age
,
a
n
d
ba
n
d
w
i
d
t
h
usi
n
g vi
r
t
ual
i
zat
i
on t
e
c
hni
que
s
based
on
Xe
n
[1
8]
.
Th
e m
o
st i
m
p
o
r
t
an
t
fu
n
c
ti
o
n
o
f
PESMS is im
ag
e and
v
i
d
e
o conver
s
ion
v
i
a t
r
an
sco
d
i
n
g
an
d
t
r
ansm
odi
n
g
;
by
t
h
i
s
m
eans, l
a
rge
vol
um
es of s
o
ci
al
m
e
dia obje
cts cre
a
ted by SNS
users
are e
ffic
i
ently
tran
sm
it
ted
to en
d-u
s
er d
e
v
i
ces. To
acco
m
p
lish
th
is
,
we
d
e
si
gne
d
an
d
i
m
pl
em
ent
e
d a
p
a
rt
i
a
l
l
y
fu
nc
t
i
onal
i
m
ag
e conv
er
si
o
n
m
o
du
le
b
a
sed
on H
a
do
op
. Figur
e
3 sh
ows th
e ar
ch
itectur
e
o
f
t
h
e im
ag
e conv
er
si
on
m
o
du
le.
Th
e t
r
an
scod
i
n
g an
d tr
an
smo
d
i
n
g
o
f
im
ag
es using
H
a
doo
p is as
follows. First,
user-created im
ages are
aut
o
m
a
t
i
call
y
di
st
ri
b
u
t
e
d a
n
d
st
ore
d
i
n
eac
h
no
de
ru
n
n
i
n
g
on
H
D
F
S
[
1
9]
, [
2
0]
. Ne
xt
,
bat
c
h
p
r
ocess
i
ng
by
M
a
pR
ed
uce co
nve
rt
s i
m
ages st
ore
d
i
n
HD
F
S
t
o
ap
preci
at
e
anot
her i
m
ages. O
u
r c
o
nve
rsi
on m
o
d
u
l
e
use
s
onl
y
a
m
a
p st
ep bec
a
use i
t
i
s
not
n
ecessary
t
o
co
nd
uct
a m
e
rgi
ng p
r
oce
ss f
o
r t
h
e res
u
l
t
s
fr
om
t
h
e red
u
ce st
e
p
[
21]
.
Th
e m
a
p
fun
c
tio
n is im
p
l
e
m
e
n
ted b
y
th
e Seq
u
e
n
ceFiles meth
od
i
n
t
h
e m
a
p
ph
ase; th
is i
s
used to inp
u
t
th
e file
nam
e
s and t
h
e
file contents t
h
e
m
selves as the keys a
n
d
v
a
l
u
es, resp
ectiv
ely, in
a set of i
n
term
ed
iate k
e
y/v
a
lu
e
p
a
irs. Th
en, t
h
e im
ag
e co
nv
ersion
m
o
du
le sets th
e
file
conten
ts
in
to b
y
te
typ
e
u
s
ing
a BytesWritab
l
e class.
Finally, im
age data sets
are
processe
d i
n
parallel in
each
node
.
Figure
4 s
h
ows
the
progra
mming elem
e
n
ts
of
t
h
e i
m
age con
v
e
rsi
o
n m
odul
e
and
t
h
e i
m
pl
em
ent
a
t
i
on
of t
h
e p
r
o
p
o
sed
m
odul
e.
Fi
gu
re 3.
Im
age
C
o
nve
rsi
o
n M
o
d
u
l
e
usi
n
g PESM
S
Fi
gu
re
4.
Pr
o
g
r
a
m
m
i
ng El
em
ent
s
of t
h
e
Im
age C
o
nve
rsi
o
n
M
o
d
u
l
e
The st
e
p
s
use
d
fo
r
pr
o
g
ram
m
i
ng t
h
ese
pr
oc
esses are
as f
o
l
l
o
ws.
Fi
rst
,
t
h
e co
nve
rsi
o
n
m
odul
e rea
d
s
im
age data
in
HDFS usi
n
g t
h
e Rec
o
rdRea
d
er m
e
thod of
th
e Inpu
tFormat
class.
In
pu
t
F
o
r
m
a
t tran
sform
s
th
e
im
age dat
a
i
n
t
o
set
s
of
key
s
(
f
i
l
e
nam
e
s) an
d
val
u
es
(by
t
es
). Sec
o
nd
, I
n
p
u
t
F
o
r
m
a
t
passes t
h
e set
s
of
ke
y
s
and
Evaluation Warning : The document was created with Spire.PDF for Python.
I
S
SN
:
2
088
-87
08
I
JECE Vo
l. 5
,
N
o
. 4
,
Au
gu
st 2
015
:
82
1
–
83
1
82
6
val
u
es
t
o
t
h
e
M
a
ppe
r cl
ass
.
M
a
ppe
r
pr
oces
ses t
h
e
i
m
age dat
a
usi
n
g t
h
e
user
-
d
efi
n
e
d
se
t
t
i
ngs a
n
d
m
e
tho
d
s
f
o
r
im
age con
v
e
r
s
i
on
vi
a t
h
e
JA
I l
i
b
ra
ry
. T
h
e
n
, t
h
e c
o
nve
rsi
o
n
m
odul
e c
o
nve
rt
s t
h
e
i
m
age
dat
a
i
n
t
o
s
p
eci
fi
c
fo
rm
ats suitable fo
r a va
riety
of
de
vi
ces suc
h
as sm
art phones a
nd ta
bula
r
and
pers
onal com
puters in a
fully
d
i
stribu
ted
m
a
n
n
e
r. Mapp
er
co
m
p
letes th
e imag
e con
v
e
rsi
o
n and
p
a
sses th
e resu
lts t
o
the Ou
tpu
t
Form
a
t
class
as t
h
e
key
(
f
i
l
e
nam
e
) an
d
val
u
e
(by
t
es)
.
Fi
n
a
l
l
y
, M
a
pper
p
a
sses t
h
e
key
a
n
d
val
u
e set
t
o
O
u
t
p
ut
Fo
rm
at. T
h
e
R
ecor
d
Wri
t
e
r m
e
t
hod o
f
t
h
e
Out
put
F
o
rm
at
cl
ass
w
r
i
t
e
s
t
h
e
res
u
l
t
as
a fi
l
e
fo
r HD
FS.
5.
SIMPLE
USE
CASE O
F
PE
SMS
An e
x
am
ple of a sim
p
le soc
i
al
m
e
dia service with
PESMS is sh
ow
n
in
Figu
r
e
5
.
Th
e f
l
ow
is as
fo
llows.
Fi
gu
re
5.
Si
m
p
l
e
use case
o
f
s
o
cial m
e
dia service using PE
SMS
1.
SNS use
r
s c
r
ea
te and uploa
d
their s
o
cial c
ont
ent.
2.
Thi
s
co
nt
e
n
t
i
s
st
ore
d
i
n
t
h
e
di
st
ri
b
u
t
e
d
dat
a
sy
st
em
usi
n
g
soci
al
m
e
di
a
API
s
. C
o
nt
e
n
t
i
s
t
h
en
pr
ocess
e
d
and
cl
assi
fi
e
d
i
n
o
r
de
r
t
o
be
f
o
u
n
d
e
ffi
ci
e
n
t
l
y
by
a
use
r
s
e
a
r
ch
pat
t
e
rn
a
n
d
be
a
v
ai
l
a
bl
e t
o
rec
o
m
m
e
nd t
o
users
as a
p
propriate.
3.
C
l
assi
fi
ed co
nt
ent
i
s
co
n
duct
e
d t
o
di
st
ri
but
e
d
an
d pa
ral
l
e
l
p
r
ocess
o
rs
by
M
a
pR
ed
uce i
n
o
r
der t
o
del
i
v
er t
o
othe
r
users
in t
h
e
form
of a
p
preciated
form
at.
4.
Reform
at
ted
co
n
t
en
t is th
en d
e
li
v
e
red to th
e
app
r
op
riate
users
by s
o
ci
al m
e
dia APIs
and the
se
rvice
del
i
v
ery
pl
at
f
o
rm
.
5.
As
a
res
u
l
t
,
l
a
r
g
e
am
ount
s
o
f
soci
al
m
u
l
t
i
m
e
di
a
dat
a
a
r
e
ef
f
i
ci
ent
l
y
share
d
i
n
t
h
e
S
N
S
a
p
pl
i
cat
i
o
n
beca
u
s
e
of PESMS
.
6.
PERFO
R
MA
NCE E
V
ALU
A
TIO
N
The
experim
e
nts
were
c
o
nducted on a
28-node
test
be
d,
whic
h is
a si
ngle-e
nterprise
scale cluste
r
consisting of 27 data
nodes (s
lave
nodes)
and 1 head
node
(m
aster nodes). T
h
e
only
way to access
the
c
l
uster
i
s
t
h
r
o
u
g
h
t
h
e
m
a
st
er no
de
.
A
l
l
no
des
ru
n
o
n
Li
n
u
x
O
S
(
U
b
unt
u
10
.0
4
LT
S).
Each
n
o
d
e
i
s
eq
ui
p
p
ed
wi
t
h
t
w
o
2.
13
G
H
z I
n
t
e
l
Xeo
n
Q
u
a
d
-
C
ore p
r
oces
so
r
s
, 4 GB
of
re
gistered ECC
DDR m
e
m
o
ry, and a
1 TB
SATA-2
HD
D (
7
20
0 R
P
M
)
. T
h
e m
a
chi
n
es
were i
n
t
e
rco
n
n
ect
ed
us
i
ng a
10
0
0
M
b
ps Et
he
r
n
et
ad
apt
e
r. E
x
cl
udi
ng t
h
e
har
d
ware
speci
fi
cat
i
ons,
t
h
e
e
xpe
ri
m
e
nt
al
envi
r
onm
ent
s
use
d
a
r
e as
f
o
l
l
o
w
s
:
To
bu
ild
a v
a
riety o
f
exp
e
rimen
t
al cond
itio
ns,
we
u
s
ed
im
a
g
e
d
a
tasets (Tab
le 1)
fro
m
eig
h
t
group
s. The
avera
g
e size
of each im
age file
is approxim
a
t
ely 19.8 MB.
To
i
m
pl
em
ent
t
h
e
i
m
age
co
n
v
ersi
on
fu
nct
i
o
n,
we
al
s
o
use
d
J
A
I 1.
1.
3
(Ja
v
a A
dva
nce
d
I
m
agi
n
g
)
AP
Is
an
d
Java 1.
6.
0
_
2
3
.
Evaluation Warning : The document was created with Spire.PDF for Python.
I
J
ECE
I
S
SN
:
208
8-8
7
0
8
A S
t
u
d
y
on
Effi
cien
t Desi
g
n
o
f
A Mu
ltimed
i
a
Co
n
version
Mo
du
le in PES
M
S
for
S
o
c
i
a
l …
(Jo
n
g
jin
Ju
ng
)
82
7
To p
r
ocess t
h
e l
a
rge dat
a
d
a
t
a
set
s
on
ou
r
t
e
st
bed,
we sel
ect
ed Ha
do
op
-
0
.
2
0
.
2
.
The
defa
ul
t
opt
i
o
n
s
sel
ect
ed fo
r H
a
do
o
p
are as
f
o
l
l
o
w
s
:
1) t
h
e num
ber o
f
bl
o
c
k re
pl
i
cat
i
ons
i
s
set
t
o
t
h
ree,
and
2) t
h
e bl
o
c
k
size is set to
64 MB.
We c
o
nd
uct
e
d se
ve
n set
s
o
f
e
x
peri
m
e
nt
s
fo
r
o
u
r
per
f
o
r
m
a
nce
eval
uat
i
o
n
o
f
t
h
e
pa
rt
i
a
l
l
y
im
pl
em
ent
e
d
M
a
pR
ed
uce-
ba
sed cl
o
u
d
di
st
r
i
but
ed
a
n
d
par
a
l
l
e
l
dat
a
p
r
oc
essi
ng
pl
at
fo
r
m
used i
n
t
h
e
PESM
S.
These
e
xpe
ri
m
e
nt
s
wer
e
c
h
o
s
en
t
o
p
r
ovi
de
an
o
v
er
vi
ew
of
t
h
e
t
r
a
n
sc
o
d
i
n
g a
n
d
t
r
an
s
m
odi
ng
f
unct
i
ons
.
I
n
part
i
c
ul
a
r
,
we
m
easured
t
h
e
r
u
n
t
i
m
e
for
t
h
e
im
age co
n
v
ers
i
on
f
unct
i
o
n
ba
sed
o
n
M
a
pR
e
duce
t
o
c
o
nv
er
t
l
a
rg
e
am
ount
s
of i
m
age dat
a
set
s
i
n
t
o
speci
fi
c
fo
r
m
at
s (e.g.,
fr
o
m
JPG t
o
P
N
G) s
u
i
t
a
bl
e f
o
r
a vari
et
y
of
m
obi
l
e
d
e
v
i
ces und
er
a v
a
riety o
f
con
d
ition
s
.
6.
1.
Perf
orm
a
nce o
f
T
r
ansc
odi
n
g
The
objective
of the
first
experim
e
nt was to m
eas
ure t
h
e
t
r
ansc
odi
ng
t
i
m
e an
d
spee
dup for the
im
age
con
v
e
r
si
o
n
f
u
n
c
t
i
on
un
der
va
ry
i
ng cl
ust
e
r s
i
zes. As s
h
ow
n i
n
Fi
g
u
re
6,
t
h
e r
un t
i
m
es decrea
se w
h
e
n
t
h
e
num
ber
of
nodes increase
s
. In pa
rticular, the elapsed tim
e
s
decrea
se dra
m
atically for the first eight
node
s
.
Fro
m
8
to
28
no
d
e
s, th
e
ru
n
ti
mes are red
u
c
ed
grad
u
a
lly
. In ad
d
ition
,
we also
m
easu
r
ed
t
h
e parallel sp
eed
up.
Th
e ex
p
e
r
i
m
e
n
t
s w
e
r
e
con
d
u
c
ted
w
ith
d
i
ff
eren
t nu
m
b
er
s of p
a
r
a
llel nod
es to
allo
w th
e
par
a
llel sp
eedup to
b
e
calculated. The pa
rallel s
p
ee
dup calculates
how m
u
ch fa
st
er th
e
p
a
rall
el and d
i
stri
b
u
ted
ex
ecu
tio
n
is th
an
ru
n
n
i
n
g a
n
i
m
age c
o
nve
rsi
o
n
f
unct
i
o
n
i
m
pl
em
ent
e
d usi
n
g
t
h
e
sam
e
M
a
pR
ed
uce
p
r
og
ram
m
i
ng i
n
a
si
n
g
l
e
node. If t
h
e s
p
eedup is
grea
ter tha
n
1, it indicates t
h
at t
h
ere is at least so
m
e
g
a
in
fro
m
carryin
g ou
t th
e
co
nv
ersion
in
a p
a
rallel m
a
n
n
e
r. If th
e
sp
eedu
p
is equ
a
l
to t
h
e num
b
er of
machines,
it indicates that our cloud
serv
er an
d
M
a
p
R
edu
ce
p
r
og
ramm
in
g
h
a
ve p
e
rfect s
calab
ility an
d
an id
eal p
e
rforman
ce. Th
e cal
cu
lated
spee
du
ps
are
s
h
o
w
n i
n
Fi
gu
r
e
7
.
As t
h
e
res
u
l
t
s
s
h
o
w
2,
4,
an
d
8
n
o
d
es
u
s
ed i
n
paral
l
e
l
resul
t
i
n
a
n
i
d
e
a
l
an
d
p
e
rfect scalab
ility. Alth
ou
gh th
e
p
e
rfo
r
m
a
n
ce is no
t
o
p
t
i
m
u
m
sin
ce 1
0
no
des,
we can
see th
at
ou
r clou
d
serve
r
ha
s
a
hi
gh
-
p
er
fo
rm
ance
t
h
ro
u
g
h
p
u
t
fr
om
t
h
e use
o
f
di
st
ri
b
u
t
e
d
pr
o
cessi
ng
.
M
o
reo
v
er
, f
o
r
very
l
a
rge
or
very sm
all image datasets
,
we can see that the th
r
o
u
g
h
put
i
n
t
h
e di
st
ri
b
u
t
e
d pr
oces
si
ng per
f
o
rm
ance
i
s
r
e
du
ced
i
n
ou
r
clo
u
d
serv
er.
In
f
act, th
e calcu
lated
sp
eedu
ps of all d
a
taset
s
in th
e 28 nodes ar
e ap
pro
x
i
matel
y
11
, 15
, 19
, 22
, 23
, 26
, 27
, an
d 27
,
r
e
sp
ectiv
el
y.
Fi
gu
re
6.
Tra
n
s
c
odi
ng
Ti
m
e
versus
C
l
ust
e
r
S
i
ze
Fi
gu
re
7.
S
p
ee
du
p
ve
rsus
C
l
u
s
t
e
r Si
ze
6.
2. Perf
orm
a
nce
B
a
se
d on Ch
an
ges
i
n
B
l
ock Repl
i
c
a
t
i
o
n
Op
ti
o
n
s
The sec
o
nd e
x
perim
e
nt was conducted to
measur
e th
e run
ti
m
e
s fo
r th
e
im
age conversion
function
according to the c
h
a
nge
s i
n
the
num
ber of bl
ock
repli
cations. Ma
pReduce
splits large datasets into fi
xed-
si
zed bl
oc
ks,
a
l
l
o
wi
n
g
a
q
u
i
c
k dat
a
searc
h
a
n
d
pr
ocessi
ng
. In
fact
,
usi
n
g
t
h
e de
faul
t
re
pl
i
cat
i
on val
u
e of
3
,
t
h
e
replicated data
is st
ore
d
i
n
t
h
ree nodes
of t
h
e
HDFS
t
o
reba
l
a
nce t
h
e
dat
a
,
m
ove co
pi
es a
r
ou
n
d
,
an
d
kee
p
t
h
e
dat
a
re
pl
i
cat
i
on hi
gh
w
h
e
n
s
y
st
em
faul
t
s
su
ch as a
di
sk
fai
l
ure,
net
w
or
k c
o
n
n
ect
i
o
n
pr
ob
l
e
m
s
, or
ot
her
i
ssues
o
ccur. Th
e
pu
rpo
s
e of th
is ex
p
e
rim
e
n
t
i
s
to
v
e
rify ho
w
b
l
o
c
k
replicatio
n
will materiall
y
affect th
e
per
f
o
r
m
a
nce.
The num
bers o
f
bl
ock repl
i
c
a
t
i
ons used
i
n
e
xpe
ri
m
e
nt
were 1,
2,
4, a
nd
5 wi
t
h
a
de
faul
t
val
u
e
of
3
.
The
e
xpe
ri
m
e
nt
resul
t
s
are s
h
ow
n
i
n
T
a
bl
e
3.
T
h
e e
x
perim
e
ntal res
u
lts indicate that the e
x
ec
ution tim
es
are reduce
d when
the
bloc
k
re
plica
tio
n nu
m
b
ers in
crease.
In
p
a
rticu
l
ar
, t
h
ere is lit
tle d
i
fferen
ce
in
th
e
ex
ecu
tion ti
m
e
s fo
r sm
all d
a
t
a
sets of
1
to
8 GB, wh
erea
s for
larg
er d
a
tasets,
th
e
difference i
n
the
executi
on
Evaluation Warning : The document was created with Spire.PDF for Python.
I
S
SN
:
2
088
-87
08
I
J
ECE Vo
l. 5
,
N
o
. 4
,
Au
gu
st 2
015
:
82
1
–
83
1
82
8
ti
m
e
s is b
i
g
g
e
r. In
fact, fo
r a 50
GB
d
a
taset, th
e ex
ecu
tion
tim
es are 1
,
0
9
2
,
437
, an
d
4
2
8
s fo
r
rep
licatio
n
f
acto
r
s
o
f
1
,
2,
an
d 3 b
l
o
c
ks,
resp
ectiv
ely (
T
ab
le 4)
.
We w
o
nde
re
d whet
her sel
ect
i
ng as l
a
rge a
num
ber o
f
bl
o
c
k re
pl
i
cat
i
ons
as pos
si
bl
e w
oul
d be t
h
e
best
way
t
o
i
n
crease t
h
e
pe
rf
orm
a
nce an
d
h
a
ndl
e l
a
rge
am
ou
nt
s
of
dat
a
se
t
s
i
n
o
u
r
co
n
v
e
r
si
o
n
m
odul
e
b
a
sed
on
M
a
pR
e
duc
e. T
h
r
o
u
g
h
t
h
i
s
ex
peri
m
e
nt
,
we
fo
u
n
d
t
h
at
p
r
oces
si
n
g
da
t
a
set
s
i
n
a
n
H
D
FS
usi
n
g a l
a
rge
num
ber of
bl
o
c
k repl
i
cas w
o
ul
d b
r
i
n
g
a
b
o
u
t
a
si
g
n
i
f
i
cant
am
ount
o
f
e
x
t
r
a
o
v
er
hea
d
. Fi
rst
,
i
f
t
h
e n
u
m
b
er
o
f
bl
oc
k re
pl
i
cat
i
ons i
s
l
a
r
g
e
r
, a
n
i
n
cr
ease i
n
st
ora
g
e ca
paci
t
y
neede
d
t
o
st
ore
t
h
e bl
oc
k re
pl
i
cas occ
u
rs
. Th
e t
i
m
e
r
e
qu
ir
ed
to
stor
e th
e
b
l
o
c
k
rep
licas also
incr
eases ex
ponen
tially. Fo
r
t
h
is r
eason
,
w
e
ev
alu
a
ted
th
e ti
m
e
s
req
u
i
r
e
d
t
o
st
or
e repl
i
cat
ed
bl
ock
s
i
n
a
n
HD
FS de
pe
n
d
i
n
g
on
t
h
e i
n
c
r
ease
i
n
t
h
e
n
u
m
b
er of
bl
oc
k
repl
i
c
at
i
ons
.
Tab
l
e 5
shows th
e resu
lts of
th
is exp
e
rim
e
n
t
. As sh
own
in
th
e tab
l
e,
for
a 5
0
GB
d
a
taset, th
e ti
m
e
ta
k
e
n
t
o
sto
r
e
three b
l
ock
rep
licas
in HDFS
is 19
84 s, wh
ile
th
e
am
ou
nt
o
f
t
i
m
e
t
a
ken
fo
r
fi
ve
bl
ock
repl
i
cas
i
s
35
8
7
s.
The
di
ffe
re
nce
i
n
t
h
e st
ora
g
e t
i
m
e
bet
w
een t
h
ree a
nd
fi
ve
bl
oc
k re
pl
i
cas i
s
ap
pr
o
x
i
m
at
ely
1,5
0
0
s. Thi
s
execution tim
e
is m
u
ch great
er tha
n
t
h
e
differe
n
ce in the
run tim
e for a
n
i
m
age co
nve
rsi
o
n
un
de
r t
h
e sam
e
co
nd
itio
ns. Fu
rth
e
rm
o
r
e, ov
erh
ead
related
with
th
e job
sc
h
e
d
u
ling
in
t
h
e HDFS is
g
e
n
e
rat
e
d
.
Th
e
p
o
i
n
t
of th
is
expe
ri
m
e
nt
i
s
to det
e
rm
i
n
e whet
he
r t
h
e f
o
r
m
and si
ze o
f
t
h
e dat
a
set
s
,
p
r
og
ram
m
i
ng t
echni
que
s, b
u
si
n
e
ss l
ogi
c,
and c
o
nfi
g
uration
of t
h
e clus
ter syste
m
s are effective
for the proces
sing whe
n
usi
ng
a selected num
ber of
bl
oc
k repl
i
cat
i
ons
Table
3.
Flickr Im
age Datasets Use
d
for t
h
e
Perform
a
nce Evaluation
S
E
CT
IO
N CO
N
T
EN
T
Na
m
e
Flicker
dataset
For
m
at
JPG
Sour
ce Key
w
or
d:
Sun
Size
1 GB
2 GB
4 GB
8 GB
10 GB
20 GB
40 GB
50 GB
Tabl
e
4. T
o
t
a
l
Im
age Transc
o
d
i
n
g Ti
m
e
s (s)
fo
r t
h
e
B
l
oc
k
R
e
pl
i
cat
i
on Fa
ct
ors
Block
Replication
I
m
age Dataset
Siz
e
1 GB
2 GB
4 GB
8 GB
10 GB
20 GB
40 GB
50 GB
1
28
40
63
114
135
496
975
1,
092
2
25
32
50
85
103
192
358
437
3
27
32
50
85
102
186
359
428
4
30
33
50
85
103
188
358
437
5
30
33
50
85
103
192
359
437
Tabl
e
5. E
x
ec
u
t
i
on Ti
m
e
(s)
f
o
r
St
o
r
i
n
g
Al
l
B
l
ock R
e
pl
i
cat
i
ons
i
n
H
D
FS
Block
Replication
I
m
age Dataset
Siz
e
1 GB
2 GB
4 GB
8 GB
10 GB
20 GB
40 GB
50 GB
1
17
34
69
130
172
377
581
672
2 33
68
129
257
332
754
1,
120
1,
392
3 41
84
163
344
428
837
1,
790
1,
984
4
69
137
237
512
667
1,
490
2,
234
2,
878
5
110
215
301
671
842
1,
890
2,
829
3,
587
6.
3.
Perf
orm
a
nce
B
a
se
d on Ch
an
ges
i
n
th
e
B
l
ock Si
z
e
Opti
on
The
p
u
r
p
ose
of
t
h
e t
h
i
r
d e
x
per
i
m
e
nt
was t
o
m
easur
e t
h
e e
x
ecution tim
es a
ccording t
o
t
h
e
bl
ock size
.
Basically, Hadoop
processes
signi
fi
cant a
m
ounts of dat
a
sets after s
p
litting them
into a
defa
ult bl
ock size
v
a
lu
e of 64 MB.
W
e
m
easu
r
ed
t
h
e
ru
n tim
e
s
fo
r an
im
ag
e
co
nv
er
sion
u
s
i
n
g
1
6
,
32
, 64
,
1
2
8
,
2
5
6
,
an
d
5
12 MB
bl
oc
k
si
zes,
re
spect
i
v
el
y
.
Th
e res
u
l
t
s
a
r
e
sh
ow
n
i
n
Ta
bl
e
6.
As
t
h
e
resul
t
s sh
o
w
,
f
o
r
a
64
M
B
bl
oc
k
s
i
ze, t
h
e
ru
n t
i
m
e i
s
t
h
e best
am
ong t
h
e
fi
ve cas
es st
u
d
i
ed. I
f
a
de
vel
o
per
set
s
t
h
e
bl
o
c
k si
ze i
n
Ha
d
o
o
p
t
o
sm
al
l
e
r
t
h
an
the file size (in our case, a
p
proxim
a
tely 20
MB) included
in
th
e im
ag
e d
a
taset, th
e ex
ecu
tio
n
tim
es in
creases
as a large
num
ber of
bloc
ks a
r
e create
d
i
n
the HDFS.
Evaluation Warning : The document was created with Spire.PDF for Python.
I
J
ECE
I
S
SN
:
208
8-8
7
0
8
A S
t
u
d
y
on
Effi
cien
t Desi
g
n
o
f
A Mu
ltimed
i
a
Co
n
version
Mo
du
le in PES
M
S
for
S
o
c
i
a
l …
(Jo
n
g
jin
Ju
ng
)
82
9
Tabl
e
6. T
o
t
a
l
Im
age Transc
o
d
i
n
g Ti
m
e
(s)
B
a
sed
on
t
h
e B
l
ock
Si
ze
Block
Replication
I
m
age Dataset
Siz
e
1 GB
2 GB
4 GB
8 GB
10 GB
20 GB
40 GB
50 GB
16
M
B
30
51
70
155
138
349
732
902
32
M
B
23
30
49
84
102
185
356
437
64
M
B
23
32
49
84
102
186
356
428
128
M
B
23
31
50
84
102
185
356
436
256
M
B
24
30
49
85
103
185
356
436
512
M
B
23
32
50
85
102
185
356
436
6.
4.
Im
age
T
r
ansc
odi
n
g
Per
f
orm
a
nce
B
a
s
e
d
on
C
h
a
n
ge
s i
n
the
apre
d
.
t
a
sktr
ack
er.
map.
t
a
sks
.
ma
xi
mum
Opti
on
In t
h
e fi
ft
h s
e
t
of e
x
peri
m
e
nt
s,
we f
o
c
u
sed
on
ex
pl
o
r
i
ng a
n
d an
al
y
z
i
ng
di
f
f
ere
n
t
val
u
es
f
o
r
mapred.tasktra
cker.m
ap.tasks
.m
axim
u
m
. T
h
is optio
n re
prese
n
t
s
t
h
e
m
a
xim
u
m
nu
m
b
er of m
a
p t
a
sks
per
f
o
r
m
e
d sim
u
l
t
a
neo
u
sl
y
i
n
a si
n
g
l
e
dat
a
n
ode
.
B
e
fo
re pe
rf
o
r
m
i
ng t
h
i
s
set
o
f
ex
pe
ri
m
e
nt
s, we ex
pect
e
d
t
h
at
t
h
e t
r
ansc
o
d
i
ng
j
ob
per
f
o
r
m
ance fo
r t
h
e
m
a
xim
u
m
nu
m
b
er of m
a
p s
l
ot
s w
o
ul
d
de
p
e
nd
o
n
t
h
e
nu
m
b
er of C
P
Us
i
n
t
h
e
p
h
y
s
i
c
a
l
m
achi
n
e, i
.
e
.
,
i
f
t
h
e
val
u
e i
s
set
t
o
4, t
h
e f
o
u
r
m
a
p t
a
sks
use
d
t
o
pr
ocess t
h
e M
a
pR
ed
uce
jo
b
are pe
rf
o
r
m
e
d sim
u
l
t
a
neousl
y
i
n
a
single
data node.
It was
expected that a
va
lue of
8 wo
ul
d
ex
hi
bi
t
a bet
t
e
r pe
rf
o
r
m
a
nce t
h
an t
h
e
ot
her
val
u
es
.
In
fact
,
fr
om
the e
xpe
ri
m
e
ntal
resul
t
s
s
h
ow
n i
n
Fi
g
u
r
e
8
,
t
h
e
best
t
r
a
n
sc
o
d
i
n
g
per
f
o
r
m
a
nce i
s
ac
hi
eve
d
wh
e
n
t
h
e val
u
e
of
t
h
i
s
o
p
t
i
o
n i
s
set
t
o
8,
beca
use
o
u
r
sy
st
em
has ei
ght
C
P
Us i
n
e
ach
no
de.
Fi
gu
re
8.
Tot
a
l
Tra
n
sc
odi
ng
T
i
m
e
i
n
Had
o
o
p
f
o
r
5,
1
0
,
an
d
20
GB
Dat
a
set
s
fo
r Vari
ous
Va
l
u
es of
m
a
pre
d
.t
askt
rac
k
er
.m
ap.t
as
ks.m
axim
um
6.
5. I
m
a
g
e T
r
ansc
odi
n
g Per
f
orm
a
nce
f
o
r
Resi
z
i
ng a
n
I
m
a
g
e
Da
ta
set
For this
performance eval
uation,
we
m
easured t
h
e t
o
tal tra
n
sc
odi
ng tim
es
for im
age resi
zing. T
h
e
t
r
ansc
odi
ng
t
i
m
e
s re
qui
re
d
t
o
enc
o
de a
n
or
i
g
i
n
al
i
m
ag
e dataset
in
to Q
V
G
A
(
320
× 240
), VG
A (
640
× 4
8
0
)
,
and
WV
GA
(8
00
× 4
0
8
)
res
o
lutions
we
re m
easure
d
.
The
p
e
rf
orm
a
nce res
u
lts are s
h
ow
n
in Fig
u
re
9
.
We ca
n
clearly see from the results that ther
e is
no diffe
re
nce in t
h
e tra
n
scodi
ng perform
a
nce for
resizing an
im
age
d
a
taset.
7.
CO
NCL
USI
O
N
Thi
s
pa
per
i
n
t
r
od
uce
d
a
n
effi
c
i
ent
m
u
ltim
edia
processing for
SNS m
u
ltim
edia s
e
rvic
e using
clo
ud co
m
p
u
t
i
n
g
s
y
stem, more pr
ecisely
Hadoop
s
y
stem. To
do
this,
PESM
S
w
a
s desi
g
n
e
d
;
t
h
i
s
p
r
o
v
i
d
es a
n
en
vi
r
o
nm
ent of t
h
e
devel
opi
ng
,
b
u
i
l
d
i
n
g
o
f
So
ci
al
ser
v
i
ce
b
y
ad
opt
i
n
g
cl
ou
d
com
put
i
n
g t
e
c
h
nol
ogi
es
an
d
el
ast
i
c
c
o
m
put
er
reso
u
r
ces.
A
n
d
t
h
i
s
st
u
d
y
pre
s
ent
e
d
an
i
m
ag
e co
nve
rsi
o
n
m
odul
e f
o
r t
r
a
n
sc
odi
ng
an
d t
r
ansm
odi
n
g
ba
sed
o
n
M
a
pR
ed
uce
r
u
n
n
i
n
g
o
n
a
n
H
D
FS
usi
n
g
PESM
S
.
T
h
e
aim
for
p
r
op
osi
n
g a
n
d i
m
pl
em
ent
i
ng t
h
e im
age
con
v
e
r
si
o
n
m
odul
e
usi
n
g
PES
M
S was
t
o
pro
c
ess lots of
socia
l
m
u
ltim
edia
wit
h
m
o
re eff
i
ci
enc
y
.
Evaluation Warning : The document was created with Spire.PDF for Python.
I
S
SN
:
2
088
-87
08
I
J
ECE Vo
l. 5
,
N
o
. 4
,
Au
gu
st 2
015
:
82
1
–
83
1
83
0
Fi
gu
re
9.
Tot
a
l
Im
age Tra
n
sc
odi
ng
Ti
m
e
for
R
e
si
zi
ng a
n
I
m
age R
e
sol
u
t
i
on
We
veri
fied the excellent
pe
rform
a
nce of the im
ag
e co
nv
er
sion
m
o
du
le
usin
g v
a
r
y
ing
ex
p
e
r
i
m
e
n
t
al
co
nd
itio
ns
cond
u
c
ted
o
n
a 28
-no
d
e
test b
e
d
.
In
fact,
we
measu
r
ed th
e
run
tim
es fo
r
an
im
ag
e co
nversion
un
de
r sev
e
n se
t
s
of e
x
peri
m
e
nt
s:
cha
nge
s i
n
t
h
e cl
ust
e
r
si
ze, bl
ock
re
pl
i
cat
i
on fact
or
, bl
ock
si
ze, J
V
M
reu
s
e
fact
or
, m
a
pred.t
as
kt
racke
r
.
m
ap.t
asks.m
axim
u
m
opt
i
on,
resi
zi
ng
fu
n
c
t
i
on, a
nd c
o
nve
rsi
o
n f
o
rm
at
s. O
u
r
con
v
e
r
si
o
n
m
odul
e
can
re
d
u
c
e
t
h
e
ru
n
t
i
m
e
s f
o
r
co
n
v
ert
i
n
g
im
age dat
a
set
s
i
n
t
o
s
p
eci
fi
c
fo
rm
at
s sui
t
a
bl
e f
o
r
vari
ous
de
vi
ce
s. I
n
pa
rt
i
c
ul
ar
, ba
sed
o
n
t
h
e
expe
ri
m
e
nt
al
resul
t
s
o
n
t
h
e c
h
an
ges
i
n
t
h
e
Had
o
o
p
o
p
t
i
o
n
s
, we
can see t
h
at M
a
pReduce
programmers
sho
u
l
d
carefu
lly co
nsid
er selecting
o
p
tion
s
related with
t
h
e
b
l
o
c
k size
and
bl
oc
k repl
i
cat
i
on, JVM
r
e
use o
p
t
i
o
n,
a
n
d
m
a
pred
.t
askt
rac
k
er.m
ap.t
asks.m
axim
u
m
o
p
t
i
o
n de
pen
d
i
n
g
o
n
t
h
e f
o
rm
and s
i
ze of t
h
e
dat
a
set
s
, t
h
e
pr
og
r
a
m
m
i
ng t
ech
ni
que
s, t
h
e
b
u
si
ness l
o
gi
c, a
n
d
t
h
e co
nfi
g
u
r
at
i
on
of
the cluster syste
m
s.
Fi
nal
l
y
, due t
o
PESM
S t
h
at
i
s
PaaS pl
at
f
o
r
m
, servi
ce pr
o
v
i
d
e
r
o
r
de
vel
ope
r c
oul
d eas
i
l
y
devel
op
so
cial m
u
lti
me
d
i
a serv
ice
with
ou
t co
m
p
lex in
stallatio
n
.
Th
ey on
ly h
a
v
e
to con
s
id
er app
licatio
n
.
We can
concl
ude
t
h
at
PESM
S i
s
w
o
r
t
hy
f
o
r a
l
a
r
g
e
soci
al
m
u
l
t
i
m
e
di
a dat
a
p
r
oce
s
si
ng
.
ACKNOWLE
DGE
M
ENTS
Th
is wo
rk
w
a
s
supp
or
ted b
y
th
e I
T
R&D p
r
o
g
r
a
m
o
f
MSIP/I
I
T
P.
[
B
0101
-15
-
0
559
, D
e
v
e
lop
i
ng
On
-
line Open Platform
to Provi
d
e L
o
cal
-b
usiness Strateg
y
An
alysis and
Us
er-targ
e
ting
Visu
al
Adv
e
rtisemen
t
M
a
terials f
o
r
M
i
cro-e
n
ter
p
ri
se M
a
nage
rs]
and
M
S
IP
(M
in
is
try
o
f
Scienc
e,
ICT
an
d
F
u
ture
Pla
nni
ng
),
Ko
rea
,
un
de
r the I
T
RC (In
f
o
rm
ation Tec
h
n
o
lo
gy
Research Ce
nter) s
u
p
p
o
rt
pr
og
ram
(IITP
-
2
0
1
5
-
H
8
5
0
1
-
1
5
-
1
0
0
4
)
sup
e
r
v
i
s
ed
by
t
h
e
NIP
A
(Nat
i
o
nal
IT
I
n
dust
r
y
Pr
om
ot
i
on
Ag
ency
.
REFERE
NC
ES
[1]
Kim M.,
Lee
H.,
Lee H., S
M
CCSE: P
aaS
Platform for pr
ocessing large
amounts of social media,
The
3rd
International Co
nference In
tern
et, Malay
s
ia, (201
1), 631-635
.
[2]
Kim M.,
Lee H
.
, SMCC: Social Media C
l
oud
Computing
Model for
Dev
e
loping SNS
based
on Social Med
i
a,
Springer Communications in
Co
mputer an
d
Infor
m
ation Science,
206(2011), 259-
266.
[3]
Golbeck
J
.
,
Ro
bles
C
.
,
Turner
K.,
P
r
edic
ting
P
e
rs
onali
t
y
wi
th
s
o
cial
m
e
di
a,
C
onferenc
e
on
Hum
a
n F
acto
r
s
i
n
Computing S
y
stems, Vancouver
,
(2011)
, 253-26
2.
[4]
Wikipedia,http
://en.wikip
edia.o
rg
/wiki/Social_media, 2011.2
[5]
Kaplan A.M., H
aenl
e
in M., User
s of the world
,
u
n
ite!
The
challen
g
es and opportu
nities of
Social
Medial
, Journal
of
business horizon
s, 53(2010), 59-
68.
[6]
Kim Hak J
.
, Online Social Media N
e
twor
king
and Assessing Its
Secur
i
ty
Risks,
International Journal of
Secur
ity
and Its Applications, 6(2012), 11
– 18.
[7]
Peter Mell,
Timoth
y
Gr
ance,
Th
e NIST Definition of Cl
oud Co
mputing, Special Pub
lication 80
0-145, Septemb
e
r
2011
[8]
http://www-01.ibm.com/softwa
re/data/infospher
e
/hadoop
/
[9]
Li X., Shi Y
., G
uo Y., Ma W
.,
Multi-Ten
a
nc
y
Based Access C
ontrol in Cloud
, 2010 Interna
tio
nal Confer
enc
e
on
Computational I
n
telligen
ce and
Software
Eng
i
neering, Wuhan, C
h
ina, (2010),1-4
.
Evaluation Warning : The document was created with Spire.PDF for Python.