TELKOM
NIKA
, Vol.12, No
.4, Dece
mbe
r
2014, pp. 75
1~7
5
2
ISSN: 1693-6
930,
accredited
A
by DIKTI, De
cree No: 58/DIK
T
I/Kep/2013
DOI
:
10.12928/TELKOMNIKA.v12i4.957
751
Re
cei
v
ed Se
ptem
ber 5, 2014; Re
vi
sed
Octob
e
r 23, 2
014; Accepte
d
No
vem
ber
6, 2014
Editorial
Fortifying Big Data infrastructures to Face Security and
Privacy Issues
Tole Sutikno
*
1
, Deris Stiaw
a
n
2
, Imam
Much Ibnu Subroto
*
3
1
Departme
n
t of Electrical En
gi
neer
ing, Un
iver
sitas Ahmad D
ahl
an, Yog
y
a
k
arta, Indon
esia
2
Departme
n
t of Computer S
y
s
t
em Engin
eeri
n
g,
Univers
i
tas Sri
w
ij
a
y
a, Pal
e
mban
g, Indon
e
s
ia
3
Departme
n
t of Informatics Engin
eeri
ng, Un
iv
ersi
tas Islam S
u
ltan Ag
un
g, Semara
ng, Indo
nesi
a
*Corres
p
o
ndi
n
g
author, em
ail
:
tole@ee.u
ad.
ac.id
1
, deris.stia
w
an@gmail.com
2
, imam.utm@gmail.com
3
The explo
s
io
n of data available o
n
the intern
et is very increasi
ng in
re
ce
nt years.
Huma
n being
s create more than 4 quin
t
illion bytes
of data every
day in 2013, whi
c
h co
me fro
m
individual a
r
chives, se
nso
r
s em
bed
ded
in smartp
ho
nes, soci
al n
e
tworks, internet of thing
s
,
enterp
r
i
s
e a
n
d
internet in
all scale
s
a
n
d
format
s [1
]
,[2]. One of th
e mo
st ch
alle
nging i
s
sue
s
is
how to
effecti
v
ely manag
e
su
ch a
la
rge
amount
of da
ta and i
dentif
y new
way
s
t
o
analy
z
e l
a
rge
amount
s of data and unlo
ck info
rmatio
n [3]. The issue is well
-kn
o
wn a
s
Big Data, whi
c
h
has
been
eme
r
gi
ng a
s
a
hot t
opic in
cu
rre
nt inform
ation
and
co
mmu
nicatio
n
te
ch
nologi
es
re
se
arch
becau
se of facin
g
many challen
ges, su
ch as
its efficient encryptio
n and de
cryp
tion algorithm
s,
encrypted i
n
formation retri
e
val, attribute based en
cry
p
tion, attacks on availability, reliability and
integrity [4
]
,[5
].
Big data com
e
s in m
any fo
rms. It ca
n co
me with bi
g d
i
fferences. It i
s
prope
rly be
tagged
as the fou
r
HVs: high
-vol
ume, high
-va
r
iety, hi
gh-vel
o
city, and hi
gh-ve
ra
city [6]. Big Data is a
term defini
ng
data that h
a
s
four mai
n
cha
r
acte
ri
stic
s [7
]: 1) It involves a
gre
a
t volu
me of dat
a, 2
)
the data can
not be struct
ured i
n
to reg
u
lar d
a
taba
se tables; 3
)
the data i
s
produ
ced
with
great
velocity and
must be capt
ured a
nd pro
c
e
s
sed ra
pi
dl
y, and 4) so
metimes the
r
e is a very big
volume of data to proce
ss b
e
fore fin
d
ing va
luabl
e neede
d informatio
n. Go
ogle, Microsoft,
Yahoo, Yo
uT
ube, T
w
itter,
and
Fa
cebo
o
k
a
r
e
ea
rly in
novators in
bi
g data
infrastructure. Altho
ugh
the comp
ani
es
have
bee
n devel
opin
g
big
data i
n
frastr
u
c
tu
re
si
nce
their in
ception, o
n
ly
more
recently have
big data
wo
rkloa
d
s
bee
n runnin
g
in the
publi
c
cl
oud [
8
]. Face
boo
k
repo
rts
abo
ut 6
billion ne
w p
hotos eve
r
y month and 7
2
hours
of video are uplo
aded to You
T
ube eve
r
y minute
[9]. So, it’s a
big
challen
g
e
. He
nce, o
r
gani
zation
s
must fin
d
a
way to
man
a
ge thei
r
data
in
accordan
ce
with all relev
ant priva
c
y regulat
io
ns
wi
thout ma
king
the data in
acce
ssi
ble a
n
d
unu
sabl
e.
Clou
d Se
cu
ri
ty Alliance
(CSA)
ha
s
rel
eased th
at th
e top
10
cha
llenge
s,
whi
c
h a
r
e
as
follows [10]: 1)
secure co
mputation
s
in
distri
b
u
ted p
r
og
rammi
ng f
r
ame
w
o
r
ks,
2)
security b
e
st
pra
c
tice
s fo
r
non-rel
a
tional
data
store
s
,
3)
se
cure d
a
ta sto
r
ag
e an
d tran
sa
ction
s
log
s
, 4
)
e
n
d-
point input validation/filteri
ng,
5) re
al-ti
m
e se
cu
rity monitori
ng, 6
)
scala
b
le a
nd co
mpo
s
a
b
le
privacy
-
prese
r
ving
d
a
ta mining and analytics,
7) crypto
gra
p
h
i
cally enfo
r
ced data
cen
t
ric
se
curity, 8) g
r
anul
ar a
c
cess co
ntrol, 9)
gran
ula
r
audi
ts, 10) data Provena
nce. The ch
allen
g
e
s
themselve
s
can be organi
zed into
four di
stinct a
s
pe
cts of the Big Data eco
s
ystem
[10].
1.
Infras
truc
tur
e
Securit
y
(included
:
1
st
and 2
nd
challenges)
The di
stri
but
ed
comp
utations an
d data
stores mu
st
be
se
cur
ed i
n
orde
r to
se
cure th
e
infrast
r
u
c
ture
of Big Data
system
s [10]. A way
of su
pportin
g
the
se
curity to d
e
tect an
omali
e
s
whi
c
h
co
nstit
u
te threats to the
sy
stem
is cu
rrent
ly
placed i
n
crit
ical i
n
fra
s
tru
c
tures by
usi
n
g
behavio
ural o
b
se
rvation an
d big data an
alysis te
chni
q
ues [11].
2.
Data Priv
ac
y
(included
:
6
th
, 7
th
and 8
th
challenge
s)
For
se
cu
rin
g
the data
itsel
f, information
disse
m
inatio
n mu
st b
e
pri
v
acy-p
r
e
s
ervi
ng a
n
d
cryptog
r
a
phy
and
g
r
an
ula
r
a
c
ce
ss con
t
rol mu
st
be
use
d
to
prote
c
t
sen
s
itive d
a
ta [10].
Clo
ud
Evaluation Warning : The document was created with Spire.PDF for Python.
ISSN: 16
93-6
930
TELKOM
NIKA
Vol. 12, No. 4, Dece
mb
er 201
4: 751
– 752
752
comp
uting p
r
ovides
pro
m
i
s
ing
scala
b
le
IT infrast
r
u
c
ture to
sup
p
ort vario
u
s
p
r
ocessin
g
of
a
variety of big
data ap
plications, b
u
t bri
n
g
s
ab
out
p
r
iva
c
y co
ncerns
potentially
if the information is
relea
s
e
d
or
share
d
to third
-
pa
rties in
clo
ud [12].
3.
Data Mana
g
e
ment (in
c
luded: 3
rd
, 9
th
and 10
th
ch
allenges
)
Big Data
is a
col
o
ssal
am
ount of
data
that
ca
nnot b
e
ha
ndled
by
the tra
d
ition
a
l data
manag
eme
n
t system [13]-[15]. A modern dat
a ma
nagem
ent sy
stem is very
needed fo
r or
stora
ge an
d retrieval of the big data.
4.
Integrity
and Reac
tiv
e
Se
curit
y
(inclu
ded: 4
th
and
5
th
challenge
s)
For integ
r
ity purp
o
se, the streami
ng d
a
ta emerging
from diverse
end-p
o
ints
must be
c
h
eck
e
d
;
an
d fo
r
re
ac
tive
se
c
u
r
i
ty p
u
r
p
o
s
e
th
e
stream
ing data
ca
n
be u
s
ed to
p
e
rform
re
al-ti
m
e
analytics in order to en
su
re
the infrastructure health [1
0].
This jou
r
nal i
s
spo
n
sori
ng
an internatio
nal confe
r
e
n
c
e entitled "the 2015 Inte
rnation
a
l
Confe
r
en
ce
on S
c
ien
c
e
i
n
Inform
ation
Te
chn
o
logy” on
O
c
tobe
r
27-28, 2
0
1
5
which
will
be
con
d
u
c
ted
un
der th
e the
m
e: “Th
e
Role
of Busin
e
ss I
n
telligen
ce i
n
Big Data Ma
nagem
ent”. T
h
is
confe
r
en
ce i
s
hoped to b
e
a forum for di
alogu
es to lo
ok at vario
u
s
issue
s
, aimin
g
at finding the
right solution
s to initial rou
nd of big
data
manag
eme
n
t techn
o
logi
es are in
he
rentl
y
well-suited f
o
r
real
-time ope
ration
s acco
rding to
the top 10 ch
allen
g
e
s ab
ove.
Referen
ces
[1]
Veiga Neves, M., et al.
Pythia: Faster Big Data in
M
o
tion t
h
rou
gh Pre
d
icti
ve Softw
are-D
e
fine
d Netw
ork
Optim
i
z
a
t
i
on at Runtim
e
.
in Parallel and Distributed P
r
oc
essing S
y
mposium, 2014 IEEE 28t
h
Internatio
na
l. 2014.
[2]
Yangqing, Z.,
Z. Jun.
Probe into setting
up
big dat
a proc
essin
g
speci
a
lt
y in Chi
nese
u
n
iversiti
es
. in
Comp
uter Scie
nce & Educ
atio
n (ICCSE), 201
4
9th Internati
o
nal C
onfere
n
ce
on. 201
4.
[3]
Malik, P., Gov
e
rni
ng B
i
g D
a
t
a
: Princi
pl
es a
nd pr
actices
.
I
B
M
Journ
a
l of Rese
arch and
Devel
o
p
m
ent
,
201
3; 57(3/4):
1-13.
[4]
T
a
kaishi, D.,
et al.
T
o
w
a
rds Energy Efficient Big D
a
ta
Gathering i
n
Dense
l
y Distr
ibute
d
Sens
or
Networks.
Emerging T
opics in Computi
ng, IEEE
T
r
ansactions on, 2014;
PP
(99): 1-1.
[5]
Juan, Z
., Z
.
Yanqi
n, J. F
ang
yuan.
Practic
a
l
and S
e
cur
e
Ou
tsourcin
g
of L
i
near A
l
ge
bra
i
n
the C
l
o
u
d
. in
Advanc
ed Cl
ou
d and Bi
g Data
(CBD), 2013
I
n
ternati
o
n
a
l Co
nferenc
e on. 2
013.
[6]
Courtney
, M.
Puzzling out big da
ta [Information T
e
ch
no
l
ogy
An
aly
t
i
cs]
.
Engi
neer
in
g &
T
e
chno
logy
.
201
3;
7
(12): 56
-60.
[7]
Garlasu, D., et
al.
A b
i
g
data
impl
e
m
entati
o
n bas
ed
on G
r
id co
mputin
g
.
in R
o
e
dun
et Internati
o
n
a
l
Confer
ence (R
oEdu
Net), 201
3 11th. 20
13.
[8]
Coll
ins, E. Big Data in the P
u
blic Cl
ou
d
.
Cloud Comput
ing, IEEE.
2014; 1(
2): 13-15.
[9]
Bari, N., D. Li
ao, S. Berkov
ic
h.
Organ
i
z
a
t
io
n of M
e
ta-kno
w
l
edge
in
the
F
o
rm
of 2
3
-Bit T
e
mplat
e
s for
Big Dat
a
Proc
essin
g
. in C
o
m
putin
g for Geo
s
patia
l Res
ear
ch an
d App
lica
t
ion (COM.Geo
)
, 2014 F
i
fth
Internatio
na
l C
onfere
n
ce o
n
. 201
4.
[10]
Expan
de
d T
op T
en Big Data S
e
curity an
d Pri
v
acy Cha
lle
ng
es
. 2013.
[11]
Hurst, W
.
, M.
Merabti, P. F
e
rgus.
Big Data
Analys
is T
e
ch
niq
ues for Cyb
e
r-threat Detec
t
ion in Critic
al
Infrastructures
. in Adva
nced I
n
formatio
n
Net
w
o
r
kin
g
an
d A
pplic
atio
ns W
o
rkshops (W
AIN
A
), 2014 2
8
th
Internatio
na
l C
onfere
n
ce o
n
. 201
4.
[12]
Z
hang, X.,
et al.
Proxi
m
ity-Aw
are Loc
al-R
e
c
odi
ng A
n
o
n
y
m
i
z
at
io
n w
i
th
MapR
educ
e fo
r Scala
b
l
e
Bi
g
Data Privacy P
r
eservati
on in
Clou
d
.
Computers, IEEE
T
r
ansactions on, 2014.
PP
(99): p. 1-1.
[13]
Padh
y, R., U. Berhampur.
Big Data Pr
oc
essi
ng
w
i
th
Hado
opM
ap
Red
u
ce in
Cl
oud S
y
stems
.
Internatio
na
l Journ
a
l of Clo
ud
Comput
i
ng a
n
d
Services Sci
ence (IJ-CLOS
ER).
2013; 2(1
)
: 16-27.
[14]
Han, H., et
al.
A Privac
y
Data
Orient
ed
Hi
era
r
chical
Map
R
e
duce
Progr
am
ming M
o
d
e
l
. TELKOMNIKA
Indon
esi
an Jou
r
nal of Electric
al Eng
i
ne
eri
ng.
2013; 1
1
(8): 4
587-
459
3.
[15]
Liu, Y., et al.
Check
poi
nt
a
nd R
epl
icati
o
n
Oriented
F
a
u
l
t T
o
lerant Me
chan
ism for
MapR
educ
e
F
r
ame
w
ork
.
T
E
LKOMNIKA Indo
nesi
an Jo
u
r
nal of Electric
al Eng
i
ne
eri
ng.
2014; 1
2
(2): 1
029-
103
6.
Evaluation Warning : The document was created with Spire.PDF for Python.