Internati
o
nal
Journal of Ele
c
trical
and Computer
Engineering
(IJE
CE)
Vol.
5, No. 6, Decem
ber
2015, pp. 1492~
1
499
I
S
SN
: 208
8-8
7
0
8
1
492
Jo
urn
a
l
h
o
me
pa
ge
: h
ttp
://iaesjo
u
r
na
l.com/
o
n
lin
e/ind
e
x.ph
p
/
IJECE
De-Identified Personal Health
Care System Using Hadoop
Das
a
ri M
a
dh
a
v
i,
B.
V.
R
a
ma
na
Department o
f
I
n
formation Tech
nol
og
y
,
AITAM, Tekk
ali, A
.
P.
Article Info
A
B
STRAC
T
Article histo
r
y:
Received
J
u
n 12, 2015
Rev
i
sed
Au
g
18
, 20
15
Accepte
d
Se
p 4, 2015
Hadoop technolog
y
p
l
ay
s
a vital role in
improving the quality
o
f
healthcar
e
b
y
deliver
i
ng right information
to right
people at right time and reduces its
cost and
time.
Most properly
h
ealth ca
re fun
c
tions like
admission, disch
a
rge,
and tr
ans
f
er p
a
t
i
ent
dat
a
m
a
in
t
a
ined
in Com
p
uter b
a
s
e
d P
a
ti
ent R
ecords
(CPR), Personal Health Information (P
HI), and Ele
c
troni
c Heal
t
h
Records
(EHR). Th
e use of medical Big
Data
is
incr
easingly
popular
in
health car
e
s
e
rvices
and cl
i
n
ica
l
res
e
arch
.
The bigg
es
t ch
al
lenges
in
hea
lth
car
e c
e
nters
are the huge am
ount of data flo
w
s into the sy
stems daily
.
Crunching this Big
Data and de-
i
de
ntif
ying it
in a t
r
adition
a
l da
ta
m
i
ning tools had problem
s.
Therefor
e to
provide solutio
n
to
th
e de-identif
y
i
ng
pers
onal h
ealth
inform
ation,
Map Redu
ce
ap
plic
ation
uses
jar f
iles whi
c
h
cont
ain
a
combination of
MR code and
PIG que
ries.
This applicatio
n also uses
advanced mech
anism of using UDF (User
Data File) whi
c
h is use
d
to prot
ect
the hea
lth car
e datase
t. De-id
e
n
tified
pe
rs
onal heal
th care s
y
s
t
em
is
us
ing
M
a
p Reduce
,
P
i
g Queries
which
are ne
eded
to b
e
executed on th
e health care
datase
t.
The
appl
ica
tion inpu
t da
t
a
set th
at
cont
ain
s
the inform
at
io
n of pat
i
ents
and de-iden
tifies their personal h
ealth care.
De-identification using Hadoop
is also suitable f
o
r social
and d
e
mographic data.
Keyword:
Base6
4
Big
Data
H
a
doo
p
Health ca
re
rec
o
rds
Map
Reduce
Copyright ©
201
5 Institut
e
o
f
Ad
vanced
Engin
eer
ing and S
c
i
e
nce.
All rights re
se
rve
d
.
Co
rresp
ond
i
ng
Autho
r
:
Dasari M
a
d
h
av
i,
Depa
rt
m
e
nt
of
In
fo
rm
at
i
on Te
chn
o
l
o
gy
,
AIT
A
M
,
Tekk
ali, A.P.
Em
a
il: d
a
sari
mad
h
a
v
i
.it@g
m
a
il.co
m
1.
INTRODUCTION
Big
Data is a
co
m
b
in
atio
n
of an
y typ
e
o
f
larg
e an
d
co
mp
lex
d
a
tasets th
at it b
eco
m
e
s
d
i
fficu
lt t
o
pr
ocess u
s
i
n
g on e
x
i
s
t
i
ng da
t
a
m
a
nagem
e
nt
t
ool
s or t
r
ad
i
t
i
onal
dat
a
pr
ocessi
n
g
ap
pl
i
cat
i
ons. B
i
g D
a
t
a
i
s
about
real-tim
e
analysis and
data
dri
v
en
deci
si
on
-m
aki
ng pr
ocess.
Big
Data is
p
l
ayin
g
cru
c
ial ro
le in
Health
care an
d sev
e
ral h
ealth
institu
tes
h
e
lp
t
o
an
alyzin
g
t
h
e
l
a
rge
vol
um
e of i
n
f
o
rm
ati
on.
To
day
h
uge a
m
ount
of
pat
i
e
nt
dat
a
i
s
ge
ne
rat
e
d i
n
heal
t
h
care o
r
ga
ni
zat
i
ons
s
o
we can
prov
i
d
e th
e p
a
tien
t
care q
u
a
lity
an
d
p
r
o
g
ram
an
alysis. By
u
s
ing
trad
itional syste
m
s we can
’t
norm
alize
that
data because increasi
ng the
digitization of
health care data means th
at organizations
ofte
n add
t
e
raby
t
e
s’ w
o
rt
h of
pat
i
e
nt
rec
o
r
d
s
t
o
dat
a
ce
nt
ers
a
n
nual
l
y
.
1
.
1
.
Ha
doo
p
Innova
ti
o
n
in Hea
l
th Ca
re
Intellig
ence
M
a
ny
or
ga
ni
zat
i
ons are
di
sc
ove
re
d t
h
at
t
h
ei
r exi
s
t
i
ng
da
t
a
m
i
ni
ng and
anal
y
s
i
s
t
echni
que
s si
m
p
ly
n
o
t
up
to
yet th
e task
of h
a
nd
lin
g
Big
Data. On
e
po
ssib
l
e
to
th
is p
r
ob
lem is to
b
u
ild
Had
oop
clu
s
ter.
H
a
doo
p
is
ope
n-s
o
urce distributed data
stora
g
e and a
n
alysis fram
e
work t
h
at acce
ss
large vol
ume
of datasets
that
m
a
y
be structured, unst
r
uct
u
re
d a
nd sem
i
st
ructured. Health ca
re data tends to
reside in m
u
ltiple places like
EMRs
or EHRs
, radi
ology,
pharm
acy etc.
Evaluation Warning : The document was created with Spire.PDF for Python.
I
J
ECE
I
S
SN
:
208
8-8
7
0
8
De-Id
e
n
tified
Persona
l Hea
lth
C
a
re
S
y
stem Using
Hado
op
(Da
s
a
r
i
Ma
dha
vi)
1
493
Agg
r
eg
ating
t
h
e d
a
ta
wh
ich co
m
e
s fro
m
a
ll o
v
e
r t
h
e
organization i
n
to
central system such a
s
an
Ent
e
r
p
ri
se
Dat
a
W
a
reh
o
u
se
(
E
D
W) a
n
d m
a
ke t
h
i
s
dat
a
ac
cessi
bl
e an
d a
c
t
i
onabl
e.
Ha
d
o
o
p
t
o
ol
s i
n
hea
l
t
h
care
in
du
stry p
r
ov
i
d
e th
e secure resu
lts for an
al
yzin
g
th
e larg
e v
o
l
u
m
e o
f
p
a
t
i
en
t d
a
ta at th
e
sa
m
e
ti
m
e
it
c
a
n
g
i
v
e
the reliability
of clinical outc
o
m
e
s.
A successful outcom
e
is a renewal of
a prescripti
on
in the expecte
d
ti
m
e
p
e
r
i
o
d
. H
a
doop
can
sto
r
e r
e
n
e
w
a
l in
for
m
a
tio
n
and
tie i
t
to
so
cial
m
e
d
i
a co
n
t
en
t and
on
lin
e r
e
m
i
n
d
e
r
s
.
Hadoo
p techno
log
y
can p
l
ay
m
a
j
o
r
ro
le in
h
ealth
care in
du
stry, th
is t
ech
no
log
y
v
e
ry u
s
efu
l
to
t
h
e pu
b
lic
sect
or;
i
t
ca
n i
m
prove t
h
e
pat
i
e
nt
safet
y
a
n
d
securi
t
y
.
1.
2.
L
i
ter
a
t
u
re Sur
vey
The i
n
c
r
easi
n
g
di
gi
t
i
zat
i
on
o
f
heal
t
h
ca
re i
n
f
o
rm
at
i
on i
s
an
al
y
z
i
ng usi
n
g
new
t
ech
ni
q
u
e
s
f
o
r i
m
pro
v
e
th
e q
u
a
lity o
f
care, h
ealth
care resu
lts, an
d
m
i
n
i
mize
th
e co
sts. Org
a
n
i
zatio
n
s
m
u
st a
n
alyze in
tern
al
and
external patient inform
ation to m
o
re accurat
e
ly
m
easur
e ri
sk a
n
d outcom
es. At t
h
e same tim
e
,
m
a
ny c
lients
are worki
n
g
to increase data
t
r
ans
p
are
n
cy
t
o
pr
o
duce
ne
w i
n
si
g
h
t
kn
o
w
l
e
d
g
e.
Pra
v
een K
u
m
a
r et
al
[1]
,
i
n
t
h
ei
r wo
rk
pr
op
ose
d
t
h
at
H
a
do
o
p
i
s
base
d o
n
M
a
p R
e
duct
i
o
n i
s
a
po
we
rf
ul
t
o
ol
m
a
nages t
h
e
h
uge
am
ount
of
dat
a
.
Wi
t
h
t
h
i
s
ech
o sy
st
em
can
use
fa
ul
t
t
o
l
e
rant
t
e
c
hni
que
s.
Em
ad A M
oham
m
e
d et al
[2], i
n
t
h
ei
r w
o
r
k
bi
g cl
i
n
i
cal
dat
a
anal
y
t
i
c
s woul
d em
phasi
ze m
odel
l
i
ng of
wh
ol
e
in
teractin
g processes in clin
ical settin
g
s
and
clin
ical
d
a
tasets can b
e
evo
l
u
tion
o
f
u
ltra-larg
e-scale
datasets.
A
r
an
tx
a
D
uque Bar
r
ach
i
n
a et al [
3
] p
r
op
o
s
ed
th
at u
s
ing
H
a
doo
p
techn
i
q
u
e
s lar
g
e d
a
t
a
sets can
b
e
used
to
id
en
tificatio
n of larg
e
d
a
taset.
K. Di
vy
a et
al [4]
,
fo
r pr
ot
ec
t
i
ng t
h
e dat
a
u
s
ed a pr
og
ressi
ve encry
p
t
i
o
n schem
e
. Hon
g
son
g
C
h
e
n
[6], i
n
thei
r
re
search article
a
n
ovel
Ha
do
o
p
-
b
ase
d
bi
ose
n
sor
S
uns
p
o
t
w
i
rel
e
ss net
w
o
r
k a
r
chi
t
ect
u
r
e,
EC
C
di
gi
t
a
l
si
gn
at
ur
e al
go
ri
t
h
m
,
My
sql
dat
a
ba
se a
nd
Ha
d
o
o
p
H
D
FS cl
ou
d st
o
r
age;
sec
u
ri
t
y
adm
i
ni
st
rat
o
r c
a
n
use
i
t
t
o
prot
ect
a
nd m
a
nage
ke
y
dat
a
. Li
d
o
n
g
W
a
n
g
et
al
[7]
,
i
n
t
h
ei
r
wo
rk
base
d
o
n
S
W
OT
(St
r
engt
hs,
Weakn
e
sses, Op
portun
ities,
Th
reat
s) an
alysis, Rad
i
o
Freq
uen
c
y Id
en
tificatio
n
Techn
o
l
o
gy (RFID).
2.
R
E
SEARC
H M
ETHOD
For
de
-i
de
nt
i
f
y
i
ng
dat
a
set
f
o
l
l
owi
n
g
Platform
s and tools
a
r
e used for Big
Data analytics in
health
care.
T
h
is work follows
t
h
e proce
d
ure
a
s
:
1)
Data co
llection
2)
H
a
doo
p Clu
s
ter
an
d Map
r
e
d
u
ce
3)
Ex
peri
m
e
nt
s
2.
1.
D
a
t
a
Co
l
l
ecti
o
n
In
th
is pap
e
r Big
Data
m
i
n
i
m
u
m size co
n
s
id
er as a p
e
ta b
y
te. Big
Data is av
ailab
l
e in
so
m
a
n
y
sect
ors l
i
k
e
W
e
b an
d s
o
ci
al
net
w
or
ki
ng, M
achine t
o
m
a
c
h
ine,
Enorm
o
us exc
h
ange, Bi
om
etric sensor data,
Hum
a
n-creat
e
d
dat
a
,
Gam
i
ng i
n
d
u
st
ry
,
A
g
ri
cul
t
u
re
an
d
Ed
ucat
i
o
n
de
pa
rt
m
e
nt
s.
2.
2.
Ha
do
op
Cl
uster
and Map Reduce
Hadoo
p is a software fram
ewo
r
k
for allows
p
r
o
cessi
n
g
of larg
e
d
a
tasets acro
ss th
e larg
e clu
s
ters
o
f
com
put
er.
Ha
d
o
o
p
Di
st
ri
b
u
t
e
d Fi
l
e
Sy
st
em
is a
java
ba
se
d
d
i
stribu
ted file syste
m
th
at can
co
llect all k
i
n
d
s
of
dat
a
wi
t
h
out
p
r
i
o
r
or
ga
ni
zat
i
o
n
.
M
a
p R
e
du
ce i
s
a soft
wa
r
e
pr
o
g
ram
m
i
ng m
odel
fo
r p
r
ocessi
n
g
l
a
r
g
e
set
of
d
a
ta in
p
a
rallel. Hadoo
p
cl
u
s
ter is in
terconnected
b
e
t
w
ee
n
t
h
e HD
FS a
nd
M
a
p R
e
du
ce.
So we ca
n i
m
p
l
em
ent
the p
r
og
ram
fo
r H
a
d
o
o
p
clust
e
r.
2.3.
E
x
peri
ment
H
a
doo
p is an op
en source framew
o
r
k
w
h
ich is w
r
itte
n
in
j
a
v
a
b
y
Ap
ach
e
so
ft
w
a
re fo
undatio
n
an
d
is
use
d
t
o
wri
t
e
soft
ware a
p
pl
i
cat
i
on w
h
i
c
h r
e
qui
res t
o
pr
o
cess h
uge am
ount
of
dat
a
.
It
wo
r
k
s i
n
para
l
l
e
l
o
n
l
a
rge cl
ust
e
rs
whi
c
h
wo
ul
d
h
a
ve t
h
o
u
sa
nds
of c
o
m
put
er
n
ode
s o
n
t
h
e cl
u
s
t
e
rs.
It
al
so
p
r
ocesses t
h
e
dat
a
very
rel
i
a
bl
e m
a
nner a
n
d
fa
ul
t
-
t
o
l
e
rant
m
a
nne
r.
Had
o
o
p
can
b
e
i
n
st
al
l
e
d i
n
c
l
ou
d e
r
a
ope
ra
t
i
ng sy
st
em
and a
f
t
e
r
co
m
p
letio
n
of
Hadoo
p in
stallatio
n
au
to
m
a
ti
cally HDF
S
pro
cess will
b
e
started
with
t
h
e d
aem
o
n
s
.
HD
FS i
s
havi
n
g
t
w
o m
a
i
n
l
a
yers M
a
st
er
n
o
d
e
, Dat
a
no
de.
M
a
st
er N
ode
o
r
Nam
e
No
de i
s
t
h
e m
a
st
er
o
f
th
e
system
main
tain
ed
and
m
a
n
a
g
e
d
b
y
th
e Data Nod
e
. Master
No
d
e
can
sp
lit d
a
ta i
n
to
slav
e nod
e. Data
Nod
e
o
r
Second
ry
Nam
e
n
o
d
e is p
r
o
v
i
d
e
the actu
a
l sto
r
age is h
a
v
i
n
g
t
h
e respon
sib
ility to
read
and
write fo
r
t
h
e cl
i
e
nt
s. M
a
p R
e
duce
i
s
an
al
go
ri
t
h
m
or c
once
p
t
t
o
p
r
oc
ess h
u
g
e am
ou
nt
o
f
dat
a
i
n
a
qui
c
k
er
way
.
As
per
i
t
s
nam
e
i
t
can di
vi
de
i
n
t
o
M
a
ppe
r a
n
d R
e
d
u
cer.
Evaluation Warning : The document was created with Spire.PDF for Python.
I
S
SN
:
2
088
-87
08
IJECE
Vol. 5, No. 6, D
ecem
ber
2015 :
1492 –
1499
1
494
Fig
u
re
1
.
Hadoo
p app
licatio
n
an
d infrastru
ct
u
r
e in
teractions
2.
4.
T
y
pi
cal
Form
s of Kn
ow
l
e
dge De
-i
d
e
nti
f
i
c
a
t
i
o
n
Th
e
H
I
PAA
ru
le pr
ov
id
es
pr
o
t
ects m
o
st i
n
d
i
v
i
du
a
lly id
en
tifiab
l
e h
ealt
h
inform
atio
n
.
Th
ere are
a
cou
p
l
e
of c
o
m
m
on kn
owl
e
dg
e de-i
dent
i
f
i
c
a
t
i
on t
act
i
c
s t
h
a
t
m
a
y
be depl
oy
ed t
o
en
ha
n
ce pr
ot
ect
i
o
n
i
n
t
h
e
Had
o
o
p
at
m
o
sphe
re,
com
p
ar
abl
e
t
o
st
ora
g
e
-
l
e
vel
e
n
cry
p
t
i
o
n
an
d
dat
a
pr
ot
ect
i
ng.
2.4.1.
Storage De
gree Encr
ypti
on
St
ora
g
e-
de
gree
encry
p
t
i
o
n t
h
e wh
ol
e qua
nt
i
t
y
t
h
at
t
h
e i
n
fo set
i
s
saved i
n
i
s
encry
p
t
e
d
on t
h
e
di
sk
vol
um
e st
age whi
l
e
“at
rel
a
x
a
t
i
on”
on t
h
e i
n
f
o
st
o
r
e,
w
h
i
c
h p
r
ot
ect
s t
o
w
a
rds
u
n
a
u
t
h
ori
zed
pers
o
nnel
wh
o ca
n
h
a
v
e
bod
ily o
b
tain
ed
th
e
d
i
sk
, fro
m
b
e
in
g
equ
i
pp
ed to
read
so
m
e
th
in
g
from
it. Th
is is a p
r
iceless co
n
t
ro
l in
a
Hadoop cl
uster or a
n
y m
a
ssive inform
ation s
t
ore
due
t
o
c
o
mmon place di
sk re
pair
s
and
swap-out
s. Howeve
r
this does
n'
t safegua
rd the
inform
ati
on from
access
when the
disk is runn
ing
withi
n
the
syste
m
. Decry
p
tion
tech
n
i
qu
e is ap
p
lied au
t
o
m
a
t
i
cally wh
en t
h
e inform
ation
is read via t
h
e
runn
ing
p
r
o
c
ess, an
d liv
e, i
n
clin
ed
d
a
ta is to
tally ex
po
sed
t
o
an
y
u
s
er or system
g
a
in
ing
access
to
th
e m
e
th
od
.
2.
4.
2.
Kn
ow
l
e
dge Pro
t
ecti
n
g
Kn
o
w
l
e
d
g
e o
v
e
rl
ay
i
ng i
s
a pri
cel
ess
m
a
nne
r fo
r o
b
f
u
scat
i
ng t
o
uc
hy
kn
o
w
l
e
d
g
e, m
o
re com
m
only
u
s
ed
fo
r pr
oductio
n
o
f
test an
d d
e
v
e
lop
m
e
n
t kno
w
l
ed
g
e
f
r
o
m
r
e
sid
e
con
s
tru
c
tio
n know
-h
ow
. Non
e
th
eless,
mask
ed
informatio
n
is
m
e
a
n
t to
be irrev
e
rsib
le,
wh
ich
l
i
m
i
ts its p
r
ice for a lo
t
o
f
an
alytic fu
n
c
tion
s
and
p
u
b
lish-pro
cessin
g
n
ecessities. Furth
e
rm
o
r
e th
ere's n
o
warran
t
y th
at th
e d
i
stin
ct co
v
e
rin
g
tran
sform
a
tio
n
ch
osen
for a detailed
sen
s
itiv
e k
n
o
w
led
g
e
su
bj
ect ab
so
lu
tely o
b
f
u
s
cates it fro
m
id
en
tificatio
n
,
m
a
in
ly
wh
en
correlated wit
h
different knowle
dge
withi
n
the Hadoop
“dat
a lake” [7] and ce
rtain protecting st
rategies
m
a
y
o
r i
s
pro
b
a
bl
y
not
ap
pr
o
v
ed t
h
r
o
u
g
h
audi
t
o
rs an
d as
sesso
rs, af
fect
i
ng
whet
her
or
not
t
h
ey
real
m
eet
regu
lato
ry co
m
p
lian
ce
n
ecessi
ties an
d prov
ide risk less
h
a
rbo
r
with
i
n
th
e ev
en
t
of a
d
a
ta
b
r
each
.
2.
4.
3. Cr
yp
to
g
r
aph
y
B
a
se 6
4
al
g
o
ri
t
h
m
C
r
y
p
t
o
gra
p
hy
B
a
se 64
i
s
a t
e
chni
que
desi
g
n
e
d t
o
rep
r
ese
n
t
an arbi
t
r
a
r
y
seque
nce o
f
oct
e
t
s
(8 bi
t
)
i
n
an e
x
cee
di
n
g
l
y
pri
n
t
a
bl
e t
e
xt
t
y
pe t
h
at
e
n
abl
e
s
pas
s
i
n
g
bi
na
ry
i
n
f
o
r
m
at
i
on t
h
r
o
ug
h c
h
an
nel
s
t
h
a
t
sq
uare
measure desi
gned for flat American
Stand
a
rd Co
de f
o
r
In
fo
rm
ation In
tercha
nge text
like SM
TP (Postal,
19
8
2
)
.
It
addi
t
i
onal
l
y
perm
i
t
s
em
beddi
ng
of
bi
na
ry
i
n
fo
rm
at
i
on i
n
m
e
di
a sup
p
o
rt
i
n
g Am
eri
can St
an
dar
d
C
ode
fo
r In
f
o
rm
ation Inte
rcha
n
g
e text only
like XM
L files.
Base64 content Transfe
r
encry
p
tion c
r
yptography
co
d
i
n
g
secret
writing
o
r
Base6
4
secr
et
writin
g
is ou
tlin
ed
in
RFC
2
045
.
3.
Results
and
Discussion
3.
1.
Prel
i
m
i
n
ary
D
a
ta
Prep
ara
t
i
o
n
Th
is wo
rk
can b
e
in
vo
lv
ed
du
mmy p
a
tien
t
h
ealth
p
a
tien
t
d
a
taset co
llected
in
th
e HSC
I
C (Health
&
Soci
al
C
a
re
I
n
fo
rm
ati
on C
e
n
t
re) c
ont
ai
n
s
fi
el
ds o
f
pat
i
e
nt
nam
e
, pat
i
e
nt
i
d
,
dat
e
of
bi
rt
h, Em
ai
l
i
d
, ge
nde
r,
disease, a
n
d
disease id. T
h
at
data can be m
a
intained
i
n
t
h
e
fo
rm
at of CSV
(Com
m
a
Separated
Value
)
file.
Evaluation Warning : The document was created with Spire.PDF for Python.
I
J
ECE
I
S
SN
:
208
8-8
7
0
8
De-Id
e
n
tified
Persona
l Hea
lth
C
a
re
S
y
stem Using
Hado
op
(Da
s
a
r
i
Ma
dha
vi)
1
495
Fi
gu
re 2.
Dat
a
Pre
p
arat
i
o
n
P
r
oced
u
r
e
3.
2.
Prel
i
m
i
n
ary
D
a
ta
An
al
ysi
s
Th
e
d
a
taset is in
CSV (co
mma Sep
a
rated
Valu
e) fo
rm
at. Th
e
resu
lts in th
is proj
ect co
n
s
isting
b
y
usi
n
g B
a
se
6
4
enco
de
d al
g
o
r
i
t
h
m
encry
p
t
t
h
e pl
ai
n t
e
xt
i
n
t
o
e
n
cry
p
t
e
d
da
t
a
. He
re
usi
n
g
Had
o
o
p
si
n
g
l
e
n
ode
syste
m
then a
m
setting class
p
a
th
fo
r Had
oop
j
a
r
files.
Fi
gu
re
3.
Heal
t
h
ca
re_sam
pl
e_Dat
a
set
1
(
p
l
a
i
n
t
e
xt
)
Exp
o
r
t
CLASSPA
T
H
=
.$
{HAD
OO
P_H
O
M
E}/H
adoo
p-
cor
e
-0
.2
0.2-
cd
h3u
6.j
a
r
:
$
{ CLASSPA
T
H
}
Ex
po
rt
C
L
A
S
S
P
AT
H=.
$
{H
A
D
O
O
P
_
HOM
E}/
c
om
m
ons-codec
-
1.
4.
jar :
$
{ C
L
AS
SP
AT
H}
A
f
ter th
e
ru
n
my j
a
v
a
co
de fo
r m
a
p
p
e
r
,
r
e
du
cer & co
m
b
in
er
.
J
a
v
a
c D
e
I
d
en
tif
y
D
a
ta
.
j
av
a
Fi
gu
re
4.
R
e
s
u
l
t
of a
d
ded
m
a
ni
fest
Evaluation Warning : The document was created with Spire.PDF for Python.
I
S
SN
:
2
088
-87
08
IJECE
Vol. 5, No. 6, D
ecem
ber
2015 :
1492 –
1499
1
496
3.
3. Ha
do
op
C
l
uster Resul
t
After th
at
p
l
acin
g
d
a
taset i
n
to
Hadoo
p d
i
stri
bu
ted
file system
, an
d
run
th
e
Hadoo
p job
and
fin
a
lly
got
t
h
e
out
put
f
i
l
e
.
Hadoo
p fs -pu
t
h
ealth
care_
S
am
p
l
e_
d
a
taset1
.csv
/
h
ealth
/h
ealth
care_
Sam
p
le_
d
a
taset
1
.csv
Hadoo
p fs -pu
t
h
ealth
care_
S
am
p
l
e_
d
a
taset1
.csv
/
h
ealth
/h
ealth
care_
Sam
p
le_
d
a
taset
1
.csv
Had
o
o
p
ja
r /
h
o
m
e/
cl
oudera/
D
eIde
nt
i
f
y
D
at
a
1
.ja
r
DeI
d
ent
i
f
y
D
at
a /
h
eal
t
h
de
i
n
t
out
12
4
In
tern
ally th
e
Map
p
e
r an
d R
e
d
u
c
er
p
r
o
cess will b
e
started.
Fig
u
re
5
.
1
.
Resu
lt of Ma
p
p
e
r
&Redu
cer i
n
itial stag
es
Fi
gu
re
5.
2.
R
e
s
u
l
t
o
f
st
at
us
o
f
t
h
e M
a
p
p
er
an
d R
e
duce
r
Fi
gu
re
5.
3.
R
e
s
u
l
t
o
f
M
a
p
p
e
r
&R
ed
ucer
Fi
na
l
st
age
Evaluation Warning : The document was created with Spire.PDF for Python.
I
J
ECE
I
S
SN
:
208
8-8
7
0
8
De-Id
e
n
tified
Persona
l Hea
lth
C
a
re
S
y
stem Using
Hado
op
(Da
s
a
r
i
Ma
dha
vi)
1
497
The
n
we ca
n c
h
eck
t
h
e
nam
e
no
de
st
at
us a
n
d
jo
b t
r
acke
r
i
n
t
h
e
br
o
w
ser
.
Fig
u
re
5
.
4
.
Resu
lt of th
e ou
tpu
t
job
file d
e
tails
Fi
gu
re
6.
R
e
sul
t
of
Ha
d
o
o
p
C
o
u
n
t
e
rs
Fi
gu
re
7.
R
e
sul
t
of
M
a
p
p
er
an
d R
e
duce
r
c
o
m
p
l
e
t
i
on
g
r
ap
h
Evaluation Warning : The document was created with Spire.PDF for Python.
I
S
SN
:
2
088
-87
08
IJECE
Vol. 5, No. 6, D
ecem
ber
2015 :
1492 –
1499
1
498
Fi
gu
re
8.
R
e
sul
t
of
fi
nal
dec
r
y
p
t
e
d
o
u
t
p
ut
(
E
ncry
pt
ed Te
xt
)
4.
CO
NCL
USI
O
N
Big
Data an
al
ytics can
p
o
ssi
b
l
y ch
ang
e
th
e way
m
e
d
i
cin
a
l serv
ices supplier’s u
tilizatio
n
co
m
p
lex
adva
ncem
ent
s
t
o
pi
ck
u
p
u
nde
rst
a
n
d
i
n
g
f
r
om
t
h
ei
r cl
i
n
i
cal
and ot
her
i
n
fo
rm
at
i
on vaul
t
s
a
nd set
t
l
e
on
ed
u
cated
cho
i
ces. Later on we'l
l see th
e
fast, acro
ss t
h
e
bo
ard
ex
ecu
tio
n and
u
tilizatio
n
o
f
b
i
g
i
n
fo
rmatio
n
exam
ination over the s
o
cial insuran
ce as
sociation and the hum
an servi
ces
industry.
To that end, t
h
e fe
w
d
i
fficu
lties h
i
gh
lig
h
t
ed
abo
v
e,
m
u
st b
e
ten
d
e
d
to
. As larg
e in
fo
rm
atio
n
in
vestig
atio
n
g
e
ts
to
b
e
m
o
re sta
n
d
a
rd
,
i
ssues ens
u
ri
n
g
secu
ri
t
y
, def
e
ndi
ng sec
u
ri
t
y
, bui
l
d
i
ng
be
nchm
arks a
nd
adm
i
ni
st
rat
i
on,
and
wi
t
h
o
u
t
a
brea
k
en
h
a
n
c
ing
t
h
e in
stru
m
e
n
t
s an
d inn
o
v
a
tion
s
will co
llect co
n
s
i
d
eration
.
Big
Data app
licatio
n
s
in m
e
d
i
cin
a
l
services a
r
e at
an ea
rly pha
s
e
of
adva
ncement, yet fa
st
a
dva
nces i
n
st
a
g
es a
nd
de
vi
c
e
s can
q
u
i
c
ke
n t
h
ei
r
devel
opi
ng
p
r
o
cedu
r
e.
REFERE
NC
ES
[1]
P
r
aveen Kum
a
r,
et al
.,
“Eff
icient Capabilities
of
Processing of Big Data using Hadoop Map Redu
ce”,
Int
e
rnation
a
l
Journal of Ad
va
nced
Research
in Co
mputer
and
Communication Engineering,
Vol. 3
,
No
. 6
,
2014
.
[2]
Emad A. Mohammed,
et a
l
.
,
“Applications of
the Map
Reduce progra
mmi
ng
Fra
m
e
work t
o
c
l
i
n
ic
al
Bi
g Data
anal
ys
is
:
curr
ent
lands
c
a
pe
and
f
u
ture
trends
”
,
Big Data Min
i
ng,
2014.
[3]
Arantxa
Duque Barrach
ina and Aisling
O’Driscoll,
“A
Big Data methodolog
y
for categorising
techn
i
cal support
requests using H
a
doop and
Maho
ut”,
Journal of
Big Data,
2014
.
[4]
K. Div
y
a,
N. S
a
dhasivam
,
“
S
e
c
ure Da
ta Sha
r
i
ng in
Cloud
En
vironm
ent Using Multi Author
i
t
y
Attribu
t
e
Bas
e
d
Encr
y
p
tion”,
International
Journal of Inno
vative Research in C
o
mputer
and Co
mmunication En
gineering
,
Vol.
2,
No. 1, 2014.
[5]
Sabia and Sheetal Kalr
a, “Appl
ications of Big
Data: Curr
en
t S
t
atus and Futur
e
Scope”,
Intern
ational Journal of
Computer Applications,
Vol. 3,
No. 5, pp. 2319-
2526, 2014
.
[6]
Hongsong Chen
and Zhongchuan
Fu, “Hadoop-B
a
sed Health
car
e
Information S
y
stem Desi
gn and
Wireless Secur
ity
Communication Implementation
”
,
Hindawi Pub
lishing Corporatio
n Mobile In
formation S
y
stems,
2
015.
[7]
Lidong Wang, Cher
y
l
Ann Al
exander
,
“Medical Applic
ation
s
and Healthcare
Based on C
l
oud Computing”
International Jo
urnal of C
l
oud C
o
mpu
ting and S
e
rvices S
c
ience (
I
J-CLOSER)
,
Vol. 2
,
No. 4, pp. 21
7-225, 2014
.
[8]
Priy
ank
a
K, Prof. Nagarathna Ku
lennav
a
r, “A Sur
v
ey
on Big Data
Anal
ytics in He
alth Car
e
”
,
Inter
national Journa
l
of Computer S
c
ience and In
formation Technologies,
Vol. 5
,
No. 4, pp. 5865-5868,
2014.
[9]
Aditi Bansal
, Ankita Deshpande, Pri
y
anka
Ghar
e, Seem
a Dhikal
e, and Balaj
i
Bo
dkhe, “Health
care Data Anal
y
s
i
s
using D
y
namic
Slot
Allocation
in Hadoop,”
International Journ
a
l of
Recent
Technology and
En
gineering (
I
JRT
E
)
,
Vol. 3
,
No. 5, pp
. 2277-3878
, 20
14.
[10]
A Techn
i
cal R
e
v
i
ew on, “Protecting Big Data Pr
otection Solutions
for th
e
Busine
ss Da
ta
La
ke
”,
Whi
t
e pap
er
,
2015
.
[11]
D. Peter Augustine, “Lev
eragin
g Big
Data Analy
t
ics and Hadoop in Develo
p
i
ng India’s Health car
e Services”
International Jo
urnal of
Computer Applications,
Vol.
89
,
No.
16, 2014.
[12]
Muni Kumar
N, Manjula R.,
et
al
.
,
“
R
ole of Bi
g Data Anal
yti
c
s
in Rural Healt
h
Care -A S
t
ep
Towards
S
v
as
th
Bharath
”
,
International Journal
of Computer
Science and Information Technolog
ies
, Vol. 5, No.
6, pp. 7172-7178,
2014.
Evaluation Warning : The document was created with Spire.PDF for Python.
I
J
ECE
I
S
SN
:
208
8-8
7
0
8
De-Id
e
n
tified
Persona
l Hea
lth
C
a
re
S
y
stem Using
Hado
op
(Da
s
a
r
i
Ma
dha
vi)
1
499
BIOGRAP
HI
ES OF
AUTH
ORS
D.Madhavi has received B.Tech
degree in Info
rmation Technolog
y
from JNTU,
Kakinada. She is
current
l
y
an M.
T
ech, In
form
ation
Techno
log
y
stu
d
ent
in
an auton
o
m
ous institu
te Adit
ya
Institu
te of
Techno
log
y
and
Managem
e
nt, T
e
kkal
i
, Indi
a. Affilia
ted to JNTU, Kakinada
. Her
areas of inte
rests
Data Min
i
ng
and
Network Security
.
Dr.B.V.Ramana obtained his doctor’s degree
from
Andhra
University
, Ind
i
a. He is currently
working as Professor and Head
of the d
e
par
t
ment
of Informatio
n Technolog
y
,
Adity
a
Institute of
Techno
log
y
and
Management, I
ndia. He h
a
s p
ublis
hed 18
pap
e
rs in
inte
rnatio
nal Journals an
d
conferen
ces
.
Evaluation Warning : The document was created with Spire.PDF for Python.