TELKOM
NIKA Indonesia
n
Journal of
Electrical En
gineering
Vol.12, No.5, May 2014, pp
. 3297 ~ 33
0
2
DOI: http://dx.doi.org/10.11591/telkomni
ka.v12i5.4949
3297
Re
cei
v
ed O
c
t
ober 1
6
, 201
3; Revi
se
d Novem
b
e
r
27, 2013; Accept
ed De
cem
b
e
r
16, 2013
Resear
ch on Operation-based Correlation in Personal
Dataspace
Shuo Jiang, Jiajin Le, Yefeng Li
Don
g
Hu
a Univ
ersit
y
,
No.29
99 N
o
rth Renmi
n
Ro
ad
of Shang
ha
i,
0086-
21-6
2
3
7
8
5
95-1
1
of Don
g
H
ua U
n
ivers
i
t
y
Corresp
on
din
g
author, e-mai
l
: iamprotoss
@1
63.com
A
b
st
r
a
ct
Operatio
n of u
s
er w
a
s define
d
. T
he w
e
ight of operati
on w
a
s expres
s
ed. T
he varia
b
le q
uantity o
f
user
beh
avi
o
r
w
a
s compute
d
by w
e
ig
ht. 3-ar
y vector
data
d
e
finiti
on w
a
s
e
x
pan
de
d. Data
ite
m
w
a
s
defin
e
d
by 4-ary vector
in perso
na
l da
taspace. C
o
rre
latio
n
of
data f
o
r user w
a
s de
fined by w
e
i
ght
. Current w
e
ig
ht
of data
w
a
s d
e
fine
d by
in
itia
l w
e
ig
ht an
d v
a
ria
b
le
qu
antity of user oper
atio
n.
A
li
brary
datas
pace
mod
e
l w
a
s
desi
gne
d. T
he
w
e
ights of d
a
ta w
e
re verifi
ed
by usi
ng
a sa
mp
le i
n
the
li
br
ary datas
pac
e
for ten d
a
ys. T
h
e
result prov
ed the corre
latio
n
of data w
a
s very
imp
o
rtant an
d useful i
n
pers
ona
l datas
pace
.
Ke
y
w
ords
: per
sona
l datas
pac
e, data ite
m
, correlati
on.
Copy
right
©
2014 In
stitu
t
e o
f
Ad
van
ced
En
g
i
n
eerin
g and
Scien
ce. All
rig
h
t
s reser
ve
d
.
1. Introduc
tion
Information t
e
ch
nolo
g
y an
d internet te
chnolo
g
y with
a rapid d
e
velopme
n
t sh
ow u
s
a
huge d
a
ta amount, variet
y data and more
clo
s
ely
data relation
ship
s [1]. The new featu
r
es of
data make p
eople in tro
u
b
le on data
manag
eme
n
t and
that is a
great challen
ge for re
se
arche
r
s
about data
manag
eme
n
t. People rep
r
esent a
ne
w theo
ry of data mana
gement n
a
m
e
d
dataspa
ce to face the
chall
enge [2]. Dat
a
sp
ace is a
set of data whi
c
h relate with
subje
c
t, and
all
the data
in d
a
tasp
ace
can
be
cont
rolle
d by
subje
c
t
[3]. Dataspa
c
e is the fo
cu
s in
current d
a
ta
manag
eme
n
t techn
o
logy rese
arch, the
r
e are
m
any achi
evement
s
ab
out
data
s
pa
ce, su
ch as
data model [4-6], index [7], data integrat
ion [8, 9] and prototype sy
stem [10].
Data
spa
c
e
i
s
prop
osed
fo
r solving data
i
n
t
egratio
n p
r
oblem, b
u
t da
taspa
c
e
doe
sn’t have
stri
ct sche
ma
, the data i
n
data
s
pa
ce
are
het
e
r
oge
neou
s and a
r
e saved
in distrib
u
ted d
a
ta
resou
r
ces. T
herefo
r
e,
person
a
l d
a
tasp
ace
mu
st
be
able to
jud
g
e
data
co
rrelat
ion for u
s
e
r
and
monitor pe
rsonal d
a
taspa
c
e to
cat
c
h
correlation
ch
angin
g
for ke
eping
data
or not. The
ste
p
of
judgin
g
co
rrel
a
tion is sho
w
n in Figure 1.
Figure 1. Jud
g
ing Correlati
on of Person
al Data
spa
c
e
Evaluation Warning : The document was created with Spire.PDF for Python.
ISSN: 23
02-4
046
TELKOM
NI
KA
Vol. 12, No. 5, May 2014: 3297 – 33
02
3298
This pap
er rese
arche
s
d
a
tasp
ace inte
gratio
n
which
is
ba
sis of
building
a
da
taspa
c
e.
User’
s
ope
rat
i
on and weig
ht of each op
eration a
r
e d
e
fined. Ope
r
a
t
ion-ba
se
d variable q
uantit
y is
defined
by weight of o
peration.
3-ary v
e
ctor is re
se
arched
an
d e
x
pande
d. We
pro
p
o
s
e a
4
-
ary
vector by 3
-
a
r
y vector to
descri
be d
a
ta item
in pe
rson
al data
s
p
a
ce. Thi
s
p
a
per al
so
defi
nes
correl
ation of
data by wei
ght of 4-ary
vector
. A lib
rary data
s
pa
ce model i
s
d
e
sig
ned
and
an
experim
ent base on this library data
s
pa
ce mo
del
verifies the
operatio
n-b
a
se
d data item
correl
ation by
weight. The
result from th
e experim
ent
sho
w
s the
correlation of
data item is very
importa
nt for building a p
e
rson
al data
s
p
a
ce.
2. Personal Datasp
ace
Correlation
The p
e
rson
al
data
s
pa
ce i
n
tegrate
s
dat
a for
user
a
nd all
data it
ems i
n
thi
s
person
a
l
dataspa
ce ha
ve correlation
s
for the user.
This
pe
rsona
l datasp
a
ce need
s to comp
ute correl
atio
n
betwe
en data
and u
s
e
r
to ensure th
at the data item
s in perso
nal d
a
tasp
ace are
asso
ciated
wi
th
use
r
b
e
fore
saving, a
nd
avoid u
s
ele
s
s data i
n
pe
rso
nal d
a
taspace. It ensure
s
the val
ue o
f
person
a
l dat
asp
a
ce. Co
re
spa
c
e m
odel
whi
c
h i
s
ba
sed on Ve
rtica
l
data mod
e
l
rep
r
e
s
ent
s core
spa
c
e th
oug
h
t
[11], Coresp
ace
gets a
co
re
spa
c
e
fo
r
use
r
by th
re
shold of
data
correl
ation, a
nd
correl
ation
s
of data in this
core sp
ace are very
high. It proves the data in this core spa
c
e are
very important for us
er. It is s
h
own in Figure 2.
Figure 2. Coresp
ace of Perso
nal Dataspace
Most of correlation resea
r
ch fo
r p
e
rso
nal data
s
pa
ce assum
e
s
correlation ev
aluation
method exi
s
ts like core
sp
ace
and
re
se
arch on
it
wit
hout mentio
ni
ng ho
w
corre
l
ation evalu
a
tes.
We sho
w
a way to evaluate correl
ation.
Definition 1: Data
spa
c
e S
= {d
1
, d
2
, …,
d
m
}, d is data item in this datasp
a
c
e, m is
numbe
r of
da
ta items.
Use
r
op
eratio
n
set for d
a
tasp
ace S
is A
=
{
a
1
, a
2
, …, a
n
}, a is op
eratio
n of
use
r
for every
time, n is number of u
s
er
operation
s
.
Definition 2: Each u
s
e
r
’s o
peratio
n will p
r
odu
ce
weig
h
t
set of operat
ion, the weig
ht set is
V =
{v
1
, v
2
, …, v
m
},
v is wei
ght of data item for ope
rati
on.
The nu
m
ber of item
s in V is sam
e
with
the numbe
r o
f
items in S.
Definition
2
mean
s ea
ch operation
of use
r
w
ill
pro
duce
weig
ht
for e
a
ch d
a
ta item i
n
dataspa
ce.
Equation
s
1.
The vari
able
quantity after
j time
s of op
e
r
ation fo
r dat
a item i in dat
asp
a
ce
S is sho
w
n in
Equations 1.
∑
=
=
+
+
+
=
j
1
i
i
ij
2
i
1
i
c
V
v
...
v
v
V
(1)
In Equation
s
1, there
are 1
≤
i
≤
m
and
1
≤
j
≤
n.
V
i
is po
sitive wh
e
n
u
s
er ha
s a
c
tions
on
the data, V
i
is negative wh
en user ha
s a
c
tion
s on oth
e
r data.
Evaluation Warning : The document was created with Spire.PDF for Python.
TELKOM
NIKA
ISSN:
2302-4
046
Re
sea
r
ch on
Operation
-
ba
sed
Correl
ation in Person
al Data
spa
c
e
(Shuo
Jian
g)
3299
Definition 3: I
t
takes a
3-ary vector to d
e
scri
b
e
ea
ch
attribute of d
a
ta in data
s
p
a
ce. Th
e
3-a
r
y vecto
r
i
s
d
e
fined
as (Obje
c
tID, AttributeName,
A
ttributeValue
). Obje
ctID is i
dentificatio
n
of
data obje
c
t, AttributeNa
m
e is a se
t which
co
n
t
ains the n
a
mes
of all attributes,
and
AttributeValu
e
is a set of the
value
s
of all attributes [
12].
The definition
3 expresse
d data
in data
s
pace simply, but it did not sho
w
the rel
a
tionshi
p
betwe
en data
and user, we
expand defin
ition 3:
Definition 4:
We take a 4
-
ary vecto
r
to
describe d
a
ta as (Obje
c
tID, AttributeNam
e,
AttributeValu
e
, Wei
ght),
O
b
jectID, Attrib
uteNam
e
a
n
d
AttributeValu
e
a
r
e th
e
sa
me to
definiti
on
3. Weight wa
s presented t
o
expre
s
s co
rrelation, an
d the wei
ght will
chan
ge by u
s
er
ch
angin
g
or
cha
ngin
g
of data itself.
Equation
s
2. The co
rrelatio
n of data
for use
r
is
sho
w
n
in Equations
2.
∑
=
+
=
+
=
j
1
i
0
c
0
d
i
V
W
V
W
W
(2)
W
d
is current
weight of d
a
ta, i.e. data correlatio
n. W
0
is initial
weight whi
c
h is given
whe
n
the dat
a into person
a
l dataspa
ce
first time. V
c
is the variabl
e
quantity of user a
c
tion
s.
W
e
s
a
ve
da
ta
b
y
th
r
e
s
h
o
l
d
o
f
w
e
ig
h
t
wh
ic
h
h
a
s
se
t a
l
r
e
ad
y in
p
e
r
s
on
a
l
da
ta
spa
c
e
,
and
the data
is im
portant
data
for
use
r
if
wei
ght is ove
r
th
e threshold
of
co
rrelation.
Cha
nging
of
use
r
or changing of
data will result
i
n
the weight
changing bet
ween user an
d data. We will
rem
o
ve
the data
whi
c
h wei
ght is l
o
wer th
an th
re
shol
d to
keep
the high
co
rrelati
on in
person
a
l data
s
p
a
ce
of user. It is shown in Figu
re 3.
Figure 3. Dat
a
Flow for Pe
rso
nal Dataspace
We
will give
i
n
itial wei
ght o
f
data item
w
hen we extra
c
t
information
from data re
sou
r
ces
first time, and take this
d
a
ta item into person
a
l
data
s
pa
ce by the
threshold of
weig
ht whe
n
the
weight of this data item i
s
over
the threshol
d. A monitor will
moni
t
o
ring the
changing
of all dat
a
items i
n
p
e
rsonal
data
s
pa
ce
and
jud
g
e
eve
r
y dat
a
item in
pe
rsonal
data
s
pa
ce i
s
useful
or
usel
ess.
3. Rese
arch
Metho
d
We u
s
e som
e
article
s
in the libra
ry as
a test data, and use of
Oracle
10g d
a
ta
base as
experim
ental
data sto
r
age
environ
ment. The library
da
taspa
c
e mo
d
e
l is sh
own in
Figure 4.
Evaluation Warning : The document was created with Spire.PDF for Python.
ISSN: 23
02-4
046
TELKOM
NI
KA
Vol. 12, No. 5, May 2014: 3297 – 33
02
3300
Figure 4. Library Data
spa
c
e Model
Figure 4 sh
o
w
s a d
a
tasp
ace mo
del fo
r libra
ry
. Library dataspa
ce extract info
rmation
from a
r
ticl
es to
cre
a
te
a
big
data
s
pa
ce,
pe
rs
on
al data
s
p
a
ce
of u
s
e
r
will
acce
ss li
brary
dataspa
ce to
catch inform
a
t
ion for user
whe
n
a user
have ope
ratio
n
s to the pe
rsonal data
s
p
a
c
e.
Data
spa
c
e
m
anag
ement
reco
rd
s u
s
e
r
’
s
actio
n
s to
suppo
rt u
s
e
r
behavio
r a
nal
ysis
or
co
mp
ute
weig
ht of data item.
4. Results a
nd Analy
s
is
We sele
ct an
article A whi
c
h is
publi
s
h
ed in 201
0 in
the library d
a
tasp
ace as
a sam
p
le.
We
choo
se
a
read
er to bui
ld perso
nal d
a
tasp
ace and
record thi
s
reade
r’s
beh
a
v
ior to com
p
u
t
e
this re
ade
r’s
variable
qua
n
t
ity of operation for
correl
a
t
ion in the first day. Let the initial weig
ht is
1, the operati
on on first da
y is sho
w
n in
Figure 5.
Figure 5. Vari
able Qu
antity of Operation
for Article A o
n
the First Da
y
The re
ade
r a
c
cesse
s
library 14 times on
the
first day.
Figure 5 sho
w
s read
er op
eration
s
on arti
cle A are 9 time
s
and op
erations on
other
data are 5 times. We ca
n
find the Vari
able
quantity of re
ader op
eratio
ns i
s
0.1
5
by
com
putin
g
with Equation
s
2. The
r
efo
r
e
,
the weight
of
read
er b
ehavi
o
r on first day is 1.15.
Evaluation Warning : The document was created with Spire.PDF for Python.
TELKOM
NIKA
ISSN:
2302-4
046
Re
sea
r
ch on
Operation
-
ba
sed
Correl
ation in Person
al Data
spa
c
e
(Shuo
Jian
g)
3301
We evalu
a
te
correlatio
n o
f
article A for person
a
l dat
asp
a
ce of a read
er by
we
ight of
article A and
record of the read
er who a
c
cess to ar
ticle A in ten days. Let the initial weight is
1
,
the thre
shol
d
of weig
ht is
0.8, we
com
pute wei
ght
of every day
by Equat
ion
s
2, the
re
su
lt is
s
h
ow
n
in
F
i
gu
r
e
6
.
Figure 6. Wei
ghts of Article
A for a Read
er in Ten
Day
s
Figure 6 sh
o
w
s thi
s
re
ad
er a
c
cessed
article A ma
ny times fro
m
the first d
a
y to the
se
con
d
d
a
y, but a
r
ticle A
wa
s n
o
t re
ad
by this
read
er from th
e
seco
nd
day to
the fou
r
th
d
a
y
instea
d of
re
sea
r
ching
ot
her data.
Th
e reade
r
a
ccessed
a
r
ticle
A from
the
fourth
day to
the
seventh
day,
and
neve
r
tou
c
he
d a
r
ticl
e A
after th
e
eigh
th day, eve
n
without
acce
ssing
the
library
from the eig
h
th day to the ninth day. It lead t
he weights of arti
cle A for this reade
r have
no
cha
nge in thi
s
pe
riod. Fig.
5 sho
w
s the
article A is
ve
ry importa
nt for this
read
er in the ten da
ys,
and the arti
cl
e A must be the data in pe
rson
al data
s
p
a
ce of this u
s
er in the ten d
a
ys.
5. Conclusio
n
The ne
w feat
ure
s
of data
make
peo
ple
resea
r
ch on
dataspa
ce, and the
r
e h
a
ve been
many a
c
hiev
ements in
se
veral a
s
pe
ct
s of data
s
pa
ce already. In
this pa
pe
r, we
re
sea
r
ch
ed
dataspa
ce i
n
tegratio
n a
nd
pointed
out th
e ne
ce
ssit
y o
f
co
rrel
a
tion.
Behavior of u
s
er was defin
ed
and o
peratio
n-ba
se
d vari
a
b
le qu
antity wa
s express
ed by weight
of operation.
3-a
r
y vecto
r
wa
s
expand
ed a
n
d
4-ary ve
cto
r
was
propo
sed by 3
-
a
r
y
vector to de
scribe
data ite
m
and
co
rrel
ation
of data. A lib
rary d
a
taspa
c
e mod
e
l was
desi
gne
d an
d
we ve
rified t
he correlatio
n
by this mo
de
l.
The re
sult sh
ows the co
rre
l
ation of data is ve
ry import
ant and u
s
efu
l
in personal
dataspa
ce.
Referen
ces
[1] Meng
X
F
.
F
r
om
Data
base
to
Datas
pace,
F
r
om Enter
p
rise
to Peo
p
l
e
.
Sch
ool
of Inform
ation,
Renm
i
n
Univers
i
t
y
of Chin
a. 200
6.
[2]
F
r
anklin M, H
a
lev
y
A, Mai
e
r
D. F
r
om databases
to d
a
ta
spaces: A ne
w
a
b
stractio
n for informati
o
n
mana
geme
n
t.
ACM SIGMOD
Record
. 20
05;
34(4): 27-3
3
.
[3]
Jones W
,
Bruc
e H.
A re
port
on th
e NSF
-
sp
onsor
ed w
o
rks
hop
on
pers
o
n
a
l i
n
for
m
atio
n
ma
na
ge
me
nt
.
Seattle. 20
05.
[4]
Don
g
X, Ha
lev
y
A.
A p
l
atfor
m
for
perso
na
l
infor
m
ati
on
mana
ge
me
nt an
d inte
grati
on.
Procee
din
g
s
of
the 2nd C
onfer
ence o
n
Innov
ative Data
S
y
st
ems Rese
arch.
Asilomar. 20
0
5
: 119
−
13
0.
[5]
Dittrich JP, Antonio M.
iDM: A unifie
d
a
nd ve
rsatile d
a
ta
mo
del for p
e
rso
n
a
l
datas
pace
mana
ge
me
nt
.
Procee
din
g
s of
the 32n
d Int’l confere
n
ce o
n
Ver
y
L
a
rg
e Data
Bases. Ne
w
Y
o
rk. 2006: 3
6
7
−
378.
[6]
Karger DR, B
a
kshi K,
Hu
yn
h D, Quan D, Sinha V.
Ha
ystack: A customi
z
a
b
l
e g
ene
ral-pur
pos
e
infor
m
ati
on
mana
ge
me
nt tool fo
r end us
ers of semist
ructured d
a
ta
. Proceed
ings
of the 2n
d
Confer
ence
on
Innovative D
a
t
a
S
y
stems R
e
s
earch. Asil
oma
r
. 2005: 13
−
26
.
[7]
Dong X
,
Halevy
A.
In
de
xi
ng
da
ta
sp
a
c
e
s
. Pro
c
eed
ings
of t
h
e 2
7
th Int’
l C
o
nferenc
e
on M
ana
geme
n
t of
Data. Ne
w
Yor
k
. 2007: 43
−
54.
Evaluation Warning : The document was created with Spire.PDF for Python.
ISSN: 23
02-4
046
TELKOM
NI
KA
Vol. 12, No. 5, May 2014: 3297 – 33
02
3302
[8]
Don
g
X,
Halev
y
A, Yu C. D
a
ta
Integrati
on
w
i
th Uncert
ainti
e
s.
T
he Internati
ona
l Jour
nal
o
n
Very Lar
g
e
Data Bases
. 2
007; 18(
2): 469
-500.
[9]
Sarma AD, Dong
X
,
Halevy
A.
Bootstrappi
ng Pay-
as-you
-go Da
t
a
Integ
r
ation Syste
m
s
.
Proceed
in
g
s
of the 20
08 A
C
M SIGMOD internati
o
n
a
l co
nfer
enc
e on M
ana
geme
n
t of data. Ne
w
Y
o
r
k
. 2008: 8
61-
874.
[10]
Blunsc
h
i L, Dit
trich JP, Girard OR, Karakas
h
ia
n SK, Sall
e
s
AV.
A dataspace
odyssey:
The iMeMex
personal datas
pace ma
nagem
e
nt system
.
Procee
din
g
s o
f
the 3rd C
onf
erenc
e o
n
Inn
o
vative
Da
t
a
S
y
stems Res
e
arch. Asilom
a
r. 2007: 1
1
4
−
11
9.
[11]
Li YK, Me
ng
XF
.
Expl
orin
g
Person
al C
o
re
Space f
o
r Dat
a
Spac
e Ma
na
ge
me
nt.
Proce
edi
ngs
of th
e
200
9 F
i
fth Internatio
nal C
onfer
ence o
n
Sema
ntics,
Kno
w
l
e
d
ge an
d Grid. Z
huh
ai. 20
09: 1
68-1
75.
[12]
Li YK, Meng
XF
.
Research o
n
perso
na
l dat
aspac
e man
a
g
e
ment
. Proce
e
d
in
gs of the 2
nd SIGMOD
PhD
w
o
rks
hop
on Innov
ative d
a
tabas
e rese
ar
ch. Vancouv
er.
2008: 7
−
12
.
Evaluation Warning : The document was created with Spire.PDF for Python.