TELKOM
NIKA Indonesia
n
Journal of
Electrical En
gineering
Vol.12, No.5, May 2014, pp
. 4030 ~ 40
3
7
DOI: http://dx.doi.org/10.11591/telkomni
ka.v12i5.5100
4030
Re
cei
v
ed
No
vem
ber 2, 20
13; Re
vised
De
cem
ber 2
7
,
2013; Accep
t
ed Jan
uary 1
6
, 2014
Resear
ch of Probability Symmetric Allocation Storage
in Distributed Storage System
Shengrong Lu*
1
, Hanming Li
2
Schoo
l of Math
ematics an
d C
o
mput
er Sci
e
n
c
e, Long
ya
n U
n
iversit
y
,
Don
g
x
ia
o Rd. N, Xin
l
u
o
Distri
c
tLong
ya
n, F
u
ji
an, Chi
na, 36
4
012, +
86-0
5
9
7
-
279
37
78
*Corres
p
o
ndi
n
g
author, e-ma
i
l
: strun@live.c
n
1
, lhm399
@q
q.com
2
A
b
st
r
a
ct
T
he goa
l of o
p
timal a
l
l
o
catio
n
is to incr
eas
e the store
d
d
a
ta ava
ila
bil
i
ty subj
ect to mini
mi
z
e
t
h
e
storage
bu
dget
. T
he symmetric all
o
cati
on
b
a
sed
on th
e n
e
tw
ork codin
g
is prov
ed to
b
e
opti
m
al w
i
th
out
consi
deri
n
g
th
e n
o
d
e
s av
ai
la
bility
in
distri
bu
ted stor
a
g
e
sy
stem. B
e
caus
e
of n
e
tw
ork co
nditi
ons
an
d
n
ode
inh
e
rent
prop
e
r
ty, each no
de
has d
i
fferent
avail
a
b
ilit
y. Thi
s
pap
er focus
e
s on the
opti
m
i
z
a
t
i
on
distrib
u
ted
data storag
e p
r
obl
em w
i
th no
des ava
ila
bil
i
ty. Bas
ed on pr
o
bab
ility
mod
e
l
of st
orage system, w
e
re-defi
n
e
the symmetric
alloc
a
tio
n
as
the pro
b
a
b
il
ity symmetr
ic a
llocati
on,
a
nd prop
osed pro
b
abil
i
ty
symmetric
alloc
a
tio
n
mo
d
e
l
and
strateg
y
w
h
ich
are
pr
oved
to
be
op
tima
l i
n
th
e g
e
nera
l
co
nditi
on
bas
ed
on
SV
M.
Co
mp
arin
g to the symmetric alloc
a
tio
n
pro
p
o
sed by
L
e
o
n
g
D. et al.,
T
he prop
osed
prob
abil
i
ty symmetr
ic
a
l
l
o
ca
ti
on
sche
m
e
imp
r
o
v
e
s
th
e
d
a
t
a
a
v
ail
a
b
i
l
i
t
y, a
n
d
i
s
m
o
re
p
r
a
c
ti
cal
me
th
od
fo
r di
stri
b
u
t
ed
sto
r
age
system
.
Ke
y
w
ords
: dis
t
ributed stora
g
e
all
o
catio
n
, ne
tw
ork coding, p
r
oba
bil
i
ty symmetric
Copy
right
©
2014 In
stitu
t
e o
f
Ad
van
ced
En
g
i
n
eerin
g and
Scien
ce. All
rig
h
t
s reser
ve
d
.
1. Introduc
tion
The e
n
viron
m
ent in the
netwo
rk,
se
n
s
or
net
work,
data cente
r
netwo
rk ba
se
d on the
data cente
r
, data
di
strib
u
ted stor
a
ge
secu
rity and
e
fficiency i
s
th
e focus of
current resea
r
ch.
Data
ba
ckup
method
be
ca
use
the
traditi
onal
co
st m
o
re; data
syn
c
h
r
oni
zation
diff
icultie
s
a
r
e
n
o
t
suitabl
e for d
i
stribute
d
n
e
twork sto
r
ag
e
syste
m
with
high
reli
abili
ty. Netwo
r
k
codi
ng [1]
da
ta
stora
ge to
p
r
ovide di
strib
u
t
ed net
work
stora
ge
of
hi
gh reliability, low cost
ba
sed,
wid
e
spread
con
c
e
r
n
and
resea
r
ch. In t
he di
strib
u
ted
data
storage
in [2] ba
se
d
on n
e
two
r
k
codi
ng, o
r
igin
al
data is
extended
co
de i
n
to a plu
r
alit
y of enco
d
e
d
data bl
ocks, and th
e coded
data bl
ock
allocated
to d
i
fferent
data
stora
ge nod
e
s
.
When
users a
c
ce
ss
dat
a, use t
he da
ta
re
ceive no
de
from the data
stora
ge no
d
e
obtainin
g
e
n
co
ding d
a
ta
blocks co
rrespondi
ng to th
e [3], restore
the
origin
al data by calculating
.
The use of cod
ed dat
a bl
ock allocation
strategy is th
e key pro
b
le
m
in distrib
u
ted
data sto
r
age
based on n
e
twork codin
g
.
In order to
so
lve the all
o
ca
tion, co
ded
d
a
ta
blo
c
k of
L
eong
D., A.
G. Dim
a
ki
s
p
r
opo
se
d
the unifo
rm code bl
ock all
o
catio
n
mod
e
l
of [4
], the enco
d
ing
of the data bl
ock
amount
assig
n
e
d
to each
storage node, and the model i
s
proved to
be optimal, whi
c
h ensures t
he availability of
data, data all
o
catio
n
overh
ead is mi
nim
a
l. In furt
her
study, Leon
g
D., A. G. Dimaki
s et al. set
th
e
same failu
re
rate for data storage
node,
namely t
he d
a
ta re
ceive n
ode with the
same
pro
babi
lity
of access dat
a storage n
o
de to retrieve
a data blo
ck.
At this time,
uniform all
o
cation strategy
is
still an optim
al allocatio
n
strat
egy. Because the storage nod
e itself
has a ce
rtain failure ra
te,
stora
ge
nod
e
s
of diffe
rent
types, different time
a
n
d
the failu
re
rate of differe
nt use of th
e
environ
ment. In con
s
ide
r
in
g the nod
e availability, Leo
ng D., unifo
rm cod
e
blo
c
k allocatio
n
m
odel
prop
osed by
Dima
kis A.
G. for lack of
con
s
id
er
in
g
the availabilit
y of storag
e
node
s, or if t
h
e
same n
ode fa
ilure rate, rather than the
optimal mod
e
.
In reality, inh
e
rent
prope
rties
of the
st
o
r
age
no
de
n
e
twork enviro
n
ment [11]
a
nd the
node
itself
d
e
termin
es th
e data
re
ceiv
er a
c
ce
ss
storag
e n
ode
s have
a
cert
ain p
r
ob
abilit
y of
failure. Fo
r e
x
ample, network
com
m
uni
cation lin
k fail
ure, the no
de
itself online
stora
ge du
rat
i
on,
hacke
r attack cau
s
e
s
the
data re
ceive
r
can
not
a
c
ce
ss
data
stora
ge no
de. At this time, dat
a
storage nodes with a ce
rtai
n probability are the data
receiver
access node ava
ilability, i.e...
This pap
er fo
cu
sed
on
the
con
s
id
eratio
n
of
di
stribute
d
data sto
r
age
optimizatio
n storage
node
availa
bil
i
ty in case of.
Thro
ugh
the
method
of p
r
obability the
o
r
y, esta
blish t
he
storage
no
de
Evaluation Warning : The document was created with Spire.PDF for Python.
TELKOM
NIKA
ISSN:
2302-4
046
Re
sea
r
ch of Proba
bility Symm
etric Allocation St
orag
e
in Distrib
u
ted
Storage…
(Sheng
ron
g
Lu
)
4031
prob
ability distributio
n ba
sed on data m
odel. Bas
ed
on the re
defi
ned unifo
rm
distrib
u
tion m
odel,
the pro
babilit
y of network
codi
ng ba
se
d
on unifo
rm distrib
u
tion
m
odel,
present
s data
storag
e
probability di
stribution
strategi
es and
methods, an
d dem
onstrat
e
s the pr
oposed method is
optimal. Com
pare
d
with
Le
ong
D., A. G. Dima
ki
s
pro
posed a
unifo
rm di
strib
u
tio
n
mod
e
l, mo
del
and the meth
od pro
p
o
s
ed
in this pape
r con
s
id
ers the
node failure
rate, improve
the availability
of data storag
e, distribute
d
stora
ge sy
ste
m
accords
with the actual.
2. Rese
arch
Metho
d
2.1. Relate
d Work
In distri
buted
storage
syst
ems, n
e
two
r
k c
odi
ng i
s
a
comm
on
codi
ng mo
de, in
orde
r t
o
improve th
e
data availabili
ty and relia
bil
i
ty. The curre
n
t aca
demi
c
resea
r
ch an
d
discu
ss
ho
w to
utilize the da
ta redun
dan
cy, to improve the reli
abilit
y of data. Bu
ild redu
nda
nt data inclu
d
i
ng
simple
b
a
cku
p
, erasure
co
ding
and
net
work codi
ng
method.
Co
mpared to
th
e si
mple
ba
ckup
mode,
in
the same sto
r
ag
e
overh
ead, era
s
u
r
e c
ode
s can p
r
ovid
e highe
r d
a
ta
reliability [5]. At
the same tim
e
, network co
ding is ap
plie
d to dist
ribute
d
stora
ge, ca
n balan
ce the
storag
e spa
c
e
and ba
nd
widt
h. Distrib
u
ted
data storage
based
on n
e
twork codin
g
, is a potential
way.
With diffe
rent
me
cha
n
ism,
for
a
simpl
e
ba
ckup,
dat
a sou
r
ce o
n
l
y
need
s to b
e
fully
repli
c
ated
ori
g
inal data
st
ored i
n
the d
a
ta st
orage
n
ode. Fo
r the
distri
b
u
ted storag
e
ba
se
d on
netwo
rk codi
ng, data
sto
r
age n
ode
s re
quire
s a
data
block re
ceiv
ed
code
d d
a
ta, storage
sp
ace,
data ban
dwi
d
th usa
ge a
r
e better th
a
n
the simp
l
e
backup [3].
Based o
n
the MDS (n, K)
distrib
u
ted d
a
ta storage
method codi
ng ca
n e
ffectively reduce
redun
dant d
a
ta, balan
ce
the
stora
ge
spa
c
e and n
e
two
r
k ba
nd
width, can
effect
ivel
y redu
ce the
data di
stributi
on, storage
a
nd
acce
ss the u
s
e of stora
ge spac
e and n
e
twork ba
nd
wid
t
h, improv
e the reliability of [6] data.
Distri
buted
st
orag
e meth
o
d
ba
sed
on
n
e
twork
co
din
g
ha
s
re
ceive
d
wid
e
spread
attention
in acade
mic
circle
s, the [1
-3] di
stribute
d
data e
n
c
odin
g
and
sto
r
ag
e metho
d
of
massive. But
the
distrib
u
tion
of co
de
blo
c
ks of data
rese
arch,
still less. Th
e tra
d
itional m
e
thod
is to
ea
ch
da
ta
block
co
de
st
ored
in
ea
ch
stora
g
e
nod
e
,
in fa
ct is
the
assig
n
ment
of [4, 7] d
a
ta
stora
g
e
meth
od
of equivalent.
In fact, in distributed d
a
ta storage, ba
se
d
on availabili
ty, network st
orag
e nod
e topolo
g
y
environment
can improve
the efficiency
and usab
ility
of the data
storage block allocation code
data. The
research wo
rk in this area i
s
still relatively
sma
ll. The mai
n
work
concentrates on two
asp
e
ct
s: 1) T
he data
distri
bution meth
o
d
ba
sed
on
e
fficiency; 2
)
Data all
o
catio
n
method
ba
sed
on usability.
In the data
distri
bution
method
b
a
se
d
on
efficien
cy, MSR (mi
n
imum
-sto
rag
e
rege
ne
rating,
minimum m
e
mory re
gen
eration
co
d
e
), is to reduce the data storage
spa
c
e
,
improve th
e repairi
ng effici
ency d
a
ta co
ding mo
de
s
o
f
distributio
n
as the ta
rget.
MBR (mini
m
um-
band
width
re
gene
rating,
m
i
nimum
ban
d
w
idth
reg
ene
rating
cod
e
) i
s
co
nsi
dered f
r
om the
an
gle
of
cha
nnel
ban
d
w
idth, d
a
ta
coding
to redu
ce th
e lin
k
ba
ndwi
d
th allo
cation
strategy
for ta
rget,
se
e
[8, 9]. Tree type data allo
cation method
base
d
on th
e [10] Jun p
r
opo
sed by Li
et al., this paper
put forwa
r
d a
data allocati
on strate
gie
s
from the
perspe
c
tive of band
width, in orde
r to improve
the repai
r efficien
cy of data.
In the data di
stributio
n met
hod ba
se
d o
n
av
ailability, Leong
D. an
d Dima
ki
s A.G. gives
the definition
of uniform
distrib
u
tion,
and p
r
op
ose
s
the [4, 7]
stora
ge
strat
egy in di
strib
u
ted
stora
ge
syste
m
with
uniform dist
ribution
.
Unifor
m di
stribution
st
rate
gy ba
sed
on t
he ide
a
l m
o
d
e
l,
without
con
s
i
derin
g the
storag
e spa
c
e,
stora
ge
n
o
d
e
failure
pro
bability, the numbe
r of e
a
ch
stora
ge no
de
storag
e eq
ual data. In
the ideal
mo
del, the author prove
s
th
at the uniform
distrib
u
tion i
s
optimal.
Dat
a
re
ceivin
g n
ode
ac
ce
ss
data
stora
g
e
nod
es for th
e data
blo
ck to
resto
r
e the
origin
al data
is com
p
lete
ly succe
ssful
. In consi
dering the wo
rking nod
e to its
efficien
cy, Le
ong
D. p
r
op
o
s
ed
the
data
dist
ribut
ion
method
of th
e proba
bility of [7, 12]. T
h
is
method i
s
that the pro
bab
ility of failure for all
sto
r
ag
e node
s i
s
the sam
e
; the essen
c
e of t
h
i
s
method with t
he method of
literature [4] is in fact
co
n
s
iste
nt. The a
u
thor p
r
ove
s
that the method
is only the premise of the
same n
ode fa
ilure p
r
ob
abili
ty is the optimal.
The advanta
g
e
s of the two method
s is th
e stor
a
ge allo
cation meth
o
d
is simpl
e
; the main
disa
dvantag
e
is the lack o
f
data distrib
u
tion nod
e
t
o
con
s
ide
r
it
s
own
cha
r
a
c
t
e
rist
ic
s,
net
wo
rk
environ
ment,
does n
o
t meet the need
s of pra
c
ti
cal
application
s
, in these me
thods, a
s
sum
i
ng
ideal conditio
n
s.
Evaluation Warning : The document was created with Spire.PDF for Python.
ISSN: 23
02-4
046
TELKOM
NI
KA
Vol. 12, No. 5, May 2014: 4030 – 40
37
4032
In reality, the
individual
st
orag
e n
ode
d
e
termin
es th
e data vi
sitors
can
not a
c
cess d
a
ta
stora
ge no
de
s, all with the same p
r
o
babi
lity,
failure probability and
the pre
s
ent st
udy did not pay
attention to t
he d
a
ta
storage
nod
e. Data sto
r
a
ge
node
failure
prob
ability of
differe
nt directly
affects the
reliability of recove
ring th
e origi
nal da
ta and the p
r
oba
bility of system, unif
o
rm
distrib
u
tion
st
rategy i
s
no
l
onge
r o
p
tima
l alloca
tion
st
rategy. Syste
m
relia
bility is an
impo
rta
n
t
goal of distri
buted sto
r
ag
e syst
em, is an importan
t
problem of
distribute
d
stora
ge sy
stem.
Method
of all
o
catin
g
fo
cusing
codin
g
d
a
t
a from g
ene
ral blo
ck,
co
nsiderin
g the
failure
pro
babilit
y
of sto
r
age
n
ode
s in
the
data di
strib
u
tion, a
c
co
rdi
n
g to the
sto
r
age
nod
e fail
ure
proba
bility of
different sto
r
a
ge nod
e data
distrib
u
tion m
e
thod.
2.2. Probabilit
y
Distributi
on Model Based on Net
w
ork Coding
Distri
buted storage system
s, storage nodes
are
different in different failure probability.
Uniform data
distrib
u
tion m
e
thod of th
e tradition
al
is n
o
long
er
opti
m
al. Data
distribution m
e
th
od
is mo
re fea
s
i
b
le to assig
n
a different n
u
m
ber of d
a
ta
as the d
a
ta storage
nod
e according to t
h
e
node availa
bi
lity of differen
t. Based on consi
deri
ng th
e stora
ge no
de availability, the probabili
ty
of network
co
ding ba
se
d o
n
uniform di
st
ribution m
ode
l.
2.2.1. Establish the
H
y
pothesis and M
odel
Distri
buted network storage
sy
st
em is stu
d
ie
d in
this pa
pe
r by th
e
data
sou
r
ce
node,
M
data sto
r
age
node
s an
d da
ta receivin
g n
ode
s.
The use
of M
D
S
(
n
,
k
) [2,
3] en
codi
ng,
encodin
g
o
r
ig
inal d
a
ta to
b
e
code
d d
a
ta
blo
ck.
May be the o
r
iginal
data u
n
it size, whi
c
h is divid
ed i
n
to k bl
ocks,
F
1
,
…
,
F
k
use become
n bl
ock
codi
ng m
a
trix cod
e
,
B
1
,
…
,
B
n
These blo
c
ks,
F
1
,
…
,
F
k
the
linea
r co
mbination
of The coeffici
e
n
t
vec
t
or
Bi is
(a
i
1
,
…
,
a
ik
)
T
, so we can get:
11
1
1
1
1
k
nn
k
k
n
aL
a
F
B
MO
M
M
M
aL
a
F
B
Data allo
cati
on assign
s the
B
i to data stora
ge no
de is differen
t, the code allocatio
n
scheme a
s
shown in Figu
re 1.
Figure 1. Dat
a
Allocation
Cod
e
Base
d on the MDS (
n
,
k
)
The sou
r
ce d
a
ta obje
c
t D f
o
r sto
r
e
d
dat
a. Thro
ugh M
D
S (
n
,
k
) d
a
ta of D
codi
ng
method,
codi
ng
and
d
a
ta blo
c
k allo
cated
to e
a
ch sto
r
ag
e n
o
de. Assign
ed
to ea
ch
sto
r
age
nod
e dat
a is
r
e
pr
es
e
n
t
ed
a
s
x
i
, 1
≤
i
≤
m
Availability of storage node is the
probability of nodes
are avail
a
ble,
and th
e n
ode
by man
u
fact
ure
r
s, te
ch
nol
ogy, ope
ra
tio
n
time, op
era
t
ion environm
ent an
d
so
o
n
,
can be obtained by a node sampli
ng. The availab
ility of each
storage node
is dif
f
erent, but in
a
relatively sho
r
t pe
riod
of time is fixed.
The p
r
ob
abili
ty of failure d
a
ta sto
r
ag
e n
ode
x
i
is
F
xi
, the
availability probability is
p
i
=
1
-F
x
i
Evaluation Warning : The document was created with Spire.PDF for Python.
TELKOM
NIKA
ISSN:
2302-4
046
Re
sea
r
ch of Proba
bility Symm
etric Allocation St
orag
e
in Distrib
u
ted
Storage…
(Sheng
ron
g
Lu
)
4033
2.2.2. Uniform Distributi
on of Probabilit
y
Model
This paper propose
s to use for data distribution node
availability st
rategy based on the
proposed m
o
del, probability distribu
tion.
Probability di
stribution m
o
del
i
s
a
generalization
of the
traditional
uni
form di
stri
buti
on, the
di
strib
u
ted
st
orage
system
con
s
i
derin
g n
ode
f
a
ilure
rate. T
h
e
allocation mo
del as follo
ws:
{}
1
{
,
...
,
}
1
{
(
)
,
...,
(
)
}
1
m
Xx
ii
xx
m
fp
fp
m
(
1
)
11
()
mm
ii
ii
x
fp
T
(
2
)
(
)
(
)
,
,
[0
,
1
],
ij
i
j
i
j
fp
f
p
p
p
p
p
(
3
)
Us
e
1
{}
m
ii
Xx
to repre
s
ent the
data
distrib
u
tion,
di
strib
u
tion fu
nction f (pi) i
s
assign
ed
to the i enco
ded d
a
ta nod
e,
f
(
p
i
) is
def
ined a
s
the i
n
terval [0, 1] one-way in
creasi
ng fun
c
tion,
cha
nge the n
ode
p
i
availability.
T is
storage sp
ace for data sto
r
age
node.
The
use of
d
a
ta di
strib
u
tion mo
del i
s
p
r
opo
se
d in
t
h
is
pap
er,
a
data bl
ock fo
r hi
gh-
availability no
de allo
catio
n
more, m
a
ke the data
re
cei
v
er can bl
ock acce
ss to hi
gher reli
abilit
y,
with a higher
probability of re
coveri
ng the original dat
a object.
2.2.3. Data
Recov
e
r
y
Method
Data receivin
g nod
e can
succe
ssfully recove
r
the o
r
iginal data, th
e acce
ss to the data
stora
ge n
ode
acq
u
ire
s
d
a
ta is n
o
t less t
han the
origi
n
al data. The
origin
al data
unit si
ze, na
mely
the data re
cei
v
e node a
c
ce
ss d
a
ta qua
ntity of not less than 1.
Assu
ming th
at the data receiv
e n
ode
su
ccessfully resto
r
e
d
the origin
al data
need to
acce
ss
r d
a
ta sto
r
ag
e no
de,
r
sub
s
et,
use
|r
|
to re
pre
s
ent th
e
R sub
s
et of
the num
ber
of
element
s, na
mely the storage nod
e nu
mber. S wa
s
defined a
s
the succe
s
sful recovery of data
events.
1
[1
]
m
i
ri
r
P(
S)
P(
r
)
I
x
Among them,
1
ij
ir
jM
\
r
P(
r
)
p
(
p
)
Is
true
I
[
G
]
=1
, G is
fals
e
I
[
G
]
=0
.
Distri
buted
st
orag
e
syste
m
of the
m
data sto
r
a
g
e
node,
com
p
ose
d
of a
pl
urality of
r
su
bset,
M\r
repre
s
e
n
ts a r sub
s
et of a M set. The set was defin
e
d
r sub
s
et of all the succe
s
sful
repai
r of th
e
set
S
r
, said t
o
be
able
to
set a
su
bset
of su
cce
ssful
re
cove
ry,
|S
r
|
is the
numb
e
r of
element
s in t
he collectio
n
of su
cces
sful
repai
r that is
comp
osed of
m
data node
s
r
the num
be
r o
f
sub
s
et
s.
3. Results a
nd Analy
s
is
This sectio
n
based o
n
the stora
g
e
node
avail
ability, analysis u
s
ing p
r
obability
distrib
u
tion m
odel, p
r
ob
abil
i
stic d
a
ta receive nod
e
su
ccessful re
co
very,
dat
a av
ailability. At the
same
time, th
is a
r
ticle i
n
th
e presen
ce of
node fa
il
ures, data availab
ility, uniform distrib
u
tion a
n
d
prob
ability distribution eval
uation with p
r
obability form
.
Evaluation Warning : The document was created with Spire.PDF for Python.
ISSN: 23
02-4
046
TELKOM
NI
KA
Vol. 12, No. 5, May 2014: 4030 – 40
37
4034
3.1. Data
Availabilit
y
For di
strib
u
te
d memo
ry sy
stem
s in the
pre
s
en
ce
of n
ode avail
ability, data storag
e nod
e
exists failure rate, so the d
a
ta receive node su
cce
ssf
ully recove
r the pro
bability of original d
a
ta
objects is not the only value. For the distributed
dat
a storage, the reliabilit
y of the system is a
distrib
u
ted sy
stem is
the m
o
st impo
rtant goal.
For the dist
ri
buted sto
r
ag
e system, the
use
of prob
a
b
ility distributi
on model, the
amount
of data the
m
data stora
ge nod
es to
store i
s
diff
erent, a
c
cord
ing to the n
ode availa
bili
ty
cha
nge
s, therefore, a num
ber
of ea
ch subset of elements
of su
cce
ss i
s
un
certai
n,
[1
,
]
m
r
, as
long as the
subset of the
node
s on the
amount of data of
1
i
ir
x
can
meet. In distributed
storage sy
stems, data avai
lability in each subset
of successful
recovery,
data availability of each
r
sub
s
et of
r
concentratio
n
prob
ability of all node
s is a
v
ailable.
Theo
rem 1: The
exi
s
ten
c
e
of stora
ge node
fa
ilure rate of di
strib
u
ted
stora
ge
system,
data availabili
ty for (can recover data probability):
1
mi
n
,
1
(
)
m
rT
Pr
m
r
Proof: Th
e d
a
ta re
ceive
n
ode
ca
n be
succe
ssfully
restored th
e o
r
iginal
data, t
he d
a
ta
s
i
ze mus
t
meet the following c
o
nditions
:
()
1
ii
ir
ir
xf
p
S
r
said t
o
set
suc
c
e
s
sf
ul s
ubs
et
,
whic
h co
n
s
i
s
ts of a sub
s
et of the set of all
r
.
|S
r
|
is the
numbe
r of th
e eleme
n
ts of
su
ccess in t
he collectio
n.
S
was
defin
ed a
s
the su
ccessful reco
very
of data events,
A
i
is exactl
y
i
nodes are acce
ss to thin
gs.
1
1
\
1
()
(
|
)
(
)
(
1
)
(
)
m
ii
r
m
r
ij
r
ir
j
M
r
m
r
r
PS
P
S
A
P
A
S
p
p
m
r
S
Pr
m
r
For ea
ch
sub
s
et of
r
, can b
e
rep
r
e
s
ente
d
by inequaliti
e
s:
12
...
it
t
t
r
ir
r
nod
e
s
x
xx
x
All the el
eme
n
ts in
the
S
r
are
a su
cce
s
sful re
covery of
a sub
s
et
of
S
r
, ca
n b
e
e
x
presse
d
by the followi
ng ineq
ualitie
s:
12
12
...
1
...
1
rr
r
tt
t
r
SS
S
r
xx
x
xx
x
With ineq
uali
t
ies,
11
...
mm
r
ax
a
x
S
, becau
se ea
ch
storage no
de b
e
long
s to a
sub
s
et of R, different
1
0
1
i
m
a
r
, and on the type:
Evaluation Warning : The document was created with Spire.PDF for Python.
TELKOM
NIKA
ISSN:
2302-4
046
Re
sea
r
ch of Proba
bility Symm
etric Allocation St
orag
e
in Distrib
u
ted
Storage…
(Sheng
ron
g
Lu
)
4035
11
1
...
1
1
1
1
rm
m
m
i
i
Sa
x
a
x
m
x
r
m
T
r
Bec
a
us
e
r
m
S
r
,
is
1
mi
n
,
1
r
mm
ST
rr
Therefore, th
e pre
s
en
ce of
reliability on dist
rib
u
ted st
orag
e syste
m
node availa
b
ility for:
1
||
()
m
i
n
,
1
(
)
m
rT
PS
P
r
m
r
System reliability in distri
but
ed
storage system
s, namely, data
receiving node to the
su
ccess prob
ability of reco
very
is determined by a ra
ndom vari
abl
e
p
i
m
.
3.2. Probabilit
y
Distributi
on Model an
d Distribution Model Comparison
Distri
buted
st
orag
e sy
stem
, data di
strib
u
tion strategy
is the main f
a
ctor
of affecting the
distrib
u
ted st
orag
e system
.
Unifor
m all
o
cation
strateg
y
for Di
ma
kis
A.G. is p
r
op
o
s
ed, th
e o
p
timal
allocation
strategy in the
ideal m
ode
l. For
the d
i
stribute
d
sto
r
age
sy
stem
storage
no
de
availability, the p
r
ob
ability distri
bution
model i
s
p
r
o
posed in
this pape
r, ba
se
d on th
e
storag
e
node failu
re
rate, the use
of distributio
n functi
on fo
r the allocatio
n
of stora
ge
node d
o
e
s
n
o
t
equal am
ount
s of data. In the storage n
ode is availa
ble, uniform
prob
ability model is eq
uivalent
to the unifo
rm distri
butio
n model
in li
terature [4], in all the
sto
r
age
nod
e a
v
ailability eq
ual,
uniform
di
stri
bution m
odel
pro
bability
model i
s
equ
ivalent to the
do
cume
nt [7]. The u
n
ifo
r
m
prob
ability model is mo
re
suitabl
e to the univers
al, uniform di
stri
bution case model is u
n
iform
distrib
u
tion m
odel proba
bili
ty.
Theorem 2: the presence of
distributed storage sy
stem
node availability, uniform
distrib
u
tion
o
f
uniform
di
stribution
mod
e
l definitio
n
model
is su
perio
r to
[4]
and
[7] in
t
h
e
probability of literature.
The proof of theorem 2 bef
ore, first ne
ed
a lemma.
Lemma
1: T
he minim
u
m
number
of
nodes with the least
num
ber of node probability
distrib
u
tion m
odel
s of su
ccessful
re
cove
ry using le
ss than unifo
rm d
i
stributio
n def
ined [4] and [7]
use a
su
ccessful re
cove
ry.
Proof: We u
s
e redu
ction a
d
absurd
u
m p
r
oof.
The mi
nimu
m num
ber of
nod
es to
e
v
enly dist
ri
bu
te the
succe
ssful
re
cove
ry of the
assumed probability usi
n
g
r
1
use the
a
ppro
p
ri
ate di
stributio
n
fun
c
tion, ma
ke
s the sto
r
ed
d
a
ta
stora
ge no
de
is
f
(
p
1
) >
f
(
p
2
) >
…
> f
(
p
r
1
).
Minimum n
u
m
ber
of nod
e
s
to evenly di
stribute
t
he succe
ssful re
covery
of
the use of
a
r
2
, memo
ry d
a
ta sto
r
a
ge
n
ode
wei
ght i
s
T
m
,
r
1
>r
2
。
The
numbe
r of su
ccessf
ul re
co
very
no
de
s
with the lea
s
t at the receiving nod
e, data
access to the amount of 1
:
1
1
12
12
2
12
1
(
)
(
)
...
(
)
1
(
)
(
)
...
(
)
0
r
r
i
ir
r
T
fp
fp
f
p
r
m
TT
fp
fp
fp
mm
Becau
s
e
of t
he
()
i
T
fp
m
, and
1
12
1
()
0
r
i
ir
r
fp
, so
can
not b
e
e
qual to
ze
ro,
and th
e
assumptio
n
o
f
cont
radi
ctio
n. The
r
e
are
r
1
<r
2
, minim
u
m num
be
r of
node
s
with th
e lea
s
t
numb
e
r
Evaluation Warning : The document was created with Spire.PDF for Python.
ISSN: 23
02-4
046
TELKOM
NI
KA
Vol. 12, No. 5, May 2014: 4030 – 40
37
4036
of nod
es i
s
uniform di
stribution
mo
del
su
ccessf
ully re
stored
usi
ng p
r
o
b
ability less t
han
su
ccessful re
covery u
s
ing
a uniform di
st
ribution.
No
w the pro
o
f
of theorem 2
Proof: Uniform distributio
n
model, all data st
ora
ge n
ode
s to store
the same a
m
ount of
data, witho
u
t
con
s
id
erin
g t
he em
pty no
des, the
amo
unt of data
of
all sto
r
ag
e n
ode
s to
store
is
T
m
, an ideal
situ
ation succe
ssfully resto
r
ed
the num
ber
o
f
node
s requi
res at lea
s
t
1/
Tm
mT
,
at this time,
1
rT
m
. Existing di
stributed
sto
r
a
ge sy
stem
n
ode avail
abili
ty, uniform di
stributio
n
model of data
availability for
()
m
m
r
T
Pr
.
The minimu
m
numbe
r of node
s with pro
bability
distrib
u
tion model
succe
ssfully re
store
d
to use fo
r
t
,
m
t
T
the avail
ability of dat
a, an
d
a unifo
rm
dist
ribution
of m
o
del d
a
ta avail
ability is
the differen
c
e
:
1
m
i
n
,
1
(
)
(
)
(
)
1
(
)
(
)
1
mm
m
rt
r
T
m
T
rt
mm
m
rt
r
T
rT
Pr
P
r
m
rT
Pr
m
rT
rT
Pr
Pr
mm
When
1
rT
m
,
1
()
0
m
T
rt
Pr
,
Proba
bility distributio
n, dat
a availability is
greate
r
tha
n
the unifo
rm
distrib
u
tion
of data
availability.
Whe
n
1
rT
m
, a uniform di
stributi
on of the mod
e
l failed to re
store th
e origi
nal data.
To sum up, the
proba
bility
distrib
u
tion
model
of d
a
t
a availability is
gre
a
ter than th
e
uniform di
stri
bution of dat
a availability, there
is a di
stribute
d
sto
r
age sy
stem
node avail
abi
lity,
prob
ability distribution st
rat
egy is bette
r t
han the unifo
rm allocatio
n
strategy.
4. Conclusio
n
Optimizatio
n
of
dist
ributed
data sto
r
ag
e
is
the
goa
l of en
su
ring
safe
sto
r
a
g
e
dat
a
recovery, to i
m
prove th
e reliability of da
ta stor
a
ge
system. With
out
con
s
id
erin
g the sto
r
ag
e no
de
availability, n
e
twork codin
g
data
ba
se
d
on u
n
iform
di
stributio
n i
s
shown to
be
o
p
timal. But d
u
e
to
node fail
ure
s
and oth
e
r fa
ctors, the
storage n
ode
ha
s different failu
re rate, ba
se
d on the
sto
r
age
node
proba
bi
lity model, d
e
fines the
un
iform di
st
rib
u
t
ion proba
bili
ty model, ma
de the
sto
r
a
ge
data pro
babili
ty distribution
strategie
s
a
nd met
hod
s,
and dem
on
strate the prop
ose
d
method
is
optimal. Com
pare
d
with
Le
ong
D., A. G. Dima
ki
s pro
posed a
unifo
rm di
strib
u
tio
n
mod
e
l, mo
del
and the met
hod propo
se
d in this pa
per con
s
id
e
r
s the availa
bility of nodes, improves th
e
effectivene
ss
of data stor
ag
e system, ha
s more reali
s
t
i
c sig
n
ifica
n
ce.
Ackn
o
w
l
e
dg
ements
The work is supp
orted
by the B class
Natu
ral Scien
c
e p
r
oje
c
t of Fujian
provin
ce
Educatio
n De
partme
n
t of China (JB12
2
0
8
).
Referen
ces
[1]
Dimakis A
g
, Pr
abh
akara
n
V,
Ramch
andr
an
K. De
centra
lize
d
Erasur
e C
o
d
e
s for Distri
but
ed N
e
t
w
orke
d
Storage.
IEEE Transactio
n
s o
n
Information T
heory
. 20
06; 5
2
(6): 2809-
28
16.
Evaluation Warning : The document was created with Spire.PDF for Python.
TELKOM
NIKA
ISSN:
2302-4
046
Re
sea
r
ch of Proba
bility Symm
etric Allocation Storag
e
in Distrib
u
ted
Storage…
(Sheng
ron
g
Lu
)
4037
[2]
Dimakis A
g
,
Ramch
andr
an
K, W
u
Y.
A Survey o
n
Netw
ork Co
des for D
i
stri
buted St
orag
e
.
Procee
din
g
s of
the IEEE. 2011; 99(3): 47
6-4
89.
[3]
Dimakis Ag, Godfrey
Pb, Wu Y. Net
w
or
k Co
din
g
for Distrib
uted Stora
ge S
y
stems.
IEEE Transactions
on Infor
m
atio
n
T
heory
. 201
0; 56(9): 45
39-
45
51.
[4]
Leo
ng D, D
i
makis Ag, Ho T
.
Distribute
d
storag
e al
locati
o
n
pro
b
le
ms.
N
e
t
w
o
r
k Co
di
ng,
T
heor
y
,
an
d
Appl
icatio
ns. W
o
rkshop o
n
NetCo
d
'
09. La
usan
ne. 20
09.
[5]
Aceda
nski
S,
Deb
S, Me
dar
d M.
H
o
w
go
od is
ra
nd
o
m
line
a
r
c
odi
ng
base
d
distrib
u
ted netw
o
rked
storage
. Proce
edi
ngs of the 1
s
t. W
o
rkshop on Ne
t
w
ork C
o
d
i
ng. Riv
a del G
a
rda. 20
05.
[6]
Cadambe Vr,
Jafar Sa, Maleki H.
Mini
mu
m Repa
ir Ban
d
w
idth for Exact Regen
erati
o
n
in Distrib
uted
Storage
. Wirel
e
ss Net
w
ork C
odi
ng Co
nfere
n
ce
(WiNC), IEEE. Bostn, MA, USA. 2010.
[7]
Leo
ng D
Dima
k
is Ag, Ho T
.
Distribut
ed Sto
r
age A
lloc
a
tio
n
for High
Rel
i
a
b
ility
. Comm
un
icatio
ns (ICC).
IEEE Internatio
nal C
onfere
n
ce
on.
Cape T
o
w
n
, South Africa
. 2010.
[8]
Rashmi KV, Shah NB, Kuma
r PV.
Optima
l Exact-Reg
ener
ating C
odes
fo
r Distributed S
t
orage at th
e
MSR a
nd MB
R Poi
n
ts vi
a a
Prod
uct-Matri
x
Co
nstruction
.
Informatio
n
T
heor
y
.
IEEE T
r
ansacti
ons
on,
201
1; 8(57): 52
27-5
239.
[9]
Rashmi KV, S
hah NB, Kum
a
r PV.
Explic
it
and
Opti
mal
E
x
act-Reg
ener
a
t
ing C
o
d
e
s for
the Mi
ni
mu
m-
Bandw
idth
Poi
n
t in Distri
but
ed Stora
g
e
. Information T
h
eory
Proceeding
s (ISI
T
)
. IEE
E
International
S
y
mp
osi
u
m on
. Austin,
T
X
. 2010.
[10]
Eong
D, Dimak
is Ag, Ho T
.
Symmetric A
lloc
a
tions for D
i
strib
u
ted Stor
age
. Globa
l
T
e
leco
mmunicati
on
s
Confer
ence
(GLOBECOM 20
10). Mia G
L
OBECOM 201
0,
Miami, F
L
, US
A 201
0 Mo
ha
n
N, Und
e
l
a
n
d
T
M
, Robbins
W
P
. Po
w
e
r El
e
c
tronics. Ne
w
York: John W
i
l
e
y
& So
ns. 200
5.
[1
1
]
X
i
a
o
y
i
n
g
Wa
ng
. Me
ta
-se
r
vi
ce
D
e
sig
n
an
d Imp
l
e
m
e
n
t
a
t
io
n
fo
r C
o
n
t
en
tD
e
l
i
v
e
r
y
N
e
tw
ork C
l
oud
Enviro
nments.
T
E
LKOMNIKA Indon
esi
a
Jour
nal of Electric
al
Engin
eeri
n
g
. 2
012; 10(
7): 183
3-18
42.
[12]
Ren S
h
u
x
i
a
, Z
hao Z
h
eng, Z
o
u Xia
o
ji
an.
An I
ndeterm
i
nac
y
T
e
mporal Data
Model
bas
ed
on Pro
bab
ilit
y.
T
E
LKOMNIKA Indon
esi
a
Jour
nal of Electric
al
Engin
eeri
ng.
2
013; 11(
11): 66
86-6
692.
Evaluation Warning : The document was created with Spire.PDF for Python.