T
E
L
K
O
M
N
I
K
A
T
elec
o
m
m
un
ica
t
io
n,
Co
m
pu
t
ing
,
E
lect
ro
nics
a
nd
Co
ntr
o
l
Vo
l.
18
,
No
.
2
,
A
p
r
il 2
0
2
0
,
p
p
.
7
8
3
~
79
1
I
SS
N:
1
6
9
3
-
6
9
3
0
,
ac
cr
ed
ited
First Gr
ad
e
b
y
Kem
en
r
is
tek
d
i
k
ti,
Dec
r
ee
No
: 2
1
/E/KPT
/2
0
1
8
DOI
: 1
0
.
1
2
9
2
8
/TE
L
KOM
NI
K
A.
v
1
8
i2
.
1
4
8
8
3
783
J
o
ur
na
l ho
m
ep
a
g
e
:
ht
tp
:
//jo
u
r
n
a
l.u
a
d
.
a
c.
id
/in
d
ex
.
p
h
p
/TELK
OM
N
I
K
A
G
eno
mic
repeats
detec
tion usin
g
B
o
y
er
-
M
o
o
re
a
lg
o
r
ithm
o
n
Apa
che Spark
Streamin
g
L
a
la
Septe
m
Riza
1
,
F
a
rha
n
Dhiy
a
a
P
ra
t
a
m
a
2
,
E
rna
P
ia
nta
ri
3
,
M
a
hm
o
ud
F
a
hs
i
4
1,
2,
3
De
p
a
rtme
n
t
o
f
Co
m
p
u
ter
S
c
ien
c
e
Ed
u
c
a
ti
o
n
,
Un
i
v
e
rsitas
P
e
n
d
i
d
ik
a
n
In
d
o
n
e
sia
,
In
d
o
n
e
sia
4
EE
DIS
Lab
o
ra
to
r
y
,
Djil
lali
Li
a
b
e
s Un
iv
e
rsity
,
Alg
e
ria
Art
icle
I
nfo
AB
S
T
RAC
T
A
r
ticle
his
to
r
y:
R
ec
eiv
ed
J
u
l 2
4
,
2
0
1
9
R
ev
is
ed
J
an
2
,
2
0
2
0
Acc
ep
ted
Feb
6
,
2
0
2
0
G
e
n
o
m
ic
re
p
e
a
ts,
i.
e
.
,
p
a
tt
e
rn
se
a
rc
h
in
g
in
th
e
stri
n
g
p
ro
c
e
ss
in
g
p
r
o
c
e
ss
to
fin
d
re
p
e
a
ted
b
a
se
p
a
irs
in
th
e
o
rd
e
r
o
f
d
e
o
x
y
ri
b
o
n
u
c
leic
a
c
id
(DN
A
),
re
q
u
ires
a
lo
n
g
p
r
o
c
e
ss
in
g
ti
m
e
.
T
h
is
re
se
a
rc
h
b
u
il
d
s
a
b
ig
-
d
a
ta
c
o
m
p
u
tatio
n
a
l
m
o
d
e
l
to
lo
o
k
fo
r
p
a
tt
e
rn
s
in
strin
g
s
b
y
m
o
d
ify
i
n
g
a
n
d
imp
lem
e
n
ti
n
g
th
e
Bo
y
e
r
-
M
o
o
re
a
lg
o
rit
h
m
o
n
Ap
a
c
h
e
S
p
a
rk
S
trea
m
in
g
f
o
r
h
u
m
a
n
DN
A
se
q
u
e
n
c
e
s
fro
m
th
e
e
n
se
m
b
le
site.
M
o
re
o
v
e
r,
we
p
e
rfo
rm
so
m
e
e
x
p
e
rime
n
ts
o
n
c
lo
u
d
c
o
m
p
u
ti
n
g
b
y
v
a
r
y
in
g
d
iffere
n
t
s
p
e
c
ifi
c
a
ti
o
n
s
o
f
c
o
m
p
u
ter
c
lu
ste
rs
with
in
v
o
l
v
in
g
d
a
tas
e
ts
o
f
h
u
m
a
n
DN
A
se
q
u
e
n
c
e
s.
T
h
e
re
su
lt
s o
b
ta
in
e
d
sh
o
w
th
a
t
th
e
p
ro
p
o
se
d
c
o
m
p
u
tati
o
n
a
l
m
o
d
e
l
o
n
Ap
a
c
h
e
S
p
a
rk
S
trea
m
in
g
is
f
a
ste
r
th
a
n
sta
n
d
a
l
o
n
e
c
o
m
p
u
ti
n
g
a
n
d
p
a
ra
ll
e
l
c
o
m
p
u
ti
n
g
with
m
u
lt
ico
re
.
Th
e
re
fo
re
,
it
c
a
n
b
e
sta
ted
th
a
t
th
e
m
a
in
c
o
n
t
rib
u
ti
o
n
i
n
th
is
re
se
a
rc
h
,
wh
ich
is
to
d
e
v
e
lo
p
a
c
o
m
p
u
tati
o
n
a
l
m
o
d
e
l
f
o
r
re
d
u
c
in
g
th
e
c
o
m
p
u
tatio
n
a
l
c
o
sts,
h
a
s
b
e
e
n
a
c
h
iev
e
d
.
K
ey
w
o
r
d
s
:
Ap
ac
h
e
Sp
ar
k
Stre
am
i
n
g
DNA
Gen
o
m
ic
r
ep
ea
ts
Hu
m
an
g
en
o
m
Strin
g
m
atch
in
g
T
h
is i
s
a
n
o
p
e
n
a
c
c
e
ss
a
rticle
u
n
d
e
r
th
e
CC B
Y
-
SA
li
c
e
n
se
.
C
o
r
r
e
s
p
o
nd
ing
A
uth
o
r
:
L
ala
Sep
tem
R
iza,
Dep
ar
tm
en
t o
f
C
o
m
p
u
ter
Scie
n
ce
E
d
u
ca
tio
n
,
Un
iv
er
s
itas
Pen
d
id
ik
an
I
n
d
o
n
esia,
I
n
d
o
n
esia.
E
m
ail:
lala.
s
.
r
iza@u
p
i.e
d
u
1.
I
NT
RO
D
UCT
I
O
N
DNA
r
ep
ea
ts
in
th
e
eu
k
ar
y
o
tic
g
en
o
m
e
[
1
]
.
R
ep
etitio
n
id
en
tific
atio
n
an
d
class
if
icatio
n
ar
e
im
p
o
r
tan
t
f
u
n
d
am
e
n
tal
an
n
o
tatio
n
task
s
f
o
r
s
ev
er
al
r
ea
s
o
n
s
.
First,
r
ep
etitio
n
is
b
eliev
e
d
t
o
p
la
y
an
im
p
o
r
ta
n
t
r
o
le
in
th
e
e
v
o
lu
tio
n
o
f
g
e
n
o
m
es
an
d
d
is
ea
s
es
[
2
]
.
Seco
n
d
,
ce
ll
u
lar
elem
en
ts
(
t
r
an
s
p
o
s
o
n
s
a
n
d
r
etr
o
t
r
an
s
p
o
s
o
n
s
)
m
ay
c
o
n
tain
co
d
in
g
r
eg
io
n
s
th
at
ar
e
d
if
f
ic
u
lt
to
d
is
tin
g
u
is
h
f
r
o
m
o
th
er
g
en
e
ty
p
es.
Fin
al
ly
,
r
e
p
etitio
n
o
f
ten
ca
u
s
es
a
lo
t
o
f
lo
ca
l
alig
n
m
en
ts
,
co
m
p
licated
s
eq
u
e
n
c
e
ass
em
b
ly
,
co
m
p
ar
is
o
n
b
et
wee
n
g
en
o
m
es
an
d
lar
g
e
-
s
ca
le
d
u
p
licatio
n
an
aly
s
i
s
an
d
r
ea
r
r
a
n
g
em
e
n
t
[
3
]
.
I
n
th
e
p
ast
d
ec
ad
e
s
cien
tis
ts
h
av
e
b
e
en
d
o
in
g
lab
o
r
ato
r
y
r
esear
ch
f
o
r
3
y
ea
r
s
to
an
al
y
z
e
DNA
[
4
]
.
On
e
o
f
th
e
ca
s
es
o
f
DNA
an
aly
s
is
th
at
r
eq
u
ir
es
tim
e
an
d
e
n
er
g
y
o
n
a
lar
g
e
s
ca
le
is
to
an
aly
ze
d
is
ea
s
es
ca
u
s
ed
b
y
r
ep
etitiv
e
g
e
n
o
m
ic
p
atter
n
s
o
r
ca
lled
g
en
o
m
ic
r
ep
ea
ts
[
3
]
lik
e
th
r
ee
r
ep
ea
ted
b
ase
p
air
s
th
at
ca
n
ca
u
s
e
d
is
ea
s
es in
th
e
tr
in
u
cleo
tid
e
r
ep
ea
t d
is
o
r
d
er
s
ca
teg
o
r
y
[
5
]
.
A
task
o
f
g
en
o
m
ic
r
e
p
ea
ts
,
wh
ich
b
asically
is
an
an
aly
s
is
o
f
s
tr
in
g
m
atch
in
g
o
r
p
atter
n
m
atch
in
g
,
is
ca
r
r
ied
o
u
t
t
o
lo
o
k
f
o
r
a
p
at
ter
n
in
a
la
r
g
e
tex
t.
T
h
e
b
asic
alg
o
r
ith
m
f
o
r
s
ea
r
ch
in
g
s
tr
in
g
s
o
r
p
atter
n
s
is
b
y
m
atch
in
g
all
th
e
p
o
s
s
ib
ilit
ies
co
n
tain
ed
in
t
h
e
d
ata
f
r
o
m
th
e
f
ir
s
t
in
d
ex
in
th
e
te
x
t
to
th
e
en
d
o
f
s
eq
u
en
ce
s
.
T
h
e
B
r
u
te
Fo
r
ce
(
Naïv
e)
Alg
o
r
ith
m
h
as th
e
wo
r
s
t p
o
s
s
ib
le
co
m
p
lex
ity
,
wh
ich
is
O
(
m
n
)
,
wh
er
e
it will b
e
v
er
y
tim
e
co
n
s
u
m
in
g
if
m
o
r
e
an
d
m
o
r
e
tex
t
will
b
e
u
s
ed
as
o
b
jects
to
s
ea
r
ch
f
o
r
s
tr
in
g
s
o
r
p
atter
n
s
[
6
]
.
So
,
th
e
n
ee
d
to
r
ed
u
ce
th
e
c
o
m
p
u
tatio
n
al
co
s
t
wh
ile
p
e
r
f
o
r
m
in
g
s
tr
in
g
m
atc
h
in
o
n
lar
g
e
d
atasets
m
ak
es
s
cien
tis
t
s
to
d
ev
el
o
p
al
g
o
r
ith
m
s
th
at
ar
e
m
o
r
e
ef
f
icien
t
th
a
n
b
r
u
te
f
o
r
ce
alg
o
r
ith
m
s
,
s
u
ch
as
th
e
al
g
o
ir
ith
m
s
o
f
Kn
u
th
Mo
r
r
is
Pra
tt (
KM
P)
[
7
]
an
d
B
o
y
er
M
o
o
r
e
(
B
M)
[
8
]
.
Evaluation Warning : The document was created with Spire.PDF for Python.
I
SS
N
:
1
6
9
3
-
6
9
3
0
T
E
L
KOM
NI
KA
T
elec
o
m
m
u
n
C
o
m
p
u
t E
l Co
n
tr
o
l
,
Vo
l.
18
,
No
.
2
,
Ap
r
il 2
0
2
0
:
7
8
3
-
79
1
784
T
h
e
KM
P
alg
o
r
ith
m
is
a
s
tr
in
g
m
atch
i
n
g
alg
o
r
ith
m
t
h
a
t
wo
r
k
s
b
y
u
tili
zin
g
th
e
s
h
if
t
p
atter
n
o
f
th
e
tex
t
f
r
o
m
th
e
lef
t
t
o
t
h
e
r
ig
h
t
d
u
r
in
g
m
atch
i
n
g
s
tr
i
n
g
s
in
th
e
tex
t.
T
h
e
KM
P
al
g
o
r
ith
m
was
f
ir
s
tly
d
ev
elo
p
e
d
b
y
Do
n
ald
E
.
Kn
u
t
h
in
1
9
6
7
an
d
t
h
en
co
n
tin
u
e
d
b
y
J
am
es
H.
Mo
r
r
is
with
Vau
g
h
an
R
.
Pra
tt
in
1
9
6
6
.
T
h
en
in
1
9
7
7
th
e
KM
P
alg
o
r
ith
m
was
p
u
b
lis
h
ed
.
T
h
e
n
,
th
e
B
M
Alg
o
r
ith
m
,
wh
ich
is
o
n
e
o
f
th
e
m
o
s
t
ef
f
icien
t
alg
o
r
ith
m
s
co
m
p
a
r
ed
to
o
th
er
s
tr
in
g
m
atch
in
g
alg
o
r
ith
m
s
,
w
as
p
r
o
p
o
s
ed
b
y
cr
ea
tin
g
two
tab
le
s
k
n
o
wn
as
th
e
B
M
B
ad
C
h
ar
ac
ter
(
b
m
B
c)
tab
le
an
d
th
e
B
M
g
o
o
d
-
s
u
f
f
ix
(
b
m
Gs)
tab
le
[
9
]
.
Fo
r
ea
ch
ch
ar
a
cter
in
th
e
alp
h
ab
et
s
et,
b
ad
ch
ar
ac
ter
tab
le
s
s
to
r
e
s
h
if
t v
alu
es b
ased
o
n
th
e
ap
p
e
ar
an
ce
o
f
ch
a
r
ac
ter
s
in
th
e
p
att
er
n
.
T
h
is
alg
o
r
ith
m
f
o
r
m
s
th
e
b
asis
f
o
r
s
ev
er
al
p
at
ter
n
m
atch
in
g
alg
o
r
ith
m
s
.
T
h
e
KM
P
an
d
B
M
alg
o
r
ith
m
c
an
b
e
u
s
ed
as
to
o
ls
f
o
r
id
en
tica
l
id
en
tical
s
eq
u
en
ce
s
in
s
o
u
r
ce
s
eq
u
en
ce
s
o
r
r
e
p
ea
ted
s
u
b
s
eq
u
en
ce
s
ea
r
c
h
es
[
1
0
]
.
T
h
e
B
M
ty
p
e
alg
o
r
ith
m
f
o
r
m
atch
in
g
co
m
p
r
ess
ed
p
atter
n
s
in
a
s
y
s
tem
co
llag
e,
an
d
s
h
o
ws
th
at
a
n
in
s
tan
ce
o
f
alg
o
r
ith
m
s
ea
r
ch
in
B
PE
(
b
y
te
p
er
en
c
o
d
in
g
)
is
c
o
m
p
r
ess
ed
tex
t
1
.
2
∼
3
.
0
f
aster
th
a
n
th
e
a
g
r
ep
s
o
f
twar
e
p
ac
k
ag
e
(
f
astes
t
p
atter
n
m
ath
in
g
to
o
l)
in
th
e
o
r
ig
in
al
tex
t
[
1
1
]
.
Mo
r
eo
v
er
,
n
o
wad
ay
s
,
s
cien
tis
ts
o
r
b
io
lo
g
is
ts
h
av
e
b
eg
u
n
to
f
ac
e
is
s
u
es
r
elate
d
to
la
r
g
e
d
ata
s
ets.
So
,
ch
allen
g
es
o
n
h
an
d
lin
g
,
p
r
o
ce
s
s
in
g
,
an
d
tr
a
n
s
f
er
r
in
g
in
f
o
r
m
atio
n
ar
e
a
p
p
ea
r
ed
.
I
t
m
ea
n
s
th
at
b
io
lo
g
i
s
ts
n
ee
d
to
ch
an
g
e
f
r
o
m
tr
a
d
itio
n
al
d
ata
p
r
o
ce
s
s
in
g
to
m
o
r
e
to
war
d
s
b
i
g
d
ata
an
aly
s
is
to
in
v
esti
g
ate
all
b
io
lo
g
ical
p
r
o
b
lem
s
.
Fo
r
ex
am
p
le,
m
o
d
i
f
y
in
g
th
e
KM
P
in
p
ar
allel
co
m
p
u
tin
g
w
ith
m
u
ltico
r
e
b
y
u
s
in
g
th
e
R
p
ac
k
ag
e
p
b
d
MPI
h
as
b
ee
n
in
tr
o
d
u
ce
d
b
y
R
iza
et
al
.
f
o
r
s
ea
r
ch
in
g
g
en
o
m
ic
r
e
p
ea
t
s
[
1
2
]
.
T
h
e
R
p
ac
k
ag
e
p
b
d
MP
I
was
also
u
tili
ze
d
f
o
r
p
a
r
allel
r
an
d
o
m
p
r
o
jectio
n
in
o
r
d
e
r
to
h
a
n
d
le
p
la
n
ted
m
o
t
if
s
ea
r
ch
[
1
3
]
.
I
n
o
th
er
s
id
e,
s
o
m
e
d
ev
elo
p
m
en
ts
in
o
p
en
-
s
o
u
r
ce
s
o
f
twa
r
e,
n
am
ely
th
e
Had
o
o
p
p
r
o
j
ec
t,
m
ad
e
d
is
co
v
er
ies
to
p
r
o
v
id
e
s
ca
lab
l
e
s
to
r
ag
es
(
e.
g
.
,
in
p
etab
y
tes
o
f
d
ata)
in
h
ad
o
o
p
d
is
tr
ib
u
te
d
f
i
le
s
y
s
tem
s
(
HD
FS
)
th
at
co
m
b
in
es
with
th
e
p
r
o
g
r
a
m
m
in
g
m
o
d
el,
ca
lled
Ma
p
R
ed
u
ce
[
1
4
]
.
H
o
wev
er
,
b
ec
au
s
e
o
f
th
e
Had
o
o
p
-
ba
s
ed
I
/O
ac
ce
s
s
p
atter
n
,
th
e
in
ter
m
ed
iate
ca
lcu
latio
n
r
esu
lts
ar
e
n
o
t
ca
ch
ed
.
T
h
er
ef
o
r
e,
Had
o
o
p
is
o
n
ly
s
u
i
tab
le
f
o
r
b
atch
d
ata
p
r
o
ce
s
s
in
g
,
an
d
s
h
o
ws
p
o
o
r
p
e
r
f
o
r
m
an
ce
f
o
r
r
e
p
etitiv
e
d
ata
p
r
o
ce
s
s
in
g
[
1
5
]
.
T
o
o
v
er
c
o
m
e
t
h
is
p
r
o
b
lem
,
Ap
ac
h
e
Sp
a
r
k
was f
o
u
n
d
,
wh
i
ch
is
a
f
aster
p
latf
o
r
m
d
esig
n
ed
to
h
an
d
le
lar
g
e
am
o
u
n
ts
o
f
d
ata
[
1
6
]
.
Ap
ac
h
e
Sp
ar
k
is
a
n
o
p
en
-
s
o
u
r
ce
clu
s
ter
co
m
p
u
tin
g
f
r
a
m
ewo
r
k
f
o
r
lar
g
e
d
ata
p
r
o
ce
s
s
in
g
.
I
t
h
as
em
er
g
ed
as
a
n
ex
t
-
g
en
er
atio
n
lar
g
e
d
ata
p
r
o
ce
s
s
in
g
en
g
in
e
,
o
v
er
ta
k
in
g
Ha
d
o
o
p
Ma
p
R
e
d
u
ce
wh
ich
h
elp
ed
r
ev
iv
e
th
e
B
ig
Data
r
ev
o
lu
tio
n
.
I
t
m
ain
tain
s
lin
ea
r
s
ca
lab
ili
ty
an
d
f
au
lt
to
ler
an
ce
o
f
Ma
p
R
ed
u
ce
,
b
u
t
ex
ten
d
s
it
in
s
ev
er
al
im
p
o
r
tan
t
way
s
.
Un
lik
e
Ap
ac
h
e
Had
o
o
p
as
d
is
k
-
b
ased
co
m
p
u
tin
g
,
Ap
ac
h
e
Sp
ar
k
d
o
es
m
em
o
r
y
co
m
p
u
tin
g
b
y
in
tr
o
d
u
ci
n
g
a
p
o
wer
f
u
l
c
o
n
c
ep
t,
i.e
.
,
r
esil
ien
t
d
is
tr
ib
u
ted
d
ataset
(
R
DD)
.
B
ec
au
s
e
it
is
p
o
s
s
ib
le
to
s
to
r
e
r
esu
lts
in
m
em
o
r
y
,
it
is
m
o
r
e
ef
f
icien
t
f
o
r
r
ep
etitiv
e
o
p
er
atio
n
s
.
I
n
ter
m
s
o
f
p
e
r
f
o
r
m
an
ce
,
Ap
ac
h
e
Sp
ar
k
ca
n
r
ea
ch
1
0
0
tim
es
f
aster
in
ter
m
s
o
f
m
em
o
r
y
ac
ce
s
s
th
an
Ap
ac
h
e
Had
o
o
p
[
1
6
]
.
T
h
e
g
a
p
b
etwe
en
Ap
ac
h
e
Sp
ar
k
an
d
Ap
ac
h
e
Had
o
o
p
is
m
o
r
e
th
an
1
0
tim
es
g
r
ea
ter
,
e
v
en
if
we
co
m
p
ar
e
b
etwe
en
th
e
two
b
ased
o
n
d
is
k
p
er
f
o
r
m
an
ce
[
1
7
,
1
8
]
.
I
n
ter
m
s
o
f
f
lex
ib
ilit
y
,
Ap
ac
h
e
Sp
ar
k
p
r
o
v
id
es
a
h
ig
h
-
lev
el
ap
p
lica
tio
n
p
r
o
g
r
a
m
m
in
g
in
t
er
f
ac
e
(
API
)
in
J
av
a,
Scala,
Py
th
o
n
,
an
d
R
.
I
n
g
en
er
al
ter
m
s
,
Ap
ac
h
e
Sp
ar
k
p
r
o
v
id
es
s
tr
u
ctu
r
ed
d
ata
p
r
o
ce
s
s
in
g
,
m
ac
h
in
e
lea
r
n
in
g
,
g
r
a
p
h
c
o
m
p
u
tin
g
,
an
d
f
lo
w
co
m
p
u
ti
n
g
ca
p
ab
ilit
ies
b
y
s
u
p
p
o
r
tin
g
s
ev
e
r
al
s
o
p
h
is
ticated
co
m
p
o
n
en
ts
.
I
n
co
n
tr
ast
to
b
atch
-
b
ased
l
ar
g
e
v
o
lu
m
e
d
ata
p
r
o
ce
s
s
in
g
,
s
tr
ea
m
in
g
p
r
o
ce
s
s
in
g
tak
es
a
m
o
r
e
ad
v
an
ce
d
s
tep
to
war
d
s
d
ata
s
tr
ea
m
in
g
.
W
ith
ex
p
o
n
en
tial
g
r
o
wth
i
n
c
o
n
tin
u
o
u
s
d
ata
s
tr
ea
m
in
g
,
it
h
as
g
ain
ed
a
lo
t
o
f
p
o
p
u
lar
ity
.
Ap
ac
h
e
Sp
ar
k
Stre
am
in
g
is
o
n
e
o
f
th
e
o
p
en
s
o
u
r
ce
f
r
am
ewo
r
k
s
f
o
r
r
e
lia
b
le
s
tr
ea
m
in
g
p
r
o
ce
s
s
in
g
,
h
ig
h
-
th
r
o
u
g
h
p
u
t,
an
d
l
o
w
laten
cy
s
tr
ea
m
in
g
p
r
o
ce
s
s
in
g
[
1
9
]
.
I
t
is
an
ex
ten
s
io
n
o
f
API
o
f
Ap
ac
h
e
Sp
ar
k
,
wh
ic
h
is
in
ten
d
ed
to
p
r
o
ce
s
s
s
tr
ea
m
in
g
d
ata
s
tr
ea
m
s
.
Alth
o
u
g
h
Ap
ac
h
e
Sp
ar
k
is
a
b
atch
p
r
o
ce
s
s
in
g
en
g
in
e,
with
Sp
a
r
k
Stre
am
in
g
it
is
ab
le
to
p
r
o
ce
s
s
s
tr
ea
m
in
g
d
ata
f
r
o
m
v
a
r
io
u
s
s
o
u
r
ce
s
in
clu
d
i
n
g
f
r
o
m
T
witter
.
Her
e
th
e
in
co
m
i
n
g
d
ata
f
lo
w
is
d
iv
id
ed
i
n
to
s
m
all
g
r
o
u
p
s
wh
ich
ar
e
th
e
n
p
r
o
ce
s
s
ed
b
y
th
e
Sp
ar
k
en
g
in
e
.
I
n
Ap
ac
h
e
Sp
ar
k
Stre
am
in
g
,
d
is
cr
etiza
tio
n
f
lo
ws
(
DStre
am
s
)
wh
ich
ar
e
R
DD
s
eq
u
en
ce
s
,
r
ep
r
esen
t
co
n
tin
u
o
u
s
d
ata
s
tr
ea
m
s
.
T
h
e
o
p
er
atio
n
s
o
n
DStre
am
s
ar
e
c
o
n
v
e
r
ted
t
o
b
asic
R
DD
tr
an
s
f
o
r
m
atio
n
s
wh
ich
ar
e
th
en
ca
lcu
lated
b
y
th
e
Sp
ar
k
en
g
in
e
[
2
0
]
.
Mo
r
eo
v
er
,
Ap
ac
h
e
Sp
a
r
k
Stre
am
in
g
,
Ap
ac
h
e
Sto
r
m
,
a
n
d
Ya
h
o
o
!
S4
[
2
1
]
ar
e
t
h
r
ee
ty
p
ic
al
p
latf
o
r
m
s
th
at
s
u
p
p
o
r
t
th
e
s
tr
ea
m
in
g
ca
lc
u
latio
n
m
o
d
el
f
o
r
d
ir
ec
t
d
ata
p
r
o
ce
s
s
in
g
.
Ap
ac
h
e
Sto
r
m
is
a
f
r
ee
an
d
o
p
en
s
o
u
r
ce
d
is
tr
ib
u
ted
r
ea
l
-
tim
e
ca
lcu
latio
n
s
y
s
tem
.
Ap
ac
h
e
Sto
r
m
m
ak
es
it
ea
s
y
to
p
r
o
ce
s
s
d
ata
f
lo
w
with
o
u
t
lim
its
r
eliab
ly
.
Un
lik
e
Ap
ac
h
e
St
o
r
m
,
Ap
ac
h
e
Sp
ar
k
Stre
am
in
g
ta
k
es a
v
er
y
d
if
f
e
r
en
t a
p
p
r
o
ac
h
a
n
d
p
r
o
ce
s
s
es e
v
en
ts
in
b
atch
es.
Mo
s
t
tr
a
d
itio
n
al
f
lo
w
p
r
o
ce
s
s
in
g
s
y
s
tem
s
ar
e
d
esig
n
ed
to
p
r
o
ce
s
s
r
ec
o
r
d
s
o
n
e
b
y
o
n
e.
T
h
is
is
k
n
o
wn
as
a
co
n
tin
u
o
u
s
o
p
er
ato
r
m
o
d
el,
a
s
im
p
le
m
o
d
el
th
at
wo
r
k
s
v
er
y
well
o
n
a
s
m
all
s
ca
le,
b
u
t
f
ac
es
s
ev
er
al
ch
allen
g
es
with
lar
g
e
s
ca
le
an
aly
s
is
an
d
r
ea
l
tim
e
.
T
o
o
v
e
r
co
m
e
th
ese
ch
allen
g
es,
Ap
ac
h
e
Sp
ar
k
Stre
am
in
g
u
s
es
m
icr
o
b
atch
ar
ch
itectu
r
e
[
2
2
-
24
]
wh
er
e
th
e
d
ata
s
tr
ea
m
is
tr
ea
ted
as
a
s
m
al
l
b
atch
o
f
d
ata
a
n
d
s
tr
ea
m
in
g
co
m
p
u
tin
g
is
d
o
n
e
th
r
o
u
g
h
a
s
er
ies
o
f
co
n
tin
u
o
u
s
b
atch
c
o
m
p
u
tatio
n
s
o
n
th
is
b
atch
o
f
d
ata
.
T
h
er
ef
o
r
e,
th
is
r
esear
ch
is
ai
m
ed
at
b
u
ild
in
g
co
m
p
u
tatio
n
a
l
m
o
d
els
an
d
im
p
lem
en
tin
g
th
e
B
M
alg
o
r
ith
m
in
f
in
d
in
g
s
tr
in
g
p
atter
n
s
in
h
u
m
an
ch
r
o
m
o
s
o
m
e
g
en
o
m
e
d
ata
co
n
ta
in
ed
i
n
en
s
em
b
le
p
ag
es.
T
h
is
s
tu
d
y
co
n
s
is
ted
o
f
th
r
ee
s
tag
es,
n
am
ely
th
e
s
tag
e
o
f
en
ter
i
n
g
d
ata
in
t
o
th
e
s
y
s
tem
,
th
e
B
M
p
r
o
ce
s
s
in
g
,
an
d
t
h
e
s
tag
e
o
f
an
aly
zin
g
th
e
r
esu
lts
.
Evaluation Warning : The document was created with Spire.PDF for Python.
T
E
L
KOM
NI
KA
T
elec
o
m
m
u
n
C
o
m
p
u
t E
l Co
n
tr
o
l
Gen
o
mic
r
ep
ea
ts
d
etec
tio
n
u
s
i
n
g
B
o
ye
r
-
Mo
o
r
e
a
l
g
o
r
ith
m
o
n
…
(
La
la
S
ep
tem
R
iz
a
)
785
2.
RE
S
E
ARCH
M
E
T
H
O
D
T
h
e
co
m
p
u
tatio
n
al
m
o
d
el
b
u
ilt
in
th
is
r
esear
ch
ca
n
b
e
s
ee
n
in
Fig
u
r
e
1
.
T
h
er
e
a
r
e
s
ev
er
al
s
tag
es
in
th
e
co
m
p
u
tatio
n
al
m
o
d
el
o
f
d
e
tectio
n
o
f
g
e
n
o
m
ic
r
ep
ea
ts
u
s
in
g
th
e
B
M
alg
o
r
ith
m
o
n
Ap
ac
h
e
Sp
ar
k
Stre
am
in
g
.
I
t
s
tar
ts
with
th
e
p
r
ep
r
o
ce
s
s
in
g
s
tag
e,
th
en
co
n
tin
u
es
with
th
e
in
p
u
t
d
ata
s
tag
e
an
d
u
p
lo
ad
s
th
e
d
ata.
T
h
en
,
th
e
d
at
a
is
p
r
o
ce
s
s
ed
b
y
m
o
v
i
n
g
th
e
u
p
lo
ad
ed
d
ata
in
to
th
e
s
tr
ea
m
in
g
f
o
ld
er
in
h
a
d
o
o
p
d
is
tr
ib
u
ted
f
ile
s
y
s
tem
(
HDF
S)
wh
ich
is
th
en
ca
p
tu
r
ed
b
y
th
e
s
y
s
tem
f
r
o
m
Ap
ac
h
e
Sp
ar
k
Stre
am
in
g
an
d
p
r
o
ce
s
s
ed
b
y
th
e
B
M
alg
o
r
ith
m
.
Af
ter
p
r
o
ce
s
s
in
g
is
co
m
p
lete,
th
e
r
esu
ltin
g
d
a
ta
f
r
o
m
p
r
o
ce
s
s
in
g
ca
n
b
e
d
o
wn
lo
ad
e
d
in
to
a
p
er
s
o
n
al
c
o
m
p
u
ter
.
Fig
u
r
e
1
.
R
esear
ch
co
m
p
u
tatio
n
al
m
o
d
el
B
ased
o
n
th
e
en
v
ir
o
n
m
en
t
wh
er
e
th
e
co
m
p
u
tatio
n
al
m
o
d
el
wo
r
k
s
,
we
d
iv
id
e
in
to
f
o
u
r
s
y
s
tem
en
v
ir
o
n
m
en
ts
,
as f
o
llo
ws:
−
W
o
r
k
in
g
i
n
p
er
s
o
n
al
co
m
p
u
te
r
s
:
th
e
m
o
d
el
s
tar
ts
in
th
e
en
v
ir
o
n
m
e
n
t
o
f
th
e
p
e
r
s
o
n
al
c
o
m
p
u
ter
wh
er
e
th
e
d
ata
is
lo
ca
ted
,
th
en
g
o
es
in
to
th
e
p
r
ep
r
o
ce
s
s
in
g
s
tag
e
th
at
is
ca
r
r
ied
o
u
t
in
th
e
p
er
s
o
n
al
co
m
p
u
ter
Evaluation Warning : The document was created with Spire.PDF for Python.
I
SS
N
:
1
6
9
3
-
6
9
3
0
T
E
L
KOM
NI
KA
T
elec
o
m
m
u
n
C
o
m
p
u
t E
l Co
n
tr
o
l
,
Vo
l.
18
,
No
.
2
,
Ap
r
il 2
0
2
0
:
7
8
3
-
79
1
786
en
v
ir
o
n
m
en
t.
T
h
e
en
v
ir
o
n
m
e
n
t
ch
an
g
es
wh
en
t
h
e
p
r
o
ce
s
s
h
as
en
ter
ed
th
e
d
ata
u
p
lo
ad
s
tag
e
wh
er
e
d
ata
is
tr
an
s
f
er
r
ed
f
r
o
m
th
e
en
v
ir
o
n
m
en
t
o
f
th
e
p
er
s
o
n
al
c
o
m
p
u
ter
i
n
to
th
e
v
ir
tu
al
m
ac
h
in
e
in
th
e
Go
o
g
le
C
lo
u
d
Pro
ject.
−
W
o
r
k
in
g
o
n
v
ir
tu
al
m
ac
h
in
es
in
Go
o
g
le
C
lo
u
d
Pro
ject:
T
h
e
n
th
e
d
ata
s
h
o
u
ld
b
e
u
p
lo
ad
ed
in
to
th
e
clo
u
d
co
m
p
u
tin
g
,
wh
ich
is
Go
o
g
le
C
lo
u
d
Pro
ject.
I
n
th
is
s
tep
,
we
s
et
s
o
m
e
ce
r
tain
p
ar
am
eter
s
s
u
c
h
as n
u
m
b
er
s
o
f
co
r
es,
n
u
m
b
e
r
s
o
f
wo
r
k
er
n
o
d
es.
−
W
o
r
k
in
g
o
n
HDFS:
T
h
e
lar
g
e
d
atasets
ar
e
th
en
co
p
ied
to
HDFS
s
tr
ea
m
in
g
ly
f
o
r
ea
c
h
b
lo
ck
to
d
eter
m
i
n
ed
f
o
ld
er
in
HDFS.
−
W
o
r
k
in
g
with
Ap
ac
h
e
Sp
ar
k
Stre
am
in
g
:
Af
ter
o
b
tain
in
g
a
b
lo
ck
o
f
d
ata,
th
e
m
o
d
i
f
icatio
n
o
f
th
e
B
M
alg
o
r
ith
m
o
n
t
h
e
Ap
ac
h
e
Sp
ar
k
Stre
a
m
in
g
r
u
n
s
.
T
h
is
wo
r
k
is
r
u
n
n
in
g
u
n
til
t
h
er
e
is
n
o
a
b
lo
ck
in
th
e
HDFS f
o
ld
er
.
I
t sh
o
u
ld
b
e
n
o
ted
th
at
th
e
task
s
r
u
n
s
alo
n
g
with
all
wo
r
k
er
n
o
d
es.
Acc
o
r
d
in
g
t
o
Fig
u
r
e
2
,
th
e
f
ir
s
t
s
tep
in
th
is
alg
o
r
ith
m
is
to
ca
ll
th
e
p
ac
k
ag
e
n
ee
d
ed
i
n
p
r
o
g
r
am
c
o
d
e
s
u
ch
a
s
Sp
ar
k
C
o
n
te
x
t,
Sp
ar
k
C
o
n
f
,
an
d
Stre
am
in
g
C
o
n
tex
t
an
d
in
itialize
th
e
p
ac
k
a
g
e
s
o
t
h
at
it
ca
n
b
e
u
s
ed
i
n
th
e
p
r
o
g
r
am
c
o
d
e.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
Input:
Jupyter Notebook Accessed
Output:
Number of Genomic repeats in 1 data,
Number of pattern repeats in 1
data, Process time
Algorithm:
Call packages needed in the system
Initializes spark configuration and spark context
Initializes the spark streaming context
Specifies the folder used as the streaming folder
Transforming from the incoming data as a form of Dstream data into RDD
Repartition the data entered into one partition and zipped with index
Call the pattern from the folder in hdfs
Take action against the pattern
Set the time for data processing to start
Transform the incoming data and pattern by entering it into the Boyer
-
Moore
algorithm function
Transform the data that has been processed to be filtered if there are no results
in one RDD
Transform the data that has been processed to be sorted by the number of genomic
repeats in an ascending RDD
Reducing the start time of processing data w
ith real time
Specifying the directory contained in hdfs to be used as a destination for
storing results
Perform an action to save the results of the number of genomic repeats into hdfs
and display the number of pattern repeats in one data and ti
me or duration of
the process
Fig
u
r
e
2
.
Ps
eu
d
o
co
d
e
o
o
f
th
e
B
M
alg
o
r
ith
m
o
n
a
p
ac
h
e
s
p
ar
k
s
tr
ea
m
in
g
T
h
en
,
we
s
et
th
e
d
ir
ec
to
r
y
in
HDFS
wh
ich
wil
l
b
e
u
s
ed
as
a
s
tr
ea
m
in
g
f
o
ld
er
,
s
o
th
at
ev
e
r
y
b
lo
ck
o
f
d
ata
en
ter
ed
in
to
th
e
f
o
ld
e
r
af
ter
th
e
p
r
o
g
r
am
s
tar
ts
will
b
e
p
r
o
ce
s
s
ed
im
m
ed
iately
.
T
o
p
r
o
ce
s
s
th
e
b
lo
ck
s
o
f
d
ata,
it
is
n
ec
ess
ar
y
to
tr
an
s
f
o
r
m
th
em
as
a
Dst
r
ea
m
d
ata
ty
p
e
in
Ap
ac
h
e
Sp
ar
k
Stre
am
in
g
.
I
t
s
h
o
u
ld
b
e
n
o
ted
th
at
Dst
r
ea
m
d
ata
ty
p
es
ca
n
n
o
t
wo
r
k
with
R
DD
d
ata
ty
p
es
s
o
a
tr
an
s
f
o
r
m
atio
n
is
n
ee
d
ed
,
wh
ich
is
to
ch
an
g
e
Dst
r
ea
m
d
ata
ty
p
es
t
o
R
DD
d
ata
ty
p
es.
T
h
e
co
d
e
s
h
o
ws
th
e
p
r
o
ce
s
s
es
as
illu
s
t
r
ated
in
Fig
u
r
e
3
.
Fig
u
r
e
3
.
C
o
d
e
f
o
r
i
n
p
u
ttin
g
th
e
b
lo
ck
s
o
f
d
ata
in
a
p
ac
h
e
s
p
ar
k
s
tr
ea
m
in
g
Evaluation Warning : The document was created with Spire.PDF for Python.
T
E
L
KOM
NI
KA
T
elec
o
m
m
u
n
C
o
m
p
u
t E
l Co
n
tr
o
l
Gen
o
mic
r
ep
ea
ts
d
etec
tio
n
u
s
i
n
g
B
o
ye
r
-
Mo
o
r
e
a
l
g
o
r
ith
m
o
n
…
(
La
la
S
ep
tem
R
iz
a
)
787
Af
ter
tr
an
s
f
o
r
m
in
g
th
e
s
tr
ea
m
i
n
g
d
ata,
t
h
e
n
ex
t
s
tep
is
to
r
ep
a
r
titi
o
n
th
e
in
co
m
i
n
g
d
ata
an
d
p
r
o
v
id
e
an
in
d
ex
in
to
ea
c
h
R
DD
in
th
e
d
ata.
Data
en
ter
ed
in
to
th
e
s
tr
e
am
in
g
f
o
ld
e
r
will
b
e
au
to
m
ati
ca
lly
p
ar
titi
o
n
ed
b
y
th
e
Ap
ac
h
e
Sp
a
r
k
Stre
am
i
n
g
s
o
th
at
if
it
is
n
o
t
p
ar
titi
o
n
ed
th
e
n
u
m
b
er
o
f
p
a
r
titi
o
n
s
i
n
o
n
e
d
ata
will
b
e
d
eter
m
in
ed
au
to
m
atica
lly
b
y
A
p
ac
h
e
Sp
ar
k
Stre
am
in
g
.
R
ep
ar
titi
o
n
in
g
will
b
e
u
s
ef
u
l
in
f
in
d
i
n
g
g
e
n
o
m
ic
r
ep
ea
ts
s
o
th
at
we
co
u
ld
o
b
tain
th
e
r
ig
h
t n
u
m
b
er
o
f
g
en
o
m
ic
r
ep
ea
ts
.
I
n
th
is
m
o
d
el,
a
d
ata
p
atter
n
is
s
elec
ted
wh
ich
is
co
n
v
er
ted
in
t
o
a
tex
t
d
ata
t
y
p
e
b
ec
au
s
e
th
e
d
ata
p
atter
n
m
u
s
t
h
av
e
a
d
ata
s
ize
s
m
aller
th
an
th
e
in
co
m
i
n
g
ch
r
o
m
o
s
o
m
e
d
ata,
s
o
to
s
p
ec
if
y
th
e
p
r
o
ce
s
s
tim
e,
th
e
d
ata
p
atter
n
is
s
elec
ted
wh
ich
is
co
n
v
e
r
ted
in
to
a
d
ata
tex
t
ty
p
e.
W
h
en
t
h
e
s
ea
r
ch
p
r
o
ce
s
s
f
o
r
g
en
o
m
ic
r
ep
ea
ts
u
s
in
g
t
h
e
B
o
y
er
-
Mo
o
r
e
f
u
n
ctio
n
h
as
b
ee
n
co
m
p
let
ed
,
th
e
n
ex
t
s
tep
is
to
tr
an
s
f
o
r
m
th
e
r
esu
lts
o
f
p
r
o
ce
s
s
in
g
b
y
r
em
o
v
in
g
th
e
r
esu
lts
f
r
o
m
o
n
e
R
DD
th
at
d
o
es
n
o
t
f
in
d
g
en
o
m
ic
r
ep
ea
ts
an
d
s
eq
u
en
cin
g
th
e
r
esu
lts
o
f
ascen
d
in
g
g
e
n
o
m
ic
r
ep
ea
ts
o
r
f
i
n
d
in
g
g
en
o
m
ic
r
ep
ea
ts
.
n
u
m
b
e
r
in
g
at
least
u
p
t
o
th
e
m
o
s
t
n
u
m
b
e
r
.
T
h
is
is
d
o
n
e
t
o
m
ak
e
it
ea
s
ier
t
o
s
ee
o
r
r
ec
o
r
d
th
e
n
u
m
b
er
o
f
g
en
o
m
ic
r
e
p
ea
ts
th
at
o
cc
u
r
.
At
t
h
is
s
tag
e
th
e
s
ea
r
ch
p
r
o
ce
s
s
h
as
en
d
ed
s
o
it
is
n
ec
e
s
s
ar
y
to
d
o
th
e
tim
in
g
o
f
th
e
p
r
o
ce
s
s
b
y
r
ed
u
cin
g
th
e
r
ea
l
tim
e
tim
e
with
th
e
s
tar
t
o
f
th
e
p
r
o
ce
s
s
th
at
h
as
b
ee
n
i
n
itialized
in
th
e
p
r
ev
i
o
u
s
s
tag
e.
T
h
e
co
d
e
s
h
o
win
g
m
ain
p
r
o
ce
s
s
es
an
d
s
av
in
g
r
esu
lts
ca
n
b
e
s
ee
n
in
Fig
u
r
e
4
.
Fig
u
r
e
4
.
C
o
d
e
t
o
p
er
f
o
r
m
th
e
m
ain
p
r
o
ce
s
s
es a
n
d
s
av
in
g
th
e
r
esu
lts
in
ap
ac
h
e
s
p
ar
k
s
tr
ea
m
in
g
3.
RE
SU
L
T
S AN
D
D
I
SCU
SS
I
O
N
I
n
th
is
s
ec
tio
n
,
it
i
s
ex
p
l
ain
ed
th
e
r
esu
lts
o
f
r
esear
ch
an
d
at
th
e
s
am
e
tim
e
is
g
iv
en
th
e
co
m
p
r
e
h
en
s
iv
e
d
is
cu
s
s
io
n
.
3
.
1
.
Da
t
a
c
o
llect
i
o
n
T
h
e
d
ata
u
s
ed
i
n
th
is
s
tu
d
y
a
r
e
h
u
m
an
DNA
s
eq
u
en
ce
s
w
h
ich
ca
n
b
e
d
o
w
n
lo
ad
ed
f
r
ee
l
y
o
n
p
a
g
e
f
tp
://ftp
.
en
s
em
b
l.o
r
g
/p
u
b
/r
elea
s
e
-
9
5
/f
asta/h
o
m
o
_
s
ap
ien
s
/d
n
a
/.
T
h
ese
d
ata
ar
e
ex
am
p
les
o
f
h
u
m
a
n
DNA
s
eq
u
en
ce
s
in
p
u
b
licatio
n
n
u
m
b
er
9
5
p
r
o
v
i
d
ed
o
n
th
e
e
n
s
em
b
l
f
ile
tr
an
s
f
er
p
r
o
to
co
l
(
FTP)
we
b
s
ite.
On
th
at
p
ag
e
th
er
e
ar
e
2
4
ch
r
o
m
o
s
o
m
e
DNA
s
eq
u
en
ce
f
iles
wh
ich
ca
n
b
e
s
ee
n
in
T
ab
le
1.
3
.
2
.
E
x
perim
ent
a
l
s
ce
na
rio
I
n
th
is
e
x
p
er
im
en
t,
we
p
er
f
o
r
m
ed
an
ex
p
e
r
im
en
tal
s
ce
n
ar
i
o
u
s
in
g
s
ev
er
al
wo
r
k
e
r
n
o
d
es
with
ea
ch
n
o
d
e
h
av
in
g
4
C
PU
co
r
es.
T
h
e
d
at
a
u
s
ed
in
th
is
s
tu
d
y
u
s
es
all
t
h
e
f
iles
m
en
tio
n
ed
in
s
ec
tio
n
3
.
1
Data
C
o
llectio
n
m
u
ltip
lied
b
y
th
e
n
u
m
b
er
o
f
ex
p
er
im
en
ts
ca
r
r
ied
o
u
t
with
th
e
n
u
m
b
er
o
f
5
7
6
f
iles
an
d
th
e
n
u
m
b
er
o
f
s
iz
7
5
3
6
0
MB.
T
h
e
p
atter
n
th
at
will
b
e
u
s
ed
is
'C
C
G
'
,
a
p
atter
n
wh
ich
if
f
o
u
n
d
r
e
p
e
ated
ly
as
m
u
ch
as
200
-
9
0
0
tim
es,
it
ca
n
b
e
co
n
cl
u
d
ed
th
at
th
e
h
u
m
an
h
as
Fag
ile
XE
Sy
n
d
r
o
m
e,
wh
ich
n
o
r
m
a
lly
th
e
p
atter
n
'
C
G
G
'
o
n
ly
r
ep
ea
ts
4
-
3
9
tim
es.
T
h
e
n
,
we
also
u
s
e
th
e
‘
C
AG’
p
at
ter
n
,
wh
ich
is
th
e
ca
u
s
e
o
f
th
e
d
is
ea
s
e
in
clu
d
in
g
th
e
p
o
ly
g
lu
tam
in
e
ca
te
g
o
r
y
.
T
h
e
d
i
f
f
er
en
ce
in
th
e
'
C
C
G'
an
d
'
C
AG
'
p
atter
n
s
d
o
e
s
n
o
t
o
n
l
y
lie
in
th
e
d
if
f
e
r
en
ce
o
f
o
n
e
c
h
ar
ac
t
er
in
th
e
m
id
d
le,
b
u
t
also
h
as
a
d
if
f
e
r
en
ce
in
t
h
e
p
r
ef
ix
th
at
is
g
en
er
ated
.
T
h
e
p
r
ef
ix
f
o
r
'
C
C
G
'
is
'
0
0
'
,
wh
ile
th
e
p
r
ef
ix
f
o
r
'
C
AG
'
is
'0
0
0
'
.
W
e
n
ee
d
to
k
n
o
w
th
e
d
if
f
er
en
ce
in
s
p
ee
d
ca
u
s
ed
b
y
th
ese
p
r
ef
ix
.
T
h
e
n
,
t
h
er
e
is
th
e
'
T
T
A
GGG
'
p
atter
n
,
wh
ich
is
a
telo
m
er
e
o
r
th
e
v
er
y
tip
o
f
lin
ea
r
DNA.
“
T
h
e
s
ea
r
ch
f
o
r
'
T
T
AGGG
'
i
s
i
n
ten
d
ed
to
s
ee
th
e
ef
f
ec
t
o
f
p
at
ter
n
len
g
th
o
n
d
i
f
f
er
en
ce
s
in
co
m
p
u
tatio
n
al
s
p
ee
d
.
I
n
ad
d
itio
n
,
th
e
s
elec
tio
n
o
f
'
T
T
AGG
G
'
is
a
ls
o
b
ec
au
s
e
it
is
co
n
f
ir
m
ed
to
ex
is
t
in
ev
er
y
h
u
m
an
DNA
s
eq
u
en
ce
[
1
2
]
.
T
a
b
le
2
s
h
o
ws
th
e
ex
p
er
im
e
n
t
will
u
s
e
d
if
f
e
r
en
t
wo
r
k
e
r
n
o
d
es
an
d
co
r
es
o
n
ea
ch
o
f
th
e
s
am
e
n
o
d
es o
n
th
e
Go
o
g
le
C
lo
u
d
Pl
atf
o
r
m
.
I
n
o
th
e
r
ex
p
er
im
en
tal
s
ce
n
ar
io
s
th
e
m
aster
will n
o
t d
o
co
m
p
u
tin
g
.
Evaluation Warning : The document was created with Spire.PDF for Python.
I
SS
N
:
1
6
9
3
-
6
9
3
0
T
E
L
KOM
NI
KA
T
elec
o
m
m
u
n
C
o
m
p
u
t E
l Co
n
tr
o
l
,
Vo
l.
18
,
No
.
2
,
Ap
r
il 2
0
2
0
:
7
8
3
-
79
1
788
T
ab
l
e
1
.
Data
u
s
ed
i
n
ex
p
e
r
im
en
ts
F
i
l
e
s
N
a
me
F
i
l
e
S
i
z
e
(
K
B
)
H
o
mo
_
sa
p
i
e
n
s.GR
C
h
3
8
.
c
h
r
o
m
o
so
m
e
.
1
.
f
a
2
5
3
.
1
0
5
H
o
mo
_
sa
p
i
e
n
s.GR
C
h
3
8
.
c
h
r
o
m
o
so
m
e
.
2
.
f
a
2
4
6
.
2
3
0
H
o
mo
_
sa
p
i
e
n
s.GR
C
h
3
8
.
c
h
r
o
m
o
so
m
e
.
3
.
f
a
2
0
1
.
6
0
0
H
o
mo
_
sa
p
i
e
n
s.GR
C
h
3
8
.
c
h
r
o
m
o
so
m
e
.
4
.
f
a
1
9
3
.
3
8
4
H
o
mo
_
sa
p
i
e
n
s.GR
C
h
3
8
.
c
h
r
o
m
o
so
m
e
.
5
.
f
a
1
8
4
.
5
6
3
H
o
mo
_
sa
p
i
e
n
s.GR
C
h
3
8
.
c
h
r
o
m
o
so
m
e
.
6
.
f
a
1
7
3
.
6
5
2
H
o
mo
_
sa
p
i
e
n
s.GR
C
h
3
8
.
c
h
r
o
m
o
so
m
e
.
7
.
f
a
1
6
2
.
0
0
1
H
o
mo
_
sa
p
i
e
n
s.GR
C
h
3
8
.
c
h
r
o
m
o
so
m
e
.
8
.
f
a
1
4
7
.
5
5
7
H
o
mo
_
sa
p
i
e
n
s.GR
C
h
3
8
.
c
h
r
o
m
o
so
m
e
.
9
.
f
a
1
4
0
.
7
0
1
H
o
mo
_
sa
p
i
e
n
s.GR
C
h
3
8
.
c
h
r
o
m
o
so
m
e
.
1
0
.
f
a
1
3
6
.
0
2
7
H
o
mo
_
sa
p
i
e
n
s.GR
C
h
3
8
.
c
h
r
o
m
o
so
m
e
.
1
1
.
f
a
1
3
7
.
3
3
8
H
o
mo
_
sa
p
i
e
n
s.GR
C
h
3
8
.
c
h
r
o
m
o
so
m
e
.
1
2
.
f
a
1
3
5
.
4
9
6
H
o
mo
_
sa
p
i
e
n
s.GR
C
h
3
8
.
c
h
r
o
m
o
so
m
e
.
1
3
.
f
a
1
1
6
.
2
7
0
H
o
mo
_
sa
p
i
e
n
s.GR
C
h
3
8
.
c
h
r
o
m
o
so
m
e
.
1
4
.
f
a
1
0
8
.
8
2
7
H
o
mo
_
sa
p
i
e
n
s.GR
C
h
3
8
.
c
h
r
o
m
o
so
m
e
.
1
5
.
f
a
1
0
3
.
6
9
1
H
o
mo
_
sa
p
i
e
n
s.GR
C
h
3
8
.
c
h
r
o
m
o
so
m
e
.
1
6
.
f
a
9
1
.
8
8
4
H
o
mo
_
sa
p
i
e
n
s.GR
C
h
3
8
.
c
h
r
o
m
o
so
m
e
.
1
7
.
f
a
8
4
.
6
4
5
H
o
mo
_
sa
p
i
e
n
s.GR
C
h
3
8
.
c
h
r
o
m
o
so
m
e
.
1
8
.
f
a
8
1
.
7
1
2
H
o
mo
_
sa
p
i
e
n
s.GR
C
h
3
8
.
c
h
r
o
m
o
so
m
e
.
1
9
.
f
a
5
9
.
5
9
4
H
o
mo
_
sa
p
i
e
n
s.GR
C
h
3
8
.
c
h
r
o
m
o
so
m
e
.
2
0
.
f
a
6
5
.
5
1
8
H
o
mo
_
sa
p
i
e
n
s.GR
C
h
3
8
.
c
h
r
o
m
o
so
m
e
.
2
1
.
f
a
4
7
.
4
8
8
H
o
mo
_
sa
p
i
e
n
s.GR
C
h
3
8
.
c
h
r
o
m
o
so
m
e
.
2
2
.
f
a
5
1
.
6
6
5
H
o
mo
_
sa
p
i
e
n
s.GR
C
h
3
8
.
c
h
r
o
m
o
so
m
e
.
X
.
f
a
1
5
8
.
6
4
1
H
o
mo
_
sa
p
i
e
n
s.GR
C
h
3
8
.
c
h
r
o
m
o
so
m
e
.
Y
.
f
a
5
8
.
1
8
1
T
o
t
a
l
3
.
1
3
9
.
7
7
0
T
ab
le
2
.
E
x
p
er
im
en
tal
s
ce
n
a
r
io
No
M
a
s
t
e
r
N
o
d
e
s
W
o
r
k
e
r
N
o
d
e
s
Ea
c
h
M
a
s
t
e
r
C
o
r
e
Ea
c
h
W
o
r
k
e
r
C
o
r
e
1
1
2
4
4
2
1
4
4
4
3
1
5
4
4
4
1
11
4
4
3
.
3
.
Resul
t
a
nd
a
na
ly
s
is
o
f
ex
perim
ent
s
B
ased
o
n
th
e
s
ce
n
ar
io
d
esig
n
ed
in
th
e
p
r
ev
io
u
s
s
ec
tio
n
,
T
ab
le
3
s
h
o
ws
th
e
r
esu
lt
s
o
f
th
e
e
x
p
er
im
en
t
with
th
e
s
ce
n
ar
io
.
I
n
T
a
b
le
3
,
t
h
er
e
ar
e
7
co
lu
m
n
s
,
n
am
el
y
p
a
tter
n
,
f
ile,
n
u
m
b
er
o
f
wo
r
k
e
r
n
o
d
es,
to
tal
p
atter
n
,
g
en
o
m
ic
r
e
p
ea
ts
,
in
d
ex
p
atter
n
r
ep
ea
ts
,
an
d
tim
e
co
s
t
.
Patter
n
co
lu
m
n
is
th
e
p
atter
n
th
at
we
tak
e
a
lo
o
k
f
o
r
i
n
s
eq
u
en
ce
s
.
Af
ter
we
co
n
d
u
cted
th
e
ex
p
er
im
e
n
t,
we
m
ad
e
a
c
o
m
p
ar
is
o
n
o
f
th
e
o
u
t
p
u
t
o
f
th
e
s
y
s
tem
th
at
we
b
u
ilt
u
s
in
g
a
n
u
m
b
er
o
f
n
o
d
es
wh
ic
h
wer
e
th
e
v
ar
iab
les o
f
t
h
is
s
tu
d
y
in
in
f
lu
en
cin
g
th
e
s
p
ee
d
o
f
p
r
o
ce
s
s
in
g
f
i
n
d
in
g
p
atter
n
s
.
Ap
ar
t
f
r
o
m
th
e
n
u
m
b
er
o
f
n
o
d
es
th
at
we
v
ar
ied
,
we
d
id
n
o
t
m
a
k
e
ch
an
g
es
to
o
th
e
r
v
ar
iab
les
s
u
ch
as
th
e
n
u
m
b
e
r
o
f
co
r
es
o
n
th
e
m
as
ter
o
r
wo
r
k
e
r
n
o
d
es
an
d
th
e
n
u
m
b
er
o
f
n
o
d
es
f
r
o
m
th
e
m
aster
o
f
th
is
ex
p
er
im
en
t.
I
n
Fig
u
r
e
s
5
an
d
6
,
we
p
r
esen
t th
e
co
m
p
ar
is
o
n
o
f
th
e
r
esu
lts
o
f
th
e
ex
p
e
r
im
en
t u
s
in
g
4
co
r
e
s
f
r
o
m
ea
ch
m
aster
an
d
th
e
wo
r
k
er
n
o
d
es c
ar
r
ied
o
u
t in
th
is
ex
p
e
r
im
en
t w
ith
C
C
G
an
d
C
AG
p
atter
n
s
o
n
ch
r
o
m
o
s
o
m
e
1
.
Alth
o
u
g
h
th
e
n
u
m
b
er
o
f
wo
r
k
er
n
o
d
es
in
f
l
u
en
ce
s
th
e
s
p
ee
d
o
f
th
e
co
m
p
u
tatio
n
al
p
r
o
ce
s
s
,
th
is
d
o
es
n
o
t
h
av
e
a
s
ig
n
if
ican
t
ef
f
ec
t
.
C
an
b
e
s
ee
n
f
r
o
m
th
e
co
m
p
u
tatio
n
al
s
p
ee
d
o
f
2
,
4
,
a
n
d
5
N
o
d
es
wh
er
e
th
e
d
if
f
er
en
ce
in
s
p
ee
d
o
f
d
ata
p
r
o
ce
s
s
in
g
o
n
ly
ex
p
er
ien
ce
s
d
if
f
er
en
ce
s
p
er
f
ew
s
ec
o
n
d
s
.
T
h
e
m
o
s
t
s
ig
n
if
ican
t
co
m
p
u
tati
o
n
al
d
if
f
er
en
ce
s
o
cc
u
r
wh
en
th
e
n
u
m
b
er
o
f
wo
r
k
er
n
o
d
es
is
ad
d
ed
b
y
1
1
No
d
es.
I
n
Fig
u
r
e
6
it
is
in
cr
ea
s
in
g
ly
ev
id
en
t
th
at
n
o
t
o
n
ly
is
th
e
s
p
ee
d
d
if
f
er
en
ce
i
n
co
n
s
is
ten
t,
ev
en
with
th
e
in
cr
ea
s
in
g
n
u
m
b
e
r
o
f
wo
r
k
er
n
o
d
es
m
a
k
in
g
c
o
m
p
u
tin
g
in
th
e
s
ea
r
ch
f
o
r
C
AG
p
atter
n
s
o
n
ch
r
o
m
o
s
o
m
e
1
s
lo
w.
T
h
is
is
b
ec
au
s
e
th
e
C
AG
p
atter
n
is
th
e
m
o
s
t
f
o
u
n
d
p
atter
n
am
o
n
g
th
e
p
atter
n
s
ca
r
r
ied
o
u
t
in
th
e
e
x
p
er
im
e
n
t
s
o
th
at
th
e
tim
e
co
s
t
o
b
tain
ed
will
b
e
lo
n
g
er
o
r
h
ig
h
e
r
wh
e
n
c
o
m
p
ar
e
d
t
o
o
t
h
er
p
atter
n
s
.
W
ith
th
e
n
u
m
b
er
o
f
p
atter
n
s
o
b
tain
e
d
in
th
e
co
m
p
u
tatio
n
al
p
r
o
c
ess
,
th
e
ef
f
ec
tiv
en
ess
o
f
th
e
n
u
m
b
er
o
f
wo
r
k
er
n
o
d
es
is
n
ee
d
ed
.
I
t
is
ev
id
en
t
f
r
o
m
th
e
h
is
to
g
r
am
o
n
th
e
n
u
m
b
er
o
f
4
wo
r
k
er
n
o
d
es
with
th
e
n
u
m
b
er
o
f
5
wo
r
k
er
n
o
d
es
ac
tu
a
lly
in
cr
ea
s
in
g
th
at
4
wo
r
k
er
n
o
d
es
ar
e
m
o
r
e
ef
f
ec
tiv
e
in
p
er
f
o
r
m
in
g
s
ea
r
c
h
c
o
m
p
u
tin
g
th
e
C
AG
p
atter
n
is
co
m
p
ar
ed
to
5
wo
r
k
er
n
o
d
es.
Ho
wev
er
,
if
th
e
n
u
m
b
er
o
f
wo
r
k
er
n
o
d
es
is
to
o
f
ar
awa
y
,
s
u
ch
as
u
s
in
g
4
wo
r
k
er
n
o
d
es
with
th
e
u
s
e
o
f
1
1
wo
r
k
e
r
n
o
d
es,
th
e
tim
e
c
o
s
t o
b
tain
ed
d
ec
r
ea
s
es si
g
n
if
ican
tly
.
Evaluation Warning : The document was created with Spire.PDF for Python.
T
E
L
KOM
NI
KA
T
elec
o
m
m
u
n
C
o
m
p
u
t E
l Co
n
tr
o
l
Gen
o
mic
r
ep
ea
ts
d
etec
tio
n
u
s
i
n
g
B
o
ye
r
-
Mo
o
r
e
a
l
g
o
r
ith
m
o
n
…
(
La
la
S
ep
tem
R
iz
a
)
789
T
ab
le
3
.
E
x
p
er
im
en
tal
r
esu
lt
P
a
t
t
e
r
n
F
i
l
e
N
u
mb
e
r
o
f
W
o
r
k
e
r
N
o
d
e
s
To
t
a
l
P
a
t
t
e
r
n
G
e
n
o
mi
c
r
e
p
e
a
t
s
I
n
d
e
x
P
a
t
t
e
r
n
R
e
p
e
a
t
s
Ti
me
C
o
st
(
S
e
c
o
n
d
s)
CCG
C
h
r
o
mo
so
m
e
1
2
6
4
7
.
3
8
8
12
(
1
2
7
6
3
3
6
8
4
,
1
2
7
6
3
3
6
8
7
,
1
2
7
6
3
3
6
9
0
,
1
2
7
6
3
3
6
9
3
,
1
2
7
6
3
3
6
9
6
,
1
2
7
6
3
3
6
9
9
,
1
2
7
6
3
3
7
0
2
,
1
2
7
6
3
3
7
0
5
,
1
2
7
6
3
3
7
0
8
,
1
2
7
6
3
3
7
1
1
,
1
2
7
6
3
3
7
1
4
,
1
2
7
6
3
3
7
1
7
)
7
9
,
5
0
CCG
C
h
r
o
mo
so
m
e
2
2
5
7
1
.
8
8
2
13
(
2
1
0
1
7
1
3
6
2
,
2
1
0
1
7
1
3
6
5
,
2
1
0
1
7
1
3
6
8
,
2
1
0
1
7
1
3
7
1
,
2
1
0
1
7
1
3
7
4
,
2
1
0
1
7
1
3
7
7
,
2
1
0
1
7
1
3
8
0
,
2
1
0
1
7
1
3
8
3
,
2
1
0
1
7
1
3
8
6
,
2
1
0
1
7
1
3
8
9
,
2
1
0
1
7
1
3
9
2
,
2
1
0
1
7
1
3
9
5
,
2
1
0
1
7
1
3
9
8
)
7
2
,
2
4
CCG
C
h
r
o
mo
so
m
e
3
2
4
2
3
.
0
7
3
9
(
1
2
9
6
0
5
3
4
4
,
1
2
9
6
0
5
3
4
7
,
1
2
9
6
0
5
3
5
0
,
1
2
9
6
0
5
3
5
3
,
1
2
9
6
0
5
3
5
6
,
1
2
9
6
0
5
3
5
9
,
1
2
9
6
0
5
3
6
2
,
1
2
9
6
0
5
3
6
5
,
1
2
9
6
0
5
3
6
8
)
5
9
,
8
7
CCG
C
h
r
o
mo
so
m
e
4
2
3
7
4
.
0
0
4
7
(
3
0
7
5
0
0
5
,
3
0
7
5
0
0
8
,
3
0
7
5
0
1
1
,
3
0
7
5
0
1
4
,
3
0
7
5
0
1
7
,
3
0
7
5
0
2
0
,
3
0
7
5
0
2
3
)
(
2
0
5
9
4
5
8
,
2
0
5
9
4
6
1
,
2
0
5
9
4
6
4
,
2
0
5
9
4
6
7
,
2
0
5
9
4
7
0
,
2
0
5
9
4
7
3
,
2
0
5
9
4
7
6
)
(
5
7
6
3
0
1
,
5
7
6
3
0
4
,
5
7
6
3
0
7
,
5
7
6
3
1
0
,
5
7
6
3
1
3
,
5
7
6
3
1
6
,
5
7
6
3
1
9
)
5
6
,
6
2
...
...
...
...
...
...
...
TTA
G
G
G
C
h
r
o
mo
so
m
e
Y
11
4
.
5
1
8
3
(
2
4
0
0
9
1
9
6
,
2
4
0
0
9
2
0
2
,
2
4
0
0
9
2
0
8
)
9
,
5
4
Mo
r
eo
v
er
,
we
also
m
ad
e
a
c
o
m
p
ar
is
o
n
with
th
e
p
r
ev
i
o
u
s
r
esear
ch
[
1
2
]
as
illu
s
tr
ated
in
Fig
u
r
e
7
.
I
t
ca
n
b
e
s
ee
n
th
e
p
r
o
p
o
s
ed
m
o
d
el
in
v
o
lv
in
g
d
ata
s
tr
ea
m
in
g
o
n
A
p
ac
h
e
Sp
ar
k
Stre
am
in
g
is
f
aster
th
an
th
e
co
m
p
u
tatio
n
al
m
o
d
el
o
n
p
ar
allel
co
m
p
u
tin
g
with
m
u
ltico
r
e
c
o
n
d
u
cted
i
n
th
e
p
r
ev
io
u
s
r
esear
ch
[
1
2
]
.
E
v
e
n
th
o
u
g
h
we
h
av
e
b
ee
n
f
aster
t
h
a
n
th
e
p
r
e
v
io
u
s
r
esear
c
h
[
1
2
]
,
th
e
p
r
o
p
o
s
ed
co
m
p
u
tatio
n
al
m
o
d
el
h
as
a
d
r
awb
ac
k
,
Evaluation Warning : The document was created with Spire.PDF for Python.
I
SS
N
:
1
6
9
3
-
6
9
3
0
T
E
L
KOM
NI
KA
T
elec
o
m
m
u
n
C
o
m
p
u
t E
l Co
n
tr
o
l
,
Vo
l.
18
,
No
.
2
,
Ap
r
il 2
0
2
0
:
7
8
3
-
79
1
790
i.e
.
,
th
e
n
u
m
b
er
o
f
g
en
o
m
ic
r
ep
ea
ts
co
u
ld
n
o
t
b
e
ac
c
u
r
ate.
I
t
h
ap
p
en
s
wh
en
t
h
e
m
atch
p
a
tter
n
s
f
o
u
n
d
a
r
e
o
n
th
e
s
p
liti
n
g
lo
ca
tio
n
s
.
I
n
ter
m
o
f
th
e
co
m
p
ar
is
o
n
in
m
eth
o
d
s
u
tili
ze
d
v
ia
p
ar
allel
co
m
p
u
tin
g
,
t
h
e
p
r
e
v
io
u
s
r
esear
ch
also
co
n
d
u
cte
d
d
ata
s
tr
ea
m
in
g
b
y
u
s
in
g
R
p
ac
k
ag
e
an
d
Seq
u
e
n
tial K
-
Me
an
s
f
o
r
d
eter
m
in
in
g
t
r
en
d
in
g
to
p
ics
in
T
witter
[
2
5
]
.
Oth
er
f
o
l
lo
win
g
m
eth
o
d
s
in
m
ac
h
in
e
lear
n
in
g
ca
n
also
b
e
u
s
ed
in
d
ata
s
tr
ea
m
in
g
f
o
r
g
e
n
o
m
ic
r
e
p
ea
ts
ar
e
Naiv
e
B
ay
es [
2
6
]
,
v
a
r
io
u
s
in
tellig
en
t c
lass
if
ier
s
[
2
7
]
,
an
d
b
o
o
ts
tr
a
p
m
eth
o
d
[
2
8
]
.
Fig
u
r
e
5
.
Sp
ee
d
co
m
p
ar
is
o
n
w
ith
th
e
v
ar
iatio
n
o
f
wo
r
k
er
n
o
d
es o
n
ch
r
o
m
o
s
o
m
e
1
in
th
e
C
C
G
p
atter
n
Fig
u
r
e
6
.
Sp
ee
d
co
m
p
ar
is
o
n
w
ith
th
e
v
ar
iatio
n
o
f
wo
r
k
er
n
o
d
es o
n
ch
r
o
m
o
s
o
m
e
1
in
th
e
C
AG
p
atter
n
Fig
u
r
e
7
.
Sp
ee
d
co
m
p
ar
is
o
n
w
ith
th
e
p
r
ev
i
o
u
s
r
esear
ch
[
1
2
]
4.
CO
NCLU
SI
O
N
T
h
e
m
ain
co
n
tr
ib
u
tio
n
o
f
th
is
r
esear
ch
is
(
i)
to
p
r
o
v
id
e
co
m
p
u
tatio
n
al
m
o
d
els
f
o
r
b
ig
d
ata
in
d
etec
tin
g
g
en
o
m
ic
r
e
p
ea
ts
u
s
in
g
th
e
B
o
y
er
-
Mo
o
r
e
alg
o
r
ith
m
with
Ap
ac
h
e
Sp
ar
k
Stre
am
in
g
.
T
h
is
m
o
d
el
co
n
tain
s
s
ev
er
al
s
tag
es,
s
u
ch
as
p
r
ep
r
o
ce
s
s
m
o
d
els,
in
p
u
t
d
ata,
B
o
y
er
-
Mo
o
r
e
alg
o
r
ith
m
s
y
s
tem
s
,
an
d
d
o
wn
lo
ad
o
u
t
p
u
ts
;
(
ii)
to
co
n
d
u
ct
s
ev
er
al
e
x
p
er
im
e
n
ts
b
y
v
ar
y
in
g
th
e
n
u
m
b
er
o
f
wo
r
k
er
n
o
d
es
u
s
ed
.
B
ased
o
n
th
e
r
esu
lts
o
b
tain
ed
,
we
ca
n
s
tate
th
at
th
e
p
r
o
p
o
s
ed
m
o
d
el
ca
n
b
e
u
s
ed
to
o
b
tain
d
ata
in
th
e
f
o
r
m
o
f
th
e
n
u
m
b
e
r
o
f
p
atter
n
s
f
o
u
n
d
o
n
o
n
e
ch
r
o
m
o
s
o
m
e,
th
e
n
u
m
b
er
o
f
g
en
o
m
ic
r
ep
ea
ts
f
o
u
n
d
o
n
o
n
e
ch
r
o
m
o
s
o
m
e
a
n
d
th
e
l
o
ca
tio
n
o
f
g
e
n
o
m
ic
r
ep
ea
ts
.
Mo
r
eo
v
e
r
,
th
e
co
m
p
ar
is
o
n
s
th
at
h
av
e
b
ee
n
d
o
n
e
s
h
o
w
th
at
th
e
p
r
o
p
o
s
ed
m
o
d
el
is
f
aster
th
an
th
e
co
m
p
u
tin
g
o
n
s
tan
d
alo
n
e
an
d
p
ar
alle
l
co
m
p
u
tin
g
with
m
u
ltico
r
e.
I
n
th
e
f
u
tu
r
e
,
we
h
av
e
p
lan
s
to
im
p
r
o
v
e
th
e
d
ev
elo
p
m
e
n
t
an
d
u
s
e
o
f
k
n
o
wled
g
e
in
th
is
m
o
d
el
s
o
th
at
i
t
ca
n
b
e
d
ev
elo
p
ed
in
to
v
a
r
io
u
s
s
cien
ce
s
ec
to
r
s
o
r
in
v
ar
io
u
s
ca
s
e
s
tu
d
ies
th
at
ca
n
u
tili
ze
B
ig
Data
tech
n
o
lo
g
y
with
s
tr
ea
m
in
g
p
r
o
ce
s
s
in
g
u
s
in
g
Ap
ac
h
e
Sp
a
r
k
Stre
am
in
g
.
RE
F
E
R
E
NC
E
S
[1
]
Ch
a
rles
wo
rt
h
.
B,
S
n
ieg
o
ws
k
i
.
a
n
d
P
,
S
tep
h
a
n
.
W
,
"
Th
e
Ev
o
lu
t
io
n
a
ry
Dy
n
a
m
i
c
s
o
f
Re
p
e
ti
ti
v
e
DN
A
in
E
u
k
a
ry
o
tes
,"
Na
tu
re
,
v
o
l
.
37
1
,
p
p
.
2
1
5
-
20
,
1
9
9
4
.
[2
]
Bu
a
rd
J,
a
n
d
Je
ffre
y
s A
.
J
,
"
Bi
g
,
B
a
d
M
in
isa
telli
tes
,"
Na
t
u
re
Ge
n
e
t
ics
,
v
o
l
1
5
,
p
p
.
3
2
7
-
3
2
8
,
1
9
9
7
.
[3
]
Ed
g
a
r
.
R
.
C,
a
n
d
M
y
e
rs
.
E
.
W
,
"
P
ILE
R:
I
d
e
n
ti
f
ica
ti
o
n
a
n
d
Clas
sifica
ti
o
n
o
f
G
e
n
o
m
ic
Re
p
e
a
ts
,"
Bi
o
in
fo
rm
a
t
ics
,
v
o
l.
21
,
p
p
.
1
5
2
-
15
8
,
2
0
0
5
.
Evaluation Warning : The document was created with Spire.PDF for Python.
T
E
L
KOM
NI
KA
T
elec
o
m
m
u
n
C
o
m
p
u
t E
l Co
n
tr
o
l
Gen
o
mic
r
ep
ea
ts
d
etec
tio
n
u
s
i
n
g
B
o
ye
r
-
Mo
o
r
e
a
l
g
o
r
ith
m
o
n
…
(
La
la
S
ep
tem
R
iz
a
)
791
[4
]
P
a
h
a
d
ia
.
M,
S
ri
v
a
sta
v
a
.
A,
S
riv
a
st
a
v
a
.
D,
a
n
d
P
a
ti
l
.
N
,
"
G
e
n
o
m
e
Da
t
a
An
a
ly
sis
Us
in
g
M
a
p
Re
d
u
c
e
P
a
ra
d
ig
m
,"
S
e
c
o
n
d
In
t
.
C
o
n
f
.
Ad
v
.
Co
mp
u
t
.
Co
mm
u
n
.
En
g
.
,
p
p
.
5
5
6
-
55
9
,
2
0
1
5
.
[5
]
Orr
.
H
.
T,
Zo
g
h
b
i
.
H
.
Y
,
"
Tr
in
u
c
leo
ti
d
e
Re
p
e
a
t
Diso
r
d
e
rs
,
”
En
c
y
c
l
Ne
u
ro
l
S
c
i
.
,
v
o
l.
30
,
p
p
.
5
2
5
-
5
33
,
2
0
1
4
.
[6
]
Al
Kin
d
h
i
.
B,
S
a
r
d
jo
n
o
.
T
.
"
P
a
tt
e
r
n
M
a
tc
h
in
g
P
e
rfo
rm
a
n
c
e
Co
m
p
a
ri
so
n
s
a
s
B
ig
Da
ta
An
a
ly
sis
Re
c
o
m
m
e
n
d
a
ti
o
n
s
fo
r
He
p
a
ti
ti
s C
Viru
s (HCV
)
S
e
q
u
e
n
c
e
DN
A
,
"
3
rd
In
t
.
Co
n
f
.
Arti
f
.
I
n
tell
.
M
o
d
e
l
S
imu
.
,
p
p
.
99
-
1
0
4
,
2
0
1
5
.
[7
]
Do
n
a
ld
.
E.
K
,
Ja
m
e
s
.
H.
M
.
J a
n
d
V
a
u
g
h
a
n
.
R
.
P
, "
F
a
st
P
a
tt
e
rn
M
a
t
c
h
in
g
in
S
tri
n
g
s
,
"
S
IA
M
J
Co
m
p
u
t
.
,
v
o
l.
6
,
n
o
.
2
,
p
p
.
3
2
3
-
3
50
,
1
9
7
7
.
[8
]
Bo
y
e
r
.
R
.
S
,
M
o
o
re
.
J
.
S
, "
A
F
a
st
S
tri
n
g
S
e
a
rc
h
in
g
Al
g
o
ri
th
m
,
"
Co
mm
u
n
.
AC
M
,
v
o
l
.
2
0
,
p
p
.
7
6
2
-
72
,
1
9
7
7
.
[9
]
S
h
e
ik
.
S
.
S
,
Ag
g
a
rwa
l
.
S
.
K,
P
o
d
d
a
r
.
A,
Ba
lak
ris
h
n
a
n
.
N,
a
n
d
S
e
k
a
r
.
K
,
"
A
F
a
st
P
a
tt
e
rn
M
a
tch
i
n
g
Al
g
o
rit
h
m
,
"
J
.
Ch
e
m
.
In
f
.
Co
m
p
u
t
.
S
c
i
.
,
v
o
l.
44
,
p
p
.
1
2
5
1
-
1
2
5
6
,
2
0
0
4
.
[1
0
]
Ha
p
o
n
i
u
k
.
M
,
P
a
we
lk
o
wic
z
.
M
,
P
rz
y
b
e
c
k
i
.
Z
,
a
n
d
N
o
wa
k
.
R
.
M
,
"
Cu
G
e
n
e
a
s
a
To
o
l
to
v
iew
a
n
d
E
x
p
l
o
re
G
e
n
o
m
ic
Da
ta
,"
Ph
o
to
n
ics
Ap
p
i
n
Astro
n
,
Co
mm
u
,
I
n
d
u
stry
,
a
n
d
Hi
g
h
En
e
r
g
y
Ph
y
sic
s E
x
p
e
r
ime
n
,
v
o
l
.
1
0
4
4
5
,
p
p
.
1
-
8
,
2
0
1
7
.
[1
1
]
S
h
ib
a
ta
.
Y,
e
t
a
l.
,
"
A
B
o
y
e
r
–
M
o
o
re
Ty
p
e
Al
g
o
ri
th
m
fo
r
Co
m
p
re
ss
e
d
P
a
tt
e
rn
M
a
tch
i
n
g
,"
An
n
u
a
l
S
y
mp
o
si
u
m
o
n
Co
mb
in
a
t
o
ria
l
Pa
tt
e
r
n
M
a
tch
i
n
g
,
p
p
.
1
8
1
–
1
9
4
,
2
0
00
.
[1
2
]
Riza
.
L
.
S
,
Ra
c
h
m
a
t
.
A
.
B
,
M
u
n
i
r
.
T
.
H,
a
n
d
Na
z
ir
.
S
.
"
G
e
n
o
m
ic
Re
p
e
a
t
De
tec
ti
o
n
Us
in
g
t
h
e
Kn
u
t
h
-
M
o
rr
is
-
P
ra
tt
Alg
o
rit
h
m
o
n
R
Hig
h
-
P
e
rf
o
rm
a
n
c
e
-
Co
m
p
u
ti
n
g
P
a
c
k
a
g
e
,"
I
n
t
.
J
.
o
f
Ad
v
.
in
S
o
ft
Co
m
p
.
a
n
d
I
ts
Ap
p
.
,
v
o
l.
11
,
p
p
.
94
-
1
1
1
,
2
0
1
9
.
[1
3
]
Riza
.
L
.
S,
Dh
ib
a
.
T
.
F
,
S
e
ti
a
w
a
n
.
W,
Hid
a
y
a
t
.
T,
a
n
d
F
a
h
si
.
M.
"
P
a
ra
ll
e
l
Ra
n
d
o
m
P
ro
jec
ti
o
n
Us
in
g
R
Hig
h
P
e
rfo
rm
a
n
c
e
Co
m
p
u
ti
n
g
f
o
r
P
lan
t
e
d
M
o
ti
f
S
e
a
rc
h
,"
T
EL
KO
M
NIKA
T
e
lec
o
mm
u
n
ica
ti
o
n
C
o
mp
u
ti
n
g
El
e
c
tro
n
ics
a
n
d
Co
n
tro
l
,
v
o
l.
17
,
n
o
.
3
,
p
p
.
1
3
5
2
-
135
9
,
2
0
0
9
.
[1
4
]
Tay
lo
r
.
R
.
C
,
"
A
n
Ov
e
rv
iew
o
f
th
e
Ha
d
o
o
p
/
M
a
p
Re
d
u
c
e
/HBa
se
F
ra
m
e
wo
rk
a
n
d
it
s
C
u
rre
n
t
Ap
p
li
c
a
ti
o
n
s
in
Bio
in
f
o
rm
a
ti
c
s
,
"
B
M
C
Bi
o
in
fo
rm
a
ti
c
s
,
v
o
l.
1
1
,
p
p
.
1
-
6
.
2
0
1
0
.
[1
5
]
G
u
o
.
R,
Z
h
a
o
.
Y,
Zo
u
.
Q,
F
a
n
g
.
X,
a
n
d
P
e
n
g
.
S
,
"
B
io
i
n
fo
rm
a
ti
c
s
Ap
p
li
c
a
ti
o
n
s
o
n
Ap
a
c
h
e
S
p
a
rk
,
"
Gig
a
S
c
ien
c
e
,
v
o
l.
7
,
p
p
.
1
-
10
,
2
0
1
8
.
[1
6
]
Zah
a
ria
.
M,
e
t
a
l.
,
"
S
p
a
rk
:
Clu
ste
r
Co
m
p
u
ti
n
g
wit
h
W
o
rk
i
n
g
S
e
ts
,
"
Ho
tCl
o
u
d
,
v
o
l.
10
,
p
p
.
1
-
7
,
2
0
1
0
.
[1
7
]
S
h
a
n
a
h
a
n
.
J
.
G
,
S
tree
t
.
H,
S
tree
t
.
H,
a
n
d
F
ra
n
c
isc
o
.
S
.
"
Lar
g
e
S
c
a
le
Distrib
u
ted
Da
ta
S
c
ien
c
e
u
si
n
g
Ap
a
c
h
e
S
p
a
r
k
,"
Pr
o
c
2
1
th
ACM
S
IGKD
D In
t
.
C
o
n
f
.
Kn
o
wl
.
Disc
o
v
.
D
a
ta
M
i
n
.
,
p
p
.
2
3
2
3
-
2
3
2
4
,
2
0
1
5
.
[1
8
]
Ha
n
.
Z,
S
q
l
.
A
.
S
.
"
S
p
a
rk
:
A
Bi
g
Da
ta
P
r
o
c
e
ss
in
g
P
latfo
rm
Ba
se
d
On
M
e
m
o
ry
Co
m
p
u
ti
n
g
,
"
S
e
v
e
n
th
In
t
.
S
y
mp
.
Pa
ra
ll
e
l
Arc
h
it
.
Al
g
o
rith
ms
Pr
o
g
r
a
m
,
p
p
.
1
7
2
-
1
7
6
,
2
0
1
6
.
[1
9
]
Li
a
o
.
X,
G
a
o
.
Z,
Ji
.
W,
a
n
d
Wan
g
.
Y.
"
An
E
n
fo
rc
e
m
e
n
t
o
f
Re
a
l
Ti
m
e
S
c
h
e
d
u
li
n
g
i
n
S
p
a
rk
S
trea
m
in
g
,"
2
0
1
5
S
ixth
In
ter
n
a
ti
o
n
a
l
Gr
e
e
n
a
n
d
S
u
st
a
in
a
b
le Co
mp
u
ti
n
g
C
o
n
fer
e
n
c
e
(IGS
C
).
p
p
.
1
-
6
,
2
0
1
5
.
[2
0
]
Lek
h
a
.
R.
N
,
S
u
jala
D.
S
,
S
i
d
d
h
a
n
th
D.
S
.
"
Ap
p
ly
in
g
S
p
a
rk
Ba
se
d
M
a
c
h
in
e
Lea
rn
in
g
M
o
d
e
l
o
n
S
tre
a
m
in
g
Big
Da
ta
fo
r
He
a
lt
h
S
tatu
s P
re
d
ictio
n
,
"
Co
mp
u
t
E
lec
tr E
n
g
,
v
o
l.
6
5
,
p
p
.
3
9
3
-
3
9
9
,
2
0
1
8
.
[2
1
]
Leo
n
a
rd
o
.
N,
Bru
c
e
.
R,
An
ish
.
N,
a
n
d
An
a
n
d
.
K,
"
S
4
:
Distrib
u
ted
S
t
re
a
m
Co
m
p
u
ti
n
g
P
la
tfo
rm
,"
Pr
o
c
-
IEE
E
In
t
Co
n
f
Da
ta
M
in
in
g
,
ICDM
,
p
p
.
1
7
0
-
1
7
7
,
2
0
1
0
.
[2
2
]
Ka
ra
u
H,
An
d
y
.
K
.
P
a
tr
ick
.
W.
a
n
d
M
a
tei.
Z
.
"
Lea
rn
in
g
S
p
a
rk
:
Li
g
h
t
n
in
g
-
F
a
st
Bi
g
Da
ta
An
a
l
y
sis
,"
O’Re
il
ly
M
e
d
i
a
,
In
c
.
,
2
0
1
5
.
[2
3
]
Ch
in
tap
a
l
li
S
,
Da
g
it
D,
E
v
a
n
s
B,
F
a
riv
a
r
R,
G
ra
v
e
s
T,
Ho
ld
e
rb
a
u
g
h
M
,
Li
u
Z
,
Nu
sb
a
u
m
K,
P
a
ti
l
K,
P
e
n
g
BJ,
P
o
u
l
o
sk
y
P
.
Be
n
c
h
m
a
rk
i
n
g
stre
a
m
in
g
c
o
m
p
u
tatio
n
e
n
g
in
e
s:
S
to
rm
,
fli
n
k
a
n
d
sp
a
rk
stre
a
m
in
g
.
2
0
1
6
I
E
EE
in
ter
n
a
ti
o
n
a
l
p
a
ra
l
lel
a
n
d
d
istri
b
u
ted
p
ro
c
e
ss
in
g
s
y
mp
o
si
u
m wo
rk
sh
o
p
s (IP
DP
S
W
)
,
1
7
8
9
-
1
7
9
2
,
2
0
1
6
.
[2
4
]
Kro
ß
J
.
,
Krc
m
a
r
H.
,
"
M
o
d
e
li
n
g
a
n
d
sim
u
lati
n
g
A
p
a
c
h
e
S
p
a
r
k
st
re
a
m
in
g
a
p
p
li
c
a
ti
o
n
s
,"
S
o
ft
w
a
re
tec
h
n
ik
-
T
re
n
d
s
,
v
o
l.
36
,
n
o
.
4
,
p
p
.
1
-
3
,
2
0
1
6
.
[2
5
]
M
e
d
iay
a
n
i
M
.
,
Wi
b
iso
n
o
Y
.
,
Riza
L
.
S
.
,
P
é
re
z
A
.
R.
,
"
De
term
in
i
n
g
T
re
n
d
in
g
T
o
p
ics
in
Twit
ter
wit
h
a
D
a
ta
-
S
trea
m
in
g
M
e
th
o
d
i
n
R
,"
In
d
o
n
e
sia
n
J
o
u
r
n
a
l
o
f
S
c
ien
c
e
a
n
d
T
e
c
h
n
o
lo
g
y
,
v
o
l.
4
,
n
o
.
1
,
p
p
.
1
4
8
-
1
57
,
2
0
1
9
.
[2
6
]
M
u
ly
a
n
i
Y
.
,
Ra
h
m
a
n
E
.
F
.
,
Riza
L
.
S
.
,
"
A
n
e
w
a
p
p
r
o
a
c
h
o
n
p
re
d
i
c
ti
o
n
o
f
fe
v
e
r
d
ise
a
se
b
y
u
si
n
g
a
c
o
m
b
in
a
ti
o
n
o
f
De
m
p
ste
r
S
h
a
fe
r
a
n
d
Na
ïv
e
b
a
y
e
s
,"
2
0
1
6
2
nd
In
ter
n
a
ti
o
n
a
l
Co
n
f
e
re
n
c
e
o
n
S
c
ien
c
e
in
In
f
o
rm
a
ti
o
n
T
e
c
h
n
o
lo
g
y
(ICS
IT
e
c
h
)
,
pp.
3
6
7
-
3
7
1
,
2
0
1
6
.
[2
7
]
Ala
sk
e
r
H
.
,
Alh
a
r
k
a
n
S
.
,
Alh
a
rk
a
n
W
.
,
Zak
i
A
.
,
Riza
L
.
S.
,
"
De
te
c
ti
o
n
o
f
k
i
d
n
e
y
d
ise
a
se
u
si
n
g
v
a
r
io
u
s
in
t
e
ll
i
g
e
n
t
c
las
sifiers
,"
2
0
1
7
3
rd
In
ter
n
a
t
io
n
a
l
Co
n
fer
e
n
c
e
o
n
S
c
ien
c
e
in
In
fo
rm
a
ti
o
n
T
e
c
h
n
o
lo
g
y
(ICS
IT
e
c
h
)
,
p
p
.
681
-
6
8
4
,
2
0
1
7
.
[2
8
]
Riza
L
.
S
.
,
Uta
m
a
J
.
A
.
,
P
u
tra
S
.
M
.
,
S
ima
tu
p
a
n
g
F
.
M
.
,
N
u
g
r
o
h
o
E
.
P.
,
"
P
a
ra
ll
e
l
Ex
p
o
n
e
n
t
ial
S
m
o
o
t
h
in
g
Us
in
g
th
e
Bo
o
tstra
p
M
e
th
o
d
i
n
R
fo
r
F
o
r
e
c
a
stin
g
As
tero
id
'
s
Orb
it
a
l
El
e
m
e
n
ts
,"
Per
ta
n
ika
J
o
u
rn
a
l
o
f
S
c
ien
c
e
&
T
e
c
h
n
o
lo
g
y
,
v
o
l.
26
,
n
o
.
1
,
p
p
.
4
4
1
-
4
6
2
,
1
Ja
n
2
0
1
8
.
Evaluation Warning : The document was created with Spire.PDF for Python.