Int
ern
at
i
onal
Journ
al of Ele
ctrical
an
d
Co
mput
er
En
gin
eeri
ng
(IJ
E
C
E)
Vo
l.
8
,
No.
6
,
D
ece
m
ber
201
8
, pp.
4321
~
43
33
IS
S
N: 20
88
-
8708
,
DOI: 10
.11
591/
ijece
.
v8
i
6
.
pp
4321
-
43
33
4321
Journ
al h
om
e
page
:
http:
//
ia
es
core
.c
om/
journa
ls
/i
ndex.
ph
p/IJECE
The Imp
r
oved H
ybrid
Al
gorithm
for
the
Ath
eer
and
Berry
-
r
avind
ra
n
A
lgorith
ms
At
he
er
Ak
r
am
A
b
dul
Ra
z
zaq
1
, Nur
’
Aini
Ab
d
ul R
as
hid
2
, A
l
aa A
h
me
d Abb
ood
3
,
Z
urinah
ni
Z
aino
l
4
1,3
Businesses
Inform
at
ic
s Col
le
ge
,
Univer
si
t
y
of
I
nform
at
ion
T
ec
h
nolog
y
a
nd
Com
m
unic
at
ions,
Ira
q
2
Coll
ege of
Com
pute
r & Inf
orm
a
ti
on
Sci
ences,
Pr
inc
ess Nourahbint
Abdulra
hm
an
Univer
sit
y
,
Saud
i
Arabi
a
4
School
of
Com
pute
r
Sc
ie
nc
es,
Unive
rsit
y
Sa
ints
Malay
sia
,
Ma
l
a
y
si
a
Art
ic
le
In
f
o
ABSTR
A
CT
Art
ic
le
history:
Re
cei
ved
A
ug
26
, 201
7
Re
vised
Ju
l
1
2
,
201
8
Accepte
d
J
ul
26
, 2
01
8
Exa
c
t
String
m
a
tc
hing
conside
rs
is
one
of
the
important
w
a
y
s
in
solving
th
e
basic
proble
m
s
in
computer
science
.
Th
is
rese
ar
ch
proposed
a
hy
brid
ex
act
string
m
at
chi
ng
al
gorit
hm
c
al
l
ed
E
-
Athee
r
.
Th
is
a
lgori
thm
depe
nd
ed
on
good
fea
tur
es;
sea
rch
ing
and
shifti
n
g
te
chni
qu
es
in
the
Athee
r
and
Berr
y
-
Ravi
ndra
n
al
gor
it
hm
s,
respe
c
ti
v
ely
.
The
propos
ed
a
lgori
thm
sh
owed
better
per
form
anc
e
in
num
ber
of
attempts
and
ch
aracte
r
compari
sons
c
om
par
ed
to
the
orig
ina
l
and
recent
and
sta
ndar
d
al
go
rit
hm
s.
E
-
Athe
er
a
lg
orit
hm
used
s
eve
ral
t
y
pes
of
dat
ab
ase
s,
whic
h
are
DN
A,
Protei
n
,
XM
L,
Pit
c
h,
English,
and
Source
.
Th
e
best
per
form
a
nce
in
th
e
num
ber
of
at
t
empts
is
when
the
al
gorit
hm
is
execut
ed
using
th
e
pit
ch
d
ataset
.
Th
e
wors
t
per
fo
rm
anc
e
is
whe
n
it
is
used
with
DN
A
dat
ase
t.
T
he
best
and
wors
t
dat
aba
ses
in
t
he
num
ber
of
cha
ra
cter
compa
risons
with
the
E
-
Athee
r
al
gor
ithm
are
the
Sour
ce
and
DN
A
dat
ab
ase
s,
r
espe
ct
iv
ely
.
Ke
yw
or
d:
E
-
At
heer al
gor
it
h
m
Exact stri
ng m
at
ching
a
lgorit
hm
s
Ty
pe of datab
a
se
Copyright
©
201
8
Instit
ut
e
o
f Ad
vanc
ed
Engi
n
ee
r
ing
and
S
cienc
e
.
Al
l
rights re
serv
ed
.
Corres
pond
in
g
Aut
h
or
:
Athee
r Akram
Abd
ul
Ra
zzaq
,
Businesse
s In
f
or
m
at
ic
s
Coll
e
ge
,
Un
i
ver
sit
y o
f I
nfor
m
at
ion
Te
chnolo
gy a
nd
Com
m
un
ic
at
io
ns
,
Al
-
Ni
dhal
Stre
et
, Bag
hd
a
d, I
r
aq
.
Em
a
il
: at
hp
roo
f@uo
it
c.e
du.iq
1.
INTROD
U
CTION
Strin
g
m
at
ching
is
the
proces
s
of
ide
ntifyi
ng
al
l
occ
urre
nc
es
of
al
ign
m
ents
by
com
par
in
g
tw
o
finite
-
le
ng
th
stri
ng
s
[
1].
Strin
g
m
at
c
hing
is
a
m
on
g
the
m
os
t
i
m
po
rtant
pr
oble
m
s
app
li
ed
in
m
any
co
m
pu
te
r
sci
enc
e
app
li
cat
io
ns
,
s
uch
as
we
b
s
earch
e
ng
i
nes
[2]
,
ope
r
at
in
g
syst
em
s,
com
pi
le
rs,
com
m
and
inter
pr
et
ers
[
3
]
,
intru
si
on
detec
ti
on
syst
em
s
[
4
]
,
inf
or
m
at
ion
retrieval
an
d
arti
fici
al
intel
l
igence
[
5
]
.
Strin
g
m
at
ching
involve
s
a
m
a
tc
hin
g
process
i
nvolv
i
ng
patte
rn
s
an
d
te
xts
to
i
den
t
ify
the
ide
ntica
l
char
act
er
s
betwee
n
them
.
The
m
at
ching
cha
r
act
er
com
par
ison
s
,
an
d
the
num
ber
of
at
te
m
pts;
these
factor
s
are
cha
ng
eable
dep
e
ndin
g
on
the
ty
pe
of alg
or
it
hm
u
sed [
6]
,
[7
]
.
Perm
anen
t
ch
al
le
ng
es
re
quire
the
us
e
of
the
m
os
t
effi
ci
ent
string
m
at
ching
al
go
rithm
s
with
increasin
g
m
e
m
or
y
siz
e
and
com
pu
te
r
sp
e
ed
[
8]
,
[9
]
.
T
hus,
strin
g
m
at
c
hing
al
go
rithm
s
ha
ve
bee
n
re
centl
y
pro
po
se
d
to
m
i
nim
iz
e
these
pr
oble
m
s
[
10
]
.
The
hy
br
id
st
ring
m
at
ching
al
gorithm
is
the
appr
opriat
e
so
luti
on
to
m
itigate
dis
adv
a
ntage
s
of
or
i
gin
al
stri
ng
m
at
ching
al
gor
it
h
m
s
[1
1
]
.
T
he
pro
posed
al
gorithm
in
this
pap
e
r
dep
e
nds
on
t
he
good
a
dva
ntages
of
t
w
o
e
xact
stri
ng
m
at
ching
a
lgorit
hm
s
wh
i
ch
a
re
(A
t
he
er
a
nd
Be
rr
y
-
r
avi
ndra
n)
an
d
dec
reasi
ng
the
disa
dva
ntages
of
them
.
All
ty
pes
of
da
ta
bases
in
be
nc
hm
ark
sta
nd
a
rd
are
us
e
d
in
this
res
earch
to
fin
d
the
su
it
ed
a
nd
unsu
it
e
d
data
ba
ses
with
pr
opose
d
al
gorithm
.
The
ob
j
ect
ive
of
this
researc
h o
ver
c
om
e the w
eak
ne
sses a
nd im
pr
ov
e
s the
p
e
rform
ance of e
xact strin
g
m
at
ching
alg
ori
thm
s.
The
or
i
gin
al
al
gorithm
s:
Two
or
i
gin
al
al
gori
thm
s
wer
e
us
e
d
as r
efe
ren
ce
d
in
this
re
searc
h,
w
hic
h
ar
e
Athee
r
an
d
Be
rr
y
-
Ra
vindra
n
al
gorithm
s.
Atheer
al
go
rithm
is
a
hybr
id
al
gorithm
of
thr
ee
al
go
rithm
s
wh
ic
h
are
Ra
it
a,
Sm
ith
,
a
nd
Karp
-
R
abin
[
3
]
.
Th
ere
are
th
ree
funct
ion
s
prep
roces
sing
of
the
At
he
er
al
gori
thm
wh
ic
h
Evaluation Warning : The document was created with Spire.PDF for Python.
IS
S
N
:
20
88
-
8708
In
t J
Elec
&
C
om
p
En
g,
V
ol.
8
, N
o.
6
,
Dece
m
ber
2
01
8
:
4321
-
4333
4322
are,
the
BM
ba
d
c
har
act
er
(
bm
Bc
)
functi
on,
the
quic
k
sear
ch
bad
cha
racter
(
qs
Bc
)
f
un
ct
ion
a
nd
the
ha
sh
i
ng
functi
on.
I
n
th
e
searching
phase
of
the
At
he
er
al
gorithm
,
al
l
the
co
m
par
ison
ste
ps
de
pe
nd
i
ng
on
th
e
has
h
process
wh
ic
h
wer
e
der
i
ved
from
the
Kar
p
-
Ra
bin
al
gorithm
.
The
com
par
iso
n
te
ch
nique
sta
rt
bet
we
en
th
e
has
h
values
of
the
th
ree
c
har
a
ct
ers
(
rig
htm
os
t,
le
ftm
os
t,
and
m
idd
le
)
i
n
the
patte
r
n
with
t
he
has
hing
value
of
the
three
c
harac
te
rs
in
the
t
ext
wind
ow.
I
f
m
a
tc
hin
g
oc
cur
s
betwee
n
them
,
then
one
by
one
thes
e
three
char
act
e
rs
of
t
he
te
xt
window
ve
rs
us
the
three
c
ha
racters
of
t
he
patte
rn
will
be
c
om
par
ed
.
Wh
e
n
m
at
chin
g
occurs,
t
hen
c
om
par
ison
starts
in
the r
em
ai
ni
ng
c
har
act
e
rs,
bu
t
without co
m
par
ing
the m
i
dd
le
c
har
act
er
again
.
Wh
e
n
a
m
at
ch
ing
or
m
is
m
atch
in
g
occ
urs,
the
sh
ifti
ng
of
the
patte
rn
w
ould
de
pe
nd
on
the
m
axi
m
u
m
value
betwee
n
the
m
v
al
ue
i
n
th
e
bm
Bc
table and
on the
m
+1
val
ue
in
the
qs
Bc
ta
ble.
The
Be
r
ry
-
r
a
vi
ndran
al
go
rith
m
is
a
hybr
id
appro
ac
h
a
nd
is
char
act
eriz
ed
by
le
ft
-
rig
ht
char
act
er
com
par
isons
[
1
2
]
.
T
his
al
gor
it
h
m
is
a
hybr
id
of
the
Z
hu
-
t
akaoka
a
nd
Q
uick
-
s
earc
h
al
gorithm
s
and
has
tw
o
ph
a
ses,
nam
ely,
pr
e
processi
ng
a
nd
searc
hi
ng
.
T
he
pr
e
processin
g
ph
as
e
of
this
al
gor
it
h
m
dep
en
ds
on
t
he
Be
rr
y
-
r
avi
ndra
n
ba
d
cha
racter
(brBc)
f
unct
ion.
Th
e
sea
rchi
ng
phase
of
t
he
Be
rr
y
-
Ra
vi
ndra
n
al
gorith
m
has
le
ft
-
rig
ht c
har
a
ct
er m
ov
em
ent
s.
T
his alg
or
it
hm
d
epen
ds o
n
t
he
s
hiftin
g op
e
rati
on
of the
n
e
xt tw
o
c
har
act
e
rs
i
n
the text w
in
dow,
which
depe
nd
s
on the m
+
1
an
d
m
+
2
of
the text w
in
do
w.
T
he
sh
ifti
ng
v
al
ue
i
s o
btai
ne
d
fro
m
the
br
Bc
ta
ble
in
the
pre
proc
essing
phase
.
The
c
om
par
iso
n
process
sta
rt
s
f
ro
m
the
le
ft
m
os
t
char
act
er
in
t
he
patte
rn
with
th
e
le
ftm
os
t
cha
r
act
er
in
t
he
te
xt
window
.
I
f
a
m
at
ch
is
fo
und,
the
com
par
is
on
will
con
ti
nue
to
ano
t
her
c
ha
rac
te
r
unti
l
al
l
char
act
ers
a
re
m
at
ched
.
Wh
e
n
a
m
a
tc
hin
g
or
m
is
m
a
tc
hin
g
occurs,
the
s
hi
ftin
g
dep
e
nds
on
the
ne
xt
tw
o
c
ha
r
act
ers
of
th
e
te
xt
wi
ndow
(m
+
1
a
nd
m
+
2)
and
the
obta
ine
d
value
from
t
he
br
Bc
ta
ble in t
he pre
processi
ng pha
se.
2.
PROP
OSE
D E
-
ATHEE
R ALGO
RITH
M
The
pro
po
se
d
E
-
At
heer
al
gorithm
con
sist
s
of
t
wo
phase
s,
nam
el
y,
the
pr
e
proces
sin
g
phase
a
nd
searchi
ng phas
e.
2.
1.
Prepr
oces
sing
P
ha
s
e
The
pr
e
proces
sing
phase
c
onta
ins
t
he
te
c
hn
i
qu
e
s
sel
ect
ed
f
r
om
the
A
theer
a
nd
Be
r
r
y
-
Ra
vindra
n
al
gorithm
s.
These
te
chn
iq
ues
are
regulat
ed
in
functi
ons
to
ob
ta
in
the
ex
a
ct
string
m
a
tc
hin
g
of
the
E
-
A
theer
al
gorithm
. Th
e
se fun
ct
io
ns ar
e presente
d
as
fo
ll
ows:
a
)
Boye
r
-
m
oo
r
e Bad C
har
act
e
r
(Bm
bc)
F
unc
ti
on
The
te
ch
nique
sel
ect
ed
from
t
his
functi
on
is
si
m
il
ar
to
that
in
the
prep
ro
ce
ssing
ph
a
se
of
the
Athee
r
al
gorithm
.
The
m
ai
n
purpose
of
us
in
g
th
e
bm
Bc
ta
ble
in
this
functi
on
i
s
to
determ
ine
and
c
hoose
th
e
best
sh
ifti
ng
f
or
ea
ch
cha
racter
in
the
m
at
ching
op
e
rati
on
as
sh
ow
n
in
Lines
21
–
26,
Fi
g
ure
1
.
The
form
of
th
e
bm
Bc
f
un
ct
io
n i
s d
e
fine
d by the e
qu
at
io
n bel
ow.
m
in
{i
: 1
≤
i <
m
−
1
and
x
[
m
−1−i]
=c}
if c o
ccur
s in
x,
bm
Bc
[
c]
(1)
m
oth
erw
ise
b) Berry
-
r
a
vindra
n
Ba
d
C
harac
te
r
(Br
bc
)
F
unct
ion
The
te
ch
nique
in
this
f
unct
io
n
is
sel
ect
ed
from
the
Be
rr
y
-
Ra
vindra
n
al
gorithm
.
The
m
ai
n
pur
pose
of
us
in
g
t
he
br
Bc
ta
ble
in
this
functi
on
is
to
determ
ine
and
choose
t
he
m
a
xim
u
m
value
in
the
s
hiftin
g
proces
s
as
show
n
in
Li
nes
5
-
19,
Fig
ure
1
.
The
f
or
m
of
th
e
brBc
f
unct
ion
is
def
i
ne
d
by
the
e
qua
ti
on
bel
ow.
T
he
m
ai
n
te
xt for
m
at
co
nsi
sts of a
flat
left
-
ri
gh
t
.
1
if P
[m
−1] = u
,
m
–
i + 1
if
P[i
]
P[
i+
1] =
uv,
(2)
m
+ 1
if P
[
0] = v
,
m
+ 2
oth
erw
ise
.
br
Bc
[
u,
v
]
=
min
c)
Hash
i
ng F
unct
ion
This
f
unct
io
n
is
der
i
ve
d
f
r
om
the
Athee
r
al
gorithm
and
the
has
hing
te
ch
nique
of
E
-
Athee
r
al
gorithm
,
wh
i
ch
is
sim
il
ar
to
the
has
hing
te
chn
i
qu
e
of
t
he
Athee
r
al
go
rith
m
.
The
pro
pos
ed
hybri
d
al
gorithm
us
es
t
he
(
Fh)
a
nd
(Fh
w)
sym
bo
ls
f
or
first
ha
sh
in
g
ste
p
in
patte
rn
a
nd
fi
r
st
has
hing
ste
p
in
the
te
xt
wi
ndow,
Evaluation Warning : The document was created with Spire.PDF for Python.
In
t J
Elec
& C
om
p
Eng
IS
S
N: 20
88
-
8708
Th
e
Impr
ove
d Hyb
ri
d
Al
go
rit
hm for
t
he
At
he
er an
d
..
.
(
Ath
eer Akr
am A
.
R
.
)
4323
resp
ect
ively
.
It
us
es
the
(Sh)
sy
m
bo
l
for
sec
ond
has
hing
a
nd
(T
h)
for
thi
rd
has
hing
ste
ps
in
t
he
patte
rn
as
sh
ow
n
in
Line
s
28
–
34; Fig
ure
1.
The
h
as
hing
value
is cal
cul
at
ed
usi
ng the
foll
ow
i
ng equat
ion
s:
First ha
sh
i
ng step
:
(
wF
[0, m
∕
2, m
-
1])
= (
w
F [0] ×2
u
-
1 +
wF
[m
/2]
× 2
u
-
2 +
wF [m
-
1]
×20) m
od
q
(3)
Seco
nd h
a
sh
i
ng ste
p:
(
wS
[
1… m
∕
2
-
1])
=
(
wS
[1
]
×
2u
-
1 +wS
[2
]
×
2u
-
2
+
...
+
wS [m
∕
2
-
1] ×
20)
m
od
q
(4)
Thir
d has
hing
ste
p:
(
wT [m
∕
2+1... m
-
2])
=
(
w
T
[m
∕
2+1
]
×2u
-
1
+
wT [m
∕ 2+2]×2
u
-
2 +.
..+ wT
[m
-
2]
×20)
m
od
q
(5)
Fig
ure
1. Pr
e
pr
ocessin
g p
hase
in
the
E
-
Athe
e
r
al
gorithm
2.2. Searc
hing
Ph
as
e
The
sea
rch
i
ng
p
ha
se
te
ch
niqu
e
in
the
E
-
Athe
er
al
gorithm
dep
en
ds
on
the
s
earchi
ng
phase
te
chn
iq
ues
of
t
he
Athee
r
a
nd
Be
rr
y
-
Ra
vi
ndra
n
al
gorith
m
s
and
on
s
ome
of
the
m
odulati
on
s
ob
ta
i
ned
durin
g
t
he
m
a
tc
hi
ng
op
e
rati
on.
I
n
the
first
ste
p,
th
e
hash
val
ues
of
the
th
ree
ch
aracte
rs
in
patt
ern
(
Fh)
are
co
m
par
ed
with
th
e
hash
values
of
t
he
three
c
har
act
e
rs
(F
hw
)
in
the
t
ext
wi
nd
ow.
I
f
a
m
at
ch
is
ob
ta
ined
betwee
n
the
has
hing
va
lues,
the
rem
ai
nin
g
three
cha
racter
s
in
the
te
xt
window
a
nd
the
rem
a
ining
th
r
ee
char
act
ers
i
n
the
patte
r
n
will
be
com
par
ed.
I
f
a
m
at
ch
is
ob
ta
ined
bet
ween
t
hese
cha
racter
s,
the
seco
nd
s
te
p
will
be
con
duct
e
d
as
s
hown
in
Line
11
;
Fig
ure
2
.
Howe
ve
r,
if
a
m
is
m
a
t
ch
is
obta
ined
in
the
ha
sh
i
ng
c
om
par
ison
or
i
n
t
he
c
harac
te
r
com
par
ison, th
e n
e
w
s
hift
of
t
his alg
or
it
hm
w
il
l dep
e
nd
on the m
axi
m
u
m
value o
f
m
f
rom
the b
m
Bc
tab
le
a
nd
the
(m
+1
and
m
+
2)
values
f
ro
m
the
br
Bc
ta
b
le
as
sh
own
in
Line
24;
Fig
ure
2
.
T
he
m
ref
ers
to
the
la
st
char
ac
te
r
in
th
e
te
xt
window,
the
m
+
1
is
t
he
first
cha
rac
te
r
after
the
te
xt
window,
a
nd
m
+2
is
the
second
char
act
e
r
after
the
te
xt
window.
T
he
re
hash
functi
on
is
then
us
e
d
to
cal
cu
la
te
the
three
char
act
er
s
of
th
e
n
e
w
te
xt w
in
dow
af
te
r
the s
hift
a
s
sh
ow
n
in
Line
26
;
Fig
ure
2
.
1.
Al
gorit
hm
E
-
At
he
er
(X
[0 …..
m−
1
]
2.
//
I
npu
t
: P
at
t
er
n X
3.
//
Out
put
: Sh
i
f
t
ta
bl
es of (bm
Bc),
(br
B
c)
an
d c
omput
e th
e hush val
ues
.
4.
//
p
r
ebr
Bc
(pr
epr
o
ce
ssi
ng
B
e
r
r
y
-
Ravi
ndran
ba
d
-
charact
er
fu
ncti
on)
5.
brB
c[A
SI
ZE][ASI
ZE]
/
/
2
D a
r
r
ay t
o
k
e
e
p shi
f
t
val
ues
6.
Fo
r
k
←
0
to ASI
ZE
Do
7.
Fo
r
j
←
0
t
o A
SI
ZE
Do
8.
brB
c[
k
][j
]
←
m+2
9.
En
d
For
10.
En
d
For
11.
Fo
r
k
←
0
to ASI
ZE
Do
12.
brB
c
[
k
][x
[0]]
←
m +
1
13.
En
d
For
14.
Fo
r
i
←
0
to m−
2
Do
15.
brB
c[x[
i
]][
x
[i
+1
]]
←
m
− i
16.
En
d
For
17.
Fo
r
k
←
0
to ASI
ZE
Do
18.
brB
c[x[
m
−1
]][
k
]
←
1
19.
En
d
For
20.
//
pre
b
m
Bc
(
pre
proc
essing
Bo
y
er
-
Moore
ba
d
-
charac
t
er
f
unct
i
on)
21.
For
j
←
0
to size o
f
alph
ab
et
Do
22.
b
m
Bc[
j]
←
m
23.
End
F
o
r
24.
For
j
←
0
to
m
−2
Do
25.
b
m
Bc
[
X[
j]]
←
m
− j
−
1
26.
End Fo
r
27.
//
Comput
e
th
e h
ush va
l
ues h =
d^S
−1
mod q
28.
Fo
r
i
←
w to S
−1
Do
29.
hy
←
(hy
<<
1
)+
y[i]
30.
End
Fo
r
31.
f
i
r
st
Ch
←
x
[0],
sec
ond
Ch
←
x
+1
,
midd
l
eCh
←
x
[m/
2
],
la
st
Ch
←
[m−
1
]
32.
//
Hash
val
ues of
al
l
steps i
n pa
t
t
er
n a
nd t
he fi
r
st
th
r
ee
characte
r
s in
tex
t
win
dow
33.
f
hx
←
(fhx
<<
1
)
+ fi
r
st
Ch, fh
x
←
(fhx
<<
1
)
+
midd
l
eCh, fhx
←
(fhx
<<
1
) +
la
st
Ch
34.
f
hy
←
(fhy<
<1) +
y[0
],
fh
y
←
(fhy<<
1
)
+ y
[
m/2], fhy
←
(fhy<
<1) + y
[
m−
1
]
Evaluation Warning : The document was created with Spire.PDF for Python.
IS
S
N
:
20
88
-
8708
In
t J
Elec
&
C
om
p
En
g,
V
ol.
8
, N
o.
6
,
Dece
m
ber
2
01
8
:
4321
-
4333
4324
Fig
ure
2. Searc
hing
ph
a
se in
the E
-
At
heer al
gorithm
In
t
he
sec
ond
ste
p,
w
hen
a
m
a
tc
h
is
obt
ai
ned
i
n
the
fi
rst
ste
p,
the
ha
sh
in
g
from
t
he
sec
ond
to
m
idd
le
-
1
cha
racters
in
t
he
te
xt
wind
ow
(
de
no
te
d
by
S
hw
)
is
cal
culat
ed
and
is
c
om
par
ed
with
t
he
ha
sh
in
g
char
act
e
rs
(
Sh)
in
t
he
patte
r
n.
If
a
m
at
ch
i
s
obta
ined
,
the
com
par
iso
n
of
cha
racters
be
tween
will
co
ntinue
(Lines
12
a
nd
14,
as
show
n
in
Fig
ure
2
.
If
ano
t
her
m
at
ch
is
ob
ta
ine
d
bet
ween
the
c
harac
te
rs,
the
thir
d
ste
p
will
com
m
enc
e.
I
f
a
m
is
m
a
tc
h
is
ob
ta
in
e
d
betwee
n
t
he
cha
racters,
th
e
sh
ift
will
de
pend
on
t
he
sam
e
te
chn
iq
ue
m
entioned
i
n
the
previ
ou
s
ste
p
as
sh
ow
n
in
Line
24
;
Fig
ure
2
a
nd
t
he
re
has
h
f
un
ct
io
n
will
be
us
ed
.
In
the
t
hir
d
st
ep,
w
hen
a
m
at
ch
is
ob
ta
ine
d
in
th
e
sec
ond
ste
p,
the
has
hi
ng
from
the
m
idd
le
+
1
to
l
ast
−
1
char
act
e
rs
in
t
he
te
xt
wi
ndow
(
denote
d
by
(Thw
))
is
cal
c
ulate
d
an
d
is
c
om
par
ed
with
the
has
hing
c
ha
racters
(Th)
in
the
pa
tt
ern
.
I
f
a
m
atch
is
ob
ta
ine
d,
the
com
par
ison
of
the
c
harac
te
rs
betwee
n
the
char
a
ct
ers
will
con
ti
nue
as
s
how
n
in
Li
nes
15
an
d
17
;
Fi
g
ure
2
.
If
a
m
at
ch
or
a
m
is
m
a
tc
h
is
ob
ta
ine
d
be
tween
t
he
cha
ra
ct
ers,
the
s
hift
will
de
pend
on
the
s
a
m
e
te
chn
iq
ue
m
entioned
on
the
pr
e
vious
s
te
ps
as
sho
wn
in
Line
24;
Fi
g
ure
2
and the last
ste
p (i.e.,
the
reha
sh
funct
io
n) w
i
ll
co
m
m
ence.
3.
PROP
OSE
D ALGO
RITH
M AN
ALY
SI
S
Pr
e
processin
g
of
t
he
pr
opose
d
al
gorithm
has
th
ree
f
unc
ti
on
s
wh
ic
h
a
re
brBc
,
bm
B
c
an
d
has
h
functi
on.
T
he
tim
e
com
plexit
y
of
brB
c
f
unc
ti
on
is
de
no
te
d
as
O
(m
+
σ
2
),
the
bm
B
c
func
ti
on
is
de
no
te
d
as
O
(m
+
σ)
and
h
as
h
f
unct
ion
is
de
no
te
d
as
O
(
m
).
The
sp
ace
and
tim
e
com
p
le
xity
of
the
pr
eprocessi
ng
ph
ase
of
the
pro
po
s
ed
a
lgorit
hm
is
den
ote
d
as
O
(m
+
σ
2
).
Th
e
tim
e
com
plexity
of
the
searchi
ng
ph
a
se
ex
plains
in
the
fo
ll
owin
g
sect
ion.
Lem
m
a
.3
.
1
The
ti
m
e
co
m
plexit
y
of
the
searchi
ng
phase
is
O
(n
/(m
+
2))
in
the
bes
t
case.
Pr
oo
f.
I
n
each
at
tem
pt,
if
any
char
act
er
do
e
s
no
t
ha
ppen
in
the
patte
rn
dur
ing
the
m
at
chi
ng
process
,
the
n
the
sh
ifti
ng
proces
s
will
dep
en
d
on
m
axi
m
u
m
value
bet
ween
m
fr
om
b
m
Bc
and
(m
+
1
and
m
+2)
fro
m
br
Bc
functi
ons
that
cal
culat
ed
in
the
prep
r
ocessi
ng
phase.
T
he
best
case
occ
urs
w
he
n
al
l
ch
aracte
rs
in
t
he
patte
rn
total
ly
diff
erent
than
tho
se
in
the
te
xt
window,
the
n
the
sh
ifti
ng
will
dep
e
nd
on
m
+
2
and
the
tim
e
co
m
p
le
xity
will
b
e
O (n
/(
m
+
2)).
Fo
r
e
xam
ple:
T
ext: y
yy
yy
yy
yy
yyyy
yyy
Patt
ern
:
xxxx
Lem
m
a.3
.2
T
he
ti
m
e co
m
plexity
o
f
the s
ea
rch
i
ng phase is
O
(n × m
)
in t
he worst ca
se
.
Pr
oo
f.
I
n
the
m
at
ching
proc
ess
ever
y
cha
r
act
er
in
the
te
xt
do
es
not
occ
ur
m
or
e
than
m
t
i
m
es
and
al
l
the
cha
racter
c
om
par
isons
f
or
n
c
har
act
er
s
of
the
te
xt
can
not
be
great
er
th
an
n
×
m
.
The
worst
case ha
ppen
s
if
al
l
the
char
act
ers
in
the
patte
rn
are
sam
e
with
tho
se
in
the
te
xt
window
i
n
eve
ry
at
tem
p
t.
Then
the
s
hiftin
g
1.
Al
gorit
hm
E
-
At
hee
r
(X [
0
….
.
m−
1
], Y [
0
……
.
n−1
]
)
2.
//
I
npu
t
: P
at
t
er
n X,
Text
Y
3.
//
Out
put
: n
umb
er
of a
t
t
em
pt
s an
d nu
mber of
chara
cte
r
co
mparisons
of p
at
t
er
n wi
t
h t
ext
4.
I
f
(
m%
2
=
= 0)
Th
e
n
5.
pa
r
←
1
6.
End
I
f
7.
j
←
0
8.
W
h
il
e
j <= n
− m
Do
9.
c
←
y[j + m
− 1
]
10.
//
C
ompa
r
i
ng
th
e Fh and
Fhw
11.
If
(fhx
=
= fhy&
&l
ast
Ch == c
&&
f
i
r
st
Ch == y
[j]
&&
mid
dl
eCh ==
y[j
+ m
/2
]
)
The
n
12.
shf
y
←
get
hy(j +
1
,
j +
m/2,
y
)
//cal
cula
t
e the
ha
sh of
(
Sh
w)
13.
/
/ Co
mpari
ng
th
e Sh and
Shw
14.
If
(shfx
=
= s
hf
y&& mat
ch(x +
1
, m
/
2
−
1
,
y,
j
+ 1, &
t
em
p)
==
1
)
T
h
en
15.
shl
y
←
get
hy(j+
m/2+
1
, j +
m−
1
,
y)
/
/ c
a
l
cula
t
e the ha
sh of
(
Th
w)
16.
// Com
p
arin
g t
he Th a
nd T
hw
17.
If
(
shl
x
==
shl
y&& mat
ch(x +
m/2 +
1
,
m/2−
1
-
pa
r
, y
, j +
m/2+
1
, &t
em
p)
==
)
T
h
en
18.
Count
//
Th
e
fi
r
st
oc
cu
r
r
e
nce o
f
th
e pat
t
er
n
i
n t
he tex
t
19.
EndIf
20.
EndIf
21.
EndIf
22.
Out
put
th
e fi
r
s
t
at
t
em
pt
an
d char
act
er
c
omparis
ons
23.
/
/shif
t
i
ng
//
24.
j +
=m
ax
(b
r
B
c[y
[j + m]][y[j
+
m+
1
]],
bmB
c[
y[j +
m−
1
]])
25.
/
/ Reha
sh op
era
t
i
on f
or
th
e text
win
dow
26.
f
hy
←
0
,fhy
←
(fhy<
<1)
+ y
[j]
,fhy
←
(fhy<
<1)
+ y
[j+
m/2],fh
y
←
(fhy
<<
1
) +
y[j+
m−
1
]
27.
End
W
h
il
e
Evaluation Warning : The document was created with Spire.PDF for Python.
In
t J
Elec
& C
om
p
Eng
IS
S
N: 20
88
-
8708
Th
e
Impr
ove
d Hyb
ri
d
Al
go
rit
hm for
t
he
At
he
er an
d
..
.
(
Ath
eer Akr
am A
.
R
.
)
4325
will
b
e
on
e
a
nd the tim
e co
m
plexity
w
il
l be
O
(
n
×
m
).
Fo
r
e
xam
ple:
Text:
yy
yy
yyyyyy
yyyyyy
Patt
ern
: y
yy
y
In
this
al
gorithm
cann
ot
accuratel
y
dete
rm
ine
the
aver
age
ti
m
e
c
om
plexity
becau
se
of
it
s
dep
e
ndence
on
the
al
ph
a
bet
siz
e
of
c
ha
ra
ct
ers
an
d
the
po
s
sibil
it
y
of
the
ap
pear
a
nc
e
of
ea
ch
c
ha
racter
ind
ivi
du
al
ly
in
the text.
4.
E
X
PERI
MEN
TAL DES
IG
N
AND E
V
AL
UA
TI
ON OF
STUDY
OUT
COME
S
The
pro
posed
al
gorithm
design
de
pe
nd
e
d
on
sel
ect
ing
the
good
featu
res
of
ori
gin
al
al
gorithm
s,
wh
ic
h
a
re
t
he
has
h
a
nd
bm
Bc
f
un
ct
io
ns
f
r
om
Atheer
al
gorithm
and
brBc
functi
on
from
Be
rr
y
-
Ra
vi
ndran
al
gorithm
.
The
pro
po
se
d
al
gorithm
us
ed
al
l
ty
pes
of
be
nc
hm
ark
sta
nda
rd
databa
ses
a
nd
t
he
re
su
lt
s
of
E
-
Athee
r
c
om
par
ed
to
r
es
ults
of the
or
i
gin
al
a
nd
recent a
nd st
and
a
r
d
al
gorit
hm
s.
4.1.
D
atabases
This
stu
dy
inve
sti
gates
the
di
ff
ere
nces
in
t
he
pe
rfo
rm
ance
and
pro
per
ti
es
of
se
ve
ral
exact
strin
g
m
at
ching
al
gor
it
h
m
s
wh
e
n
dif
fer
e
nt
ty
pes
of
databases
are
use
d
(20
0
MB
da
ta
siz
e).
T
he
ben
c
hm
ark
sta
nd
a
r
d
of
database
s
de
al
s
with
com
m
on
ty
pes
of
data,
s
uch
a
s
DNA,
Prote
in,
XML
Pit
ch,
E
ng
li
s
h
te
xt,
a
nd
S
ource
cod
e
.
These
dataset
s
wer
e
dow
nlo
ade
d
f
ro
m
the
Pizz
a
&
Chil
i
Corpus
Web
sit
e
(h
tt
p://
pizzac
hi
li
.d
cc.uc
hile.cl
/
(P
iz
za
Chil
i
Corpus).
T
w
o
patte
rn
le
ng
t
hs
wer
e
us
ed
i
n
this
stud
y:
the
sh
ort
patte
rn
le
ng
t
h,
w
hic
h
ra
nged
f
ro
m
4
cha
racters
to
28
cha
rac
te
rs,
an
d
t
he
lo
ng
patte
r
n
le
ng
th
(len
gth
pow
er
of
two),
w
hich
ra
ng
e
d
f
ro
m
2
^5
char
act
e
rs
to
2
^
10
char
act
ers
[
1
3]
,
[
1
4
].
The
DNA
data
seq
uen
ce
is
com
po
se
d
of
four
nu
cl
e
otide
s,
nam
el
y,
Ad
enine,
Gu
a
nin
e
,
Cy
tosine,
an
d
Thiam
in,
and
these
ty
pes
are
encode
d
as
A,
G,
C,
and T
res
pecti
vely
.
The
G
uten
berg
pro
j
ect
is
include
d
in
this
database
[
15]
,
[1
6].
T
he
Pro
te
ins
are
nece
ssary
to
the
structu
re
a
nd
f
un
ct
io
n
of
the
cel
ls
of
a
n
org
anism
.
The
P
r
otein
data
se
quence
is
c
om
pos
ed
of
20
am
ino
aci
ds
arr
a
ng
e
d
in
a
li
near
series
an
d
enc
od
e
d
usi
ng
uppe
rcase
le
tt
ers
[1
7]
,
[
1
8
]
.
The
databas
es
wer
e
obta
ine
d
fr
om
the
Sw
iss
-
P
ro
t
database
.
The
XML
str
ucture
database
c
on
ta
ins
the
bi
bliogra
ph
ic
in
for
m
at
ion
of
c
ompu
te
r
sci
ences.
T
he
Pit
ch
(Mi
di
Pit
ch
values
)
dat
abase
co
ntains
tun
in
g
data
use
d
in
di
gital
m
us
ic
[
19
]
,
[20].
The
En
glish
te
xt
da
ta
base
us
es
al
l
the
cha
racter
s
in
the
E
ng
li
s
h
al
pha
bet.
T
he
G
uten
berg
pro
j
ect
has
est
a
blishe
d
this
database
[
1
5
].
T
he
Sourc
e
pr
og
ram
cod
e
databas
e
is
com
po
s
ed
of
al
l
the
char
act
ers
us
e
d
in
the
C
an
d
Java la
ngua
ges
[
21
].
4.2.
I
mple
men
tation a
nd
En
vironm
ent
This experim
ent w
as c
onduct
ed
us
i
ng
t
he
Bi
runi
cl
us
te
r
in t
he
Sc
hool of Co
m
pu
te
r
Scie
nc
es at USM
(b
ir
uni.cs.
us
m
.m
y).
The
oper
at
ing
syst
em
us
ed
was
U
bunt
u
Linux
10.04
an
d
the
c
ompil
er
us
e
d
wa
s
GCC
v4.4.3.
This
st
ud
y
s
howe
d
th
at
the
hybr
i
d
al
gorithm
was
“best
in
pe
rform
ance”
as
the
resu
lt
of
t
he
hybri
d
al
gorithm
was
bette
r
tha
n
th
ose
of
t
he
or
igi
na
l
al
go
rithm
s.
The
ta
bles
of
e
valuati
on
f
or
e
ach
hy
br
i
d
al
gorithm
wer
e
ar
range
d
base
d
on
t
he
best
res
ult
an
d
fo
ll
owe
d
by
the
oth
e
r
al
gorithm
s.
The
al
gorithm
s
wer
e
the
n
ranke
d
as
“firs
t,”
“seco
nd,”
or
“t
hir
d,
”
I
n
th
e
evaluati
on
th
e
perform
ance
of
the
hybri
d
a
lgorit
hm
in
var
io
us
ty
pes
of
datab
ases,
the
res
ults
are
re
garded
as
“best”
w
hen
the
hybri
d
al
gorithm
per
f
orm
ed
bette
r
in
s
pecifi
c
databases
com
par
e
d
with
the
oth
e
r
al
gorith
m
s,
wh
ereas
“
worst”
re
fer
s
to
the
poo
rest
perform
ance
of
the
hybri
d
al
go
rith
m
fo
r
that
data
base.
When
th
e
hybri
d
or
ot
he
r
al
gorithm
s
obta
ined
the
bes
t
perform
ance
in
al
l
ty
pes
of
databa
ses, the
n
“al
l
da
ta
bases” is
us
e
d,
w
her
eas
wh
en
the
h
y
br
id
or o
t
her al
gorith
m
s o
btained
th
e b
est
perform
ance
in
alm
os
t
al
l
databases,
“m
os
t
databases”
is
use
d.
T
o
cl
ari
fy
the
res
ults
in
t
he
fi
gures
in
num
ber
of
at
te
m
pts,
th
e
pr
op
os
e
d
hybr
i
d
al
gorithm
s
sh
ow
only
(1
00
00)
dis
play
un
it
s
com
par
ed
with
the
or
iginal
al
gorithm
.
Com
par
ed
with
t
he
rece
nt
an
d
st
and
a
r
d
al
gor
it
hm
s,
the
propos
ed
al
go
rithm
has
a
lo
gar
it
hm
i
c
scal
e
and
base
of
(
10),
dis
play
unit
s
of
(
1000
0),
a
nd
m
ini
m
u
m
nu
m
ber
of
(10
0000).
I
n
nu
m
ber
of
ch
aracte
r
com
par
isons,
t
he
propose
d
hy
br
id
al
gorith
m
has
a
log
ari
thm
ic
scal
e
and
ba
se
of
(10
)
an
d
dis
play
unit
s
o
f
(10
000) com
par
ed
w
it
h t
he ori
gin
al
, st
an
dard,
and
recent al
gorithm
s.
5.
RESU
LT
S
A
ND D
I
SCUSI
ON
The
res
ults
of
E
-
At
heer
al
gor
it
h
m
co
m
par
ed
with
tho
se
of
the
or
i
gin
al
al
gorithm
s
in
first
ste
p
and
then
with
th
ose
of
t
he
recent
an
d
sta
ndar
d
al
gorithm
s
in
seco
n
d
ste
p.
N
um
ber
of
at
te
m
pts
and
num
ber
of
char
act
e
r
com
par
is
ons
co
ns
i
der
e
d
the
m
ai
n
facto
rs
that
use
d
in
t
his
stu
dy
.
The
databas
es
us
e
d
in
t
his
stud
y
are
D
N
A,
Prot
ei
n,
XML,
Pit
ch,
E
ngli
sh
,
an
d
Sour
ce
.
T
he
s
iz
e
of
data
is
200
MB
.
T
wo
pa
tt
ern
le
ngths
wer
e
us
e
d
, whic
h
a
r
e shor
t
(
4
to
28
)
a
nd lo
ng that
dep
e
nds
on the
leng
t
h powe
r of t
w
o
(
2
^5
t
o 2
^
10
).
Evaluation Warning : The document was created with Spire.PDF for Python.
IS
S
N
:
20
88
-
8708
In
t J
Elec
&
C
om
p
En
g,
V
ol.
8
, N
o.
6
,
Dece
m
ber
2
01
8
:
4321
-
4333
4326
5.1.
E
valua
tio
n and
Resul
ts An
aly
sis
of th
e E
-
A
th
eer
A
l
go
ri
th
m
and
t
he Origin
al
The
E
-
Atheer
al
gorithm
ob
ta
ins
the
best
re
su
lt
s
com
par
e
d
with
the
Be
rr
y
-
Ra
vindra
n
and
At
heer
al
gorithm
s
in
bo
t
h
s
hort
an
d
long
patte
r
ns.
The
Pit
ch
dat
abase
s
hows
t
he
be
st
res
ults
in
num
ber
of
at
tem
p
t
s
wh
e
n
usi
ng
sh
ort
an
d
long
patte
rn
s,
whereas
the
D
N
A
d
at
aba
se
sh
ow
t
he
w
ors
t
resu
lt
s
as
sh
ow
n
in
Figures
3
an
d
4
.
T
he
E
-
Athee
r
al
gorit
hm
ob
ta
ins
the
be
st
res
ults
com
par
ed
with
the
Athe
er
an
d
Be
rr
y
-
r
avi
ndra
n
al
go
rithm
s
i
n
bo
t
h
s
hort
a
nd
lo
ng
patte
r
ns
.
The
S
ourc
e
databa
se
s
hows
the
be
st
re
su
lt
s
in
nu
m
ber
of
c
ha
racter
c
om
pa
risons
wh
e
n
us
i
ng
s
hort
an
d
long
patte
r
ns
,
wh
e
reas
t
he
D
NA
data
base
s
hows
the
worst re
su
lt
s
a
s sho
wn in
Fig
ur
es
5 an
d 6
.
Figure
3. N
umber
of att
em
pts
for
E
-
Athee
r
c
om
par
ed wit
h
t
he ori
gin
a
l al
gorithm
s w
he
n usin
g
a
sho
rt
pa
tt
ern
and a
200
MB
data
siz
e
Figure
4. N
umber
of att
em
pts
for
E
-
Athee
r
c
om
par
ed wit
h
t
he ori
gin
al
alg
or
it
hm
s w
he
n usin
g
a
lo
ng p
a
tt
ern
and a
200
MB
data siz
e
0
2000
4000
6000
8000
4
8
12
16
20
24
28
Number
of
at
t
empt
s
x 1
00
00
Pat
t
er
n
le
ngth
DNA
BR
AT
E-A
T
0
1000
2000
3000
4000
5000
4
8
12
16
20
24
28
Number
of
at
t
empt
s
x 1
00
00
Pat
t
er
n
le
ngth
P
rotei
n
BR
AT
E-A
T
0
1000
2000
3000
4000
5000
4
8
12
16
20
24
28
Number
of
at
t
empt
s
x 1
00
00
Pat
t
er
n
le
ngth
XM
L
BR
AT
E-A
T
0
1000
2000
3000
4000
5000
4
8
12
16
20
24
28
Number
of
at
t
empt
s
x 1
00
00
Pat
t
er
n
le
ngth
P
i
tc
h
BR
AT
E-A
T
0
1000
2000
3000
4000
5000
4
8
12
16
20
24
28
Number
of
at
t
empt
s
x 1
00
00
Pat
t
er
n
le
ngth
En
gl
i
sh
BR
AT
E-A
T
0
1000
2000
3000
4000
5000
4
8
12
16
20
24
28
Number
of
at
t
empt
s
x 1
00
00
Pat
t
er
n
le
ngth
So
urc
e
BR
AT
E-A
T
0
2000
4000
6000
8000
32
64
128
256
512
10
2
4
Number
of
at
t
empt
s
x 1
00
00
Pat
t
er
n
le
ngth
DNA
BR
AT
E-A
T
0
500
1000
1500
32
64
128
256
512
10
2
4
Number
of
at
t
empt
s
x 1
00
00
Pat
t
er
n
le
ngth
P
rotei
n
BR
AT
E-A
T
0
200
400
600
800
1000
32
64
128
256
512
10
2
4
Number
of
at
t
empt
s
x 1
00
00
Pat
t
er
n
le
ngth
XM
L
BR
AT
E-A
T
0
200
400
600
800
1000
32
64
128
256
512
10
2
4
Number
of
at
t
empt
s
x 1
00
00
Pat
t
er
n
le
ngth
P
i
tc
h
BR
AT
E-A
T
0
200
400
600
800
1000
32
64
128
256
512
1024
Number
of
a
t
t
e
mpts
x
10
00
0
Pat
t
er
n
le
ngth
E
ng
lish
BR
AT
E-A
T
0
200
400
600
800
1000
32
64
128
256
512
10
2
4
Number
of
at
t
empt
s
x 1
00
00
Pat
t
er
n
le
ngth
So
urc
e
BR
AT
E-A
T
Evaluation Warning : The document was created with Spire.PDF for Python.
In
t J
Elec
& C
om
p
Eng
IS
S
N: 20
88
-
8708
Th
e
Impr
ove
d Hyb
ri
d
Al
go
rit
hm for
t
he
At
he
er an
d
..
.
(
Ath
eer Akr
am A
.
R
.
)
4327
Figure
5. N
umber
of c
har
act
e
r
c
om
par
ison
s
for
E
-
At
heer c
om
par
ed wit
h
t
he ori
gin
al
alg
or
it
hm
s w
he
n usi
ng
a shor
t
patte
r
n and a
200
MB
data siz
e
Fig
ure
6. N
umber
of c
har
act
e
r
c
om
par
ison
s
for
E
-
At
heer c
om
par
ed wit
h
t
he ori
gin
al
alg
or
it
hm
s w
he
n usin
g
a long
patte
r
n and a
200
MB
data siz
e
5.2.
E
valua
tio
n and
Resul
ts An
aly
sis
of th
e E
-
A
th
eer
Al
go
ri
th
m
and
Recen
t and
Standard
A
l
go
ri
th
ms
The
E
-
At
heer
al
gorithm
con
sidere
d
the
best
al
gorithm
in
al
l
ty
pes
of
da
ta
bases
w
he
n
us
in
g
s
hor
t
patte
rn
e
xce
pt w
he
n usin
g D
NA
it
w
a
s the
seco
nd
best. T
he
Ma
xim
u
m
sh
ift al
gorithm
i
s the b
est
al
gor
it
h
m
in
al
l
da
ta
bases
wh
e
n
us
i
ng
lo
ng
patte
r
n
a
nd
fo
ll
ow
e
d
by
t
he
E
-
At
heer
a
lgorit
hm
excep
t
w
he
n
us
in
g
Pit
ch
database
it
is
t
he
best
with
E
-
Athee
r.
The
T
wo
-
way
al
gorit
hm
is
the
w
ors
t
al
gorithm
in
s
hort
a
nd
lo
ng
pa
tt
ern
le
ng
th
s
as
sho
wn in
Fig
ur
es
7
a
nd 8
.
The
E
-
At
heer
a
lgorit
hm
con
s
idere
d
the
best
al
gorithm
in
al
l
databases
wh
e
n
us
in
g
s
hort
patte
rn,
wh
il
e
AK
R
A
M
is
the
best
al
gorithm
in
al
l
databases
an
d
E
-
At
heer
is
the
sec
ond
best
al
gorithm
us
ing
lo
ng
patte
rn
le
ngth
in
al
l
data
base
s
exce
pt
XML
the
E
-
At
heer
and
A
KR
AM
are
the
be
st.
T
he
T
w
o
-
way
is
the
worst alg
ori
th
m
in shor
t a
nd
long
patte
r
n l
eng
t
hs
as
s
hown in F
i
gures 9 a
nd 10
.
0.00
01
0.01
1
100
10000
4
8
12
16
20
24
28
Nu
m
b
e
r
of
ch
a
r
a
ct
e
r
co
m
p
a
r
i
so
n
s
x
10
00
0
Pat
t
er
n
le
ngth
DNA
BR
AT
E-A
T
0.00
01
0.01
1
100
10000
4
8
12
16
20
24
28
Nu
m
b
e
r
of
ch
a
r
a
ct
e
r
co
m
p
a
r
i
so
n
s
x
10
00
0
Pat
t
er
n
le
ngth
P
rotei
n
BR
AT
E-A
T
0.00
01
0.01
1
100
10000
4
8
12
16
20
24
28
Nu
m
b
e
r
of
ch
a
r
a
ct
e
r
co
m
p
a
r
i
so
n
s
x
10
00
0
Pat
t
er
n
le
ngth
XM
L
BR
AT
E-A
T
0.00
01
0.01
1
100
10000
4
8
12
16
20
24
28
Nu
m
b
e
r
of
ch
a
r
a
ct
e
r
co
m
p
a
r
i
so
n
s
x
10
00
0
Pat
t
er
n
le
ngth
P
i
tc
h
BR
AT
E-A
T
0.00
01
0.01
1
100
10000
4
8
12
16
20
24
28
Nu
m
b
e
r
of
ch
a
r
a
ct
e
r
co
m
p
a
r
i
so
n
s
x
10
00
0
Pat
t
er
n
le
ngth
En
gl
i
sh
BR
AT
E-A
T
0.00
01
0.01
1
100
10000
4
8
12
16
20
24
28
Nu
m
b
e
r
of
ch
a
r
a
ct
e
r
co
m
p
a
r
i
so
n
s
x
10
00
0
Pat
t
er
n
le
ngth
So
urc
e
BR
AT
E-A
T
0.00
01
0.01
1
100
10000
32
64
128
256
512
10
2
4
Nu
m
b
e
r
of
ch
a
r
a
ct
e
r
co
m
p
a
r
i
so
n
s
x
10
00
0
Pat
t
er
n
le
ngth
DNA
BR
AT
E-A
T
0.00
01
0.01
1
100
32
64
128
256
512
10
2
4
Nu
m
b
e
r
of
ch
a
r
a
ct
e
r
co
m
p
a
r
i
so
n
s
x
10
00
0
pa
t
t
er
n
le
ngth
P
rotei
n
BR
AT
E-A
T
0.00
01
0.01
1
100
32
64
128
256
512
10
2
4
Nu
m
b
e
r
of
ch
a
r
a
ct
e
r
co
m
p
a
r
i
so
n
s
x
10
00
0
Pat
t
er
n
le
ngth
XM
L
BR
AT
E-A
T
0.00
01
0.01
1
100
32
64
128
256
512
10
2
4
Nu
m
b
e
r
of
ch
a
r
a
ct
e
r
co
m
p
a
r
i
so
n
s
x
10
00
0
Pat
t
er
n
le
ngth
P
i
tc
h
BR
AT
E-A
T
0.00
01
0.01
1
100
32
64
128
256
512
10
2
4
Nu
m
b
e
r
of
ch
a
r
a
ct
e
r
co
m
p
a
r
i
so
n
s
x
10
00
0
Pat
t
er
n
le
ngth
En
gl
i
sh
BR
AT
E-A
T
0.00
01
0.01
1
100
32
64
128
256
512
10
2
4
Nu
m
b
e
r
of
ch
a
r
a
ct
e
r
co
m
p
a
r
i
so
n
s
x
10
00
0
Pat
t
er
n
le
ngth
So
urc
e
BR
AT
E-A
T
Evaluation Warning : The document was created with Spire.PDF for Python.
IS
S
N
:
20
88
-
8708
In
t J
Elec
&
C
om
p
En
g,
V
ol.
8
, N
o.
6
,
Dece
m
ber
2
01
8
:
4321
-
4333
4328
Figure
7. N
umber
of att
em
pts
for
E
-
Athee
r
c
om
par
ed wit
h r
ecent an
d st
an
dard al
gorithm
s when u
sin
g
a shor
t
patte
r
n and a
200
MB
data siz
e
Figure
8. N
umber
of att
em
pts
for
E
-
Athee
r
c
om
par
ed wit
h r
ecent an
d st
an
dard al
gorithm
s when u
sin
g
a long
patte
r
n and a
200
MB
data siz
e
10
100
1000
10000
100000
4
8
12
16
20
24
28
Number
of
at
t
empt
s
x 1
00
00
Pa
tt
e
r
n
l
e
n
g
th
DNA
H
QS
T-W
FS
SS
TV
AK
MS
E-A
T
10
100
1000
10000
100000
4
8
12
16
20
24
28
Number
of
at
t
empt
s
x 1
00
00
Pa
tt
e
r
n
l
e
n
g
th
P
rotei
n
H
QS
T-W
FS
SS
TV
AK
MS
E-A
T
10
100
1000
10000
100000
4
8
12
16
20
24
28
Number
of
at
t
empt
s
x 1
00
00
Pa
tt
e
r
n
l
e
n
g
th
XM
L
H
QS
T-W
FS
SS
TV
AK
MS
E-A
T
10
100
1000
10000
100000
4
8
12
16
20
24
28
Number
of
at
t
empt
s
x 1
00
00
Pa
tt
e
r
n
l
e
n
g
th
P
i
tc
h
H
QS
T-W
FS
SS
TV
AK
MS
E-A
T
10
100
1000
10000
100000
4
8
12
16
20
24
28
Number
of
at
t
empt
s
x 1
00
00
Pa
tt
e
r
n
l
e
n
g
th
En
gl
i
sh
H
QS
T-W
FS
SS
TV
AK
MS
E-A
T
10
100
1000
10000
100000
4
8
12
16
20
24
28
Number
of
at
t
empt
s
x 1
00
00
Pa
tt
e
r
n
l
e
n
g
th
So
urc
e
H
QS
T-W
FS
SS
TV
AK
MS
E-A
T
10
100
1000
10000
100000
32
64
128
256
512
10
2
4
Number
of
at
t
empt
s
x 1
00
00
Pa
tt
e
r
n
l
e
n
g
th
DNA
H
QS
T-W
FS
SS
TV
AK
MS
E-A
T
10
100
1000
10000
100000
32
64
128
256
512
10
2
4
Number
of
at
t
empt
s
x 1
00
00
Pa
tt
e
r
n
l
e
n
g
th
P
rotei
n
H
QS
T-W
FS
SS
TV
AK
MS
E-A
T
10
100
1000
10000
100000
32
64
128
256
512
10
2
4
Number
of
at
t
empt
s
x 1
00
00
Pa
tt
e
r
n
l
e
n
g
th
XM
L
H
QS
T-W
FS
SS
TV
AK
MS
E-A
T
10
100
1000
10000
100000
32
64
128
256
512
10
2
4
Number
of
at
t
empt
s
x 1
00
00
Pa
tt
e
r
n
l
e
n
g
th
P
i
tc
h
H
QS
T-W
FS
SS
TV
AK
MS
E-A
T
10
100
1000
10000
100000
32
64
128
256
512
10
2
4
Number
of
at
t
empt
s
x 1
00
00
Pa
tt
e
r
n
l
e
n
g
th
En
gl
i
sh
H
QS
T-W
FS
SS
TV
AK
MS
E-A
T
10
100
1000
10000
100000
32
64
128
256
512
10
2
4
Number
of
at
t
empt
s
x 1
00
00
Pa
tt
e
r
n
l
e
n
g
th
So
urc
e
H
QS
T-W
FS
SS
TV
AK
MS
E-A
T
Evaluation Warning : The document was created with Spire.PDF for Python.
In
t J
Elec
& C
om
p
Eng
IS
S
N: 20
88
-
8708
Th
e
Impr
ove
d Hyb
ri
d
Al
go
rit
hm for
t
he
At
he
er an
d
..
.
(
Ath
eer Akr
am A
.
R
.
)
4329
Figure
9. N
umber
of c
har
act
e
r
c
om
par
ison
s
for
E
-
At
heer
a
gainst
recent
a
nd stan
dard al
gorithm
s w
he
n usin
g
a shor
t
patte
r
n and a
200
MB
data siz
e
Figure
10. N
um
ber
o
f
c
ha
rac
te
r
com
par
iso
ns f
or
E
-
Athee
r a
gainst
rece
nt
and stan
dard al
gorithm
s w
he
n usi
ng
a long
patte
r
n and a
200
MB
data siz
e
0.00
01
0.01
1
100
10000
4
8
12
16
20
24
28
Number
of
cha
r
act
er
compa
r
is
on
s
x 1
00
00
Pa
tt
e
r
n
l
e
n
g
th
DNA
H
QS
T-W
FS
SS
TV
AK
MS
E-A
T
0.00
01
0.01
1
100
10000
4
8
12
16
20
24
28
Number
of
cha
r
act
er
compa
r
is
on
s
x 1
00
00
Pa
tt
e
r
n
l
e
n
g
th
P
rotei
n
H
QS
T-W
FS
SS
TV
AK
MS
E-A
T
0.00
01
0.01
1
100
10000
4
8
12
16
20
24
28
Number
of
cha
r
act
er
compa
r
is
on
s
x 1
00
00
Pa
tt
e
r
n
l
e
n
g
th
XM
L
H
QS
T-W
FS
SS
TV
AK
MS
E-A
T
0.00
01
0.01
1
100
10000
4
8
12
16
20
24
28
Number
of
cha
r
act
er
compa
r
is
on
s
x 1
00
00
Pa
tt
e
r
n
l
e
n
g
th
P
i
tc
h
H
QS
T-W
FS
SS
TV
AK
MS
E-A
T
0.00
01
0.01
1
100
10000
4
8
12
16
20
24
28
Number
of
cha
r
act
er
compa
r
is
on
s
x 1
00
00
Pa
tt
e
r
n
l
e
n
g
th
En
gl
i
sh
H
QS
T-W
FS
SS
TV
AK
MS
E-A
T
0.00
01
0.01
1
100
10000
4
8
12
16
20
24
28
Number
of
cha
r
act
er
compa
r
is
on
s
x 1
00
00
Pa
tt
e
r
n
l
e
n
g
th
So
urc
e
H
QS
T-W
FS
SS
TV
AK
MS
E-A
T
0.00
01
0.01
1
100
10000
32
64
128
256
512
10
2
4
Number
of
cha
r
act
er
compa
r
is
on
s
x 1
00
00
Pa
tt
e
r
n
l
e
n
g
th
DNA
H
QS
T-W
FS
SS
TV
AK
MS
E-A
T
0.00
01
0.01
1
100
10000
32
64
128
256
512
10
2
4
Number
of
cha
r
act
er
compa
r
is
on
s
x 1
00
00
p
a
tt
e
r
n
l
e
n
g
th
P
rotei
n
H
QS
T-W
FS
SS
TV
AK
MS
E-A
T
0.00
01
0.01
1
100
10000
32
64
128
256
512
10
2
4
Number
of
cha
r
act
er
compa
r
is
on
s
x 1
00
00
Pa
tt
e
r
n
l
e
n
g
th
XM
L
H
QS
T-W
FS
SS
TV
AK
MS
E-A
T
0.00
01
0.01
1
100
10000
32
64
128
256
512
10
2
4
Number
of
cha
r
act
er
compa
r
is
on
s
x 1
00
00
Pa
tt
e
r
n
l
e
n
g
th
P
i
tc
h
H
QS
T-W
FS
SS
TV
AK
MS
E-A
T
0.00
01
0.01
1
100
10000
32
64
128
256
512
10
2
4
Number
of
cha
r
act
er
compa
r
is
on
s
x 1
00
00
Pa
tt
e
r
n
l
e
n
g
th
En
gl
i
sh
H
QS
T-W
FS
SS
TV
AK
MS
E-A
T
0.00
01
0.01
1
100
10000
32
64
128
256
512
10
2
4
Number
of
cha
r
act
er
compa
r
is
on
s
x 1
00
00
Pa
tt
e
r
n
l
e
n
g
th
So
urc
e
H
QS
T-W
FS
SS
TV
AK
MS
E-A
T
Evaluation Warning : The document was created with Spire.PDF for Python.
IS
S
N
:
20
88
-
8708
In
t J
Elec
&
C
om
p
En
g,
V
ol.
8
, N
o.
6
,
Dece
m
ber
2
01
8
:
4321
-
4333
4330
5.
3
.
E
valua
tio
n of the E
-
A
the
er Alg
orit
hm
Comp
ared
w
ith
th
e
Origin
al
A
lg
orithm
s
The
perform
a
nce
re
su
lt
s
of
the
E
-
Athee
r
al
gorithm
and
the
or
igi
nal
a
lgorit
hm
s
are
com
par
ed
i
n
te
rm
s
of
the
num
ber
of
at
te
m
pts
and
t
he
nu
m
ber
of
c
ha
racter
c
om
par
isons
in
bo
t
h
s
hort
a
nd
lo
ng
patte
rns
with
dif
fer
e
nt
data
ty
pes
and
siz
es.
Table
1
show
s
c
om
par
iso
n
of
the
r
esults
of
the
e
-
at
heer
a
nd
or
i
gin
al
al
gorithm
s
.
Table
1.
C
om
par
iso
n of t
he
R
esults
of t
he
e
-
at
heer
a
nd
Or
i
gin
al
Algorith
m
s
Alg
o
rith
m
s
Data size
(
2
0
0
MB)
Sh
o
rt
Lon
g
Bes
t perf
o
r
m
an
ce
in
Nu
m
b
er
o
f
atte
m
p
ts
Berr
y
-
Rav
in
d
ran
Seco
n
d
Seco
n
d
Ath
eer
Third
Third
E
-
Ath
ee
r
First
First
Bes
t perf
o
r
m
an
ce
in
Nu
m
b
er
o
f
char
acter
co
m
p
ariso
n
s
Berr
y
-
Rav
in
d
ran
Third
Third
Ath
eer
Seco
n
d
Seco
n
d
E
-
Ath
ee
r
First
First
The
E
-
At
heer
al
gorithm
ob
ta
ins
the
best
res
ults
in
te
rm
s
of
the
num
ber
of
at
te
m
pts
m
ad
e
beca
us
e
it
dep
e
nds
on
t
he
best
sh
i
fting
f
un
ct
io
n
from
t
he
m
axi
m
u
m
value
of
brBc
a
nd
bm
Bc
.
The
E
-
At
heer
al
gorith
m
ob
ta
in
s
the
lowest
nu
m
ber
of
char
act
e
r
co
m
par
ison
s
bec
ause
it
reli
es
on
the
has
h
fun
ct
ion
,
th
us
sim
plifyi
ng
the
cha
racter
c
om
par
ison
bet
ween
patte
r
ns
and
te
xts
[
22
]
.
The
best
s
hifting
functi
ons
al
so
hel
p
re
du
ce
the
n
um
ber
of
c
ha
racter
c
om
par
isons
as
s
how
n
in
Table
1
.
Ta
ble
2
s
how
s
p
e
rfor
m
ance
of
the
e
-
at
heer
al
gorithm
in d
i
ff
e
ren
t
dat
abase ty
pes
.
Table
2.
Per
for
m
ance of th
e
e
-
at
heer
Al
gorithm
in
Diff
e
re
nt
D
at
abase
Typ
es
Perf
o
r
m
an
ce
Databas
es
Data size
2
0
0
M
B
Pattern len
g
th
Sh
o
rt
Lon
g
Atte
m
p
ts
Bes
t
Pitch
Pitch
W
o
rst
DNA
DNA
Ch
arac
ter
co
m
p
ari
so
n
s
Bes
t
So
u
rce
So
u
rce
W
o
rst
DNA
DNA
The
E
-
At
heer
al
gorithm
ob
ta
ins
the
fe
west
nu
m
ber
of
at
te
m
pts
in
the
Pi
tc
h
databa
se
be
cause
thi
s
al
gorithm
depends
on
t
wo
good
f
unct
io
ns,
nam
el
y,
has
h
a
nd
bm
Bc
,
in
the
Athee
r
al
gorithm
[
3
]
;
these
functi
ons
are
consi
der
e
d
e
ffi
ci
ent
wh
e
n
e
m
plo
ye
d
in
th
e
Pit
ch
data
ba
se
[2
3
]
.
Pit
ch
data
co
ntain
a
high
per
ce
ntage
of
nu
m
ber
s
beca
us
e
the
data
a
re
enc
oded
as
MID
I
pitc
h
nu
m
ber
s
i
n
co
m
pu
te
r
app
li
c
at
ion
s
[2
4]
,
[
2
5
]
.
T
he
hash
f
un
ct
io
n
al
so
us
es
i
nteg
er
num
ber
s
an
d
the
bm
Bc
fu
nction,
w
hich
i
s
consi
dered
a
good
sh
ifti
ng
functi
on that
help
s red
uce th
e
num
ber
of
att
em
pts.
The
E
-
Athee
r
al
gorithm
ob
ta
ins
the
lo
west
nu
m
ber
of
ch
aracte
r
com
pari
so
ns
in
t
he
S
ource
co
de
because
it
reli
es
on
the
ef
fici
ency
of
the
Athee
r
te
ch
nique.
T
he
Sou
rc
e
databa
se
al
s
o
ben
e
fits
f
rom
this
te
chn
iq
ue.
T
he
two
al
gorithm
s
us
e
the
hash
functi
on
in
databases
with
la
r
ge
al
ph
a
bet
siz
es
to
pr
od
uce
la
rge
has
h
val
ues,
t
hu
s
re
du
ci
ng
the
pr
ob
a
bili
ty
of
c
ha
racter
com
par
ison.
T
he
E
-
Athe
er
a
nd
Athee
r
al
gorithm
s
sh
ow
t
he hig
he
st n
um
ber
of at
tem
pts an
d cha
racter c
om
par
isons i
n t
he D
N
A data
base
as s
how
n
in
Ta
ble
2
.
5.4.
E
valua
tio
n of the E
-
A
the
er Alg
orit
hm
Comp
ared
W
ith Rec
ent
an
d S
t
andar
d
Algo
ri
th
ms
The
perf
or
m
ance
res
ults
of
t
he
E
-
At
heer
a
lgorit
hm
and
t
he
recent
a
nd
sta
nd
a
rd
al
gori
thm
s
wer
e
com
par
ed
in
te
rm
s
of
the
num
ber
of
at
tempts
an
d
cha
racter
com
par
iso
ns
us
in
g
short
an
d
long
patte
r
ns,
wit
h
diff
e
re
nt
data
ty
pes
an
d
siz
e
s.
T
he
sta
nd
a
r
d
a
nd
rece
nt
a
lgorit
hm
s
e
m
p
loye
d
in
this
s
tud
y
a
re
H
ors
pool
,
Qu
ic
k
-
sea
rc
h,
Tw
o
-
way,
Fas
t
search
,
SS
A
BS,
T
VS
BS,
AK
R
AM,
a
nd
Ma
xim
u
m
sh
ift.
Table
3
s
how
s
c
om
par
ison o
f t
he
res
ults
between t
he
e
-
at
he
er alg
or
it
hm
a
nd r
ece
nt a
nd s
ta
nd
a
rd alg
or
it
hm
s
.
The
E
-
Athee
r
al
gorithm
ob
ta
ins
the
fe
west
nu
m
ber
of
at
te
m
pts
wh
e
n
us
i
ng
s
hort
patte
r
ns
beca
use
this
al
gorithm
de
pends
on
t
he
e
ff
ic
ie
nt
s
hi
fting
functi
on
s
(
bm
Bc
and
br
Bc
)
of
t
he
Athee
r
al
gorit
hm
.
The
Ma
xim
u
m
sh
ift
al
gorithm
sh
ows
t
he
f
ewest
nu
m
ber
of
at
te
m
pts
becau
se
t
his
al
gorit
hm
reli
es
on
the
ef
f
ic
ie
nt
functi
ons
of
(
z
tB
c)
and
(
qs
B
c)
in
long
patte
rn
s
[
26]
.
The
E
-
Athee
r
al
gorithm
ob
ta
ins
the
lowest
num
ber
of
char
act
e
r
com
par
is
ons
w
hen
us
in
g
short
pa
tt
ern
s
beca
us
e
this
al
go
rit
hm
dep
e
nds
on
th
e
us
ef
ul
te
chn
i
qu
e
of
the
Athee
r
al
gorithm
in
co
m
par
i
ng
c
har
act
ers.
I
f
a
m
is
m
a
tc
h
is
ob
ta
ine
d
in
the
second
ste
p,
the
loss
will
on
ly
Evaluation Warning : The document was created with Spire.PDF for Python.