IAES
International
Journal
of
Articial
Intellig
ence
(IJ-
AI)
V
ol.
14,
N
o.
3,
June
2025,
pp.
2537
∼
2546
ISSN:
2252-8938,
DOI:
10.11591/i
jai.v14.i3.pp2537-2546
❒
2537
Ev
aluating
sear
c
h
k
e
y
distribution
im
pact
on
sear
c
hing
per
f
ormance
in
larg
e
dat
a
str
eams
Bo
w
onsak
Srisungsittisunti
1
,
Jira
w
at
Duangkae
w
1
,
N
akarin
Chaikae
w
2
1
Depar
tment
of
Computer
Engineer
ing,
Sc
hool
of
Inf
or
mation
and
Communication
T
ec
hnology
,
U
niv
ersity
of
Pha
y
ao,
Pha
y
ao,
Thailand
2
Depar
tment
of
Geog
raphic
Inf
or
mation
Science,
Sc
hool
of
Inf
or
mation
and
Communication
T
ec
hnology
,
U
niv
ersity
of
Pha
y
ao,
Pha
y
ao,
Thailand
Article
Inf
o
Ar
ticle
histor
y
:
R
eceiv
ed
Oct
9,
2024
R
e
vised
Jan
29,
2025
A
ccepted
Mar
15,
2025
K
eyw
or
ds:
Binar
y
searc
h
tree
Dense
inde
xing
Inde
xing
eciency
Lar
g
e-scale
dataset
Sparse
inde
xing
ABS
TRA
CT
The
dis
tr
ibution
patter
n
of
searc
h
k
e
y
s
is
assessed
in
this
s
tudy
b
y
contras
ting
f
our
methods
of
inde
x
searc
hing
on
lar
g
e-scale
JSON
les
with
data
s
treams.
The
A
delson-
V
elskii
and
Landis
(A
VL)
tree,
binar
y
searc
h
tree
(BS
T),
linear
searc
h
(LS),
and
binar
y
searc
h
(BS)
are
among
the
searc
h
s
trategies.
W
e
look
at
the
nor
mal
dis
tr
ibution,
left-sk
e
w
ed
dis
tr
ibution,
and
r
ight-sk
e
w
ed
dis
tr
ibution
of
searc
h-k
e
y
dis
tr
ibutions.
A
ccording
to
the
results,
LS
per
f
or
ms
the
slo
w
es
t,
a
v
eraging
653.166
milliseconds,
whereas
A
VL
tree
per
f
or
ms
better
than
the
others
in
dense
inde
x,
with
an
a
v
erag
e
searc
h
time
of
0.005
milliseconds.
W
ith
0.011
milliseconds
per
k
e
yw
ord
f
or
sparse
inde
x,
BS
outper
f
or
ms
LS,
whic
h
a
v
erag
es
1007.848
milliseconds.
F
or
dense
inde
xing,
an
A
VL
tree
w
orks
bes
t;
f
or
sparse
inde
xing,
BS
is
recommended.
This
is
an
open
access
ar
ticle
under
t
he
CC
B
Y
-SA
license.
Cor
r
esponding
A
uthor
:
Jira
w
at
Duangkae
w
Depar
tment
of
Computer
Engineer
ing,
Sc
hool
of
Inf
or
mation
and
Communication
T
ec
hnology
U
niv
ersity
of
Pha
y
ao
19
Area
2
Maeka,
Pha
y
ao,
Thailand
Email:
jira
w
at.du@outlook.com
1.
INTR
ODUCTION
In
our
pre
vious
researc
h
[1],
w
e
introduced
a
method
to
enhance
data
retr
ie
v
al
eciency
in
lar
g
e-
scale
Ja
v
aScr
ipt
Object
N
otation
(JSON)
datasets
b
y
using
inde
xing
tec
hniq
ues.
Our
approac
h
applies
dense
inde
xing
tec
hniq
ues
f
or
one-to-one
dataset
relationships
and
sparse
inde
xing
f
or
one-to-man
y
relationships,
in
compar
ison
with
non-inde
x
ed
cases.
The
researc
h
w
as
conducted
in
tw
o
main
segments.
The
rs
t
in
v
ol
v
ed
e
xper
imenting
with
accessing
positions
within
the
inde
x,
and
the
second
in
v
ol
v
ed
data
retr
ie
v
al
within
segments
using
both
inde
xing
types,
sho
w
casing
the
ability
to
skip
segments
to
access
lar
g
e-scale
Bigdata.json
le
sets.
In
ter
ms
of
time
eciency
f
or
position
access
with
a
searc
h
k
e
y
,
dense
inde
xing
a
v
erag
ed
59.175
milliseconds,
whereas
non-dense
inde
xing
a
v
erag
ed
15,635.232
milliseconds.
Sparse
inde
xing
a
v
erag
ed
36.387
milliseconds,
while
non-sparse
inde
xing
took
an
a
v
erag
e
of
15,635.232
milliseconds.
A
ccessing
data
positions
b
y
searc
h
k
e
y
from
pre
vious
researc
h
[1]
s
till
f
ollo
w
s
a
linear
searc
h
(LS)
patter
n,
whic
h
is
ecient
onl
y
when
there
is
an
inde
x
order
.
F
or
this
researc
h,
e
v
aluating
the
impact
of
searc
h
k
e
y
dis
tr
ibution
on
searc
hing
tec
hniq
ue
per
f
or
mance
in
lar
g
e-scale
JSON
les
that
contain
data
s
treams
is
proposed.
This
e
v
aluation
compares
the
time
eciency
of
accessing
data
locations
b
y
searc
h
k
e
y
:
dense
inde
x
f
or
one-to-one
data
and
sparse
inde
x
f
or
one-to-man
y
data.
Section
2.2
e
xplains
the
approac
hes
to
appl
ying
LS,
binar
y
searc
h
(BS),
binar
y
searc
h
tree
(BS
T),
and
A
delson-
V
elskii
Landis
tree
(A
VL)
tec
hniq
ues.
Exper
iments
are
conducted
on
the
c
haracter
is
tics
Journal
homepag
e:
http://i
jai.iaescor
e.com
Evaluation Warning : The document was created with Spire.PDF for Python.
2538
❒
ISSN:
2252-8938
of
searc
h
k
e
y
s
in
dense
and
sparse
inde
x
es,
whic
h
ha
v
e
three
dierent
dis
tr
ibution
patter
ns
as
descr
ibed
in
section
2.1,
consis
ting
of
searc
h
k
e
y
s
f
or
nor
mal
dis
tr
ibution,
left-sk
e
w
dis
tr
ibution,
and
r
ight-sk
e
w
dis
tr
ibution.
The
aim
is
to
s
tudy
whic
h
tec
hniq
ue
has
the
highes
t
and
lo
w
es
t
time
eciency
in
accessing
data
locations
in
the
lar
g
e-scale
JSON
le
set
using
searc
h
k
e
y
s
from
dense
and
sparse
inde
x
es
that
ha
v
e
dierent
searc
h
k
e
y
dis
tr
ibution
patter
ns.
In
addition,
w
e
sur
v
e
y
ed
to
oer
a
comprehensiv
e
unders
tanding
of
cur
rent
researc
h
in
this
eld.
Pre
vious
researc
h
in
v
ol
v
ed
impro
ving
inde
x
eciency
,
suc
h
as
through
parallel
inde
xing,
h
ybr
id
methodologies,
and
adaptiv
e
inde
xing,
whic
h
ha
v
e
the
potential
to
signicantl
y
enhance
inde
xing
per
f
or
mance.
The
y
can
reduce
searc
h
durations,
decrease
s
torag
e
o
v
erhead,
and
manag
e
specialized
data
types
or
intr
icate
data
s
tr
uctures
[2]–[8].
The
deplo
yment
of
these
tec
hniq
ues
is
par
ticular
l
y
benecial
f
or
databases
containing
uncer
tain
data,
g
raph
databases,
lar
g
e-scale
g
eospatial
data
der
iv
ed
from
the
inter
net
of
things
(Io
T),
and
e
xtensiv
e
time
ser
ies
datasets.
The
researc
h
section
w
as
related
to
databases
using
LS
tec
hniq
ues.
Suc
h
inno
v
ativ
e
methods
as
path-
based
inde
x
s
tr
uctures,
online
similar
ity
searc
hes,
in
v
er
ted
inde
x
optimizations,
dynamic
compressed
lear
ned
inde
x
es,
small-space
algor
ithmic
anal
y
sis,
lear
ned
s
tr
uctures
f
or
spatial
data,
and
dis
tance-computation-free
searc
h
sc
hemes
are
emer
ging
as
inno
v
ativ
e
methods
to
enhance
database
per
f
or
mance.
These
tec
hniq
ues
can
s
treamline
q
uer
y
processing,
optimize
s
torag
e
utilization,
and
procientl
y
manag
e
specialized
data
f
or
mats
or
comple
x
data
arc
hitectures
[9]–[16].
The
application
of
these
methodologies
can
be
especiall
y
adv
antag
eous
f
or
e
xtensible
mark
up
languag
e
(XML)
databases,
uncer
tain
time
ser
ies
data,
relational
database
manag
ement
sy
s
tems,
spatial
data,
and
binar
y
code
databases.
Mean
while,
researc
h
related
to
BS,
suc
h
as
ne
xt-g
eneration
database
inde
xing,
decision
tree
algor
ithms
tailored
f
or
spatial
data,
nature-inspired
optimization
methods,
dynamic
approac
hes
in
BS
spaces,
and
random
BS
methodologies,
oers
considerable
promise
to
adv
ance
database
per
f
or
mance.
These
adv
ancements
can
s
treamline
searc
h
operations,
optimize
s
torag
e
capabilities,
and
ecientl
y
handle
specialized
data
f
or
mats
or
comple
x
data
infras
tr
uctures
[17]–[21].
The
application
of
these
s
trategies
pro
v
es
especiall
y
eectiv
e
f
or
prog
ressiv
e
database
sy
s
tems,
spatial
data
anal
y
sis,
association
r
ule
mining,
BS
optimizations,
and
neares
t
neighbor
searc
hes
within
binar
y
domains.
On
the
other
hand,
researc
h
related
to
BS
T
,
suc
h
as
deter
minis
tic
nite
tree
automata
minimizatio
n,
space-ecient
inde
xing
tailored
f
or
cyber
-ph
y
sical
sy
s
tems,
data-dr
iv
en
cac
he
optimization,
and
B+-
T
ree-based
searc
h
methods,
presents
signicant
adv
ancements
in
database
per
f
or
mance
and
secur
ity
.
These
methodologies
ha
v
e
the
potential
to
optimize
searc
h
operations,
ensure
ecient
space
utilization,
and
deliv
er
secure
and
eectiv
e
data
retr
ie
v
al
in
clo
ud
en
vironments
[22]–[26].
Appl
ying
these
tec
hniq
ues
is
especiall
y
adv
antag
eous
f
or
sy
s
tems
f
ocused
on
minimizing
automata,
optimizing
cyber
-ph
y
sical
databases,
rening
cac
he
per
f
or
mance
in
B+-
T
rees,
ensur
ing
encr
ypted
cloud
data
secur
ity
,
and
ecientl
y
retr
ie
ving
product
inf
or
mation
on
cloud
platf
or
ms.
F
inall
y
,
w
e
s
tudied
researc
h
related
to
the
A
VL
tree.
Suc
h
researc
h
includes
secure
document
retr
ie
v
al
in
encr
ypted
cloud
sy
s
tems,
h
ybr
id
sw
ar
m
optimization
f
or
netw
ork
clus
ter
ing,
ener
gy
-ecient
clus
ter
head
selection,
and
ultra-scalable
bloc
kc
hain
platf
or
ms
f
or
asset
tok
enization,
sho
w
casing
promising
adv
ancements
in
their
respectiv
e
elds.
These
methods
ensure
data
secur
ity
in
cloud
en
vironments,
optimize
netw
ork
clus
ter
ing
and
ener
gy
utilization
in
wireless
sensor
netw
orks,
and
pa
v
e
the
w
a
y
f
or
lar
g
e-scale
asset
manag
ement
on
bloc
kc
hain
platf
or
ms
[27]–[30].
The
amalg
amation
of
these
s
trategies
oers
subs
tantial
potential
f
or
sy
s
tems
that
underscore
encr
ypted
cloud
data
protection,
s
treamlined
clus
ter
ing
in
wireless
infras
tr
uctures,
and
e
xtensible
bloc
kc
hain
mec
hanisms
f
or
global
assets.
2.
METHOD
In
pre
vious
researc
h
[1],
w
e
proposed
dense
inde
xing
f
or
one-to-one
data
access,
as
illus
trated
in
F
igure
1.
The
le
set
is
called
Dense
inde
x
position.json,
with
a
size
of
33
MB
and
containing
1,000,000
transactions.
The
le
set
consis
ts
of
searc
h
k
e
y
s
and
specic
positions
within
the
Bigdata.json
dataset.
While
the
sparse
inde
x
f
or
accessing
one-to-man
y
data
is
illus
trated
in
F
igure
2,
the
set
of
les
is
called
Sparse
inde
x
position.json,
with
a
le
size
of
8
MB
and
con
taining
5,146
transactions.
This
le
set
consis
ts
of
searc
h
k
e
y
s
and
multiple
positions
within
the
Bigdata.json
dataset.
In
searc
hing
to
access
data
positions
using
the
dense
and
sparse
inde
x
es,
no
issues
ar
ise
when
searc
hing
f
or
a
single
searc
h
k
e
y
.
Ho
w
e
v
er
,
when
searc
hing
f
or
multiple
searc
h
k
e
y
s,
the
searc
h
k
e
y
dis
tr
ibution
mus
t
be
considered.
Theref
ore,
this
researc
h
proposes
an
approac
h
to
compare
the
eciency
of
access
times
to
data
positions
using
dense
and
sparse
inde
x
es.
The
researc
her
selected
main
searc
h
k
e
y
s
from
both
inde
x
datasets
Int
J
Ar
tif
Intell,
V
ol.
14,
N
o.
3,
June
2025:
2537–2546
Evaluation Warning : The document was created with Spire.PDF for Python.
Int
J
Ar
tif
Intell
ISSN:
2252-8938
❒
2539
as
descr
ibed
in
section
2.1,
whic
h
ha
v
e
three
dierent
searc
h
k
e
y
dis
tr
ibutions:
nor
mal
dis
tr
ibution,
left-sk
e
w
dis
tr
ibution,
and
r
ight-sk
e
w
dis
tr
ibution.
The
objectiv
e
is
to
compare
the
eciency
of
accessing
positions
using
the
main
searc
h
k
e
y
s
within
both
inde
x
les.
The
researc
her
applied
f
our
tec
hniq
ues
as
e
xplained
in
section
2.2,
including
LS,
BS,
BS
T
,
and
A
VL
tree.
The
time
eciency
of
eac
h
tec
hniq
ue
w
as
considered
to
deter
mine
whic
h
pro
vided
the
f
as
tes
t
and
slo
w
es
t
results
f
or
accessing
positions
within
both
inde
x
datasets.
"Phillip.Headlon@smith.com":
0,
"Jonnie@pacificare.com":
1,
...
"Annalee@pinnaclewest.com":
1000000
F
igure
1.
Ex
amples
of
transactions
in
the
Dense
inde
x
position.json
with
a
size
of
33
MB
and
1,000,000
transactions
"Maryalice":
[0,
547,
836317,
1040693,
1041786],
"Luciana":
[1,
3639,
9282,
1041132,
1044174],
...
"Eulah":
[2,
1216,
630028,
630264,
1042683]
F
igure
2.
Ex
amples
of
transactions
in
the
Sparse
inde
x
position.json
with
a
size
of
8
MB
and
5,146
transactions
2.1.
The
selection
of
sear
c
h
k
e
y
s
f
or
dense
and
sparse
distribution
This
section
e
xplains
the
selection
of
524
searc
h
k
e
y
s
to
e
v
aluate
the
eectiv
eness
of
the
researc
h.
W
e
obtained
these
searc
h
k
e
y
s
from
the
dense
inde
x
position
le,
whic
h
is
33
MB
and
has
1,000,000
transactions,
and
the
sparse
inde
x
position
le,
whic
h
is
8
MB
and
has
5,146
transactions.
F
or
the
dense
inde
x,
w
e
created
v
e
email
in
de
x
les,
eac
h
containing
524
non-repeating
searc
h
k
e
y
s,
and
divided
these
inde
x
les
into
three
main
types
according
to
position
rang
e.
The
rs
t
type
is
the
dense
inde
x
f
or
the
nor
mal
dis
tr
ibution
of
the
inde
x,
whic
h
has
2,070
searc
h
k
e
y
s
in
the
position
rang
e
of
300,000
to
800,000,
with
550
searc
h
k
e
y
s
in
the
left-sk
e
w
ed
dis
tr
ibution
and
r
ight-sk
e
w
ed
dis
tr
ibution,
as
illus
trated
in
F
igure
3.
F
igure
3.
Dense
inde
x
f
or
nor
mal
dis
tr
ibution
The
second
type
of
inde
x
is
a
dense
inde
x
f
or
the
left-sk
e
w
ed
dis
tr
ibution.
As
illus
trated
in
F
igure
4,
the
inde
x
is
dis
tr
ibuted
within
the
rang
e
of
100,000
to
500,000
with
1,806
searc
h
k
e
y
s.
Moreo
v
er
,
the
dis
tr
ibution
on
the
r
ight
side
holds
814
searc
h
k
e
y
s.
Ev
aluating
sear
c
h
key
dis
tribution
impact
on
sear
c
hing
per
f
or
mance
in
...
(Bo
w
onsak
Srisungsittisunti)
Evaluation Warning : The document was created with Spire.PDF for Python.
2540
❒
ISSN:
2252-8938
F
igure
4.
Dense
inde
x
f
or
a
left
sk
e
w
ed
dis
tr
ibution
The
third
type
of
inde
x
is
a
dense
inde
x
f
or
a
r
ight-sk
e
w
ed
dis
tr
ibution.
As
illus
trated
in
F
igure
5,
the
inde
x
is
dis
tr
ibuted
within
the
rang
e
of
500,000
to
1,000,000,
encompassing
1,999
searc
h
k
e
y
s.
The
dis
tr
ibution
on
the
left
side
contains
621
searc
h
k
e
y
s.
F
or
the
sparse
inde
x,
w
e
created
v
e
email
inde
x
les,
eac
h
containing
524
non-repeating
searc
h
k
e
y
s.
These
inde
x
les
w
ere
divided
into
three
main
types
according
to
position
rang
e.
The
rs
t
type
is
the
sparse
inde
x
f
or
nor
mal
dis
tr
ibution
of
the
inde
x,
whic
h
has
2,600
searc
h
k
e
y
s
in
the
2,000
to
4,000
position
rang
e,
with
20
searc
h
k
e
y
s
in
the
left-
and
r
ight-sk
e
w
ed
dis
tr
ibution,
as
illus
trated
in
F
igure
6.
F
igure
5.
Dense
inde
x
f
or
a
r
ight
sk
e
w
ed
dis
tr
ibution
F
igure
6.
Sparse
inde
x
f
or
nor
mal
dis
tr
ibution
The
second
type
of
inde
x
is
a
sparse
inde
x
f
or
the
left-sk
e
w
ed
dis
tr
ibution.
As
illus
trated
in
F
igure
7,
the
inde
x
is
dis
tr
ibuted
within
the
rang
e
of
1,000
to
3,000
with
1,885
searc
h
k
e
y
s.
Moreo
v
er
,
the
dis
tr
ibution
on
the
r
ight
side
holds
735
searc
h
k
e
y
s.
Int
J
Ar
tif
Intell,
V
ol.
14,
N
o.
3,
June
2025:
2537–2546
Evaluation Warning : The document was created with Spire.PDF for Python.
Int
J
Ar
tif
Intell
ISSN:
2252-8938
❒
2541
The
third
type
of
inde
x
is
a
sparse
inde
x
f
or
a
r
ight-sk
e
w
ed
dis
tr
ibution.
As
illus
trated
in
F
igure
8,
the
inde
x
is
dis
tr
ibuted
within
the
rang
e
of
3,000
to
5,000,
encompassing
1,916
searc
h
k
e
y
s.
Moreo
v
er
,
the
dis
tr
ibution
on
the
left
side
contains
704
searc
h
k
e
y
s.
F
igure
7.
Sparse
inde
x
f
or
left
sk
e
w
ed
dis
tr
ibution
F
igure
8.
Sparse
inde
x
f
or
r
ight
sk
e
w
ed
dis
tr
ibution
2.2.
Algorithms
f
or
com
paring
the
eciency
of
dense
and
sparse
indices
This
section
e
xplains
the
methods
f
or
searc
hing
and
accessing
dense
and
sparse
inde
x
position
les,
whic
h
compr
ise
LS,
BS,
BS
T
,
and
A
VL
tree.
The
details
of
these
algor
ithms
are
pro
vided
as
f
ollo
w
s.
2.2.1.
Dense
and
sparse
inde
x
algorithms
using
linear
sear
c
h
F
or
the
method
of
searc
hing
the
position
of
searc
h
k
e
y
ter
ms
within
dense
and
sparse
inde
x
les
using
the
LS
algor
ithm,
three
dierent
dis
tr
ibutions
of
tes
t
searc
h
k
e
y
ter
ms
w
ere
used,
as
detailed
in
section
2.1.
The
searc
h
s
tar
ts
b
y
c
hec
king
the
searc
h
k
e
y
ter
m
in
eac
h
transaction
entr
y
within
the
dense
inde
x
le.
If
the
searc
hed
searc
h
k
e
y
ter
m
matc
hes
the
searc
h
k
e
y
ter
m
appear
ing
in
the
le,
the
position
of
that
searc
h
k
e
y
ter
m
can
be
retr
ie
v
ed.
If
not,
the
searc
h
continues
until
the
desired
k
e
y
is
f
ound
or
until
all
entr
ies
are
e
xhaus
ted.
Algorithm
1
Dense
and
sparse
inde
x
algor
ithms
using
LS
R
eq
uir
e:
Dense
and
Sparse
Inde
x
position
F
ile,
searc
h
k
e
y
s
tes
t
with
dis
tr
ibution
as
descr
ibed
in
section
2.1
Ensur
e:
P
osition
in
Bigdata.json
1:
function
FindSear
chKey
(inde
x.positions,
desired.searc
h.k
e
y
s)
2:
result
←
{}
3:
f
or
searc
h.k
e
y
∈
desired.searc
h.k
e
y
s
do
4:
position.f
ound
←
F
alse
5:
f
or
inde
x.entr
y
,
inde
x.entr
y
∈
Enumerate(inde
x.positions)
do
6:
if
inde
x.entr
y[’
searc
h.k
e
y’]
==
searc
h.k
e
y
then
7:
result[searc
h.k
e
y]
←
inde
x.entr
y[’position
’]
8:
position.f
ound
←
T
r
ue
br
eak
9:
end
if
10:
end
f
or
11:
if
not
position.f
ound
then
12:
result[searc
h.k
e
y]
←
”Searc
h
k
e
y
not
f
ound.”
13:
end
if
14:
end
f
or
15:
r
e
turn
result
16:
end
function
F
or
the
utilization
of
dense
and
sparse
inde
x
algor
ithms
using
LS
to
access
the
position
of
searc
h
k
e
y
s
within
dense
and
sparse
inde
x
position
les,
impor
t
tes
t
searc
h
k
e
y
s
with
all
dis
tr
ibution
patter
ns
as
descr
ibed
in
section
2.1.
Impor
t
one
searc
h
k
e
y
at
a
time
into
the
searc
h
v
er
ication
process.
This
process
uses
the
LS
tec
hniq
ue
as
e
xplained
in
section
2.2.1.
Ev
aluating
sear
c
h
key
dis
tribution
impact
on
sear
c
hing
per
f
or
mance
in
...
(Bo
w
onsak
Srisungsittisunti)
Evaluation Warning : The document was created with Spire.PDF for Python.
2542
❒
ISSN:
2252-8938
2.2.2.
Dense
and
sparse
inde
x
algorithms
using
binary
sear
c
h
When
searc
hing
f
or
the
position
of
searc
h
k
e
y
ter
ms
in
dense
and
sparse
inde
x
les
using
the
BS
tec
hniq
ue,
three
searc
h
k
e
y
ter
m
dis
tr
ibution
patter
ns
w
ere
tes
ted,
as
descr
ibed
in
section
2.1.
The
searc
h
process
begins
b
y
sor
ting
the
searc
h
k
e
y
ter
ms
in
the
dense
and
sparse
inde
x
les.
T
es
t
searc
h
k
e
y
ter
ms
with
dierent
dis
tr
ibutions
are
then
used
to
nd
the
midpoint
of
the
searc
h
k
e
y
ter
m
seq
uence.
The
midpoint
is
compared
to
the
searc
h
v
alue.
If
the
searc
h
k
e
y
ter
m
is
f
ound,
the
position
of
the
searc
h
k
e
y
ter
m
is
retur
ned.
If
the
searc
h
k
e
y
ter
m
is
not
f
ound,
the
data
seq
uence
is
divided
into
tw
o
par
ts,
and
the
searc
h
continues
in
the
par
t
with
searc
h
k
e
y
ter
m
v
alues
closes
t
to
the
desired
searc
h
k
e
y
ter
m.
Algorithm
2
Dense
and
sparse
inde
x
algor
ithms
using
binar
y
searc
h
1:
In
put
:
Dense
and
Sparse
Inde
x
position
F
ile,
searc
h
k
e
y
s
tes
t
with
dis
tr
ibution
as
descr
ibed
in
section
2.1
2:
Output
:
P
osition
in
Bigdata.json
3:
function
bin
ar
y
sear
ch
(
,
ℎ
)
4:
←
0
5:
ℎ
ℎ
←
len
(
)
−
1
6:
while
≤
ℎ
ℎ
do
7:
←
(
ℎ
ℎ
+
)
//
2
8:
ℎ
←
[
]
[
0
]
9:
if
ℎ
==
ℎ
then
10:
r
e
turn
[
]
[
1
]
11:
else
if
ℎ
<
ℎ
then
12:
←
+
1
13:
else
14:
ℎ
ℎ
←
−
1
15:
end
if
16:
end
while
17:
r
e
turn
”Searc
h
k
e
y
not
f
ound.”
18:
end
function
F
or
the
utilization
of
dense
and
sparse
inde
x
algor
ithms
using
BS
to
access
the
position
of
searc
h
k
e
y
s
within
dense
and
sparse
inde
x
position
les,
impor
t
tes
t
searc
h
k
e
y
s
with
six
dierent
dis
tr
ibution
patter
ns
as
descr
ibed
in
section
2.1.
Eac
h
searc
h
k
e
y
is
processed
through
the
searc
h
v
er
ication
process
using
the
BS
tec
hniq
ue,
as
e
xplained
in
section
2.2.2.
The
eciency
of
accessing
the
searc
h
k
e
y
position
based
on
the
a
v
erag
e
or
w
ors
t-case
data
rang
e
is
O
(log
n),
but
the
bes
t-case
scenar
io
is
O(1)
w
hen
the
tes
t
searc
h
k
e
y
matc
hes
the
inde
x
position
les,
retur
ning
the
position
of
Bigdata.json.
2.2.3.
Dense
and
sparse
inde
x
algorithms
using
BS
T
and
A
VL
T
r
ee
This
section
e
xplains
the
method
of
searc
hing
and
accessing
positions
within
dense
and
sparse
inde
x
les
using
both
the
BS
T
and
the
A
VL
tree
tec
hniq
ues.
The
algor
ithms
f
or
these
methods
are
illus
trated
in
Algor
ithm
3,
whic
h
integ
rates
both
BS
T
and
A
VL
tree
operations
f
or
eciency
.
Six
dierent
tes
t
searc
h
k
e
y
dis
tr
ibution
patter
ns
are
used,
as
descr
ibed
in
section
2.1.
The
searc
h
process
begins
b
y
ar
ranging
the
main
w
ords
in
the
inde
x
le.
T
es
t
searc
h
k
e
y
s
are
then
used
to
searc
h
at
the
root
node
of
either
the
BS
T
or
the
A
VL
tree,
depending
on
the
desired
eciency
.
If
the
root
node
is
empty
,
the
process
retur
ns
none.
F
or
both
tree
s
tr
uctures,
the
searc
h
proceeds
b
y
compar
ing
the
desired
searc
h
k
e
y
with
the
cur
rent
node
’
s
searc
h
k
e
y
s
in
the
dense
inde
x
position
le.
If
the
searc
h
k
e
y
matc
hes
the
cur
rent
node,
the
position
of
the
main
w
ord
is
retur
ned,
and
the
searc
h
f
or
the
ne
xt
searc
h
k
e
y
continues.
Other
wise,
the
direction
of
tra
v
ersal
(left
or
r
ight)
is
deter
mined
based
on
whether
the
searc
h
k
e
y
is
less
or
g
reater
than
the
cur
rent
node
’
s
searc
h
k
e
y
.
F
or
the
utilization
of
dense
and
sparse
inde
x
algor
ithms
using
BS
T
or
A
VL
trees
to
access
the
position
of
searc
h
k
e
y
s
within
dense
and
sparse
inde
x
position
les,
the
tes
t
searc
h
k
e
y
s
are
processed
with
both
tec
hniq
ues,
depending
on
eciency
req
uirements.
The
BS
T
tec
hniq
ue,
as
descr
ibed
in
section
2.2.3,
allo
w
s
searc
hing
with
a
v
erag
e
or
w
ors
t
time
comple
xity
eq
ual
to
(
)
.
On
the
other
hand,
the
A
VL
tree
tec
hniq
ue,
as
descr
ibed
in
section
2.2.4,
oers
an
impro
v
ed
a
v
erag
e
and
w
ors
t-case
time
comple
xity
of
(
log
)
.
The
bes
t
case
f
or
both
is
(
1
)
when
the
searc
h
k
e
y
directl
y
matc
hes
an
inde
x
in
the
le,
retur
ning
the
position
in
Bigdata.json.
Int
J
Ar
tif
Intell,
V
ol.
14,
N
o.
3,
June
2025:
2537–2546
Evaluation Warning : The document was created with Spire.PDF for Python.
Int
J
Ar
tif
Intell
ISSN:
2252-8938
❒
2543
Algorithm
3
Dense
and
sparse
inde
x
algor
ithms
using
BS
T
and
A
VL
T
ree
1:
In
put
:
Dense
and
Sparse
Inde
x
position
F
ile,
searc
h
k
e
y
s
tes
t
with
dis
tr
ibution
as
descr
ibed
in
section
2.1
2:
Output
:
P
osition
in
Bigdata.json
3:
function
Index
Sear
ch
(inde
x
le,
searc
h
k
e
y
s,
tree
type)
4:
if
==
“BS
T”
then
5:
←
ne
w
Binar
ySearc
hT
ree()
6:
else
if
==
“
A
VL
”
then
7:
←
ne
w
A
VL
T
ree()
8:
else
9:
raise
err
or
:
U
nsuppor
ted
tree
type.
10:
end
if
11:
f
or
(
ℎ
,
)
∈
do
12:
.
(
ℎ
,
)
13:
end
f
or
14:
←
{
}
15:
f
or
ℎ
∈
ℎ
do
16:
←
.
ℎ
(
ℎ
)
17:
if
≠
NULL
then
18:
[
ℎ
]
←
19:
else
20:
[
ℎ
]
←
”Searc
h
k
e
y
not
f
ound.”
21:
end
if
22:
end
f
or
23:
r
e
turn
24:
end
function
3.
RESUL
TS
This
section
presents
the
e
v
aluation
of
searc
h
time
eciency
f
or
accessing
data
positions
in
tw
o
datasets:
dense
inde
x
position.json
and
sparse
inde
x
position.json.
The
searc
h
algor
ithms
emplo
y
ed
f
or
the
anal
y
sis
include
LS,
BS,
BS
T
,
and
A
VL
tree.
The
e
xper
iments
w
ere
conducted
with
three
dis
tinct
searc
h
k
e
y
dis
tr
ibution
patter
ns:
nor
mal,
left-sk
e
w
ed,
and
r
ight-sk
e
w
ed
dis
tr
ibutions,
as
detailed
in
section
2.1.
The
comparativ
e
e
xper
iments
f
ocused
on
tw
o
types
of
inde
x
es:
dense
inde
x
as
sho
wn
in
T
able
1
and
sparse
inde
x
as
sho
wn
in
T
able
2.
F
or
both
inde
x
types,
the
e
xper
iments
measured
tw
o
pr
imar
y
c
haracter
is
tics:
‘
A
v
erag
e
time
of
15
rounds,
’
whic
h
assessed
the
total
time
req
uired
to
access
positions
f
or
524
searc
h
k
e
y
s
o
v
er
fteen
rounds,
and
‘
A
v
erag
e
time
per
searc
h
k
e
y
,
’
whic
h
e
v
aluated
the
searc
h
time
req
uired
to
access
eac
h
searc
h
k
e
y
individuall
y
.
The
results
sho
w
ed
that
sor
ting
and
inde
xing
are
necessar
y
f
or
BS,
BS
T
,
and
A
VL
T
ree
to
enhance
per
f
or
mance.
F
or
LS,
there
w
as
no
need
to
sor
t
the
inde
x
bef
orehand,
as
LS
seq
uentiall
y
e
v
aluates
all
transaction
items,
reg
ardless
of
searc
h
k
e
y
dis
tr
ibution.
The
results
f
or
the
dense
inde
x
(T
able
1)
demons
trate
that
BS
had
an
a
v
erag
e
inde
xing
time
of
1,795.708
milliseconds,
while
BS
T
completed
inde
xing
in
773.149
milliseconds,
and
A
VL
T
ree
ac
hie
v
ed
the
f
as
tes
t
time
of
635.262
milliseconds.
T
able
1.
Compar
ison
of
dense
inde
x
searc
hing
eciency
with
searc
h
k
e
y
dis
tr
ibution
T
ec
hniq
ues
R
etr
ie
v
e
time
b
y
dense
inde
x
(ms)
A
v
g.
time
of
15
rounds
A
v
g.
per
searc
h
k
e
y
Inde
x
le
sor
ting
time
Linear
searc
h
N
or
mal
dis
tr
ibution
342259.155
653.166
N
o
inde
x
le
sor
ting
Left-sk
e
w
ed
dis
tr
ibution
337220.625
643.551
Right-sk
e
w
ed
dis
tr
ibution
349457.130
666.903
Binar
y
searc
h
N
or
mal
dis
tr
ibution
9.811
0.019
1795.708
Left-sk
e
w
ed
dis
tr
ibution
10.199
0.019
Right-sk
e
w
ed
dis
tr
ibution
11.572
0.022
Binar
y
searc
h
tree
N
or
mal
dis
tr
ibution
179.970
0.343
773.149
Left-sk
e
w
ed
dis
tr
ibution
173.583
0.331
Right-sk
e
w
ed
dis
tr
ibution
155.317
0.296
A
VL
T
ree
N
or
mal
dis
tr
ibution
2.703
0.005
635.262
Left-sk
e
w
ed
dis
tr
ibution
2.346
0.004
Right-sk
e
w
ed
dis
tr
ibution
2.433
0.005
The
rs
t
e
xper
iment
e
v
aluated
the
eciency
of
accessing
data
positions
in
a
dense
inde
x.
Eac
h
tes
t
round
included
524
searc
h
k
e
y
s,
and
the
results
w
ere
a
v
erag
ed
o
v
er
15
rounds.
W
hen
tes
ted
with
searc
h
k
e
y
s
f
ollo
wing
a
nor
mal
dis
tr
ibution,
the
A
VL
tree
w
as
the
f
as
tes
t,
with
an
a
v
erag
e
access
time
of
2.703
milliseconds.
Ev
aluating
sear
c
h
key
dis
tribution
impact
on
sear
c
hing
per
f
or
mance
in
...
(Bo
w
onsak
Srisungsittisunti)
Evaluation Warning : The document was created with Spire.PDF for Python.
2544
❒
ISSN:
2252-8938
On
the
other
hand,
LS
w
as
the
slo
w
es
t,
taking
342,259.155
milliseconds.
F
or
searc
h
k
e
y
s
with
a
left-sk
e
w
ed
dis
tr
ibution,
the
A
VL
tree
per
f
or
med
the
q
uic
k
es
t,
with
an
a
v
erag
e
time
of
2.346
milliseconds,
while
LS
w
as
ag
ain
the
slo
w
es
t,
taking
337,220.625
milliseconds.
In
r
ight-sk
e
w
ed
searc
h
k
e
y
dis
tr
ibutions,
the
A
VL
tree
w
as
the
f
as
tes
t,
with
an
a
v
erag
e
time
of
2.433
milliseconds,
whereas
LS
remained
the
slo
w
es
t
at
349,457.13
milliseconds.
The
second
e
xper
iment
assessed
searc
h
eciency
in
a
sparse
inde
x.
Eac
h
round
also
consis
ted
of
524
searc
h
k
e
y
s,
and
the
results
w
ere
a
v
erag
ed
o
v
er
15
rounds.
W
ith
a
nor
mal
searc
h
k
e
y
dis
tr
ibution,
BS
w
as
the
f
as
tes
t,
w
ith
an
a
v
erag
e
time
of
5.647
milliseconds,
while
LS
w
as
the
slo
w
es
t,
taking
528,112.194
milliseconds.
F
or
left-sk
e
w
ed
searc
h
k
e
y
dis
tr
ibutions,
BS
w
as
the
q
uic
k
es
t,
a
v
eraging
3.868
milliseconds,
while
LS
w
as
the
slo
w
es
t,
taking
528,112.194
milliseconds.
W
ith
r
ight-sk
e
w
ed
searc
h
k
e
y
dis
tr
ibutions,
BS
w
as
the
f
as
tes
t,
a
v
eraging
5.592
milliseconds,
while
LS
remained
the
slo
w
es
t
at
528,179.767
milliseconds.
T
able
2.
Compar
ison
of
sparse
inde
x
searc
hing
eciency
with
searc
h
k
e
y
dis
tr
ibution
T
ec
hniq
ues
R
etr
ie
v
e
time
b
y
sparse
inde
x
(ms)
A
v
g.
time
of
15
rounds
A
v
g.
per
searc
h
k
e
y
Inde
x
le
sor
ting
time
Linear
searc
h
N
or
mal
dis
tr
ibution
528112.194
1007.848
N
o
inde
x
le
sor
ting
Left-sk
e
w
ed
dis
tr
ibution
528253.244
1008.117
Right-sk
e
w
ed
dis
tr
ibution
528179.767
1007.977
Binar
y
searc
h
N
or
mal
dis
tr
ibution
5.647
0.011
136.711
Left-sk
e
w
ed
dis
tr
ibution
3.868
0.007
Right-sk
e
w
ed
dis
tr
ibution
5.592
0.011
Binar
y
searc
h
tree
N
or
mal
dis
tr
ibution
161.007
0.307
N
o
inde
x
le
sor
ting
Left-sk
e
w
ed
dis
tr
ibution
166.118
0.317
Right-sk
e
w
ed
dis
tr
ibution
158.378
0.302
A
VL
tree
N
or
mal
dis
tr
ibution
30.903
0.059
113.538
Left-sk
e
w
ed
dis
tr
ibution
35.200
0.067
Right-sk
e
w
ed
dis
tr
ibution
35.223
0.067
4.
C
ON
CL
USION
This
researc
h
co
mpares
the
time
eciency
of
accessing
positions
using
both
dense
and
sparse
inde
x
es.
The
s
tudy
applied
f
our
searc
h
tec
hniq
ues:
LS,
BS,
BS
T
,
and
A
VL
tree,
across
three
searc
h
k
e
y
dis
tr
ibution
patter
ns:
nor
mal,
left-sk
e
w
ed,
and
r
ight-sk
e
w
ed.
The
e
xper
imental
ndings
indicate
that
f
or
dense
inde
x
es
with
a
nor
mal
dis
tr
ibution
of
searc
h
k
e
y
s,
the
A
VL
T
ree
w
as
the
f
as
tes
t,
a
v
eraging
0.005
milliseconds
per
searc
h
k
e
y
.
Con
v
ersel
y
,
LS
w
as
the
slo
w
es
t,
with
an
a
v
erag
e
of
653.166
milliseconds
per
searc
h
k
e
y
.
F
or
left-sk
e
w
ed
dis
tr
ibutions,
the
A
VL
T
ree
w
as
ag
ain
the
q
uic
k
es
t,
a
v
eraging
0.004
milliseconds
per
searc
h
k
e
y
,
while
LS
w
as
the
slo
w
es
t
at
643.551
milliseconds.
The
patter
n
w
as
consis
tent
f
or
r
ight-sk
e
w
ed
dis
tr
ibutions,
with
the
A
VL
T
ree
being
the
f
as
tes
t
at
0.005
milliseconds
and
LS
being
the
slo
w
es
t
at
666.903
milliseconds.
F
or
sparse
inde
x
es,
the
results
w
ere
slightl
y
dierent.
When
searc
h
k
e
y
s
w
ere
nor
mall
y
dis
tr
ibuted,
BS
w
as
the
mos
t
ecient,
a
v
eraging
0.011
milliseconds
per
searc
h
k
e
y
,
while
LS
w
as
the
slo
w
es
t,
taking
1007.848
milliseconds.
F
or
left-sk
e
w
ed
dis
tr
ibutions,
BS
remained
the
f
as
tes
t,
a
v
eraging
0.007
milliseconds
per
searc
h
k
e
y
,
with
LS
being
the
slo
w
es
t
at
1008.117
milliseconds.
Similar
l
y
,
f
or
r
ight-sk
e
w
ed
dis
tr
ibutions,
BS
w
as
the
q
uic
k
es
t
at
0.011
milliseconds
per
searc
h
k
e
y
,
and
LS
w
as
the
slo
w
es
t,
a
v
eraging
1007.977
milliseconds.
In
conclusion,
the
A
VL
T
ree
is
highl
y
ecient
f
or
dense
inde
x
es,
par
ticular
l
y
when
handling
lar
g
e
v
olumes
of
searc
h
k
e
y
s,
suc
h
as
1,000,000
transactions.
On
the
other
hand,
BS
is
more
suitable
f
or
sparse
inde
x
es
with
a
smaller
number
of
searc
h
k
e
y
s,
suc
h
as
5,000
transactions.
LS
consis
tentl
y
sho
w
ed
the
leas
t
eciency
in
both
dense
and
sparse
inde
x
scenar
ios.
F
uture
researc
h
can
e
xplore
secondar
y
inde
xing
tec
hniq
ues
to
enhance
scalability
and
eciency
f
or
more
comple
x
data
types
and
datasets.
A
CKN
O
WLEDGMENTS
The
authors
thank
R
esearc
h
U
nit
f
or
De
v
elopment
of
Intellig
ent
Sy
s
tems
and
A
utonomous
R
obots
(IS
AR)
and
Sc
hool
of
Inf
or
mation
T
ec
hnology
and
Communication,
U
niv
ersity
of
Pha
y
ao,
f
or
f
acility
suppor
t.
FUNDIN
G
INFORMA
TION
This
w
ork
w
as
suppor
ted
b
y
the
Super
KPI
project
f
or
inter
national
publications,
under
the
Sc
hool
of
Inf
or
mation
and
Communication
T
ec
hnology
,
U
niv
ersity
of
Pha
y
ao.
Int
J
Ar
tif
Intell,
V
ol.
14,
N
o.
3,
June
2025:
2537–2546
Evaluation Warning : The document was created with Spire.PDF for Python.
Int
J
Ar
tif
Intell
ISSN:
2252-8938
❒
2545
A
UTHOR
C
ONTRIBUTIONS
S
T
A
TEMENT
This
jour
nal
uses
the
Contr
ibutor
R
oles
T
ax
onom
y
(CR
edi
T)
to
recognize
individual
author
contr
ibu-
tions,
reduce
authorship
disputes,
and
f
acilitate
collaboration.
N
ame
of
A
uthor
C
M
So
V
a
F
o
I
R
D
O
E
V
i
Su
P
Fu
Bo
w
onsak
Sr
isungsittisunti
✓
✓
✓
✓
✓
✓
✓
✓
✓
✓
Jira
w
at
Duangkae
w
✓
✓
✓
✓
✓
✓
✓
✓
✓
N
akar
in
Chaikae
w
✓
✓
✓
✓
✓
✓
✓
C
:
C
onceptualization
I
:
I
n
v
es
tig
ation
V
i
:
V
i
sualization
M
:
M
ethodology
R
:
R
esources
Su
:
Su
per
vision
So
:
So
ftw
are
D
:
D
ata
Curation
P
:
P
roject
A
dminis
tration
V
a
:
V
a
lidation
O
:
W
r
iting
-
O
r
iginal
Draft
F
u
:
Fu
nding
A
cq
uisition
F
o
:
F
o
r
mal
Anal
y
sis
E
:
W
r
iting
-
R
e
vie
w
&
E
diting
C
ONFLICT
OF
INTERES
T
S
T
A
TEMENT
The
authors
s
tate
no
conict
of
interes
t.
D
A
T
A
A
V
AILABILIT
Y
Der
iv
ed
data
suppor
ting
the
ndings
of
this
s
tudy
are
a
v
ailable
from
the
cor
responding
author
,
[JD],
on
req
ues
t.
REFEREN
CES
[1]
B.
Sr
isungsittisunti,
J.
Duangkae
w
,
S.
Mekr
uksa
v
anic
h,
N
.
Chaikae
w
,
and
P
.
R
ojana
v
asu,
“Enhancing
data
retr
ie
v
al
eciency
in
lar
g
e-scale
Ja
v
aScr
ipt
object
notation
datasets
b
y
using
inde
xing
tec
hniq
ues,
”
IAES
Int
er
national
Jour
nal
of
Ar
ticial
Int
ellig
ence
(IJ-AI)
,
v
ol.
13,
no.
2,
pp.
2342–2353,
2024,
doi:10.11591/i
jai.v13.i2.pp2342-2353.
[2]
R.
Qiu,
Y
.
Ming,
Y
.
Hong,
H.
Li,
and
T
.
Y
ang,
“W
ind-bell
inde
x:
to
w
ards
ultra-f
as
t
edg
e
q
uer
y
f
or
g
raph
databases,
”
2023
IEEE
39t
h
Int
er
national
Conf
er
ence
on
Data
Engineering
(ICDE)
,
Apr
.
2023,
pp.
2090–2098,
doi:
10.1109/icde55515.2023.00162.
[3]
H.
W
u
e
t
al.
,
“
A
per
f
or
mance-impro
v
ed
and
s
torag
e-ecient
secondar
y
inde
x
f
or
big
data
processing,
”
2017
IEEE
Int
er
national
Conf
er
ence
on
Smar
t
Cloud
(Smar
tCloud)
,
N
o
v
.
2017,
pp.
161–167,
doi:
10.1109/smar
tcloud.2017.32.
[4]
R.
Ma,
X.
Zhu,
and
L.
Y
an,
“
A
h
ybr
id
approac
h
f
or
clus
ter
ing
uncer
tain
time
ser
ies,
”
Jour
nal
of
Computing
and
Inf
or
mation
T
ec
hnology
,
v
ol.
28,
no.
4,
pp.
255–267,
Oct.
2021,
doi:
10.20532/cit.2020.1004802.
[5]
S.
V
.
Limkar
and
R.
K.
Jha,
“
A
no
v
el
method
f
or
parallel
inde
xing
of
real
time
g
eospatial
big
data
g
enerated
b
y
Io
T
de
vices,
”
F
utur
e
Gener
ation
Comput
er
Sy
s
t
ems
,
v
ol.
97,
pp.
433–452,
A
ug.
2019,
doi:
10.1016/j.future.2018.09.061.
[6]
M.
A.
Qader
,
S.
Cheng,
and
V
.
Hr
is
tidis,
“
A
comparativ
e
s
tudy
of
secondar
y
inde
xing
tec
hniq
ues
in
LSM-based
N
oSQL
databases,
”
Pr
oceedings
of
t
he
2018
Int
er
national
Conf
er
ence
on
Manag
ement
of
Data
,
Ma
y
2018,
pp.
pp.
551–566,
doi:
10.1145/3183713.3196900.
[7]
Z.
W
ang,
Q.
W
ang,
P
.
W
ang,
T
.
P
alpanas,
and
W
.
W
ang,
“Dump
y
:
A
compact
and
adaptiv
e
inde
x
f
or
lar
g
e
data
ser
ies
collections,
”
Pr
oceedings
of
t
he
A
CM
on
Manag
ement
of
Data
,
v
ol.
1,
no.
1,
pp.
1–27,
Ma
y
2023,
doi:
10.1145/3588965.
[8]
M.
M.
La
w
al,
H.
Ibrahim,
N
.
F
.
M.
Sani,
and
R.
Y
aak
ob,
“
Anal
y
ses
of
inde
xing
tec
hniq
ues
on
uncer
tain
data
with
high
dimensionality
,
”
IEEE
A
ccess
,
v
ol.
8,
pp.
74101–74117,
2020,
doi:
10.1109/access.2020.2988487.
[9]
D.
Gopinathan
and
K.
Asa
w
a,
“N
e
w
path
based
inde
x
s
tr
ucture
f
or
processing
C
AS
q
uer
ies
o
v
er
XML
database,
”
Jour
nal
of
Computing
and
Inf
or
mation
T
ec
hnology
,
v
ol.
25,
no.
3,
pp.
211–225,
Oct.
2017,
doi:
10.20532/cit.2017.1003557.
[10]
R.
Ma,
D.
Zheng,
and
L.
Y
an,
“F
as
t
online
similar
ity
searc
h
f
or
uncer
tain
time
ser
ies,
”
Jour
nal
of
Computing
and
Inf
or
mation
T
ec
hnology
,
v
ol.
28,
no.
1,
pp.
1–17,
Jul.
2020,
doi:
10.20532/cit.2020.1004574.
[11]
Y
.
Shin,
J.
Ahn,
and
D.-H.
Im,
“Join
optimization
f
or
in
v
er
ted
inde
x
tec
hniq
ue
on
relational
database
manag
ement
sy
s
tems,
”
Exper
t
Sy
s
t
ems
wit
h
Applications
,
v
ol.
198,
Jul.
2022,
doi:
10.1016/j.esw
a.2022.116956.
[12]
P
.
F
er
ragina
and
G.
V
inciguer
ra,
“
The
PGM-inde
x:
a
full
y
-dynamic
compressed
lear
ned
inde
x
with
pro
v
able
w
ors
t-case
bounds,
”
Pr
oceedings
of
t
he
VLDB
Endo
wment
,
v
ol.
13,
no.
8,
pp.
1162–1175,
2020,
doi:
10.14778/3389133.3389135.
[13]
D.
Belazzougui
e
t
al.
,
“Linear
-time
s
tr
ing
inde
xing
and
anal
y
sis
in
small
space,
”
A
CM
T
r
ansactions
on
Algorit
hms
(T
AL
G)
,
v
ol.
16,
no.
2,
pp.
1–54,
2020,
doi:
10.1145/3381417.
[14]
P
.
Li,
H.
Lu,
Q.
Zheng,
L.
Y
ang,
and
G.
P
an,
“LIS
A:
A
lear
ned
inde
x
s
tr
ucture
f
or
spatial
data,
”
Pr
oceedings
of
t
he
2020
A
CM
SIGMOD
Int
er
national
Conf
er
ence
on
Manag
ement
of
Data
,
Ma
y
2020,
pp.
2119–2133,
doi:
10.1145/3318464.3389703.
[15]
T
.
Kraska,
A.
Beutel,
E.
H.
Chi,
J.
Dean,
and
N
.
P
ol
yzotis,
“
The
case
f
or
lear
ned
inde
x
s
tr
uctures,
”
Pr
oceedings
of
t
he
2018
Int
er
national
Conf
er
ence
on
Manag
ement
of
Data
,
Ma
y
2018,
pp.
489–504,
doi:
10.1145/3183713.3196909.
[16]
J.
Song,
H.
T
.
Shen,
J.
W
ang,
Z.
Huang,
N
.
Sebe,
and
J.
W
ang,
“
A
dis
tance-computation-free
searc
h
sc
heme
f
or
binar
y
code
databases,
”
IEEE
T
r
ansactions
on
Multimedia
,
v
ol.
18,
no.
3,
pp.
484–495,
Mar
.
2016,
doi:
10.1109/tmm.2016.2515990.
[17]
J.
Dittr
ic
h,
J.
Nix,
and
C.
Sc
h
¨
on,
“
The
ne
xt
50
y
ears
in
database
inde
xing
or:
the
case
f
or
automaticall
y
g
enerated
inde
x
s
tr
uctures,
”
Pr
oceedings
of
t
he
VLDB
Endo
wment
,
v
ol.
15,
no.
3,
pp.
527–540,
N
o
v
.
2021,
doi:
10.14778/3494124.3494136.
Ev
aluating
sear
c
h
key
dis
tribution
impact
on
sear
c
hing
per
f
or
mance
in
...
(Bo
w
onsak
Srisungsittisunti)
Evaluation Warning : The document was created with Spire.PDF for Python.
2546
❒
ISSN:
2252-8938
[18]
S.
Oujdi,
H.
Belbac
hir
,
and
F
.
Bouf
ares,
“C4.5
decision
tree
algor
ithm
f
or
spatial
data,
alter
nativ
es
and
per
f
or
mances,
”
Jour
nal
of
Computing
and
Inf
or
mation
T
ec
hnology
,
v
ol.
27,
no.
3,
pp.
29–43,
Ma
y
2020,
doi:
10.20532/cit.2019.1004651.
[19]
Y
.
Gheraibia,
A.
Moussaoui,
S.
Kabir
,
and
P
.
Y
.
Y
in,
“P
enguins
searc
h
optimisation
algor
ithm
f
or
association
r
ules
mining,
”
Jour
nal
of
Computing
and
Inf
or
mation
T
ec
hnology
,
v
ol.
24,
no.
2,
pp.
165–179,
Jun.
2016,
doi:
10.20532/cit.2016.1002745.
[20]
A.
Ba
ykaso
˘
glu
and
F
.
B.
Ozso
ydan,
“Dynamic
optimization
in
binar
y
searc
h
spaces
via
w
eighted
super
position
attraction
algor
ithm,
”
Exper
t
Sy
s
t
ems
wit
h
Applications
,
v
ol.
96,
pp.
157–174,
Apr
.
2018,
doi:
10.1016/j.esw
a.2017.11.048.
[21]
M.
K
omoro
w
ski
and
T
.
T
rzci
´
nski,
“Random
binar
y
searc
h
trees
f
or
appro
ximate
neares
t
neighbour
searc
h
in
binar
y
spaces,
”
Applied
Sof
t
Computing
,
v
ol.
79,
pp.
87–93,
Jun.
2019,
doi:
10.1016/j.asoc.2019.03.031.
[22]
Y
.
Guellouma
and
H.
Cher
roun,
“Ecient
implementation
f
or
deter
minis
tic
nite
tree
automata
minimization,
”
Jour
nal
of
Computing
and
Inf
or
mation
T
ec
hnology
,
v
ol.
24,
no.
4,
pp.
311–322,
Dec.
2016,
doi:
10.20532/cit.2016.1002867.
[23]
Y
.-H.
K
uan,
Y
.-H.
Chang,
T
.-
Y
.
Chen,
P
.-C.
Huang,
and
K.-
Y
.
Lam,
“Space-ecient
inde
x
sc
heme
f
or
PCM-based
multiv
ersion
databases
in
cyber
-ph
y
sical
sy
s
tems,
”
A
CM
T
r
ansactions
on
Embedded
Computing
Sy
s
t
ems
,
v
ol.
16,
no.
1,
pp.
1–26,
Oct.
2016,
doi:
10.1145/2950060.
[24]
R.
K
¨
uhn,
D.
Bieber
t,
C.
Hak
er
t,
J.-J.
Chen,
and
J.
T
eubner
,
“
T
o
w
ards
data-based
cac
he
optimization
of
B+-trees,
”
Pr
oceedings
of
t
he
19t
h
Int
er
national
W
or
kshop
on
Data
Manag
ement
on
N
ew
Har
dw
ar
e
,
Jun.
2023,
pp.
63–69,
doi:
10.1145/3592980.3595316.
[25]
H.
Shen,
L.
X
ue,
H.
W
ang,
L.
Zhang,
and
J.
Zhang,
“B+-tree
based
multi-k
e
yw
ord
rank
ed
similar
ity
searc
h
sc
heme
o
v
er
encr
ypted
cloud
data,
”
IEEE
A
ccess
,
v
ol.
9,
pp.
150865–150877,
2021,
doi:
10.1109/access.2021.3125729.
[26]
Y
.-S.
Zhao
and
Q.-
A.
Zeng,
“Secure
and
ecient
product
inf
or
mation
retr
ie
v
al
in
cloud
computing,
”
IEEE
A
ccess
,
v
ol.
6,
pp.
14747–14754,
2018,
doi:
10.1109/access.2018.2816919.
[27]
J.
F
u,
N
.
W
ang,
B.
Cui,
and
B.
K.
Bhar
g
a
v
a,
“
A
practical
frame
w
ork
f
or
secure
document
retr
ie
v
al
in
encr
ypted
cloud
le
sy
s
tems,
”
IEEE
T
r
ansactions
on
P
ar
allel
and
Dis
tribut
ed
Sy
s
t
ems
,
v
ol.
33,
no.
5,
pp.
1246–1261,
Ma
y
2022,
doi:
10.1109/tpds.2021.3107752.
[28]
R.
M.
Alamelu
and
K.
Prabu,
“Hybr
idization
of
Pig
eon
inspired
with
glo
ww
or
m’
sw
ar
m
optimization
based
clus
ter
ing
tec
hniq
ue
in
wireless
sensor
netw
orks,
”
Micr
opr
ocessor
s
and
Micr
osy
s
t
ems
,
v
ol.
91,
Jun.
2022,
doi:
10.1016/j.micpro.2022.104528.
[29]
R.
R.
Pr
iy
adarshini
and
N
.
Siv
ak
umar
,
“Clus
ter
head
selection
based
on
minimum
connected
dominating
set
and
Bi-par
tite
inspired
methodology
f
or
ener
gy
conser
v
ation
in
W
SNs,
”
Jour
nal
of
King
Saud
U
niv
er
sity
-
Comput
er
and
Inf
or
mation
Sciences
,
v
ol.
33,
no.
9,
pp.
1132–1144,
N
o
v
.
2021,
doi:
10.1016/j.jksuci.2018.08.009.
[30]
A.
Buldas
e
t
al.
,
“
An
ultra-scalable
bloc
kc
hain
platf
or
m
f
or
univ
ersal
asset
tok
enization:
design
and
implementation,
”
IEEE
A
ccess
,
v
ol.
10,
pp.
77284–77322,
2022,
doi:
10.1109/access.2022.3192837.
BIOGRAPHIES
OF
A
UTHORS
Dr
.
Bo
w
onsak
Srisungsittisunti
is
Assis
tant
Prof
essor
at
Computer
Engineer
ing,
Sc
hool
of
Inf
or
mation
and
Communication
tec
hnology
,
U
niv
ersity
of
Pha
y
ao,
Thailand.
He
holds
a
Ph.D.
deg
ree
in
computer
engineer
ing
with
specialization
in
data
processing.
His
researc
h
areas
are
data
processing,
data
anal
ytic,
data
mining,
and
database
sy
s
tem.
He
can
be
contacted
at
email:
bo
w
onsak.sr@up.ac.th.
Jira
w
at
Duangkae
w
receiv
ed
a
Bac
helor
of
Science
deg
ree
in
computer
science
from
Rambhai
Bar
ni
Ra
jabhat
U
niv
ersity
,
Thailand,
in
2020,
and
completed
his
Mas
ter
of
Engineer
ing
in
computer
engineer
ing
at
the
U
niv
ersity
of
Pha
y
ao,
Thailand,
in
2024.
He
is
cur
rentl
y
pursuing
his
Ph.D.
in
computer
engineer
ing
at
the
U
niv
ersity
of
Pha
y
ao,
Thailand.
His
researc
h
interes
ts
include
inde
xing
tec
hniq
ues,
non-relational
databases,
lar
g
e
databases,
and
incremental
databases.
He
can
be
contacted
at
email:
jira
w
at.du@outlook.com.
N
akarin
Chaikae
w
is
Ph.D.
in
remote
sensing
and
g
eog
raphic
inf
or
mation
sy
s
tems,
Asian
Ins
titute
of
T
ec
hnology
(AIT).
He
is
Assis
tant
Prof
essor
in
g
eog
raphic
inf
or
mation
science,
U
niv
ersity
of
Pha
y
ao,
Thailand.
He
can
be
contacted
at
email:
nakar
in.c
h@up.ac.th.
Int
J
Ar
tif
Intell,
V
ol.
14,
N
o.
3,
June
2025:
2537–2546
Evaluation Warning : The document was created with Spire.PDF for Python.