Inter
national
J
our
nal
of
Electrical
and
Computer
Engineering
(IJECE)
V
ol.
9,
No.
2,
April
2019,
pp.
1359
1373
ISSN:
2088-8708,
DOI:
10.11591/ijece.v9i2.pp1359-1373
1359
Impr
o
v
ed
optimization
of
numerical
association
rule
mining
using
h
ybrid
particle
swarm
optimization
and
cauch
y
distrib
ution
Imam
T
ah
yudin
1
and
Hidetaka
Nambo
2
1,2
Artificial
Intelligence
Laboratory
,
Graduate
School
of
Natural
Science
and
T
echnology
,
Di
vision
of
Electrical
Engineering
and
Computer
Science,
Kanaza
w
a
Uni
v
ersity
,
Japan
1
Department
of
Information
System,
STMIK
AMIK
OM
Purw
ok
erto,
Indonesia
Article
Inf
o
Article
history:
Recei
v
ed
Sep
7,
2017
Re
vised
Sep
10,
2018
Accepted
Sep
16,
2018
K
eyw
ords:
Numerical
data
ARM
PSO
Cauch
y
distrib
ution
Multi-objecti
v
e
functions
P
ARCD
ABSTRA
CT
P
article
Sw
arm
Optimization
(PSO)
has
been
applied
to
solv
e
optimization
problems
in
v
arious
fields,
such
as
Assoc
iation
Rule
Mining
(ARM)
of
numerical
problems.
Ho
w-
e
v
er
,
PSO
often
becomes
trapped
in
local
optima.
Consequently
,
the
results
do
not
represent
the
o
v
erall
opt
imum
solutions.
T
o
address
this
limitation,
this
study
aims
to
combine
PSO
with
the
Cauch
y
distrib
ution
(P
ARCD),
which
is
e
xpected
to
increase
the
global
optimal
v
alue
of
the
e
xpanded
search
space.
Furthermore,
this
study
uses
multi-
ple
objecti
v
e
functions,
i.e.,
support,
confidence,
comprehensibility
,
interestingness
and
amplitude.
In
addition,
the
proposed
method
w
as
e
v
aluated
using
benchmark
datasets,
such
as
the
Quak
e,
Bas
k
et
ball,
Body
f
at,
Pollution,
and
Bolt
datasets.
Ev
al
uation
re-
sults
were
compared
to
the
results
obtained
by
pre
vious
studies.
The
results
indicate
that
the
o
v
erall
v
alues
of
the
objecti
v
e
functions
obtained
using
the
proposed
P
ARCD
approach
are
satisf
actory
.
Copyright
c
2019
Institute
of
Advanced
Engineering
and
Science
.
All
rights
r
eserved.
Corresponding
A
uthor:
Imam
T
ah
yudin,
Artificial
Intelligence
Laboratory
,
Graduate
School
of
Natural
Science
and
T
echnology
,
Electrical
Engineering
and
Computer
Science,
Kanaza
w
a
Uni
v
ersity
,
Kakumamachi,
Kanaza
w
a,
Ishika
w
a,
Japan.
T
el.:
+81-76-234-4835
F
ax:
+81-76-234-4900
Email:
imam@blitz.ec.t.kanaza
w
a-u.ac.jp
1.
INTR
ODUCTION
The
ARM
or
association
analysis
method
is
used
to
find
associations
or
rela
tionships
between
v
ariables,
which
often
arise
simultaneously
in
a
dataset
[1].
In
other
w
ords,
association
analysis
b
uilds
a
rule
for
se
v
eral
v
ariables
in
a
dataset
that
can
be
distinguished
as
an
antecedent
or
a
consequent.
The
Apriori
and
Frequent
P
attern
(FP)
gro
wth
methods
are
widely
em
plo
yed
in
as
sociation
analysis.
These
methods
are
suitable
for
cate
gorical
or
binary
data,
such
as
gender
data,
i.e.,
males
can
be
represented
by
0
and
females
by
1
[2].
Furthermore,
if
the
data
are
numeric,
such
as
age,
weight
or
length,
these
methods
process
the
data
by
t
ransforming
numerical
data
into
cate
gorical
data
(i.e.,
a
discretization
process).
This
transformation
process
requires
more
time
and
can
miss
a
significant
amount
of
important
information
because
data
transformation
does
not
maintain
the
main
meaning
of
t
he
original
data
[3],
[4],
[5].
F
or
e
xample,
if
age
data
represents
a
35
years
old
person
and
is
transformed
to
1,
this
obscures
the
original
meaning
of
the
age
information.
In
addition,
both
methods
require
manual
interv
ention
to
determine
the
minimum
support
(attrib
ute
co
v
erage)
and
confidence
(accurac
y)
v
alues.
Note
that
this
step
is
subjecti
v
e
in
some
cases;
thus,
the
results
will
not
be
optimal
[6],
[7].
J
ournal
Homepage:
http://iaescor
e
.com/journals/inde
x.php/IJECE
Evaluation Warning : The document was created with Spire.PDF for Python.
1360
ISSN:
2088-8708
T
o
resolv
e
this
problem,
some
researchers
ha
v
e
proposed
solutions
that
emplo
y
optimization
approaches,
e.g.,
particle
sw
arm
optimization
(PSO)
[4],
fuzzzy
logic
[8],
and
genetic
algorithm
(GA)
[3],
[7].
Re
g
arding
of
the
PSO
approach
which
has
multiple
objecti
v
e
functions
for
solving
association
analysis
of
numerical
data
without
a
discretization
process.
This
research
produced
the
better
result
than
other
pre
vious
optimization
meth-
ods.
It
has
optimum
v
alue
automatically
without
determining
the
minimum
support
and
minimum
confidence.
Ho
we
v
er
,
this
method
can
also
become
trapped
in
local
optima.
When
iterations
are
complete
and
the
number
of
iterations
tends
to
w
ard
infinity
,
the
v
elocity
v
alue
of
a
particle
approaches
0
(the
weight
v
alue
of
the
v
elocity
function
is
between
0
and
1).
Therefore,
the
search
is
terminated
because
the
PSO
method
can
not
find
the
optimal
v
alue
when
the
v
elocity
v
alue
is
0.
Thus,
PSO
often
f
ails
to
seek
the
o
v
erall
optimal
v
alue
[4],
[9]
,
[10].
W
e
proposed
a
method
that
can
address
the
premature
searching
and
the
limitations
of
traditional
met
h-
ods
that
it
does
not
use
a
discretization
process.
In
other
w
ord,
the
original
data
are
processed
directly
using
the
concept
of
the
Michig
an
or
Pittsb
ur
gh
approaches.
Furthermore,
support
and
confidence
threshold
v
alues
are
determined
automatically
using
the
P
areto
optimality
concept.
One
solution
to
this
problem
is
by
combining
PSO
with
the
Cauch
y
distrib
ution.
This
combination
increases
the
size
of
the
s
earch
space
and
is
e
xpected
to
produce
a
better
optimal
v
alue.
Y
ao
et
al
(1999)
reported
that
combining
a
function
with
the
Cauch
y
distrib
ution
will
result
in
a
wider
co
v
erage
area;
thus,
when
the
Cauch
y
distrib
ution
is
combined
with
the
function
of
the
PSO
method,
the
optimal
v
alue
will
increase
[10].
Therefore,
the
purpose
of
this
study
is
to
find
the
optimal
v
alue
of
the
numerical
data
in
association
a
nal-
ysis
problems
by
combining
PSO
with
the
Cauch
y
distrib
ution
(P
ARCD).
Furthermore,
we
determine
the
v
alue
of
se
v
eral
objecti
v
e
functions
such
as
support,
confidence,
comprehensibility
,
interestingness,
and
amplitude,
as
a
parameter
to
e
v
aluate
the
performance
of
the
proposed
method.
Problem
solving
in
numerical
data
association
analysis
is
generally
performed
using
se
v
eral
a
p
pr
o
a
ches,
including
discretization,
distrib
ution
and
optimization.
That
the
discretization
is
performed
using
partitioning
and
combining,
clustering
[11],
[12]
and
fuzzy
[8]
methods,
and
the
optimization
approach
is
solv
ed
using
the
optimized
association
rule
[13],
dif
ferential
e
v
olution
[14],
GA
[3],
[7]
and
PSO
[4],
[15]
as
sho
wn
in
Figure
1.
Figure
1.
Numeric
association
analysis
rule
mining
W
e
focus
to
solv
e
the
problem
of
association
analysis
of
numerical
data
by
optimization.
The
pre
vious
research
from
optimization
approach
is
kno
wn
as
the
GAR
method.
It
has
been
attempted
to
find
the
optimal
item
set
with
the
best
support
v
alue
without
using
a
discretization
process
[13].
And
then,
the
dif
ferential
e
v
olution
optimization
approach
includes
the
generation
of
the
initial
population,
as
well
as
mutation,
crosso
v
er
and
selection
operations.
The
multi-objecti
v
e
functions
are
optimized
using
the
P
areto
optimality
theory
.
This
method
is
kno
wn
a
s
MODEN
AR
[14].
Furthermore,
a
study
of
numerical
association
rul
e
mining
using
the
genetic
algorithm
approach
(ARM
GA).
It
successfully
solv
ed
association
analysis
of
numerical
data
problems
without
determining
the
v
alues
of
the
minimum
support
or
minimum
confidence
manually
.
In
addition,
this
method
can
e
xtract
the
best
rule
that
has
the
best
relationship
between
the
support
and
confidence
v
alues
[7].
Another
study
of
GA
approach
has
been
used
MOGAR
method.
It
presented
that
using
MOGAR
method
w
as
f
aster
than
using
con
v
entional
methods,
such
as
Apriori
and
FP-gro
wth
algorithms,
because
the
time
comple
xity
of
the
MOGAR
method
tends
to
be
simpler
,
and
follo
ws
quadratic
distrib
ution.
On
the
other
hand,
the
Apriori
IJECE
V
ol.
9,
No.
2,
April
2019
:
1359
–
1373
Evaluation Warning : The document was created with Spire.PDF for Python.
IJECE
ISSN:
2088-8708
1361
algorithm
follo
ws
an
e
xponential
distrib
ution,
which
requires
more
time
for
computation
[3].
Ne
xt,
the
opti
mization
method
has
been
used
PSO
for
solving
numerical
ARM
problem.
Some
authors
who
performed
PSO
method
such
as
the
y
used
ARM
to
in
v
estig
ate
the
association
of
frequent
and
repeated
dysfunction
in
the
production
process.
The
result
obtained
a
f
aster
and
more
ef
fecti
v
e
optimization
emplo
yed
PSO,
which
resulted
in
a
f
aster
and
more
ef
fecti
v
e
optimization
process
than
the
other
optimization
methods
[16].
In
addition,
the
PSO
approach
w
as
used
to
impro
v
ed
the
computational
ef
ficienc
y
of
ARM
problems
such
that
appropriate
support
and
confidence
v
alues
could
be
determined
automatically
[17].
In
2012,
the
de
v
elopment
of
PSO
for
ARM
problems
w
as
performed
by
weighting
the
item
set.
This
weighting
is
v
ery
import
ant
for
v
ery
lar
ge
data
because
such
data
often
contain
important
information
that
appears
infrequently
.
F
or
e
xample,
in
medical
data,
if
there
is
a
rule
f
stif
f
neck,
fe
v
er
,
a
v
ersion
to
light
g
!
f
meningitis
g
that
rarely
appears
b
ut
this
rule
is
v
ery
important
because
in
f
act
this
condition
is
often
happen
[18].
In
2013,
Sarath
and
Ra
vi
introduced
binary
PSO
(BPSO)
to
generate
association
rules
in
a
transaction
dat
abase.
This
method
is
similar
to
the
Apriori
and
FP
gro
wth
algorithms;
ho
we
v
er
,
BPSO
can
determine
optimum
rules
without
specifying
the
minimum
support
and
confidence
v
alues
[19].
In
2014,
Beiran
v
and
et
al.
studied
numerical
data
association
analysis
using
the
PSO
method.
The
y
stated
that
the
emplo
yed
method
could
ef
fecti
v
ely
analyze
numerical
data
association
analysis
problems
without
using
a
discretization
process.
This
research
empl
o
ys
four
objecti
v
e
functions,
i.e.,
support,
confidence,
comprehensibility
and
interestingness.
This
method
is
referred
to
as
MOP
AR
[4].
In
2014,
Indira
and
Kanmani
conducted
resear
ch
using
a
PSO
approach;
ho
we
v
er
,
the
y
attempted
to
impro
v
e
results
and
analysis
time
using
an
adapti
v
e
parameter
determination
process
to
determine
v
arious
parameters,
such
the
constant
and
weight
v
alue
in
a
v
elocity
equation.
The
y
de
v
eloped
the
Apriori
algorithm
using
a
PSO
approach
(APSO),
and
the
results
demonstrated
that
this
approach
w
as
f
aster
and
better
compared
to
using
only
an
Apriori
method
[15].
In
addition,
the
combination
of
PSO
and
GSA
has
been
conducted
for
solving
optimal
reacti
v
e
po
wer
dispatch
problem
in
po
wer
system.
The
problem
has
succesfully
accomplished
on
basis
of
ef
ficient
and
reli
able
technique.
And
then,
the
result
were
found
satisf
actorily
to
a
lar
ge
e
xtent
that
of
reported
earlier
[20].
V
erma
and
Lakhw
ani
e
xamined
ARM
problems
by
combining
PSO
and
a
GA.
The
results
sho
wed
better
accurac
y
and
consistenc
y
compared
to
indi
vidual
PSO
or
a
GA
method
[21].
There
are
man
y
de
v
elopments
of
PSO
method.
i.e.
the
papers;
”the
implementation
of
PSO
in
dis-
trib
uted
generation
sizing”
[22],
”impro
v
ed
cann
y
edges
using
cellular
based
PSO
technique
in
digital
images”
[23],
and
the
h
ybrid
method.
One
of
h
ybrid
methods
is
the
h
ybrid
PSO
with
the
Cauch
y
distrib
ution
[24].
This
method
pro
vides
better
results
compared
to
using
only
PSO.
In
2011,
this
combined
method
w
as
retested
for
SVM
parameter
selection
[25-27].
The
combined
approach
w
as
also
used
to
impro
v
e
performance
weaknesses
in
a
process
to
identify
a
w
atermark
image
based
on
discrete
cosine
transform
(DCT).
The
results
demonstrated
that
combining
PSO
with
the
Cauch
y
distrib
ution
outperforms
the
compared
method
[28].
In
2014,
an
empirical
study
demonstrated
that
combining
PSO
with
the
Cauch
y
distrib
ution
pro
vided.
The
results
sho
w
that
the
use
of
PSO
with
Cauch
y
distrib
ution
higher
than
using
only
PSO
[29].
T
o
the
best
of
our
kno
wledge,
combining
PSO
with
the
Cauch
y
distrib
ution
has
not
been
applied
to
ARM
problems
that
in
v
olv
e
numerical
data.
This
research
has
important
contrib
ution
for
optimization
approach
of
numerical
ARM
problem.
The
reminder
of
this
paper
is
or
g
anized
as
follo
ws.
Research
method
is
discussed
in
Section
2.
This
section
describes
the
design
of
the
multiple
objecti
v
e
functions
and
the
de
v
elopment
of
the
proposed
P
ARCD
method.
Secti
on
3
e
xposes
the
e
xperimental
result
and
discussion
of
proposed
method
which
w
as
tested
using
a
dataset
benchmark.
This
section
also
pro
vides
a
comparison
of
the
results
obtained
by
the
proposed
P
ARCD
method
and
e
xisting
methods.
Conclusions
and
suggestions
for
future
w
ork
are
pro
vided
in
Section
4.
2.
RESEARCH
METHOD
2.1.
Objecti
v
e
Design
This
study
uses
multiple
objecti
v
e
functions,
i.e.,
support,
confidence,
comprehensibility
,
interesting-
ness
and
amplitude.
First,
the
support
criterion
determines
the
ratio
of
transactions
for
item
X
to
the
total
transaction
(D),
i.e.,
support(X)=
j
X
j
/
j
D
j
.
Then,
if
A
is
the
antecedent
of
the
transaction
dataset
as
a
precondi-
tion
then
C
is
consequence
as
the
conclusion
of
a
transaction
dataset.
The
support
v
alue
of
if
A
then
C
(A
!
C)
is
computed
as
follo
ws:
S
uppor
t
(
A
[
C
)
=
j
A
[
C
j
j
D
j
(1)
Impr
o
ved
optimization
of
numerical
association
rule
mining
...
(Imam
T
ahyudin)
Evaluation Warning : The document was created with Spire.PDF for Python.
1362
ISSN:
2088-8708
where
j
A
[
C
j
is
the
number
of
transaction
which
contain
A
and
C.
The
minimum
support
v
alue
is
closely
link
ed
to
the
number
of
items
co
v
ered
to
determine
the
refe
renced
rule.
If
the
threshold
v
alue
is
lo
w
,
the
support
co
v
ers
man
y
items
and
vice
v
ersa.
The
support
measurement
is
used
to
determine
the
confidence
measurement
criteria,
i.e.,
the
criteria
used
to
measure
the
quality
or
accurac
y
of
the
rule
deri
v
ed
from
the
total
transactions.
Such
rules
are
often
de
v
eloped
for
each
transaction
to
better
demonstrate
quality
or
accurac
y
[4].
Confidence
can
be
e
xpressed
as
follo
ws,
C
onf
idence
(
A
[
C
)
=
S
uppor
t
(
A
[
C
)
S
uppor
t
(
A
)
(2)
Ho
we
v
er
,
these
criteria
are
not
guaranteed
to
produce
appropriate
rules.
Thus,
for
a
gi
v
en
rule
to
be
considered
reliable
and
to
pro
vide
o
v
erall
co
v
erage,
the
result
must
also
satisfy
the
comprehensibility
and
interestingness
criteria.
Gosh
and
Nath
(2004),
stated
that
less
number
of
attri
b
ut
es
in
antecedent
component
of
a
rule
sho
w
that
the
rule
is
comprehensible
[30].
The
comprehensibility
measurement
criteria
can
be
e
xpressed
as
follo
ws:
C
ompr
ehensibil
ity
(
A
[
C
)
=
l
og
(1+
j
C
j
)
l
og
(1+
j
A
[
C
j
)
(3)
where
j
C
j
is
the
number
of
consequence
item
and
j
A
[
C
j
is
the
rule
number
of
if
A
then
C
(A
!
C).
Ne
xt,
the
interestingness
criter
ia
are
used
to
generate
hidden
information
by
e
xtracting
some
interest
ing
rule
or
unique
rule.
This
criterion
is
based
on
the
support
v
alue
and
is
e
xpressed
as
follo
ws:
I
nt
er
esting
ness
(
A
[
C
)
=
S
upp
(
A
[
C
)
S
upp
(
A
)
S
upp
(
A
[
C
)
S
upp
(
C
)
1
S
upp
(
A
[
C
)
j
D
j
(4)
The
right
side
of
Eq.
(4)
consists
of
three
components.
The
first
component
sho
ws
the
generation
probability
of
the
r
u
l
e
that
i
s
based
on
the
antecedent
attrib
ute.
The
second
is
based
on
the
consequence
attrib
utes
and
the
third
is
based
on
the
total
dataset.
There
is
a
ne
g
ati
v
e
correlation
between
interestingness
and
support.
When
the
support
v
alue
is
high,
the
interestingness
v
alue
is
lo
w
because
the
number
of
frequent
items
co
v
ered
is
small
[4].
The
last
criterion
is
the
amplitude
interv
al.
The
amplitude
interv
al,
which
is
a
measure
of
a
minimizati
on
function,
dif
fers
from
support,
confidence
and
comprehensibility
measures,
which
are
m
aximization
functions.
The
amplitude
interv
al
is
e
xpressed
as
follo
ws:
Ampl
itude
(
A
[
C
)
=
1
1
m
(
i
=
1
;
m
)
ui
l
i
max
(
Ai
)
min
(
Ai
)
(5)
Here,
m
is
the
number
of
attrib
utes
in
the
item
set
(
j
A
[
C
j
)
,
ui
and
l
i
are
the
upper
and
lo
wer
bounds
encoded
in
the
item
sets
corresponding
to
attrib
ute
i.
max
(
Ai
)
and
min
(
Ai
)
are
the
allo
w
able
limits
of
the
interv
als
corresponding
to
attrib
ute
i.
Thus,
rules
with
smaller
interv
als
are
intended
to
be
generated
[14].
2.2.
PSO
PSO,
which
w
as
first
introduced
by
K
ennedy
and
Eberhart
(1995),
is
an
e
v
ol
utionary
method
i
nspired
by
animal
beha
vior
,
e.g.,
flocks
of
birds,
school
of
fish,
or
sw
arms
of
bees
[31].
PSO
be
gins
with
a
set
of
random
particles.
Then,
a
search
process
attempts
to
find
the
optimal
v
alue
by
performing
an
update
generation
process.
During
each
iteration,
each
particle
is
updated
by
follo
wing
tw
o
best
v
alues.
The
first
is
the
best
solution
(fitness)
achie
v
ed
to
this
point.
This
v
alue
is
called
pBest.
The
other
best
v
alue
track
ed
by
the
sw
arm
particle
optimizer
is
the
best
v
alue
obtained
by
each
particle
in
the
population.
The
v
alue
is
called
gBest.
After
finding
pBest
and
gBest,
each
particle’
s
v
elocity
and
corresponding
position
are
updated
[15].
Each
particle
p
in
some
iteration
t
has
a
position
x
(
t
)
and
displacement
speed
v
(
t
)
.
The
finest
particles
(pBest)
and
bes
t
global
positioning
(gBest)
are
stored
in
memory
.
The
speed
and
position
are
updated
using
Eqs.
6
and
7,
respecti
v
ely
[15].
V
i;
new
=
!
V
i;
ol
d
+
C
1
r
and
()(
pB
est
X
i
)
+
C
2
r
and
()(
g
B
est
X
i
)
(6)
IJECE
V
ol.
9,
No.
2,
April
2019
:
1359
–
1373
Evaluation Warning : The document was created with Spire.PDF for Python.
IJECE
ISSN:
2088-8708
1363
X
i;
new
=
X
i;
ol
d
+
V
i;
new
(7)
Here
!
is
the
inertia
weight;
V
i;
ol
d
is
the
v
elocity
of
the
i
th
particle
before
updating;
V
i;
new
is
the
v
elocity
of
the
ith
particle
after
updating;
X
i
is
the
i
th
,
or
current
particle;
i
is
the
number
of
particles;
r
and
()
is
a
r
andom
number
in
the
range
(0,
1);
C
1
is
the
cognit
i
v
e
component;
C
2
is
the
s
ocial
component;
pB
est
is
the
particle
best
or
local
optima
in
some
iterations
on
e
v
ery
running;
g
B
est
is
the
global
best
or
global
optima
in
some
iterations
on
e
v
ery
running.
P
article
v
elocities
in
each
dim
ension
are
restricted
to
maximum
v
elocity
V
max
[32].
2.3.
Cauch
y
Distrib
ution
Y
ao
et
al.
(1999)
used
a
Cauch
y
distrib
ution
to
implement
a
wider
mutation
scale
[10].
A
general
formula
for
the
probability
density
function
is
e
xpressed
as
follo
ws.
f
(
x
)
=
1
s
(1
+
((
x
t
)
=s
)
2
)
(8)
A
Cauch
y
random
v
ari
able
is
calculated
as
follo
ws.
F
or
an
y
random
v
ariable
X
with
distrib
ution
func-
tion
F
.
The
random
v
ariable
Y=F(X)
has
a
uniform
distrib
ution
in
the
range
[0,1).
Consequently
,
if
F
is
in
v
erted,
the
random
v
ariable
can
use
a
uniform
density
to
simulate
random
v
ariable
X
because
X
=
F
1
(Y).
Therefore,
the
cumulati
v
e
distrib
ution
function
of
Cauch
y
distrib
ution
is
e
xpressed
as
follo
ws
F
(
x
)
=
1
ar
ctan
(
x
)
+
0
:
5
(9)
Therefore
if
y
=
1
ar
ctan
(
x
)
+
0
:
5
(10)
by
in
v
erting
its
function,
the
Cauch
y
random
v
ariable
can
be
e
xpressed
as
follo
ws
x
=
tan
(
(
y
0
:
5))
(11)
This
function
can
be
e
xpressed
by
Eq.
(12)
because
y
has
a
uniform
distrib
ution
in
the
range
(0,1].
Thus,
we
obtain
the
follo
wing,
x
=
tan
(
=
2
r
and
[0
;
1))
(12)
2.4.
PSO
f
or
Numerical
Association
Rule
Mining
with
Cauch
y
Distrib
ution
P
ARCD
is
an
e
xtension
o
f
the
MOP
AR
methods
that
combines
PSO
and
the
Cauch
y
distrib
ution
to
solv
e
problems
that
occur
in
the
association
analysis
of
numerical
data
[33].
The
goal
is
to
find
the
optimal
v
alue
of
amateurs
and
a
v
oid
being
trapped
in
local
optima.
Essentially
,
this
method
uses
the
concept
of
PSO
b
ut
modifies
the
v
elocity
equation
by
including
the
Cauch
y
distrib
ution.
The
v
elocity
function
is
e
xpressed
as
follo
ws,
V
i
(
t
+
1)
=
!
(
t
)
V
i
(
t
)
+
C
1
r
and
()(
pB
est
X
i
(
t
))
+
C
2
r
and
()(
g
B
est
X
i
(
t
))
(13)
The
ne
xt
step
is
normalization
by
using
V
i
(
t
+
1)
v
alue
(Eq.
13),
which
mak
es
the
v
ect
or
length
1.
The
v
ariant
of
the
Cauch
y
distrib
ution
is
infinite
and
the
objecti
v
e
function
scales
are
1
[10].
U
i
(
t
+
1)
=
V
i
(
t
+
1)
p
V
i
1(
t
+
1)
2
+
V
i
2(
t
+
1)
2
:::
+
V
iK
(
t
+
1)
2
(14)
The
result
of
the
normalization
process
is
multiplied
by
the
Cauch
y
random
v
ariable
as
follo
ws.
S
i
(
t
+
1)
=
U
i
(
t
+
1)
tan
2
r
and
[0
;
1)
(15)
Impr
o
ved
optimization
of
numerical
association
rule
mining
...
(Imam
T
ahyudin)
Evaluation Warning : The document was created with Spire.PDF for Python.
1364
ISSN:
2088-8708
Then,
the
result
of
Eq.
(15)
which
is
a
combination
of
the
v
elocity
v
alue
and
the
Cauch
y
distrib
ution,
is
used
to
determine
the
ne
w
position
of
a
particle.
X
i
(
t
+
1)
=
X
i
(
t
)
+
S
i
(
t
+
1)
(16)
2.5.
P
ARCD
Pseudo
code
and
Flo
wchart
The
P
ARCD
pseudocode
as
sho
wn
in
Figure
2
and
flo
wchart
as
sho
wn
in
Figure
3
sho
w
that
the
al-
gorithm
be
gins
by
initializing
the
v
elocity
v
ector
and
position
randomly
.
The
algorithm
calculates
the
multi-
objecti
v
e
functions
as
the
current
fitness.
Then,
it
e
x
ecutes
looping
iterations
to
seek
pBest
until
it
finds
the
gBest
v
alue
as
the
optimal
solution.
Figure
2.
P
ARCD
pseudocode
Figure
3.
PSO
flo
wchart
IJECE
V
ol.
9,
No.
2,
April
2019
:
1359
–
1373
Evaluation Warning : The document was created with Spire.PDF for Python.
IJECE
ISSN:
2088-8708
1365
3.
RESUL
T
AND
DISCUSSION
3.1.
Experimental
Setup
W
e
conducted
an
e
xperiment
using
the
Quak
e,
Bask
etball,
Body
f
at,
Pollution,
and
Bolt
benchmark
datasets
in
T
able
1.
from
the
Bilk
ent
Uni
v
ersity
Function
Approximation
Repository
.
The
e
xperiment
w
as
performed
using
a
computer
with
an
Intel
Core
i5
processor
wi
th
8
GB
main
memory
running
W
indo
ws
7.
The
algorithms
were
implemented
using
MA
TLAB.
F
or
the
proposed
algorithm,
we
set
the
population
size,
e
xternal
repository
size,
number
of
iterations,
C1
and
C2,
!
,
v
elocity
limit
and
xRank
parameters
in
T
able
2.
to
40,
100,
2000,
2,
0.63,
3.83,
and
13.33
respecti
v
ely
.
T
able
1.
Dataset
Properties
Dataset
No.
of
Records
No.
of
Attr
ib
utes
Quak
e
2178
4
Bask
etball
96
5
Body
f
at
252
15
Pollution
60
16
Bolt
40
8
T
able
2.
P
arameters
P
arameter
Size
External
Number
of
C
1
,
C
2
!
V
elocity
xRank
Repository
Size
iteration
Limi
t
A
v
erage
40
100
2000
2
0.63
3.83
13.33
3.2.
Experiments
Association
rule
analysis
comprises
tw
o
steps.
The
first
step
is
to
determine
the
frequent
itemset
that
includes
the
antecedents
or
consequences
of
each
attrib
ute.
The
second
step
is
to
implement
the
proposed
algorithm.
3.2.1.
Output
Rules
of
the
P
ARCD
Results
This
e
xperiment
sho
ws
the
20
th
run
time
where
each
running
contains
2000
rules.
W
e
presented
three
datasets
of
output
rules
i.e.
Body
f
at,
Bolt,
and
Pollution
datasets.
T
able
3
sho
ws
the
results
obtained
with
the
Body
f
at
dataset.
F
or
Rule
1,
there
are
eight
antecedent
attrib
utes
and
three
consequent
attrib
utes.
F
or
Rule
2,
the
number
of
antecedent
and
consequent
attrib
utes
are
the
same
as
Rule
1.
F
or
the
last
rule,
the
number
of
antecedent
and
consequent
attrib
utes
are
six
and
tw
o,
respecti
v
ely
.
The
antecedent
attrib
utes
of
Rule
1
are
case
number
,
percent
body
f
at
(Siri’
s
equation),
density
,
age,
adiposity
inde
x,
chest
circumference,
abdomen
circumference,
and
thigh
circumference.
The
consequent
at-
trib
utes
are
percent
body
f
at
(Brozek’
s
equation),
height,
and
hip
circumference.
F
or
Rule
2,
the
antecedent
and
consequent
attrib
utes
are
the
same
as
Rule
1.
Thus,
Rules
1
and
2
can
be
e
xpressed
as
follo
ws:
if
(att1,
att3,
att4,
att5,
att8,
att11,
att12,
att14)
then
(att2,
att
7,
att13).
F
or
Rule
2000,
the
antecedent
attrib
utes
are
Percent
body
f
at
using
Brozek’
s
equation,
Percent
body
f
at
using
Siri’
s
equation,
density
,
height,
neck
circumference
and
knee
circumference,
and
the
consequent
attrib
utes
are
case
number
and
weight.
Therefore,
Rule
2000
is
if
(att2,
att3,
att4,
att7,
att10,
att15)
then
(att1,
att6).
T
able
4
sho
ws
the
results
obtained
with
the
Bolt
dataset,
which
has
eight
attrib
utes;
(run,
speed,
total,
speed2,
number2,
Sens,
time
and
T20Bolt).
As
can
be
seen,
the
first
tw
o
rules
the
same
results
for
both
antecedent
and
consequent
attrib
utes.
The
antecedent
attrib
utes
are
total
and
time,
and
the
consequent
attrib
utes
are
run
and
speed1.
Therefore,
the
rule
is
if
(total,
time)
then
(run,
speed1).
The
rule
2000
sho
ws
that
the
antecedent
Impr
o
ved
optimization
of
numerical
association
rule
mining
...
(Imam
T
ahyudin)
Evaluation Warning : The document was created with Spire.PDF for Python.
1366
ISSN:
2088-8708
attrib
utes
are
run
and
speed2.
Ho
we
v
er
,
the
consequent
attr
ib
ute
is
unkno
wn.
Thus,
this
rule
cannot
be
declared
clearly
because
it
does
not
ha
v
e
a
conclusion.
T
able
5
sho
ws
the
rule
results
for
the
pollution
dataset
obtained
using
the
proposed
particle
repr
esen-
tation
P
ARCD
method.
The
results
for
the
first
and
second
rules
are
the
same.
Here,
the
antecedent
attrib
utes
are
J
ANT
,
EDUC,
NONW
,
and
WWDRK,
and
the
consequent
attrib
utes
are
PREC,
JUL
T
,
O
VR65,
DENS
and
HUMID.
Thus,
the
rule
is
if
(J
ANT
,
EDUC,
NONW
,
WWDRK)
then
(PREC,
JUL
T
,
O
VR65,
DENS,
HUMID).
The
Rule
2000
has
an
A
CN
result
that
dif
fers
from
the
first
and
second
attrib
utes.
The
antecedent
attrib
utes
of
Rule
2000
are
J
ANT
,
O
VR65,
HOUS,
POOR,
HC
and
HUMID
and
its
consequent
attrib
utes
are
POPN,
EDUC,
DENS,
NO
X,
and
SO@.
Thus,
the
final
rule
is
if
(J
ANT
,
O
VR65,
HOUS,
POOR,
HC)
then
(POPN,
EDUC,
DENS,
NO
X,
SO@).
T
able
3.
A
CN
Rules
(the
Body
f
at
dataset)
Rules
A
CN
LB
<
Attrib
ute
<
UB
Rule
1
Antecedent
1.096724
<
Att1
<
1.108900
57.988435
<
Att3
<
69.574945
309.987803
<
Att4
<
314.218245
55.294719
<
Att5
<
66.896106
136.234441
<
Att8
<
138.744999
40.927433
<
Att11
<
41.562953
20.266071
<
Att12
<
20.586850
22.220988
<
Att14
<
23.180185
Consequence
35.426088
<
Att2
<
42.169776
113.825926
<
Att7
<
122.261793
32.375620
<
Att13
<
33.596051
Rule
2
Antecedent
1.096724
<
Att1
<
1.108900
57.988435
<
Att3
<
69.574945
309.987803
<
Att4
<
314.218245
55.294719
<
Att5
<
66.896106
136.234441
<
Att8
<
138.744999
40.927433
<
Att11
<
41.562953
20.266071
<
Att12
<
20.586850
22.220988
<
Att14
<
23.180185
Consequence
35.426088
<
Att2
<
42.169776
113.825926
<
Att7
<
122.261793
32.375620
<
Att13
<
33.596051
.....
.....
Rule
2000
Antecedent
12.402089
<
Att2
<
18.144187
56.221481
<
Att3
<
65.667791
139.024098
<
Att4
<
289.982951
94.156397
<
Att7
<
136.200000
57.669974
<
Att10
<
87.300000
18.798957
<
Att15
<
19.060978
Consequence
1.054478
<
Att1
<
1.108900
31.100000
<
Att15
<
40.883823
Note
:
Att1
:
Case
Number
Att2
:Percentage
using
Brozek’
s
equation
Att3
:Percentage
using
Siri’
s
equation
Att4
:Density
Att5
:Age
(years)
Att6
:W
eight
(lbs)
Att7
:Height
(inches)(tar
get)
Att8
:Adiposity
inde
x
Att9
:F
at
Free
W
eight
Att10
:Neck
circumference
(cm)
Att11
:Chest
circumference
(cm)
Att12
:Abdomen
circumference
(cm)
Att13
:Hip
circumference
(cm)
Att14
:Thigh
circumference
(cm)
Att15
:Knee
circumference
(cm)
Att16
:Ankle
circumference
(cm)
Att17
:Extended
biceps
circumference
(cm)
Att18
:F
orearm
circumference
(cm)
Att19
:Wrist
circumference
(cm)
T
able
4.
A
CN
Rules
(the
Bolt
dataset)
Rules
A
CN
LB
<
Attrib
ute
<
UB
Rule
1
Antecedent
11.911616
<
Att3
<
16.259242
62.782669
<
Att7
<
65.562550
Consequence
23.688468
<
Att1
<
31.295955
5.928943
<
Att2
<
6.000000
Rule
2
Antecedent
11.911616
<
Att3
<
16.259242
62.782669
<
Att7
<
65.562550
Consequence
23.688468
<
Att1
<
31.295955
5.928943
<
Att2
<
6.000000
.....
.....
Rule
2000
Antecedent
13.621221
<
Att1
<
29.817232
1.761097
<
Att4
<
2.325029
Consequence
None
Note
:
Att1
:R
UN
Att2
:SPEED1
Att3
:T
O
T
AL
Att4
:SPEED2
Att5
:NUMBER2
Att6
:SENS
Att7
:TIME
Att8
:T20BOL
T
IJECE
V
ol.
9,
No.
2,
April
2019
:
1359
–
1373
Evaluation Warning : The document was created with Spire.PDF for Python.
IJECE
ISSN:
2088-8708
1367
T
able
5.
A
CN
Rules
(the
Pollution
dataset)
Rules
A
CN
LB
<
Attrib
ute
<
UB
Rule
1
Antecedent
42.431841
<
Att2
<
46.441110
9.675301
<
Att6
<
10.303791
24.171326
<
Att9
<
27.345700
42.882070
<
Att10
<
44.054696
Consequence
21.695266
<
Att1
<
22.757671
77.760994
<
Att3
<
80.221960
6.698662
<
Att4
<
7.071898
7436.549761
<
Att8
<
7801.004046
58.816363
<
Att15
<
63.240005
Rule
2
Antecedent
42.431841
<
Att2
<
46.441110
9.675301
<
Att6
<
10.303791
24.171326
<
Att9
<
27.345700
42.882070
<
Att10
<
44.054696
Consequence
21.695266
<
Att1
<
22.757671
77.760994
<
Att3
<
80.221960
6.698662
<
Att4
<
7.071898
7436.549761
<
Att8
<
7801.004046
58.816363
<
Att15
<
63.240005
.....
.....
Rule
2000
Antecedent
39.363260
<
Att2
<
46.455909
8.721294
<
Att4
<
9.206407
89.212389
<
Att7
<
90.700000
21.796671
<
Att11
<
23.231486
606.938956
<
Att12
<
648.000000
67.768113
<
Att15
<
73.000000
Consequence
2.956662
<
Att5
<
3.005372
9.450171
<
Att6
<
10.068287
9345.537477
<
Att8
<
9699.000000
225.061313
<
Att13
<
288.274133
242.720468
<
Att14
<
250.733264
Note
:
Att1
:PREC
A
v
erage
annual
precipitation
in
inches
Att2
:J
ANT
A
v
erage
January
temperature
in
de
grees
F
Att3
:JUL
T
A
v
erage
July
temperature
in
de
grees
F
Att4
:O
VR65
SMSA
population
aged
65
or
older
Att5
:POPN
A
v
erage
household
size
Att6
:EDUC
Median
school
years
completed
by
those
o
v
er
22
Att7
:HOUS
of
housing
units
which
are
sound
and
with
all
f
acilities
Att8
:DENS
Population
per
sq.
mile
in
urbanized
areas,
1960
Att9
:NONW
non-white
population
in
urbanized
areas,
1960
Att10
:WWDRK
emplo
yed
in
white
collar
occupations
Att11
:POOR
poor
of
f
amilies
with
income
¡
U
S
D
3000
Att12
:HC
Relati
v
e
h
ydrocarbon
pollution
potential
Att13
:NO
X
Same
as
nitric
oxides
Att14
:SO@
Same
as
Sulphur
dioxide
Att15
:HUMID
Annual
a
v
erage,
relati
v
e
humidity
at
1
pm
Att16
:MOR
T
T
otal
age-adjusted
mortality
rate
per
100,000
3.2.2.
Output
of
multi-objecti
v
e
function
and
corr
elation
of
P
ARCD
methods
The
basic
concept
of
association
analysis
comprises
tw
o
steps,
i.e.,
the
first
step
is
the
determination
rules
which
in
e
v
ery
rule
contain
antecedent
and
consequent
and
the
second
step
is
the
implementation
of
the
algorithm
(i.e.,
the
proposed
method).
This
method
be
gins
with
the
initialization
process,
which
as
the
st
art
of
the
algorithm
starts
with
the
determine
the
multi-objecti
v
e
function
v
alue
and
calculates
the
particl
e
v
elocity
and
positioning
at
i.
Then,
an
iterati
v
e
process
is
performed
to
search
for
pBest
and
gBest
as
the
optimal
solution.
Impr
o
ved
optimization
of
numerical
association
rule
mining
...
(Imam
T
ahyudin)
Evaluation Warning : The document was created with Spire.PDF for Python.
1368
ISSN:
2088-8708
T
able
6
sho
ws
the
results
of
the
multi-objecti
v
e
function
of
the
P
ARCD
method.
Here,
there
are
four
parameters
i.e.,
support,
confidence,
comprehensibility
and
interestingness.
Then,
the
method
is
e
xamined
using
fi
v
e
datasets
i.e.,
quak
e,
bask
etball,
body
f
at,
bo
l
t,
and
pollution.
Generally
,
the
Bolt
dataset
is
the
dominant
data
set
and
has
the
highest
v
alue
for
each
parameter
(e
xcept
comprehensibility).
Con
v
ersely
,
the
least
dominant
dataset
is
quak
e
(with
the
e
xception
of
the
confidence
parameter).
T
able
6.
The
Output
of
P
ARCD
Method
Dataset
Support
(%)
Confide
nce
(%)
Comprehensibility
Interestingness
(%)
Quak
es
22.97
86.73
25.88
785.2
37.72
2.34
9.30
Bask
et
Ball
61.04
92.69
17.87
545.80
167.74
6.56
21.16
Body
f
at
73.94
81.26
30.67
333.49
218.95
10.61
21.03
Pollution
250.84
96.88
9.49
231.08
168.35
43.43
39.68
Bolt
60.45
34.96
43.91
110.63
165.76
9.51
18.61
The
first
parameter
,
i.e.,
support,
sho
wed
a
higher
v
alue
with
the
Bolt
dataset
(250.84%)
and
the
lo
west
with
the
quak
e
dataset
(22.97%).
The
a
v
erage
w
as
approximately
90%.
The
highest
confidence
v
alue
w
as
similar
to
the
support
v
alue.
The
highest
confidence
v
alue
w
as
obtained
with
the
Bolt
dataset
(96.88%)
with
a
de
via-
tion
of
approximately
10.
The
lo
west
confidence
v
alue
w
as
obtained
with
the
pollution
dataset
(34.96%)
with
a
v
ery
high
de
viation
of
just
under
45.
The
a
v
erage
confidence
v
alue
w
as
approximately
80%.
The
highest
com-
prehensibility
v
alue
w
as
obtained
with
the
Quak
e
dataset
(approximately
785).
The
lo
west
comprehensibility
v
alue
w
as
obtained
with
the
pollution
dataset
(approximately
110
with
a
de
viation,
well
o
v
er
165).
The
a
v
erage
comprehensibility
v
alue
w
as
approximately
400.
The
final
parameter
,
i.e.,
interestingness,
obtained
the
highest
v
alue
with
the
bolt
dataset
(approximately
43%
with
a
de
viation
of
just
under
40).
The
lo
west
interestingness
v
alue
w
as
obtained
with
the
quak
e
dataset
(2.34%
with
a
de
viation
of
just
under
10).
The
a
v
erage
interesting-
ness
v
alue
w
as
approximately
15%.
This
demonstrates
that
the
support
and
confidence
v
alues,
i.e.,
90%
and
80%
respecti
v
ely
,
were
satisf
actory
.
Moreo
v
er
,
the
comprehensibility
v
alue
w
as
four
times
better;
ho
we
v
er
,
the
interestingness
v
alue
w
as
not
satisf
actory
(approximately
15%).
The
correlation
v
alues
between
each
objecti
v
e
function
are
sho
wn
in
T
able
7
and
Figure
4.
The
result
s
sho
w
one
objecti
v
e
function
with
another
are
significant
association
either
be
positi
v
e
or
ne
g
ati
v
e.
The
correla-
tion
v
al
ue
of
all
objecti
v
e
functions
to
amplitude
w
as
al
w
ays
close
to
zero.
In
other
w
ords,
the
correlation
to
the
amplitude
function
w
as
lo
w
.
Thi
s
pro
v
es
the
opinion
gi
v
en
by
Alatas
et
al.
(2008),
i.e.,
the
amplitude
function
dif
fers
from
other
functions
because
it
attempts
to
minim
ize
while
the
other
functions
attempt
to
maximize
their
v
alues.
T
able
7.
Correlation
of
Multi-Objecti
v
e
Function
Support
Confidence
Comprehensibility
Interestingness
Amplitude
Quak
e
Support
1
0.8076
0.2112
0.9999
0.0000
confidence
0.8076
1
0.3971
0.8077
0.0000
comprehensibility
0.2112
0.3971
1
0.2113
0.0000
interestingness
0.9999
0.8077
0.2113
1
0.0000
amplitude
0.0000
0.0000
0.0000
0.0000
1
Bask
et
ball
Support
1
0.4360
-0.7437
0.9750
0.0000
confidence
0.4360
1
0.1646
0.5716
0.0000
comprehensibility
-0.7437
0.1646
1
-0.6350
0.0000
interestingness
0.9750
0.5716
-0.6350
1
0.0000
amplitude
0.0000
0.0000
0.0000
0.0000
1
Body
f
at
Support
1
0.8137
-0.8340
0.8555
0.0000
confidence
0.8137
1
0.9917
0.9469
0.0000
comprehensibility
0.8340
0.9917
1
0.9575
0.0000
interestingness
0.8555
0.9469
0.9575
1
0.0000
amplitude
0.0000
0.0000
0.0000
0.0000
1
IJECE
V
ol.
9,
No.
2,
April
2019
:
1359
–
1373
Evaluation Warning : The document was created with Spire.PDF for Python.