TELKOM
NIKA
, Vol. 11, No. 7, July 201
3, pp. 3604 ~ 3610
e-ISSN: 2087
-278X
3604
Re
cei
v
ed
Jan
uary 16, 201
3
;
Revi
sed Ap
ril 5, 2013; Accepte
d
April 1
5
, 2013
Ming Non-redundant Associations From the Frequent
Concept Sets on FP-tre
e
Wang Hui
Information Se
curit
y
En
gi
neer
ing D
epartme
n
t, People’s P
u
b
lic
Securit
y
U
n
i
v
ersit
y
of C
h
in
a, Beiji
ng, Chi
n
a,
100
03
8
e-mail: w
a
ng
hu
i03
30@
gmai
l.com
A
b
st
r
a
ct
T
he class
i
cal
alg
o
rith
m for
mi
nin
g
ass
o
ci
ation r
u
les
is l
o
w
efficiency.
Genera
lly th
er
e is h
i
g
h
redu
nda
ncy
be
tw
een ga
in
ed
rules. T
o
solv
e
these
pr
o
b
l
e
ms, a
n
e
w
al
g
o
rith
m
of find
i
ng
non-r
e
d
und
ant
associ
ation
rul
e
s b
a
sed
o
n
fr
equ
ent c
once
p
t
sets w
a
s pro
pose
d
. T
he
Ha
sse gr
aph
of th
ese c
once
p
ts
w
a
s
gen
erate
d
on t
he bas
is of
the
F
P
-tree. Because
of the rest
riction of the s
upp
ort m
o
st Hasse graphs have
lose l
a
ttice structure. Duri
ng
buil
d
i
ng proc
es
s of t
he Hasse
graph, al
l nod
es w
e
re format
ted accord
ing t
o
the i
ndex
of
ite
m
s w
h
ic
h w
e
re
foun
d i
n
th
e fr
equ
ent-i
te
m
he
ad ta
ble. At t
h
e sa
me ti
me
th
ese
nod
es w
e
r
e
selecte
d
by co
mp
arin
g sup
p
o
r
ts. In
the Hasse grap
h, the intentio
n of no
de
is frequ
ent items
e
t and th
e
extensi
on of n
ode is su
pport
count of
this item set. And th
e non-r
edu
nd
a
n
t associati
on
rules w
e
re ga
i
ned
by scann
in
g the leaf no
des of
the
graph. The
simu
lati
on sho
w
s the feas
ibilit
y of the algorit
hm pr
op
ose
d
.
Ke
y
w
ords
: dat
a mi
ni
ng, no
n-redu
nd
ant, associat
i
on rul
e
s, frequ
ent conc
e
p
t, F
P
-tree
Copy
right
©
2013 Un
ive
r
sita
s Ah
mad
Dah
l
an
. All rig
h
t
s r
ese
rved
.
1. Introduc
tion
Data minin
g
is one of the i
m
porta
nt means fo
r d
a
ta analysi
s
in m
any fields [1-3]. Now
many algo
rith
ms have
appl
ied to ge
nera
t
e better
resu
lts su
cce
ssful
ly. Among these
algo
rithm
s
mining a
s
so
ciation rul
e
s i
s
most po
pula
r
in bu
sine
ss field.The cl
a
ssi
cal al
go
rithms fo
r mini
ng
asso
ciation rules a
r
e Ap
ri
or an
d FP-g
rowth [4-6
]. There
are two
step
s to gain
high co
nfide
n
ce
asso
ciation rules for th
e two alg
o
rithm
s
. The firs
t
step is findin
g
freque
nt itemsets. Thi
s
is
the
key me
asu
r
e
and influ
e
n
c
es the
efficie
n
cy of the
wh
ole alg
o
rithm.
It takes m
a
n
y
times to scan
databa
se in
this process
usu
a
lly. Exploring
app
rop
r
iate kno
w
led
ge mod
e
l an
d red
u
ci
ng the
numbe
r of scannin
g
datab
ase a
r
e the
main ways
to improve
al
gorithm
s. Th
e se
con
d
ste
p
is
extracting
asso
ciation rule
s on the fre
q
uent itemsets
.
Compa
r
ed with
the
first step,
this ste
p
is
simple
r. But the associ
ation rule
s ge
n
e
rated
from the frequ
ent itemset
s
are h
i
gh red
und
an
t.
Espe
cially
wh
en the
supp
ort and
co
nfide
n
ce
are lo
w t
he n
u
mbe
r
of rule
s gen
erated
will in
crea
se
expone
ntially. This re
du
se
s the p
r
edi
ctive ability of th
e rul
e
s gain
e
d
. So it is
essential
to
stu
d
y
comp
act represe
n
tation a
nd find non-redun
dant rul
e
s in the mining pro
c
e
s
s.
The con
c
ept
lattice de
scribes th
e rel
a
tionship bet
ween tra
n
sacti
on an
d attrib
ute and
reflect
s
the g
eneralization
and spe
c
iali
zation betwee
n
con
c
e
p
ts [7, 8]. At
the same time th
e
Ha
sse gra
ph
of the con
c
ep
t lattice is si
mple and
ea
sy to realize. It is more intu
itive to discov
e
r
asso
ciation rules ba
sed
o
n
the Ha
sse grap
h.
But
the effic
i
enc
y
of c
o
ns
truc
ting c
o
nc
ept lattic
e
directly determines the applicability of
mining asso
ci
ation rules. F
u
rtherm
ore, compare with t
h
e
Aprior al
go
rithm, the FP-g
rowth al
go
rith
m onl
y need
scan data
b
a
s
e twice for b
u
ilding a FP-tree
.
And all the freque
nt itemsets a
r
e foun
d
on the tr
ee.
The
candi
dat
e itemset
s
a
r
e also avoid
e
d
.
Ho
wever, th
e
r
e a
r
e exp
o
n
ential g
r
owth
of the a
s
soci
ation rules wi
th the the freque
nt itemsets
increa
sing.
T
he la
rge
am
ount of
re
du
ndan
cy exi
s
ts
bet
ween
rules,
espetial
l
y with the
small
support
and l
o
w
confidence. So the rul
e
s
are diffi
cul
t
to understand and
utilize. To
solve these
probl
em
s and
obtain relia
bl
e asso
ciation
rule
s fo
r the
databa
se, a n
e
w alg
o
rithm
of finding non
-
redu
nda
nt asso
ciation
rul
e
s ba
se
d on
freque
nt
co
nce
p
t set is prop
osed. The alg
o
rith
m is
comp
osed o
f
two sub
-
al
gorithm
s. Th
e one i
s
DFCSA (Discover Frequ
e
n
t Con
c
ept
Set
Algorithm
) to build Hasse
grap
h of the frequ
ent it
emsets on the ba
sis of the FP-tree. Anothe
r is
NAREA
(No
n
-redu
ndant
Asso
ciation
Rul
e
s Extrac
tion
Algo
ri
thm) to
gai
n no
n-red
u
n
dant
asso
ciation rules fro
m
Hasse graph.
In the bu
ildi
ng pro
c
e
s
s
of the grap
h
,
the prunin
g
is
Evaluation Warning : The document was created with Spire.PDF for Python.
TELKOM
NIKA
e-ISSN:
2087
-278X
Ming No
n-red
unda
nt Asso
ciations F
r
om
the Fre
que
nt Con
c
e
p
t Sets on FP-tree
(Wan
g Hui
)
3605
compl
e
ted
si
multaneo
usly
. All the information a
bout
gene
ratin
g
a
s
soci
ation
rul
e
s
ca
n be
fo
und
in the Ha
sse grap
h of the frequ
ent itemsets.
Here, it is cl
ear that the
advantag
e of
the
cla
s
sical
FP-growth
a
l
gorithm i
s
in
tegrated
into the con
s
tructin
g
pro
c
e
ss of freq
uent
con
c
ept set
s
for the new a
l
gorithm. The
non-re
dund
a
n
t
asso
ciation rules gain
ed on
t
he
Ha
sse graph
are
ensured by
rule scre
eni
ng.
After
no
n-
redu
nda
nt extraction th
e rules
will be
more u
nde
rstanda
ble. The
simulatio
n
s
sho
w
that th
e
algorith
m
is i
n
tuitive and e
fficient.
2. Relate
d Concep
ts
The related
con
c
e
p
ts an
d definition
s
of
associ
ation rul
e
s
an
d co
ncept la
ttice are
descri
bed a
s
follows.
Defini
tion 1
[7] Giving a
backg
rou
nd
as
(,
,
)
TD
I
R
, it is the group
with three eleme
n
ts.
Whe
r
e,
D
is
the trans
ac
tion
s
e
ts
.
I
is the attribute sets.
R
is a relation an
d
R
DI
. If there is
only one p
a
rtial orde
r rel
a
tion to gen
erate the l
a
ttice
stru
cture, the ba
ckg
ro
und i
s
call
ed
as
c
o
nc
ept lattic
e
.
Defini
tion 2
[7] The
nod
e of lattice
L
is a
ordered
pairs
and
e
x
presse
d a
s
,
XY
.
Whe
r
e,
X
is
a
colle
ction
of t
r
an
sa
ction
s
a
nd
calle
d the
extensi
on.
Y
is the
comm
on
attribute
of
all inst
an
ce
s i
n
X
and called t
he co
nnotatio
n. Each pai
r is co
mplete, e
x
presse
d as
()
,
()
,
XY
x
D
y
Y
x
R
y
YX
y
I
x
X
x
R
y
(1)
Defini
tion 3
[9] Let item s
e
t
1
I
I
,
and th
e sup
p
o
r
t of
1
I
in the tran
sa
ction sets
D
is
expre
s
sed a
s
11
()
/
Suppor
t
I
t
D
I
t
D
(2)
whe
r
e, the su
pport is the p
e
rcentag
e of affairs in
D
, which contain
1
I
.
Defini
tion 4
[9] The
con
f
idence of a
s
soci
ation
rul
e
12
()
I
I
whi
c
h i
s
d
e
fined in th
e
attribute set
I
and tran
sa
ctio
n sets
D
, is expresse
d as
12
1
2
1
()
(
)
/
(
)
C
onf
i
d
e
n
c
e
I
I
Su
ppor
t
I
I
S
u
ppor
t
I
(3)
whe
r
e,
12
,
I
II
.
12
II
. The confid
en
ce
is the ratio
of affairs n
u
mbe
r
whi
c
h
is
respe
c
tively inclu
ded in
12
I
I
an
d
2
I
.
Defini
tion 5
[9] The St
rong A
s
soci
ation
Rule
(S
AR) i
s
th
e a
s
soci
ation
rul
e
which
sat
i
sf
ie
s wit
h
M
in
s
u
p
(Min
-sup
port) an
d
M
i
n
c
onf
(Min
-co
n
fiden
ce
)
in
D
and
I
. When SAR is
a
nonem
pty set
,
it is
calle
d f
r
equ
ent item
sets. If
any e
l
ement
of SAR d
o
e
s
n’t
co
ntained
by th
e
others, it is called the max
i
mum frequ
en
t item sets.
3. Mining Non-Redund
an
t Ass
o
ciatio
n Rule
In the mi
ning
pro
c
e
s
s for a
s
soci
ation
rul
e
s, the
rule
s
must
be
sati
sfied with
the
minimum
sup
port th
re
shold. Compa
r
ed with
the
specifi
c
tra
n
sa
ction
s
contai
ned by th
e freque
nt itemset,
the su
ppo
rt calcul
ation o
n
l
y
con
c
e
r
ne
d
about the
qu
antity of these tran
sa
ction
s
. So the
sp
e
c
ific
informatio
n
containe
d by
the exten
s
io
n
X
of the
co
nce
p
t
,
XY
is ignored. Here,
X
is re
placed
by the cardi
n
al numbe
r
X
.
X
mean
s the nu
mber of the tr
ansactio
n
s in
volved by ite
m
set
Y
. The
c
o
nc
ep
t
,
X
Y
be
come
s
con
c
ept qua
ntified. This
con
c
ept is mo
re
con
c
i
s
e an
d ea
sier to
cal
c
ulate
sup
port for mini
n
g
asso
ciation
rule.
More
o
v
er, becau
se
of supp
ort thre
shol
d’s li
mit,
most of
Ha
sse gra
p
h
s
lo
se
the
structu
r
e
of lattice. Because all
sub
s
ets of the m
a
ximum frequ
e
n
t
are still fre
q
u
ent itemset
s
. So the Hasse
graph of
fre
q
uent itemset
s
contain all freque
nt con
c
e
p
t.
Non
-
re
dun
d
ant asso
ciati
on rule
s
can
be found o
n
this g
r
aph.Buil
ding Hasse g
r
aph i
s
the m
o
st
importa
nt ste
p
at beginin
g
.
Evaluation Warning : The document was created with Spire.PDF for Python.
e-ISSN:
2087
-27
8
X
TELKOM
NIKA
Vol. 11, No
. 7, July 2013
: 3604 – 361
0
3606
3.1. Discov
e
r Frequen
t
Concep
t Sets
Algorithm (DFCSA)
Acco
rdi
ng to
Definition1
and Definition 2, DFCSA
algorithm u
s
es con
c
ept n
ode an
d
build
s Hasse
grap
h. But the Hsa
ae g
r
aph i
s
ge
nerated on th
e
basi
s
of the
FP-tree
whi
c
h is
con
s
tru
c
ted b
y
the classica
l FP-gro
wth a
l
gorithm.
Con
s
ide
r
ing the
Ha
sse gra
ph
with the value
o
f
the 1 freque
n
t
itemsets, the layers
(2
)
i
Li
are g
enerated by indexing
Ht
abl
e (He
ade
r-ta
b
le) of the
FP-growth
al
gorithm.
Oth
e
r n
ode
s
are sele
ct
ed
b
y
comp
ari
n
g
with th
e mi
nimum
su
pp
ort
threshold.
S
o
ea
ch
no
de i
s
frequ
ent. And the
c
onn
otation of
ea
ch
node
is fre
q
u
ent item
set. T
h
e
maximum fre
quent item se
ts are compo
s
ed of the
co
nnotation
s
of leaf node
s. At the same time,
th
e
Su
b-
Ha
sse
w
i
th fr
eq
u
e
n
t
n
o
d
e
va
lue 1
is
no
t
c
r
oss
ea
ch
o
t
he
r
.
T
h
e co
ns
tr
uc
tin
g
p
r
oc
ess is
descri
bed b
e
l
o
w.
Input: transaction databa
se
D
, minimum su
pport threshol
d
M
in
s
u
p
-
c
o
u
n
t
.
Output: the Hasse dia
g
ra
m
of frequent
concept set which i
s
co
rrespondi
ng to
D
.
Step 1: th
rou
gh
scanni
ng t
he d
a
taba
se
D
o
n
ce, th
e
1 fre
quent ite
m
sets a
r
e
ge
nerated.
The supp
ort
cou
n
t numb
e
r is
re
co
rd
ed. And the
n
, 1 frequ
en
t items list
F
T
is obtai
ned b
y
desce
nding t
he count nu
mber
of t
he frequ
ent item
s. Let the nu
mber
of freq
uent item set with
value 1 is
N
.
Step 2: Con
s
tru
c
ting the
Htable a
nd t
he FP-tree o
f
F
T
. Each nod
e of the FP-tree i
s
con
s
i
s
ted of node n
a
me, n
ode count nu
mber, no
de-li
nk an
d pointe
r
of pare
n
t no
de [6].
Step 3: Th
e
root
of
0
L
la
yer no
de i
s
cre
a
ted
directly, which
is marke
d
a
s
,
D
.
A
cco
rdi
ng t
o
F
T
, the
1
L
layer i
s
created. It
s
node
is expre
s
sed
as
,{
}
ii
A
A
. Wh
ere,
()
i
Ai
N
wa
s
freque
nt items in
F
T
,
i
A
is the su
pport count n
u
mbe
r
of
i
A
.
Step 4:
1
i
. Based on
the
Htable’
s item o
r
der, e
a
ch no
des
i
A
of FP-tre
e is
re
spe
c
tively
execute
d
by depth-fi
rst se
arching.
Step 5: If
N0
, turn to Step 6 else Step 8.
Step 6: If
.
i
An
o
d
e
l
i
n
k
, th
en gen
erate
Sub-h
a
sse g
r
aph of the no
de
i
A
,else Step
7.
Step 7:
1
ii
,
1
NN
,turn to Step5. // Layer
(1
)
j
Lj
of Hasse diag
ram i
s
gen
erat
ed
in above ste
p
s
.
Step 8: Joi
n
t
he no
de
s wit
h
cove
r
relati
ons
and
outp
u
t the Hasse
grap
h
corre
s
pondi
ng
to
D
and sup
p
o
r
t threshold.
Acco
rdi
ng to
the Ha
sse
di
agra
m
of the
frequ
ent con
c
ept con
s
tructed
with
th
e method
descri
bed a
b
o
ve, the freq
uent itemsets with val
ue 1
were co
nsi
d
ered
duri
ng the co
nst
r
u
c
ting
pro
c
e
ss of F
P
-tree. Fo
r al
l transa
c
tion
s cont
ain
ed by
each freque
nt con
c
ept h
a
ve been
sorted
comp
ari
ng
wi
th the supp
ort numbe
r, the
relate
d no
de
s of the
Ha
sse-diag
ram’
s
1
L
layer a
ppea
r
orde
rly. And t
he in
ner no
d
e
s
above
are
no-rep
eat,
th
e Sub
-
Hasse
diagram
of freque
nt item
sets
with value 1 can be ge
nera
t
ed indep
end
ently.
In order to v
e
rify the
effe
ctivene
ss of
DFCSA, th
e
sampl
e
d
a
tab
a
se
an
d the
minimum
sup
port
co
un
t numb
e
r i
s
same
as the
referen
c
e
[10]
. The
datab
a
s
e i
s
sh
own i
n
table
1
b
e
lo
w.
The minim
u
m
sup
port
cou
n
t
numbe
r is 2.
At firs
t the Htable an
d FP-t
ree of the
sa
mple data
b
a
s
e
are getted b
y
FP-gro
wth algorith
m
an
d sho
w
n
in
Figure 1. The Ha
sse dia
g
ram of freq
uent
con
c
e
p
t corre
s
po
ndin
g
with
FP-tree is
sh
own in Fig
u
re
2.
Seen from th
e Figure 2, the Ha
sse gra
p
h
abov
e isn’t
lattice stru
ctu
r
e already. But to
abtain the n
o
n
-redu
ndant
asso
ciation
rules
will be
e
a
sie
r
with thi
s
stru
cture. And the compa
r
i
s
on
of node
s’ co
nnotation
with dire
ct and
indire
ct rela
tionshi
p will be executed
by bottom-up
prin
ciple.
Table 1. Sample datab
ase
TID
Transaction
item
sets
1 ABC
2 ACD
4 ABCD
5 A
6 BCD
Evaluation Warning : The document was created with Spire.PDF for Python.
TELKOM
NIKA
e-ISSN:
2087
-278X
Ming No
n-red
unda
nt Asso
ciations F
r
om
the Fre
que
nt Con
c
e
p
t Sets on FP-tree
(Wan
g Hui
)
3607
Figure 1. A freque
nt tree o
f
the sample.
Figure 2. Fre
quent con
c
ep
t Hasse dia
g
ram
.
3.2. Non-r
e
d
undan
t
As
so
ciation Ru
le
s Extrac
tion
Algorithm (NARE
A)
After gen
erati
ng the
Hasse
diag
ram
of f
r
equ
ent
con
c
ept set, extra
c
ting
rule
s
be
come
s
easi
e
r.
Con
s
iderin
g the
Ha
sse dia
g
ram an
d letti
ng the
Mini
mum reliabili
ty as
M
i
n
c
onf
, in
addition
to
ro
ot nod
e the
content of
oth
e
r
nod
es
is freque
nt item
sets. And
the
extensio
n of
the
node i
s
the suppo
rt co
unt numbe
r in tra
n
sa
ction d
a
ta
base
D
. In the
Ha
sse graph,
the leaf node
s
contai
n all th
e maximum f
r
equ
ent item
sets for a
su
p
port threshol
d given. The
asso
ciation
rules
can b
e
obtain
ed throu
gh scannin
g
the cross-laye
rs
except the root node. But the
rules
extract
e
d
can
not be
a
v
oided redu
n
dan
cy with th
e cla
s
sica
l al
gorithm
s. Be
cau
s
e th
e no
n-redu
ndant
rules
will provid
e more inform
ation unde
r the minimum
confid
en
ce threshold give
n
.
The definitions
about no
n-re
dund
ant asso
ciation rule
s are a
s
follows.
Defini
tion 6
Accordi
ng t
o
the rul
e
A
B
and
CD
, if
CD
will
be gai
ned f
r
om
A
B
by some infe
rence rule
s th
at
CD
is a redu
n
dant rule
relat
i
ve to
A
B
.
Defini
tion 7
Let
:
RX
Y
com
e
s f
r
o
m
it
emset
I
.
XY
I
. If there isn’t rul
e
''
'
:
R
XY
in
I
that the rul
e
:
RX
Y
is
calle
d t
he smalle
st
non-re
dund
a
n
t rule
s in i
t
emset
I
. Where,
''
X
YI
,
'
XX
,
'
YY
.
Defini
tion 8
Let
:
RX
Y
come
s f
r
om it
emset
j
I
.
j
X
YI
.
The rule
''
'
:
R
XY
co
mes
from item
set
k
I
(
kj
I
I
).
''
k
X
YI
.
If
'
XX
that rule
''
'
:
R
XY
is call
e
d
stri
ct no
n-redun
dant
rules
relative to
:
RX
Y
.
The d
e
finition
6 sho
w
s th
a
t
there i
s
n
o
cro
s
s b
e
twe
e
n
non
-redu
nd
ant rul
e
s
abo
ut the
transactio
n
d
a
taba
se. Th
e
definition
7
sho
w
s
that minimum no
n-redu
ndant asso
ciation rules
have the
ch
a
r
acte
ri
stics with minimum
antecede
nt a
nd maximu
m
con
s
e
que
nt. So in o
r
de
r
to
avoid
redu
nd
ancy th
e
rule
s that
contain
fewe
r
proj
ect
s
of
ante
c
ed
e
n
t in th
e
sam
e
item
set n
e
e
d
to be cal
c
ul
ated at first. Th
e definition 8
sho
w
s t
hat the rule
s g
ene
rated amo
ng t
he the fre
que
nt
itemset
s
and
its sub
s
et are redu
nda
nt. To
avoid redun
dan
cy
the rules g
enerated by
the
maximum fre
quent itemsets mu
st be cal
c
ulate
d
firstly.
In the
Ha
sse
grap
h, the i
n
tensi
on
of the
l
eaf no
de i
s
t
he maxim
u
m
freque
nt item
set. So
the rule
pY
p
gene
rqted by any leaf node
()
,
(
)
Cs
u
p
Y
Y
p
Y
is the smalle
st non-re
dund
a
n
t
Evaluation Warning : The document was created with Spire.PDF for Python.
e-ISSN:
2087
-27
8
X
TELKOM
NIKA
Vol. 11, No
. 7, July 2013
: 3604 – 361
0
3608
rule. If rule
pY
p
is ture then all
the other
rule
s involving
p
in antecedent
can be d
e
rive
d by
this rule. If th
e rul
e
d
o
e
s
n
o
t meet the
confiden
ce
the
n
the
rule
inv
o
lving
p
in ant
ece
dent
ca
n
be
found by the followin
g
meth
od.
Firstly, the
n
on-red
unda
nt
rul
e
s involvi
ng
p
in ante
c
edent will be
gen
erated b
y
th
e
prop
er su
bse
t
of
Y
. These
rules
are
gain
ed by the up
per la
ye
r of t
he leaf no
de
s of the
Ha
sse
diagram.
Secon
d
ly, the no
n-singl
e
item rul
e
inv
o
lving
p
in ant
ece
dent
will
be g
ene
rated
by
Y
.
For
example,
If rule
A
BC
D
is
not ture then rule
A
BC
D
may be
ture
. The
s
e
rule
s are g
a
ine
d
by the lower l
a
yer of the no
de
({
}
)
,
{
}
Cs
u
p
A
A
.
Here, the rul
e
s ge
nerated
by the descri
bed metho
d
above are no
cro
s
s. So the non-
redu
nda
nt rul
e
s of the fre
quent con
c
e
p
t set quant
if
ied ca
n be completed int
e
ra
ctively by two
pro
c
e
s
ses
a
bove. The al
gorithm of n
on-red
unda
nt
asso
ciation
rule
s extra
c
ti
on is d
e
scrib
ed
belo
w
.
Input: The
Ha
sse g
r
a
ph
of frequ
ent
con
c
ept
set (Let t
he n
u
mbe
r
of leaf n
ode
s is
L
) and
minimum con
f
idence
m
i
nc
of
.
Output: The n
on-red
unda
nt
strong
asso
ciation rule
set
N
SA
R
Step 1: All th
e leaf node
s
()
,
ii
i
Cs
u
p
Y
Y
enter the qu
e
ue
1
Q
.
1,
2
,
,
iL
.
Step 2:
1
i
.
NS
A
R
.
Step 3: If the queu
e
1
Q
is e
m
pty then turn to Step 1
1
. Else
()
,
ii
i
Cs
u
p
Y
Y
1
()
O
u
t
que
ue
Q
,
1
j
,
12
{,
,
,
}
im
Yp
p
p
.
Step 4: If
1
i
Y
then turn to Step 10.
Step 5: If
j
m
then turn to Step 10.
Step 6:
()
/
(
{
}
)
ij
co
n
f
i
d
en
ce
s
u
p
Y
s
u
p
p
.
Step 7: If
c
onf
i
d
e
n
c
e
m
i
nc
of
then
{}
ji
j
N
S
AR
N
S
AR
p
Y
p
,
1
j
j
and turn to Step5.
Else Step 8.
Step 8: Call
Ge
n
-
R
u
l
e
s
-
S
u
b
s
e
t
s
(
,
)
ij
Cp
.
Step 9: Call
Ge
n
-
R
u
l
e
s
(
,
)
ij
Cp
.
Step 10:
1
ii
. If
iL
then Step3.
Step 11: Output
N
SA
R
.
Procedu
re
G
e
n-
R
u
l
e
s
-
S
ubs
e
t
s
(
,
)
ij
Cp
2
(,
(
)
)
i
E
n
Q
u
e
u
e
Q
pa
r
e
nt
C
if
2
()
Q
u
eu
eE
m
p
t
y
Q
then
return
else
2
()
N
Q
ue
u
e
l
e
ng
t
h
Q
for (
1,
,
kk
N
k
)
2
()
i
C
O
ut
que
u
e
Q
i
f
{}
ij
Yp
then
if
ij
Yp
then
()
/
(
{
}
)
ij
co
n
f
i
d
en
ce
s
u
p
Y
s
u
p
p
i
f
(
c
onf
i
d
e
n
c
e
m
i
nc
of
) then
{}
ji
j
NS
A
R
NS
A
R
p
Y
p
els
e
c
a
ll
Ge
n
-
Ru
l
e
s
-
S
u
b
s
e
t
s
(
,
)
ij
Cp
endif
endif
endif
endfor
Evaluation Warning : The document was created with Spire.PDF for Python.
TELKOM
NIKA
e-ISSN:
2087
-278X
Ming No
n-red
unda
nt Asso
ciations F
r
om
the Fre
que
nt Con
c
e
p
t Sets on FP-tree
(Wan
g Hui
)
3609
endif
Procedu
re
Ge
n
-
R
u
l
e
s
(
,
)
ij
Cp
2
(
,
(
s
up
{
}
,
{
}
)
)
jj
En
Queue
Q
C
h
il
d
p
p
if
2
()
Q
u
eu
eE
m
p
t
y
Q
then
return
else
2
()
N
Q
u
e
u
el
en
g
t
h
Q
for (
1,
,
kk
N
k
)
2
()
CO
u
t
q
u
e
u
e
Q
if
i
YY
then
()
/
(
)
i
co
n
f
i
d
en
ce
s
u
p
Y
s
u
p
Y
if (
c
onf
i
d
e
n
c
e
m
i
nc
of
) the
n
{}
i
NS
A
R
NS
A
R
Y
Y
Y
els
e
c
a
ll
Ge
n
-
Ru
l
e
s
(
,
)
i
CY
c
a
ll
G
e
n-
R
u
l
e
s
-
S
u
bs
e
t
s(
,
)
i
CY
endif
endif
endfor
endif
All the non
-re
dund
ant a
s
so
ciation
rul
e
s
c
an be gen
erated
by
the a
l
gorithm abov
e.
The
pro
c
ed
ure
G
e
n-
R
u
l
e
s
-
S
u
bs
e
t
s
ca
lculate
s
the
rule
s with a
singl
e set an
teced
ent of the frequ
ent
itemset
s
. The
procedu
re
Ge
n
-
R
u
l
e
s
produ
ce
s the n
on-singl
e item rule
in
a
n
tece
dent.
Acco
rdi
ng to
figure 2, th
ere
ar
e th
re
e leaf
nod
es su
ch
a
s
,,
A
C
B
A
CD
CB
D
. When
50%
m
i
nc
of
,
the re
sult is
best for n
o
n
-
redun
dant rul
e
s extra
c
tion.
There
are
ni
ne sm
alle
st non-
redu
nda
nt rul
e
s
su
ch
a
s
A
CB
,
A
B
C
,
BA
C
and
so
on.
T
he oth
e
r rul
e
s a
r
e
re
dun
d
ant. If no
n-
redundant extraction doe
sn’t exist, there will be
6
2
3
=
192
rules o
f
this graph.
4. Simulation
To further illu
strate the a
c
cura
cy and effe
ctivene
ss of the NAR
EA on the Ha
sse
graph,
the matlab
si
mulation
of NAREA, FP-G
rowth
and A
p
rior a
r
e
don
e
respe
c
tively. The m
u
shro
o
m
data set of machi
ne learning database UCI (http
://archive.i
cs.uci.edu/ml/)
is chosen as the
simulatio
n
d
a
t
a. There a
r
e
812
4 tra
n
sa
ction
s
a
nd
2
3
prope
rtie
s.
The
data
se
t is p
r
ovide
d
b
y
American
Uni
v
ersity of Cali
fornia. Becau
s
e the mu
sh
room set is de
nse d
a
tasets
the asso
ciatio
n
rule
s of this
datasets a
r
e
redu
nda
ncy
greatly.
The
test of the proce
s
se
s for
gene
rating
n
on-
redu
nda
nt asso
ciation rule
s is do
ne by usin
g th
ree a
l
gorithm
s ab
o
v
e. The simu
lation re
sult is
s
h
ow
n
in
F
i
gu
r
e
3
.
Figure 3. Rul
e
s Extractio
n
Algorithm Co
mp
a
r
e
s
Evaluation Warning : The document was created with Spire.PDF for Python.
e-ISSN:
2087
-27
8
X
TELKOM
NIKA
Vol. 11, No
. 7, July 2013
: 3604 – 361
0
3610
Figure 3
sh
o
w
s that the
n
u
mbe
r
of a
ssociat
io
n rule
s gaine
d by
Apriori
an
d F
P
-gro
wth
are
same. So
two cu
rves a
l
most re
peat i
n
the Fi
gure 3. In contra
st
, under the
same co
nfiden
ce
NAREA al
gorithm rem
o
ves red
und
an
cy and
red
u
ces t
he nu
mbe
r
of
mining
rul
e
s.
This imp
r
ove
s
the re
ada
bility and
com
p
rehen
sibility o
f
the rul
e
s. T
herefo
r
e,
th
e
NAREA alg
o
rithm ha
s
hi
gh
operating efficien
cy of mining.
5. Conclusio
n
In this pap
er,
the DF
CSA and NA
REA
algorith
m
s b
a
se
d on FP
-tree
were p
r
o
posed.
Both algo
rith
ms
are
u
s
ed
to extra
c
t no
n-redu
ndant
asso
ciation
rules.
Th
e p
r
oce
s
s fo
r p
r
u
n
ing
bran
ch i
s
ex
ecute
d
syn
c
hron
ou
sly du
ring
con
s
tru
c
ting Ha
sse
grap
h of fre
quent con
c
e
p
ts.
Con
s
id
erin
g the sub
-
tree
correspon
ding
to 1 freq
uent
item, the algo
rithm redu
ce
s the compa
r
in
g
cou
n
t bet
we
en the
seq
u
ence n
ode
s.
At the
sa
me time, th
e con
c
ept
o
f
non
-re
dun
d
ant
asso
ciation rules have b
e
en put forwa
r
d throug
h
different form
s. And the Hasse graph of the
freque
nt co
n
c
ept
contai
ns all inform
ation for
ex
tracting non
-redu
ndant a
s
so
ci
ation rul
e
s. It
is
sho
w
n that the DF
CSA a
nd NAREA a
l
gorithm
s ar
e
effective and efficient by simulating a
n
d
comp
ari
ng wi
th other algo
ri
thm.
Ackn
o
w
l
e
dg
ements
The pa
pe
r is supp
orted
by the tea
c
hin
g
pr
oje
c
t of the Peo
p
le’s Pu
blic
Secu
rity
University of China named as the feasi
b
ility research for network real name.
Referen
ces
[1]
Liu
Bin,
Qiu
H
u
a
y
o
ng, S
hen
Yizhe
n
. Re
al
iz
ation
a
nd A
p
p
l
i
c
ation
of C
u
sto
m
er Attrition
E
a
rl
y Warn
in
g
Mode
l in Sec
u
r
i
t
y
Com
p
a
n
y
.
T
E
LKOMNIKA Indo
nesi
an Jo
u
r
nal
of El
ectrical Eng
i
n
eeri
n
g
.
2012; 1
0
(5):
110
6-11
10.
[2]
Liju
an
Z
hou,
H
u
i W
a
n
g
, W
e
n
bo W
a
ng. Par
a
llel
Implem
enta
t
ion
of Cl
assifi
cation
Alg
o
rith
ms Base
d o
n
Clou
d
C
o
mp
uti
ng Env
i
ro
nme
n
t.
T
E
LKOMNIKA Indo
nesi
a
n
Journ
a
l
of El
ectrical E
ngi
ne
erin
g
. 20
12;
10(5): 10
87-
10
92.
[3]
A
w
a
d
Ali, M
o
a
w
i
a
E
l
faki,
Da
yang
Nor
h
a
y
at
i. Usin
g
Naïve B
a
yes
and
Ba
ye
sian
Net
w
o
r
k f
o
r Pre
d
ictio
n
of Potentia
l
Probl
ematic
Cases i
n
T
ubercu
losis.
In
ternatio
nal J
o
urna
l of Info
rmatics
and
Co
mmun
icati
o
n T
e
chno
lo
gy (IJ-ICT
)
. 2012; 1(2): 63-7
1
.
[4]
Agra
w
a
l R, I
m
ieli
ń
ski T
,
Sw
ami A.
Mi
ni
ng Assoc
i
ati
o
n Ru
les
betw
een S
e
ts of I
t
ems
in
Larg
e
Databases
. Procee
din
g
s of the 19
93 ACM
SIGMOD
In
ternatio
nal C
onf
erenc
e on Ma
nag
ement of
Data. Ne
w
Yor
k
. 1993: 20
7-2
16.
[5]
Agra
w
a
l R, Sh
fer JC. Parall
e
l
Mini
ng of As
sociati
on R
u
le
s. I
EEE Transactions
on
Knowledge and
Data Engineering
. 199
6; 8(6): 962-
969.
[6]
Han J, Pei J, Y
i
n Y.
Mini
ng F
r
equ
ent Pattern
s W
i
thout Can
d
id
ate Gener
at
ion
. Proc
eed
in
g of the 20
0
0
ACM SIGMOD
Internation
a
l C
onfere
n
ce o
n
Mana
geme
n
t o
f
Data, Ne
w
Y
o
rk. 2000; 29: 1-
12.
[7]
Ganter B, W
ill
e R. F
o
rma
l
Conc
ept An
al
ysis: Ma
themati
c
al F
o
u
n
d
a
tio
n
s. Berli
n
: Spr
i
ng
er-Verl
a
g
.
199
9.
[8]
Wille
R.
Restru
cturing L
a
ttice T
heory: an Ap
proac
h Based
on Hier
a
rch
i
es
of Conce
p
ts.
Procee
din
g
s of
the 7th Internat
ion
a
l Co
nfere
n
c
e on F
o
rmal
Conc
ept Ana
l
ysis, Berlin. 20
0
9
: 314-3
39.
[9]
Mao G, Dua
n
L, W
ang S. Princip
l
es
and A
l
gorit
hm
s of D
a
ta Mini
ng. Be
iJing: T
s
ingh
u
a
Univ
ersi
t
y
Press. 2005.
[10]
Che
n
X, W
u
Y. Minin
g
Ass
o
ciati
ons B
a
se
d
on S
i
mpl
i
fie
d
Co
ncept
Lat
tice b
y
Improv
ed Al
gorithm.
Appl
icatio
n Re
search of Co
mputers
. 201
1; 2
8
(4): 129
3-1
2
9
5
.
Evaluation Warning : The document was created with Spire.PDF for Python.