Indonesi
an
Journa
l
of El
ect
ri
cal Engineer
ing
an
d
Comp
ut
er
Scie
nce
Vo
l.
13
,
No.
2
,
Febr
uar
y
201
9
, pp.
521
~
526
IS
S
N: 25
02
-
4752, DO
I: 10
.11
591/ijeecs
.v1
3
.i
2
.pp
521
-
526
521
Journ
al h
om
e
page
:
http:
//
ia
es
core.c
om/j
ourn
als/i
ndex.
ph
p/ij
eecs
A dyn
am
ic K
-
m
ea
ns
c
lus
ter
in
g for da
ta min
ing
Md. Z
ak
ir
H
ossain
1
, Md
.
N
as
im
Ak
h
ta
r
2
,
R
.B.
Ahm
ad
3
,
M
os
t
afi
ju
r
R
ah
m
an
4
1,2
Depa
rtment
of
Com
pute
r
Sci
en
ce
and Engi
ne
ering,
Dhak
a
Unive
rsit
y
of Engin
ee
r
ing
and
T
ec
hnol
og
y
,
Bang
la
d
esh
3
Facul
t
y
of
Infor
m
at
ic
s a
nd
Com
puti
ng,
Univer
sit
i
Sulta
n
Z
ai
n
al
Abidin
(UniSZA
),
Mal
a
y
s
ia
4
Depa
rtment of
Software
Eng
ineeri
ng,
Daffodi
l
I
nte
rna
ti
ona
l
Uni
ver
sit
y
(DIU
), B
angl
ad
esh
Art
ic
le
In
f
o
ABSTR
A
CT
Art
ic
le
hist
or
y:
Re
cei
ved
Sep
25
, 201
8
Re
vised
N
ov
2
4
, 2
018
Accepte
d
Dec
8
, 2
018
Data
m
ini
ng
is
t
he
proc
ess
of
fi
nding
struct
ur
e
of
dat
a
from
la
r
ge
data
sets
.
W
it
h
thi
s
proc
e
ss
,
the
decision
m
ake
rs
ca
n
m
ake
a
par
t
ic
u
la
r
dec
ision
fo
r
furthe
r
dev
el
o
pm
ent
of
th
e
r
e
al
-
world
proble
m
s.
Sever
al
d
ata
cl
uster
ingt
e
chniques
are
used
in
dat
a
m
ini
ng
for
findi
ng
a
spec
if
ic
patter
n
o
f
dat
a
.
The
K
-
m
ea
ns
m
et
hod
iso
ne
of
the
familiar
cl
uster
ing
te
c
hnique
s
for
cl
uster
ing
la
rge
dat
a
sets.
Th
e
K
-
m
ea
ns
cl
usteri
n
g
m
et
hod
par
ti
tions
th
e
dat
a
set
base
d
on
the
assum
pti
on
tha
t
the
num
ber
of
cl
usters
are
f
ixed.
The
m
ai
n
proble
m
of
thi
s
m
et
hod
is
tha
t
if
the
num
ber
of
c
luste
rs
is
to
be
c
hosen
sm
al
l
the
n
the
r
e
is
a
highe
r
proba
bi
lit
y
of
add
ing
dissim
il
ar
it
ems
into
the
sam
e
group.
On
the
ot
he
r
hand,
if
th
e
num
ber
of
cl
uste
rs
is
chose
n
to
be
high,
th
en
the
re
is
a
h
igher
cha
n
ce
of
add
ing
sim
il
ar
it
em
s
in
the
differe
nt
groups.
In
thi
s
pape
r,
we
addr
ess
thi
s
is
sue
b
y
proposing
a
new
K
-
Mea
ns
cl
usteri
n
g
al
gorit
hm
.
Th
e
proposed
m
ethod
per
form
s
dat
a
cl
ust
eri
ng
d
y
n
amical
l
y
.
The
proposed
m
et
hod
initiall
y
c
a
lc
ul
at
es
a thre
sh
old
val
ue as
a
c
e
ntroi
d
of
K
-
Mea
ns
and
b
ase
d
on
thi
s
v
al
ue
the
num
ber
of
c
luste
rs
ar
e
form
ed.
At
each
it
er
at
ion
of
K
-
Mea
ns,
if
the
E
ucl
idian
dista
n
c
e
bet
wee
n
two
point
s
is
le
ss
tha
n
or
equa
l
to
the
thre
shold
value,
the
n
th
ese
two
dat
a
point
s
will
be
in
th
e
sam
e
group.
Oth
erwise
,
the
prop
osed
m
et
hod
wil
l
cr
ea
t
e
a
n
ew
c
luste
r
wi
th
the
dissim
il
ar
dat
a
poin
t.
Th
e
result
s
show
tha
t
th
e
proposed
m
et
hod
outpe
rform
s the
origi
nal K
-
Mea
n
s m
et
ho
d.
Ke
yw
or
ds:
Ce
ntro
i
d
Cl
us
te
rin
g
Data m
ining
Eucli
dea
n dist
ance
K
-
Me
a
ns
Thr
e
shold
v
al
ue
Copyright
©
201
9
Instit
ut
e
o
f Ad
vanc
ed
Engi
n
ee
r
ing
and
S
cienc
e
.
Al
l
rights re
serv
ed.
Corres
pond
in
g
Aut
h
or
:
Md.
Za
kir
Hos
sai
n,
Dep
a
rtm
ent o
f C
om
pu
te
r
Scie
nce a
nd E
ng
i
ne
erin
g,
Dh
a
ka U
niv
e
rs
it
y of
E
ng
i
neeri
ng
a
nd
Tec
hn
ology,
Gazip
ur, Ban
glades
h.
Em
a
il
: zakircse11.d
uet@
gm
a
il
.co
m
1.
INTROD
U
CTION
The
new
inter
discipli
nar
y
fie
ld
of
c
om
pu
te
r
sci
ence
is
dat
a
m
ining
.
This
is
the
process
of
fi
nd
i
ng
data
patte
rn
a
ut
om
a
ti
cal
l
y
fr
om
the
la
rg
e
database
[
1].
The
necessit
y
of
da
ta
m
ining
is
increasin
g
day
by
day
since
pre
vious ten
or f
ifte
en
y
ears
a
nd
so
no
w
in
t
his
ti
m
e
on
the
m
ark
et
pl
ace
is
ver
y
c
ha
ll
eng
in
g
c
ompeti
ti
on
to
eff
ic
ie
ncy
of
inf
or
m
a
ti
on
and
in
f
or
m
at
ion
ra
pid
ly
perf
or
m
ed
an
i
m
po
rta
nt
ro
le
to
f
ind
out
a
decis
ion
of
plan
an
d
pro
vi
ded
a
gr
e
at
of
fer
of
in
f
or
m
at
ion
i
n
in
dustr
y,
so
ci
et
y
a
nd
al
l
tog
et
her.
I
n
real
-
world,
a
la
rg
e
nu
m
ber
of
data
is
avail
able
in
w
hich
it
is
diff
ic
ult
to
ret
rieve
t
he
us
ef
ul
in
form
at
ion
.
D
ue
to
the
pr
act
ic
al
i
m
po
rtance
,
it
is
i
m
po
rtant
to
retrieve
the
st
ru
ct
ur
e
of
data
within
the
gi
ven
ti
m
e
bu
dg
et
.
The
da
ta
m
ining
pro
vid
es
t
he
w
ay
of
el
im
inatin
gu
nn
ece
ssary
noise
s
f
r
om
data.
It
he
lpsto
pro
vid
e
neces
sa
ry
inf
or
m
at
ion
f
r
o
m
the
la
rg
e
da
ta
set
and
prese
nt
it
in
the
pr
oper
fo
rm
wh
e
n
it
i
s
necessary
f
or
a
sp
eci
fic
ta
sk.
It'
s
ver
y
helpful
to
analy
ze
the
m
ark
et
tre
nd,
se
arch
t
he
ne
w
t
echnolo
gy,
pro
du
ct
io
n
c
ontro
l
based
on
the
dem
and
of
c
ust
om
er
and
s
o
on. In
a
word, the
data m
i
nin
g
is harv
est
ing
of kn
owle
dg
e f
r
om
a
lar
ge
am
ou
nt
of
data.
W
e ca
n
predic
t
the ty
pe or
b
e
ha
vior
of
a
ny
pa
tt
ern
us
in
g dat
a m
ining
.
Cl
us
te
r
eval
ua
ti
on
of
data
is
an
im
po
rtant
ta
sk
i
n
kno
wled
ge
fin
ding
an
d
data
m
ining
.
Cl
us
te
r
f
orm
ation
is
the
pro
cess
of
c
r
eat
in
g
data
gr
oup
base
d
on
the
data
sim
i
la
riti
e
s
from
la
rg
e
da
ta
set
.
The
cl
us
te
rin
g process
is do
ne
b
y s
uper
vised
,
sem
i
-
su
pervis
ed or
uns
up
e
r
vi
sed
m
ann
er
[
2]
.
Evaluation Warning : The document was created with Spire.PDF for Python.
IS
S
N
:
2502
-
4752
Ind
on
esi
a
n
J
E
le
c Eng &
Co
m
p
Sci,
Vo
l.
13
, N
o.
2
,
Fe
bru
ary 2
019
:
521
–
526
522
The
cl
us
te
ri
ng
al
go
rithm
s
are
powerfu
l
m
et
a
-
le
arn
i
ng
to
ols
for
analy
zi
ng
the
data
pro
duced
by
m
od
ern
a
pp
li
c
at
ion
s.
T
he
purpose
of
cl
ust
erin
g
is
to
cl
ass
ify
the
data
into
gro
ups
acc
ordin
g
to
sim
il
ar
it
ie
s,
trai
ts, an
d be
ha
vior
of
data
[3
]
.
Ma
ny
cl
us
te
rin
g
al
gorithm
s
hav
e
bee
n
pro
pose
d
for
cl
assifi
cat
ion
of
dat
a.
Most
of
thes
e
al
go
rithm
s
are
base
d
on
the
assum
ption
that
the
nu
m
ber
of
cl
us
te
r
s
in
a
la
rg
e
data
is
fixed
.T
he
pr
oble
m
with
this
assum
ption
is
that
if
the
ass
um
ed
nu
m
ber
o
f
cl
us
te
r
is
sm
all
then
the
re
is
a
higher
c
ha
nce o
f
a
ddin
g
dissi
m
il
ar
it
e
m
s
into
the
sam
e
gr
ou
p.
O
n
the
ot
her
hand,
if
the
nu
m
ber
of
cl
us
te
r
is
la
rg
e,
the
n
the
r
e
is
a
hig
her
c
hanc
e
of
ad
ding
sim
ilar
data
place
d
into
differe
nt
gro
ups
[4
]
.
I
n
a
dd
it
io
n,
i
n
t
he
real
sit
uatio
n,
i
t
is
dif
ficult
to
know
the num
ber
of
cl
us
te
rs
i
n
a
dvance.
In
t
his
pa
pe
r,
we
dev
el
op
a
dy
nam
ic
K
-
Me
a
ns
cl
ust
erin
g
al
gorithm
.
This
al
gorithm
firstl
y
cal
culat
es
a
threshold
value
base
d
on
th
e
data
set
and
then
gro
up
s
the
data
set
without
fixing
the
num
ber
of
cl
us
t
ers
(K)
.
In
t
he
pro
po
se
d
al
go
rithm
analy
ze
the
data set
base
d
on
th
e
thres
hold v
al
ue
a
nd
fin
al
ly
th
e d
at
a
set
is
cl
us
te
rs
.
The
th
reshold
value
is
the
ke
y
to
this
propo
sed
m
et
ho
d.
T
he
thre
shold
va
lue
determ
ines
the
data
are
sam
e
gro
up or c
reate a ne
w g
rou
p.
2.
THE
K
-
MEA
NS
CLU
STE
RING
ALGO
RITH
M
In
this
s
ect
ion,
we
des
cribe
t
he
K
-
Me
a
ns
al
gorithm
first
then
t
he
detai
l
of
the
pro
pose
d
al
gorithm
will
be
pro
vide
d
in
the
f
ollo
wing
sect
io
n.
The
K
-
Me
a
ns
cl
us
te
rin
g
al
go
rithm
is
a
po
pula
r
al
gorit
hm
wh
ic
h
works
f
or
var
i
ou
s
ty
pes
of
data
nam
el
y
m
edical
i
m
age,
te
xt
and
s
o
on.
The
perf
orm
ance
of
cl
ust
ering
al
gorithm
s
depe
nds
on
t
he
ini
ti
al
centro
id
of
K
-
Me
a
ns
.
I
f
t
he
sel
ect
io
n
of
centr
oid
is
w
r
ong,
the
n
cl
us
t
erin
g
resu
lt
is
vo
la
t
il
e
and
t
he
num
ber
of
it
er
at
ion
s
will
be
increase
d.
T
her
e
fore,
both
the
ti
m
e
and
sp
ace
com
plexity
w
il
l be inc
rease
d pro
portion
al
ly
[5
]
.
The
K
-
Me
an
s
al
gorithm
is
widely
us
ed
te
chn
i
qu
e
wh
ic
h
is
a
si
m
ple
clu
ste
rin
g
te
ch
ni
qu
e
in
data
m
ining
.
It
isa
non
-
supe
rv
ise
d
le
arn
ing
al
go
rithm
wh
ic
h
is
us
ed
to
so
l
ve
well
-
known
c
luster
pro
blem
[6
]
.
Partit
ion
ba
sed
cl
us
te
rin
g
is
a
way
to
cl
us
te
r
la
rg
e
data
set
s
in
wh
ic
h
a
nu
m
ber
of
ob
j
ect
s
are
gi
ve
n
first
,
then
t
hese
obj
e
ct
s ar
e
par
ti
ti
oned
into
a
nu
m
ber o
f gro
up
s
a
nd each
gr
oup con
ta
in
s sim
il
a
r data
points
[
7]
.
The
K
-
m
eans
al
gorithm
cl
ass
ifie
s
the
data
into
K
dif
fer
e
nt
cl
us
te
r
thr
ough
the
it
erati
ve,
convergin
g
proces
s
.
T
he
ge
ner
at
e
d
cl
us
te
rs
of
K
-
Me
an
s
are
in
dep
e
nde
nt.
T
he
K
-
Me
ans
cl
us
te
rin
g
al
gorithm
wo
r
ks
i
n
two
diff
e
re
nt
pa
rts.
Firstl
y,
it
sel
ect
s
a
K
-
value,
w
he
re K
is t
he
num
ber
of clusters. Ano
t
he
r
pa
rt
is
to
c
onside
r
each
data
point
to
the
nea
rest
center
[8
]
.
A
f
te
r
com
pleti
ng
t
he
fi
rst
ste
p
t
he
n
cal
culat
e
th
e
Eucli
dea
n
distance
betwee
n
the
da
ta
po
int
to
K
centr
oid
s.
T
he
n
al
l
the
data
po
i
nts
are
use
d
to
create
s
om
e
gr
oup.
Thi
s
proces
s
will
b
e c
on
ti
nuing
unti
l m
inim
u
m
.
The
K
-
Me
ans
al
go
rith
m
g
iven
bel
ow.
Her
e
, K is
the
nu
m
ber
of clus
te
rs
an
d D is
th
e d
at
a set
w
hich
c
on
ta
in
s
n da
ta
o
bject
s.
Step
-
1: Select
k data o
bject
s
f
ro
m
D
as a
n
i
ni
ti
al
cluster cen
te
rs.
Step
-
2: Repe
at
Step 3 a
nd Ste
p 4,
i
f
the
cent
er
of
cl
us
te
rs
rem
ai
ns
unch
a
nged
.
Step
-
3:
Ca
lc
ul
at
e
the
distanc
e
betwee
n
eac
h
data
obj
e
ct
di
,
w
her
e
i=
0,1,2,…
K
-
1
a
nd
al
l
k
cl
us
te
r
cent
ers
c
j
,
wh
e
re
j
=
0,1,2
…K
-
1. A
ssig
n data o
bject
di to
the
n
ea
rest cl
us
te
r.
Step
-
4: F
or eac
h
cl
ust
er
j
, reca
lc
ulate
the clus
te
r
cente
r.
The
K
-
Me
an
s
cl
us
te
rin
g
al
gorithm
resu
lt
,
it
is
so
cl
os
e
to
each
data
points
in
eac
h
data
gro
up.
In
K
-
Me
a
ns
al
gorithm
,
the
data
gr
ou
ps
are
create
d
be
f
or
e
cal
culat
ing
the
distance
bet
w
een
centr
oi
d
to
each
data
po
i
nt
an
d
this
process
con
ti
nues
a
nu
m
ber
of
ti
m
es
un
ti
l
eac
h
data
po
ints
are
purely
gro
up
[9
]
.
So
the
ti
m
e
com
plexit
y
of
th
e
K
-
Me
a
ns
cl
ust
ering
al
gorithm
is
O(
m
kt).
Wh
e
re
‘m
’
is
the
data
po
i
nts,
‘k’
is
the init
ia
l centr
oid
s
,‘
t’
is the
num
ber
of ite
rat
ion
s
[10
]
.
3.
RELATE
D
W
ORK
In
t
his
sect
io
n,
w
e
will
gi
ve
a b
rief d
isc
us
si
on
of
the
existi
ng
K
-
m
eans
al
gorithm
s.
In
[2],
a
m
od
ifie
d
K
-
Me
a
ns
al
go
rithm
is
pr
opose
d
to
sel
ect
the
init
ia
l
cent
er
of
cl
us
te
r
base
d
on
t
he
i
m
pr
ovem
ent
of
t
he
sensiti
vity
.
Th
is
al
gorithm
div
ides
the
w
hole
sp
ace
in
s
egm
ent
and
ca
lc
ulate
s
the
f
r
equ
e
ncy
betw
e
en
the
segm
ent
and
e
ach
data
point
.
The
m
axi
m
u
m
fr
eq
uen
cy
of
data
point
se
le
ct
s
the
centr
oid
.
In
t
his
m
et
hod,
the
num
ber
K
is
de
fine
d
by
us
e
r
as
de
fin
ed
by
the
tra
di
ti
on
al
K
-
m
ea
n
al
go
rithm
.
Fo
r
this
al
gorithm
,
the
nu
m
ber
of d
i
vision
s
w
il
l be
k*k,
wh
e
re
‘
k’
ve
rtic
al
ly
as w
el
l as ‘
k’ hor
iz
onta
ll
y.
In
[
10
]
,
an
im
pro
ved
k
-
m
ea
ns
al
gorithm
i
s
pro
pose
d.
In
this
al
go
rith
m
,
the
inform
at
ion
of
dat
a
structu
re
needs
to
store
i
n
e
ach
it
erati
on.
This
in
form
ati
on
us
e
d
in
ne
xt
it
erati
on
.
T
his
pro
posed
m
et
hod
without
cal
cul
at
ing
the
distance
bet
ween
e
ach
data
po
i
nts
and
cl
us
te
r
centers
re
peat
edly
,
so
sa
ving
the
run
ning ti
m
e.
In
[
11
]
,
an
opti
m
iz
ed
k
-
m
eans
cl
us
te
ri
ng
m
eth
od
is
pr
opos
e
d
base
d
on
th
r
ee
opti
m
iz
at
io
n
pr
i
nciples
nam
ed
k*
-
m
eans.
Firstl
y,
a
hierar
chical
optim
iz
ati
on
pr
i
nc
iple
init
ia
li
ze
d
by
k*
cl
us
te
r
centers
(
k*>
k)
to
Evaluation Warning : The document was created with Spire.PDF for Python.
Ind
on
esi
a
n
J
E
le
c Eng &
Co
m
p
Sci
IS
S
N:
25
02
-
4752
A d
y
namic K
-
Mea
ns
Cl
us
te
ri
ng for
data
m
i
ning
(
Md
. Zaki
r Hoss
ain
)
523
reduce
the
ris
k
of
ra
ndom
ly
s
eeds
sel
ect
ion
.
Secondly
,
a
cl
us
te
r
pru
ni
ng
s
trat
egy
is
pr
op
os
e
d
for
i
m
pr
ovin
g
the
ef
fici
ency
of
k
-
m
eans.
F
inall
y,
it
i
m
pl
e
m
ents
an
optim
iz
ed
update
theo
ry
to
opt
i
m
iz
e
t
he
k
-
m
eans
it
erati
on
updating.
4.
PROP
OSE
D MET
HO
D
Our
pro
po
se
d
m
et
ho
d
cl
us
te
r
s
dyna
m
ic
a
ll
y
al
l
data
f
ro
m
a
la
r
ge
data
s
et
without
s
pe
ci
fyi
ng
(K)
value,
w
her
e
(
K)
is
t
he
num
ber
of
cl
us
te
rs.
I
n
K
-
Me
an
s
fir
s
tl
y
sel
ect
the
(K
)
value
the
n
sta
rt
cl
us
te
rin
g
base
d
on
the
value
of
(K).
But,
at
f
irst,
it
is
the
diff
ic
ult
ta
sk
to
sel
ect
.
Fo
r
this
reason
K
-
m
e
ans
cl
us
te
ri
ng
resu
lt
qu
al
it
y
bec
ome
s
poor.
I
n
our
pro
posed
m
eth
od
t
o
cl
us
te
r
l
arg
e
data
set
ba
sed
on
the
t
hresh
old
value
a
nd
the
resu
lt
of cluste
rin
g qu
al
it
y
is i
m
pr
ov
e
d.
)
1
(
)
,
(
1
0
1
0
eq
N
N
x
x
di
s
t
N
i
N
j
j
i
(1)
)
2
(
)
,
(
1
0
,
0
eq
x
x
d
i
s
t
M
i
n
N
j
i
j
i
(2)
Our pr
opose
d
al
gorithm
g
iven belo
w.
Wh
e
re
‘
D’
(
d
1
,
d
2
,
……
…d
n
)
is
data
set
s.
‘n’
is
the
data
points.
‘
K’
is
the
cl
us
te
rs.
‘X’
(
x
1
,
x
2
,
x
3
,
……
…x
n
)
is t
he data
poin
t. Th
e
Th
is t
he
thr
es
hold.
‘
c
’
i
s cluster
cente
r
.
Step
-
1: Cal
cul
at
e d
ist
ance m
at
rix dist
ance
di
(x
i
,x
j
)
, whe
re i
=0,
1, 2, …
….
N
-
1
a
nd
j
=0
, 1,
2
,
…
N
-
1.
Step
-
2: Cal
cul
at
e the th
res
ho
l
d value
Th
u
si
ng
(
1
)
.
Step
-
3: Fi
nd th
e m
ini
m
u
m
M
ean
from
x
i
to
x
j
us
in
g
(2).
Step
-
4:
Fi
nd th
e m
ini
m
u
m
m
e
an value i
ndex
x
i
. Sel
ect
x
i
t
h data p
oin
t
as a
first ce
ntro
i
d.
Step
-
5: Repe
at
Step
-
6 an
d St
ep
-
7 u
ntil
d
at
a points c
ha
ng
es
gro
up o
t
herwis
e Step
-
8.
Step
-
6: Cal
cul
at
e the d
ist
a
nc
e b
et
wee
n
e
ach
d
at
a
po
i
nt
x
i
a
nd all
K
cl
us
te
r
centers
c
j
.
if
(Th>
=
d
i
)
Assig
n da
t
a poi
nt xi to t
he ne
ar
est
clu
ste
r.
el
se
K=K
+1;
Step
-
7: r
ecal
c
ul
at
e the each
cluster ce
nter
.
Step
-
8: E
nd.
5.
E
X
PERI
MEN
TAL RES
UL
T AND
A
NAL
YS
IS
5
.
1.
E
xp
eri
m
ent
al Setup
We h
ave
sim
ul
at
ed
ou
r
propo
sed
m
et
ho
d
us
i
ng
MAT
LAB,
Jav
a,
Ma
pRe
du
ce,
a
nd
C+
+
i
n
a p
e
rsonal
com
pu
te
r.
T
he
per
s
on
al
c
om
pu
te
r
sp
eci
f
ic
at
ion
is
4GB
RAM
2.
4
GH
z
Co
rei5
process
or
.
A
t
first,
our
pro
posed
m
et
ho
d
de
velo
ped
in
C+
+
a
nd
t
hen
it
is
c
onve
rted
into
j
a
va
Ma
pRed
uc
e.
T
he
res
ult
of
our
pro
po
se
d
m
et
ho
d
is
a
ppli
ed
in
the
M
AT
L
AB
to
see
the
cl
us
te
r
an
d
da
ta
po
int
posit
ion
.
The
n
we
hav
e
dev
el
op
e
d ge
ne
ral K
-
Me
ans
al
gorithm
to
co
m
par
e w
it
h our
prop
os
ed
m
eth
od.
5
.
2
.
Result
Analysis
In
th
e
res
ult
an
al
ysi
s,
com
par
e
betwee
n
pr
opose
d
m
et
ho
d
and
ge
ner
al
K
-
Me
ans
cl
us
te
ri
ng
base
d
on
var
i
ou
s
par
am
et
ers
su
c
h
as
inter
cl
us
te
r
dis
ta
nce,
intra
cl
ust
er
distance
a
nd
s
um
of
sq
ua
re
error
(S
SE
)
.
If
SS
E
and
intra
cl
us
t
er
distance
a
r
e
m
ini
m
u
m
,
t
hen
the
qual
it
y
of
cl
us
te
r
is
good.
I
f
i
nter
cl
us
te
r
distance
is
m
axi
m
u
m
,
then
the
qual
it
y
of
cl
us
te
r
is
good.
F
or
res
ult
a
naly
sis,
we
a
re
gen
e
rated
so
m
e
data
set
us
i
ng
j
a
va
.
The
range
of
da
ta
set
s
betwee
n
0
to
100
a
nd
nu
m
ber
of
dat
a
point
is
10
0,
200,3
00,
400,
5
00
,
10
00
an
d
al
so
us
e
iris
data
se
t.
Table
1
s
ho
wn
the
iris
dat
a
set
res
ult
an
d
com
par
e
betw
een
pro
po
se
d
m
et
ho
d
a
nd
K
-
Me
ans
cl
us
te
rin
g
base
d
on
su
m
of
inter
cl
us
te
r
dist
ance
an
d
su
m
of
s
qu
a
re
er
ror
.
In
iris
set
os
a
and
Ir
is
ve
rsic
olour
each
has 5
0
ins
ta
nces
.
Evaluation Warning : The document was created with Spire.PDF for Python.
IS
S
N
:
2502
-
4752
Ind
on
esi
a
n
J
E
le
c Eng &
Co
m
p
Sci,
Vo
l.
13
, N
o.
2
,
Fe
bru
ary 2
019
:
521
–
526
524
Table
1.
Res
ult f
or
Ir
is
D
at
a
S
et
Data po
in
t
g
rou
p
The p
rop
o
sed
M
et
h
o
d
Algo
rith
m
Gen
eral K
-
M
eans
Alg
o
rith
m
#
of
clu
ster(K)
Su
m
of
inter clus
te
r
d
istan
ce
Su
m
of
sq
u
are
err
o
r
#
of
clu
ster(K)
Su
m
of
inter clus
te
r
d
istan
ce
Su
m
of
sq
u
are
err
o
r
Ir
is
seto
sa
(petal leng
h
t)
(petal wid
th
)
6
5
.97
4
.71
6
3
.25
7
.20
Ir
is v
ersicol
o
u
r
(sep
al leng
h
t)
(sep
al width
)
3
3
.02
1
4
.74
3
2
.48
1
8
.27
Ir
is v
ersicol
o
u
r
(petal leng
h
t)
(petal wid
th
)
3
2
.41
1
0
.74
3
2
.25
1
2
.28
Figure
1(a)
s
howi
ng
the
c
omparis
on
betwee
n
K
-
Me
a
ns
al
gorithm
and
pro
po
s
e
al
gorithm
based
on
su
m
of
i
nter
-
cl
us
te
r
distanc
e.
O
ur
pro
po
s
ed
al
gorithm
s
ap
ply
in
i
ris
set
os
a.
It
cre
at
es
six
data
gr
ou
p
dynam
ic
al
l
y
based
on
sim
il
arit
y.
So
the
s
um
of
i
nter
-
cl
us
te
r
dista
nc
e
is
in
creased
.
I
n
K
-
Me
ans
al
gorith
m
su
m
of
i
nter
-
cl
us
te
r
distance
is
de
creased
sho
wn
in
Fig
ur
e
1(
a
)
.
Fig
ur
e
1(b
)
s
howing
the
c
om
par
ison
bet
w
een
K
-
Me
ans
al
gorith
m
and
pro
pose
al
go
rithm
based
on
su
m
of
s
qu
a
re
er
ror.
O
ur
propose
d
al
gorithm
s
app
ly
in
iris
dat
a sets. T
he
n sum
o
f
s
quare
error
is
d
ec
reas
e. In
K
-
Me
a
ns
al
gorithm
su
m
o
f sq
ua
re
er
ror i
s incr
ease
d
s
how
n
in Figu
re
1(b).
Fo
r
Ou
r gene
ra
te
d
data
set
s r
e
su
lt
g
i
ven in
T
able 2.
(a)
(b)
Figure
1
(a
)
.
S
um
o
f
inter cl
us
te
r dist
ance
for i
ris d
at
a set
,
(b)
. Sum
o
f sq
ua
re err
or
f
or
i
ris d
at
a set
Table
2
. Res
ult f
or
Ge
ner
at
e
d Data
Set
Nu
m
b
e
r
o
f
Data
p
o
in
ts
Ou
r
p
rop
o
sed
M
et
h
o
d
Algo
rith
m
Gen
eral K
-
M
eans
Alg
o
rith
m
#
of
clu
ster
(K)
Su
m
of
in
ter
clu
ster
d
istan
ce
Su
m
of
sq
u
are
err
o
r
#
of
clu
ster
(K)
Su
m
of
in
ter
clu
ster
d
istan
ce
Su
m
of
sq
u
are
err
o
r
100
4
2
.02
1
.00
3
3
0
.84
1
.06
6
200
3
0
.92
4
3
2
.01
3
0
.86
7
3
2
.31
2
300
6
0
.98
5
6
1
.53
4
6
0
.89
7
5
1
.98
7
400
4
1
.71
7
7
3
.96
4
4
0
.88
1
3
4
.71
6
500
4
1
.68
6
4
.78
4
4
0
.80
8
4
5
.85
8
1000
8
2
.78
3
.38
6
1
.48
5
.1
Table
2
s
hows
our
ge
ne
rated
data
set
resu
lt
and
c
om
par
e
betwee
n
pro
po
sed
m
et
ho
d
an
d
K
-
Me
an
s
cl
us
te
rin
g
base
d
on
s
um
of
in
te
r
cl
us
te
r
distance
a
nd
s
um
of
squa
re
e
rror.
W
e
are
gen
e
r
at
ed
s
om
e
data
set
s.
The ran
ge o
f d
at
a sets betwe
e
n 0 to 1
00 a
nd
each
data set
ha
s 10
0,
200,
300,
400,
500, an
d 100
0
in
sta
nc
e.
Figure
2(a)
s
howi
ng
the
c
omparis
on
betwee
n
K
-
Me
a
ns
al
gorithm
and
pro
po
s
e
al
gorithm
based
on
su
m
of
inter
-
c
luster
distance
us
in
g
our
generate
d
data
set
s.
Fig
ur
e
2(a)
sh
ow
w
hen
nu
m
ber
of
data
po
i
nts
increase
th
en
su
m
of
inter
c
luster
dista
nce
increase
for
our
propose
d
m
et
ho
d.
So,
da
ta
po
ints
a
re
gro
up
Evaluation Warning : The document was created with Spire.PDF for Python.
Ind
on
esi
a
n
J
E
le
c Eng &
Co
m
p
Sci
IS
S
N:
25
02
-
4752
A d
y
namic K
-
Mea
ns
Cl
us
te
ri
ng for
data
m
i
ning
(
Md
. Zaki
r Hoss
ain
)
525
eff
ic
ie
ntly
.
In
K
-
Me
ans
al
gorithm
su
m
of
inter
-
cl
us
te
r
distance
is
decr
ease.
Fig
ur
e
2(
b)
sho
wing
the
com
par
ison
be
tween
K
-
Me
an
s
al
go
rithm
and
pro
pose
al
go
rithm
based
on
su
m
of
squa
re
error.
Fig
ure
2(b)
sh
ow
s
um
of
s
qu
a
re
e
rror
is
decr
ease
f
or
our
pro
posed
m
et
hod.
In
K
-
m
eans
s
um
of
square
e
rro
r
is
in
creased
.
So
,
Cl
us
te
r
qua
li
ty
is p
oor.
(a)
(b)
Figure
2(
a
)
.
S
um
o
f
inter cl
us
te
r dist
ance
for our
ge
ner
at
ed
data set
,
(b)
.
S
um
o
f
s
qu
a
re e
rror f
or our
gen
e
rated
d
at
a
set
6.
CONCL
US
I
O
N
In
t
his
pa
pe
r,
we
propose
a
new
K
-
Me
ans
al
gorithm
to
re
m
ov
e
the
dif
f
ic
ulti
es
of
t
he
existi
ng
K
-
Me
ans
al
gorith
m
.
The
pro
po
s
ed
m
et
ho
d
dynam
ic
al
l
y
fo
rm
s
the
cl
us
te
rs
f
or
a
giv
e
n
data
s
et
.
W
e
com
pare
our
pro
po
se
d
m
eth
od
with
t
he
existi
ng
K
-
Me
ans
al
gorithm
.
The
resul
ts
sh
ow
t
hat
the
propose
d
m
et
ho
d
ou
t
perform
s the ex
ist
in
g
m
et
ho
d f
or the
well
-
kn
own
iris
d
at
a set.
REFERE
NCE
S
[1]
S.
Sharm
a,
J.
Agrawal
,
S.
Agar
wal,
S.
Sharm
a,
“Machine
Learning
Techn
iqu
es
for
Data
M
ini
n
g:
A
Surv
ey
”
,
i
n
IEE
E
Internat
ion
al
Conf
ere
n
ce o
n
Com
puta
ti
on
al Int
e
ll
ig
ence
an
d
Com
puti
ng
Res
ea
rch
,
2013
.
[2]
R.
V.
Singh,
M.P.S.
Bhat
i
a,
“D
ata
Cluste
r
ing
wit
h
Modified
K
-
means
Al
gorithm”
,
in
IEE
E
-
In
te
rna
ti
o
n
a
l
Confer
ence
on
R
ec
en
t
T
ren
ds i
n
I
nform
at
ion
T
ec
h
nolog
y
(ICRTI
T
2011),
June
201
1.
[3]
V.
W
.
Ajin,
L.
D.
Kum
ar,
“Big
da
ta
and
cl
ustering
algorit
hms
”
,
in
Inte
rna
ti
ona
l
Confer
ence
on
Resea
rch
Advan
ce
s
in
Int
egr
ated
Na
viga
ti
on
S
y
st
ems
(RAIN
S),
Ma
y
2016.
[4]
A.
Shafe
eq
.
B.
M,
Hare
esha
.
K.
S,
“D
ynamic
C
luste
ring
of
Data
wit
h
Mo
dif
ie
d
K
-
M
eans
Al
gorithm”
,
in
Inte
rna
ti
ona
l
Co
nfe
ren
c
e
on
Inf
orm
at
ion
and
C
om
pute
r
Networ
ks
(ICICN
2012),
IPCS
IT
vol
.
2
7
IACS
IT
Press
,
Singapore
,
2012.
[5]
L.
Guol
i,
W
.
T
i
ngti
ng,
Y.Lim
ei
,
“T
he
improve
d
research
on
k
-
means
cl
usterin
g
algorit
hm
in
i
nit
ial
va
lue
s”
,
i
n
Inte
rna
ti
ona
l
Co
nfe
ren
c
e
on
Mec
hat
roni
c
Scie
n
ces
,
El
e
ct
ri
c
Engi
n
ee
ring
and
Com
pute
r
(MEC),
Sh
eng
y
ang,
Ch
ina,
2013.
[6]
S.
Jigui, L.
Ji
e,
Z.
Lian
y
u
,
“
Cluste
ring
al
gor
it
hm
s R
ese
arc
h
”,
in
J
ournal
of
So
ft
wa
re
,
2008;
19
(
1
):
48
-
61.
[7]
D
.
Neha
,
B.
M.
Vid
y
av
at
h
i,
“
A
Surve
y
on
A
ppli
c
at
ions
of
Data
Mining
using
Cluste
ring
Te
chni
qu
es”
,
in
Inte
rnational
Jo
urnal
of
Comput
er
Applications
(
0975
–
8887)
Vol
um
e
126
–
No.2
,
2015.
[8]
M.Fahi
m
,
A.
M.
Sale
m
,
F
.
A.
Torkey
,
“
An
eff
ic
i
ent
e
nhanced
k
-
m
ea
ns
cl
uste
r
ing
al
gor
it
hm
”,
in
Journal
o
f
Zheji
ang
Uni
ve
r
sity
S
ci
en
ce A
,
2
006;
10
:
1626
-
1
633.
[9]
K.
A.
Abdul
Naz
e
er,
M
.
P.
Se
basti
an
,
“Im
proving
th
e
A
cc
ur
acy
and
Ef
f
ic
i
e
ncy
of
the
k
-
m
eans
Cluste
ring
Al
gorithm”
,
in
P
roc
ee
d
ing
of
the
W
orld
Congress
on
Engi
n
ee
ring
,
vol
1,
Lon
don
,
J
ul
y
2009.
[10]
L.
ShiNa,
G.
Xu
m
in,
“Researc
h
on
k
-
means
Cluste
ring
Al
gori
thm
an
Impr
ove
d
k
-
means
Cluste
ring
Al
gorithm”
,
i
n
Thi
rd
In
te
rn
at
io
nal
S
y
m
posium
on
Intelli
g
ent Inf
orm
at
ion
T
ec
hn
olog
y
and
Secur
i
t
y
Inform
at
i
cs.
[11]
J.
Qi,
Y.
Yu
,
L
.
W
ang,
J.
L
iu
,
“K
*
-
Me
ans:
An
Ef
f
ec
t
ive
and
E
ff
icient
K
-
means
Cluste
ring
A
lg
orithm”
,
in
IE
E
E
Inte
rna
ti
ona
l
Co
nfe
ren
c
es
on
Bi
g
Data
and
Clo
ud
Com
puti
ng
(
BDCloud),
Soci
al
Com
puti
ng
a
nd
Networki
ng
(Socia
lCom
),
Su
stai
nable
Com
pu
ti
ng
and
Com
m
unic
a
ti
ons (Sus
ta
i
nCom
),
2016.
Evaluation Warning : The document was created with Spire.PDF for Python.
IS
S
N
:
2502
-
4752
Ind
on
esi
a
n
J
E
le
c Eng &
Co
m
p
Sci,
Vo
l.
13
, N
o.
2
,
Fe
bru
ary 2
019
:
521
–
526
526
BIOGR
AP
HI
ES OF
A
UTH
ORS
Md.
Za
kir
Hos
sain
recei
v
ed
the
B.
Sc
Engi
ne
eri
n
g
degr
ee
in
Com
pute
r
Scie
n
ce
an
d
Engi
nee
r
ing
Depa
rtment
f
ro
m
Dhaka
Univ
ersity
of
Enginee
ring
and
T
ec
hnolog
y
(DU
ET
),
Gaz
ipur,
Bangl
ad
esh,
in
2015
and
he
is
cur
ren
t
l
y
pursui
ng
the
M.Sc
Engi
nee
r
ing
degr
e
e
in
Com
pute
r
Scie
nc
e
and
En
gine
er
ing
Dep
ar
tment
in
Dhaka
Univer
sit
y
of
Engi
ne
eri
ng
an
d
Technol
o
g
y
(DU
ET
),
Ga
zi
pu
r.
His
res
ea
rch
i
nte
rest
includes
Data
Min
ing,
B
i
g
Data,
AI,
Mac
hine
Le
arn
ing,
Cloud
Com
puti
ng,
Software
Eng
ine
er
ing,
Com
pute
r
Network,
I
oT.
He
has
pre
s
ent
ed
pap
ers
at
conf
ere
n
ce
s bo
th
at hom
e
and
abr
oad.
Md.
Nasim
Akht
ar
recei
ved
the
M.E
ng
and
Ph.D
degr
ee
s
from
N
at
ion
al
T
ec
hni
cal
Univer
sit
y
of
Ukrai
ne,
Kiev
,
Ukrai
ne
and
M
oscow
Stat
e
Ac
ade
m
y
of
Fine
Chemica
l
Techn
olog
y
,
Russ
ia,
in
1998
and
20
10,
respe
ctively
.
Curre
ntly
,
he
i
s
a
Profes
sor
in
the
Depa
rtment
of
Co
m
pute
r
Scie
nc
e
and
En
gine
er
ing,
Dhak
a
Univer
sit
y
of
Engi
ne
eri
ng
and
Te
chnol
og
y
(DU
ET
),
Gaz
ipur
,
Bangl
ad
esh.
Hi
s
rese
arc
h
int
er
ests
inc
lud
es
Distribut
ed
Data
W
are
house
S
y
st
em
On
La
rge
Cluste
rs,
Dig
it
a
l
Im
age
Proc
essing
and
W
ater
Marking,
Peer
to
Pe
er
Netw
orking,
C
loud
Com
puti
ng,
Ope
rat
ing
S
y
st
em.
He
has
pr
ese
nt
ed
pape
rs
a
t
conf
er
enc
es
bo
th
at
ho
m
e
an
d
abr
oad
,
publi
shed
art
i
cle
s a
nd
pap
ers
in
v
ari
ous j
ourn
al
s.
R.
Badlishah
Ahm
ad
obta
ined
Bac
he
lor
of
Engi
ne
eri
ng
wit
h
Honors
(B.
E
ng.
(Hons
))
in
El
e
ct
ri
ca
l
&
Ele
ct
roni
c
Engi
n
ee
r
ing
from
Glasgow
Univer
sit
y
,
U
K
in
1994.
Conti
nued
Master
of
Scie
nc
es
(M.S
c.)
in
Opt
ic
a
l
E
l
ec
tron
ic
Engi
n
e
eri
ng
at
Univ
er
sit
y
of
Str
at
hc
l
y
de
,
UK
and
gra
duated
in
1
995
and
in
20
00
complet
ed
PhD
.
Resea
rch
int
er
ests
are
i
n
Com
pute
r
and
Te
l
ec
om
m
unic
ation
Network
Modell
ing
include
W
SN
and
Optic
al
Network
usin
g
discre
t
e
ev
en
t
sim
ula
to
rs (OMN
eT
++
)
,
Opt
ical
Networki
ng
and
Embedde
d
S
y
st
em ba
sed
on
GN
U/Li
nux.
Mos
ta
fij
ur
Rah
m
an
complet
ed
his
BS
c
in
C
om
pute
r
Scie
n
c
e
from
Nati
on
a
l
Univer
sit
y
of
Bangl
ad
esh
(2003).
He
Purs
ued
h
is
MS
c
(2009)
and
PhD
(2017)
i
n
Com
pute
r
Engi
nee
r
ing,
from
UN
IMAP,
Malay
sia
.
He
worke
d
as
Lectu
rer
since
2009
to
S
ept
ember,
2017
for
School
of
Com
pute
r
and
Com
m
unic
at
ion
Engi
ne
eri
ng
in
UN
IMA
P.
Curr
ent
l
y
h
e
is
serv
i
ng
as
As
sistant
Profess
or
in
the
Depa
rtment
of
Software
Engi
ne
e
ring
at
Daffodi
l
Inte
rna
ti
ona
l
Un
ive
rsit
y
(DIU
),
Bangl
ad
esh.
His
rese
arc
h
in
te
re
st
in
Software
Te
sting
,
Multi
m
edi
a
and
Cr
ea
t
i
vity
in
Medical
Scie
nc
e,
Com
pute
r
Secur
ity
,
Cloud
Com
puti
ng,
Algorit
hm
Optimiza
ti
on
,
Para
llel
and
Distribut
ed
S
y
st
em,
Devi
ce Driv
er
for
GN
U/Li
nu
x
base
d
embedd
ed
OS
.
Evaluation Warning : The document was created with Spire.PDF for Python.