TELKOM
NIKA
, Vol.14, No
.4, Dece
mbe
r
2016, pp. 15
45~155
1
ISSN: 1693-6
930,
accredited
A
by DIKTI, De
cree No: 58/DIK
T
I/Kep/2013
DOI
:
10.12928/TELKOMNIKA.v14i4.3586
1545
Re
cei
v
ed
Jun
e
2, 2016; Re
vised Novem
ber 6, 201
6; Acce
pted No
vem
ber 2
1
, 2016
The Optimal High Performance Computing
Infrastructure for Solving High Complexity Pr
oblem
Yuliant Sibaroni*
1
, Fitri
y
a
n
i
2
, Fhira Nhita
3
1,2,
3
School of C
o
mputi
ng, T
e
lkom Univ
ersi
t
y
,
Band
un
g, Indo
nesi
a
, (022) 7
5
641
08
*Corres
p
o
ndi
n
g
author, e-ma
i
l
:
y
s
ib
aro
n
i@
g
m
ail.com
1
, fitri
y
ani@t
elkom
uni
versit
y
.
ac.i
d
2
,
fh
i
r
a
n
h
i
ta
@
t
el
ko
mu
ni
ve
rsi
t
y
.
ac.i
d
3
A
b
st
r
a
ct
T
he h
i
g
h
co
mp
lexity
of the
pr
obl
e
m
s tod
a
y r
equ
ir
es incr
eas
ingly
p
o
w
e
rful hardw
are
p
e
rfo
r
ma
nce.
Corresp
on
din
g
econ
o
m
ic
law
s
, the more r
e
li
abl
e the
per
for
m
a
n
ce
of the
h
a
rdw
a
re, it w
ill
be co
mpar
abl
e
to
the h
i
gh
er pr
ic
e. Associ
ated
w
i
th
the hi
gh-
p
e
rformanc
e co
mp
utin
g (HP
C
)
infrastructur
e
s
,
there ar
e thr
e
e
hardw
are
arch
i
t
ecture that ca
n be
use
d
, i.e
.
Comput
er
Cl
uster, Graphic
a
l Proc
essin
g
Unit (GPU), a
n
d
Super C
o
mp
uter. T
he goa
l of this research
i
s
to determi
n
e
the most opti
m
al of HPC infr
a
s
tructure to sol
v
e
hig
h
co
mp
lexit
y
probl
e
m
. F
o
r this reaso
n
, w
e
chos
e T
r
avel
ling S
a
l
e
s
m
an
Probl
e
m
(T
SP) as a case stu
d
y
and Ge
netic A
l
gorit
hm as a
meth
od to s
o
lv
e T
SP. T
r
avell
i
ng S
a
les
m
an
Probl
e
m
is b
e
l
ong
ing
often t
h
e
case i
n
re
al
lif
e an
d h
a
s a
h
i
gh c
o
mp
utatio
nal c
o
mp
l
e
xity.
W
h
ile t
he Ge
netic Al
gorit
h
m
(GA) bel
ongs
a
relia
bl
e al
gorit
hm to so
lve co
mp
lex cas
e
s b
u
t has the d
i
sa
dvanta
ge th
at the ti
me co
mpl
e
xity leve
l is v
e
ry
hig
h
.
In some
research re
lat
ed to HPC inf
r
astructure
co
mp
ariso
n
, the perfor
m
a
n
ce o
f
multi-cor
e
C
P
U
singl
e n
ode f
o
r data co
mp
utation
has n
o
t bee
n do
ne.
T
he current
deve
l
op
ment trend l
e
a
d
s to the
deve
l
op
ment
o
f
PCs w
i
th hi
g
her sp
ecific
atio
ns lik
e
this. B
a
sed
on th
e ex
peri
m
e
n
ts res
u
lts, w
e
concl
u
d
e
that the use of GA is very
effective to solve
T
SP. the use of mult
i-c
o
re si
ngl
e-no
de in p
a
rall
el for solvi
n
g
hig
h
co
mp
lexit
y
probl
e
m
s as
far as this is
still better
th
an
the tw
o other
infrastructure
but sli
ghtly b
e
l
o
w
compar
e to
mu
lti-core si
ngl
e-
nod
e seri
ally,
w
h
ile GPU
d
e
li
vers the w
o
rst perfor
m
a
n
ce c
o
mpar
ed to oth
e
rs
infrastructure.
T
he utili
z
a
t
i
o
n
of a super
computer
PC
for data co
mp
utation
is still
quite pr
o
m
isi
n
g
consi
deri
ng th
e eas
e of i
m
pl
ementati
on, w
h
ile
the GPU
utili
z
a
ti
on for t
he p
u
rpos
es o
f
data co
mp
uti
ng i
s
profitab
le if w
e
only uti
l
i
z
e
GP
U
to support C
P
U for data co
mp
utin
g.
Ke
y
w
ords
: T
SP, Genetic Alg
o
rith
m, cluster,
GPU, HPC, multi-core C
P
U
Copy
right
©
2016 Un
ive
r
sita
s Ah
mad
Dah
l
an
. All rig
h
t
s r
ese
rved
.
1. Introduc
tion
The
req
u
ire
m
ent of
High
-P
erform
an
ce
Computing
(HP
C
) in va
riou
s
fields i
s
i
n
cre
a
sin
g
ly
urge
nt. In
wh
eater,
biolo
g
y, aero, an
d
n
u
cle
a
r re
acto
r p
o
wer sy
ste
m
s
modeli
n
g
and
sim
u
latio
n
,
sup
port of hig
h
-pe
r
forman
ce comp
uting i
s
nee
ded to g
e
t more qui
ckly time and preci
s
ely re
sult
s
[1, 2]. Weath
e
r mo
delin
g system that u
s
es
weath
e
r d
a
ta from the
entire
regi
on
in variou
s p
a
rts
of the
wo
rld
also
requi
re
s very hi
gh
comput
ing
su
pport. F
o
r ex
ample, m
ode
lling in
Coupl
ed
AGCM
s
at 50
km atmosphe
re and
1 deg
oce
an have
computation
complexity in
Tera
scale,
while
in Earth System Model
s at 10 km atmosphere
and 1/
10 deg o
c
ea
n
have comp
utation com
p
le
xity
in Peta
scal
e
and i
n
mo
st
compl
e
xity problem,
F
u
ll
e
a
rth system
model
s with carbon
feed
b
a
ck
cycle at 1
k
m
atmosp
he
re a
nd 1/100 d
eg
oce
an hav
e
computation
complexity in Exascal
e
[1].
The cu
rre
nt
i
n
frast
r
u
c
ture
that
ca
n
b
e
use
d
to sup
port
hi
gh
-pe
r
forman
ce
co
mputing
among
othe
rs Com
puter Cl
uster,
Graphi
cal Processin
g
Unit
(GP
U
),
and Su
per Compute
r
PC.
In
a comp
uter
cl
uster, a
set o
f
compute
r
s
with a
uniform platform conne
cted to
a LAN netwo
rk
so
that they
ca
n work to
get
her to p
e
rfo
r
m pa
ralle
l
co
mputing.
GPU i
s
usually
use
d
fo
r h
e
a
v
y
graphics processi
ng, but the trend
at this
present time is
the utilization of GPU i
n
data
comp
uting be
cau
s
e
the ad
vantage
s
of GPU
th
at
hav
e mo
re th
an
2000
core
th
erein
[3]. A
super
comp
uter P
C
is ba
sically a com
puter t
hat cr
eated
by the PC m
anufa
c
ture
r
with a very
high
spe
c
ification.
At this time, a supe
r co
mputer
P
C
h
a
ve 192 p
r
o
c
essor
co
re a
nd deliveri
n
g
96
GFLOPS
co
re cl
ock sp
e
ed [4].
Com
pari
s
on
bet
ween th
ese t
h
ree
infrast
r
uctures i
s
v
e
ry
intere
sting. S
uperco
mpute
r
PC
ha
s a
d
vantage
s in
te
rms of
ea
se of
use, but it h
a
s
d
r
a
w
ba
cks i
n
terms of high price and a limit
ed number of proc
essor cores.
Com
puter cl
usters have flexibility
in term
s of t
he nu
mbe
r
o
f
pro
c
e
s
sor
cores dynam
ically a
c
cordi
ng to u
s
e
r
n
eed
s, whil
e th
e
Evaluation Warning : The document was created with Spire.PDF for Python.
ISSN: 16
93-6
930
TELKOM
NIKA
Vol. 14, No. 4, Dece
mb
er 201
6 : 1545 – 155
1
1546
dra
w
ba
ck is the com
p
lexity of the cluster devel
opme
n
t process. The
advantage
s
of GPU is in it
s
price
com
pare than
two
o
t
hers infrast
r
ucture,
wh
ile the lack of
GPU i
s
m
o
re complicated in
prog
ram
m
ing
implementati
on and the l
e
vel of GP
U clo
ck
spe
ed i
s
lower than
the CPU
clock
spe
ed.
Comp
ari
s
o
n
of comp
uting
time and oth
e
r facto
r
s in
some
HP
C in
frastructu
re h
a
s b
e
e
n
studie
d
by
several g
r
ou
ps
of re
se
a
r
ch
er
s
wh
o
aim to
obt
ain the
mo
st optimal
HPC
infrast
r
u
c
ture.
Baker An
d Buyya [5] concl
ude that
the u
s
e of comput
er cl
uste
rs
ha
ve advantag
e
s
over the u
s
e
of a dedicat
ed para
llel superco
mpute
r
that is asso
ciated
with a
lower p
r
i
c
e
and
increme
n
tal gro
w
th facto
r
. Compa
r
iso
n
between the
CPU and G
P
U in clu
s
ter sho
w
that the
perfo
rman
ce
of the GPU in Anomaly Detection in
Hyperspe
c
tral I
m
age
s is sup
e
rio
r
to the CPU
clusters [6].
This i
s
consi
s
tent
with the utility of GPU that
dedi
cated to im
age processi
ng.
Whe
r
ea
s in t
he ca
se
of cryptography, GPU p
r
o
c
e
s
sing time in
data computi
ng is
sup
e
ri
or
(faste
r) tha
n
the sin
g
le-co
r
e CPU p
r
o
c
e
ssi
ng time
bu
t worse than t
he pe
rform
a
n
c
e of multi-co
re
CPU to p
r
o
c
e
ss th
e same
data [7]. Performa
nce mu
lt
i-co
re CPU cl
uster
(multi
-core
m
u
lti-no
d
e
),
is also more signifi
cantly superi
o
r tha
n
gree
n-
ba
sed
clu
s
ter envi
r
o
n
ment (clu
ste
r
node
with low
power co
nsu
m
ption: Ra
spberry Pi) for
data
com
puting [8]. I
n
the
clu
s
te
r infrast
r
u
c
tu
re,
comp
utation
of sin
g
le
-co
r
e
CPU i
s
m
o
re
effe
ctive
than m
u
lti-core
CP
U for low
co
mple
xity
probl
em, whil
e for hig
h
co
mplexity prob
lem, com
puta
t
ion of multi-core CP
Us is more effe
ctive
than sin
g
le-core CP
U line
a
rly with the numbe
r of no
des in the
clu
s
ter [9].
By analyzing
seve
ral
stud
ies
related
to
the HP
C inf
r
ast
r
u
c
ture
compa
r
ison, it can
be
con
c
lu
ded
th
at
a com
p
a
r
i
s
on
of seve
ral
HP
C infra
s
tru
c
ture und
ertaken aim to determin
e
the
most optimal
infrast
r
u
c
ture
where the type of dat
a that used (im
a
g
e
/ non-imag
e) also affect the
perfo
rman
ce
obtaine
d an
d
determine
the a
pprop
ri
ate infra
s
tructu
re to
be
use
d
. Overall, the
prog
ram
m
ing
that use
d
in
these
studi
es is a p
a
rall
el
prog
ram
m
ing
.
In general, the pe
rform
a
n
c
e
of multi-co
re
(multi
-no
de) CP
Us in
a
clu
s
te
r
inf
r
astru
c
tu
re sti
ll
ha
s sup
e
ri
or perfo
rma
n
c
e
esp
e
ci
ally for computatio
n with hi
gh
com
p
lexity probl
em
s. But of all the
s
e
studie
s
, t
h
e
evaluation of
the perfo
rma
n
ce
of high
complexity
pro
b
lems co
mpu
t
ation
usi
ng multi-core CP
U
in
singl
e-n
ode i
s
not
done.
T
he pe
rforman
c
e of m
u
lti-
core
CPU in
single-nod
e is very interest
ing
for further a
n
a
lysis
con
s
id
ering the tre
n
d
of t
he use of a PC or noteboo
k with
multi-core
CPU
spe
c
ification
s
has al
so in
creased at pre
s
ent.
A su
per co
mputer ba
si
cally is
also
a PC
with
a
multi-
co
re
CP
U with v
e
ry
high
spe
c
ification
s
. Multi-core single-nod
e PC just di
d se
rially whe
r
e
most users a
r
e alre
ady qu
ite
familiar with this serial programming.
I
n
this
se
ri
al
program
m
ing, the us
er does
not need
to
perform paral
l
el processes
di
vision manually but will be set
automat
ically by the system.
This
study ai
ms to
com
p
le
ment previou
s
st
udie
s
, e
s
peci
a
lly relat
ed to the
pe
rforma
nce
of
multi-co
re singl
e-n
ode
whi
c
h rep
r
e
s
ents a simp
le
form
of a
su
per co
mpute
r
. Com
putatio
n of
high co
mplex
i
ty data will
be done u
s
i
ng a multi-core CP
U sin
g
le-n
ode
seri
ally and will be
comp
ared
with sin
g
le-co
r
e
multi-no
de
s i
n
pa
ralle
l and
multi-core GPU
in
pa
rall
el.
Perform
a
n
c
e
comp
ari
s
o
n
o
f
multi-co
re
si
ngle-nod
e se
rially infra
s
tru
c
ture, m
u
lti-core p
a
rallel si
ngle-nod
e an
d
multi-core GP
U in parallel i
n
frast
r
u
c
ture
can al
so b
e
consi
dered a
s
a comp
ari
s
o
n
betwee
n
thre
e
HPC infrast
r
u
c
ture i.e. sup
e
rcomp
u
ters, CPU
cl
uste
r
and GP
U on
a small
scale.
Comp
ari
s
on
of
the three infrastru
c
tu
re
s will be done in
high comp
lexi
ty problem, where the T
r
av
eling Sale
sm
an
Problem
(TS
P
) route
usi
n
g gen
etic al
g
o
rithm
s
is
ta
ken in thi
s
stu
d
y. The main
contri
bution
of
this research
is to give
recom
m
en
dati
on t
he o
p
tim
a
l of the
Hi
gh-Pe
rform
a
n
c
e
Com
putin
g
infrast
r
u
c
ture to
solve
high comp
utation probl
em.
2. Related Work
The research
involved in this
study incl
ude
s many a
r
ea
s which a
s
soci
ated wit
h
TSP,
geneti
c
algo
ri
thms, and
HPC. The field o
f
HPC is
the
most impo
rta
n
t con
c
ern in this study.
2.1. Rese
arc
h
in Computer Clus
ter
Comp
uter
Cl
uster
at the era of the 80
s is a
very fancy stuff for a colle
ge or la
b
o
rato
ry
and si
mply b
e
long
s for bi
g com
pani
es only. Si
nce
the 2000
s
where th
e pri
c
e of a PC h
a
s
become quit
e
affordabl
e and hig
h
-p
erf
o
rma
n
ce
co
mputing ne
e
d
s in
cre
a
si
ng
ly needed, m
any
laboratori
e
s t
hen try to buil
d
their o
w
n compute
r
clu
s
t
e
r. No
wa
day
s, ther
e a
r
e
many re
sea
r
ches
in high-pe
rformance co
mp
uting field fro
m
among a
c
a
demia.
Evaluation Warning : The document was created with Spire.PDF for Python.
TELKOM
NIKA
ISSN:
1693-6
930
The Optim
a
l High Pe
rform
ance Com
put
ing Infrast
r
u
c
ture for Sol
v
in
g High
… (Yul
iant Sibaro
n
i)
1547
Re
sea
r
ch in t
he field
of HPC u
s
ing
clu
s
te
r co
mpute
r
s wa
s con
d
u
c
ted
by a
n
u
m
ber of
resea
r
chers i
n
the wide a
r
ea. Barret, et al., ex
amine simple an
d more
compl
e
x matching
MPI
cores to
perf
o
rm
ope
ratio
n
s [1
0]. In th
eir
re
sea
r
ch,
Barret
et al.
analyzes the
me
ssage
rat
e
s
delivere
d
by
curre
n
t low-p
o
we
r ge
ne
ral
-
pu
rpo
s
e
pro
c
e
s
sors an
d
comp
ares th
e
m
to the current
high sin
g
le
-th
r
ead p
e
rfo
r
m
ance pro
c
e
ssors. Utili
zatio
n
of compute
r
clu
s
ter is al
so appli
ed in the
netwo
rki
ng fi
eld. Zou
n
me
vo et al. u
s
e
MPI as a
net
work t
r
an
spo
r
t for a
la
rge
-
scale [11]. T
hey
implemented
MPI in three
program
m
ing tool
i.e. Open MPI, MPICH and MVAPICH. Handli
ng
compl
e
x data
types are al
so examined b
y
Traf in data type normali
zation [12].
2.2. Rese
arc
h
in GPU
GPU
utilizati
on in
compu
t
ation today’
s
h
a
s al
so
been
carried
out by
a n
u
mbe
r
o
f
resea
r
chers. Lefohn
et al.
develo
p
a libra
ry
for accessing data
stru
ctures a
r
e gene
ric an
d
efficient GPU [13]. Mendez-Lojo, et al., utilizing
the G
P
U to run irre
gular al
gorith
m
s that operate
on pointe
r
-ba
s
ed d
a
ta structures
su
ch
as graph
s
[
14]. The re
sult is avera
g
e
spe
edu
p o
f
7x
comp
ared to
a se
que
ntial CPU im
plem
entation a
nd outperfo
rm
s a
parallel
imp
l
ementation o
f
the
same al
go
rith
m runni
ng on
16 CPU
co
re
s.
2.3. Rese
arc
h
in Comparison bet
w
e
e
n
CPU, GPU, and Clus
ter
Comp
ari
s
o
n
GPU a
nd
CP
U
clu
s
ters h
a
v
e bee
n ma
d
e
by Abel
Pa
z a
nd Anto
ni
o Plaza
2010 [6] in the pro
c
e
ss of
Anomaly Det
e
ction in
Hyp
e
rspe
ctral Im
age
s. In acco
rdan
ce
with the
utility of GPU that dedi
cated for graphics proc
essing, the results obta
ined show that the
perfo
rman
ce
of the GPU cl
uster in An
o
m
aly Detect
io
n in Hypersp
ectral Ima
g
e
s
was
sup
e
rio
r
to
CPU cl
uste
r up to 32 node
s. CPU
and GPU uti
lizat
ion in th
e clu
s
ter infrastru
c
tu
re al
so
con
d
u
c
ted
b
y
Mark, et
al
., [7] to larg
e-scal
e
com
putation i
n
t
he
ca
se
of
cryptography.
The
experim
ental
re
sults
sh
o
w
that
the
G
P
U processin
g
time in d
a
t
a
co
mputin
g sup
e
rio
r
(fast
e
r)
than a
singl
e
-
co
re
CPU p
r
ocessin
g
. While the b
e
st
perfo
rman
ce
is obtai
ned
from the u
s
e
of
multi-core
CP
U to proc
es
s
the s
a
me data.
Eac
h
HP
C
infras
truc
ture c
a
n
be
com
b
ine
d
with a
nothe
r infras
truc
ture in its
impleme
n
tation to
obtain
a mo
re
opti
m
al result.
Wan
g
, et
al., examin
e th
e level
of eff
i
cient
coo
r
din
a
tion
mech
ani
sm
s
to han
dle
pa
rallel requ
est
s
from
multipl
e
ho
sts of
co
ntrol to
a
GP
U
unde
r hybrid
prog
ram
m
ing
[15]. In another stu
d
y,
Aji
et al. shows that t
he utiliza
t
ion of the GPU-
integrate
d
M
P
I solution
s, i
n
epid
e
miolo
g
y simu
latio
n
can i
m
prove
the pe
rform
a
nce
up to 6
1
.6%
and ca
n
al
so
improve
the
perfo
rman
ce
of
a sei
s
molo
gy
modeli
ng appli
c
ation u
p
to
4
4
%,
wh
en
comp
ared wit
h
traditional h
y
brid MPI+G
P
U impleme
n
t
ations [16]. In anothe
r si
milar stu
d
y, Choi
et al. optimized CP
U–GP
U hybri
d
impl
ementation
a
nd a GPU p
e
rf
orman
c
e m
o
del for the kernel-
indep
ende
nt Fast Multipol
e Method (F
MM). In t
he best case, the achi
eve a spe
edu
p of 1.5
×
comp
ared to GPU-only implementatio
n [17].
2.4. Rese
arc
h
in TSP
The T
r
avelin
g Sale
sman
Proble
m
(TSP) is
am
ong the
mo
st famou
s
NP-h
ard
optimizatio
n
probl
em
s. Re
sea
r
ch for
T
SP cases i
s
gene
rally mu
ch in te
rm
s o
f
its mathem
atical
asp
e
ct
s. Bart
al et al. sho
w
that the a
l
gorit
hmi
c
tra
c
tability of metric TSP de
pend
s on th
e
dimen
s
ion
a
lity of
the spa
c
e and not on its spe
c
ific
ge
ometry [18]. In anothe
r stu
d
y, Fekete et al.
can
solve th
e
Fe
rmat-We
b
e
r-P
robl
em
(FWP) with
hi
gh a
c
cu
ra
cy i
n
o
r
de
r to
fin
d
a
go
od
heu
ristic
solutio
n
for the Maximum Weig
hted Ma
tching Proble
m
(MWMP
)
[19]. Bjorklu
n
d
et al. show
that
the traveling
sale
sma
n
problem in bo
u
nded
-de
g
re
e
grap
hs
can
be solve
d
in
time O((2
−
)n),
whe
r
e
>0 d
epen
ds o
n
ly on the deg
re
e boun
d but n
o
t on the num
ber of citie
s
, n [20].
3. Method
s
The meth
od t
hat we
used t
o
solve
TSP probl
em i
s
G
enetic Alg
o
rit
h
m (GA
)
. Thi
s
ge
netic
algorith
m
is i
m
pleme
n
ted i
n
thre
e
HPC infra
s
tr
u
c
ture i.e. su
pe
rcompute
r
, GP
U, an
d comp
uter
clu
s
ters.
We
use th
ree
HPC infra
s
tructure
with
similar pri
c
e
becau
se
th
e
final goal
of
this
resea
r
ch to o
b
tain the
mo
st optimal
HPC infrast
r
uct
u
re e
c
o
nomi
c
a
lly. The pa
rall
el co
mputin
g
is
impleme
n
ted
in GPU, a
nd co
mpute
r
clust
e
rs wh
ile se
rial co
mputing i
s
impleme
n
ted
in
sup
e
rcom
put
er.
Evaluation Warning : The document was created with Spire.PDF for Python.
ISSN: 16
93-6
930
TELKOM
NIKA
Vol. 14, No. 4, Dece
mb
er 201
6 : 1545 – 155
1
1548
3.1. Hard
w
a
re
In
frastr
u
cture
In this
researc
h
, we us
e a
CPU
with the s
p
ec
ifications Intel core
i5
@ 1.7
GHz,
4 co
re
s
and 4
GB
of RAM to i
m
plement
multi-co
re
si
n
g
le
-node se
rially
and multi-core sin
g
le-no
de
parall
e
l. Whil
e to impleme
n
t multico
r
e
GPU, we
use
a GPU nvidi
a
GTX 67
0 wi
th 1344
co
re
s and
980 MHz p
r
o
c
e
s
sor
clock.
3.2. TSP
TSP is one i
s
sue th
at ha
s
a high
com
p
l
e
xity or
NP-Complete. Th
e
idea of the T
SP is to
find the
sho
r
t
e
st route fro
m
the
co
lle
ction of the
city, visiting the
city exactly on
ce
and
retu
rn
to
the city of ori
g
in. TSP is di
vided into two types
, sta
n
dard
(symm
e
tric) an
d a
s
ymmetric
TSP. For
example in symmetric TS
P, given grap
h with
vertex
and ce
rtain
node a
nd int
e
r-nod
e wei
g
hts,
the result is a
po
ssi
ble
ro
ute pe
rmutatio
n
of the
n
u
mb
er
o
f
c
i
ties
.
W
h
e
n
th
e nu
mb
er
o
f
ver
y
lar
ge
cities, the complexity of time to find the cost of
each route will
be very
large.
Graph types
are
addresse
d in
this
study i
s
a
com
p
lete
grap
h, that
all the
city (n
o
des) a
r
e
con
necte
d
with e
a
ch
other.
3.3. Gene
tic
Alg
o
rithms
Geneti
c
Algo
rithms is in
spired by the t
heory of evolution and ge
netics. Solutions o
r
model
s produ
ced in
GA form contai
ning
individual
chromosome
s o
r
gene
s. The
evolution of the
GA pro
c
e
ss begin
s
dete
r
minin
g
indiv
i
dual repr
ese
n
tation, then
do the d
e
coding fo
r ea
ch
chromo
som
e
. The next pro
c
e
ss i
s
the initializati
on of the populatio
n. Each ch
ro
moso
me will
be
evaluated a
n
d
sel
e
cte
d
ba
sed
on the v
a
lue of fitn
e
s
s a
s
pa
rent
s
own
ed. A pai
r of pa
rent
s
will
prod
uce a ch
ild (ne
w
indiv
i
dual) of the
cro
s
sove
r. The ne
w indivi
dual will h
a
ve mutation
s in
some
it’s ge
nes an
d
conf
igure
s
a
ne
w natu
r
e th
at
i
s
really
different from
the
gen
es of th
eir
pare
n
ts. Th
e
resulting off
s
pri
ng
will then be
sele
ct
ed to re
place
the parental
chromo
som
e
s in
the pro
c
e
ss o
f
forming the next generation.
3.3.1. Indiv
i
d
u
al Repr
ese
nta
t
ion and
Cros
sov
e
r
In Geneti
c
Algorithm
s (GA
)
, an individ
u
a
l or
ch
romo
somes
ca
n rep
r
esent a
s
bin
a
ry, real
or integ
e
r. Th
e rep
r
e
s
entat
ion that used
in T
SP is pe
rmutation
rep
r
esentation,
whe
r
e the val
u
e
of each ge
ne
is differe
nt from ea
ch oth
e
r
be
cau
s
e
ea
ch g
ene
describe
s
ea
ch
cit
y
that has b
e
en
visited. We
u
s
e
order
cro
s
sover meth
od
s in
thi
s
study
. The
pu
rpo
s
e of thi
s
orde
r
cro
s
sove
r i
s
to
prevent the same city passed more than
one time.
3.3.2. Fitnes
s Functio
n
Fitness fun
c
ti
on me
asure
s
the de
gre
e
of
effe
ctivene
ss of an i
ndivid
ual a
s
the
sol
u
tion of
the sy
stem. I
ndividual
s
with po
or fitne
s
s valu
e (sm
a
ll) will
be
eli
m
inated i
n
th
e next p
r
o
c
e
ss.
Individual
s wi
th good fitness values (l
arg
e
), likely to
be a system solution. Fitne
ss value
s
u
s
e
d
in
this study can
be see
n
in e
quation (1).
(
1
)
Whe
r
e
I :
individual
i-th
B
:
initial weight (small number)
dist
(i)
:
the amo
unt o
f
the dista
n
ce
betwe
en
th
e
cities i an
d i
+
1 o
n
in
dividual i, i = 1,
..,(the s
i
z
e
of
the c
i
ty-1).
3.4. Parallel
GA
There are
several p
r
oce
s
ses a
r
e ca
rrie
d
out
in GA like initialization po
pulation,
individual evaluation, cro
s
sover,
m
u
ta
tion,
the
formation of
a
new gen
eration. In this st
udy,
individual
eva
l
uation p
r
o
c
e
ss i
s
con
d
u
c
ted a
s
a
par
all
e
l proce
s
s. T
h
is p
r
o
c
e
s
s is impleme
n
ted
in
parall
e
l on G
P
U and comp
uter clu
s
te
rs.
Evaluation Warning : The document was created with Spire.PDF for Python.
TELKOM
NIKA
ISSN:
1693-6
930
The Optim
a
l High Pe
rform
ance Com
put
ing Infrast
r
u
c
ture for Sol
v
in
g High
… (Yul
iant Sibaro
n
i)
1549
4. Results a
nd Analy
s
is
The implem
e
n
tation of the geneti
c
alg
o
rithm for T
SP is cond
u
c
ted u
s
ing th
ree HP
C
infrast
r
u
c
ture
i.e. the GPU,
sup
e
rcom
put
ers a
nd comp
uter clu
s
te
rs. Experiment
s perfo
rmed
with
several p
a
ra
meters of GA
. Types
of G
A
’s pa
ra
m
e
te
r settings
ca
n
be
see
n
in T
able 1. TSP
data
that used in t
h
is stu
d
y is o
b
tained fro
r
m
101-city prob
lem by
Chri
st
ofides-Eilon.
Table 1. Para
meters of Ge
netic Algo
rith
m
Parameters
Val
ues
Size of Population
100, 150
Prob. of C
r
ossover
0.5, 0.7, 0.9
Prob. of mutation
0.1, 0.3, 0.5
4.1. Experiment Results Using Multi
-
Core GP
U
By using
the
sam
e
g
eneti
c
al
gorith
m
p
a
ram
e
ter
sett
ings such
a
s
Table
1, the
re
sults
obtaine
d in
p
a
rallel
ge
neti
c
al
gorith
m
i
m
pleme
n
tatio
n
u
s
ing
the
GPU
wa
s
sh
own
in Fi
gure 1.
T
h
e
e
x
p
e
r
ime
n
t
a
l
r
e
su
lts
s
h
ow
ed
th
a
t
th
e
time
to so
lve
T
SP b
y
u
s
in
g
g
e
n
e
t
ic
a
l
g
o
r
ithm and
GPU infras
truc
ture
range from
0.5 se
con
d
s to 1.3 se
cond
s.
Figure 1. Run
n
ing Pro
c
e
s
s of Genetic Al
gorithm u
s
in
g
Multi-Co
re G
P
U
4.2. Experiment Results Using Mu
lti
-
Core Single-Node Seriall
y
The result of GA impleme
n
tation for T
SP using
GPU sho
w
ed b
e
tter co
mpa
r
ed to
a
n
impleme
n
tation u
s
ing a
regul
ar P
C
in our
pr
e
v
ious resea
r
ch. In this
experim
ent, GA
impleme
n
tation for TSP is co
ndu
cted
usin
g multi-core
single
no
de infra
s
tru
c
t
u
re
seri
ally. By
usin
g the
con
f
iguration
pa
rameters
su
ch
as th
e
confi
guratio
n pa
ra
meters of the
GPU, detail
e
d
results of the
experime
n
t can b
e
se
en
in Figure
2.
The expe
rim
ental re
sults
usin
g multi-core
singl
e nod
e i
n
frast
r
u
c
ture
seri
ally sh
ows that pr
ocessing time th
at obtaine
d ra
n
ge from
0.06
to
0.27 se
co
nd
s.
Figure 2. Run
n
ing Pro
c
e
s
s of Genetic Al
gorit
hm u
s
in
g
Multi-Co
re
-Single-No
de Se
rially
Evaluation Warning : The document was created with Spire.PDF for Python.
ISSN: 16
93-6
930
TELKOM
NIKA
Vol. 14, No. 4, Dece
mb
er 201
6 : 1545 – 155
1
1550
4.3. Experiment Results Using Mult
i
-
Core Single-Node in Parallel
Last
HPC inf
r
ast
r
u
c
ture
th
at used in
ou
r res
earch
is multi-core
si
ngle-nod
e in
Parallel
with MPI me
chani
sm. Gen
e
tic Algo
rithm
for TSP mu
st be adju
s
ted
in parallel to
be processe
d
in
this inf
r
ast
r
u
c
ture. By u
s
in
g pa
ram
e
ters as the
sa
me
previous
con
f
iguration, det
ailed re
sults of
the experim
e
n
t can b
e
se
en in Fig
u
re
3. The expe
ri
mental result
s for TSP-A
G
usin
g multi-core
singl
e-n
ode i
n
Parallel
sh
owin
g the time aro
und the
value of the pro
c
e
ss g
a
i
ned 0.06 to
0.2
5
se
con
d
s, not
far in cont
ra
st to multi-co
re
singl
e-n
ode
serially .
Figure 3. Run
n
ing Pro
c
e
s
s of Genetic Al
gorithm
Usi
n
g Multi-Core
-Single-Nod
e
in Parallel
4.4. Compari
s
on of 3 HP
C
infra
s
tru
c
ture and othe
rs Res
earc
h
To
see
a det
ailed comp
ari
s
on between the
thre
e HP
C infrastructu
res,
all re
sult
s will
b
e
displ
a
yed i
n
one
gra
ph,
a
s
sho
w
n
in
Fi
gure
4.
Refe
rring
to Fig
u
re
4, the
be
st ti
me results
of
th
e
impleme
n
tation TSP-GA i
s
u
s
ing m
u
lti-co
re
si
ngl
e-node in
pa
ra
llel, although
not signifi
ca
nt
comp
ared to
multi-core
sin
g
le-n
ode
seri
ally. The
wo
rst p
r
o
c
e
ssi
ng
time i
s
obtai
ned
by the
m
u
lti-
cor
e
GP
U.
Figure 4. Run
n
ing Pro
c
e
s
s of Genetic Al
gorithm u
s
in
g
3 HPC infra
s
tructu
re
Comp
ared to
the re
sults obtaine
d b
y
other
rese
arche
r
s, the
experim
enta
l
results
obtaine
d relat
ed to the
adv
antage
s
of m
u
lti-co
re
CP
U sin
g
le n
ode
i
n
pa
rallel
co
mpared to
m
u
lti-
core GPU i
s
still in line wit
h
t
he results
obtained Mark, et al., [3
] i
n
whi
c
h the best perform
a
nce
is obtaine
d for multi-core CPU in the clusters in
fras
t
r
uc
ture . Whereas
for the res
u
lt
s
of multi-
core CP
U si
ngle-nod
e se
rially ce
rtainl
y comple
m
e
nt the results obtain
ed
Mark, et al.,
sin
c
e
simila
r experi
m
ents have n
o
t been do
ne
by Mark, et
al. The experimental re
sult
s relate
d to the
comp
ari
s
o
n
b
e
twee
n multi-core
CPU
sin
g
le-n
ode i
n
p
a
rallel ve
rsu
s
multi-core CPU si
ngle
-
no
de
seri
ally also
similar to the result
s obtain
ed by
Rahm
a
n
for high co
mplexity prob
lem com
putat
ion.
Specific re
sul
t
s that different
iate
with other res
e
arch
infr
as
truc
ture are
related to the
three
com
p
a
r
i
s
on
s m
ade i
n
this exp
e
rim
ent. Alt
hough
the pe
rforma
nce
of multi-core
sin
g
le n
o
de
seri
ally is not the best; but the use of multi-co
re
sin
g
le node
seri
ally still very
worthy u
s
ed
for
comp
utation
of data with
h
i
gh complexit
y
. In general
it can b
e
con
c
lud
ed, the ut
ilization
of hi
gh
specification
PC (super computer
PC) for data computation seria
lly is still quite feasibl
e
and
efficient than
building
cl
ust
e
rs
that are
specifi
c
ally u
s
ed for
com
p
u
t
ing data. An
other
advant
age
Evaluation Warning : The document was created with Spire.PDF for Python.
TELKOM
NIKA
ISSN:
1693-6
930
The Optim
a
l High Pe
rform
ance Com
put
ing Infrast
r
u
c
ture for Sol
v
in
g High
… (Yul
iant Sibaro
n
i)
1551
is in prog
ram
m
ing aspe
ct that quite don
e se
ri
ally, which mea
n
s the
programmin
g
langua
ge u
s
e
d
is also more varied.
5. Conclusio
n
It can be
con
c
lud
ed that the u
s
e of GA
is very effect
ive to solve T
SP compa
r
ed
to the
brute
force
m
e
thod
ever u
s
ed oth
e
r rese
arch i
n
so
lvin
g TSP. By u
s
i
ng
simila
r
HP
C inf
r
a
s
tru
c
tu
re
spe
c
ification
s
, the use of multi-core
sin
g
le-n
ode
in p
a
rallel for
sol
v
ing high co
mplexity prob
lems
is better than
the two others infra
s
truct
u
re
s.
The proce
s
sing tim
e
using multi
-
co
re si
ngle
-
node
seri
ally slightl
y
below
com
pare to m
u
lti-core
si
ngle
-
n
ode in p
a
rall
el, while GP
U delive
r
s
worst
performance compar
ed to
others i
n
frastructure.
In general
it
can be co
ncluded,
utilization of
a
sup
e
r
com
p
uter PC for data
comp
uting is
still
quite p
r
omi
s
ing
co
nsi
d
e
r
ing th
e ea
se of
implementation. While the GPU utilization for da
ta
computing is
only promi
s
ing when we use
GPU to
sup
port data
co
mputing b
e
si
de CP
U but
developin
g
t
he spe
c
ial
HPC infra
s
tru
c
ture
based on GP
U is not p
r
ofitable.
Ackn
o
w
l
e
dg
ements
This
wo
rk is
sup
porte
d by
Minist
ry of
Re
sea
r
ch, Te
chn
o
logy a
n
d
High
er Edu
c
ation of
the Republi
c
of Indonesia
and Te
l
k
om
University by the Gran
t 105/SP2H/PPM/DRPM/II/2016.
Referen
ces
[1]
P
Ny
berg.
HP
C T
r
ends an
d Directio
n
s in th
e Earth Scienc
es.
In ECMW
F
HPC
w
o
rksh
op
. 2014.
[2]
PJ
T
u
rinsk
y
. A
d
vanc
es In Multi-Ph
ysics An
d High Perfor
mance Com
p
u
t
ing In Supp
or
t Of Nuclear
Reactor Po
w
e
r
S
y
stems Mod
e
lin
g And S
i
mu
latio
n
.
Nucl. En
g. T
e
chnol.
2
0
12; 44(2): 1
03-
112.
[3]
GEF
O
RCE GT
X 1
080. 2
016.
[4]
S Mcintosh-sm
i
th. A Next-Ge
nerati
on Man
y
-Cor
e Process
o
r With Relia
b
ilit
y
,
Fau
l
t
T
o
leranc
e And
Adaptiv
e Po
w
e
r Mana
gem
ent
F
eatures Opti
mized
F
o
r
Em
bed
de
d An
d H
i
gh P
e
rformanc
e Com
puti
n
g
Appl
icatio
ns. 2
016.
[5]
M Baker, R Buyya. Cl
uster C
o
mputi
ng: T
he Commod
i
t
y
S
u
percom
puter. 1
999; 29: 5
51-5
76.
[6]
A Paz, A Plaza. Clusters ver
s
us GPUs for
Para
ll
el T
a
rget and Anom
al
y
Detectio
n in H
y
p
e
rsp
e
ctral
Images.
EURA
SIP J. Adv. Sig
nal Proc
ess.
2010; 20
10.
[7]
M Marks, J Jantura, E Ni
e
w
i
adomsk
a-sz
yn
kie
w
icz, P Strz
elcz
yk, K Gó
ź
d
ź
. Hetero
ge
n
eous GPU &
CPU Clust
er F
o
r Hig
h Perfor
mance C
o
mput
ing.
Co
mput. Sci.
2012; 13(
2)
: 63-79.
[8]
A Ashari, M
Riaseti
a
w
a
n
.
High
Perform
a
nce C
o
mp
utin
g on
Cl
uster
and M
u
ltic
ore
Architecture.
T
E
LKOMNIKA T
e
leco
mmunic
a
tion C
o
mputi
n
g Electron
ics a
nd Co
ntrol.
20
15; 13(4): 1
408
-141
3.
[9]
A Rahm
an. Hi
gh Perform
anc
e Comp
utin
g
Clusters
D
e
si
g
n
an
d Ana
l
ysis
Using
Re
d H
a
t Enterpris
e
L
i
nux
.
T
E
LKOMNIKA Indone
sian Jo
urna
l El
ectrical En
gin
e
e
rin
g
.
201
5; 14
(3): 534-5
42.
[10]
BW Barrett, S
D
Hammond, R Bright
w
e
ll,
KS He
mmert. T
he Impact of
Hy
brid-Core
Processors on
MPI Message
Rate.
EuroMPI
/
ASIA
’
13
. 201
3
:
67-71.
[11]
JA Z
ounmevo,
A Afsahi, R R
o
ss. Using
MP
I in High-Perfo
rmance Com
p
uting Serv
ices.
EuroMPI
’
13
.
201
3: 43-4
8
.
[12]
JL T
r
af.
Optimal MPI Datat
y
pe N
o
rma
l
i
zati
on for Vector
and In
de
x-
bloc
k T
y
pes.
E
u
roMPI/ASIA’
14.
201
4: 33-3
8
.
[13]
AE Lefohn, S
Sengupta, JOE
Kniss,
R Strz
odka, JD O
w
ens
.
Glift: G
eneric,
Efficient, Random-Access
GPU Data Structures.
ACM T
r
ans. Graph.
20
06; 25(1): 6
0
-9
9.
[14]
M Mend
ez-L
oj
o, M Burtsch
er, K Pin
g
a
li.
A GPU I
m
pl
ementati
on
of
Inclusi
on-
bas
ed P
o
ints-to
Analys
is.
PPoPP’12. 20
12: 1
07-1
16.
[15]
L W
ang, M
Hua
ng, T
El-Ghaza
w
i.
T
o
w
a
rds Efficie
n
t
GPU Shar
in
g on
Multic
or
e Process
o
rs.
PMBS’11. 20
1
1
: 23-24.
[16]
AM Aji, LS Pa
n
w
ar, F
Ji, M Cha
bbi, K Mur
t
h
y
.
On the Efficacy of GPU-Int
egrate
d
MPI for Scientific
Appl
icatio
ns.
HPDC’1
3. 201
3: 191-
202.
[17]
J Choi, A Cha
ndram
o
w
l
i
sh
w
a
ran, K Madd
u
r
i, R Vuduc.
A CPU – GPU
Hybrid I
m
pl
e
m
entatio
n an
d
Mode
l-Drive
n
Sched
uli
ng
of
the F
a
st Multip
ole Meth
od.
In ACM.org. 201
4
.
[18]
Y Bartal, LA.
Gottlieb, R
Krauthg
amer.
T
he T
r
aveli
n
g
Sales
m
an Pr
obl
e
m
: Low
-di
m
e
n
si
ona
lity.
ST
OC’12. 201
2: 663-6
72.
[19]
P Fekete, H Meijer,
I Sc
ie
nce
,
D Math
ematik
, W
T
i
etze.
Solving
a ‘ Hard
’
Probl
e
m
to Ap
proxi
m
at
e
a
n
‘ Easy
’
One
:
Heuristics
for
Maxi
mu
m M
a
tchin
g
s a
n
d
Maxi
mu
m T
r
avel
in
g
Sal
e
s
m
an
Pro
b
le
ms.
AC
M
.
200
1; 1(21
2).
[20]
ABJ Bjorklu
nd,
T
Husfeldt, P
Kaski, M Koivis
to
.
T
he
T
r
aveli
ng Sal
e
sma
n
Probl
em in Bo
un
ded D
egre
e
Graphs.
ACM T
r
ans. Algorith
m
s
. 20
12; 8(2):
1-13.
Evaluation Warning : The document was created with Spire.PDF for Python.