TELK
OMNIKA
,
V
ol.
15,
No
.
4,
December
2017,
pp
.
1884
1893
ISSN:
1693-6930,
accredited
A
b
y
DIKTI,
Decree
No:
58/DIKTI/K
ep/2013
DOI:
10.12928/telk
omnika.v15.i4.6049
1884
A
Load-Balanced
P
arallelization
of
AKS
Algorithm
Ar
dhi
Wiratama
Baskara
Y
udha
and
Reza
Pulungan
z
Depar
tment
of
Computer
Science
and
Electronics
,
F
aculty
of
Mathematics
and
Na
tur
al
Sciences
,
Univ
ersitas
Gadjah
Mada,
Y
ogy
akar
ta,
Indonesia
z
Corresponding
author
,
e-mail:
pulungan@ugm.ac.id
Abstract
The
best
kno
wn
deter
ministic
polynomial-time
algor
ithm
f
or
pr
imality
testing
r
ight
no
w
is
due
to
Ag
r
a
w
al,
Ka
y
al,
and
Sax
ena.
This
algor
ithm
has
a
time
comple
xity
O
(log
15
=
2
(
n
))
.
Although
this
algor
ithm
is
polynomial,
its
reliance
on
the
cong
r
uence
of
large
polynomials
results
in
en
or
mous
computational
require-
ment.
In
this
paper
,
w
e
propose
a
par
allelization
technique
f
or
this
algor
ithm
based
on
message-passing
par
allelism
together
with
f
our
w
or
kload-distr
ib
ution
str
ategies
.
W
e
perf
or
m
a
ser
ies
of
e
xper
iments
on
an
implementation
of
this
algor
ithm
in
a
high-perf
or
mance
computing
system
consisting
of
15
nodes
,
each
with
4
CPU
cores
.
The
e
xper
iments
indicate
that
our
proposed
par
allelization
technique
introduces
a
significant
speedup
on
e
xisting
implementa
tions
.
Fur
ther
more
,
the
dynamic
w
or
kload-distr
ib
ution
str
at
egy
perf
or
ms
better
than
the
others
.
Ov
er
all,
th
e
e
xper
iments
sho
w
that
the
par
allelization
obtains
up
to
36
times
speedup
.
K
e
yw
or
d:
Pr
imality
testing,
AKS
algor
ithm,
par
allelization,
load
balancing,
high-perf
or
man
ce
computing.
Cop
yright
c
2013
Univer
sitas
Ahmad
Dahlan.
All
rights
reser
ved.
1.
Intr
oduction
Pr
ime
n
umbers
are
the
cor
nerstone
of
n
umber
theor
y
.
Mathematicians
and
n
umber
the-
or
ist,
since
ancient
times
,
ha
v
e
been
f
ascinated
b
y
man
y
prob
lems
concer
ning
pr
ime
n
umbers
.
In
moder
n
time
,
man
y
of
the
most
impor
tant
cr
yptog
r
aph
ic
algor
ithms
rely
on
big
pr
ime
n
umbers
to
perf
or
m
encr
yption
and
decr
yption.
One
of
them
is
Riv
est-Shamir-Adleman
(RSA)
algor
ithm
[1],
which
is
no
w
widely
used
in
stor
age
encr
yption
[2],
digital
cer
tificate
[3],
and
w
eb
secur
ity
[4],
including
in
banking
tr
ansaction
secur
ity
.
RSA
algor
ithm
depends
on
the
f
act
that
it
is
difficult
to
find
the
pr
ime
f
actors
of
a
b
ig
integer
.
Electronic
F
rontier
F
oundation
(EFF)
off
ers
$250,000
as
a
re
w
ard
to
the
first
individual
or
g
roup
who
disco
v
ers
a
pr
ime
n
umber
with
at
least
1,000,000,000
decimal
digits
[5].
Searching
f
or
a
pr
ime
n
umber
is
usua
lly
based
on
an
efficient
algor
ithm
that
deter
mines
whether
a
giv
en
n
umber
is
pr
ime
or
composite
.
Such
algor
ithms
are
called
pr
imality
testing
algor
ithms
.
Most
of
pr
imality
testing
algor
ithms
are
probab
ilistic
,
namely
the
y
cannot
ascer
tain
the
pr
imality
of
a
giv
en
n
umber
,
b
ut
only
pro
vide
a
probability
that
the
giv
en
n
umber
is
pr
ime
.
Miller-
Rabin
pr
imality
test
[6,
7],
f
or
instance
,
has
an
error
r
ate
belo
w
25%,
which
means
that
if
the
giv
en
n
umber
pa
ss
e
s
this
test
n
times
,
then
the
probability
that
the
n
umber
is
pr
ime
is
1
0
:
25
n
[8].
Solo
v
a
y-Str
assen
[9]
pr
imalit
y
test,
on
the
other
hand,
has
an
error
r
ate
belo
w
50%.
Probabilis-
tic
pr
imality
testing
algor
ithms
are
relativ
ely
f
ast,
of
lo
w
comple
xity
,
b
ut
with
tunab
le
accur
acy
.
Ho
w
e
v
er
,
there
are
cases
that
require
cer
tainty
that
a
giv
en
n
umber
is
pr
ime
or
not;
and
thus
,
probabilistic
algor
ithms
cannot
be
used.
In
2002,
three
Indian
computer
scientists
Ag
r
a
w
al,
Ka
y
al,
and
Sax
ena
[10]
proposed
a
deter
ministic—
i.e
.
,
non-probabilistic—pr
imality
testing
algor
ithm
that
r
uns
in
polynomia
l
time;
w
e
will
ref
er
to
this
algor
ithm
as
AKS
algor
ithm
.
This
is
the
first
deter
ministic
polynomial-time
al-
gor
ithm
f
or
pr
imality
testing.
Since
this
seminal
paper
,
the
pr
imality
testing
prob
lem
no
longer
resides
in
the
comple
xity
classes
of
NP-Hard,
NP
,
or
ZPP
[11].
AKS
algor
ithm,
interestingly
,
is
relativ
ely
simple
and
str
aightf
orw
ard,
while
pre
vious
w
or
k
b
y
other
researchers
attempted
to
sho
w
that
pr
imality
testing
is
of
polynomial
time
comple
xity
b
y
making
comple
x
modifications
on
e
xisting
pr
imality
testing
algor
ithms
[12].
Receiv
ed
March
20,
2017;
Re
vised
October
26,
2017;
Accepted
No
v
ember
14,
2017
Evaluation Warning : The document was created with Spire.PDF for Python.
TELK
OMNIKA
ISSN:
1693-6930
1885
Since
this
theoretical
breakthrough,
man
y
researchers
ha
v
e
proposed
theoretical
and
pr
actical
impro
v
ements
to
this
algor
ithm
soon
after
it
w
as
released
in
pub
lic.
Notab
le
among
them
are
Lenstr
a
[13]
and
Ber
nstein
[14].
Ber
nstein
[14]
proposed
tw
o
pr
actical
possibilities
f
or
accel-
er
ating
AKS
algor
ithm
with
lo
w-le
v
el
speedup
b
y
impr
o
ving
the
integer
squar
ing
method
and
high-
le
v
el
speedup
b
y
reducing
the
n
umber
of
f
or
loop
iter
ations
in
the
last
step
of
the
algor
ithm.
This
included
all
state-of-the-ar
t
impro
v
ements
on
reducing
the
last
f
or
loop
iter
ations
and
produced
speedup
of
man
y
orders
of
magnitudes
.
These
ha
v
e
been
incor
por
ated
in
the
latest
v
ersion
of
AKS
algor
ithm.
Lenstr
a
and
P
omer
ance
[15,
16],
on
the
other
hand,
proposed
theoretical
impro
v
ements
to
AKS
algor
ithm
and
obtained
a
ne
w
technique
with
time
comple
xity
O
(log
6
(
n
))
.
The
y
modi-
fied
the
or
iginal
AKS
algor
ithm
b
y
decreasing
the
n
u
mber
of
iter
ations
in
the
f
or
loop
.
This
is
done
b
y
replacing
the
use
of
the
cyclotomic
polynomials
in
AKS
algor
ithm
b
y
a
monic
polynomial
f
(
x
)
of
deg
ree
r
with
integer
coefficients
such
that
the
r
ing
Z
[
x
]
=
(
f
(
x
)
;
n
)
is
a
pseudofield.
Ber
n-
stein
in
[17]
proposed
a
fur
ther
theoretical
impro
v
ement
to
AKS
algor
ithm
with
time
comple
xity
O
(log
4
(
n
))
.
The
proposal
also
attempted
to
decrease
the
n
umber
of
iter
ations
in
the
f
or
loop
b
y
replacing
the
use
of
the
cyclotomic
polynomials
b
y
r
andom
K
ummer
e
xtensions
of
Z
[
x
]
=n
.
Cr
andall
and
P
apadopoulos
[18]
implemented
a
v
ar
iant
of
AKS
algor
ithm
b
y
Lenstr
a
[13]
and
f
ound
that
empir
ically
the
time
comple
xity
of
the
v
ar
iant
is
c
log
6
(
n
)
,
where
c
is
around
1,000
cloc
k
cycles
.
Li
[19]
also
implemented
the
Lenstr
a
v
ar
iant
of
AKS
algor
ithm
using
C++
and
NTL
libr
ar
y
to
handle
the
polynomial
data
str
ucture
.
In
this
implementation,
a
15-decimal-digit
pr
ime
n
umber
required
around
3,000
seconds
to
compute
in
a
single-processor
computer
.
Menon
in
[20]
implemented
AKS
algor
ithm
in
SA
GE
(Softw
are
f
or
Algebr
a
and
Geometr
y
Exper
imentation),
and
produced
an
implementation,
in
which
a
25-decimal-digit
pr
ime
n
umber
required
more
than
4,000
seconds
to
compute
in
a
single-processor
computer
.
Cao
[21]
analyz
ed
the
stor
age
space
re-
quirement
f
or
AK
S
algor
ithm
and
sho
w
ed
that
the
required
stor
age
space
f
or
testing
a
n
umber
with
length
1,024
bits
is
about
1,000,000,000
Gigab
yte
,
which
is
p
r
actically
inf
easib
le
.
This
is
due
to
the
need
to
store
e
xtremely
large
polynomials
dur
ing
the
computation.
Man
y
scientists
ha
v
e
made
impro
v
ements
on
the
or
iginal
AKS
algor
ithm,
b
ut
sequential
implementations
of
the
algor
ithm
are
still
impr
actical
to
use
due
to
the
e
xpensiv
e
computation
in
v
olv
ed
in
each
step
and
its
stor
age
requirements
.
Future
direction
seems
to
be
to
w
ards
par
allel
implementations
.
This
paper
repor
ts
on
our
eff
or
t
to
de
v
elop
a
par
allelization
technique
f
or
AKS
algor
ithm
based
on
message-passing
par
allelism
(using
MPI)
and
to
find
out
the
best
w
or
kload-
distr
ib
ution
str
ategy
f
or
the
par
allelization
technique
.
The
rest
of
the
paper
is
organiz
ed
as
f
ollo
ws:
Section
2
presents
the
basis
of
AKS
algo-
r
ithm.
Section
3
descr
ibes
the
proposed
par
allelization
technique
,
together
with
f
our
accompan
y-
ing
w
or
kload-distr
ib
ution
str
ategies
.
In
Section
4,
w
e
present
the
result
of
our
e
xper
iments
with
the
proposed
par
allelization
technique
and
the
f
our
w
or
kload-distr
ib
ution
str
ategies
and
pro
vide
analysis
.
Section
5
concludes
the
paper
.
2.
Preliminaries
Let
Z
be
the
set
of
integers
and
let
a
and
b
be
tw
o
positiv
e
integers
.
Let
g
cd
(
a;
b
)
be
the
g
reatest
common
divisor
of
a
and
b
.
The
tw
o
integers
a
and
b
are
relativ
ely
pr
ime
if
and
only
if
g
cd
(
a;
b
)
=
1
.
Let
(
a
)
be
the
Euler’
s
totient
function,
namely
the
n
umber
of
positiv
e
integers
smaller
than
a
that
are
relativ
ely
pr
ime
to
a
.
F
or
relativ
ely
pr
ime
a
and
r
,
let
o
r
(
a
)
be
the
order
of
a
modulo
r
,
namely
the
smallest
integer
k
such
that
a
k
1
(mo
d
r
)
.
Let
a
rem
b
be
the
remainder
of
integer
division
betw
een
a
and
b
.
In
the
ear
liest
v
ersion
of
their
pub
lication,
Ag
r
a
w
al,
Ka
y
al,
an
d
Sax
ena
obtained
an
algo-
r
ithm
with
the
w
orst-case
time
comple
xity
of
O
(log
12
(
n
))
,
where
n
is
the
giv
en
n
umber
.
In
this
paper
,
w
e
are
ref
err
ing
to
the
latest
v
ersion
(v
ersion
6)
of
their
pub
lication
[10]
,
in
which
the
latest
AKS
algor
ithm
w
as
presented.
The
latest
v
ersion
has
incor
por
ated
man
y
impro
v
ements
proposed
b
y
man
y
researchers
and
the
resulting
algor
ithm
r
uns
in
polynomial
time
with
the
w
orst-case
com-
ple
xity
o
f
O
(log
15
=
2
(
n
))
.
Pr
ior
to
the
pub
lication
of
this
algor
ithm,
there
w
ere
oth
er
pr
imality
pro
ving
algor
ithms
that
seemed
to
r
un
in
polynomial
time
,
b
ut
AKS
algor
ithm
is
the
first
one
that
is
de-
A
Load-Balanced
P
ar
allelization
of
AKS
Algor
ithm
(Y
udha
and
Pulungan)
Evaluation Warning : The document was created with Spire.PDF for Python.
1886
ISSN:
1693-6930
ter
ministic
as
w
ell
as
of
polynomial
time
[18].
The
main
idea
of
AKS
algor
ithm
is
descr
ibed
in
Lemma
1,
which
is
a
gener
alization
of
F
er
mat’
s
little
theorem.
Lemma
1
([10])
Let
a
2
Z
be
relativ
ely
pr
ime
to
n
2
Z
and
n
2
.
Then
n
is
pr
ime
if
and
only
if:
(
x
+
a
)
n
x
n
+
a
(mo
d
n
)
:
(1)
T
o
reduce
the
n
umber
of
oper
ations
perf
or
med,
both
sides
of
(1)
can
be
simplified
b
y
taking
their
respectiv
e
remainders
modulo
a
polynomial
x
r
1
,
f
or
some
small
positiv
e
r
2
Z
,
namely:
(
x
+
a
)
n
x
n
+
a
(mo
d
x
r
1
;
n
)
:
(2)
Ho
w
e
v
er
,
r
ight
no
w
the
bi-implication
in
Lemma
1
no
longer
applies
,
since
non-pr
ime
n
ma
y
also
satisfy
(2)
f
or
some
a
and
r
.
Theorem
1—as
ref
or
m
ulated
b
y
Gr
an
ville
in
[12]—f
or
ms
the
cor
nerstone
of
AKS
algo
r
ithm.
The
theorem
basically
asser
ts
that
f
or
appropr
iately
selected
r
’
s
,
if
(2)
is
satisfied
b
y
some
a
,
then
n
m
ust
be
a
pr
ime
.
Theref
ore
,
r
m
ust
be
selected
accordingly
.
Theorem
1
([10,
12])
Giv
en
n
2
Z
and
n
2
,
let
r
<
n
be
a
posit
iv
e
integer
satisfying
o
r
(
n
)
>
log
2
(
n
)
.
Then
n
is
pr
ime
if
and
only
if:
(1)
n
is
not
a
perf
ect
po
w
er
,
(2)
n
does
not
ha
v
e
an
y
pr
ime
f
actor
r
,
and
(3)
(
x
+
a
)
n
x
n
+
a
(mo
d
x
r
1
;
n
)
f
or
an
y
integer
a
,
where
1
a
p
(
r
)
log
(
n
)
.
A
str
aightf
orw
ard
implementation
of
Theorem
1
is
giv
en
in
Algor
ithm
1,
where
condition
(1)
corresponds
to
the
first
if;
and
conditions
(2)
and
(3)
correspond
to
the
first
and
the
last
f
or
,
respectiv
ely
.
Algorithm
1:
AKS
algor
ithm
Input:
n
2
Z
;
n
2
Output:
A
str
ing
Pr
ime
or
Composite
1
begin
2
if
n
=
a
b
,
where
a;
b
2
Z
and
a;
b
>
1
then
3
return
Composite
4
end
5
Find
the
smallest
r
that
satisfies
o
r
(
n
)
>
b
log
2
(
n
)
c
6
f
or
2
to
r
do
7
if
g
cd
(
a;
n
)
>
1
then
8
return
Composite
9
end
10
end
11
f
or
a
1
to
b
p
(
r
)
log
(
n
)
c
do
12
if
(
x
+
a
)
n
6
x
n
+
a
(mo
d
x
r
1
;
n
)
then
13
return
Composite
14
end
15
end
16
return
Pr
ime
17
end
TELK
OMNIKA
V
ol.
15,
No
.
4,
December
2017
:
1884
1893
Evaluation Warning : The document was created with Spire.PDF for Python.
TELK
OMNIKA
ISSN:
1693-6930
1887
3.
Pr
oposed
Method
3.1.
P
arallel
AKS
Algorithm
Scr
utinizing
Algor
ithm
1,
w
e
can
see
that
the
algor
ithm
basically
compr
ises
4
st
eps:
de-
ter
mining
whether
n
is
a
perf
ect
po
w
er
(lines
2–4);
deter
mining
r
(line
5);
deter
mining
whether
n
has
pr
ime
f
actors
r
(lines
6–10);
and
deter
mining
the
cong
r
uence
of
polynomials
(
x
+
a
)
n
and
x
n
+
a
modulo
(
x
r
1
;
n
)
(lines
11–15)
f
or
some
v
alues
of
a
.
Of
these
f
our
steps
,
the
last
tak
es
most
of
the
computation
times
of
the
algor
ithm,
since
w
e
are
dealin
g
with
an
enor
mous
n
.
Fur-
ther
more
,
when
r
aising
polynomial
(
x
+
a
)
to
th
e
po
w
er
n
—albeit
modulo
(
x
r
1
;
n
)
—inter
mediate
results
might
be
enor
mous
polynomials
requir
ing
large
stor
age
and
hea
vy
computation.
Our
par-
allelization
eff
or
t
will
be
f
ocused
on
computing
this
last
step
.
P
ar
allelizing
the
other
steps
will
incur
comm
unication
o
v
erhead
that,
with
the
current
state
of
netw
or
king
technology
,
renders
the
sa
ving
achie
v
ed
b
y
the
par
allelization
w
or
thless
e
v
en
f
or
hundreds-decimal-digit
n
.
As
has
been
noticed
b
y
Cr
andall
and
P
apadopoulos
in
[18],
AKS
algor
ithm
is
an
em-
barr
assingly
par
allel
algor
ithm.
It
can
easily
be
par
alleliz
ed
using
master-sla
v
e
technique
,
b
y
distr
ib
uting
the
w
or
k
of
deter
mining
the
cong
r
uence
of
polynomials
(
x
+
a
)
n
and
x
n
+
a
modulo
(
x
r
1
;
n
)
f
or
diff
erent
v
alues
of
a
to
diff
erent
computer
nodes
in
a
message-passing
par
allel
system.
Figure
1
illustr
ates
this
master-sla
v
e
technique
.
0
2
1
u
u
-1
1
2
v
1
2
v
1
2
v
1
2
v
…
…
…
…
…
Master node
Slave nodes
i
j
: node i
: cor
e
j
: communication
: owner
Figure
1.
The
design
of
the
par
allelization
technique
In
the
beginning,
the
master
node
perf
or
ms
the
computation
of
the
first
three
steps
of
AKS
algor
ithm
sequentially
.
O
nce
the
master
node
obtains
the
v
alue
of
r
,
it
broadcasts
the
v
alues
of
n
and
r
together
with
other
necessar
y
inf
or
mation
about
distr
ib
ution
of
w
or
k
(namely
the
distr
ib
ution
of
the
v
alues
of
a
)
to
all
sla
v
e
nodes
.
Each
sla
v
e
node
then
proceeds
with
the
computation
of
deter
mining
the
cong
r
uence
of
polynomials
(
x
+
a
)
n
and
x
n
+
a
modulo
(
x
r
1
;
n
)
f
or
se
v
er
al
v
alues
of
a
.
A
sla
v
e
node
comm
unicates
only
with
the
master
node
and
only
in
tw
o
cases:
(1)
when
f
or
some
v
alue
of
a
the
polynomials
are
not
cong
r
uent,
and
(2)
when
f
or
all
v
alues
of
a
assigned
b
y
the
master
node
,
both
polynomials
are
alw
a
ys
cong
r
uent,
and
thus
signaling
that
the
w
or
k
as-
signed
to
the
sla
v
e
node
has
been
completed.
Upon
receiving
a
comm
unication
of
type
(1)
from
a
sla
v
e
node
,
the
master
node
immediately
dismisses
the
last
f
or
loop
and
thereb
y
announces
that
n
is
composite;
and
proceeds
to
command
the
rest
of
the
sla
v
e
nodes
to
abor
t
their
computation.
Receiving
comm
unication
type
(2)
from
all
sla
v
e
nodes
indicates
that
all
sla
v
e
nodes
ha
v
e
com-
pleted
their
w
or
k
and
all
of
them
find
that
the
tw
o
polynomials
are
cong
r
uent
f
or
all
v
alues
of
a
;
the
master
node
then
proceeds
to
announce
that
n
is
pr
ime
.
A
Load-Balanced
P
ar
allelization
of
AKS
Algor
ithm
(Y
udha
and
Pulungan)
Evaluation Warning : The document was created with Spire.PDF for Python.
1888
ISSN:
1693-6930
A
moder
n
computer
system
usually
has
m
ulti-core
CPUs
.
A
par
allelizatio
n
technique
where
each
of
these
CPU
cores
in
a
sla
v
e
computer
node
is
treated
as
a
sla
v
e
node
as
w
ell
is
ref
erred
to
as
single-le
v
el
par
allelization
.
In
this
technique
,
each
core
is
assigned
b
y
the
master
node
se
v
er
al
v
alues
of
a
to
compute
separ
ately
from
other
cores
in
the
same
computer
node
.
Comm
unications
from
all
cores
in
a
sla
v
e
node
to
the
master
node
m
ust
pass
through
the
same
channel
of
comm
unication
and
this
ma
y
result
in
contention.
Ho
w
e
v
er
,
compared
to
the
computa-
tion
time
spent
f
or
each
v
alue
of
a
,
the
o
v
erhead
produced
b
y
this
contention
is
negligib
le
.
3.2.
W
orkload-Distrib
ution
Strategies
The
single-le
v
el
par
allelization
technique
requires
the
distr
ib
ution
of
w
or
kload
from
the
master
node
to
all
sla
v
e
nodes
.
This
basically
entails
distr
ib
uting
the
v
alues
of
a
f
or
sla
v
e
nodes
to
w
or
k
on.
Recall
from
Algor
ithm
1
that
t
he
cong
r
uence
of
polyno
mials
(
x
+
a
)
n
and
x
n
+
a
modulo
x
r
1
m
ust
be
deter
mined
f
or
1
a
b
p
(
r
)
log
(
n
)
c
.
Let
q
=
b
p
(
r
)
log
(
n
)
c
and
u
be
the
n
umber
of
sla
v
e
nodes
.
Fur
ther
,
let
%
stand
f
or
the
integer
division
oper
ator
.
In
the
f
ollo
wing,
w
e
present
f
our
w
or
kload-distr
ib
ution
str
ategies
that
will
be
e
xper
imented
on
in
this
study
.
Strategy
1
Figure
2
illustr
ates
the
first
w
or
kload-distr
ib
ution
str
ategy
.
A
rectangle
in
the
figure
represents
a
single
v
alue
of
a
,
while
the
circle
r
ight
belo
w
the
rectangl
e
represents
the
sla
v
e
node
responsib
le
f
or
computing
that
v
alue
.
This
str
ategy
is
the
simplest
of
the
three
str
ategies
,
where
sla
v
e
node
i
is
responsib
le
to
deter
mine
the
cong
r
uence
of
polynomials
(
x
+
a
)
n
and
x
n
+
a
modulo
x
r
1
,
f
or
(
i
1)(
q
%
u
)
+
1
a
i
(
q
%
u
)
.
Hence
sla
v
e
node
#1
w
or
ks
on
the
first
q
%
u
v
alues
of
a
,
sla
v
e
node
#2
w
or
ks
on
the
second
q
%
u
v
alues
of
a
,
and
so
on,
while
sla
v
e
node
#
u
w
or
ks
on
the
remaining
v
alues
of
a
.
Th
is
last
sla
v
e
node
ma
y
only
w
or
k
on
f
e
w
er
than
q
%
u
v
alues
of
a
,
if
q
is
not
divisib
le
b
y
u
.
1
2
…
q
%
u
q
%
u
+1
q
%
u
+2
…
2
(q
%
u
)
2
(q
%
u
)
+1
2
(q
%
u
)
+2
…
3
(q
%
u
)
…
…
…
(u
-1
)
(q
%
u
)
(u
-1
)
(q
%
u
)+1
…
q
1
1
1
2
2
2
3
…
…
V
alues of a
Node
3
3
u
u
…
…
u
…
…
…
V
alues of a
Node
Figure
2.
W
or
kload-distr
ib
ution
str
ategy
1
Strategy
2
One
of
the
main
concer
ns
with
the
first
str
ategy
is
that
one
sla
v
e
node
is
assigned
only
with
v
alues
of
a
that
are
consistently
smaller
or
larger
than
those
assigned
to
other
sla
v
e
nodes
.
All
v
alues
of
a
assigned
to
sla
v
e
node
#1,
f
or
instance
,
are
smaller
than
those
assigned
to
sla
v
e
node
#2.
A
larger
v
alue
of
a
ma
y
result
in
a
longer
computation
time
,
since
the
resulting
inter
mediate
polynomials
will
ha
v
e
larger
coefficients
,
which
in
tur
n
tak
e
longer
to
m
ultiply
and
require
more
stor
age
.
The
second
and
third
str
ategies
tr
y
to
address
this
.
Figure
3
illustr
ates
the
second
w
or
kload-distr
ib
ution
str
ategy
.
The
first
sla
v
e
node
will
get
a
=
1
,
the
second
sla
v
e
node
will
get
a
=
2
,
and
so
on,
until
the
last
sla
v
e
node
will
get
a
=
u
.
This
is
then
repeated
until
all
v
alues
of
a
are
e
xhausted.
Theref
ore
sla
v
e
node
i
will
be
assigned
the
v
alues
of
a
of
i;
i
+
u;
i
+
2
u;
:
:
:
;
i
+
j
u
,
where
j
is
the
largest
int
eger
that
still
satisfies
i
+
j
u
q
.
This
str
ategy
manages
to
a
v
oid
assigning
one
sla
v
e
node
v
alues
of
a
that
are
consistently
smaller
or
larger
than
those
assigned
to
other
sla
v
e
nodes
.
Ho
w
e
v
er
,
each
v
alue
of
a
assigned
to
a
sla
v
e
node
is
alw
a
ys
relativ
ely
smaller
or
larger
than
that
assigned
to
other
sla
v
e
nodes
.
F
or
e
v
er
y
v
alue
i
assigned
to
sla
v
e
node
#1,
f
or
instance
,
the
v
alue
i
+
1
is
assigned
to
sla
v
e
nod
e
#2.
Hence
,
if
larger
v
alue
of
a
alw
a
ys
results
in
longer
computation
time
,
sla
v
e
node
#1
will
complete
its
w
or
kload
ear
lier
than
sla
v
e
node
#2.
This
prob
lem
will
be
addressed
b
y
the
third
str
ategy
.
TELK
OMNIKA
V
ol.
15,
No
.
4,
December
2017
:
1884
1893
Evaluation Warning : The document was created with Spire.PDF for Python.
TELK
OMNIKA
ISSN:
1693-6930
1889
1
2
…
u
u
+1
u
+2
…
2
u
2
u
+1
2
u
+2
…
3
u
3
u
+1
3
u
+2
…
4
u
4
u
+1
…
q
1
2
u
1
2
u
1
…
…
2
u
1
2
u
1
…
…
…
q
re
m
u
V
alues of a
Node
V
alues of a
Node
Figure
3.
W
or
kload-distr
ib
ution
str
ategy
2
Strategy
3
The
third
str
ategy
addresses
the
prob
lem
encountered
in
the
second
str
ategy
b
y
ensur
ing
that
if
a
sla
v
e
node
is
assigned
a
small
v
alue
of
a
,
it
will
be
compensated
b
y
another
assignment
with
large
v
alue
of
a
.
Figure
4
illustr
ates
the
third
w
or
kload-distr
ib
ution
str
ategy
.
1
2
…
u
u
+1
u
+2
…
2
u
2
u
+1
…
q
-2
u
-1
q
-2
u
…
q
-u
-2
q
-u
-1
q
-u
…
q
-1
q
1
1
2
2
u
u
1
2
1
2
u
u
1
1
…
…
…
…
…
V
alues of a
Node
V
alues of a
Node
Figure
4.
W
or
kload-distr
ib
ution
str
ategy
3
Hence
,
since
sla
v
e
node
#1
is
assigned
the
v
alue
a
=
1
(the
smallest),
then
it
will
also
be
assigned
the
v
alue
a
=
q
(the
largest).
Similar
ly
,
since
sla
v
e
node
#2
is
assigne
d
the
v
alue
a
=
2
(the
second
smallest),
then
it
will
also
be
assigned
the
v
alue
a
=
q
1
(the
second
largest).
This
is
carr
ied
out
subsequently
until
all
v
alues
of
a
are
assigned
to
all
sla
v
e
nodes
in
similar
f
ashion:
if
the
v
alue
a
=
i
is
assigned
to
sla
v
e
node
j
,
then
the
v
alue
a
=
q
i
is
also
assigned
to
sla
v
e
node
j
,
f
or
i
q
%2
.
Strategy
4
All
pre
vious
str
ategies
are
static
,
in
the
sense
that
w
or
kload
distr
ib
utions
are
actually
predefined
e
v
en
bef
ore
e
x
ecution;
a
specific
node
alw
a
ys
gets
the
same
set
of
a
’
s
when
the
input
n
is
the
same
.
In
this
f
our
th
str
ategy
,
w
e
propose
a
dynamic
str
ategy
where
the
set
of
a
’
s
assigned
to
a
sla
v
e
node
cannot
be
predicted
bef
ore
it
is
r
un.
The
idea
is
,
firstly
,
a
sla
v
e
node
is
assigned
an
a
according
to
its
id,
f
or
e
xample
sla
v
e
node
#1
gets
a
=
1
,
sla
v
e
node
#2
gets
a
=
2
and
so
on
until
sla
v
e
node
#
u
gets
a
=
u
.
After
completing
a
w
or
k,
a
sla
v
e
node
requests
to
the
master
node
f
or
another
remaining
a
or
inf
or
ms
the
master
node
if
the
polynomial
cong
r
uence
chec
k
produces
f
alse
result.
The
master
node
then
sends
a
remaining
a
to
the
requesting
node
or
the
master
node
simply
ter
minates
all
nodes
and
output
composite
result
f
or
the
other
condition.
When
no
remaining
a
e
xists
,
the
master
node
ter
minates
all
sla
v
e
nodes
then
outputs
pr
ime
result.
4.
Result
and
Anal
ysis
4.1.
Implementation
Since
w
e
are
pr
imar
ily
concer
ned
with
big
n
umbers
,
w
e
use
GNU
Multiple
Precision
(GMP)
ar
ithmetic
libr
ar
y
v
ersion
6.10
to
handle
integer
of
arbitr
ar
y
length.
In
the
first
step
of
the
algor
ithm,
GMP
function
mpz
p
erfect
p
o
w
er
p()
is
used
to
chec
k
f
or
perf
ect
po
w
ers
.
T
o
com-
pute
the
v
al
ue
of
r
in
the
second
step
,
w
e
use
function
P
o
w
erMo
d()
of
NTL
libr
ar
y
v
ersion
9.10.0,
which
basically
perf
or
ms
integer
modular
e
xpon
entiations
.
F
or
chec
king
the
e
xistence
of
f
actors
of
the
input
n
umber
that
are
no
more
than
r
in
the
third
step
,
NTL
function
GCD()
is
used.
A
Load-Balanced
P
ar
allelization
of
AKS
Algor
ithm
(Y
udha
and
Pulungan)
Evaluation Warning : The document was created with Spire.PDF for Python.
1890
ISSN:
1693-6930
Comm
unications
betw
een
master
and
sla
v
e
nodes
is
perf
or
med
using
MPICH
libr
ar
y
v
er-
sion
3.2.
T
o
broadcast
the
input
n
umber
n
and
the
v
alue
of
r
,
the
master
and
sla
v
e
nodes
use
function
MPI
Bcast()
.
Since
the
MPI
does
not
suppor
t
data
type
mpz
t
defined
b
y
GMP
as
w
ell
as
data
type
ZZ
defined
b
y
NTL,
n
and
r
are
first
con
v
er
ted
to
arr
a
ys
of
b
ytes
bef
ore
the
y
are
broad-
cast.
Once
arr
iv
ed,
the
y
will
be
con
v
er
ted
bac
k
to
type
mpz
t
using
function
mpz
init
set
str()
.
After
all
the
v
alues
required
to
compute
the
last
step
are
obtained
b
y
a
sla
v
e
node
,
it
t
hen
com-
putes
the
left
and
r
ight
side
of
the
cong
r
uence
using
function
P
o
w
erMo
d()
.
4.2.
Experimental
Setup
All
e
xper
iments
in
this
study
are
conducted
using
High-P
erf
or
mance
Computing
(HPC)
system
pro
vided
b
y
Director
ate
of
Inf
or
mation
System
and
Resources
(DSSDI)
of
Univ
ersitas
Gadjah
Mada.
The
HPC
system
has
15
sla
v
e
nodes
,
each
with
2
CPU
Dual
Core
AMD
Opteron
TM
Processor
280
(hence
,
4
CPU
cores),
4
GB
DDR3
RAM,
Op
enSUSE
11.2
64
bits
oper
ating
system,
and
GCC
compiler
v
ersion
6.1.
W
e
e
xper
iment
o
n
pr
ime
n
umbers
r
anging
from
5
digits
to
35
digits
in
length
as
sho
wn
in
T
ab
le
1.
The
se
v
en
pr
ime
n
umbers
selected
are
the
largest
pr
ime
n
umbers
f
or
the
corresponding
n
umbers
of
digits
according
to
[22].
T
ab
le
1.
Pr
ime
n
umbers
used
in
e
xper
iments
Digits
Pr
ime
Number
5
99,929
10
9,999,999,929
15
999,998,727,899,999
20
99,999,999,999,999,999,989
25
9,989,999,899,883,889,989,999,899
30
909,090,909,090,909,090,909,090,909,091
35
68,476,562,763,327,854,359,085,599,065,855,383
4.3.
Result
Comparing
the
w
orkload-distrib
ution
strategies
T
ab
le
2
sho
ws
the
r
unning
times
of
the
se-
quential
as
w
ell
as
the
par
allel
implementations
of
AKS
algor
ithm
f
o
r
the
se
v
en
pr
ime
n
umbers
.
The
par
allel
implementations
are
r
u
n
on
a
60-processor
message-passing
system,
while
the
se-
quential
one
is
r
un
on
one
of
the
processors
.
It
is
e
vident
that
the
dynamic
w
or
kload
distr
ib
ution
(str
ategy
4)
perf
or
ms
consistently
and
significantly
better
than
other
str
a
tegies
in
all
e
xper
iments
,
which
means
that
this
str
ategy
is
the
most
load
balanced
among
the
proposed
str
ategies
.
This
also
indicates
that
the
o
v
erheads
associated
with
comm
unication
times
bet
w
een
the
master
node
and
sla
v
e
nodes
are
insignificant
compared
to
the
computation
times
f
or
diff
erent
v
alues
of
a
.
F
rom
T
ab
le
2,
it
is
clear
that
patter
ns
from
the
r
unning
times
of
the
first
three
w
or
kload-
distr
ib
ution
str
at
egies
are
not
easy
to
discer
n.
This
result
is
contr
ar
y
to
the
authors’
or
iginal
e
xpectation,
as
descr
ibed
in
Section
3.2.
The
result
indicates
that
the
computation
times
required
f
or
th
e
v
alues
of
a
are
not
propor
tional
to
those
v
alues:
a
larger
v
alue
of
a
ma
y
require
less
computation
time
than
that
of
smaller
one
.
Speedups
f
or
v
arious
n
umber
of
pr
ocessor
s
The
pre
vious
result
sho
ws
that
w
or
kload
dis-
tr
ib
ution
str
ategy
4
produces
the
b
est
par
allel
implementation
f
or
AKS
algor
ithm.
In
this
par
t,
w
e
f
ocus
on
this
str
ategy
and
find
out
the
speedups
that
are
achie
v
ab
le
f
or
v
ar
ious
n
umber
of
processors
.
The
result
is
presented
in
Figure
5.
Figure
5
sho
ws
that
f
or
almost
all
n
umbers
of
digits
,
speedup
mostly
g
ro
ws
linear
ly
as
the
n
umber
of
processors
used
in
the
computation
increases
.
The
apparent
e
xception
to
this
is
f
or
TELK
OMNIKA
V
ol.
15,
No
.
4,
December
2017
:
1884
1893
Evaluation Warning : The document was created with Spire.PDF for Python.
TELK
OMNIKA
ISSN:
1693-6930
1891
T
ab
le
2.
Running
times
(in
seconds)
of
the
diff
erent
w
or
kload-distr
ib
ution
str
ategies
Digits
Sequential
P
ar
allel
Str
ategy
1
Str
ategy
2
Str
ategy
3
Str
ategy
4
5
1.57168
0.41278
0.19912
0.20151
0.11407
10
105.7
5.3
2.1
2.2
1.7
15
712.2
35.3
22.6
24.0
19.6
20
3,236.0
128.2
130.4
128.8
112.3
25
8,848.8
446.6
371.5
374.1
325.4
30
32,343.2
1,421.6
1,317.3
1,972.4
1,179.6
35
70,901.7
4,121.8
4,177.6
4,152.3
2,457.4
0
5
10
15
20
25
30
35
40
0
10
20
30
40
50
60
Sp
e
e
d
u
p
Nu
m
b
e
r
o
f
p
r
o
c
e
s
s
o
r
s
5
d
i
g
i
t
s
10
d
i
g
i
t
s
15
d
i
g
i
t
s
20
d
i
g
i
t
s
25
d
i
g
i
t
s
30
d
i
g
i
t
s
35
d
i
g
i
t
s
Figure
5.
Speedups
obtained
b
y
v
ar
ying
the
n
umber
of
processors
when
the
n
umber
of
digits
is
5.
F
or
this
case
,
there
are
only
275
diff
erent
v
alues
of
a
to
chec
k
and
each
of
them
requires
a
relativ
ely
shor
t
computation
time
.
When
the
n
umber
of
processors
e
x-
ceeds
30,
comm
unication
o
v
erheads
becomes
large
enough
to
offset
the
sa
vings
of
computation
times
b
y
the
par
allelism.
Ov
er
all,
the
largest
speedup
is
obtained
when
the
n
umber
of
digits
is
10
and
it
is
a
bit
more
than
36
times
the
computation
time
of
the
sequential
implementation.
The
eff
ects
of
m
ulti-core
pr
ocessor
s
The
high-perf
or
mance
computing
system
used
in
the
e
xper
iments
consists
of
computers
,
each
with
m
ulti-core
processors
.
When
each
of
this
core
is
treated
as
a
node
,
contentions
ma
y
occur
wh
en
there
is
more
than
one
core
comm
unicating
sim
ultaneously
with
the
master
node
.
In
this
par
t,
f
or
each
w
or
kload-distr
ib
ution
str
ategy
,
w
e
v
ar
y
the
n
umber
of
cores
per
node
used
in
the
par
allel
computation
to
estab
lish
their
eff
ects
on
the
o
v
er
all
computation
time
.
F
or
this
pur
pose
,
three
scenar
ios
are
created,
namely
1
core
per
node
,
2
cores
per
node
,
and
4
cores
per
node
.
In
all
of
these
scenar
ios
,
the
o
v
er
all
n
umber
of
cores
is
maintained
at
8
in
order
to
set
a
baseline
.
Figure
6
depicts
the
result
of
the
e
xper
iments
.
F
rom
Figure
6,
w
e
can
conclude
that
w
or
kload
distr
ib
ution
str
ategies
1,
2
and
3
ar
e
al-
most
not
aff
ected
b
y
the
n
umber
of
cores
per
node
used
in
the
par
allel
computation.
This
is
understandab
le
since
,
in
these
str
ategies
,
a
sla
v
e
node
r
arely
comm
unicates
with
the
master
node
.
Comm
unications
betw
een
a
sla
v
e
node
and
the
master
node
occurs
only
dur
ing
ter
mina-
tion,
namely
when
the
sla
v
e
node
finds
that
the
polynomials
are
not
cong
r
uent
f
or
a
specific
v
alue
of
a
or
when
it
finds
that
the
polynomials
are
cong
r
uent
f
or
all
assigned
v
alues
of
a
.
The
eff
ect
f
or
w
or
kload-distr
ib
ution
str
ategy
4,
ho
w
e
v
er
,
is
star
k
and
the
larger
the
pr
ime
n
umber
the
more
pronounced
the
eff
ect.
Ha
ving
more
cores
per
node
results
in
longer
computa-
tion
time
.
This
is
in
line
with
our
e
xpectation,
since
,
ha
ving
more
cores
per
node
results
in
hea
vier
use
of
the
comm
unication
line
betw
een
the
master
and
the
sla
v
e
nodes
.
What
w
e
do
not
e
xpect
A
Load-Balanced
P
ar
allelization
of
AKS
Algor
ithm
(Y
udha
and
Pulungan)
Evaluation Warning : The document was created with Spire.PDF for Python.
1892
ISSN:
1693-6930
0
5000
10000
15000
20000
25000
30000
0
5
10
15
20
25
30
Ru
n
n
i
n
g
t
i
m
e
(
s
e
c
o
n
d
s
)
Di
g
i
t
s
o
f
t
h
e
p
r
i
m
e
n
u
m
b
e
r
Wo
r
k
l
o
a
d
-
di
s
t
r
i
but
i
o
n
s
t
r
a
t
e
g
y
2
2
n
o
d
e
s
(
8
c
o
r
e
s
)
4
n
o
d
e
s
(
8
c
o
r
e
s
)
8
n
o
d
e
s
(
8
c
o
r
e
s
)
0
5000
10000
15000
20000
25000
30000
0
5
10
15
20
25
30
Ru
n
n
i
n
g
t
i
m
e
(
s
e
c
o
n
d
s
)
Di
g
i
t
s
o
f
t
h
e
p
r
i
m
e
n
u
m
b
e
r
Wo
r
k
l
o
a
d
-
di
s
t
r
i
but
i
o
n
s
t
r
a
t
e
g
y
1
2
n
o
d
e
s
(
8
c
o
r
e
s
)
4
n
o
d
e
s
(
8
c
o
r
e
s
)
8
n
o
d
e
s
(
8
c
o
r
e
s
)
0
5000
10000
15000
20000
25000
30000
0
5
10
15
20
25
30
Ru
n
n
i
n
g
t
i
m
e
(
s
e
c
o
n
d
s
)
Di
g
i
t
s
o
f
t
h
e
p
r
i
m
e
n
u
m
b
e
r
Wo
r
k
l
o
a
d
-
di
s
t
r
i
but
i
o
n
s
t
r
a
t
e
g
y
3
2
n
o
d
e
s
(
8
c
o
r
e
s
)
4
n
o
d
e
s
(
8
c
o
r
e
s
)
8
n
o
d
e
s
(
8
c
o
r
e
s
)
0
5000
10000
15000
20000
25000
30000
0
5
10
15
20
25
30
Ru
n
n
i
n
g
t
i
m
e
(
s
e
c
o
n
d
s
)
Di
g
i
t
s
o
f
t
h
e
p
r
i
m
e
n
u
m
b
e
r
Wo
r
k
l
o
a
d
-
di
s
t
r
i
but
i
o
n
s
t
r
a
t
e
g
y
4
2
n
o
d
e
s
(
8
c
o
r
e
s
)
4
n
o
d
e
s
(
8
c
o
r
e
s
)
8
n
o
d
e
s
(
8
c
o
r
e
s
)
Figure
6.
Running
times
f
or
v
ar
ious
n
umbers
of
cores
per
node
is
f
or
the
eff
ect
to
be
so
strong
(1
core
per
node
is
f
aster
more
than
twice
compared
to
4
cores
per
node).
This
means
that
an
HPC
with
single-core
nodes
will
produces
e
v
en
better
results
.
5.
Conc
lusion
In
this
paper
,
w
e
proposed
a
par
allelization
technique
based
on
message
passing
par
al-
lelism
f
o
r
AKS
algor
ithm.
W
e
also
de
v
eloped
f
our
w
or
kload-distr
ib
utio
n
str
ategies
f
or
this
par-
allelization
technique
.
F
rom
the
e
xper
iments
w
e
ha
v
e
conducted,
w
e
conclude
that
dynamic
w
or
kload-distr
ib
ution
str
ategy
is
the
most
load-balanced
one
.
Fur
ther
more
,
the
diff
erence
be-
tw
een
the
dynamic
str
ategy
and
static
str
ategies
is
so
significant
that
it
is
difficult
to
en
vision
cir-
cumstances
when
one
wishes
to
use
the
static
ones
.
Ov
er
all,
the
dynamic
str
ategy
can
achie
v
e
a
speedup
of
up
to
36
times
the
sequential
computation.
Ne
v
er
theless
,
the
dynamic
str
ategy
has
one
ob
vious
dr
a
wbac
k,
namely
the
bottlenec
k
in
the
comm
unication
line
to
w
ards
the
master
node
.
The
more
nodes
in
v
olv
ed
in
the
par
allelism,
the
b
usier
the
master
node
and
the
hea
vier
the
comm
unication
line
to
w
ards
the
master
node
.
W
e
did
not
manage
to
demonstr
ate
this
due
to
the
limited
siz
e
of
t
he
HPC
a
v
ailab
le
to
us
.
W
e
also
sho
w
ed
that
the
n
umber
of
cores
per
node
has
a
strong
eff
ect
f
or
the
dynamic
w
or
kload-distr
ib
ution
str
ategy
.
Ac
kno
wledg
ement
The
authors
w
ould
lik
e
to
thank
Director
ate
of
Inf
or
mation
System
and
Resources
(DSSDI)
of
Univ
ersitas
Gadjah
Mada
f
or
pro
viding
the
high-perf
or
mance
computing
ser
vice
used
in
this
re-
search.
TELK
OMNIKA
V
ol.
15,
No
.
4,
December
2017
:
1884
1893
Evaluation Warning : The document was created with Spire.PDF for Python.
TELK
OMNIKA
ISSN:
1693-6930
1893
Ref
erences
[1]
R.
L.
Riv
est,
A.
Shamir
,
and
L.
Adleman,
“A
method
f
or
obtaini
ng
digital
signatures
and
pub
lic-k
e
y
cr
yptosystems
,
”
Comm
unications
of
the
A
CM
,
v
ol.
21,
no
.
2,
pp
.
120–126,
F
eb
.
1978.
[2]
H.
Y
ang,
“Research
on
design
method
based
on
hardw
are
encr
yption
and
tw
o-w
a
y
id
au-
thentication
f
or
secur
ity
mobile
hard
disk,
”
TELK
OMNIKA
,
v
ol.
12,
no
.
4,
pp
.
300
1–3009,
2014.
[3]
Z.
Qi,
“Secure
digital
cer
tificate
design
based
on
the
pub
lic
k
e
y
cr
yptog
r
aph
y
algor
ithm,
”
TELK
OMNIKA
,
v
ol.
11,
no
.
12,
pp
.
7366–7372,
2013.
[4]
A.
Muzakir
and
A.
Ashar
i,
“Rancang
bangun
k
eamanan
w
eb
ser
vice
dengan
metode
ws-
secur
ity
,
”
IJCCS
(Indonesian
Jour
nal
of
Computing
and
Cyber
netics
Systems)
,
v
ol.
6,
no
.
1,
pp
.
1–10,
2012.
[5]
EFF
,
“EFF
off
ers
cooper
ativ
e
computing
p
r
iz
es
,
”
2009,
last
accessed:
2017-03-19.
[Online].
A
v
ailab
le:
https://www
.eff
.org/a
w
ards/coop
[6]
G.
L.
Miller
,
“Riemann’
s
h
ypothesis
and
tests
f
or
pr
imality
,
”
Jour
nal
of
Computer
and
System
Sciences
,
v
ol.
13,
no
.
3,
pp
.
300–317,
1976.
[7]
M.
O
.
Rabin,
“Probabilistic
algor
ithm
f
or
testing
pr
imality
,
”
Jour
nal
of
Number
Theor
y
,
v
ol.
12,
no
.
1,
pp
.
128–138,
1980.
[8]
C
.
Dong,
“Math
in
netw
or
k
secur
ity:
A
cr
ash
course
,
”
2016,
last
accessed:
2017-03-19.
[Online].
A
v
ailab
le:
http://www
.doc.ic.ac.uk/
mrh/330tutor/
[9]
R.
Solo
v
a
y
and
V
.
Str
assen,
“A
f
ast
Monte-Car
lo
test
f
or
pr
imality
,
”
SIAM
Jour
nal
on
Comput-
ing
,
v
ol.
6,
no
.
1,
pp
.
84–85,
1977.
[10]
M.
Ag
r
a
w
al,
N.
Ka
y
al,
and
N.
Sax
ena,
“PRIMES
is
in
P,
”
Annals
of
Mathematics
,
v
ol.
2,
pp
.
781–793,
2002.
[11]
L.
K.
Nemana
and
V
.
C
.
V
enkaiah,
“An
empir
ical
study
to
w
ards
refining
the
AKS
pr
imality
testing
algor
ithm,
”
IA
CR
Cr
yptology
ePr
int
Archiv
e
,
v
ol.
2016,
pp
.
362–387,
2016.
[12]
A.
Gr
an
ville
,
“It
is
easy
to
deter
mine
whether
a
giv
en
integer
is
pr
ime
,
”
Bulletin
of
the
Amer
i-
can
Mathematical
Society
,
v
ol.
42,
no
.
1,
pp
.
3–38,
2005.
[13]
H.
W
.
Lenstr
a,
“Pr
imality
testing
with
cyclotomic
r
ings
,
”
Mathematic
Institute
,
Univ
ersity
of
Leiden,
T
ech.
Rep
.,
2002.
[14]
D
.
Ber
nstein,
“Pro
ving
pr
imality
after
Ag
r
a
w
al-Ka
y
al-Sax
ena,
”
Depar
tment
of
Mathematics
,
Statistics
,
and
Computer
Science
,
Univ
ersity
of
Illinois
,
T
ech.
Rep
.,
2003.
[15]
H.
W
.
Lenstr
a,
“Pr
imality
testing
with
Gaussian
per
iods
,
”
in
FST
TCS
2002:
F
oundations
of
Softw
are
T
echnology
and
Theoretical
Computer
Science
,
22nd
Conf
erence
Kanpur
,
India,
December
12-14,
2002,
Proceedings
,
ser
.
Lecture
Notes
in
Computer
Science
,
M.
Ag
r
a
w
al
and
A.
Seth,
Eds
.,
v
ol.
2556.
Spr
inger
,
2002,
pp
.
1–45.
[16]
H.
W
.
Lenstr
a
and
C
.
P
omer
ance
,
“Pr
imality
testing
with
Gaussian
per
iods
,
”
Depar
tment
of
Mathematics
,
Univ
ersity
of
Dar
tmouth,
T
ech.
Rep
.,
2011.
[17]
D
.
Ber
nstein,
“Pro
ving
pr
imality
in
essentially
quar
tic
r
andom
time
,
”
Mathematics
of
compu-
tation
,
v
ol.
76,
no
.
257,
pp
.
389–403,
2007.
[18]
R.
E.
Cr
andall
and
J
.
S
.
P
apadopoulos
,
“On
the
implementation
of
AKS-class
pr
imality
tests
,
”
Univ
ersity
of
Mar
yland
College
P
ar
k,
T
ech.
Rep
.,
2003.
[19]
H.
Li,
“The
analysis
and
implementation
of
the
AKS
algor
ithm
and
its
impro
v
ement
algo-
r
ithms
,
”
Master’
s
thesis
,
Depar
tment
of
Computer
Science
,
Univ
ersity
of
Bath,
2007.
[20]
V
.
Menon,
“Deter
ministic
pr
imality
testing
-
understanding
the
AKS
algor
ithm,
”
CoRR
,
v
ol.
abs/1311.3785,
2013.
[21]
Z.
Cao
,
“A
note
on
the
stor
age
requirement
f
or
AKS
pr
imality
testing
algor
ithm,
”
IA
CR
Cr
yp-
tology
ePr
int
Archiv
e
,
v
ol.
2013,
pp
.
449–452,
2013.
[22]
C
.
K.
Caldw
ell,
“Th
e
pr
ime
pages:
pr
ime
n
umber
research,
records
,
and
resources
,
”
2017,
last
accessed:
2017-03-19.
[Online].
A
v
ailab
le:
https://pr
imes
.utm.edu
A
Load-Balanced
P
ar
allelization
of
AKS
Algor
ithm
(Y
udha
and
Pulungan)
Evaluation Warning : The document was created with Spire.PDF for Python.