Indonesian
J
our
nal
of
Electrical
Engineering
and
Computer
Science
V
ol.
19,
No.
2,
August
2020,
pp.
1036
1047
ISSN:
2502-4752,
DOI:
10.11591/ijeecs.v19i2.pp1036-1047
r
1036
Countermeasur
es
against
darknet
localisation
attacks
with
pack
et
sampling
Masaki
Narita,
K
eisuk
e
Kamada,
Kanay
o
Ogura,
Bhed
Bahadur
Bista,
T
oy
oo
T
akata
Graduate
School
of
Softw
are
and
Information
Science,
Iw
ate
Prefectural
Uni
v
ersity
,
Japan
Article
Inf
o
Article
history:
Recei
v
ed
Jan
2,
2020
Re
vised
Mar
3,
2020
Accepted
Mar
17,
2020
K
eyw
ords:
Darknet
monitoring
Localisation
attack
P
ack
et
sampling
Security
ABSTRA
CT
The
darknet
monitoring
system
consists
of
netw
ork
sensors
widely
deplo
yed
on
the
Internet
to
capture
incoming
unsolicited
pack
ets.
A
goal
of
this
system
is
to
analyse
captured
malicious
pack
ets
and
pro
vide
ef
fecti
v
e
information
to
protect
re
gular
non-
malicious
Internet
users
from
malicious
acti
vities.
T
o
pro
vide
ef
fecti
v
e
and
reliable
information,
the
location
of
sensors
must
be
concealed.
Ho
we
v
er
,
attack
ers
launch
localisation
attacks
to
detect
sensors
in
order
to
e
v
ade
them.
If
the
actual
location
of
sensors
is
re
v
ea
led,
it
is
almost
impossible
to
identify
the
latest
tactics
used
by
attack
ers.
Thus,
in
a
pre
vious
study
,
we
proposed
a
pack
et
sampling
method,
which
samples
incoming
pack
ets
based
on
an
attrib
ute
of
the
pack
et
sender
,
to
increase
tolerance
to
a
localisation
attack
and
maintain
the
quality
of
information
publicised
by
the
system.
W
e
were
successful
in
countering
localisation
attacks,
which
generate
spik
es
on
the
publicised
graph
to
detect
a
sensor
.
Ho
we
v
er
,
in
some
cases,
with
the
pre
viously
propos
ed
sampling
method,
spik
es
were
clearly
e
vident
on
the
graph.
Therefore,
in
this
paper
,
we
propose
adv
anced
sampling
methods
such
that
incoming
pack
ets
are
sampled
based
on
multiple
attrib
utes
of
the
pack
et
sender
.
W
e
present
our
impro
v
ed
methods
and
sho
w
promising
e
v
aluation
results
obtained
via
a
simulation.
Copyright
©
2020
Insitute
of
Advanced
Engineeering
and
Science
.
All
rights
r
eserved.
Corresponding
A
uthor:
Masaki
Narita,
Graduate
School
of
Softw
are
and
Information
Science,
Iw
ate
Prefectural
Uni
v
ersity
,
152-52
Sugo,
T
akiza
w
a,
Iw
ate
020-0693,
Japan.
Email:
narita
m@iw
ate-pu.ac.jp
1.
INTR
ODUCTION
The
Internet
has
become
an
indispensable
resource
for
most
people.
Ho
we
v
er
,
a
v
arious
mal
w
are
is
deplo
yed
to
cause
c
yber
attacks
that
seriously
threaten
the
safe
and
reliable
use
of
the
Internet,
for
e
xample
stealing
confidential
personal
d
a
ta
or
launching
Denial-of-Service
(DoS)
attacks
on
specific
corporate
enter
-
prises
to
hinder
their
ability
to
pro
vide
services
[1].
A
Symantec
[2]
sho
ws
that
4,818
unique
websites
were
compromised
e
v
ery
month
in
2018.
W
ith
data
from
a
single
credit
card
being
sold
for
up
t
o
$45
on
under
ground
mark
ets,
just
10
credit
cards
stolen
from
compromised
websites
could
result
in
a
yield
of
up
to
$2.2
million
for
c
yber
criminals
each
month.
Cyberattacks
primarily
e
xploit
softw
are
vulnerabilit
ies
[3,
4].
Thus,
i
t
is
essential
to
find
and
manage
such
vulnerabilities
at
an
early
stage
to
pre
v
ent
v
arious
types
of
damage,
e.g.,
re
v
ealing
confidential
information
or
financial
harm
to
both
corporate
enterprises
and
general
users.
Darknet
monitoring
systems
are
de
v
eloped
to
find
softw
are
vulnerabilities
and
identify
the
latest
meth-
ods
used
by
attack
ers.
An
o
v
ervie
w
of
a
darknet
monitoring
system
is
sho
wn
in
Figure
1.
A
darknet
monitoring
system
comprises
multiple
netw
ork
de
vices,
i.e.,
sensors,
that
are
deplo
yed
in
unused
IP
address
space
on
the
Internet.
Note
that
these
sensors
are
configured
to
capture
only
incoming
pack
ets;
the
y
do
not
pro
vide
an
y
out-
going
services.
Such
sensors
may
recei
v
e
unsolicited
pack
ets
when
the
y
are
directly
connected
to
the
Internet.
J
ournal
homepage:
http://ijeecs.iaescor
e
.com
Evaluation Warning : The document was created with Spire.PDF for Python.
Indonesian
J
Elec
Eng
&
Comp
Sci
ISSN:
2502-4752
r
1037
Sensors
distrib
uted
in
darknet
space
capture
incoming
unsolicited
pack
ets.
T
ypically
,
pack
ets
do
not
arri
v
e
in
unused
address
space,
i.e.,
darknet.
Thus,
we
can
infer
that
pack
ets
arri
ving
in
these
unused
address
spaces,
which
are
v
alid
and
routable,
are
lik
ely
malicious.
Or
g
anisations
that
operate
darknet
monitoring
systems
col-
lect
and
analyse
these
malicious
pack
ets.
Usually
,
the
analysis
results,
which
can
pro
vide
ef
fecti
v
e
information
to
protect
re
gular
,
non-malicious
Internet
users
from
malicious
acti
vities,
are
made
publicly
a
v
ailable.
Users
Data Center
Sensors
Malicious
Packets
Monitored
Results
Figure
1.
Ov
ervie
w
of
darknet
monitoring
system
T
o
pro
vide
ef
fecti
v
e
and
reliable
information,
the
location
of
darknet
sensors
must
be
concealed
be-
cause
attack
ers
launch
localisation
attacks
to
detect
the
sensors’
IP
addresses
in
order
to
e
v
ade
them.
If
the
IP
addresses
are
re
v
ealed
to
attack
ers,
it
is
almost
impossible
to
identify
and
analyse
ne
w
malicious
attack
methods
[5].
In
addition,
the
sensors
thems
elv
es
may
be
tar
geted
by
DoS
attacks.
When
attack
ers
initiate
a
localisation
attack,
the
y
masquerade
as
general
Internet
use
rs.
T
o
detect
sensors,
the
y
misuse
information
pro
vided
by
or
g
anisations
operating
a
darknet
monitoring
system.
Thus,
if
or
g
anisations
release
time-series
graphical
information
without
taking
countermeasures
to
protect
the
information,
attack
ers
can
use
the
publi-
cised
graph
to
detect
a
sensor
.
In
a
pre
vious
study
,
we
proposed
a
pack
et
sampling
method,
which
samples
captured
pack
ets
based
on
an
attrib
ute
of
the
pack
et
sender
,
to
increase
localisation
attack
tolerance
and
to
en-
sure
the
quality
of
information
pro
vided
by
a
darknet
monitoring
s
y
s
tem
in
our
w
ork
[6,
7].
W
e
were
relati
v
ely
successful
in
countering
localisation
attacks,
which
generate
spik
es
on
the
publicised
graph
to
detect
a
sensor
.
Ho
we
v
er
,
in
some
cases,
une
xpected
spik
es
that
could
indicate
the
location
of
a
sensor
appeared
in
the
graph.
Therefore,
here,
we
propose
adv
anced
sampling
methods
whereby
incoming
pack
ets
are
sa
mpled
based
on
multiple
attrib
utes
of
the
pack
et
sender
.
Note
that
the
proposed
method
e
xploits
kno
wledge
g
ained
from
our
pre
vious
w
ork.
A
performance
e
v
aluation
w
as
conducted
by
simulating
attack
ers’
tactics
and
applying
the
proposed
methods.
W
e
used
actual
captured
pack
ets
pro
vided
by
the
Nicter
[8,
9]
darknet
project
operated
by
the
Japanese
National
Institute
of
Information
and
Communi
cations
T
echnology
(NICT).
In
addition,
we
also
discuss
the
de
gradation
of
publicised
information
by
sampling
captured
pack
ets
compared
to
a
no
sampling
case.
In
this
paper
,
we
present
our
impro
v
ed
methods
and
sho
w
a
promising
e
v
aluation
result
.
This
p
a
per
is
a
re
vised
and
e
xpanded
v
ersion
of
a
paper
entitled
“
A
Study
of
P
ack
et
Sampling
Methods
for
Protecting
Sensors
Deplo
yed
on
Darknet”
presented
at
19th
International
Conference
on
Netw
ork-Based
Inform
ation
Systems
(NBiS
2016)
[7].
2.
RELA
TED
W
ORK
A
wide
v
ariety
of
darknet
monitoring
systems
are
operated
all
o
v
er
the
w
orld
to
identify
and
under
-
stand
the
latest
at
tack
trends
on
the
Inter
n
e
t
[10,
11].
The
NICT
NICTER
project
has
man
y
user
interf
ace
s.
F
or
e
xample,
Cube
dra
ws
a
cubical
object
in
the
centre
of
a
windo
w
and
maps
captured
pack
ets
on
the
object
based
on
the
source
and
destination
of
arri
v
ed
pack
ets.
Atlas
maps
captured
pack
ets
on
a
w
orld
map
and
indicates
in
real
time
the
count
ries
that
sends
malicious
pack
ets
to
Japan.
In
addition,
NICTER
counts
captured
pack
ets
by
country
of
origin
and
classifies
pack
ets
as
TCP
or
UDP
.
This
is
a
v
ailable
on
the
NICTER
W
eb
[12].
As
another
e
xample,
the
Japanese
National
Police
Agenc
y
operates
@police
[13],
a
darknet
monitoring
system
that
col-
lects
fire
w
all
and
intrusion
detection
system
log
data
at
the
netw
ork
entry
g
ate
w
ay
of
institutions
af
filiated
with
the
police.
@police
pro
vides
security-related
information
in
the
form
of
ti
me-series
graphs
and
tables
on
their
web
site.
DShield
[14],
a
community-based
fire
w
all
log
correlation
system,
recruits
v
olunteers
from
across
the
w
orld.
The
v
olunteers
pro
vide
fire
w
all
logs
that
are
used
to
analyse
attack
trends.
DShield
is
the
w
orld’
s
lar
gest
darknet
community
.
The
Cooperati
v
e
Association
for
Internet
Data
Analysis,
CAID
A
[15,
16],
man-
Countermeasur
es
a
gainst
darknet
localisation
attac
ks...
(Masaki
Narita)
Evaluation Warning : The document was created with Spire.PDF for Python.
1038
r
ISSN:
2502-4752
ages
the
Uni
v
ersity
of
California,
San
Die
go’
s
Netw
ork
T
elescope’
s
Internet
traf
fic
monitoring
system.
This
system
dominates
an
entire
globally
routed
/8
netw
ork
that
comprises
approximately
1/256
of
all
IPv4
Internet
addresses.
The
system
captures
all
incoming
pack
ets
to
that
address
space.
Ho
we
v
er
,
attack
ers
launch
localisation
attacks
ag
ainst
darknet
monitoring
systems
to
detect
and
bypass
sensors.
Shinoda
et
al.
[17]
and
Bethencourt
et
al.
[18]
reported
that,
to
detect
sensors,
an
attack
er
can
send
a
lar
ge
number
of
probing
pack
ets
to
suspicious
netw
ork
that
includes
a
sensor
preliminary
in
the
short
term
to
detect
sensors.
Subsequently
,
attack
ers
af
firm
the
presence
of
sensors
if
sharp
spik
es
appear
on
the
time-
series
graph
publicised
by
the
tar
geted
system.
Y
u
et
al.
[19–21]
and
our
w
ork
[22]
introduced
a
localisation
attack
inspired
by
spread
spectrum
technology
.
This
method
increases
attack
stealth
and
can
detect
sensors
with
lo
w
probing
pack
ets
by
sending
probing
pack
ets
synchronised
with
the
v
alue
of
a
PN
code
sequence.
If
the
IP
addresses
of
deplo
yed
sensors
are
re
v
ealed
by
a
localisation
attac
k
,
attack
ers
can
bypass
the
sensors
intentionally
and
launch
malicious
attacks
on
the
Internet.
Consequently
,
identifying
the
latest
attack
trends
becomes
dif
ficult.
In
addition,
the
sensors
themselv
es
may
be
e
xposed
to
obstinate
DoS
attacks.
Basically
,
attack
ers
infer
sensor
IP
addresses
by
in
v
estig
ating
the
publicly
a
v
ailable
data
pro
vided
by
agencies
that
operate
darknet
monitori
ng
systems.
In
short,
attack
ers
put
a
mark
on
upcoming
publicised
data
and
v
erify
the
presence
of
the
mark
to
detect
a
sensor
.
If
countermeasures
are
not
implemented,
attack
ers
could
e
xploit
publicised
data
and
easily
detect
sensors.
Thus,
establishing
countermeasures
ag
ainst
localisation
attacks
is
imperati
v
e.
V
iecco
indicated
and
emphasised
the
usefulness
of
pack
et
sampling
techniques
for
captured
pack
ets
as
a
countermeasure
to
localisation
attacks
[23]
.
Ho
we
v
er
,
the
y
only
consider
a
theoretical
frame
w
ork.
The
y
do
not
propose
a
functional
sampling
method
and
ha
v
e
not
conducted
a
numerical
v
alidation.
Thus,
we
propose
a
pack
et
sampling
method
wherein
capturing
pack
ets
are
sampled
based
on
an
attrib
ute
of
the
pack
et
sender
.
Sampling
is
essentially
a
process
that
selects
a
representati
v
e
subset
of
indi
viduals
from
an
entire
population.
W
e
ensure
that
pack
et
sampling
alle
viates
the
influence
of
probing
pack
ets
sent
by
attack
ers
in
open
publicised
data
and
obtain
tolerance
to
a
localisation
attack.
3.
LOCALISA
TION
A
TT
A
CK
BY
GENERA
TING
SPIKES
In
this
section,
we
describe
the
localisation
attack
concept,
which
generates
spik
e
on
the
pu
bl
icised
graph.
This
is
based
on
the
w
ork
by
Shinoda
et
al.
[17]
and
Bethencourt
et
al.
[18].
Shinoda
et
al.
[17]
referred
to
a
direct
and
intentional
detecting
sensor
acti
vity
as
marking.
The
y
also
defined
spik
es
used
for
marking
as
a
mark
er
.
In
this
paper
,
we
use
these
terms
in
the
same
manner
.
An
o
v
ervie
w
of
the
localisation
attack
is
sho
wn
in
Figure
2,
and
we
e
xplain
the
procedures
in
the
follo
wing.
3) Identify Marker
4) Update
Address List
Data Center
Attacker
Address List
Sensors
2) Send Marker
1) Preliminary Survey
Publicized Monitored
Result
Figure
2.
Ov
ervie
w
of
localisation
attack
(a)
Attack
ers
preliminarily
narro
w
do
wn
lists
of
suspicious
tar
get
netw
orks
that
include
a
sensor
by
analysing
materials
on
the
web
and
obtaining
handouts
distrib
uted
in
w
orkshops
by
an
operating
or
g
anisation
of
the
tar
get
darknet
monitoring
system.
The
in
v
estig
ated
results
are
preserv
ed
by
attack
ers
and
become
lists
of
suspicious
IP
addresses
acting
as
sensors
on
the
Internet.
Then,
the
attack
ers
select
a
range
of
IP
addresses
from
the
lists
and
initiate
a
localisation
attack
by
sending
mark
ers.
(b)
Attack
ers
send
mark
ers
o
v
er
a
short
period
to
suspicious
netw
orks
that
include
sensors,
i.e.,
marking.
If
the
sensor
recei
v
es
pack
ets
in
a
short
period,
a
discriminating
spik
e
appears
in
the
time-series
graph
publicised
by
the
tar
get
darknet
monitoring
system.
Indonesian
J
Elec
Eng
&
Comp
Sci,
V
ol.
19,
No.
2,
August
2020
:
1036
–
1047
Evaluation Warning : The document was created with Spire.PDF for Python.
Indonesian
J
Elec
Eng
&
Comp
Sci
ISSN:
2502-4752
r
1039
(c)
Attack
ers
masquerading
as
general
and
good
Internet
users
access
a
website
operated
by
the
or
g
anisation
running
t
he
tar
get
darknet
monitoring
system.
The
y
v
erify
the
presence
of
a
sensor
in
a
tar
get
netw
ork
by
identifying
the
mark
er
trace
in
the
system’
s
publicised
time-series
graph.
(d)
F
ollo
wing
the
result
of
procedure
3),
the
attack
ers
update
the
lists
of
suspicious
IP
addresses.
The
y
precisely
identify
a
sensor’
s
IP
address
by
repeating
procedures
(b)
through
(d).
The
abo
v
e
procedures
summarise
localisation
attacks,
which
produce
spik
es
on
a
publicised
graph.
Since
mark
ers
are
sent
as
standard
pack
ets
generated
by
a
port
scan,
i
t
is
dif
ficult
to
distinguish
between
mark
ers
and
other
pack
ets.
As
a
result,
a
darknet
monitoring
system
unintentionally
pro
vides
instructi
v
e
feedback
to
attack
ers
by
publicising
monitored
results
based
on
an
all
captured
pack
ets
(including
mark
ers).
4.
PR
OPOSED
P
A
CKET
SAMPLING
METHODS
W
e
ha
v
e
pre
viously
proposed
a
pack
et
sampling
method
that
samples
incoming
pack
ets
based
on
an
attrib
ute
of
the
pack
et
sender
to
increase
tolerance
to
localisation
attacks
in
our
w
ork
[6].
Here,
we
propose
tw
o
adv
anced
sampling
methods
that
sample
incoming
pack
ets
based
on
multiple
attrib
utes
of
pack
ets
sender
based
on
our
pre
vious
w
ork.
4.1.
Method
1:
Sampling
captur
ed
pack
ets
by
f
ocusing
on
arri
v
al
time
and
sour
ce
IP
addr
ess
Initially
,
we
determine
the
timing
of
pack
et
capture
using
random
numbers.
This
is
similar
to
our
pre
vious
w
ork,
which
uses
pack
et
arri
v
al
time.
Consequently
,
using
random
numbers
mak
es
it
dif
ficult
for
attack
ers
to
infer
when
sensors
capture
pack
ets.
In
addition,
we
k
eep
captured
pack
ets
in
chronological
order
.
This
contrib
utes
to
tolerating
localisation
attacks.
Ho
we
v
er
,
successfully
pre
v
enting
a
localisation
attack
is
a
matter
of
probability
.
In
the
w
orst
case,
each
sensor
may
capture
all
pack
ets
sent
by
attack
ers
based
on
the
time
selected
by
random
numbers.
Thus,
as
a
second
step,
we
also
sample
captured
pack
ets
based
on
source
IP
address.
Generally
,
global
IP
addresses
are
administrated
by
a
re
gistration
system.
Users
of
a
global
IP
addres
s
must
re
gister
a
certain
amount
of
identity
information
to
WHOIS
[24].
WHOIS
informati
on
identifies
which
IP
addresses
are
assigned
to
which
country;
thus,
it
is
easy
to
access
information
about
where
captured
pack
ets
originated.
As
sho
wn
in
Figure
3
pro
vided
by
Japanese
National
Police
Agenc
y
[25],
the
ratio
of
source
country
of
captured
pack
ets
is
greatly
biased
to
w
ard
se
v
eral
countries.
W
e
assume
that
attack
ers
manipulate
multiple
hosts
in
a
botnet
to
send
mark
ers
when
the
y
attempt
a
localisation
attack.
If
the
botnet
comprises
e
xploited
hosts
whose
source
IP
addresses
are
biased
to
w
ard
se
v
eral
countries,
sampling
captured
pack
ets
uniformly
in
terms
of
source
countries
reduces
a
spik
e
produced
by
attack
ers
on
a
publicised
graph
e
v
en
with
the
containing
mark
ers
in
the
time
selected
by
random
numbers.
CN
US
TW
RU
DE
JP
HK
TR
KR
NL
Others
Figure
3.
Ratio
of
source
country
of
captured
pack
ets
The
procedure
for
sampling
method
1
is
descri
bed
as
follo
ws.
W
e
use
GeoIP
[26]
to
map
the
source
IP
address
of
captured
pack
ets
to
a
country
.
Pr
ocedur
e
1:
Determine
Captur
e
T
ime:
(a)
Configure
parameters
m
and
n
as
0
m
<
n
60
.
Hereafter
,
(
m;
n
)
denotes
the
range
of
random
numbers.
(b)
Di
vide
captured
pack
ets
into
separate
data
per
hour
.
The
data
are
denoted
:::;
D
i
1
;
D
i
;
D
i
+1
;
:::
.
Gen-
erate
l
random
numbers
for
each
D
i
.
Note
that
l
2
[
m;
n
]
.
Countermeasur
es
a
gainst
darknet
localisation
attac
ks...
(Masaki
Narita)
Evaluation Warning : The document was created with Spire.PDF for Python.
1040
r
ISSN:
2502-4752
(c)
Generate
distinct
l
random
numbers
r
j
under
follo
wing
conditions:
1
j
l
and
0
r
j
<
60
.
(d)
Sample
pack
ets
arri
v
ed
at
k
minutes
at
each
D
i
,
k
2
f
r
1
;
r
2
;
:::;
r
l
g
.
Pr
ocedur
e
2:
Sample
P
ac
k
ets
whose
Countries
ar
e
Distrib
uted
Equally:
(a)
Identify
originating
country
by
referencing
the
source
IP
address
of
a
captured
pack
et.
(b)
Sample
or
discard
pack
ets
based
on
the
result
of
(a).
a.
Sample
a
pack
et
if
it
has
not
been
captured
from
the
identified
country
.
b
.
If
a
pack
et
has
been
captured
from
the
identified
country
,
sample
it
as
long
as
pack
ets
sent
from
the
country
are
belo
w
predefined
threshold
1
;
otherwise,
discard
the
pack
et.
(c)
Reset
the
list
of
the
sampled
country
per
hour
.
4.2.
Method
2:
Sampling
captur
ed
pack
ets
according
to
arri
v
al
time
and
time-to-li
v
e
In
method
2,
we
initially
determine
pack
et
capture
timing
using
random
numbers
(similar
to
method
1).
W
e
then
sample
captured
pack
ets
based
on
T
ime-to-Li
v
e
(TTL).
TTL
is
a
configured
v
alue
in
an
IP
header
to
pre
v
ent
an
endless
loop
of
pack
ets
caused
by
misconfiguration
in
the
netw
ork.
The
TTL
v
alue
is
reduced
sequentially
when
a
pack
et
mo
v
es
through
a
router
.
The
corresponding
pack
et
is
discarded
when
the
TTL
v
alue
is
reduced
to
zero.
Generally
,
the
initial
TTL
v
alue
is
defined
by
an
operating
system
as
sho
wn
in
T
able
1.
T
able
1.
Initial
TTL
v
alue
of
major
operating
systems
Operating
System
Initial
TTL
V
alue
UNIX
255
W
indo
ws
128
Linux
64
Thus,
TTL
can
be
used
to
infe
r
the
number
of
routers
pas
sing
pack
ets
along
a
communication
path,
and
we
can
use
it
as
an
inde
x
of
distance
in
a
netw
ork.
Similar
to
method
1,
if
the
source
distance
of
captured
pack
ets
is
greatly
biased,
we
belie
v
e
sampling
captured
pack
ets
uniformly
in
t
erms
of
the
number
of
netw
ork
hops
reduces
a
spik
e
produced
by
attack
ers
on
a
publicised
graph.
The
procedure
of
sampling
method
2
is
described
as
follo
ws.
Pr
ocedur
e
1:
Determine
Captur
e
T
ime:
(a)
Configure
parameters
m
and
n
as
0
m
<
n
60
(
(
m;
n
)
denotes
the
range
of
random
numbers).
(b)
Di
vide
captured
pack
ets
into
separate
data
per
hour
.
The
data
is
denoted
:::;
D
i
1
;
D
i
;
D
i
+1
;
:::
.
Generate
l
random
numbers
at
each
D
i
.
Note
that
l
2
[
m;
n
]
.
(c)
Generate
unique
l
random
numbers
r
j
under
follo
wing
conditions,
1
j
l
and
0
r
j
<
60
.
(d)
Sample
pack
ets
arri
v
ed
at
k
minutes
at
each
D
i
,
k
2
f
r
1
;
r
2
;
:::;
r
l
g
.
Pr
ocedur
e
2:
Sample
P
ac
k
ets
Uniformly
in
T
erms
of
the
Number
of
Network
Hops:
(a)
Infer
the
number
of
netw
ork
hops
according
to
the
TTL
v
alue.
If
the
captured
TTL
v
alue
is
t
1
,
we
obtain
t
2
because
t
2
is
greater
than
the
v
alue
of
t
1
and
less
than
the
minimum
v
alue
listed
in
T
able
1.
W
e
define
the
netw
ork
hops
as
t
2
t
1
.
(b)
Sample
or
discard
pack
ets
based
on
the
result
of
1).
a.
If
the
number
of
netw
ork
hops
has
not
been
captured,
sample
it.
b
.
If
the
number
of
netw
ork
hops
has
been
captured,
sample
it
as
long
as
the
number
of
netw
ork
hops
is
less
than
predefined
threshold
2
;
otherwise,
discard
it.
(c)
Reset
the
list
of
the
number
of
netw
ork
hops
per
hour
.
5.
PERFORMANCE
EV
ALU
A
TION
5.1.
Ev
aluation
of
tolerance
to
localisation
attack
1)
Assumption:
W
e
simulated
a
localisation
attack
(Section
3.)
and
e
v
aluated
the
ef
fecti
v
eness
of
the
proposed
methods
to
coun
t
eract
the
attack.
Simulations
were
performed
using
a
dataset
[27]
that
includes
pack
ets
cap-
tured
in
No
v
ember
of
2015.
The
dataset
w
as
pro
vided
by
NICTER
(a
darknet
monitoring
system
operated
in
Japan).
Assumptions
about
the
tar
get
darknet
monitoring
system
and
localisation
at
tack
are
described
in
the
follo
wing.
Indonesian
J
Elec
Eng
&
Comp
Sci,
V
ol.
19,
No.
2,
August
2020
:
1036
–
1047
Evaluation Warning : The document was created with Spire.PDF for Python.
Indonesian
J
Elec
Eng
&
Comp
Sci
ISSN:
2502-4752
r
1041
(a)
Assumption
about
tar
get
darknet
monitoring
system:
In
this
simulation,
the
tar
get
darknet
monitoring
system
captures
pack
ets
at
each
port.
Then,
the
system
publicises
and
updates
the
monitored
results
at
the
same
time
interv
al.
As
sho
wn
in
Figure
4,
the
transi-
tion
of
the
number
of
pack
ets
captured
by
the
system
is
plotted
in
a
time-series
graph
per
hour
within
the
compass
of
one
week.
Here,
the
Y
-axis
i
s
the
nu
m
ber
of
pack
et
s,
and
the
X-axis
is
elapsed
time.
Note
that
an
y
general
Internet
user
can
ac
cess
the
publicised
result;
thus,
malicious
users
can
access
the
same
information.
Figure
4.
Graph
publicised
by
tar
get
darknet
monitoring
system
(b)
Assumption
about
localisation
attack:
W
e
assume
that
an
attack
er
sends
mark
ers
to
the
tcp
port
445
as
the
destination.
In
this
simulation,
the
number
of
pack
ets
arri
ving
at
tcp
port
445
dra
ws
a
relati
v
ely
smooth
graph
compared
to
that
of
other
ports
as
sho
wn
in
Figure
4;
thus,
we
consider
that
attack
ers
are
easy
to
identify
their
spik
es
on
the
publicised
graph.
In
other
w
ords,
we
belie
v
e
that
this
port
is
suitable
for
a
loca
lisation
attack.
W
e
also
assume
tw
o
groups
of
botnets,
i.e.,
botnets
A
and
B.
These
botnets
are
descri
bed
in
T
ables
2
to
5.
Botnet
A
comprises
e
xploited
hosts
whose
source
IP
addresses
and
netw
ork
hops
are
greatly
biased,
and
botnet
B
is
the
group
that
is
the
same
assumption
in
our
pre
vious
w
ork
for
the
comparison.
The
number
of
sent
pack
ets,
mark
ers,
duration
of
pack
et
sending
and
time
interv
al
are
sho
wn
in
T
able
6.
T
able
2.
Breakdo
wn
of
botnet
A
in
terms
of
source
countries
Country
(Code)
The
Number
of
Hosts
CN
2
US
1
R
U
1
T
able
3.
Breakdo
wn
of
botnet
A
in
terms
of
netw
ork
hops
The
Number
of
Netw
ork
Hops
The
Number
of
Hosts
18
2
15
1
12
1
T
able
4.
Breakdo
wn
of
botnet
B
in
terms
of
source
countries
Country
(Code)
The
Number
of
Hosts
CN
7
US
3
TW
2
NL
2
IN
1
KR
1
TR
1
R
U
1
FR
1
MX
1
Countermeasur
es
a
gainst
darknet
localisation
attac
ks...
(Masaki
Narita)
Evaluation Warning : The document was created with Spire.PDF for Python.
1042
r
ISSN:
2502-4752
T
able
5.
Breakdo
wn
of
botnet
B
in
terms
of
netw
ork
hops
The
Number
of
Netw
ork
Hops
The
Number
of
Hosts
18
4
19
4
15
2
17
2
20
1
21
1
16
1
14
1
22
1
12
1
28
1
27
1
T
able
6.
Marking
parameters
Destination
Port
445/tcp
The
Number
of
Sending
P
ack
ets
3,000
The
Number
of
Mark
ers
4
T
ime
Interv
al
among
Markings
20
hours
Duration
of
Marking
within
1
minute
2)
How
to
Determine
Mark
er
s
by
Attac
k
er
s:
After
attack
ers
attempt
marking,
the
y
masquerade
as
good
Internet
users
and
access
a
darknet
monitoring
system
to
identify
their
mark
ers
on
a
publicised
graph
to
determine
the
presence
of
a
sensor
.
Here,
the
attack
ers
must
determine
whether
identified
mark
ers,
i.e.,
spik
es,
on
the
graph
were
generated
by
themselv
es.
Note
that
we
assume
the
attack
ers
determine
their
mark
er
using
an
outlier
detection
method.
If
the
spik
es
are
identified
as
a
statistical
outlier
,
the
attack
ers
can
identify
the
spik
es
as
their
o
wn
mark
ers.
In
our
e
v
aluation,
we
adopted
the
most
general
statistical
method,
i.e.,
the
2
outlier
test.
Figure
5
sho
ws
a
simulated
graph
where
attack
ers
insert
four
mark
ers.
If
or
g
anisations
operating
a
darknet
monitoring
system
pro
vide
the
graph
as
is
without
implementing
countermeasures,
attack
ers
can
acquire
adv
antageous
feedback,
and
a
sensor’
s
IP
address
can
be
disco
v
ered
easily
.
Figure
5.
Mark
ers
generated
by
attack
ers
5.2.
Inf
ormation
quality
e
v
aluation
Sampling
is
essentially
a
process
that
decimates
a
p
a
rt
from
entire
and
complete
information.
Thus,
it
is
necessary
to
consider
de
gradation
of
information
quality
caused
by
a
darknet
monitoring
system
when
using
a
sampling
method.
Therefore,
we
e
v
aluated
the
proposed
methods
in
terms
of
information
quality
in
sampling
and
no
sampling
cases.
W
e
compared
quality
obtained
before
and
after
sampling
i
n
well-kno
wn
publicised
formats
(time-series
graph
and
table).
1)
Rate
of
Concor
dance
of
Most
Accessed
P
ort:
Darknet
monitoring
systems
generally
publicise
the
most
accessed
top
10
port
numbers
on
their
websites
in
table
form.
The
most
accessed
top
10
port
numbers
are
Indonesian
J
Elec
Eng
&
Comp
Sci,
V
ol.
19,
No.
2,
August
2020
:
1036
–
1047
Evaluation Warning : The document was created with Spire.PDF for Python.
Indonesian
J
Elec
Eng
&
Comp
Sci
ISSN:
2502-4752
r
1043
destination
ports
that
atta
ck
ers
intend
to
e
xploit.
Here,
we
assumed
tw
o
sets
including
the
top
10
port
numbers:
set
X
represents
results
obtained
without
sampling,
and
set
Y
represents
results
with
sampling.
W
e
then
e
v
aluated
the
rate
of
concordance
of
these
tw
o
sets.
T
o
e
v
aluate
the
concordanc
e
rate
of
the
tw
o
sets,
we
computed
Simpson’
s
coef
ficient
as
follo
ws:
S
im
=
j
X
\
Y
j
min
(
j
X
j
;
j
Y
j
)
:
2)
Similarity
of
T
ime-series
Gr
aph:
Darknet
monitoring
systems
publicise
the
amount
of
arri
ving
pack
ets
at
each
port
as
a
ti
me-series
graph
on
their
website.
W
e
compared
a
time-series
graphs
obtained
without
sampling
f
X
u
g
and
with
sampling
f
Y
u
g
.
W
e
adopted
the
Bhattacharyya
coef
ficient
L
as
an
inde
x
to
compute
the
similarity
of
these
tw
o
graphs.
L
=
m
X
u
=1
p
X
u
Y
u
;
m
X
u
=1
X
u
=
m
X
u
=1
Y
u
=
1
;
(0
L
1)
:
The
Bhattacharyya
coef
fi
cient
is
essentially
an
inde
x
used
to
compute
the
similarity
of
histogram
s.
W
e
define
the
v
alue
of
1
L
as
the
discrepanc
y
of
graphs,
where
tw
o
graphs
are
increasingly
similar
as
the
v
alue
approaches
zero.
6.
RESUL
TS
AND
DISCUSSION
The
e
v
aluation
results
are
sho
wn
in
T
ables
7
to
11.
The
number
of
mark
ers
in
each
table
r
epresents
the
number
of
spik
es
that
attack
ers
successfully
identified
in
the
publicised
graphs.
T
able
7
sho
ws
the
results
obtained
by
our
pre
vious
method.
The
other
tables
sho
ws
the
results
obtained
by
the
proposed
methods.
Com-
pared
to
the
pre
vious
met
ho
d,
the
proposed
sampling
methods
increased
the
tolerance
to
localisation
attacks
if
we
focus
the
number
of
appearing
mark
ers.
Ho
we
v
er
,
relati
v
e
to
graph
discrepanc
y
and
concordance
rate
of
the
accessed
port,
procedure
2,
i.e.,
sampling
source
countries
or
netw
ork
hops
uniformly
,
reduced
the
informa-
tion
quality
.
Relati
v
e
to
the
concordance
rate
of
the
accessed
port,
method
2
obtained
better
results
compared
to
method
1.
W
e
conclude
that
the
proposed
methods
can
achie
v
e
higher
tolerance
to
l
ocalisation
attacks
by
reducing
the
sampling
pack
ets
as
a
range
of
random
number
defined
in
procedure
1
and
threshold
defined
in
procedure
2
become
small.
Ho
we
v
er
,
de
gradation
of
information
quality
occurred.
T
able
7.
Sampling
results
of
pre
vious
w
ork
(
m;
n
)
Number
of
Discrepanc
y
,
S
im
Mark
ers
1
L
(10
;
20)
1.02
0.0694
0.833
(25
;
35)
2.08
0.0352
0.910
(40
;
50)
3.05
0.0140
0.949
T
able
8.
Sampling
results
of
method
1
based
on
arri
v
al
time
and
source
countries
ag
ainst
botnet
A
(
m;
n
)
1
Number
of
Discrepanc
y
,
S
im
Mark
ers
1
L
(10
;
20)
10%
0.18
0.0621
0.742
(10
;
20)
15%
0.62
0.0556
0.753
(25
;
35)
10%
0.10
0.0509
0.744
(25
;
35)
15%
0.76
0.0417
0.771
(40
;
50)
10%
0.20
0.0466
0.743
(40
;
50)
15%
1.82
0.0312
0.803
Countermeasur
es
a
gainst
darknet
localisation
attac
ks...
(Masaki
Narita)
Evaluation Warning : The document was created with Spire.PDF for Python.
1044
r
ISSN:
2502-4752
T
able
9.
Sampling
results
of
method
1
based
on
arri
v
al
time
and
source
countries
ag
ainst
botnet
B
(
m;
n
)
1
Number
of
Discrepanc
y
,
S
im
Mark
ers
1
L
(10
;
20)
10%
1.00
0.0621
0.743
(10
;
20)
15%
1.00
0.0617
0.750
(25
;
35)
10%
1.92
0.0419
0.743
(25
;
35)
15%
2.24
0.0372
0.773
(40
;
50)
10%
3.06
0.0320
0.743
(40
;
50)
15%
3.08
0.0226
0.796
T
able
10.
Sampling
results
of
method
2
based
on
arri
v
al
time
and
TTL
ag
ainst
botnet
A
(
m;
n
)
2
Number
of
Discrepanc
y
,
S
im
Mark
ers
1
L
(10
;
20)
10%
0.34
0.0601
0.786
(10
;
20)
15%
0.60
0.0556
0.822
(25
;
35)
10%
0.32
0.0482
0.839
(25
;
35)
15%
1.18
0.0412
0.874
(40
;
50)
10%
0.38
0.0434
0.883
(40
;
50)
15%
1.86
0.0327
0.912
T
able
11.
Sampling
results
of
method
2
based
on
arri
v
al
time
and
TTL
ag
ainst
botnet
B
(
m;
n
)
2
Number
of
Discrepanc
y
,
S
im
Mark
ers
1
L
(10
;
20)
10%
0.76
0.0656
0.789
(10
;
20)
15%
1.16
0.0637
0.820
(25
;
35)
10%
2.00
0.0422
0.835
(25
;
35)
15%
2.08
0.0392
0.873
(40
;
50)
10%
3.00
0.0302
0.885
(40
;
50)
15%
3.04
0.0245
0.914
In
our
pre
vious
w
ork,
pre
v
enting
a
localisation
attack
w
as
dependent
on
t
h
e
probability
.
As
sho
wn
in
Figure
6,
spik
es
are
further
emphasised
if
each
sensor
captures
all
pack
ets
sent
by
attack
ers
based
on
the
time
selected
by
random
numbers.
This
is
a
major
problem
that
must
be
addressed.
Methods
1
and
2
alle
viated
a
problem
by
sampling
source
c
o
unt
ries
or
netw
ork
hops
uniformly
in
procedure
2
if
sensors
capture
all
pack
ets
sent
by
attack
ers
in
procedure
1.
As
sho
wn
in
Figures
7
and
8,
the
proposed
methods
curb
mark
ers
if
the
number
of
hosts
managed
by
attack
ers
is
small
and
attrib
utions
are
greatly
biased.
Figure
9
sho
ws
that
our
method
can
pre
v
ent
mark
ers
as
the
thres
hold
defined
in
procedure
2
is
lo
w
.
If
the
number
of
hosts
managed
by
attack
ers
is
lar
ge
and
attrib
utions
are
unbiased,
the
proposed
methods
could
alle
viate
an
impact
of
mark
ers.
Figure
6.
Publicised
graph
produced
by
our
pre
vious
method
under
the
conditions
of
(
m;
n
)
=
(
2
5
;
35)
Figure
7.
Publicised
graph
produced
by
method
1
to
counteract
botnet
A
under
the
conditions
of
(
m;
n
)
=
(25
;
35)
,
threshold
1
=
15%
Indonesian
J
Elec
Eng
&
Comp
Sci,
V
ol.
19,
No.
2,
August
2020
:
1036
–
1047
Evaluation Warning : The document was created with Spire.PDF for Python.
Indonesian
J
Elec
Eng
&
Comp
Sci
ISSN:
2502-4752
r
1045
Figure
8.
Publicised
graph
produced
by
method
2
to
counteract
botnet
A
under
the
conditions
of
(
m;
n
)
=
(25
;
35)
,
threshold
2
=
13%
Figure
9.
Publicised
graph
produced
by
method
1
to
counteract
botnet
A
under
the
conditions
of
(
m;
n
)
=
(25
;
35)
,
threshold
1
=
10%
7.
CONSIDERA
TIONS
ON
THRESHOLD
FLUCTU
A
TIONS
AND
VERIFICA
TION
OF
TRADE-OFF
As
demonstrated
by
the
e
xperimental
results,
as
the
sampling
parameters
and
the
number
of
pack
ets
to
be
acquired
decrease,
the
resistance
to
localisation
attacks
increases;
ho
we
v
er
,
the
information
quality
is
reduced.
Con
v
ersely
,
if
the
number
of
pack
ets
to
be
acquired
increases,
information
that
is
similar
to
the
original
information
can
be
obtained.
The
tw
o
are
in
a
trade-of
f
relationship.
T
o
v
erify
this
trade-of
f
relationship,
we
performed
additional
e
xperiments
with
v
arying
parameters
for
our
proposed
methods.
In
general,
similar
results
were
obtained
with
methods
1
and
2,
i.e.,
resistance
to
localisation
attacks
impro
v
ed
as
the
threshold
v
alue
decreased.
By
focusing
on
the
destination
port
match
rate,
T
able
12
(with
a
common
v
alue
for
threshold
2
)
sho
ws
that
the
match
rate
of
the
destination
port
fluctuates.
In
contrast,
there
is
no
change
in
the
match
rate
of
the
destination
in
T
able
13
(with
a
common
v
alue
for
threshold
1
).
By
comparing
the
results
in
T
able
14
and
T
able
15,
the
destination
port
match
rate
als
o
decre
ases
as
the
threshold
1
decreases
sharply
.
From
this,
it
is
considered
that
the
influence
of
the
ratio
of
transmission
source
country
on
the
matching
rate
of
the
destination
port
is
lar
ge.
Note
that
if
the
method
1
is
adopted,
we
need
to
configure
random
number
width
(
m;
n
)
as
small
and
control
threshold
v
alue
1
to
realise
resistance
to
localisation
attacks
and
maintain
high
information
quality
.
T
able
12.
Proposed
method
2:
Experimental
results
(botnet
B)
with
only
random
number
width
(m,
n)
of
sampling
method
focusing
on
arri
v
al
time
and
TTL
(
m;
n
)
2
The
Number
of
Discrepanc
y
,
S
im
Mark
ers
1
L
(10
;
20)
0.94
0.0617
0.791
(20
;
30)
1.64
0.0467
0.821
(25
;
35)
10%
1.86
0.0411
0.840
(40
;
50)
2.84
0.0309
0.883
(50
;
60)
3.64
0.0245
0.905
T
able
13.
Proposed
method
1:
Experimental
results
(botnet
B)
with
only
random
number
width
(m,
n)
of
sampling
method
focusing
on
arri
v
al
time
and
source
country
(
m;
n
)
1
The
Number
of
Discrepanc
y
,
S
im
Mark
ers
1
L
(10
;
20)
1.12
0.0613
0.743
(20
;
30)
1.66
0.0469
0.743
(25
;
35)
10%
1.92
0.0419
0.743
(40
;
50)
2.68
0.0337
0.743
(50
;
60)
3.66
0.0275
0.743
Countermeasur
es
a
gainst
darknet
localisation
attac
ks...
(Masaki
Narita)
Evaluation Warning : The document was created with Spire.PDF for Python.