Indonesian
J
our
nal
of
Electrical
Engineering
and
Computer
Science
V
ol.
21,
No.
3,
March
2021,
pp.
1622
1633
ISSN:
2502-4752,
DOI:
10.11591/ijeecs.v21i3.pp1622-1633
r
1622
What
netw
ork
simulator
questions
do
users
ask?
a
lar
ge-scale
study
of
stack
o
v
erflo
w
posts
Syful
Islam
1
,
Y
usuf
Sulisty
o
Nugr
oho
2
,
Md.
J
a
v
ed
Hossain
3
1
Nara
Institute
of
Science
and
T
echnology
,
Japan
2
Uni
v
ersitas
Muhammadiyah
Surakarta,
Indonesia
1,3
Noakhali
Science
and
T
echnology
Uni
v
ersity
,
Bangladesh
Article
Inf
o
Article
history:
Recei
v
ed
Oct
2,
2020
Re
vised
Dec
2,
2020
Accepted
Dec
23,
2020
K
eyw
ords:
Netw
ork
simulators
Discussion
topics
Stack
o
v
erflo
w
ABSTRA
CT
The
use
of
netw
ork
simulator
as
a
modern
tool
in
analyzing
and
predicting
the
beha
viour
of
computer
netw
orks
has
gro
wn
to
reduce
the
comple
xity
of
its
accurac
y
measure-
ment.
This
attracts
researchers
and
practitioners
to
share
problems
and
discuss
them
to
impro
v
e
the
features.
T
o
communicate
the
related
issues,
users
mo
v
e
to
online
question-
answering
platforms.
Although
recent
studies
ha
v
e
sho
wn
the
popularity
and
benefits
of
adopting
netw
ork
sim
ulation
tools,
the
challenges
users
f
ace
in
using
the
netw
ork
simulator
remain
unkno
wn.
In
this
research
paper
,
we
e
xamine
2,322
netw
ork
simulat
or
related
stack
o
v
erflo
w
question
posts
to
g
ain
insights
into
the
topics
and
challenges
that
users
ha
v
e
discussed.
W
e
adopt
the
latent
dirichl
et
allocation
model
to
understand
the
topics
discussed
in
stack
o
v
erflo
w
.
W
e
then
in
v
estig
ate
the
popularity
and
dif
ficulty
of
each
topic.
The
results
sho
w
that
users
use
stack
o
v
erflo
w
as
an
implementation
guideline
for
the
netw
ork
simulation
model.
W
e
determine
8
discussion
topics
that
are
mer
ged
into
5
major
cate
gories.
Simulation
model
configuration
is
the
most
useful
topic
for
users.
W
e
also
observ
e
that
tar
get
netw
ork
protocol
modification
and
netw
ork
simulator
installation
are
the
most
popular
topics.
Netw
ork
simulator
installation
and
tar
get
netw
ork
protocol
modification
issues
ha
v
e
been
challenging
for
most
users.
The
findings
also
highlight
future
research
that
suggests
w
ays
to
help
the
netw
ork
simulator
community
in
the
early
stages
to
o
v
ercome
the
popular
and
dif
ficult
topics
f
aced
when
using
netw
ork
simulation
tools.
This
is
an
open
access
article
under
the
CC
BY
-SA
license
.
Corresponding
A
uthor:
Syful
Islam
Laboratory
of
Softw
are
Engineering
Nara
Institute
of
Science
and
T
echnology
,
Japan
Email:
islam.syful.il4@is.naist.jp
1.
INTR
ODUCTION
Since
the
comple
xity
of
communication
netw
orks
ha
v
e
increased
to
meas
u
r
e
the
accurac
y
of
system
beha
vior
in
traditional
analytical
techniques
[1],
netw
ork
simulators
(NS)
are
used
as
a
modern
technique
to
analyze
and
predict
the
beha
vior
of
computer
netw
orks.
In
the
implementation,
NS
allo
ws
the
users
to
design,
modify
and
test
the
netw
orking
protocols
in
a
simulation
mode
and
is
modeled
with
de
vices,
links
and
application
to
report
the
performance
of
the
tested
netw
orks.
T
oday’
s
application
of
simulation
techniques
has
attracted
researchers
and
pr
actitioners
to
discuss
and
find
a
w
ay
to
impro
v
e
the
features.
T
o
communicate
the
NS-related
issues
and
update,
users
turn
to
an
online
question-answering
platforms,
such
as
stack
o
v
erflo
w
(SO),
and
to
get
help
and
advice
from
community
about
the
technical
problems
the
y
f
ace.
Stack
o
v
erflo
w
is
an
online
J
ournal
homepage:
http://ijeecs.iaescor
e
.com
Evaluation Warning : The document was created with Spire.PDF for Python.
Indonesian
J
Elec
Eng
&
Comp
Sci
ISSN:
2502-4752
r
1623
question-answering
platform
that
accommodate
the
need
of
v
arious
users
and
discusses
a
wide-range
of
topics
and
programming
languages.
A
number
of
research
on
the
quality
and
content
of
online
question-answering
platforms,
such
as
stack
o
v
erflo
w
,
has
indicated
its
significance.
Zag
alsk
y
et
al.
[2]
reported
that
SO
with
R-tag
has
become
a
communication
channels
that
ha
v
e
a
relationship
wit
h
the
topics
discussed
in
R
softw
are
de
v
elopment
community
forum.
Squire
[3]
sho
wed
that
the
mea
surement
of
quality
,
users’
participation,
and
the
ef
f
ecti
v
eness
of
responding
time
in
the
SO
forum
ha
v
e
been
the
k
e
y
f
actors
that
caused
de
v
elopers
to
mo
v
e
from
mailing
list.
Calef
ato
et
al.
[4]
proposed
a
guideline
of
making
a
successful
question
in
SO
forum,
while
W
ang
et
al.
[5]
analyzed
the
important
f
actors
that
impact
the
time
of
recei
ving
an
accepted
ans
wer
in
four
Stack
Exchange
websites.
SO
discussion
topics
were
also
in
v
estig
ated
in
se
v
eral
studies.
Be
yer
et
al.
[6]
performed
a
lar
ge
scale
empirical
study
on
SO
to
a
n
a
lyze
topics
and
the
current
trends
that
de
v
elopers
discuss.
Finally
the
y
automatically
classify
the
posts
into
se
v
en
question
cate
gories.
Other
study
on
mobil
e-de
v
elopers
topics
in
SO
w
as
conducted
by
Rosen
and
Shihab
[7].
Extending
prior
w
orks
that
analyze
the
quality
of
SO
questions
and
their
topics,
in
this
paper
,
w
e
conduct
a
lar
ge
scale
empirical
study
on
the
topics
co
v
ered
by
NS
questions
posted
in
SO.
W
e
also
in
v
estig
ate
the
types
of
issues
f
aced
by
the
NS
users
by
cate
gorizing
the
k
e
yw
ords
used
in
the
questions
.
W
e
further
study
the
NS
questions
that
are
most
dif
ficult
to
answer
by
calculating
the
PD
score
[8].
Based
on
the
analysis
of
2,322
NS-related
questions
from
SO,
we
find
that
the
NS-related
threads
seek
ers
use
SO
platform
to
discuss
simulation
model
configura
tion,
tar
get
netw
ork
protocol
modification,
NS
installation,
simulation
model
performance
measure
and
NS
b
uild
error
.
Although
simulation
model
configuration
is
the
most
useful
topic
amongst
the
NS
users,
it
does
not
so
popular
.
The
most
vie
wed
NS-related
discussion
topic
by
the
users
is
NS
installation.
Most
discussions
in
Stack
Ov
erflo
w
are
initially
triggered
with
a
How-to
type
of
questions.
This
indicates
that
NS
users
usually
ask
for
an
instruct
ion,
guidance
or
tut
orial
to
solv
e
their
problem
s.
Ho
we
v
er
,
although
the
number
of
responses
is
high,
NS
installation
and
tar
get
netw
ork
protocol
modification
are
the
most
dif
ficult
types
of
questions
to
get
the
appropriate
answers.
Our
findings
sho
w
that
NS
users
need
a
s
pecific
discussion
forum
to
reach
the
maturity
le
v
el
of
similar
project-specific
discussion
platforms,
such
as
the
Eclipse
forum.
So
that
the
most
common
issues,
such
as
plug-ins
and
documentation,
could
be
identified
and
suggesting
to
ne
w
users
ho
w
the
y
can
address
the
issues
that
pre
v
ent
their
entry
to
the
discussion
forums
[9].
In
addition,
our
insights
can
be
used
for
guidance
to
conduct
future
research
on
NS-related
discussions
in
other
channels
and
artef
acts
in
the
comple
x
softw
are
de
v
elopment
en
vironment.
P
aper
Or
g
anization:
the
rest
of
this
paper
are
or
g
anized
is
being
as.
S
ection
2.
presents
the
resea
rch
methodology
.
In
detail,
we
describe
the
research
questions,
our
procedures
to
collect
data,
and
our
online
appendix.
The
study
results
are
presented
in
section
3.
to
answer
the
formulated
research
questions.
The
implication
of
the
study
are
then
e
xplained
in
section
4.
section
5.
and
6.
describe
the
threat
to
v
alidity
and
present
the
related
w
orks,
respecti
v
ely
.
Finally
,
we
conclude
the
paper
in
section
7.
2.
RESEARCH
METHOD
This
section
presents
our
research
questions,
data
collection
process,
and
the
replication
packages
in
an
online
appendix.
2.1.
Resear
ch
questions
This
study
aims
to
e
xtract
insights
into
NS-related
discussion
characteri
stics
on
the
SO
platform.
T
o
achie
v
e
this
goal,
we
ha
v
e
created
three
research
questions
to
guide
our
research.
W
e
present
these
questions
with
moti
v
ation.
RQ
1
:
What
kind
of
topics
presented
in
the
netw
ork-simulators
related
discussion?
Netw
ork
simulators
(NS)
ha
v
e
become
a
high
demand
for
netw
ork
engineers
and
researchers
[10]
.
Dif
ferent
type
of
users
will
ha
v
e
v
arious
NS-related
problems
that
require
a
dif
ferent
area
of
e
xpertise.
F
or
e
x
a
mple,
some
users
require
specific
e
xpertise
i
n
the
tcl
scripting,
b
ut
others
could
ha
v
e
problems
on
netw
ork
protocols,
or
design
features.
Thus,
the
dif
ficulties
f
aced
by
users
are
lik
ely
to
dif
fer
.
Since
users
get
the
benefit
from
question-answering
platforms
to
communicate
issues,
the
objecti
v
e
of
this
research
quest
ion
is
to
understand
the
most
useful
and
popular
NS
topics
that
are
frequently
f
aced
by
NS
What
network
simulator
questions
do
user
s
ask?
a
lar
g
e-scale
study
of
stac
k
o
verflow
posts
(Syful
Islam)
Evaluation Warning : The document was created with Spire.PDF for Python.
1624
r
ISSN:
2502-4752
community
.
In
addition,
identifying
widely
discussed
NS
topics
is
the
first
step
in
highlighting
issues
that
are
g
aining
more
attention.
RQ
2
:
What
types
of
questions
do
users
face?
T
aking
the
results
from
RQ
1
,
we
then
set
out
to
empirically
study
the
types
of
question
that
were
ask
ed
by
users.
Pre
vious
study
[7]
sho
ws
that
users
ask
the
questions
in
dif
f
erent
types
(i.e.
ho
w
,
what,
wh
y).
Similar
to
the
approach
of
prior
study
[7],
this
analysis
is
performed
to
identify
the
nature
of
dif
ficulties
encountered
while
using
NS.
RQ
3
:
What
topics
are
the
most
dif
ficult
to
answ
er?
The
k
e
y
m
o
t
i
v
ation
of
this
RQ
is
to
in
v
estig
ate
the
topics
that
are
dif
fi
cult
to
answer
.
Finding
the
topics
that
are
hard
to
answer
will
help
the
users
to
get
more
attention
from
the
NS
community
.
Furthermore,
it
highlights
the
topics
that
require
better
support
(tools/frame
w
ork/
of
ficial
documentation)
for
addressing
NS
usage
related
dif
ficulty
.
2.2.
Data
Collection
Figure
1
outlines
the
methodology
for
col
lecting
the
data
which
is
described
is
being
as.
W
e
initially
do
wnloaded
the
latest
SO
data
dump
(July
2008
to
December
2019)
that
is
publicly
a
v
ailable
on
the
SO
T
orrent
[11]
.
The
SO
data
contains
all
the
Q&A
with
the
metadata
(creation
date,
f
a
v
ourite
count,
vie
ws,
and
score).
The
initial
collected
dataset
contains
46,947,633
posts,
where
39.83%
(18,699,426)
are
question
posts
and
60.17%
(28,248,207)
are
answer
posts
spanning
from
July
2008
to
December
2019.
Step
1:
Filter
using
#
simulator
tag.
SO
posts
are
typically
tagged
by
rele
v
ant
tags
to
impro
v
e
visibility
of
the
posts.
F
o
llo
wing
the
similar
approach
that
w
as
used
in
pre
vious
study
[7]
,
we
collect
the
initial
NS
dataset
by
filtering
posts
that
contain
simulator
as
a
tag
w
ord.
The
output
of
this
step
is
1,407
NS
posts.
Step
2:
Disco
v
er
rele
v
ant
tags.
In
this
step,
we
e
xtract
the
co-occurring
tags
with
simulator
from
1,407
posts
to
disco
v
er
rele
v
ant
tags.
One
major
risk
of
disco
v
ering
rele
v
ant
tag
is
the
possibility
of
introducing
noise
in
the
main
dataset.
F
or
e
xample,
JNS
ja
v
a
net
w
ork
simulator
(JNS)for
implementing
ns2
is
a
rela
v
ant
post
that
contains
tag
w
ords
Ja
v
a
along
with
simulator
and
ns2.
Therefore,
to
mitig
ate
this
problem,
we
group
and
aggre
g
ate
the
tar
get
t
ags
through
semi-automatic
process.
In
detail,
the
proces
s
includes
string
search
using
manual
v
erification
of
ne
wly
e
xplored
tags.
In
addition,
we
v
alidate
the
tags
by
applying
tag
rele
v
ance
threshold
(TR
T)
and
tag
significance
threshold
(TST)
as
metrics
to
v
alidate
tags:
T
R
T
tag
=
#
tag
posts
S
um
(#
tag
posts
)
(1)
T
S
T
tag
=
#
tag
posts
#
popul
ar
tag
posts
(2)
where
#tag
posts
is
the
number
of
NS
posts
for
the
tag,
Sum(#tag
posts)
is
the
total
number
of
pos
ts
for
the
tag,
and
#popular
tag
posts
means
number
of
NS
posts
for
most
popular
tag.
F
or
instance,
omnet
++
is
a
tag
w
ord
that
occur
only
11
times
as
a
co-occurring
tag
in
simulator
tagged
post
while
the
total
number
of
question
posts
on
SO
is
1406.
Therefore,
we
also
included
such
kind
of
tags
in
the
final
tag
set.
Thus,
the
output
of
step
2
is
manually
v
alidated
4
tags
(see
T
able
1
)
as
being
representati
v
e
of
NS
discussions.
Step
3:
Collect
tagged
posts.
After
getting
the
NS-related
tag
sets,
we
utilize
those
tags
to
identify
and
e
xtract
posts
to
create
the
final
NS
post
dat
aset.
The
output
of
step
3
is
2,322
posts
from
SO
being
used
as
final
dataset
in
the
subsequent
sections.
Step
4:
Extract
post
title
and
preprocess
for
LD
A.
In
this
step,
we
apply
a
filter
to
remo
v
e
irrele
v
ant
information.
F
or
topic
modeling
we
only
focus
on
title
of
the
question
posts
since
body
of
post
can
introduce
noise
to
our
analysis.
Using
the
similar
approach
t
o
prior
studies
[7]
to
e
xtract
the
title
of
the
posts,
we
performed
pre-processing
of
the
data.
This
includes
remo
v
al
of
emails,
ne
wline
characters,
stop
w
ords
using
re
gular
e
xpression
[12]
and
p
ython
NL
TK
[13].
W
e
s
ubsequently
b
uild
a
bigram
model
using
Gensim
[14]
and
lemmatize
the
w
ords
to
map
the
original
w
ords.
The
output
of
step
4
is
NS
post
title
corpus
which
is
used
as
the
input
of
LD
A
(Latent
Dirichlet
Allocation)
model.
Step
5:
LD
A
topic
modeling.
As
illustrated
in
Algorithm
1
,
in
this
step,
we
identify
the
NS
topics
using
SO
post
title.
T
o
obtain
the
topic
names,
we
use
the
LD
A
technique
[15]
,
which
w
as
also
used
by
pre
vious
studies
[7,
16-18].
Indonesian
J
Elec
Eng
&
Comp
Sci,
V
ol.
21,
No.
3,
March
2021
:
1622
–
1633
Evaluation Warning : The document was created with Spire.PDF for Python.
Indonesian
J
Elec
Eng
&
Comp
Sci
ISSN:
2502-4752
r
1625
SO
T
orre
nt
Filt
er
using
#sim
ula
tor
tag
Data
col
le
ct
ion
Final
Data
set
Col
le
c
t
ta
gge
d
post
s
1
3
Ext
rac
t
post
ti
tl
e
and pre
-
proc
ess
for LDA
Sele
ct
opti
m
al
topic
s
num
ber
1
Cat
egor
iz
e
posts int
o
topi
cs
2
La
bel
topi
cs
base
d
on
key
wo
r
d
and p
os
t
ti
tl
e
3
Pos
ts
ca
te
gorize
d
into
la
bel
ed
topi
c
LDA
t
opic
m
odell
ing
Discove
r
rel
eva
nt
ta
gs
2
Figure
1.
Ov
ervie
w
of
the
methodology
of
NS
study
T
able
1.
The
tag
list
used
to
identify
and
e
xtract
NS
related
posts.
The
TR
T
and
TST
v
alues
are
presented
in
percentages
Filtered
tag
#Initial
posts
#Final
posts
TR
T
(%)
TST
(%)
simulator
1,407
1,407
100.00
100.00
ns2
15
599
2.50
1.06
ns-3
11
316
3.48
0.78
omnet++
11
1,406
0.78
0.78
In
this
study
,
we
apply
the
Mallet
model
of
the
LD
A
technique
[19]
to
create
group
of
NS
posts
in
our
dataset
based
on
the
k
e
yw
ords
e
xists
in
the
title
of
posts.
T
o
obtain
the
optimal
number
of
topics
k,
we
perform
the
modeling
process
in
se
v
eral
iterations.
First,
we
run
the
LD
A
for
range
(0-50)
with
3
step
size
increment.
Second,
we
choose
the
sub-optimal
range
(4-20)
based
on
the
coherence
score
[20].
Third,
we
ag
ain
run
the
model
for
sub-optimal
range
with
1
step
size
increment
and
thus
optimally
come-up
with
8
topics.
Finally
,
we
run
the
model
with
topic
number
k=8
and
obtain
8
NS
topics
with
their
associated
k
e
yw
ords
(20
k
e
yw
ords
per
topic).
Algorithm
1:
NS
topic
modeling
using
LD
A
Input:
N
S
q
posts
=Stack
Ov
erflo
w
NS
question
posts
obtained
in
step-3
Result:
Suggested
NS
topics
(
k
)
and
k
e
yw
ords
Method:
N
S
al
l
post
titl
e
=Extract
titles
from
N
S
q
posts
;
N
S
post
titl
e
=Preprocess
the
N
S
al
l
post
titl
e
to
remo
v
e
noise;
f
or
run
LD
A
topic
modeling
on
N
S
post
titl
e
using
a
custom
r
ang
e
(
N
1
N
2
)
do
iterate
with
step
size
3;
compute
coherence
score
for
each
topic
number;
end
select
sub-optimal
topic
range
(
O
1
O
2
)
based
on
coherence
score;
f
or
run
LD
A
topic
modeling
on
N
S
post
titl
e
using
sub-optimal
r
ang
e
(
O
1
O
2
)
do
iterate
with
step
size
1;
compute
coherence
score
for
each
topic
number;
end
select
optimal
topic
number
k
based
on
coherence
score;
run
the
LD
A
topic
modeling
on
N
S
post
titl
e
for
the
optimal
topic
number
k
return
Suggested
NS
topics
(
k
)
and
k
e
yw
ords
2.3.
Online
A
ppendix
W
e
Publish
the
replication
package.
It
contains
(1)
the
NS
dataset,
(2)
p
ython
codes,
and
(3)
results
of
the
study
.
The
package
is
a
v
ailable
at
https://github
.com/syful-is/Netw
ork/simulator
.
What
network
simulator
questions
do
user
s
ask?
a
lar
g
e-scale
study
of
stac
k
o
verflow
posts
(Syful
Islam)
Evaluation Warning : The document was created with Spire.PDF for Python.
1626
r
ISSN:
2502-4752
T
able
2.
T
op
5
discussion
topics
that
relate
to
netw
ork
simulators
T
opic
name
T
op
10
k
e
yw
ords
#Posts
Simulation
model
configuration
v
ein,
node,
pack
et,
netw
ork,
message,send,
omnet,
v
ehicle,
create,
node
999
T
ar
get
netw
ork
protocol
modification
file,
implement,
find,
add,
function,
route,
set,
parameter
,
code,
module
527
NS
installation
installation,
error
,
omnet,
omnetpp,
v
ariable,
windo
w
,
unable,
read,
problem,
ub
untu
278
Simulation
model
performance
measurement
simulation,
run,
time,
calculate,
w
ork,
delay
,
distance,
end,
through-
put,
result
276
NS
b
uild
error
omnet,
inet,
mak
e,
project,
b
uild,
link,
error
,
library
,
f
ail,
command
242
sum
2,322
3.
RESUL
T
AND
DISCUSSION
This
section
describes
the
analyses
of
SO
posts
and
topics
to
answer
the
research
questions.
In
details,
we
present
each
research
question
alongside
with
the
approach
and
the
results.
3.1.
RQ
1
:
What
kind
of
topics
presented
in
the
netw
ork-simulators
related
discussion?
Approach:
T
o
answer
this
RQ
,
we
apply
the
LD
A
topic
modeling
for
identifying
topics
based
on
the
ti
tle
of
NS
post
as
described
in
section
2.
W
e
label
the
topic
names
based
on
the
k
e
yw
ords
suggested
by
LD
A
and
by
a
manual
reading
of
the
top
25
question
posts
for
each
topic.
Manual
analysis
of
topic
k
e
yw
ords
and
question
posts
re
v
eals
that
some
topics
ha
v
e
similar
meanings
and
ask
similar
types
of
questions,
such
as
k
e
yw
ords
related
to
simulation
model
and
netw
ork
model
configuration.
While
these
are
dif
ferent
topics,
the
y
relate
to
simulation
model
configuration.
Therefore,
some
topics
are
mer
ged
and
grouped
into
the
same
topic
name.
Thus,
we
obtain
5
final
topic
names
from
8
topics
suggested
by
LD
A
topic
modeling.In
addition
to
the
results
of
this
RQ
,
we
will
also
look
at
the
most
popular
NS
topics
among
users.
T
o
identify
the
most
use
ful
and
popular
topics,
we
use
three
dif
ferent
metrics
(i.e.,
score,
f
a
v
orite,
and
vie
ws)
that
were
also
used
in
pre
vious
studies
[7,
8,
21,
22].
W
e
used
the
SO
tour
[23
]
as
reference
for
definition
of
the
metrics.
The
a
v
erage
score
of
the
NS
question
posts.
According
t
o
SO
tour
,
members
are
allo
wed
to
up-v
ote
posts
that
are
considered
useful
to
users.
This
v
otes
are
summarized
as
the
score.
W
e
use
this
score
as
one
of
the
metric
to
measure
the
usefulness
of
the
post
topics.
The
a
v
erage
number
of
posts
mark
ed
as
f
a
v
ourite
by
SO
users.
Thi
s
metric
is
used
to
measure
the
usefulness
of
the
post
topics.
The
a
v
erage
number
of
vie
ws
of
the
post
by
both
unre
gistered
and
re
gistered
users.
According
to
the
SO
T
our
,
if
a
question
post
i
s
vie
wed
by
man
y
users,
this
post
is
considered
popular
among
them.
Therefore,
this
metric
sho
ws
the
popularity
of
the
topic.
Results:
LD
A
topic
modelling
on
SO
posts
suggest
that
users
mainly
discuss
5
NS-related
topics.
T
able
2
sho
ws
the
5
topic
names,
number
of
posts
for
each
topic,
and
the
top
10
associated-k
e
yw
ords
obtained
from
LD
A
topic
modeling.
W
e
find
that,
simulation
model
configuration
is
the
most
com
mon
topics
discussed
by
users,
follo
wed
by
tar
get
netw
ork
protocol
modification.
The
other
three
topics
discussed
by
users
are
NS
installation,
simulation
model
performance
measurement,
and
b
uild
error.
In
the
second
part
of
analysis,
we
e
xamine
the
NS
topics
usefulness
and
popularity
among
users.
T
able
3
sho
ws
that
simulation
model
configuration
is
the
most
useful
topic
based
on
the
a
v
erage
score
and
f
a
v
orite
count
of
the
posts.
This
indic
ates
that,
users
find
the
posts
re
lated
to
this
topic
as
most
useful.
In
addition,
we
find
that
based
on
vie
w
count
of
posts,
tar
get
netw
ork
protocol
modification
and
NS
installation
are
the
top
tw
o
most
popular
topics
among
the
users.
This
indicates
that,
posts
related
to
this
topics
are
most
commonly
searched
by
NS
users
on
SO.
Indonesian
J
Elec
Eng
&
Comp
Sci,
V
ol.
21,
No.
3,
March
2021
:
1622
–
1633
Evaluation Warning : The document was created with Spire.PDF for Python.
Indonesian
J
Elec
Eng
&
Comp
Sci
ISSN:
2502-4752
r
1627
T
able
3.
Usefulness
and
popularity
of
the
top
5
discussion
topics.
The
score,
f
a
v
orite
and
vie
w
counts
are
presented
in
a
v
erage
T
opic
name
Score
F
a
v
orite
V
ie
w
Simulation
model
configuration
0.37
0.28
420.52
NS
b
uild
error
0.32
0.15
477.67
Simulation
model
performance
measurement
0.32
0.21
488.59
T
ar
get
netw
ork
protocol
modification
0.31
0.14
634.86
NS
installation
0.18
0.09
602.08
RQ
1
:
Summary
LD
A
topic
modelling
on
SO
posts
suggest
that
users
mainly
discuss
5
NS-related
topics.
W
e
find
that
Simulation
Model
Configuration
is
the
most
useful
topic
to
users.
In
addition,
we
observ
e
that
tar
get
netw
ork
protocol
modification
and
NS
installation
issues
are
the
most
popular
topics
among
the
users.
3.2.
RQ
2
:
What
types
of
questions
do
users
face?
Approach:
T
o
e
xamine
the
questions
f
aced
by
the
users,
we
apply
the
same
approach
as
pre
vious
studies
to
identify
the
types
of
posts
in
SO
[7,
8,
24].
W
e
utilize
tw
o
steps
to
obtain
the
results.
First,
we
manually
in
v
estig
ate
30
random
sample
questions
using
k
e
yw
ords
(i.e.,
ho
w
,
what,
and
wh
y).
W
e
observ
ed
that
some
question
ask
ed
for
instruction
without
using
‘ho
w-to’
k
e
yw
ord.
F
or
e
xample
‘Is
there
an
yw
ay
to
change
co
v
erage
in
wireless
node?’
is
a
‘ho
w-to’
types
NS
question
post.
Therefore,
after
manual
in
v
estig
ation,
we
also
include
‘Is
there
an
yw
ay’
as
search
string
to
identify
‘ho
w-to’
questions.
In
the
same
w
ay
,
we
append
a
k
e
yw
ord
list
to
classify
the
post
into
question
(i.e.,
ho
w
,
what,
and
wh
y)
types.
Finally
,
we
apply
the
k
e
yw
ord
list
in
the
post
title
and
body
to
obtain
classification
of
the
questions.
The
cate
gories
used
in
NS
question
pos
t
labelling
are
as:
Ho
w
-
is
a
question
type
asks
for
instructions
to
perform
a
tas
k.
F
or
e
xample,
“how
to
dr
aw
xgr
aph
in
satcom
in
ns2
simulation”
.
This
question
asks
for
instruction
on
xgraph
feature
in
ns2
simulation.
What
-
is
a
question
type
ask
for
information
that
are
more
abs
tract,
conceptual
in
nature,
asking
for
decision
help,
or
ask
on
non-functional
requirements.
F
or
e
xample,
“What
network
simulation
model
should
I
use
for
simulat
ing
the
behavior
of
an
ad-hoc
network
in
OMNET++”
.
This
type
of
question
is
asking
on
the
netw
ork
simulation
model
for
predicting
beha
vior
of
an
adhoc
netw
ork
in
OMNET++.
Wh
y
-
is
a
question
post
that
ask
for
re
vie
w
,
reason,
or
cause
for
something.
F
or
e
xample,
“Why
is
the
following
tcl
script
for
NS2
gives
err
or
for
pr
ocedur
e
implementation?”
.
This
question
asks
for
clarifying
wh
y
an
error
has
happened.
Others
-
is
a
question
post
that
can’
t
be
classified
by
k
e
yw
ord
search
in
the
title
and
body
of
the
post.
F
or
e
xample,
“#includes
in
OMNeT++
Unit
T
ests”
.
Results:
T
able
4
sho
ws
that
most
NS
posts
of
each
topic
ask
for
an
instruction
to
perform
their
specific
tasks.
This
is
indicated
by
high
percentage
of
ho
w-to
type
of
question,
ranging
from
55.80%
to
78.06%.
The
NS
Installation
topic
has
the
highest
ho
w-to
type
of
question
78.06%,
sho
wing
a
necessity
for
rich
resources
of
guidance
to
install
and
manage
the
NS
tools.
The
Simulation
Model
Configuration
has
the
highest
What
type
of
post
12.02%.
This
suggests
the
necessity
of
general
information
about
supported
features
on
simulation
model
configuration
of
the
NS.
Finally
,
the
Si
mulation
Model
Performance
Measure
has
the
highest
wh
y
type
of
post
2.17%.
This
suggests
the
necessity
of
discussion
forums
and
impro
v
ed
documentation
on
simulation
model
performance
measurements
issues.
RQ
2
:
Summary
Results
sho
w
that
users
mainly
ask
ho
w-to
type
of
questions,
follo
wed
by
what
and
wh
y,
respecti
v
ely
.
In
addition,
we
find
that
NS
Installation
is
the
most
dominant
t
opic
in
asking
for
an
instruction
(i.e.,
ho
w-to).
This
indicates
the
necessity
of
pro
viding
a
guidance
to
reliably
install
NS
tools.
What
network
simulator
questions
do
user
s
ask?
a
lar
g
e-scale
study
of
stac
k
o
verflow
posts
(Syful
Islam)
Evaluation Warning : The document was created with Spire.PDF for Python.
1628
r
ISSN:
2502-4752
T
able
4.
Dif
ferent
issues
of
the
top
5
topics
f
aced
by
de
v
elopers.
The
v
alues
are
presented
in
percentage
T
opic
name
ho
w-to
what
wh
y
others
NS
installation
78.06
4.68
1.43
15.83
NS
b
uild
error
70.25
7.85
0.00
21.90
T
ar
get
netw
ork
protocol
modification
64.71
8.92
1.33
25.05
Simulation
model
configuration
56.66
12.02
0.90
30.43
Simulation
model
performance
measurement
55.80
9.05
2.17
32.97
Year
# Question post
0
50
100
150
200
2008
2010
2012
2014
2016
2018
Simulation model configuration
Taget netwok protocol modification
NS installation
Simulation model performance measure
NS build error
Figure
2.
NS
topic
cate
gories
e
v
olution
o
v
er
time
3.3.
RQ
3
:
What
topics
are
the
most
dif
ficult
to
answ
er?
Approach:
T
o
answer
this
RQ
,
we
in
v
estig
ate
the
dif
ficulty
of
the
NS
topic
by
utilizing
four
met
rics
which
are
also
used
in
the
pre
vious
studies
[7,
8].
F
or
the
first
three
metrics,
we
collect
the
metadata
of
each
topic,
that
are,
answer
count,
accepted
answer
count,
and
comments
count
to
compute
the
a
v
erage
v
alues.
F
or
the
fourth
metric,
that
is,
the
Probability
of
Dif
fi
culty
(PD)
score,
we
e
xtract
the
answer
count
and
vie
w
count
of
each
topic.
W
e
then
calculate
the
a
v
erage
(i.e.,
a
vg.
answer
count
and
a
vg.
vie
w
count)
to
find
the
PD
score,
formulated
as:
P
D
scor
e
=
Av
er
ag
e
Answ
er
C
ount
Av
er
ag
e
V
iew
C
ount
100%
(3)
In
general,
a
high
number
of
vie
ws
on
a
topic
b
ut
a
small
number
of
answers
indicates
that
only
a
small
number
of
people
can
answer
the
topic’
s
questions.
Therefore,
we
adopt
the
PD
score
to
measure
the
dif
fi
culty
of
the
topic.
The
lo
wer
the
PD
score,
the
harder
it
is
to
answer
questions
in
NS-related
discussion
topics.
Results:
T
able
5
sho
ws
the
dif
fi
culty
measure
of
NS
topics.
W
e
find
t
hat
NS
topics
in
this
analysis
do
not
significantly
dif
fer
in
term
of
the
a
vg.
answer
count
(0.9-0.99),
a
vg.
accepted
answer
count
(0.35-0.38)
and
a
vg.
comment
count
(1.19-1.71).
Hence,
we
consider
the
PD
score
to
determine
the
most
dif
ficult
topic
to
answer
.
As
described
in
T
able
5
,
NS
Installation
is
the
most
dif
ficult
topic
f
aced
by
users,
follo
wed
by
T
ar
get
Net
w
ork
Protocol
modification.
Although
NS
b
uild
error
and
Simulation
model
performance
measurement
are
not
as
dif
ficult
as
the
top
tw
o
topics,
the
y
ha
v
e
similar
le
v
el
of
dif
ficulty
according
to
our
analysis.
Finally
,
simulation
model
configuration
is
the
least
dif
ficult
topics
to
answer
,
accounting
for
0.095%
of
PD
score
in
the
result.
Indonesian
J
Elec
Eng
&
Comp
Sci,
V
ol.
21,
No.
3,
March
2021
:
1622
–
1633
Evaluation Warning : The document was created with Spire.PDF for Python.
Indonesian
J
Elec
Eng
&
Comp
Sci
ISSN:
2502-4752
r
1629
T
able
5.
Dif
ficulty
measure
of
NS
topics.
The
of
number
answers,
accepted
answers
and
comments
are
presented
in
a
v
erage
while
the
PD
score
is
presented
in
percentage
T
opic
name
answers
accepted
comments
PD
(%)
answers
NS
installation
0.99
0.35
1.71
0.057
T
ar
get
netw
ork
protocol
modification
0.92
0.38
1.19
0.060
NS
b
uild
error
0.93
0.35
1.27
0.073
Simulation
model
performance
measurement
0.96
0.36
1.22
0.074
Simulation
model
configuration
0.90
0.35
1.26
0.095
RQ
3
:
Summary
NS
Installation
and
Target
Network
Protocol
modification
are
the
top
tw
o
most
dif
ficult
NS-related
topics
to
answer
,
with
PD
score
0.057%
and
0.060%,
respecti
v
ely
.
4.
IMPLICA
TIONS
The
results
of
this
study
will
help
the
NS
users
better
understand
and
focus
on
the
most
press
ing
NS
issues.
This
s
ection
describes
ho
w
our
results
can
help
practitioners,
researchers,
and
guides
to
softw
are
de
v
elopment
projects.
4.1.
Practitioners
According
to
T
able
5
,
the
most
popular
topic
on
ho
w
to
install
NS
has
less
probability
to
recei
v
e
accepted
answer
.
The
NS
community
can
benefit
from
these
findings
to
de
vise
better
tutorials
(i.e.,
video
tutorial,
documentation)
to
reduce
the
barrier
of
NS
usage.
Our
findings
can
also
help
NS
community
to
prioritize
the
task
by
considering
the
areas
of
the
dif
ficult
NS
topics
while
performi
ng
e
xperiments.
NS
installation,
b
uild
error
,
simulation
model
configuration
are
the
topics
with
highest
share
without
accepted
answers.
NS
de
v
elopers
can
tak
e
these
issues
into
account
to
impro
v
e
the
user
e
xperience.
In
addition,
Figure
2
sho
ws
the
trend
of
NS
topic
cate
gories
e
v
olution
o
v
er
time.
This
trend
hint
s
the
necessity
of
dedicated
online
blog
for
each
NS
that
can
help
to
create
lar
ge
community
and
impro
v
e
NS
usage
e
xperience.
4.2.
Resear
chers
Our
lar
ge-scale
em
pirical
research
pro
vides
an
o
v
erall
vie
w
of
NS-related
topics
being
discus
sed
on
the
SO
pl
atform.
W
e
found
fi
v
e
NS-related
topics
discussed
in
SO,
that
are,
Simulation
Model
Configuration,
T
ar
get
Netw
ork
Protocol
modification,
NS
Installation,
Simulation
Model
performance
measure,
and
NS
Build
error
.
W
e
also
focus
the
most
popular
and
dif
ficult
NS
topics.
Therefore,
we
encourage
researchers
to
de
v
elop
techniques
to
help
NS
users
answer
this
dif
ficult
question.
4.3.
Softwar
e
de
v
elopment
pr
ojects
The
findings
of
this
study
recommend
for
softw
are
de
v
elopment
projects
to
consider
preparing
project-
specific
discussion
forums.
As
observ
ed
in
the
study
,
the
number
of
NS-related
discussions
in
SO
ha
v
e
increased
o
v
er
time,
b
ut
the
attention
from
the
community
is
not
high.
This
leads
to
the
most
problem
discussed
remains
unsolv
ed.
The
results
indicate
that
considering
to
prepare
project-specific
discussion
forums
is
important.
Sharing
the
project-specific
problems
in
the
project-related
community
forums
will
increase
the
probabi
lity
of
getting
the
solutions.
This
will
also
enable
de
v
elopers
to
pro
vide
a
com
munity-related
information
or
announcement.
As
reported
by
T
antisuw
ankul
et
al.
[25]
that
softw
are
projects
tend
to
adopt
communication
channel
for
both
capturing
ne
w
kno
wledge
and
updating
e
xisting
kno
wledge,
and
since
the
inform
ation
or
announcement
that
specifically
relates
to
the
softw
are
projects
is
important
to
share
amongst
the
community
,
pro
viding
an
NS-specific
discussion
forum
is
necessary
.
5.
THREA
T
T
O
V
ALIDITY
There
are
se
v
eral
threats
that
may
af
fect
the
v
alidity
of
the
NS
study
.
This
section
describes
the
threats
to
v
alidity
in
detail.
What
network
simulator
questions
do
user
s
ask?
a
lar
g
e-scale
study
of
stac
k
o
verflow
posts
(Syful
Islam)
Evaluation Warning : The document was created with Spire.PDF for Python.
1630
r
ISSN:
2502-4752
5.1.
Construct
v
alidity
The
threats
to
construct
v
alidity
may
emer
ge
in
our
e
xperiments.
During
t
he
SO
question
e
xtracti
on
phase,
we
may
miss
some
NS
questions
due
to
the
tag-based
e
xtraction
technique.
Since
the
number
of
this
issue
is
small,
thus,
the
impact
of
the
missing
questions
is
not
significant.
5.2.
Exter
nal
v
alidity
The
threats
to
e
xternal
v
alidity
may
appear
in
data
preparation
phase.
W
e
conducted
an
empirical
study
of
2,322
NS
questions
from
Stack
Ov
erflo
w
,
b
ut
could
not
generalize
the
results
to
other
question-answering
online
platforms.
5.3.
Reliability
W
e
mitig
ate
the
threats
to
reliability
by
preparing
online
appendix
of
the
dataset
and
scripts.
This
online
appendix
is
described
in
Section
2.3.
6.
RELA
TED
W
ORK
This
section
describes
the
NS-related
w
ork.
First,
we
re
vie
w
some
prior
research
on
NS
and
it
s
implementation
on
netw
ork
related
research
domain.
Second,
we
discuss
SO-based
case
studies
and
studies
on
topic
modeling.
6.1.
Netw
ork
simulator
There
are
se
v
eral
studi
es
on
NS.
Prior
w
orks
compared
NS
tools
to
subjug
ate
barrier
to
select
the
suitable
one
that
support
users
objecti
v
e
[10]
and
the
usage
on
wireless
netw
orks
[26-29].
In
another
paper
,
Campanile
et
al.
conducted
a
case
study
to
demonstr
ate
the
ef
fecti
v
eness
of
netw
ork
simulator
in
real
applications,
and
modeling
studies
[27].
Comparati
v
e
studies
on
wireless
netw
ork
simulators
were
also
conducted
by
Lessman
et
al.
[30]
and
K
orkalainen
et
al.
[31]
to
help
other
users
to
quickly
identify
which
simulator
is
most
suitable
for
their
needs.
There
are
also
some
pre
vious
research
[32-36]
to
utilize
NS
as
the
tools
to
perform
simulation
w
ork
for
dif
ferent
wireless
netw
ork
scenarios
and
intrusion
detection.
As
NS
impleme
ntations
increase
in
academia
and
industry
,
we
in
v
estig
ate
the
problems
that
users
are
f
acing.
The
results
of
this
study
pro
vide
the
research
community
with
insights
to
understand
areas
that
require
more
attention.
6.2.
Stack
o
v
erflo
w
The
SO
data
ha
v
e
also
been
analyzed
in
se
v
eral
studies.
In
a
study
by
Rosen
and
Shihab
[7],
the
authors
has
summarized
mobile-related
questions
from
SO
to
identify
specific
issues
on
v
arious
mobile
platforms
The
SO
dataset
w
as
also
used
to
understand
the
challenges
chatbot
de
v
elopers
[22].
Mahajan
et
al.
[37]
proposed
a
recommendation
system
to
fix
run-time
e
xception
by
utilizing
SO
dataset.
Riccardo
et
al.
[38]
utilized
SO
dataset
in
PostFinder
system
to
support
softw
are
de
v
elopers
with
suitable
code
snippets.
Cai
et
al.
[39]
proposed
a
API
recommendation
method
that
also
depends
on
SO
dataset.
Uddin
et
al.
[40],
proposed
an
automated
system
to
mine
the
API
usage.
This
study
also
utilize
SO
as
a
primary
data
source.
As
f
ar
as
we
kno
w
,
no
research
has
been
conducted
on
SO
NS-related
posts.
Our
study
complements
pre
vious
w
ork
at
SO
by
analyzing
NS-related
posts.
W
e
collected
and
cate
gorized
NS
topics
and
in
v
estig
ated
the
popularity
and
dif
ficulty
of
the
topics.
W
e
belie
v
e
that
our
research
sheds
spot
light
on
the
areas
where
NS
users
are
f
acing
challenges.
7.
CONCLUSION
T
o
understand
the
characteristics
of
NS
issues
discussed
by
users,
we
conducted
a
lar
ge-scale
empirical
study
on
2,322
NS
questions
posted
in
SO.
In
our
study
,
we
analyze
(i)
the
types
of
discussion
topics
and
their
popularity
,
(ii)
types
of
questions
that
frequently
f
aced
by
the
users,
and
(iii)
the
dif
ficulty
of
topics
shared
in
SO.
The
results
of
our
study
ha
v
e
sho
wn
that
simulation
model
confi
gu
r
ation
is
the
most
common
discussed
and
useful
topic
amongst
the
users,
while
tar
get
netw
ork
protocol
modification
and
NS
installation
are
become
the
most
popular
NS-related
topics
in
SO.
NS
users
are
frequentl
y
ask
for
an
instruction
of
NS
installation
by
posting
a
ho
w-to
type
of
question.
This
suggests
the
importance
of
pro
viding
impro
v
e
d
NS
installation
document.
Furthermore,
the
findings
sho
w
that
the
most
dif
ficult
NS
related
questions
posted
in
SO
are
NS
installation
and
tar
get
netw
ork
protocol
modification.
Based
on
this
study
,
we
al
so
sho
ws
the
increase
of
NS
rel
ated
discussion
in
Stack
Ov
erflo
w
.
Therefore,
there
are
m
an
y
open
issues
in
future
w
ork,
such
as
a
comprehensi
v
e
understanding
of
the
e
v
olution
of
NS-related
discussions
and
further
study
of
NS
topics
on
other
online
discussion
platforms.
Indonesian
J
Elec
Eng
&
Comp
Sci,
V
ol.
21,
No.
3,
March
2021
:
1622
–
1633
Evaluation Warning : The document was created with Spire.PDF for Python.
Indonesian
J
Elec
Eng
&
Comp
Sci
ISSN:
2502-4752
r
1631
REFERENCES
[1]
V
.
Stock
er
,
G.
Smaragdakis,
W
.
Lehr
,
S.
Bauer
,
“The
Gro
wing
Comple
xity
of
Content
Deli
v
ery
Netw
orks:
Challenges
and
implications
for
the
Internet
Ecosystem,
”
T
elecommunications
Polic
y
,
v
ol.
41,
no.
10,
1003–1016,
2017.
[2]
A.
Zag
alsk
y
,
D.
M.
German,
M.
A.
Store
y
,
C.
G.
T
eshima,
G.
Poo-Caamano,
“Ho
w
the
r
community
creates
and
curates
kno
wledge:
An
e
xtended
study
of
stack
o
v
erflo
w
and
mailing
lists,
”
Empirical
Softw
are
Engineering
,
v
ol.
23,
no.
2,
pp.
953–986,
Apr
.
2018.
[3]
M.
Squire,
“Should
we
mo
v
e
to
stack
o
v
erflo
w?:
Measuring
the
utility
of
social
media
for
de
v
eloper
support,
”
in
Proceedings
of
the
37th
International
Conference
on
Softw
are
Engineering
-
V
olume
2,
ser
.
ICSE
’15.
Piscata
w
ay
,
NJ,
USA:
IEEE
Press,
2015,
pp.
219–228.
[4]
F
.
Calef
ato,
F
.
Lanubile,
N.
No
vielli,
“Ho
w
to
ask
for
technical
help?
e
vidence-based
guidelines
for
writing
questions
on
stack
o
v
erflo
w
,
”
Information
and
Softw
are
T
echnology
,
v
ol.
94,
pp.
186–207,
Feb
.
2018.
[5]
S.
W
ang,
T
.-H.
Chen,
A.
E.
Hassan,
“Understanding
the
f
actors
for
f
ast
answers
in
technical
qa
websites,
”
Empirical
Softw
are
Engineering
,
v
ol.
23,
no.
3,
pp.
1552–1593,
Jun.
2018.
[6]
S.
Be
yer
,
C.
Macho,
M.
Di
Penta,
M.
Pinzger
,
“What
kind
of
questions
do
de
v
elopers
as
k
on
stack
o
v
erflo
w?
a
comparison
of
automated
approaches
to
classify
posts
into
question
cate
gories,
”
Empirical
Softw
are
Engineering
,
v
ol.
25,
no.
3,
pp.
2258–2301,
2020.
[7]
C.
Rosen,
E.
Shihab,
“What
are
mobile
de
v
elopers
asking
about?
a
lar
ge
scale
study
using
stack
o
v
erflo
w
,
”
Empirical
Softw
are
Engineering
,
v
ol.
21,
no.
3,
pp.
1192–1223,
2016.
[8]
X.
L.
Y
ang,
D.
Lo,
X.
Xia,
Z.-Y
.
W
an,
J.-L.
Sun,
“What
security
questions
do
de
v
elopers
ask?
a
lar
ge-scale
study
of
stack
o
v
erflo
w
posts,
”
Journal
of
Computer
Science
and
T
echnology
,
v
ol.
31,
no.
5,
pp.
910–924,
2016.
[9]
N.
Kahani,
M.
Bagherzadeh,
J.
Dingel,
J.
R.
Cordy
,
“The
problems
with
eclipse
model
ing
tools:
A
topic
analysis
of
eclipse
forums,
”
in
Proceedings
of
the
A
CM/IEEE
19th
International
Conference
on
Model
Dri
v
en
Engineering
Languages
and
Systems
,
ser
.
MODELS
’16,
2016,
pp.
227–237.
[10]
M.
H.
Kabir
,
S.
Islam,
M.
J.
Hossain,
S.
Hossai
n
,
“Detail
comparison
of
netw
ork
simulators,
”
International
Journal
of
Scientific
and
Engineering
Research
,
v
ol.
5,
no.
10,
pp.
203–218,
2014.
[11]
S.
Baltes,
L.
Dumani,
C.
T
reude,
S.
Diehl,
“Sotorrent:
reconstructing
and
analyzing
the
e
v
olution
of
stack
o
v
erflo
w
posts
,
”
in
Proceedings
of
the
15th
International
Conference
on
Mining
Softw
are
Repositories
,
MSR
2018,
Gothenb
ur
g,
Sweden,
May
28-29,
2018,
A.
Zaidman,
Y
.
Kamei,
and
E.
Hill,
Eds.
A
CM,
2018,
pp.
319–330.
[Online].
A
v
ailable:
https://doi.or
g/10.1145/3196398.3196430.
[12]
“Re
gular
e
xpression
operations”
accessed
02-01-2020
.
[Online].
A
v
ailable:
https://docs.p
ython.or
g/3/
library/re.html.
[13]
“Python
NL
TK”
accessed
02-01-2020
.
[Online].
A
v
ailable:
https://www
.nltk.or
g/.
[14]
“Gensim
model”
accessed
02-01-2020
.
[Online].
A
v
ailable:
https://radimrehurek.com/gensim/
[15]
D.
M.
Blei
,
A.
Y
.
Ng,
M.
I.
Jordan,
“Latent
dirichlet
allocation,
”
Journal
of
ma
chine
Learning
research
,
v
ol.
3,
pp.
993–1022,
Jan
2003.
[16]
H.
Zhang,
S.
W
ang,
T
.-H.
Chen,
A.
E.
Hassan,
“Reading
answers
on
stack
o
v
erflo
w:
Not
enough!”
IEEE
T
ransactions
on
Softw
are
Engineering
,
2019.
[17]
S.
Liu,
R.-Y
.
Zhang,
T
.
Kishimot
o,
“
Analysis
and
prospect
of
clinical
psychology
based
on
topic
models:
hot
research
topics
and
scientific
trends
in
the
latest
decades,
”
Psychology
,
Health
Medicine
,
pp.
1–13,
2020.
[18]
S.
Choi,
J.
Seo,
“
An
e
xploratory
study
of
the
research
on
care
gi
v
er
depression:
Using
bibliometrics
and
lda
topic
modeling,
”
Issues
in
Mental
Health
Nursing
,
2020,
pp.
1–10.
[19]
A.
McCallum,
“
AK
McCallum,
S
Thrun,
T
Mitchell,
”
“Mechine
Learning
,
”
v
ol.
39,
no.
2,
pp.
103-134,
2020.
[20]
S.
Boussaadi,
H.
Aliane,
A.
Cerist,
P
.
O.
Abdeldjalil,
“Modeling
of
scientists
profiles
base
d
on
lda.
”
[21]
M.
Zahedi,
R.
N.
Rajapakse,
M.
A.
Babar
,
“Mining
questions
ask
ed
about
continuous
softw
are
engineering:
A
case
study
of
stack
o
v
erflo
w
,
”
in
Proceedings
of
the
Ev
aluation
and
Assessment
i
n
Softw
are
Engineering
,
pp.
41–50,
2020.
[22]
A.
Abdellatif,
D.
Costa,
K.
Badran,
R.
Abdalkareem,
E.
Shihab,
“Challenges
in
chatbot
de
v
elopment:
A
study
of
stack
o
v
erflo
w
posts,
”
in
Proceedings
of
the
17th
International
Conference
on
Mining
Softw
are
Repositories
,
pp.
174–185,
2020.
[23]
“SO
T
our”
accessed
10-02-2020.
[Online].
A
v
ailable:
https://stack
o
v
erflo
w
.com/tour
What
network
simulator
questions
do
user
s
ask?
a
lar
g
e-scale
study
of
stac
k
o
verflow
posts
(Syful
Islam)
Evaluation Warning : The document was created with Spire.PDF for Python.