Indonesi
an
Journa
l
of El
ect
ri
cal Engineer
ing
an
d
Comp
ut
er
Scie
nce
Vo
l.
1
3
,
No.
1
,
Jan
uar
y
201
9
,
pp.
4
20
~
4
26
IS
S
N: 25
02
-
4752, DO
I: 10
.11
591/ijeecs
.v1
3
.i
1
.pp
4
20
-
4
26
420
Journ
al h
om
e
page
:
http:
//
ia
es
core.c
om/j
ourn
als/i
ndex.
ph
p/ij
eecs
Flexibilit
y of
Ind
on
esia
n
t
ext p
re
-
p
rocessin
g library
Dian S
a’ad
il
ah M
aylawa
ti
1
,
Hil
mi
A
ul
awi
2
,
M
uhamm
ad Ali
Ram
dhan
i
3
1
Depa
rtment of I
nform
at
ic
s,
Seko
la
h
Ti
nggi
T
ekologi
Garut
,
Indon
esia
2
Depa
rtment of I
ndustria
l
Engi
n
e
eri
ng,
Sekol
ah
T
inggi
Te
kolog
i G
aru
t,
Indone
sia
1,
3
Depa
rtment
of
Inform
at
ic
s
,
UI
N Sunan
Gunung Dja
t
i
Bandung
,
Indone
si
a
Art
ic
le
In
f
o
ABSTR
A
CT
Art
ic
le
history:
Re
cei
ved
J
ul
27
, 2
018
Re
vised
A
ug
21
, 2
018
Accepte
d
Nov
18
, 201
8
Thi
s
stud
y
a
imed
to
ac
hi
eve
and
m
ea
sure
f
le
xib
il
i
t
y
as
a
software
q
ual
ity
f
actor
of
te
x
t
pr
e
-
proc
e
ss
ing
li
bra
r
ie
s
wi
th
Indone
si
an
te
x
t
from
soci
al
m
edi
a.
L
ibr
a
r
y
was
buil
t
base
d
on
a
r
eview
of
s
om
e
te
xt
m
ini
ng
applications
that
did
not
y
et
have
a
spec
ia
l
p
re
-
proc
ess
for
In
donesia
n
te
x
t.
T
ext
pre
-
pro
ce
ss
i
ng
li
br
aries
were
d
esigne
d
a
nd
built
using
a
n
object
-
ori
ente
d
appr
oa
ch
tha
t
was
m
odula
r
to
ac
hi
eve
fl
exibili
t
y
.
F
le
xib
il
i
t
y
w
as
m
e
asure
d
b
y
the
Mc
Ca
ll
C
y
c
lomati
c
Com
ple
xity
(CC
)
m
et
r
ic.
Fl
exi
bi
l
ity
of
li
br
ar
y
w
as
te
st
ed
b
y
imple
m
ent
ing
th
e
li
bra
r
y
i
nto
t
ext
m
ini
ng
app
li
c
ati
ons.
Th
e
resul
ts
of
exp
eri
m
ent
s
howed
th
at
te
xt
pr
e
-
proc
essi
ng
l
ibra
ri
es
could
be
f
le
xib
le
an
d
e
as
y
to
use
wi
thout
m
uch
conf
iguration
in
te
xt m
ini
ng
appli
ca
t
ions. I
t
was
pr
oved b
y
th
e v
al
u
e of
CC
of
2.
51
which
m
e
an
t
the
li
br
ar
y
or
so
ftwa
re
was
not
to
o
complex,
sim
p
le
enough
,
and
a
lso f
l
exi
bl
e
to
use
.
Ke
yw
or
d
s
:
Flexibil
it
y
So
ft
war
e
li
br
a
r
y
So
ft
war
e
quali
ty
Text
pr
e
-
proc
e
ssing
Copyright
©
201
9
Instit
ut
e
o
f Ad
vanc
ed
Engi
n
ee
r
ing
and
S
cienc
e
.
Al
l
rights re
serv
ed
.
Corres
pond
in
g
Aut
h
or
:
Dian Sa’a
dilah
May
la
wati
,
Dep
a
rtm
ent o
f Info
rm
at
ic
s
,
Sekola
h Ti
nggi
Tekolo
gi
Garu
t, I
ndonesi
a
.
Em
a
il
:
ds
aadil
la
h@
stt
ga
r
ut.ac
.id
1.
INTROD
U
CTION
So
ft
war
e
qual
it
y
was
one
of
the
im
po
rta
nt
pa
rts
that
need
e
d
t
o
be
accom
plished
in
softwa
re
dev
el
op
m
ent. S
of
t
war
e
q
uali
ty
factors
to
b
e
achie
ved
in
ac
corda
nce w
it
h
the n
eed
s
a
nd o
bject
ives
o
f
s
of
t
war
e
bu
il
t
[1
-
2]
.
T
he
re
wer
e
var
i
ous
m
od
el
s
of
s
of
t
war
e
qu
al
it
y
fa
ct
or
s,
s
uc
h
as
Mc
Ca
ll
wh
ic
h
was
the
be
ginn
ing
of
the
de
velo
pm
e
nt
of
s
of
t
war
e
qu
al
it
y
m
od
el
s
[3]
,
B
oeh
m
so
f
tware
qu
al
it
y
m
od
el
[4]
,
F
U
RPS
[5
-
6]
,
In
te
rn
at
io
na
l
Orga
nizat
ion
f
or
Sta
nd
a
r
dizat
ion
(
IS
O
)
[
7]
,
CM
M
(Capa
bili
ty
Ma
turity
Mo
del)
[8
-
9]
,
an
d
oth
er
m
od
el
s
wer
e
wi
dely
us
e
d
as
the
goal
of
s
of
twa
r
e
qual
it
y
to
be
achie
ved
[10
-
12]
.
T
he
software
qu
al
it
y
m
od
el
evo
l
ved
acc
ord
ing
to
t
he
nee
ds
of
t
he
s
of
t
war
e
,
one
of
w
hich
was
a
li
br
a
ry
or
com
ponen
t
-
ba
sed
s
oft
wa
re
qual
it
y
m
od
el
[13
-
15]
.
Text
m
ining
te
chn
i
qu
e
was
a
te
chn
iq
ue
to
find
im
po
rtant
va
lue
s
previ
ou
sl
y
unknown
a
uto
m
atical
ly
by
extracti
ng
t
ext
data
so
t
hat
ob
ta
ine
d
us
ef
ul
knowle
dge
[
16
-
21]
.
On
e
of
the
pro
blem
s
i
n
te
xt
m
ining
was
t
o
represe
nt
te
xt
that
was
unstr
uc
ture
d
data
int
o
a
str
uctu
red
data
re
pr
ese
nta
ti
on
be
f
or
e
t
he
m
ining
proces
s.
The
process
of
pr
e
pa
rin
g
the
data
i
nto
a
str
uctu
re
d
represe
ntati
on
was c
al
le
d
th
e
pre
-
processi
ng
sta
ge
un
ti
l
th
e
data
had rea
dy fo
r
t
he
m
ining
proc
ess
[
22
-
23]
.
Seei
ng
s
o
ra
pid
ly
the
nee
ds
of
te
ch
nolo
gy
[24]
,
e
sp
eci
al
ly
on
te
xt
m
anag
em
ent
with
te
xt
m
ining
te
chn
iq
ues
ap
pe
ared
a
var
ie
t
y
of
Te
xt
Mi
ni
ng
a
p
plica
ti
ons
su
c
h
as
We
ka
,
Kea
,
Ma
ll
et
,
Lin
gP
i
pe,
G
ATE,
Ca
rrot2,
Mi
nor
Thir
d,
Sim
m
et
rics,
an
d
s
o
forth.
Buil
di
ng
li
braries
that
pac
k
pr
e
processi
ng
f
or
te
xt
m
ining
cou
l
d
be
us
e
f
ul
in
str
ea
m
li
nin
g
an
d
si
m
plifyi
ng
the
m
ining
pr
oces
s.
Wh
ere
li
brar
ie
s
or
s
of
t
war
e
li
br
aries
c
on
ta
ined
m
od
ules
that
had
cl
asses
a
nd
f
un
ct
io
ns
[25]
,
c
ollec
ti
on
of
f
unct
io
ns
on
f
un
ct
io
nal
pro
gr
am
m
ing
,
or
cl
as
s
def
i
niti
on
s
on
obj
ect
-
or
ie
nted
pro
gr
am
m
ing
.
Libra
ry
was
usual
ly
us
e
d
for
co
de
reuse
with
s
pecific
f
un
ct
ion
s
that
exist
in
a
l
ibrar
y
re
us
e
d
t
o
stream
li
ne
program
cod
e,
s
o
it
did
not
rep
e
at
the
program
co
de
with
t
he
sam
e
al
gorithm
.
The
resu
lt
of
t
he
pre
-
processi
ng
l
ibrar
y
was
a
struct
ur
e
d
te
xt
r
epr
ese
ntati
on
that
co
uld
be
use
d
f
or
m
ining
process
es p
e
rfor
m
ed
by
text m
ining a
pp
li
cat
io
ns
.
Evaluation Warning : The document was created with Spire.PDF for Python.
Ind
on
esi
a
n
J
E
le
c Eng &
Co
m
p
Sci
IS
S
N:
25
02
-
4752
Fle
xi
bili
ty
o
f I
ndonesi
an text
pr
e
-
processi
ng li
br
ar
y
(
Dian
Sa’
ad
il
ah M
ay
lawati
)
421
To
m
eet
the
ne
eds
of
te
xt
m
i
ning
ap
plica
ti
on
s
that
va
ry,
in
t
his
stu
dy
pre
-
proce
ssin
g
li
br
a
ries
buil
t
m
us
t
certai
nly
be
a
ble
to
m
ee
t
the
qual
it
y
fa
ct
or
of
flexi
bili
ty
.
Flexi
bili
ty
was
ach
ie
ved
by
pro
vid
in
g
a
bs
tract
cl
asses
that
c
ou
ld
be
im
ple
m
ented
as
nee
de
d
f
or
te
xt
m
ining
app
li
cat
io
ns
a
nd
m
easur
e
d
by
Mc
Ca
be
Cy
cl
om
at
i
c
Com
plexity
(CC)
Me
tric
s.
W
her
e
the
Mc
Ca
be
CC
m
et
rics
cou
l
d
m
easur
e
so
ft
war
e
c
om
plexity
,
the
l
ower
t
he
CC
so
ft
war
e
va
lue
t
he
sim
pl
er,
easi
e
r,
a
nd
m
or
e
flexible
t
o
us
e,
m
od
ify
,
an
d
m
ai
ntain
[26
-
27]
.
I
n
a
ddit
io
n,
li
br
ary
flexibili
ty
w
as also
test
ed by i
m
ple
m
e
nting l
ibra
ries i
n
te
xt m
ining a
pp
li
cat
io
ns
.
2.
RESEA
R
CH MET
HO
D
This
resea
rc
h
use
d
Re
searc
h
a
nd
De
velo
pm
e
nt
m
e
tho
d
(R&
D)
[28]
.
W
e
stud
ie
d
the
li
te
ra
tures
a
bout
te
xt
pre
-
proces
sing
as
r
esearc
h
ph
a
se.
We
a
naly
zed
th
e
ne
eds
of
te
xt
pr
e
-
proce
ssin
g,
t
he
cha
racteri
sti
cs
of
Ind
on
esi
a
n
la
ngua
ge,
t
he
ne
e
ds
of
li
brary,
and
buil
t
the
li
br
ary
as
de
ve
lop
m
ent
ph
ase
.
W
e
us
e
d
W
a
te
rf
al
l
So
ft
war
e
De
ve
lop
m
ent
Life
Cy
cl
e
(S
DLC)
m
et
ho
d
for
bui
lt
the
li
br
ary.
Waterfall
S
DL
C
was
us
e
d
be
cause
it
was
sim
ple
S
DLC
[10]
,
sys
tem
at
ic
,
and
r
equ
i
rem
ents
of
te
xt
pre
-
proc
essing
li
br
a
ry
had
def
ine
d
cl
early
,
al
tho
ug
h
to
day
m
any
so
ftwa
re
dev
el
op
m
ent
m
et
ho
dolo
gy
that
de
velo
p,
s
uc
h
as
A
gile
m
e
thodo
l
og
y
[29]
.
The
n,
for
m
od
el
in
g
t
he
a
naly
sis
an
d
desig
n,
we
use
d
U
nified
Mod
el
in
g
Lan
gu
age
(U
ML
),
be
cause
of
UML
wa
s
su
it
able f
or
obje
ct
o
ri
e
nted
a
na
ly
sis and
desi
gn that s
upport
m
od
ularit
y t
o reach
f
le
xi
bili
ty
also
[30
-
31]
.
3.
RESU
LT
S
AND DI
SCUS
S
ION
3.1.
Fle
xibil
ity of Softw
are
Li
brary
Flexibil
it
y
was
a
s
of
t
war
e
qu
al
it
y
factor
t
ha
t
was
pa
rt
of
the
cat
e
gory
of
pro
duct
re
visi
on
fact
or
accor
ding
to
M
cC
al
l,
req
ui
red
to
support
s
of
t
war
e
m
ai
ntenan
ce
act
ivit
ie
s
[3
]
,
[
10
]
,
[
13]
.
Likewise,
i
n
softwa
r
e
com
po
ne
nt
m
od
el
,
flexi
bili
ty
was
sub
qual
it
y
or
s
ub
c
ha
ra
ct
erist
ic
of
m
ain
ta
inabili
ty
[
13]
.
Com
ponent
s
ha
d
the
abili
ty
to
be
m
od
ifie
d
in
m
ai
ntenan
ce
a
ct
ivit
ie
s.
Flexible
com
ponen
t
s
or
s
of
t
war
e
wer
e
easi
er
t
o
m
ai
ntain
[27]
.
Flexi
bili
t
y
was
diff
e
re
nt
f
ro
m
ada
ptabi
li
ty
that
was
a
su
b
qual
it
y
of
portabil
it
y
[
13]
,
[
15
]
.
Ad
a
pta
bili
ty
was
t
he
a
bili
ty
of
com
pone
nts
to
a
dap
t
in
dif
f
eren
t
platf
or
m
s
or
e
nvir
on
m
ents
with
ou
t
m
uch
m
od
ific
at
io
n
[13]
.
Both
fle
xib
il
it
y
and
a
da
ptabi
li
ty
need
ed
to
had
ease
of
use
in
a
di
ff
e
ren
t
env
i
ronm
en
t.
Howe
ver,
fle
xib
il
it
y
require
d
com
po
ne
nts to be ea
sil
y
m
od
ifie
d
in the m
ai
ntenan
ce p
ro
ce
ss, w
her
eas a
da
ptab
il
ity req
uire
d
t
he
ease
of com
ponen
ts
when m
ou
ntin
g
in
d
i
ff
e
ren
t e
nv
i
ronm
ents.
Libra
ry
as
it
w
as
know
n
as
s
of
t
war
e
re
us
e
appr
oach
[
32
]
,
so
it
need
e
d
t
o
be
fle
xib
le
t
o
c
hange,
in
order
t
o
reduc
e
the
c
os
t
w
he
n
m
ai
ntained.
Flexibil
it
y
ha
d
s
ub
qu
al
it
y
in
cl
ud
in
g
m
od
ul
arit
y
an
d
sim
p
li
ci
t
y.
Flexible
s
of
tw
are
co
uld
be
a
chieve
d
by
a
pply
i
ng
the
pri
nc
iples
of
m
odul
arit
y,
becau
se
the
m
od
ular
s
oft
war
e
reduce
d
the
c
om
plexit
y
in
the
so
ft
war
e
.
Mo
du
la
rity
cou
ld
be
m
easur
ed
by
evaluati
ng
c
oh
e
sio
n
an
d
c
ouplin
g
so
ft
war
e
m
odul
e
desig
n
[
10
]
,
[15],
[
33]
.
Wh
il
e
si
m
plicity
wo
ul
d
be
m
easu
red
by
com
plexity
m
et
rics.
On
e
of
the
m
et
rics
for
m
easur
in
g
flex
ibil
it
y
and
m
ain
ta
inabili
ty
wa
s
the
Mc
Ca
be
Cy
cl
o
m
at
ic
Com
plexit
y
Me
tric
s
[
27]
.
By
m
easur
in
g
t
he
c
om
plexity
o
f
the lib
rar
y,
it
w
ould
also
b
e
m
easur
ed
li
br
aries we
re easy
to
m
ai
ntain o
r
no
t.
Mc
Ca
be
cy
cl
oma
ti
c
co
m
ple
xity
m
easur
ed
so
ft
war
e
c
om
plexit
y
based
on
gr
a
ph
t
heory
it
wa
s
cy
cl
o
m
at
ic
num
ber
,
V(g
)=e−
n+
[26]
.
Wh
e
re,
V(g
)
is
a
cy
cl
om
a
ti
c
com
p
le
xity
gr
a
ph,
e
is
the
s
um
of
e
dg
e
,
n
is
the
su
m
of
node,
a
nd
p
is
sapar
at
e
d
c
om
po
nen
t
or
graph.
T
he
Mc
Ca
be
cy
cl
om
at
i
c
com
plexity
us
e
d
to
cal
culat
e
com
po
ne
nt
com
plex
it
y
with
V(g
)=e
−n+2
form
ulatio
n.
Where
2
w
as
the
a
ddit
ion
of
a
n
a
ddit
ion
a
l
ed
ge
from
the
exit
node
to
t
he
e
ntr
y
node
on
eac
h
m
od
ule
com
po
ne
nt.
Mc
Ca
be
value
was
rig
ht
if
al
ways
t
ha
t
m
ean
the
value
of c
om
po
nen
t
was
1. T
he
li
m
i
t
of c
om
plexity
v
al
ue
of cycl
om
at
i
c com
plexity
accordin
g
t
o So
ftware
En
gin
eeri
ng
I
nst
it
ute
a
m
on
g
oth
e
rs 1
-
10
wa
s
a
si
m
ple m
od
ule
with
ou
t
m
a
ny
risks,
11
-
20
m
ean m
or
e
co
m
plex
m
od
ules
with
m
ul
ti
ple
risks
,
21
-
50
was
a
co
m
plex
m
od
ule
with
high
ris
k,
and
m
or
e
than
50
inclu
ding
program
s
wh
ic
h was
unte
sta
ble w
it
h a
ver
y
high
risk.
3.2.
In
donesi
an
Te
xt Pre
-
P
rocessin
g and
Feature E
xt
r
ac
ting f
or
S
tru
ctured
Tex
t R
epresent
ati
on
Do
i
ng
te
xt
m
ining
te
c
hniq
ues
was
highly
de
pe
nd
e
nt
on
the
la
ngua
ge.
This
was
beca
us
e
ea
ch
la
ng
ua
ge
had
un
i
qu
e
c
ha
racteri
sti
cs,
ru
l
es, s
pelli
ng, a
nd
gr
am
m
ar.
Es
pecial
ly
f
or te
xt
s d
e
rive
d from
so
ci
al
m
edia,
m
any
natu
ral
la
ng
ua
ges,
a
bbre
viati
on
s
,
eve
n
sla
ng
la
ng
uag
es
were
us
ed
s
o
ne
ede
d
sp
eci
al
ha
ndli
ng
.
T
he
refor
e
,
it
was
necessa
ry
to
unde
rstan
d
well
the
gram
m
ar,
i
n
this
st
ud
y
t
he
Ind
on
e
sia
n
la
ngua
ge,
so
that
t
he
re
su
lt
of
t
he
pr
e
-
process
of
the
te
xt
was
a
go
od
re
pr
ese
ntati
on
as
well
.
Ba
sed
on
previ
ous
resea
rch,
the
te
xt
pre
-
proce
ssin
g
pro
du
ce
d
str
uc
ture
d
te
xt
re
presentat
ion.
Str
uctu
red
te
xt
re
pr
ese
nta
ti
on
da
ta
was
rea
dy
for
use
in
the
m
inin
g
process
.
Struc
ture
d
re
pr
e
senta
ti
on
of
a
te
xt
i
n
ge
ne
ral
t
her
e
wer
e
tw
o
ty
pes
,
nam
el
y
the
f
orm
of
sin
gle
w
ord
or
bette
r
know
n
as
the
bag
of
words
a
nd
t
he
f
or
m
of
m
ultip
le
wor
ds
(
n
-
gram
).
Ba
g
of
word
was
a
str
uctu
red
represe
ntati
on
of
t
he
te
xt
by
colle
ct
ing
al
l
t
he
words
in
a
te
xt
docum
ent
without
seei
ng
the
inte
rr
el
at
ion
s
hi
p
betwee
n
w
ords
[
23
]
,
w
hile
th
e
re
pr
ese
ntati
on
of
m
ulti
ple
word
was
a
te
xt
represe
ntati
on
t
hat
colle
ct
words
in
te
xt
do
c
um
ent
by
consi
der
i
ng
the
inter
rela
ti
on
s
hip
bet
we
en
w
ords
s
o
that
sem
antic
m
eaning
in
t
he
te
xt
Evaluation Warning : The document was created with Spire.PDF for Python.
IS
S
N
:
2502
-
4752
Ind
on
esi
a
n
J
E
le
c Eng &
Co
m
p
Sci,
Vo
l.
1
3
, N
o.
1
,
Ja
nu
a
ry
201
9
:
4
20
–
4
26
422
do
c
um
ent
cou
l
d
be
bette
r
pre
serv
e
d,
beca
use
it
cou
ld
capt
ur
e
t
he
relat
io
nship
betwee
n
words
/
phra
se
s,
eve
n
cl
au
ses
a
nd
se
ntences
[
34
-
35]
.
Sequentia
l
pa
tt
ern
w
as
on
e
form
of
re
pre
sentat
ion
of
m
ulti
ple
w
ords,
so
that
com
par
ed wit
h si
ngle
words
, t
he
m
eaning
or
knowle
dge
of text d
oc
um
ents can be m
ai
ntain
ed
b
et
te
r
[
19
]
.
Sequentia
l
patt
ern
us
e
d
in
t
he
te
xt
was
cal
le
d
Se
quence
of
Wor
d
(SO
W)
with
a
se
quenc
e
of
w
ords
that c
on
si
der t
he o
rd
e
r
of
occ
urren
ce
of
wor
ds
.
T
her
e
wer
e
3 st
ru
ct
ur
e
d t
ext
represe
ntati
on
s
i
n SO
W f
orm
that
cou
l
d
be
use
d
[
33
-
34
]
,
am
ong
oth
e
rs,
Fr
e
qu
e
nt
Wo
r
d
Seque
nce
(
F
WS)
w
hich
was
the
set
of
F
WS
wit
h
r
espect
to
the
orde
r
of
occurre
nce
of
each
wor
d.
Th
en,
t
he
set
of
F
reque
nt
W
ord
Sequence
(
Set
of
F
WS)
w
hic
h
was
a
set
of
F
WS
set
that
not
only
pa
y
at
te
ntion
to
the
se
quen
ce
of
word
a
pp
e
ara
nc
e
but
al
s
o
pay
at
te
ntion
t
o
seq
uen
c
e
of
occurre
nce
of
se
ntences
.
T
he
la
st,
Fr
e
qu
e
nt
Wo
r
d
Item
s
et
s
(FWI)
wh
ic
h
was
a
F
WI
se
t
that
did
not
con
si
de
r
the
orde
r
of
oc
currence
of
w
ords
.
The
re
wa
s
al
so
F
WI
asso
ci
at
ion
(S
et
of
F
WI
)
as
the
de
velo
pm
ent
of
SOW
wh
ic
h
ha
d
bee
n
pro
ven
a
ble
to
m
ai
ntain
th
e
m
eaning
of
I
ndonesi
an
sla
ng
la
ngua
ge
we
ll
[36
-
38]
.
T
he
pre
-
processi
ng
li
brary
in
t
his
stu
dy
us
ed
Set
of
F
WI
as
a
str
uc
ture
d
te
xt
repr
esentat
ion.
Fl
owcha
rt
of
I
ndonesi
an
te
xt pre
-
proces
sing l
ibra
ry
show
n nig
Fig
ure
1
.
Figure
1. Flo
w
char
t
of In
done
sia
n
Te
xt P
re
-
proces
sin
g
Lib
r
ary
3.3
.
Modul
D
esi
gn
of Ind
onesi
an
Te
xt Pre
-
Proce
ssing
Li
brar
y
Ba
sed
on
the
ne
eds
of
I
ndone
sia
n
te
xt
pre
-
proces
sin
g
desc
ribe
d
in
the
thi
rd
sect
io
n,
we
bu
il
t
a
te
xt
pre
-
pr
ocessin
g
li
br
ary
with
t
he
arc
hitec
tura
l
desig
n
s
how
n
in
Fig
ur
e
2.
The
re
we
re
t
hr
ee
m
ai
n
m
o
du
le
s
,
includi
ng
the
Text
P
repr
oces
sing
m
od
ule,
t
he
Feat
ur
e
E
xtracti
on
m
od
ule
,
an
d
t
he
m
odul
e
Feat
ur
e
d
Sele
ct
ion
.
Each
m
od
ule
ha
d
an
a
bs
tract
c
la
sses
as
a
prot
otype
that
co
ul
d
be
im
ple
m
ented
flexi
bly.
Th
e
te
xt
pr
e
-
proc
essing
m
od
ule
was
a
colle
ct
ion
of
cl
asses
that
pe
rfor
m
ed
tok
e
ni
zi
ng
pr
ocesse
s,
rem
ov
in
g
non
-
le
tt
e
r
cha
ra
ct
ers,
m
anipu
la
ti
ng
r
egu
la
r
e
xpress
ion
,
case
fo
l
din
g,
c
onve
rtin
g
ab
breviat
io
ns
to
their
or
i
gin
al
form
,
rem
ov
i
ng
stopwor
ds
,
a
nd
stemm
ing
pro
cesses.
T
he
Fe
at
ur
e
E
xtracti
on
m
od
ule
is
a
colle
ct
ion
of
cl
asses
that
func
ti
on
to
extract
feat
ur
es
in
the
f
or
m
of
structu
re
d
te
xt
represe
ntati
on
,
w
hile
the
Feat
ur
e
Sele
ct
ion
m
od
ule
co
ntains
a
se
t
of
cl
asses
that
ser
ve
to
sel
ec
t
the
ext
racted
featu
res
beca
us
e
a
featu
re
m
ay
hav
e
s
ub
-
featu
res
t
hat
can
be
rem
ov
ed.
Be
ca
us
e
in
this
rese
arch,
Ind
on
e
sia
n
te
xt
is
ta
ke
n
fro
m
so
ci
al
m
e
dia
s
o
it
is
m
ore
ap
pro
pr
ia
te
to
use
Set
of
F
WI
as
represe
ntati
on
of
str
uctur
e
d
te
xt,
s
o
al
gorith
m
us
ed
is
f
re
quent
patte
r
n
al
gorithm
.
On
e
of
the
fr
e
qu
e
nt
patte
r
n
al
gorithm
s
is
the
FP
-
G
r
ow
t
h
al
gorithm
that
does
not
ge
ner
at
e
feat
ur
e
cand
i
dates,
b
ut
buil
ds
tree str
uctures,
m
aking
it
m
or
e
eff
ic
ie
nt
[
39
-
40]
.
Evaluation Warning : The document was created with Spire.PDF for Python.
Ind
on
esi
a
n
J
E
le
c Eng &
Co
m
p
Sci
IS
S
N:
25
02
-
4752
Fle
xi
bili
ty
o
f I
ndonesi
an text
pr
e
-
processi
ng li
br
ar
y
(
Dian
Sa’
ad
il
ah M
ay
lawati
)
423
Figure
2. Mo
dul de
sig
n of
I
ndonesi
a
n
te
xt
pre
-
processi
ng li
br
ary
3
.
3.1
The
R
es
ult
of
Li
br
ar
y Mea
sured
b
y McC
ab
e
Cycl
om
at
ic
Compl
eci
ty
Metr
ic
s
Evaluati
on
of
t
he
li
braries
of
pr
e
processi
ng
l
ibrar
ie
s
usi
ng
t
he
Mc
Ca
be
cy
c
lom
a
ti
c
com
pl
exity
m
e
tric
s
cal
culat
ed
the co
m
plexity
of
the
m
od
ules
i
n
the
li
br
a
ry.
T
he
res
ult
of
t
he
evaluati
on o
f
t
he
fle
xib
il
it
y
ob
ta
ine
d
by
the
ave
ra
ge
value
of
cy
cl
om
at
ic
co
m
plexity
2.
51
a
s
in
F
igure
3.
The
va
lue
in
dicat
ed
the
li
brary
was
no
t
to
o
com
plex,
qu
it
e
si
m
ple
an
d
flex
ible
(i
nd
ic
at
or
value
wa
s
gott
en
from
So
ft
w
are
E
nginee
rin
g
Insti
tute).
H
owev
e
r,
there
wer
e
s
ome
cl
asses
that
ha
d
a
cy
cl
om
at
i
c
com
plexity
value
great
er
tha
n
10,
e
ve
n
the
Po
rte
rA
l
goGa
ul
.j
ava
cl
ass h
a
d
a
val
ue of
m
or
e tha
n 50 w
hich
m
e
ans
it
was very
co
m
plex
an
d i
nf
le
xib
le
.
Figure
3. Re
su
l
t of Com
pex
it
y M
et
rics of
Ind
on
e
sia
n
Te
xt
P
re
-
processi
ng
Libra
ry
Evaluation Warning : The document was created with Spire.PDF for Python.
IS
S
N
:
2502
-
4752
Ind
on
esi
a
n
J
E
le
c Eng &
Co
m
p
Sci,
Vo
l.
1
3
, N
o.
1
,
Ja
nu
a
ry
201
9
:
4
20
–
4
26
424
3.3.2
The
R
es
ul
t of
Tes
tin
g on
Li
n
gPipe
4.1.0
LingPi
pe
was
an
ap
plica
ti
on
for
te
xt
pr
oces
sing
by
us
i
ng
li
ng
uisti
c
com
pu
ti
ng
that
c
ould
perform
functi
ons
s
uc
h
as
sea
rc
hing
for
nam
es
of
people,
orga
ni
zat
ion
s,
or
l
oc
at
ion
s
i
n
t
he
ne
ws
,
a
uto
m
at
ic
al
l
y
cl
assify
ing
sea
rch
res
ults
i
n
c
at
egories,
a
nd
correct
ing
th
e
correct
s
pelli
ng
of
a
sta
te
m
en
t.
Lin
gP
i
pe
4.1
.0
co
ul
d
acce
pt
in
pu
t
da
ta
in
plain
te
xt
an
d
xm
l.
In
the
Lin
gP
ip
e
ap
p,
th
e
f
unct
ion
that
wa
s
invo
ked
f
rom
the
pr
e
processi
ng
l
ibrar
y
was
rea
dD
i
r()
from
the
cl
ass
Fr
e
qu
e
ntItem
set
Fr
omM
ulti
pleDir
t
o
gen
e
rate
t
he
ge
tFwi()
featur
e
of
the
Fr
e
quent
Item
set
cl
ass
to
use
d
featu
res
in
t
he
m
od
el
dev
el
op
m
ent
process
,
as
w
el
l
as
Pr
e
processin
gT
est
Fil
eToStri
ng()
f
r
om
the
TextPre
proces
sing
cl
ass
t
o
pe
rf
or
m
pr
e
pro
cesses
on
the
data
for
te
sti
ng
.
Li
ngPipe
recei
ved
in
put
data
.t
xt,
s
o
li
br
ary
pr
e
proce
ssing
c
ould
be
directl
y
us
e
d
w
it
ho
ut
c
ha
ng
i
ng
th
e
input
ty
pe
first
.
Exam
ples
of
l
ibrar
y
te
sts
f
or cla
ssific
at
ion
w
it
h
Dy
nam
icLM
al
gorithm
s
wer
e
a
vaila
ble
in
the
LingPi
pe
a
ppli
cat
ion
.
We
use
d
data
f
ro
m
Twitt
er
an
d
Bl
og
with
t
hr
ee
scenari
os
,
am
ong
ot
her
s:
F
r
om
17
0
te
sti
ng
data,
136
data
was
cl
assifi
ed
c
orrec
tl
y,
a
m
on
g
ot
he
rs
115
data
i
n
”
kam
pan
ye
”
cl
ass,
a
nd
21
data
in
”t
ahunb
a
r
uislam
”
cl
ass;
fr
om
242
te
sti
ng
dat
a,
87
data
was
cl
assifi
ed
c
orre
ct
ly
,
a
m
on
g
ot
her
s
9
data
i
n
”
agam
a”
cl
ass,
15
data
i
n
bisn
is
cl
ass,
and
63
data
in
”ko
m
pu
te
r”
cl
a
ss;
an
d t
otal
te
s
t
data
from
blog
was
48
with
on
ly
4
data co
rr
e
ct
ly
classi
fied.
3.3.3
The
R
es
ult
of
Tes
tin
g Kea
5.0
Kea
was
a
te
xt
-
processi
ng
a
pp
li
cat
ion
that
co
uld
perform
to
ken
iz
e
rs
by
e
xtracti
ng
phrases
,
perf
or
m
ing
stopwor
d
rem
ov
al
an
d
ste
m
ming
,
a
nd
m
od
el
ing
the
proce
ss
of
docum
ent
trai
ning
a
nd
m
od
el
te
sti
ng.
M
od
el
s
wer
e
buil
t
us
in
g
the
Naïve
Ba
ye
s
al
go
rit
hm
. Kea
co
ntracte
d
t
he
phras
e
of
a
do
c
um
ent
and
the
phrase
was
us
e
d
as
a
keyw
ord
to
trai
n
do
c
ume
nt
data
an
d
a
s
a
la
bel
f
or
cl
assifi
cat
ion
.
K
ea
ha
d
no
al
gorithm
for
cl
ust
ering
process
. T
he
in
pu
t
for
t
he data
train
process
was
a
.txt
file
co
ntaini
ng text
f
or
t
he
trai
n data an
d
a
key file
w
it
h
the
sam
e
na
m
e
as
the
.t
xt
file
.
This
.
key
file
con
ta
ine
d
keyp
hr
ase
s
f
or
each
.txt
file
.
T
o
prepare
a
phr
ase.
ke
y
file
con
ta
ini
ng
phrases
it
co
ul
d
us
e
t
he
rea
dDir(
)
f
unct
ion
of
t
he
F
reque
ntItem
set
Fr
omM
ulti
pleDir
cl
ass
to
gen
e
rate
featu
r
es
t
hat
c
ould
be
us
e
d
as
keyp
hr
ase
,
getF
wi()
f
ro
m
the
F
re
quent
Item
set
cl
a
ss
to
use
featu
r
es
in
the
m
od
el
de
ve
lop
m
ent
proces
s,
as
well
as
Pre
processi
ng
Te
stFi
le
ToS
trin
g()
f
r
om
the
Tex
tPrep
ro
ce
ssin
g
cl
as
s
to
pe
r
form
pr
e
proces
on
trai
ni
ng
or
te
st
data.
So
that
t
he
te
st
an
d
trai
ning
da
ta
ha
d
been
done
to
ke
nize
pr
ocess,
stopwor
ds
rem
ov
al
,
a
nd
ste
m
m
ing
.
Be
ca
us
e
Kea
did
no
t
pr
ovide
sto
pwo
rd
and
ste
m
m
ing
Ind
on
esi
a
n
la
ngua
ge
.
Kea
was
ve
ry
dep
e
ndent
on
t
he
la
ngua
ge
use
d,
so
if
set
V
oc
abu
la
ry
ha
d
a
"n
on
e"
value
the
re
su
lt
in
g
m
od
el
cou
l
d
not
be
t
est
ed.
Kea
acc
epted
only
E
ngli
sh
,
S
pan
is
h
and
F
ren
c
h
in
pu
t.
T
his
was
wh
y
t
he
m
od
e
l
with
Ind
on
esi
a
n pro
du
ce
d
c
ould
no
t be test
ed
, but
the m
od
el
w
as
su
ccess
fu
ll
y es
ta
blished.
3.3.4
The
R
es
ult
of
Tes
tin
g Ma
ll
et
2.0.
6
Ma
ll
et
(MAch
ine
Lear
ning
f
or
Lan
guag
E
T
oo
l
kit)
was
a
J
ava
-
base
d
pac
kag
e
for
the
processin
g
of
natu
ral
la
ngua
ge
sta
ti
sti
cs,
docum
ent
cl
assifi
cat
ion
,
cl
us
te
rin
g,
t
op
ic
m
od
el
ing,
i
nfor
m
at
ion
e
xtracti
on,
a
nd
oth
e
r
m
achine
le
ar
ning
a
ppl
ic
at
ion
s
f
or
te
xt.
Ma
ll
et
i
nclud
e
d
po
werf
ul
to
ols
f
or
doc
um
ent
cl
assifi
cat
ion
:
eff
ic
ie
nt
r
ou
ti
ne
s
f
or
c
onver
ti
ng
te
xt
to
"
fea
tures",
var
io
us
al
gorithm
s
(inclu
ding
Naïv
e
Ba
ye
s,
Ma
xim
u
m
Entr
op
y,
a
nd
Decisi
on
T
ree)
,
a
nd
co
de
f
or
eval
uating
cl
assifi
er
perf
orm
ance
us
i
ng
s
om
e
com
m
on
l
y
us
e
d
m
et
rics.
As
with
Li
ngPipe
an
d
Kea,
pr
e
proc
essing
li
braries
co
uld
be
ad
de
d
t
o
Ma
ll
et
.
Sa
m
e
as
the
Li
ngPip
e
app
li
cat
io
n,
t
he
f
unct
io
n
was
cal
le
d
f
ro
m
the
pre
processi
ng
li
br
a
ry
was
r
eadD
i
r()
f
r
om
the
Fr
e
qu
e
ntItem
s
et
Fr
om
Mult
ipleDir
cl
ass
t
o
ge
ner
at
e
the
get
Fw
i(
)
featu
re
o
f
t
he
Fr
e
qu
e
ntI
tem
set
cl
ass
to
us
e
d
the
feat
ur
es
i
n
the
m
od
el
dev
el
op
m
ent
process
,
as
w
el
l
as
the
P
r
eprocessi
ng
Te
stFi
le
ToS
trin
g()
of
t
he
TextPre
proces
sing
cl
ass
to
pe
rfor
m
pr
e
proce
sses
on
the
dat
a
f
or
te
sti
ng
.
M
al
le
t
al
so
recei
ved
in
put
data
.
txt,
s
o
the
prep
ro
c
ess
ing
li
brary
c
ould
be
di
rectl
y
us
e
d
with
out
changin
g
th
e
input
ty
pe
fir
st.
Ther
e
we
re
s
ever
al
exam
ples
of
li
br
a
ry
te
sti
ng
w
it
h
the
sam
e
trai
n
a
nd
te
st
dat
a
as
in
the
Lin
gP
ipe
li
br
a
ry
t
est
,
f
or
the
pro
cess
of
cl
assifi
cat
ion
with
t
he
Naï
ve
Ba
ye
s
al
g
ori
thm
.
The
cl
assifi
cat
ion
res
ults
ha
d
an
a
ver
a
ge
acc
uracy
of
ab
out
96.5%, t
his r
es
ult was
certai
nl
y i
nf
lue
nce
d b
y goo
d
te
xt
dat
a pre
-
process
re
su
lt
s.
4.
CONCL
US
I
O
N
So
ft
war
e
needed
t
o
ac
hieve
qual
it
y
in
order
to
c
on
ti
nue
to
be
us
e
d,
an
d
ea
sy
to
m
ai
ntain
and
dev
el
op.
Ther
e
was
a
qu
al
it
y
factor
software
t
hat
co
uld
be
use
d
as
a
m
easur
e
of
w
hat
qu
al
it
y
to
be
ac
hieve
d.
C
om
po
nen
t
-
base
d
s
of
twa
re
,
inclu
ding
li
braries,
w
hich
w
ere
c
ollec
ti
ons
of
m
odules,
cl
asses,
or
functi
ons,
hav
e
qu
al
it
y
factors
that
nee
ded
to
be
ac
hie
ved.
On
e
of
t
hem
was
the
qu
al
it
y
factor
of
f
le
xib
il
it
y.
The
pr
e
-
pr
ocessin
g
sta
ge
in
te
xt
m
ining
was
a
n
im
po
r
ta
nt
sta
ge
that
cou
l
d
af
fect
th
e
m
ining
resu
l
ts.
H
ow
e
ver,
no
m
any
te
xt
m
inin
g
app
li
cat
io
ns
ha
d
a
c
om
plete
pr
e
-
processi
ng
ph
a
se,
es
pecial
ly
fo
r
I
ndonesi
an
te
xt.
Be
cau
se,
in
t
he
te
xt
m
ining
needs a
nd ch
a
r
act
erist
ic
s o
f
e
ach lan
guag
e
on text
data t
o b
e proces
sed
d
if
fer
e
ntly
.
This
resea
rch
bu
il
t
a
nd
m
easur
e
d
th
e
fle
xibi
li
t
y
of
te
xt
pr
e
-
proce
ssin
g
li
br
a
ries
for
I
ndon
e
sia
n
te
xt,
rangin
g
from
tok
e
nizin
g
st
ages
to
f
or
m
i
ng
struc
ture
d
te
xt
re
pr
ese
nt
at
ion
s.
T
he
f
le
xib
il
it
y
of
li
br
ary
Evaluation Warning : The document was created with Spire.PDF for Python.
Ind
on
esi
a
n
J
E
le
c Eng &
Co
m
p
Sci
IS
S
N:
25
02
-
4752
Fle
xi
bili
ty
o
f I
ndonesi
an text
pr
e
-
processi
ng li
br
ar
y
(
Dian
Sa’
ad
il
ah M
ay
lawati
)
425
m
easur
em
ent
us
in
g
Mc
Ca
be
cy
cl
om
at
ic
c
om
plexity
m
etr
ic
s
s
howe
d
t
hat
te
xt
pre
-
pr
ocessin
g
li
br
a
r
ie
s
f
or
Ind
on
esi
a
n
te
xt
s
we
re
flexibl
e
en
ough,
sim
ple,
an
d
easy
to
m
od
ify
.
This
was
pro
ve
d
by
im
ple
m
enting
li
br
ary
in
3
te
xt
m
ining
a
ppli
cat
ion
s,
incl
ud
i
ng
Li
ngPipe
4.1.0,
K
ea
5.0,
an
d
Ma
ll
et
2.0.6.
Eac
h
a
ppli
cat
ion
ut
il
iz
ed
the
pr
e
-
proce
ss
ing
li
br
a
ry
as
a
pr
e
-
pr
ocess
sta
ge
for
docum
ent
cl
assifi
cat
io
n
te
c
hn
i
qu
es
.
I
n
a
ddit
ion
,
the
li
br
ary
pro
vid
e
d
so
m
e
abstract
m
od
ul
es
that
co
uld
be
i
m
ple
m
ented
accor
ding
to
t
he
needs
of
te
xt
m
ining
ap
plica
ti
on
s
.
Howe
ver,
of
c
ourse
the
progr
a
m
m
ing
la
ngua
ge
us
e
d
af
fect
s
the
us
e
of
li
braries,
because
the
li
br
a
ry
c
ould
not
be
us
e
d
in
ap
pl
ic
at
ion
s
with
diff
e
re
nt
pro
gr
a
m
m
ing
la
ngua
ges.
The
refo
r
e,
li
br
a
ries
m
us
t
achieve
wit
h
oth
e
r
qu
al
it
y
facto
rs,
su
c
h
a
s
porta
bili
ty
,
adap
ta
bili
ty
,
or
i
nterop
erab
il
it
y
that
c
ou
l
d
be
im
plem
ente
d
in
di
fferent
env
i
ronm
ents (bo
t
h
la
ng
uages
and
op
e
rati
ng
syst
e
m
s).
REFERE
NCE
S
[1]
H.
Aulawi,
M.
A.
Ramdhani,
C.
Slamet
,
H.
Ainiss
y
ifa,
and
W
.
Darm
al
aksa
na,
“
Functi
on
al
Nee
d
Anal
y
sis
of
Know
le
dge
Port
al
Design
in
Hig
her
Edu
cation
In
stit
uti
on
,
”
Int
.
S
oft
Comput
.
,
vol. 12, no. 2, pp. 13
2
–
141,
2017
.
[2]
T.
W
ah
y
un
ingru
m
and
K.
Mus
to
fa,
“
A
S
y
stema
tic
Mapp
ing
R
eview
of
Software
Quali
t
y
M
ea
sure
m
ent
:
Res
ea
rch
Tre
nds,
Mod
el
,
and
Method
,
”
In
t.
J. E
l
ectr.
Com
put.
Eng.
,
vo
l. 7, no. 5, p. 2847, 2
017.
[3]
J.
A.
McC
all,
P. K
.
Ric
h
ard
s,
and
G.
F.
W
al
t
ers, “
Fact
ors i
n
Softw
are
Qu
al
i
t
y
-
Vol
um
e
1
-
Conce
p
t
and
Defi
ni
ti
ons
of
Software
Qua
li
t
y
,
”
D
ef
.
Tech
.
Inf.
C
ent.
,
1977.
[4]
B.
W
.
Bo
ehm,
J.
R.
Brown
,
H.
Kaspar,
M.
Li
pow
,
G
.
McL
eod,
an
d
M.
Merr
it
t
t,
C
haracte
ristic
s
of
Soft
ware
Quality
.
North
Holla
nd
,
1
978.
[5]
S.
Tri
p
at
hi
,
“
A S
urve
y
on
Qu
al
i
t
y
Perspec
t
ive a
nd
Software
Qual
ity
Mod
el
s,”
IOS
R
J
.
Comput
.
En
g.
,
2014.
[6]
R.
Al
-
Qut
ai
sh,
“
Quali
t
y
m
od
el
s
i
n
softwar
e
engi
n
ee
ring
l
itera
tur
e:
an
ana
l
y
t
i
ca
l
an
d
compar
at
iv
e
st
ud
y
,
”
J
.
Am.
S
ci
.
,
2010.
[7]
ISO
/IE
C,
“
S
y
st
ems
and
softwa
re
engi
n
ee
ring
—
S
y
st
ems
and
software
Q
ua
lit
y
R
equi
rement
s
and
Eva
lu
at
io
n
(SQ
uaRE
)
—
S
ystem a
nd
softwa
re
qua
li
t
y
m
od
els
,
”
ISO/I
EC
Fd
is 250
102011
,
201
1.
[8]
M.
C.
Pau
lk,
B
.
Curti
s,
M
.
B.
C
hrissis,
and
C
.
V.
W
ebe
r,
“
Cap
abi
lit
y
m
at
ur
ity
m
odel
,
ver
sion
1.
1,
”
IE
EE
So
ftw
.
,
1993.
[9]
Y.
Fang,
B.
H
an
,
and
W
.
Zhou,
“
Resea
rch
and
Anal
y
s
is
of
CMM
I
Proce
ss
Im
prove
m
ent
Based
o
n
SQ
CS
Sy
st
em,”
TEL
KOMNIK
A
Indone
sian J
ourn
al
of
Elec
tric
al
Engi
ne
ering
,
vol
.
10
,
no
.
7
,
pp
.
1
849
–
1854,
2012
.
[10]
R.
S.
Press
m
an,
Soft
ware
Engi
ne
ering:
A
Prac
ti
t
i
oner’s A
pproach
,
7th
ed
.
New
York:
McGraw
-
Hi
ll
,
2011.
[11]
I.
Som
m
erv
il
le,
Soft
ware
Engi
ne
ering
.
2010
.
[12]
M.
A.
Kab
ir,
M.
U.
Rehman,
and
S.
I.
Ma
jumdar,
“
An
ana
l
y
tica
l
a
nd
compara
t
ive
stud
y
of
softwar
e
usabil
ity
qua
l
i
t
y
fac
tors,
” in
7t
h
I
EE
E
Inte
rnat
ion
al
Conf
ere
nce o
n
Soft
ware
Enginee
ring a
nd
Ser
vi
c
e
S
ci
e
n
ce (
ICSESS)
,
2016.
[13]
A.
Sharm
a,
R.
K
um
ar,
and
P.
S.
Grover,
“
Esti
m
ation
of
Quali
t
y
fo
r
Software
Com
ponent
-
an
Emporical
Approac
h
,
”
SIGSO
FT
Soft
w.
Eng. Notes
,
vo
l. 33, no. 6, 2008.
[14]
Y.
e
.
a. Choi
,
“
Prac
t
ic
a
l
S/W
Co
m
ponent
Qualit
y
Evaluation
Mod
el
,
”
in
ICACT
,
2
008.
[15]
A.
Alvar
o,
E.
Sa
nta
na
d
e
Alm
ei
d
a,
and
S.
Rom
er
o
de
Le
m
os
Mei
ra,
“
A
software
component
quali
t
y
f
ramework,”
ACM
SIGSO
FT
Soft
w.
Eng
.
Not
e
s
,
2010.
[16]
C.
J.
Torr
e
,
M
.
J.
Mart
in
-
Bautista,
D.
Sanch
ez,
and
I
.
Bla
n
co,
“
Te
xt
Know
le
dge
Mining:
And
Approac
h
To
T
e
x
t
Mining,
”
EST
Y
L
F08
,
vo
l. 17
–
19,
2008.
[17]
V.
Gupta
and
G.
S.
L
eha
l
,
“
A
surve
y
of
t
ext
m
ini
n
g
t
ec
hniqu
es
an
d
app
li
c
at
ions,
”
J
ournal
of
Eme
rg
ing
Te
chnol
ogi
e
s
in
We
b
Int
el
l
igence
,
vol
.
1
,
no
.
1
.
pp
.
60
–
76
,
200
9.
[18]
H.
Jiawe
i
,
M
.
K
amber,
J.
Han,
M.
Kam
ber
,
and J
.
Pei,
Data
Mining: Conc
ep
ts an
d
Techni
qu
es
.
20
06.
[19]
A.
Hoonlor,
“
Sequent
i
al
Pa
tt
ern
s
and Te
m
pora
l
Patterns for Te
xt
Mining,
”
New Y
ork,
2011
.
[20]
M.
R.
Isl
am,
I
.
F.
Al
-
Shaikhl
i
,
R
.
B.
M.
Nor
,
an
d
V.
Vara
d
ara
j
an,
“
Te
chni
c
al
appr
oac
h
in
t
ext
m
ining
for
stock
m
ar
ket
pre
diction: A s
ystemati
c
rev
ie
w,
”
Indone
s
.
J
.
Ele
ct
r. Eng.
Comput.
S
ci.
,
vol. 10, n
o.
2
,
pp
.
770
–
77
7,
2018
.
[21]
R.
Gunawan
an
d
K.
Mus
tofa
,
“
Finding
knowle
dge
from
Indon
esia
n
tra
di
ti
ona
l
m
edi
ci
n
e
using
sem
ant
ic
web
r
ule
la
nguag
e,”
Int
.
J
.
E
lectr.
Comput.
Eng
.
,
vol
.
7
,
no
.
6
,
pp
.
3674
–
36
82,
2017
.
[22]
H.
Mahgoub,
D.
Rösner,
N.
Ism
ail,
and
F.
Tork
e
y
,
“
A
Te
xt
Mining
Te
chn
ique
Us
in
g
As
socia
ti
on
R
ule
s
Ext
r
ac
t
ion,”
Int.
J. Comput. I
nte
ll.
,
vol
.
4
,
no
.
1,
pp
.
21
–
28
,
20
08.
[23]
Y.
E.
Zoha
r
,
“
I
ntroduc
ti
on
to
T
ext
Mining
,
”
Au
tomate
d
Learnin
g
Gr
oup,
Univer
sity
of
I
ll
ino
is
,
2002.
[Onlin
e]
.
Avail
ab
le
:
htt
p
:/
/
ww
w.doc
stoc.
co
m
/doc
s/25443990/Int
roduc
ti
on
-
to
-
T
ext
Mining
.
[24]
M.
A.
R
amdhani
,
H
.
Aulawi
,
A
.
Ikhwana
,
and
Y
.
Mauludd
in,
“
Model
of
gre
en
tec
hnolog
y
ada
pt
ati
on
in
sm
al
l
and
m
edi
um
-
size
d
t
a
nner
y
industr
y
,
”
J.
Eng. A
pp
l. Sc
i
.
,
2017.
[25]
C.
e.
a
.
Sz
y
per
s
ki,
Component
S
oft
ware,
B
ey
ond
Obje
ct
-
Or
ie
nt
e
d
Progra
mm
ing
,
2nd
ed
.
Gr
ea
t
Brit
ai
n
:
Addison
-
W
esley
,
R
ea
rson
Educat
ion
L
imit
ed,
2002
.
[26]
T.
J
.
McCab
e, “A
Com
ple
xity
Mea
sure,”
IE
EE
Tr
ansacti
ons on Software Engineering
.
1976
.
[27]
F.
Deissenboe
c
k,
S.
W
agne
r
,
M.
Pizka,
S.
T
euc
her
t,
and
J.
F.
Gira
rd
,
“
An
ac
t
ivi
t
y
-
b
ase
d
q
ual
ity
m
od
el
fo
r
m
ai
nta
in
abi
l
ity
,
”
in
I
EEE
Int
ernati
onal
Con
fe
ren
c
e
on
So
ft
ware
M
aint
enan
ce
,
ICS
M
,
2007
.
[28]
D.
Mahdjoub
i,
“
The
L
inear
Model
of
Techno
logi
c
al
Innova
tion:
B
ac
kground
and
Ta
xonom
y
,
”
The
Atlas
o
f
Innov
ati
on
.
199
7.
[29]
N.
Sharm
a
and
M.
W
adhwa
,
eXSRUP:
Hybrid
S
oft
ware
De
ve
lo
p
ment
Mode
l
In
tegr
ati
ng
E
xt
reme
Program
ing,
Scr
um
&
R
ati
onal
Unified
Proc
ess
.
201
5.
[30]
M.
A.
R
amdhani
,
D
.
S.
Ma
y
la
w
ati,
A.
S.
Am
in,
an
d
H.
Aul
awi,
“
Requi
rement
s
E
li
c
it
ation
in
Softwa
re
Engi
n
ee
ring
,
”
Int.
J. E
ng
.
T
e
ch
nol.
,
vol
.
7
,
no
.
2
.
29,
pp.
772
–
775
,
2018
.
Evaluation Warning : The document was created with Spire.PDF for Python.
IS
S
N
:
2502
-
4752
Ind
on
esi
a
n
J
E
le
c Eng &
Co
m
p
Sci,
Vo
l.
1
3
, N
o.
1
,
Ja
nu
a
ry
201
9
:
4
20
–
4
26
426
[31]
A.
E.
H.
Soum
i
ya
and
B.
Moham
ed,
“
Converting
UM
L
Cla
ss
Diag
rams
int
o
T
emporal
Obj
ec
t
Relational
Da
ta
Base
,
”
Int.
J. Ele
c
tr.
Co
mput.
Eng
.
,
vol
.
7,
no
.
5
,
p
.
2823
,
2017.
[32]
B.
Jale
nd
er,
a
.
Govardha
n,
and
P
.
Prem
cha
nd,
“
Designing
code
l
eve
l
r
eusa
bl
e
software
componen
ts,”
Int
.
J
.
Sof
tw.
Eng.
Appl.
,
2012
.
[33]
B.
Me
y
er
and
P.
Am
eri
ca
,
“
Object
-
orie
n
te
d
softw
a
re
constru
ct
io
n
2
nd
Ed.,”
Sc
i. Co
mput.
Program
.
,
1989.
[34]
A.
Douce
t
and
H
.
Ahonen
-
M
y
k
a,
“
Non
-
cont
iguou
s
word
seque
nces
for
informati
on
ret
ri
eval,
”
MWE
’04
Proc.
Work.
Mult
iword
Ex
pr
essions
,
vol. 26,
no.
Jul
y
,
pp.
88
–
95,
2004
.
[35]
A.
Douce
t
and
H.
Ahonen
-
M
y
k
a,
“
An
eff
i
ci
en
t
an
y
la
ngu
age
ap
proa
ch
for
the
i
nte
gra
ti
on
of
ph
rase
s
in
docume
nt
ret
ri
eva
l
,
”
Lang.
Re
sour
.
Ev
a
l.
,
v
ol.
44
,
no
.
1
–
2
,
p
p.
159
–
180
,
201
0.
[36]
D.
S.
Ma
y
l
awa
ti
,
“
Pem
bangun
an
Li
b
rar
y
Pre
-
Proce
ss
ing
untu
k
Te
xt
Mining
denga
n
Repre
s
ent
asi
Him
puna
n
Freque
nt
W
ord
I
te
m
set
(HF
W
I)
Studi
Kasus
:
Ba
hasa
Gaul
Indon
esia
,
”
B
andung,
2015.
[37]
D.
S.
A.
Ma
y
l
awa
ti,
M.
A.
R
amdhani,
A
.
Rahman
,
and
W
.
Darm
alaksana
,
“
Inc
rement
a
l
techniqu
e
with
set
of
fr
equ
ent
word
it
em
sets
f
or
m
ini
ng
large
I
ndonesia
n
t
ext
d
at
a
,
”
in
2017
5
th
Inte
rnationa
l
C
onfe
renc
e
on
C
y
ber
and
IT
Serv
i
ce
Manage
ment, CI
TSM 2017
,
2017
.
[38]
D.
Sa’
Adill
ah
Ma
y
la
wa
ti
and
G.
A.
Putri
Sapta
w
at
i
,
“
Set
of
Freq
uent
W
ord
Ite
m
sets
as
Feat
ur
e
R
epr
ese
nt
at
ion
for
Te
xt
with Indon
esia
n
Sl
ang,”
in
Journal
of
Ph
ysic
s: Conf
ere
n
ce S
erie
s
,
2017
.
[39]
J.
Han,
H.
Ch
en
g,
D.
Xin,
and
X.
Yan,
“
Freque
nt
pa
tt
ern
m
ini
n
g:
Cur
ren
t
sta
tus
and
fu
ture
direc
ti
ons,”
Data
M
i
n.
Knowl.
Disco
v.
,
vol.
15
,
no
.
1
,
pp
.
55
–
86
,
2007
.
[40]
J.
Han,
J.
Pe
i,
Y.
Yin,
and
R
.
Ma
o,
“
Mining
fre
qu
ent
pa
tt
ern
s
with
out
ca
nd
ida
t
e
ge
ner
ation:
A
fre
q
uent
-
pa
tt
ern
tr
e
e
appr
oac
h
,
”
Data
Min. Knowl.
Di
scov
.
,
vol
.
8
,
no
.
1,
pp
.
53
–
87
,
20
04.
Evaluation Warning : The document was created with Spire.PDF for Python.