TELKOM
NIKA
, Vol.11, No
.3, Septembe
r 2013, pp. 5
91~596
ISSN: 1693-6
930,
accredited
A
by DIKTI, De
cree No: 58/DIK
T
I/Kep/2013
DOI
:
10.12928/TELKOMNIKA.v11i3.1095
591
Re
cei
v
ed Ma
rch 3
1
, 2013;
Re
vised July
13, 2013; Accepted July 2
8
,
2013
Comparative Study of Bankruptcy Prediction Models
Is
y
e
Ariesha
n
ti*, Yudhi Pur
w
a
n
anto,
Aries
t
ia Ram
a
dhani, Moh
a
mat Ulin Nu
ha,
Nuriss
aidah Ulinnuha
Dep
a
rtment of Informatics En
gin
eeri
ng,
F
T
I, Institut T
e
knolo
g
i
Sep
u
lu
h
No
pemb
e
r
Gedun
g T
e
knik Informatika, Kampus IT
S Su
kolil
o
*Corres
p
o
ndi
n
g
author, e-ma
i
l
: i.ariesh
anti@
if.its.ac.id
Abs
t
rak
Prediksi kebangkrutan
m
e
r
u
pakan hal yang cuk
up pe
ntin
g dal
am su
atu
perusa
haa
n. Den
gan
me
ng
etah
ui po
tensi
k
eba
ngk
r
u
tan,
mak
a
su
atu p
e
rusa
ha
a
n
ak
an
le
bih
si
ap
dan
l
ebi
h
mampu
men
g
a
m
bi
l
keputus
an
k
e
u
ang
an untuk me
ng
antisi
pas
i
terjad
iny
a
ke
ban
gkruta
n. U
n
tuk a
n
tisip
a
s
i itul
ah, se
bu
ah
pera
ngkat lu
n
a
k untuk pre
d
i
ksi
keba
ngkr
u
tan dap
at me
mb
antu p
i
hak
perusa
haa
n dal
a
m
me
ng
a
m
b
i
l
keputus
a
m
. D
a
la
m
men
g
e
m
ban
gka
n
p
e
ran
g
kat l
unak
pre
d
iksi k
e
b
angkr
u
tan, h
a
rus
dil
a
kuka
n p
e
m
il
ih
a
n
meto
de
machi
ne le
arn
i
ng y
a
ng tep
a
t. Sebu
ah
meto
de ya
n
g
cocok u
n
tuk
sebu
ah kas
u
s, bel
u
m
tentu co
cok
untuk k
a
sus y
ang
la
in. K
a
re
na
itula
h
, d
a
la
m stu
d
i
in
i d
ila
kukan
per
ba
nd
ing
an
be
bera
p
a
meto
de M
a
c
h
in
e
Lear
nin
g
u
n
tu
k me
ng
etah
ui
meto
de
man
a
yan
g
co
cok
untuk kas
u
s
pred
iksi ke
b
angkr
u
tan. D
e
nga
n
me
ng
etah
ui metode ya
ng p
a
ling c
o
cok, ma
ka untuk
pe
ng
emba
nga
n ber
ikutnya, da
pat
difokusk
an p
a
d
a
meto
de ya
ng terba
i
k. Berdas
arkan p
e
rba
n
d
i
ng
an be
ber
ap
a meto
de (k-N
N, fu
zz
y k-N
N
,
SVM, Baggin
g
Near
est Nei
g
h
bour SVM, Mul
t
ilayer Perc
eptr
on(MLP), Me
to
de hi
bri
d
MLP+
Regr
esi Li
ni
er Berga
n
d
a
) da
p
a
t
disi
mp
ulk
an
b
ahw
a
meto
de
fu
zz
y
k-N
N
mer
u
p
a
kan
metode y
a
n
g
p
a
lin
g coc
o
k u
n
tuk kasus
pr
ediks
i
keba
ngkrut
an
den
ga
n tingkat
akurasi 7
7
.5
%. Sehin
g
g
a
untuk pe
ng
e
m
ban
ga
n mode
l
lebi
h lan
j
ut, d
apat
me
manfa
a
tkan
mod
i
fikas
i
dari
meto
de fu
zz
y
k-NN.
Ka
ta
k
unc
i:
Prediksi K
eba
ngk
rutan, k-NN, fuzz
y
k-NN, Ba
g
g
in
g Near
est Neig
hbo
ur SVM, Metode hi
brid
MLP+
Regr
esi Lini
er
Berg
an
d
a
A
b
st
r
a
ct
Early ind
i
catio
n
of Bankru
ptcy is imp
o
rtant for
a co
mpa
n
y. If compa
n
ies
a
w
are of potenc
y of thei
r
Bankru
ptcy, they can
take
a
preve
n
tive
acti
on to
antic
ipat
e the B
ankr
upt
cy. In order
to
detect the
pote
n
cy
of a Bankr
uptc
y
, a company c
an util
i
z
e
a mo
del of Ba
nkru
ptcy predicti
on. T
he pr
edicti
on mo
de
l
can be buil
t
usin
g a
mach
ine l
ear
nin
g
meth
ods. H
o
w
e
ver, the
cho
i
ce of mac
h
i
n
e lear
ni
ng
me
thods sh
oul
d
be
perfor
m
e
d
car
e
fully
beca
u
se
the suita
b
il
ity
of a
mod
e
l
de
pen
ds o
n
the
prob
le
m sp
ec
ifi
c
ally. T
her
efor
e, in
this pap
er w
e
perfor
m
a co
mparativ
e study
of severa
l
mac
h
in
e lea
n
i
ng
methods for Ba
n
k
ruptcy pred
icti
on.
It is exp
e
cted t
hat the
co
mp
ar
ison r
e
su
lt w
ill
provi
de
in
si
ght
ab
out the
ro
b
u
st metho
d
for
further res
ear
ch.
Accordi
ng to
t
he c
o
mpar
ativ
e study, t
he
p
e
rformanc
e
of
severa
l mo
del
s
that
b
a
se
d o
n
mac
h
in
e lear
nin
g
meth
ods (k-NN
,
fu
zz
y
k-NN,
SVM, Baggin
g
Nearest Nei
g
hbo
ur SVM, Multilay
e
r Perce
p
tron(MLP), H
y
brid
of MLP +
Multiple
Li
near
Re
gressi
on), it can b
e
c
onc
lu
d
ed that fu
zz
y
k-NN me
tho
d
achi
eve
the b
e
st
perfor
m
a
n
ce
w
i
th accuracy
77.5%.
T
he re
sult sugg
es
ts that the e
nha
n
c
ed dev
el
op
ment of ba
nkrup
t
cy
pred
iction
mo
d
e
l cou
l
d us
e the improv
e
m
ent
or mo
dificati
on
of fu
zz
y k-N
N
.
Ke
y
w
ords
: Ba
nkruptcy pre
d
ic
tion, k-NN, fu
z
z
y
k-NN, Ba
g
g
i
ng Ne
arest Nei
ghb
our SVM, Hybrid
met
hod
MLP+
Multipl
e
Lin
ear Re
gress
i
on
1. Introduc
tion
In bussiness, a company
can ha
ve two possibilities (gain prof
it ar loss). In the high
comp
etitive e
r
a, ea
rly warn
ing of a B
a
n
k
ruptcy i
s
im
p
o
rtant to
prev
ent the
worst
con
d
ition fo
r the
comp
any. In
orde
r to
pre
d
i
c
t the Ba
nkru
ptcy, a
comp
any ca
n em
pl
oy the rel
e
va
nt data
su
ch
as
asset total, inventroy, profi
t
and financi
a
l def
icien
c
y. Those data
will give maximum advant
age
whe
n
thei
r p
a
ttern i
s
inte
rpretable.
Wi
th the
obj
ecti
ve of discov
er the
Ban
k
ruptcy patte
rn
, a
machi
n
e
lea
r
ning metho
d
ca
n be em
ployed.
Sp
ecif
ically,
the method
will
cla
ssify whet
her
pattern in the
comp
any dat
a sup
port the
indication of Bankrupt
cy or not.
Re
cently, se
veral m
a
chin
e lea
r
nin
g
m
e
thod
s a
r
e
p
r
opo
se
d fo
r
Bankrupt
cy p
r
edi
ction.
Some of the
m
are k-n
e
a
r
est
neig
hbo
r, neural
n
e
twork
and
su
pport ve
cto
r
machi
ne. T
h
ose
method
s com
e
with
th
eir a
d
vantage an
d
disadv
anta
ge. Among
several
ca
se
s,
neu
ral n
e
twork
Evaluation Warning : The document was created with Spire.PDF for Python.
ISSN: 16
93-6
930
TELKOM
NIKA
Vol. 11, No. 3, September 20
13: 59
1 – 596
592
and
su
ppo
rt
vector
ma
chi
ne a
r
e
su
pe
rior tha
n
oth
e
r
meth
od
s. F
o
r exa
m
ple,
sup
port ve
ct
or
machi
ne i
s
e
x
ploited in d
e
tection
of di
abete
s
mellit
us [1] an
d n
eural
network is
employe
d
in
classifi
cation
of mobile robot navigation [2]. T
he superiority is because
of their capability in
gene
rali
zatio
n
. However, their mo
del
s
are difficult
to interpret. On
the cont
rary,
model that u
s
e
k-n
e
a
r
e
s
t nei
ghbo
r is ea
si
er to interp
ret
and its co
mp
utation is sim
p
le.
For Ba
nkrupt
cy predi
ction
model, Li
et. al., [6] propo
sed fu
zzy k-n
n
mod
e
l an
d
Wie
s
la
w
et. al. [3] proposed
statisti
cal-ba
sed model. Still, the i
m
provem
ent
space is avail
able i
n
order
to
obtain a
bette
r mod
e
l. The
main
contri
bu
tion of th
is
pa
per i
s
con
d
u
c
ting a
comp
arative study fo
r
evaluating th
e most suita
b
le model for Bankrupt
cy
predi
ction p
r
oblem. The
compa
r
ative result
can b
e
use
d
as a co
nsi
d
e
r
ation for furth
e
r re
se
ar
ch in
the Bankru
ptcy predi
ction
probl
em. In this
comp
arative study,
the u
s
ag
e
of
k-n
eare
s
t n
e
igh
bour,
neu
ral
network a
n
d supp
ort v
e
ctor
machi
ne in
a
model p
r
e
d
ict
i
on will
be ev
aluated
and
will be
com
p
a
r
ed. In ad
ditio
n
, the variant
of
the methods will be evaluated as
wel
l
. The vari
ant metods are fuzzy k-nearest nei
ghbour,
baggi
ng n
earest nei
ghb
our sup
p
o
r
t vect
or ma
chi
ne, a
nd a hyb
r
id m
odel of m
u
ltilayer p
e
rcept
ron
and multipl
e
li
near
re
gre
s
si
on. By con
s
id
ering
th
e excellen
c
y and t
he drawba
ck
of each meth
od,
this study will
explore which method is
su
itable for Ba
nkruptcy p
r
ed
iction mod
e
l.
The organi
za
tion of the pa
per i
s
as foll
ow,
the next se
ction de
scribes the d
a
ta
set and
followe
d by
machi
ne l
earning m
e
thod
s explanatio
n i
n
the thi
r
d
se
ction. Sub
s
e
q
uently, the re
sult
of the comparative study is
illustrated in the fourth sect
ion. Fi
nally, the last sect
ion descri
b
es the
con
c
lu
sio
n
an
d dsi
c
u
ssi
on.
2. Method
s
This
se
ction
descri
b
e
s
me
thods that a
r
e
com
pared i
n
this stu
d
y and follo
wed
by the
dataset.
2.1 K-Nea
r
es
t Neighb
our
K-Ne
are
s
t
Ne
ighbo
r (K
NN) is a
non
-pa
r
ametri
c cl
assi
fication m
e
th
od. Co
mputat
ionally,
it is simple
r than an
other
method
s such as Sup
port
Vector M
a
chi
ne (SVM) a
n
d
Artificial Ne
ural
Network (ANN). In ord
e
r to cl
a
ssify, KNN requi
re
s
three pa
ram
e
ters, data
s
et,
distan
ce me
tric
and k (num
be
r of nearest n
e
igbo
urs) [8].
Similarity between atri
but
es with tho
s
e
of their ne
are
s
neig
hbo
ur ca
n be co
mputed
using Euclidean di
st
ance.
The m
a
jority
class num
ber will
be
t
r
ansf
erred as
the
predi
cted cl
ass.
If a reco
rd is represented
as a vecto
r
(x1,
x2, ..., xn), then Eu
cl
idean di
stan
ce betwe
en t
w
o
records
is
c
o
mputed as
follow [8]:
d(x
i
, x
j
) =
∑
(1)
The valu
e d
(
xi, xj) represents di
stan
ce
bet
we
en a
re
cord with its
neig
hbo
urs.
The
comp
ut
ed di
st
an
ce
s ar
e sort
e
d
in a
s
cen
d
ing
way
.
Nex
t
,
choo
s
e
k sm
alle
st
dist
an
ce
s a
s
k
nearest
dista
n
ce
s.
Cla
s
se
s of
re
co
rd
s in the
k ne
are
s
t nei
ghb
ours a
r
e
the
n
u
s
ed
for cl
ass
predi
ction. The majority cla
ss in that set will be tansfe
rred to the predicted data.
2.2 Fuzz
y
K-Near
es
t Neig
hbour
In 198
5, Kell
er p
r
op
osed
a KNN meth
od
with fuzzy
logic, l
a
ter i
t
is
calee
d
F
u
zzy k-
Nea
r
e
s
t Nei
g
hbou
r [4]. The fuzzy logic
is exploited t
o
define the
membe
r
ship
degree for
e
a
ch
data in ea
ch
categ
o
ry, as
descri
b
e
s
in the next formu
l
a [4]:
u
i
(x
) =
∑
ij
/
‖
j
‖
/
∑
/
‖
j
‖
/
(2)
The i variabl
e define the
index of cla
s
se
s,
j is nu
mber of k n
e
i
ghbo
urs, an
d m with
value in
(1,
∞
) is fuzzy
stre
ngth pa
ram
e
ter to
define
we
ig
ht or me
mbershi
p
de
g
r
ee f
r
om
data
x.
Eulidean
dist
ance bet
wee
n
x and j
-
th n
e
ighb
our is
sym
bolize
d
a
s
||x-xj||. Memb
ership fu
nctio
n
of
xj to eac
h c
l
as
s
is
defined as
uij [4]:
u
ij
(x
k
) =
0.5
1
j/
∗
0
.49,
1
j/
∗
0
.49,
1
(3)
In addition,
nj
is the
num
b
e
r of
neigh
bo
urs with
j-th
class. Equatio
n (3
) i
s
subj
e
c
t to the
next equation
[4]:
Evaluation Warning : The document was created with Spire.PDF for Python.
TELKOM
NIKA
ISSN:
1693-6
930
Com
parative
Study of Ban
c
ru
ptcy Pr
edi
ction Mod
e
ls
(Isye Arie
sha
n
ti)
593
μ
1
, j
1
,2,
…
,n
0
∑
u
u
ij
ϵ
[0, 1]
(4)
After a d
a
ta
is eval
uated
usi
ng tho
s
e
formul
as, it
wo
uld b
e
cl
assified into
a cl
ass
according to
the membe
r
ship deg
ree t
o
the corre
s
pondi
ng cla
s
s (in this
ca
se, class po
sitive
mean
s ban
crupt and cl
ass negative me
ans n
o
t ban
crupt). [5].
C(x
)
=
ar
g
m
ax
u
x
,u
x
(5)
2.3 Suppo
rt Vector M
a
chin
e
Suppo
rt vector ma
chin
es
(SVM) is a m
e
thod
that pe
rform a
classification by finding a
hyperpl
ane
with the large
s
t margin
[8]. A Hyperpl
an
e
sep
a
rate a
class from
an
other. Ma
rgi
n
is
distan
ce b
e
twee
n hype
rpl
ane an
d the
clo
s
e
s
t data
to the hyperpl
ane. Data fro
m
each cla
ss that
clo
s
e
s
t to hyperpla
ne a
r
e d
e
fined a
s
su
p
port vecto
r
s [
8
].
In order to gene
rate SVM models, using traini
ng
data
x
∈R
and label cla
ss
y
∈
1,
1
, SVM finds a hyperpla
ne
with the lar
ges
t margin wit
h
this
equationc
[8]:
.
0
(6)
To maximize
margi
n
, an SVM shoul
d sa
tisfy this equa
tion [8]:
1
2
subj
ect to
.
.
1
,
1
,
…
,
(7)
Xi is trai
ning
data, yi is l
a
b
e
l cla
s
s, w an
d b a
r
e
pa
ra
meters to b
e
defined i
n
the
trainin
g
pro
c
e
ss.
Th
e eq
uation
(7) is adj
usted u
s
ing
slack vari
able
in o
r
de
r t
o
ha
ndle
th
e
miscl
assification ca
se
s. Th
e adju
s
ted form
ula is then
defined a
s
in
equatio
n (8
) [8]:
1
2
,,
subj
ect to
.
.
1
;
1
,
…
,
;
0
(8)
To solve the
optimation p
r
oce
s
s, Lagra
nge Multiplie
r (
α
) is
introduc
ed as
follow:
,
,
∝
1
2
∝
.
1
(9)
Becau
s
e ve
ctor w may in h
i
gh dimen
s
io
n, equation (9
) is tran
sfo
r
m
ed into dual f
o
rm [8]:
Max
∑
∝
∑
∝
∝
,
Subject to
∝
0
1
,
2,
…,
;
∑
∝
0
∑
∝
y
0
(10
)
And deci
s
io
n function i
s
de
fined as follo
w [8]:
.
(11
)
Value of b pa
ramete
r is
cal
c
ulate
d
usi
n
g
this formula [
8
]:
∝
.
1
0
(12
)
2.4 Bagging
Near
es
t Neig
hbour Supp
ort Vec
t
or M
achine (BNNSVM)
In order to
create B
NNSV
M
mod
e
l, mo
del
Nea
r
e
s
t Neig
hbo
r
Su
pport
Vecto
r
Machi
n
e
s
(NNSVM) is
created firs
t. The proc
edure is
as
follow [6]:
1.
Traini
ng data
is divided into
train set (trs) and
test set (ts) u
s
ing
cro
s
s validation p
r
ocess.
Evaluation Warning : The document was created with Spire.PDF for Python.
ISSN: 16
93-6
930
TELKOM
NIKA
Vol. 11, No. 3, September 20
13: 59
1 – 596
594
2.
Find k-nea
re
st
nei
ghb
ours
for ea
ch re
cord
in
t
s
. Th
e
s
e
k-ne
are
s
t
neigh
bou
rs is define
d
a
s
ts_nn
s_
bd.
3.
Create a c
l
ass
i
fic
a
tion model from ts
_n
n
s
_b
d. The mo
del is spe
c
ifie
d as NNSVM.
4.
Perform p
r
edi
ction to testin
g data usi
ng
NNSVM mo
d
e
l.
Subse
que
ntly, baggin
g
al
g
o
rithm i
s
inte
grated
to
NNSVM model t
o
form B
N
NSVM. The
comp
utation
of BNNSVM model is d
e
fined in the ne
xt steps [6]:
1.
Cre
a
te 10 n
e
w ba
se trai
ning set fro
m
trs
data. In ord
e
r to g
enerate ba
se
training
set,
perfo
rm sa
m
p
ling with repl
acem
ent.
2.
Acco
rdi
ng to 10 ba
se traini
ng set from
step 1, gene
rat
e
10 NNSVM model.
3.
Perform a p
r
e
d
iction ta
sk u
s
ing 1
0
NNSVM models from step 2.
4.
For ea
ch
record in test
set
,
vote the prediction result usin
g the NNSVM models.
5.
Final pre
d
icti
on re
sult is the class that is vot
ed in the step 4. If the voting result is ‘negative’
then the data
is pre
d
icte
d a
s
‘neg
ative’ and vice versa
for ‘positive’ result.
2.5 Multiple La
y
e
r Perceptron (MLP)
Multilayer Pe
rce
p
tro
n
(ML
P
) method is an A
NN me
thod with architecture at least 3
layers. Th
ose 3 layers are input laye, hidde
n layer
and outp
u
t layer. Similar to anothe
r ANN
method
s, this method ai
ms to cal
c
ul
ate the weig
ht vectors. T
he weig
ht vector will b
e
fit to
training
data.
To u
pdate
th
e weight ve
ct
or,
MLP u
s
e
s
b
a
ckp
r
o
pag
ation al
gorith
m
. The
a
c
tivation
function that i
s
used in this
MLP model is Sigmoid function.
In pre
d
iction
stage, a
data
com
pany x
will be
cla
s
sified a
s
po
sitive (the
com
p
a
n
y has
ban
cru
p
t potency)
or ne
g
a
tive (the co
mpany fine
condition
)a
cco
r
ding to eq
ua
tion (13
)
. In the
equatio
n (13
)
wi is weig
ht vector from training p
r
o
s
e
s
, w0 is bia
s
and n feature
dimen
s
ion of
the
data [9].
0
1
.
exp
1
/
1
)
(
w
n
i
i
x
i
w
sign
x
y
(13
)
In the trainin
g
stage, the weig
ht vector is
updated i
n
two step
s. The first ste
p
perform
initialization
of wei
ght ve
ctor,
both i
n
input
l
a
yer
and
hidd
en l
a
yer. Afterwa
r
d, th
e fo
rward
prop
agatio
n i
s
comp
uted t
o
obtain
the
netwo
rk out
p
u
t. The
comp
utation is sta
r
ted from in
p
u
t
layer, hidde
n layer and out
put layer. Wh
en the val
ue (ok) from out
put layer and
value (oh) from
hidde
n layer
are o
b
taine
d
, back propa
gation p
r
o
c
ed
ure i
s
pe
rformed to cal
c
ul
ate the error
(
δ
k)
in output laye
r (eq
uation 1
4
) and e
r
ror
(
δ
h) in hid
den
layer (e
quatio
n 15). In the equatio
n 8, wkh
is wei
ght valu
e of the hidde
n unit that co
nne
cted to ou
tput unit [9]
)
)(
1
(
k
o
k
t
k
o
k
o
k
(14
)
output
k
k
kh
w
h
o
h
o
h
)
1
(
(15
)
Acco
rdi
ng to
error
cal
c
ulati
on, weig
ht vector at
inp
u
t layer (e
quatio
n 16) a
nd wei
ght vector at
hidde
n layer
(equ
ation 1
7
) are u
pdated.
The num
be
r of iteration i
s
determi
ned b
a
se
d on e
p
o
c
h
[9]
i
x
h
ih
w
ih
w
(16
)
i
h
o
k
kh
w
kh
w
(17
)
2.6 The H
y
brid of MLP
w
i
th Multiple L
i
near Re
gres
sion (MLP+MLR)
This hyb
r
id
classificatio
n
model g
ene
rated
in two
step
s. The fi
rst ste
p
com
pute the
Mu
ltip
le
L
i
nea
r
R
e
gr
es
s
i
on
(
M
LR)
mo
de
l. T
h
e
r
e
s
u
lt
of the mod
e
l
is u
s
ed
a
s
a
new feature
for
Evaluation Warning : The document was created with Spire.PDF for Python.
TELKOM
NIKA
ISSN:
1693-6
930
Com
parative
Study of Ban
c
ru
ptcy Pr
edi
ction Mod
e
ls
(Isye Arie
sha
n
ti)
595
the cla
ssifi
ca
tion model
[7]. The main obje
c
tive
of the MLR usag
e is to
add the lin
ear
comp
one
nt to the classifica
tion model. T
he MLR m
o
d
e
l is define
d
as in eq
uatio
n 18 [7]:
⋯
.
(18
)
Where xi
wit
h
(i=
0
,1,2,
...,n) is
features and
α
i
with (i=
0
,1,
2, ..., n) is
unknown
reg
r
e
ssi
on coefficient. Th
e co
eeffients are e
s
timat
ed u
s
ing le
a
s
t sq
uare error.
Whe
n
the
reressio
n
coe
fficients
are o
b
tained,
L val
ue i
s
cal
c
ulat
ed b
a
sed
on
coe
e
ficie
n
ts
and th
e fe
atu
r
e
value. The
L value
will become a
n
addition
al
attribute in i
nput layer
o
f
MLP mod
e
l.
Con
s
e
quently
, the L value is involved in the MLP traini
ng pro
c
e
s
s.
2.7 Datase
t
Data
set that is used in this study is data
s
et
from Wi
e
s
la
w [3]. The data is a re
su
lt of an
observation
from
2 to 5
years o
n
1
20
comp
anie
s
.
The d
a
taset
con
s
i
s
ts
of 2
40 records
(128
record a
r
e
po
sitive data
da
n the
re
st are
negative
dat
a). Po
sitive d
a
ta mea
n
s th
e compa
n
ie
s
are
not ban
crupt and the ne
ga
tive ones a
r
e
the oppo
site.
The features
related to fina
ncial
ratio. Th
e
feature
s
are descri
bed in
Table 1.
Table 1 Data
set feature
Sym
bol
Feat
ure
Sym
bol
Feat
ure
X1 Cash/current
liabil
ities
X16
Sales/receivables
X2
Cash/total assets
X17
Sales/total assets
X3
Current assets/current liabilities
X18
Sales/current
asset
s
X4
Current assets/tota
l assets
X19
(365
⁄
receivables)/s
a
les
X5
Working capital/total assets
X20
Sales/total assets
X6 Working
capital/sa
l
es
X
21 Liabilities/total
inc
o
m
e
X7
Sales/inventory
X22
Current liabilities/t
o
tal inco
m
e
X8 Sales/receivables
X23
Receivables/liabilities
X9 Net
pro
fi
t/total ass
e
ts
X24
Net pro
fi
t/sales
X10 Net
pr
o
fi
t/current a
ssets
X25
Liabilities/total ass
e
ts
X11 Net
pr
o
fi
t/sales
X26
Liabilities/equity
X12 Gr
oss
pr
o
fi
t/sales
X27
Long term
l
i
abiliti
es/equity
X13 Net
pr
o
fi
t/liabilitie
s X28
Current
liabilities/
equity
X14 Net
pr
o
fi
t/equity
X29
E
B
I
T
(
e
ar
nings befor
e
interests and t
a
xes)/total assets
X15 Net
pr
o
fi
t/(equity
+ long term
liabilit
ie
s) X30
Current
assets/sale
s
3. Results a
nd Analy
s
is
In this compa
r
ative study, the pe
rform
a
n
c
e
of the
cm
pare
d
metho
d
s i
s
evaluat
ed u
s
in
g
k-fold
cro
s
s
validation. Th
e k-fold
cro
s
s valid
ation a
tech
niqu
e di
vide the
data
s
et into
traini
ng
and te
sting
set. With thi
s
techni
que, e
a
c
h
re
co
rd
in
dataset is
used a
s
te
sting
data o
n
ce a
nd
use
d
a
s
t
r
aini
ng d
a
ta fo
r
k-1 time
s. The
k valu
e
rep
r
e
s
ent th
e fold
numbe
r
of th
e data
s
et. In t
h
is
study, the fold numb
e
r for k-NN, fuzzy
k-NN,
SVM
and BNNSVM model i
s
5. And the fold
numbe
r for M
L
R dan
Hibri
d
of MLP+ML
R is 4. The
d
e
termin
ation of these fold numbe
r is ba
sed
on the be
st perform
na
ce that are
a
c
hie
v
ed by the co
mpared mod
e
ls.
The perfo
rma
n
ce
results o
f
the comp
ared
m
odel
s are
represente
d
a
s
accu
ra
cy
value.
The accu
ra
cy
metric is u
s
e
d
becau
se the num
be
r of positive and
negative data
is quite bala
n
ce.
The accu
ra
cy
metric is d
e
fined in eq
uati
on 19:
∗
100%
(19
)
whe
r
e T
r
u
e
Positive (TP
)
is the
numb
e
r of
d
a
ta wi
th positive
cl
ass a
r
e p
r
edi
cted a
s
positive, Tru
e
Ne
gative (TN) i
s
th
e n
u
mbe
r
of
da
ta with ne
ga
tive class a
r
e predi
cted
as
negative. In a
ddition, Fal
s
e
Positive (FP) and Fa
l
s
e Negative
(F
N) are
t
he
numb
e
r of data
wit
h
positive
cla
s
s a
r
e p
r
e
d
ict
ed a
s
n
egati
v
e and th
e
numbe
r of
d
a
ta with
neg
ative cla
s
s a
r
e
predi
cted
a
s
positive, resp
ectively. The
comp
ari
s
o
n
result
of the p
e
rform
a
n
c
e f
o
r e
a
ch mo
d
e
l is
rep
r
e
s
ente
d
i
n
Tabl
e 2. T
able 2
sho
w
s that
the hi
g
hest
accu
ra
cy is
a
c
hieve
d
by Fu
zzy
k-NN
model with a
c
cura
cy valu
e 77.5%, k=2 and m=
10.
The m para
m
eter dete
r
mine the wei
ght
distan
ce
whe
n
comp
ute
co
ntribution
of t
he d
a
ta
from
each n
e
igh
b
o
u
r. Th
e
bigge
r m
value,
mo
re
simila
r the
weight to e
a
ch
distan
ce.
On
the cont
ra
ry, small
e
r m
va
lue (i
e cl
ose t
o
1),
bigge
r t
h
e
weig
ht co
ntri
bution to the
nearest n
e
ig
hbou
r. Fro
m
table 2, it ca
n be illu
strat
ed that di
sta
n
ce
weig
ht to each neigh
bou
r is relatively si
milar.
Evaluation Warning : The document was created with Spire.PDF for Python.
ISSN: 16
93-6
930
TELKOM
NIKA
Vol. 11, No. 3, September 20
13: 59
1 – 596
596
Table 2 Th
e compa
r
ison of accuracy
Model
Accu
racy
(
%
)
P
arameters
1. k-NN
75.42
k=2
2.
Fuzz
y
k-NN
77.50
k=2, m=10
3. SVM
70.42
kernel
linear,
C=1
4. BNNSVM
71.58
kernel
linear,
C=1, B=10
5. MLP
71
epoch=500
6. MLP+MLR
74.5
epoch=500
The second
high a
c
curacy is achi
eved
by
k-NN m
o
del with a
c
cu
racy 7
5
.42%.
Whe
n
comp
ar
e t
o
F
u
z
z
y
k
-
N
N
,
t
h
is a
c
cur
a
cy
i
s
low
e
r t
h
an t
hat
of
f
u
zzy
k
-
N
N
a
c
cu
ra
cy
.
This
de
sc
rib
e
that the m
e
m
bership
de
gree of
cla
s
s
affect the
cl
assification perfo
rman
ce. The
influen
ce of
the
cla
ss
memb
e
r
shi
p
fun
c
tion
see
m
s t
o
re
duce the
noi
se effe
ct whi
c
h i
s
g
ene
rall
y occur in
k-NN
model. The
r
e
f
ore, the effe
ct will lea
d
the model
to p
r
edict an
app
ropriate
cla
s
s eventhou
gh the
differen
c
e bet
wee
n
both cl
ass tende
ncy
is small.
The n
e
xt hig
h
accu
ra
cy is 74.5% which
is
attain
ed b
y
MLP+ML
R
model. Th
e a
c
cura
cy
of MLP+ML
R is high
er a
b
out 3.5% tha
n
that
of orig
inal MLP mo
del. The imp
r
ovement of the
accuracy sho
w
s that the linear
cha
r
a
c
teristi
c
that is calculated b
y
MLR compl
e
ment the no
n-
linear characteristic that
is exploited by
MLP. The f
u
se
of linea
r and
non
-line
a
r
cha
r
a
c
teri
sti
c
indicate a po
sitive contri
bu
tion to the cla
ssifi
cation mo
del perfo
rma
n
ce.
The la
st re
su
lt is re
porte
d
for BNNSVM
model. T
h
e pe
rform
ance of BNNSVM is n
o
t
different
com
pare
to th
e p
e
rform
a
n
c
e
o
f
SVM mod
e
l
.
The
bag
gin
g
p
r
o
c
e
s
s se
ems not
prov
ide
advantag
e to the BNNSVM model. Th
e possible
e
x
planation is becau
se B
NNSVM is
more
comp
atible
whe
n
po
sitive data
s
et a
nd ne
gat
ive
data
s
et is
not bala
n
ce. Mean
whil
e, the
Bankrupt
cy dataset that is
exploited for model b
u
i
l
ding, has a
balan
ce p
r
op
ortion bet
we
en
positive an
d negative data
.
4. Conclusio
n
Based
on
the
com
p
a
r
ison
of accu
ra
cy from
mo
del
s th
at are
buil
d
from k-NN, SV
M dan
MLP, it can b
e
con
c
lu
ded
that k-NN-ba
s
ed m
e
t
hod i
s
the mo
st suitable meth
o
d
. Mainly, k-NN
method that involve fuzzy
logic. The fuzzy effect
indicate the redu
ction of n
egative effect
of
noise. The
r
ef
ore, for fu
rth
e
r research i
n
Bankru
pt
cy
predi
ction
m
odel
with feat
ure
s
a
s
liste
d
in
Table
1, an
improvem
ent
model
can
be devel
ope
d ba
sed
on
fuzzy
k-NN
method. An
o
t
her
sug
g
e
s
tion is
anothe
r adva
n
ce
k-
NN-b
ased method to
be co
nsi
dere
d
as mod
e
l for Bankrupt
cy
Referen
ces
[1]
T
a
ma B A, S
Rodi
ya
tul, Her
m
as
yah H. An
Ea
rl
y
D
e
tecti
on Metho
d
of T
y
pe-
2 Dia
bet
es Mellitus i
n
Publ
ic Hosp
ital
.
T
e
lkomnika v
o
l 9 no 2
201
1
[2]
Nurmai
n
i S, T
u
tuko B. A Ne
w
Classific
a
tio
n
T
e
c
hnique in Mobil
e
R
obot Navig
a
tio
n
.
T
e
lkomnik
a
vol 9
no 3 20
11.
[3]
W
i
esla
w
,
P.
A
pplic
atio
n
of D
i
screte Pr
edicti
ng
Structur
es
in A
n
Ear
l
y W
a
rni
ng E
x
pert
S
y
stem f
o
r
F
i
nanc
ial D
i
stress.
T
ourism Mana
geme
n
t. 200
4.
[4]
Keller, J., Gray
, M., & Giv
e
ns, J. A Fuzz
y
k
Nearest
Neighbours A
l
gori
thm.
IEEE T
r
ansaction on
S
y
stem, Man,
and C
y
b
e
rn
et
ic
s , SMC-15, 4.
198
5.
[5]
Chen, H.
L., Yang, B., Wang,
G., Liu, J.,
Xu,
X
., Wang, S
.
-J., et al. A
Novel Bankr
uptc
y
Pr
ediction
Mode
l Based o
n
An Adaptiv
e F
u
zz
y
k-Ne
are
s
t Neig
h
bor M
e
thod. Kn
o
w
l
e
dge-B
a
se
d S
y
stem , 24 (8),
134
8-13
59. (20
11).
[6]
Li, Hu
i a
nd S
u
n, Ji. F
o
recasti
ng Bus
i
n
e
ss F
a
ilur
e
: T
he Use
of Ne
arest-Ne
i
ghb
our S
u
p
por
t Vectors an
d
Correctin
g Imb
a
la
nced S
a
mp
l
e
s - Evid
ence
from Chin
es
e
Hotel In
dustr
y
.
3, s.l.
: Elsevier, T
ourism
Management, Vol. XXX
III,
pp. 622-634. 2011.
[7]
Khash
e
i, Me
hd
i, Ali Z
e
i
nal
H
a
mad
ani, a
nd
Mehd
i Bij
a
ri. A
nove
l
h
y
br
id
classi
fi
cati
on
mode
l of AN
N
and ML
R mod
e
ls. Expert S
y
s
t
ems
w
i
th A
ppl
i
c
ations 3
9
, 201
1: 2606
–2
62
0.
[8]
T
an, P. N., Steinbac
h, M., & K
u
mar, V. Intro
d
u
ctio
n
to
Data
Minin
g
(
4
th
ed.
). Boston: P
ear
son A
ddis
o
n
We
sl
ey
. 20
06
.
[9]
Mitchell, T
.
M.
Machi
ne Le
arn
i
ng. Si
n
g
a
pore:
McGra
w
-H
ill C
o
mpa
n
ies Inc. 199
7.
Evaluation Warning : The document was created with Spire.PDF for Python.