Indonesi
an
Journa
l
of El
ect
ri
cal Engineer
ing
an
d
Comp
ut
er
Scie
nce
Vo
l.
23
,
No.
3
,
Septem
ber
20
21
,
pp.
1
634
~
1642
IS
S
N: 25
02
-
4752, DO
I: 10
.11
591/ijeecs
.v
23
.i
3
.
pp
1634
-
1642
1634
Journ
al
h
om
e
page
:
http:
//
ij
eecs.i
aesc
or
e.c
om
Fra
ud
ulent
credit card
transacti
on
detecti
on using
soft
co
mp
uti
ng techn
iqu
es
Aishwar
ya Pr
iya
d
ars
hini
1
,
Sa
n
hita
Mishr
a
2
, Deb
an
i Pr
as
ad
Mishr
a
3
, Surender
R
e
d
dy
Sa
lk
ut
i
4
,
Ra
m
akan
t
a
Mohant
y
5
1
Depa
rtment of
Com
pute
r
Scie
n
ce
and Engi
ne
ering,
III
T
Bhuban
eswar
,
Ind
ia
2
Depa
rtment
of
El
e
ct
ri
ca
l
Eng
in
ee
ring
,
KIIT
De
emed
to
be
Univ
ersity
,
Indi
a
3
Depa
rtment of
El
e
ct
ri
ca
l
Eng
in
ee
ring
,
III
T
Bhu
bane
sw
ar
,
India
4
Depa
rtment of
Rai
lro
ad
and
E
lectr
i
c
al E
ng
ine
er
i
ng,
W
oosong Unive
rsit
y
,
Repub
lic
of
Kor
ea
5
Depa
rtment of
Com
pute
r
Scie
n
ce
and Engi
ne
ering,
Ge
et
han
ja
l
i Colle
ge
of
Enginee
ring
and
Tec
hnolog
y
,
Ind
ia
Art
ic
le
In
f
o
ABSTR
A
CT
Art
ic
le
history:
Re
cei
ved
Dec
8,
202
0
Re
vised
Ju
l
29
,
2021
Accepte
d
Aug
4
,
2021
Now
aday
s,
fr
au
dule
nt
or
de
ceitf
ul
a
ct
iv
it
i
es
associa
t
ed
with
fina
nc
ial
tra
nsac
ti
ons,
pr
e
dom
ina
ntly
usin
g
cre
di
t
c
ard
s
h
ave
b
ee
n
in
crea
sing
at
an
al
arming
r
at
e
a
nd
are
on
e
of
the
m
ost
pre
v
al
en
t
a
ct
iv
it
i
es
in
fina
n
c
e
industri
es,
cor
p
ora
te
companie
s
,
and
o
the
r
gov
ern
m
ent
orga
n
izati
ons.
I
t
is
the
ref
or
e
essent
ia
l
to
inc
orpor
a
te
a
fr
aud
d
et
e
ct
ion
s
y
st
em
th
at
m
ai
n
l
y
consists
of
inte
ll
ige
n
t
fr
aud
d
et
e
ct
ion
techniq
ues
to
k
ee
p
in
vie
w
the
consum
er
and
client
s’
welf
are
a
l
ike
.
Num
ero
us
fra
ud
detec
t
ion
p
roc
edur
es
,
te
chn
ique
s,
and
s
y
stems
in
l
it
er
a
ture
h
ave
b
ee
n
i
m
ple
m
ent
ed
b
y
emplo
y
ing
a
m
y
r
ia
d
of
intelligent
technique
s
inc
lud
ing
a
lgo
rit
hm
s
and
fra
m
eworks
to
det
e
ct
fra
udul
ent
and
dec
eitful
tr
ansa
ctions.
Thi
s
pape
r
ini
tially
a
naly
s
es
the
dat
a
through
expl
ora
tor
y
data
anal
y
sis
an
d
the
n
proposes
var
iou
s
cl
assifi
ca
t
ion
m
odel
s
that
are
i
m
ple
m
ent
ed
usi
ng
int
ellig
ent
so
ft
computing
te
chn
ique
s
to
pre
dictivel
y
c
las
sif
y
fra
udule
n
t
cre
d
it
ca
rd
tr
ansa
ctions.
Cla
ss
ifi
c
at
ion
algorithms
such
a
s
K
-
Nea
rest
n
ei
ghbor
(K
-
NN
),
dec
ision
tree,
ran
dom
fore
st
(RF),
and
logi
sti
c
reg
ression
(L
R)
have
be
en
i
m
ple
m
ent
ed
t
o
cri
tica
lly
eva
l
uat
e
their
pe
rform
anc
es.
T
he
proposed
m
odel
is
computat
ion
al
l
y
eff
ic
i
ent,
li
gh
t
-
weight
and
c
a
n
be
used
for
cre
di
t
ca
rd
fra
udule
n
t
tr
ansa
ct
ion
detec
ti
on
with
be
tt
er
a
cc
ur
acy
.
Ke
yw
or
ds:
Data distri
bu
ti
on
Ex
plo
rat
or
y
da
ta
an
al
ysi
s
Fr
a
ud d
et
ect
io
n
Ou
tl
ie
rs
This
is an
open
acc
ess arti
cl
e
un
der
the
CC
B
Y
-
SA
l
ic
ense
.
Corres
pond
in
g
Aut
h
or
:
Su
r
en
der Re
dd
y
Salku
ti
Dep
a
rtm
ent o
f R
ai
lroad
a
nd E
le
ct
rical
En
gi
ne
erin
g
Woos
ong U
nive
rsity
17
-
2,
Jaya
ng
-
Don
g,
D
ong
-
G
u,
Dae
j
eo
n
-
34
606, Re
pu
blic of K
or
ea
Em
a
il
:
su
rende
r@wsu.ac
.
kr
1.
INTROD
U
CTION
Fo
r
ye
ars,
fr
a
ud
a
nd
il
le
gal
tr
ansacti
ons
have
bee
n
sig
nific
ant
pro
blem
s
i
n
ba
nkin
g,
m
edici
ne,
a
nd
insurance
,
am
on
g
oth
e
rs.
Be
cause
of
t
he
inc
reased
reli
ance
on
the
inte
rn
e
t,
the
am
ou
nt
of
t
otal
on
li
ne
cred
it
card
t
ran
sact
io
ns
has
inc
rease
d
dram
at
ic
ally
acro
s
s
va
rio
us
paym
ent
m
e
tho
ds
su
c
h
as
Phon
e
Pe,
Gp
ay
,
Paytm
,
and
oth
e
rs.
Cr
eat
ing
a
secu
re
fr
am
ewo
r
k
f
or
safer
onli
ne
transacti
ons
with
ad
va
nced
a
uth
entic
at
io
n
ser
vices
that
can
hel
p
pr
e
ve
nt
f
raud
and
ot
her
f
raudu
le
nt
tra
ns
act
ion
pract
ic
es
ha
s
bec
om
e
an
increasin
gly
diff
ic
ult
j
ob,
gi
ven
that
no
syst
em
is
perfect
.
T
her
e
is
al
ways
the
l
ikeli
hood
of
a
flaw.
Fr
a
udule
nt
or
decep
ti
ve
cred
it
card
pu
rch
as
es
m
ay
be
interp
reted
as
uncert
ifie
d
or
un
la
wful
cred
it
card
use
,
wh
ic
h
is
rar
e
[1
]
-
[
8].
F
raud
is
a
com
plex
pro
ble
m
that
inv
ol
ve
s
en
gag
i
ng
i
n
il
le
gal
or
il
li
ci
t
finan
ci
al
gai
n
w
hile
vi
ola
ti
ng
la
ws,
regul
at
ion
s,
and
po
li
ci
es
[
9
]
,
[
10]
.
Ra
nd
om
fo
rest
(RF
),
K
-
Nea
rest
neig
hbor
(KN
N)
,
m
ulti
la
ye
r
per
ce
ptron
(
MLP),
Evaluation Warning : The document was created with Spire.PDF for Python.
Ind
on
esi
a
n
J
E
le
c Eng &
Co
m
p
Sci
IS
S
N:
25
02
-
4752
Fraudule
nt c
re
dit car
d
tr
ansac
ti
on
detect
ion usi
ng so
ft
c
ompu
ti
ng tec
hniq
ues
(
Aishw
ar
y
a
Priy
adar
s
hini
)
1635
extrem
e
le
arn
i
ng
m
achine
(E
LM),
an
d
ba
gging
cl
assifi
er
wer
e
am
on
g
th
e
m
achine
le
arn
in
g
al
gorithm
s
us
ed
to
def
i
ne
an
d
cl
assify
cred
it
card
tra
ns
act
io
n
data
into
f
ra
udulent
an
d
non
-
f
ra
udulent
cat
egories
[11
]
-
[
14
]
.
Fr
a
ud
detect
io
n
in
cred
it
car
d
transacti
ons
has
pique
d
res
earche
rs'
interest
.
The
num
ber
of
te
ch
niqu
es
fo
r
detect
ing
f
ra
udulent
or
dece
pt
ive
act
ivit
y
in
t
he
realm
of
fin
ancial
transacti
on
s
,
w
hethe
r
onli
ne
or
offli
ne
,
ha
s
grown
dram
ati
cal
ly
.
Ghos
h
[
15
]
s
uggeste
d
the
m
et
ho
d
of
f
raud
detect
ion
us
i
ng
ne
ur
al
netw
orks,
wh
ic
h
consi
ste
d
of
a
la
rg
e
dataset
that
inclu
de
d
s
tolen
or
lost
c
red
it
ca
rd
s
,
phishin
g,
cy
ber
-
a
tt
acks,
non
-
rec
ei
ve
d
issue
(N
R
I)
f
r
auds,
a
nd
s
o
on.
Ca
rcil
lo
et
al.
[
16
]
m
erged
th
e
c
omm
o
nly
us
e
d
supe
rv
ise
d
le
ar
ning
a
nd
un
s
uper
vise
d
le
arn
i
ng
al
gorit
hm
s
to
identif
y
fr
au
du
le
nt
pa
tt
ern
s
in
credi
t
card
transacti
on
s
.
The
aut
om
at
ed
m
et
ho
d
[17]
w
as
us
e
d
to
e
xa
m
ine
al
l
cur
re
nt
transacti
on
s
and
a
ssig
n
ea
ch
one
a
valid
or
fr
a
udule
nt
r
ank
i
ng.
These
f
raud
de
t
ect
ion
syst
em
s
are
based
on
e
xp
e
rt
-
dr
iv
en
ru
le
s
that
f
or
m
a
co
m
plete
colle
ct
ion
of
fr
au
d
detect
ion
r
ules an
d
patte
r
ns
when
co
m
bin
ed
with
a reli
able
data
sou
rce.
T
he
r
ules
a
re
ba
sed
on
p
at
te
r
ns
f
ou
nd
in the d
at
a u
sin
g
va
rio
us
intel
li
gen
t m
achine learn
in
g
al
gorit
hm
s,
su
ch
as
lo
gisti
c reg
res
sio
n
(
LR
)
a
nd
s
up
port
vecto
r
m
achines
(
S
VM
)
[
18
]
,
[
19]
.
Andrea
et
al.
[20]
pr
ese
nt
a
sign
ific
a
ntly
chall
eng
i
ng
ta
sk
f
or
m
achine
le
arn
in
g
al
gorithm
s
for
sever
al
reas
on
s,
i.e.,
seas
on
a
li
ty
,
and
sk
ew
ness
.
J
oh
a
nnes
et
al.
[2
1]
ad
dr
ess
t
he
pro
ble
m
of
fr
a
ud
de
te
ct
ion
that
pr
im
aril
y
t
he
us
ed
lo
ng
s
hort
te
rm
m
e
mo
ry
(
LSTM
)
ne
tworks
as
a
m
achine
-
le
ar
ning
al
gorithm
to
detect
fr
a
udulent
t
ransac
ti
on
s
a
nd
t
el
l
the
ge
nu
i
ne
an
d
f
ra
ud
ones
a
par
t.
Sa
m
e
t
al.
[
22]
pro
po
se
d
a
m
od
el
t
hat
com
bin
e
s
neural
netw
ork
cl
assifi
ers
an
d
ba
ye
sia
n
net
works
to
dif
f
eren
ti
at
e
a
fraudulent
cre
di
t
card
transacti
on
fro
m
a
le
gitim
at
e
on
e
.
Sim
il
arly
,
Bolt
on
an
d
H
and
[
23]
cond
uc
te
d
researc
h
that
descr
i
bes
va
rio
us
fr
a
udulent
cre
dit
card
act
ivit
ie
s
and
kee
ps
up
-
to
-
date
wit
h
ne
wer
dec
ep
ti
on
te
chn
i
qu
e
s
that
fr
au
ds
te
rs
can
e
m
plo
y.
Zarea
poor
,
See
j
a,
a
nd
Alam
[2
4]
propose
d
a
com
par
at
ive
stu
dy
ba
sed
on
the
perform
ance
of
va
rio
us
m
achine lear
nin
g t
ech
niques.
It w
as
prim
arily to r
e
view
the
d
if
fer
e
nt cre
di
t card
frau
d
te
c
hn
i
qu
e
s
.
This p
ape
r
ai
m
s
to
im
pr
ove
th
e
predict
ive
ac
cur
acy
o
f
disti
nguis
hing
f
ra
udule
nt
fi
nan
ci
a
l
transacti
on
data
from
le
gi
tim
a
te
finan
ci
a
l
transacti
on
da
ta
.
Gain
a
co
m
pr
ehen
sive
unde
rstan
ding
of
the
la
te
st
research
pap
e
rs
desc
ribi
ng
the
ap
plica
ti
on
of
var
i
ou
s
m
achine
le
ar
ni
ng
an
d
arti
fici
al
intel
li
gen
ce
t
echn
i
qu
e
s
to
id
entify
cred
it
ca
rd
f
ra
ud
s
a
nd
em
plo
y
an
eff
ic
ie
nt
cl
assifi
cat
ion
m
od
el
with
higher
accu
rac
y
and
higher
faster
pr
e
dicti
on
rate
for
f
raud
detect
ion
f
or
t
he
fin
ancial
industr
y,
an
d
go
vernm
ent
orga
ni
zat
io
ns
.
Ultim
at
ely,
these
so
ft
com
pu
ti
ng
te
chn
i
qu
e
s
and
the
propose
d
m
e
tho
dolo
gy
pu
t
tog
et
he
r
ai
ds
in
achiev
ing
pe
rf
ect
acc
ur
acy
,
and
t
his
el
i
m
i
nated
the
nee
d
to
use
hea
vier
le
arn
i
ng
al
gorithm
s,
hen
c
e
reducin
g
the
huge
com
pu
t
at
ion
al
ov
e
r
head an
d
i
ncr
easi
n
g t
he p
red
ic
ti
ve pe
rfo
rm
ance o
f
t
he dete
ct
ion sy
ste
m
.
2.
OVERVIEW
OF
MACHI
N
E L
EAR
NI
N
G TECH
NI
Q
UES
APPLIE
D
2.1.
K
-
Nea
re
st
nei
gh
b
ou
r
c
lassifier
K
-
N
e
a
r
e
s
t
n
e
i
g
h
b
o
u
r
i
s
a
s
u
p
e
r
v
i
s
e
d
a
n
d
p
a
t
t
e
r
n
c
l
a
s
s
i
f
i
c
a
t
i
o
n
l
e
a
r
n
i
n
g
a
l
g
o
r
i
t
h
m
t
h
a
t
h
e
l
p
s
u
s
f
i
g
u
r
e
o
u
t
t
h
e
c
l
a
s
s
t
o
t
h
e
n
e
w
i
n
p
u
t
(
t
e
s
t
v
a
l
u
e
)
.
W
h
e
n
k
n
e
a
r
e
s
t
n
e
i
g
h
b
o
r
s
a
r
e
c
h
o
s
e
n
,
d
i
s
t
a
n
c
e
i
s
d
e
t
e
r
m
i
n
e
d
b
e
t
w
e
e
n
t
h
e
m
.
I
t
t
r
i
e
s
t
o
p
r
e
d
i
c
t
t
h
e
c
o
n
d
i
t
i
o
n
a
l
d
i
s
t
r
i
b
u
t
i
o
n
o
f
Y
g
i
v
e
n
X
a
n
d
a
s
s
i
g
n
a
g
i
v
e
n
o
b
s
e
r
v
a
t
i
o
n
(
t
e
s
t
v
a
l
u
e
)
t
o
t
h
e
c
l
a
s
s
w
i
t
h
t
h
e
h
i
g
h
e
s
t
e
s
t
i
m
a
t
e
d
p
r
o
b
a
b
i
l
i
t
y
.
I
t
c
a
l
c
u
l
a
t
e
s
t
h
e
d
i
s
t
a
n
c
e
b
e
t
w
e
e
n
a
l
l
o
f
t
h
o
s
e
c
a
t
e
g
o
r
i
e
s
a
f
t
e
r
i
d
e
n
t
i
f
y
i
n
g
t
h
e
k
p
o
i
n
t
s
i
n
t
h
e
t
r
a
i
n
i
n
g
d
a
t
a
n
e
a
r
e
s
t
t
o
t
h
e
t
e
s
t
v
a
l
u
e
.
T
h
e
t
e
s
t
r
e
s
u
l
t
w
o
u
l
d
f
a
l
l
i
n
t
o
t
h
e
g
r
o
u
p
w
i
t
h
t
h
e
s
h
o
r
t
e
s
t
d
i
s
t
a
n
c
e
[
2
5
]
.
E
v
e
n
t
h
o
u
g
h
K
N
N
i
s
c
o
m
p
u
t
a
t
i
o
n
a
l
l
y
e
x
p
e
n
s
i
v
e
,
i
t
a
c
h
i
e
v
e
s
v
e
r
y
g
o
o
d
o
u
t
p
u
t
r
e
l
i
a
b
l
y
,
w
i
t
h
o
u
t
a
n
y
a
n
a
l
y
t
i
c
a
l
a
s
s
u
m
p
t
i
o
n
s
r
e
l
e
v
a
n
t
t
o
t
h
e
d
i
s
t
r
i
b
u
t
i
o
n
s
i
n
w
h
i
c
h
t
h
e
t
r
a
i
n
i
n
g
s
a
m
p
l
e
s
a
r
e
f
e
t
c
h
e
d
p
r
o
p
e
r
l
y
,
o
u
t
o
f
t
h
e
d
i
f
f
e
r
e
n
t
t
e
c
h
n
i
q
u
e
s
t
h
a
t
h
a
v
e
b
e
e
n
u
s
e
d
t
o
i
d
e
n
t
i
f
y
f
r
a
u
d
u
l
e
n
t
a
c
t
i
v
i
t
i
e
s
i
n
c
r
e
d
i
t
c
a
r
d
t
r
a
n
s
a
c
t
i
o
n
s
i
n
t
h
e
l
i
t
e
r
a
t
u
r
e
.
I
f
t
h
e
b
o
r
d
e
r
i
n
g
o
r
c
l
o
s
e
r
n
e
i
g
h
b
o
r
i
s
r
e
c
o
g
n
i
z
e
d
a
s
a
f
r
a
u
d
u
l
e
n
t
t
r
a
n
s
a
c
t
i
o
n
,
i
t
i
s
t
e
r
m
e
d
a
s
f
r
a
u
d
o
n
e
[
2
6
]
.
T
h
e
c
o
d
e
s
n
i
p
p
e
t
f
o
r
t
r
a
i
n
i
n
g
t
h
e
K
N
N
c
l
a
s
s
i
f
i
e
r
h
a
s
b
e
e
n
p
r
e
s
e
n
t
e
d
i
n
F
i
g
ure
1
.
Fig
ure
1
.
Trai
ni
ng
K
NN cla
ss
ifie
r
m
od
el
2.2.
Rand
om
forest
classifi
cat
i
on
techniq
ue
It
is
al
so
a
su
pe
rv
ise
d
le
ar
ning
te
ch
nique
th
at
can
be
ef
fici
ently
us
ed
f
or
so
lvin
g
cl
assi
ficat
ion
a
nd
regressio
n
pr
oble
m
s.
It
is
han
dy
f
or
c
om
puti
ng
so
l
ution
s
for
sit
uations
wh
e
re
the
data
set
m
us
t
be
classified
[14
]
,
[
27]
.
A
s
et
of
decisi
on
trees
(
DT
)
a
re
a
hel
pful
way
of
ob
ta
ini
ng
the
pr
e
dicti
on
values
.
A
gro
up
of
Evaluation Warning : The document was created with Spire.PDF for Python.
IS
S
N
:
2502
-
4752
Ind
on
esi
a
n
J
E
le
c Eng &
Co
m
p
Sci,
Vo
l.
23
, N
o.
3
,
Se
ptem
ber
20
21
:
16
34
-
16
42
1636
decisi
on
trees
(
DT)
is
buil
t,
w
hich
are
the
n
use
d
f
or
p
re
dicti
ng
the
cl
ass
.
The
voti
ng
te
c
hniq
ue
of
the
m
ajorit
y
can
be
us
e
d
f
or
obta
inin
g
the
final
predict
iv
e
ou
t
pu
t
from
t
he
tree
that
res
e
m
bles
cl
os
el
y.
The
c
om
pu
ta
ti
on
al
eff
ic
ie
ncy
of
R
andom
Fo
rest
cl
assifi
ers
li
es
in
the
fact
that
the
co
ns
tr
uction
of
each
tree
do
es
not
de
pe
nd
on
oth
e
rs [28]
. T
he
code s
nippet
for
trai
ning
the
r
a
ndom
f
or
est
cl
assifi
er
ha
s
be
en prese
nted
i
n
Fi
g
ure
2.
Figure
2
.
Trai
ni
ng
ra
ndom
f
orest
classi
fier
m
od
el
2.3.
Decisi
on
tree
The
pr
im
ary
pu
r
pose
of
a
de
c
isi
on
tree
is
cl
a
ssific
at
ion
.
T
he
decisi
on
tree consi
sts
of
a
rcs
and
nod
e
s,
wh
ic
h
m
akes
its
resem
blance
with
a
data
str
uctu
re.
T
he
nodes
of
t
he
deci
sion
tre
es
im
pl
y
a
decisi
on,
a
nd
t
he
arcs
denote
th
e
outc
om
e
of
t
hat
decisi
on
[
2
9].
T
he
de
sig
n
of
a
decisi
on
tree
is
pri
m
arily
done
us
in
g
a
top
-
dow
n
ap
proac
h.
Feat
ur
e
S
el
ect
ion
m
easur
es
can
be
us
e
d
to
find
t
he
be
st
possible
s
plit
ti
ng
point.
The
functi
on
values
are
c
om
par
ed
to
t
he
decisi
on
tree'
s
nodes
or
value
s.
T
he
c
od
e
sni
pp
et
for
t
rai
nin
g
the
decisi
on
tre
e
cl
assifi
er h
a
s
be
en prese
nted
i
n
Fi
g
ure
3
.
Figure
3
.
Trai
ni
ng
decisi
on t
r
ee cl
assifi
er m
od
el
2.4.
L
og
is
tic
regressio
n
Lo
gisti
c
reg
res
sion
(LR
)
is
m
ai
nly
a
pr
edict
ive
analy
sis
m
od
el
us
e
d
in
a
regressio
n
anal
ysi
s
wh
er
e
the
cl
assifi
cat
ion
outp
ut
var
i
able
is
bin
a
ry.
The
outp
ut
de
pende
nt
var
ia
ble
is
est
im
at
e
d
us
in
g
a
seri
es
of
releva
nt
param
et
ers
am
on
g
th
e
ind
e
pende
nt
var
ia
bles.
T
he
cod
e
s
ni
pp
et
for
trai
ni
ng
t
he
log
ist
ic
regr
ession
cl
assifi
er h
a
s
be
en prese
nted
i
n
Fi
gure
4
.
Figure
4
.
Trai
ni
ng
l
og
ist
ic
regressio
n
cl
assifi
er m
od
el
3.
DA
T
AS
ET
P
REPA
R
ATIO
N
This
re
searc
h
work
use
s
a
publicl
y
avail
ab
le
dataset
[
14
]
that
co
ntains
on
ly
492
fr
a
ud
ulent
c
red
it
card
tra
ns
act
io
ns
out
of
a
w
hoppin
g
total
of
284,315
tra
nsa
ct
ion
s
m
ade
within
the
tw
o
days.
It
sho
ws
that
the
dataset
is
hi
gh
l
y
sk
ewe
d
or
i
m
balanced
with
the
cl
ass
1
or
the
fr
a
udule
nt
cred
it
car
d
t
r
ansacti
on,
acc
ountin
g
for
a
t
otal
of
0.1
72%
of
the
t
otal
nu
m
ber
of
c
red
it
car
d
t
r
ansacti
ons.
T
he
dataset
f
eat
ur
es
V_1,V
_2,V_
3,…,V
_28,
a
re
a
ll
nu
m
eric
values
t
hat
we
re
ob
ta
ine
d
by
a
pply
ing
pri
ncipa
l
com
po
ne
nt
a
naly
si
s
(
PCA
)
tran
sf
orm
ation
on
th
e
or
igi
nal
rec
ords
to
m
ai
nt
ai
n
the
co
nf
i
den
ti
al
it
y
of
the
us
e
rs
an
d
cred
it
cardh
old
er
s.
T
he
only
feature
s
that
are
no
t
transfor
m
ed
and
pr
e
-
existi
ng
num
eric
values
are
“Ti
m
e
”
an
d
“Am
ou
nt”.
Fea
tures
“Ti
m
e”
a
nd
“
Am
ou
nt”
are
the
tim
e
elap
se
d
in
sec
onds
bet
ween
eac
h
tran
sact
ion
a
nd
t
he
total
tran
sact
io
n
am
ou
nt,
in $
resp
ect
ively
.
T
he data
set
d
et
a
il
s h
ave
b
ee
n p
resen
te
d
in
Ta
ble 1.
Table
1.
Desc
ription o
f data
se
t
No
r
m
al
tran
sactio
n
s
Fraud
u
len
t
tra
n
sac
tio
n
s
Featu
res
Nu
m
b
e
r
o
f
tr
an
sac
t
io
n
s
2
8
4
,315
492
30
2
8
4
,807
Evaluation Warning : The document was created with Spire.PDF for Python.
Ind
on
esi
a
n
J
E
le
c Eng &
Co
m
p
Sci
IS
S
N:
25
02
-
4752
Fraudule
nt c
re
dit car
d
tr
ansac
ti
on
detect
ion usi
ng so
ft
c
ompu
ti
ng tec
hniq
ues
(
Aishw
ar
y
a
Priy
adar
s
hini
)
1637
Our
m
ai
n
obj
e
ct
ive
in
ca
rr
yi
ng
out
the
ex
pl
or
at
ory
data
a
naly
sis
is
to
a
naly
ze
an
d
under
sta
nd
th
e
com
plete
distribu
ti
on
of
the
data
an
d
ide
ntify
correla
ti
on
a
nd
de
pe
ndency
am
on
g
var
i
ou
s
i
nput
featur
e
s.
Starti
ng,
with
unde
rstan
ding
the
dataset
,
as
exp
la
ine
d
a
bove
,
only
two
of
the
m
any
inp
ut
feature
s
are
know
n,
i.e.,
“
Am
ou
nt”
an
d
“Ti
m
e”.
These
col
um
ns
are
al
read
y
sc
al
ed
a
nd
are
r
eady
to
be
use
d
in
our
e
xper
i
m
ent.
Our
dataset
is
highly
i
m
balanced,
w
he
re
on
l
y
49
2
(
0.1
72%
)
ou
t
of
28
4,8
07
tra
ns
act
io
ns
are
fr
au
dul
ent
,
and
the
rest
283,8
23
(
99.
83%)
a
re
ge
nu
i
ne
tra
ns
a
ct
ion
s.
This
da
ta
set
is
highly
i
m
balanced,
a
nd
to
com
pen
sat
e
for
the
hi
gh
im
bal
ance
i
n
the
da
ta
set
,
the
a
da
pt
ive
syntheti
c
(
A
DASY
N
)
oversam
pling
[
16
]
m
et
ho
d
ha
s
bee
n
us
e
d
to
resam
ple
the
datase
t.
Fo
ll
ow
i
ng
wh
ic
h,
the
M
achine
le
ar
ning
al
gorithm
s
wer
e
a
pp
li
ed
to
the
resam
pled
data
set
.
The
c
ode
sn
ip
pet
f
or
im
plem
entat
ion
of
A
DAYS
N
oversam
pling
ha
s
bee
n
prese
nt
ed
in
Fig
ure
5
.
Figure
5
.
Co
de
sn
i
pp
e
nt fo
r
i
m
ple
m
entat
ion
of AD
ASYN
ov
e
rsam
pling
Af
te
r
resam
pling,
the g
e
nu
i
ne
and
f
r
au
dule
nt
cl
ass
distri
buti
ons,
the
e
val
uation
an
d
est
im
at
ion
of
t
he
var
i
ou
s
feat
ur
e
s
sig
nificant
t
owar
ds
m
od
el
s’
input
feature
s
are
ca
rr
ie
d
out.
Eve
n
t
hough,
the
featu
res
i
n
th
e
data
are
not
re
vealed
du
e
to
c
onfide
ntial
it
y
issues,
i
t
is
hi
gh
l
y
appr
opriat
e
that
the
fe
at
ur
es,
V_1,V
_2,V_
3,…,V
_28,
wh
i
ch
ar
e
PC
A
norm
al
iz
ed,
w
hi
le
“t
i
m
e”,
“am
ou
nt”,
an
d
“
cl
ass”
are
non
-
PC
A
norm
al
iz
ed
featur
es,
be
te
ste
d
for
t
heir
c
or
relat
ion
an
d
li
near
de
pe
nd
e
nc
y
on
the
re
st
of
feat
ur
es
an
d
how
includi
ng
or
e
xclu
ding
a
fea
ture
can
a
ff
ect
the
ov
e
rall
outc
om
e.
Ther
e
fore,
the
e
xp
l
or
at
or
y
data
a
naly
sis
(EDA)
is
car
ried
out
to
analy
ze
and
underst
and
the
c
om
plete
distribu
ti
on
of
the
data
a
nd
identify
co
rrel
at
ion
and
de
pe
nd
e
nc
y
a
m
on
g
va
rio
us
in
put
feat
ures.
Starti
n
g,
wi
th
un
der
sta
nd
i
ng
the
dataset
,
as
ex
plaine
d
a
bove,
on
ly
t
wo
of
t
he
m
any
inp
ut
f
eat
ur
es
are
kn
own,
i.e.
,
“
Amount”
a
nd
“Ti
m
e”.
These
c
ol
um
ns
are
pr
e
vi
ou
sly
scal
ed
a
nd
are
rea
dy
to
be
us
e
d
in
our
e
xp
e
rim
ent.
Our
dataset
is
hig
hly
im
balanced,
w
her
e
on
l
y
49
2
(0.17
2%)
ou
t
of
284,8
07
tra
nsa
ct
ion
s
are
fr
a
udulent
a
nd
t
he
rest
283,8
23
(99.8
3%)
a
re
ge
nu
i
ne
tra
ns
act
ion
s
.
The
2
-
D
scat
te
r
plo
t
was
plo
tt
ed
to
acc
ur
at
el
y
visu
al
iz
e
an
d
reali
ze
the
cre
dit
card
tra
ns
a
ct
ion
in
bo
t
h
cl
asses,
i.e.,
cl
ass
0
a
nd
cl
ass
1,
re
pr
e
se
ntin
g
genuine
c
red
it
ca
rd
t
ran
sact
i
on
s
an
d
f
raud
ulent
car
d
tra
ns
a
ct
ions
,
resp
ect
ively
.
T
he
2
-
D
scat
te
r
plo
t
has
bee
n
pr
ese
nted
i
n
Fig
ure
6.
A
gain
,
to
accuratel
y
ve
rify
the
trans
a
ct
io
n
distrib
ution,
w
e
plo
t
the
3
-
D
scat
te
r
plo
t
a
nd
pai
r
plo
t
of
al
l
the
c
om
bin
at
ions
of
the
featur
e
s
“Ti
m
e
”
a
nd
“Am
ou
nt”,
pre
sented
in Fi
g
ure
7
.
(a)
(b)
Figure
6
.
2
-
D
Scat
te
r
pl
ot r
e
presenti
ng
;
(a)
Distrib
ution o
f
f
ra
udule
nt tra
ns
act
io
ns
vs
a
m
ou
nt ($)
,
(b) Dist
rib
utio
n
of frau
dule
nt
tran
sact
io
ns v
s t
i
m
e (s
ec)
Evaluation Warning : The document was created with Spire.PDF for Python.
IS
S
N
:
2502
-
4752
Ind
on
esi
a
n
J
E
le
c Eng &
Co
m
p
Sci,
Vo
l.
23
, N
o.
3
,
Se
ptem
ber
20
21
:
16
34
-
16
42
1638
Figure
7
.
3
-
D
s
cat
te
r
plo
t a
nd
pair plot
of the
tran
sact
io
n vs
tim
e and
tra
ns
a
ct
ion
vs
am
ount
r
especti
ve
ly
Fr
om
the
ab
ove
3
-
D
scat
te
r
plo
t,
first
co
nsi
der
in
g
plo
tt
in
g
the
c
red
it
ca
rd
tra
ns
act
io
n
distrib
utio
n
ver
s
us
the
feat
ur
e
“Ti
m
e”,
on
e
can
see
that
wh
et
her
the
cr
edit
card
tran
sa
ct
ion
is
genuin
e
or
frau
dule
nt
,
it
has
been
eve
nly
di
stribu
te
d
th
rou
ghout
the
ti
m
e.
The
re
is
no
spe
ci
fic
dis
ti
nction
m
ade
wh
e
n
it
com
es
to
tim
e,
i.e.
it
ind
ic
at
es
no
sign
ific
a
nt
pa
tt
ern
rele
van
t
t
o
f
raud
ulent
c
red
it
car
d
tra
nsa
ct
ion
s
withi
n
the
s
pecifie
d
tim
e
.
Si
m
il
arly
,
the
3
-
D
scat
te
r
plot
of
the
transac
ti
on
distrib
utio
n
versus
“
Am
ou
nt”,
ca
n
be
a
naly
zed
that
m
os
t
or
al
l
of
the
fr
a
udulent
trans
act
ion
s
ha
ve
bee
n
carried
ou
t
on
a
m
ou
nts
le
ss
than
$2500.
O
n
com
pu
ta
ti
on
,
ou
t
of
a
total
of
284,8
07
tra
ns
act
io
ns
,
284,3
57
ha
ve
a
transacti
on
a
m
ou
nt
le
ss
tha
n
or
e
qu
al
t
o
$
2500,
acc
ount
ing
for
99.84
perce
nt
of
t
he
total
tr
a
ns
act
io
ns
.
I
n
c
on
t
rast,
no
f
ra
udulent
t
ran
sa
ct
ion
has
a
t
ra
ns
act
io
n
am
ount
m
or
e
sign
ific
a
nt
tha
n
$
2500.
Sim
i
la
rly
,
an
oth
e
r
i
m
po
rtant
obse
r
vation
be
in
g
th
e
transa
ct
ion,
wh
et
her
f
raud
ul
ent
or
genuine
,
was
s
pr
ea
d
e
ve
nly
thr
ough
ou
t
ti
m
e,
an
d
t
her
e
is
no
cl
ea
r
disti
nction.
T
her
e
is
a
hea
vy
ov
e
r
la
p
of
genuine
a
nd
f
raud
ulent
tran
sact
ion
s
th
rou
ghout
the
tim
e
and
it
can
be
obser
ve
d
that
there
is
no
cl
ear
disti
nction.
S
o,
there
is
no
pa
rtic
ular
patte
rn
about
f
ra
udul
ent
be
ha
vio
r
s
pecific
to
a
ny
par
t
of
t
he
day
.
It
i
s
even
ly
s
pr
ea
d
t
hro
ughout
2 da
ys. The
h
ist
ogr
a
m
, p
ai
r plot al
ong dist
pl
ot fo
r
it
h
as
b
ee
n p
r
esented
in Fi
gu
re
8.
Fig
ure
8
.
Histo
gr
am
, p
ai
r
plo
t,
and
distplot t
o sh
ow a
hea
vy
ov
e
rlap
of
genuine
a
nd
fr
a
udulent tra
ns
act
io
ns
with
no clea
r d
ist
incti
on
Fo
r
t
h
e
d
i
s
t
p
l
o
t
,
t
h
e
p
a
p
e
r
u
s
e
s
k
e
r
n
e
l
d
e
n
s
i
t
y
e
s
t
i
m
a
t
o
r
(
K
D
E
)
.
K
D
E
h
e
l
p
s
i
n
d
i
s
c
e
r
n
i
n
g
o
u
t
t
h
e
c
r
u
c
i
a
l
f
e
a
t
u
r
e
s
o
f
t
h
e
d
a
t
a
,
s
u
c
h
a
s
s
k
e
w
n
e
s
s
,
b
i
m
o
d
a
l
i
t
y
,
a
n
d
c
e
n
t
r
a
l
t
e
n
d
e
n
c
y
.
H
o
w
e
v
e
r
,
i
t
h
a
s
i
t
s
p
i
t
f
a
l
l
s
.
K
D
E
t
e
n
d
s
t
o
r
e
p
r
e
s
e
n
t
t
h
e
u
n
d
e
r
l
y
i
n
g
d
a
t
a
p
o
o
r
l
y
i
n
c
e
r
t
a
i
n
s
i
t
u
a
t
i
o
n
s
,
a
s
i
t
a
s
s
u
m
e
s
t
h
a
t
t
h
e
u
n
d
e
r
l
y
i
n
g
d
i
s
t
r
i
b
u
t
i
o
n
i
s
s
m
o
o
t
h
a
n
d
u
n
b
o
u
n
d
e
d
.
T
h
i
s
a
s
s
u
m
p
t
i
o
n
f
a
i
l
s
w
h
e
n
a
v
a
r
i
a
b
l
e
(
h
e
r
e
,
t
i
m
e
)
i
s
n
a
t
u
r
a
l
l
y
b
o
u
n
d
e
d
a
n
d
t
h
e
r
e
a
r
e
o
b
s
e
r
v
a
t
i
o
n
s
o
r
t
r
a
n
s
a
c
t
i
o
n
s
e
l
a
p
s
e
d
a
t
a
r
o
u
n
d
t
h
e
s
t
a
r
t
i
n
g
o
f
t
h
e
d
a
y
(
a
s
s
u
m
e
d
t
o
b
e
0
s
e
c
)
,
K
D
E
c
u
r
v
e
h
e
r
e
e
x
t
e
n
d
s
t
o
u
n
r
e
a
l
i
s
t
i
c
v
a
l
u
e
s
,
r
e
s
u
l
t
i
n
g
i
n
t
i
m
e
a
b
o
v
e
b
e
i
n
g
p
l
o
t
t
e
d
t
o
n
e
g
a
t
i
v
e
v
a
l
u
e
s
,
w
h
i
c
h
c
a
n
n
o
t
b
e
n
e
g
a
t
i
v
e
.
Evaluation Warning : The document was created with Spire.PDF for Python.
Ind
on
esi
a
n
J
E
le
c Eng &
Co
m
p
Sci
IS
S
N:
25
02
-
4752
Fraudule
nt c
re
dit car
d
tr
ansac
ti
on
detect
ion usi
ng so
ft
c
ompu
ti
ng tec
hniq
ues
(
Aishw
ar
y
a
Priy
adar
s
hini
)
1639
Now,
the
im
balance
in
our
da
ta
set
has
bee
n
analy
zed
.
Con
si
der
i
ng
th
e
sk
ew
ness
of
th
e
dataset
,
to
eff
ect
iv
el
y
fit
our
cl
assifi
er
m
od
el
s
and
ge
ner
at
e
ef
fici
ent
cl
assifi
cat
ion
s
,
the
m
os
t
com
m
on
te
chn
iq
ues
are
ei
ther
a
pp
ly
in
g
unde
r
-
sam
pling
or
ov
e
rsa
m
pl
ing
t
o
e
ve
n
ou
t
the
im
balance
in
the
dataset
.
Her
e
,
howe
ver,
ADAS
Y
N
ov
ersam
pling
te
chn
i
qu
e
has
been
c
onsider
ed
[
30
]
.
A
ss
um
ing
random
un
de
r
-
sam
pling
is
i
m
ple
m
ented
on
our
dataset
,
the
sam
ples
fr
om
the
m
ajo
rity
cl
ass
in
the
da
ta
are
delet
ed
r
andom
ly
,
and
there
is
no
way
w
hat
so
e
ver
one
ca
n
preser
ve
t
he
e
ssentia
l
or
i
nfo
rm
ation
-
rich
sa
m
ples
fr
om
the
data.
Ne
ve
rthe
le
ss,
as
al
read
y
pro
po
s
ed
in
our
m
et
ho
dolo
gy,
ADAS
Y
N
ove
rsam
pling
te
ch
nique
was
im
ple
m
ented
to
resam
ple
the d
at
aset
t
o
e
ff
ic
ie
ntly
train
the cla
ssifie
r
algorit
hm
s
.
4.
METHO
DOL
OGY
O
u
r
s
t
u
d
y
u
s
e
d
t
h
e
d
a
t
a
s
e
t
[
1
4
]
,
w
h
i
c
h
i
s
h
i
gh
l
y
i
m
ba
l
a
n
c
e
d
,
a
n
d
t
h
e
b
a
l
a
nc
i
n
g
i
s
c
a
r
r
i
e
d
o
u
t
u
s
i
n
g
t
h
e
A
D
A
S
Y
N
o
v
e
r
s
a
m
pl
i
n
g
m
e
t
h
o
d
.
F
o
l
l
o
w
i
n
g
t
h
i
s
,
t
h
e
d
a
t
a
s
e
t
i
s
c
r
o
s
s
-
v
a
l
i
d
a
t
e
d
u
s
i
n
g
s
t
r
a
t
e
gi
c
t
e
n
-
f
o
l
d
c
r
o
s
s
-
v
a
l
i
d
a
t
i
o
n
.
T
h
e
c
l
a
s
s
i
f
i
e
r
m
o
de
l
s
w
e
r
e
e
v
a
l
u
a
t
e
d
a
n
d
t
e
s
t
e
d
f
o
r
t
h
e
i
r
p
e
r
f
o
r
m
a
n
c
e
o
n
t
he
d
a
t
a
u
s
i
n
g
v
a
r
i
o
u
s
p
e
r
f
o
r
m
a
n
c
e
m
e
t
r
i
c
s
a
n
d
p
a
r
a
m
e
t
e
r
s
s
u
c
h
a
s
a
c
c
u
r
a
c
y
,
r
e
c
a
l
l
,
p
r
e
c
i
s
i
o
n
,
a
n
d
F
1
-
s
c
o
r
e
,
w
h
i
c
h
w
e
r
e
c
om
p
u
t
e
d
u
s
i
n
g
t
h
e
c
o
n
f
u
s
i
o
n
m
a
t
r
i
x
.
T
he
e
s
t
i
m
a
t
e
d
p
e
r
f
o
r
m
a
n
c
e
p
a
r
a
m
e
t
e
r
s
a
r
e
t
h
e
n
c
om
p
a
r
e
d
w
i
t
h
e
v
e
r
y
m
o
d
e
l
u
s
e
d
i
n
t
h
i
s
s
t
u
d
y
t
o
c
l
a
s
s
i
f
y
t
h
e
d
a
t
a
i
n
t
o
c
l
a
s
s
0
a
n
d
c
l
a
s
s
1
,
i
.
e
.
,
g
e
n
u
i
n
e
c
l
a
s
s
a
n
d
f
r
a
u
d
u
l
e
n
t
c
l
a
s
s
c
r
e
di
t
c
a
r
d
t
r
a
n
s
a
c
t
i
o
n
s
,
r
e
s
p
e
c
t
i
v
e
l
y
.
A
c
o
n
f
u
s
i
o
n
m
a
t
r
i
x
,
i
n
g
e
n
e
r
a
l
,
c
om
p
u
t
e
s
t
h
e
p
e
r
f
o
r
m
a
n
c
e
o
f
t
h
e
c
l
a
s
s
i
f
i
e
r
m
o
d
e
l
w
h
i
l
e
a
s
s
i
g
n
i
ng
a
n
i
n
p
u
t
t
o
t
h
e
l
a
b
e
l
s
.
F
o
r
b
i
n
a
r
y
c
l
a
s
s
i
f
i
c
a
t
i
o
n
,
i
t
i
s
a
2
×
2
t
a
b
l
e
o
r
a
2
-
d
i
m
e
n
s
i
o
n
a
l
m
a
t
r
i
x
d
e
s
i
g
n
e
d
t
o
c
om
p
u
t
e
a
n
d
r
e
p
r
e
s
e
n
t
t
h
e
q
u
a
n
t
i
t
y
o
f
a
l
l
f
o
u
r
r
e
s
u
l
t
s
o
f
a
b
i
n
a
r
y
c
l
a
s
s
i
f
i
e
r
a
nd
d
e
n
o
t
e
d
a
s
T
N
,
F
N
,
T
P
,
a
n
d
F
P
[
1
2
]
.
T
h
e
e
nt
r
i
e
s
of
t
h
e
c
o
n
f
u
s
i
o
n
m
a
t
r
i
x
h
a
v
e
b
e
e
n
b
r
i
e
f
l
y
e
x
p
l
a
i
n
e
d
b
e
l
o
w
:
Tru
e
–
P
os
it
ive
s: C
orrectl
y C
la
ssifie
d Fra
udul
ent Cred
it
Ca
r
d
T
ransac
ti
on
s
False
–
P
os
it
ives: Erro
neousl
y C
la
ssifie
d
F
r
audulent C
re
dit Ca
rd
Tra
ns
ac
ti
on
s
Tru
e
–
Ne
gativ
e:
Correct
ly
Cl
assifi
ed N
on
-
F
raud
ulent
or
ge
nu
i
ne
c
red
it
ca
rd Tra
ns
act
io
ns
False
–
Ne
gative:
Err
on
e
ously
Cl
assifi
ed
N
on
-
F
raud
ulent
or g
e
nuine c
re
dit car
d
T
ran
s
a
ct
ion
s
Ac
cura
cy
:
Ac
cur
acy
is
the
perform
ance
pa
ram
et
er
us
ed
to
cal
culat
e
to
determ
ine
the
exte
nt
to
w
hich
th
e
cl
assifi
cat
ion
a
lgorit
hm
can
predict
the
class
es co
rr
ect
ly
.
Accuracy
=
+
+
+
+
(1)
Re
call:
Re
cal
l
is
a
per
form
a
nce
pa
ram
et
er
that
rep
rese
nt
s
how
ef
fici
en
t
the
m
od
el
or
the
cl
assifi
er
is
in
detect
ing t
he
a
ct
ual f
ra
udule
nt
cred
it
ca
rd tra
ns
act
io
n
.
Re
cal
l =
+
(2)
Preci
sion:
T
hi
s
is
the
per
f
orm
ance
par
am
e
te
r
that
m
easur
es
the
reli
abili
ty
of
a
m
od
el
or
the
cl
assif
yi
ng
al
gorithm
.
Pr
eci
sio
n
=
+
(3)
Re
call/
Preci
sion
Tr
ad
e
off
:
I
t
is
a
ver
y
signi
ficant
trade
-
of
f
betwee
n
prec
isi
on
ve
rsus
re
cal
l.
It
ind
ic
at
es
that,
as
the
pr
eci
si
on
with
wh
ic
h
the
cl
assifi
er
pr
edict
s
increase
s,
it
will
detect
a
le
sser
nu
m
ber
of
fr
a
udule
nt
cases,
viz.
s
uppose
t
ha
t
the
cl
assifi
e
r
m
od
el
ha
s
a
pr
eci
sio
n
of
95%,
with
only
five
cas
es
of
f
raud
ulent
c
red
i
t
card
transacti
ons.
S
uppose
one
tri
es
ad
ding
a
no
ther
five
f
raudu
le
nt
tra
ns
act
ion
ca
ses
to
i
t.
I
n
that
ca
se
,
ou
r
cl
assifi
er
m
od
el
con
si
ders
90
%
preci
sio
n,
i.
e.,
the
lo
wer
t
he
preci
sion,
our
cl
assifi
er
m
od
el
w
ould
be
a
ble
t
o
pr
e
dict m
or
e num
ber
of s
uc
h ca
ses.
F1
–
S
c
o
r
e
o
r
F
1
–
m
e
a
s
u
r
e
:
F1
-
S
c
o
r
e
,
a
p
e
r
f
o
r
m
a
n
c
e
p
a
r
a
m
e
t
e
r
e
s
t
i
m
a
t
e
d
t
o
f
i
n
d
t
h
e
t
e
s
t
i
n
g
a
c
c
u
r
a
c
y
o
f
t
he
f
r
a
u
d
u
l
e
n
t
c
r
e
d
i
t
c
a
r
d
d
e
t
e
c
t
i
on
c
l
a
s
s
i
f
i
e
r
m
od
e
l
o
r
t
h
e
c
l
a
s
s
i
f
y
i
n
g
a
l
g
o
r
i
t
h
m
.
T
o
c
a
l
c
ul
a
t
e
t
h
e
s
c
o
r
e
,
i
t
c
o
n
s
i
d
e
r
s
e
v
a
l
u
a
t
i
n
g
t
h
e
h
a
r
m
o
n
i
c
m
e
a
n
o
f
t
w
o
p
e
r
f
o
r
m
a
n
c
e
m
e
t
r
i
c
s
:
p
r
e
c
i
s
i
o
n
a
n
d
r
e
c
a
l
l
o
f
t
h
e
t
e
s
t
s
a
m
pl
e
s
[
1
3
]
.
F1
-
sco
re =
2
∗
2
∗
+
+
(4)
5.
RESU
LT
S
AND DI
SCUS
S
ION
The
ge
ne
rated
resam
pled
dataset
was
then
div
ide
d
into
trai
ning
an
d
te
sti
ng
set
s,
with
the
te
sti
ng
set
hav
i
ng
a
sp
li
t
r
at
io
of
0.3
3.
T
o
furthe
r
incre
ase
the
eff
ic
ie
nc
y
and
perf
or
m
ance
of
our
le
arn
i
ng
al
gorith
m
s,
we
e
m
plo
ye
d
strat
ifie
d
k
-
fo
l
d
cr
oss
-
vali
dation
w
it
h
k=
10.
T
his
will
helps
us
w
it
h
m
od
el
resu
l
ts
bein
g
le
ss
bi
as
e
d
Evaluation Warning : The document was created with Spire.PDF for Python.
IS
S
N
:
2502
-
4752
Ind
on
esi
a
n
J
E
le
c Eng &
Co
m
p
Sci,
Vo
l.
23
, N
o.
3
,
Se
ptem
ber
20
21
:
16
34
-
16
42
1640
and
le
ss
optim
ist
ic
towar
ds
the
m
ino
rity
class
an
d
any
ot
her
ou
tl
ie
rs
i
n
the
data.
T
he
c
orrelat
ion
m
at
rix
wa
s
us
e
d
to
ide
ntify
wh
ic
h
feat
ur
es
ha
ve
high
posit
ive
an
d
neg
at
ive
c
orr
el
at
ion
or
de
pe
nd
e
ncy
co
nc
ern
i
ng
fr
a
udulent c
re
di
t card
tr
ansact
ion
s
. T
he
c
orre
la
ti
on
m
at
rix
ha
s b
ee
n plott
ed
and
pr
ese
nted
in Figu
re
9.
Fig
ure
9
.
Co
rr
e
la
ti
on
m
at
rix
f
or r
esam
pled d
at
a
Correl
at
ion
m
at
rix
f
or r
e
sam
pled
data
can
b
e
r
e
pr
ese
nted
b
y
:
Feature
s
wi
th
Ne
ga
ti
ve
c
or
r
el
ation:
Feat
ures
10
,
12
,
14
,
17
are
neg
at
i
vely
co
rr
el
at
e
d.
T
his
in
dicat
es
that t
he
le
sse
r
t
hese
values
wo
uld
be, t
he
e
nd
ou
tc
om
e w
ou
l
d be m
os
t l
ikely
a f
ra
udule
nt t
ran
sact
io
n.
Feature
s
wi
th
Positi
ve
correl
ation:
Feat
ures
2
,
11
,
4
,
19
are
posit
ivel
y
cor
relat
e
d,
i.
e.,
the
higher
t
he
va
lues
of the
se
f
eat
ures,
the
m
or
e p
r
ob
a
ble
the end
outc
om
e w
ould
be
a
fr
a
udulent t
ransac
ti
on
.
Fu
rt
her
m
or
e,
it
can
be
obser
ved
that
only
a
ha
ndf
ul
of
t
he
dataset
feat
ur
es
are
c
orrelat
ed,
i.e
.,
t
he
at
tribu
te
“cl
ass
”
is
ind
epe
nde
nt
of
both
the
featur
es
“am
ou
nt
”
a
nd
“t
im
e”.
It
is
al
so
c
le
ar
from
the
belo
w
correla
ti
on
m
a
trix
that
the
transacti
on
cl
ass,
wh
et
her
genu
ine
(class
0)
a
nd
fr
a
udulent
(class
1),
esse
ntial
ly
dep
e
nds
on the
PCA n
or
m
al
ized
att
rib
utes.
T
h
e
m
o
d
e
l
p
e
r
f
o
r
m
a
n
c
e
o
n
t
h
e
o
r
i
g
i
n
a
l
d
a
t
a
s
e
t
a
n
d
t
h
e
A
D
A
S
Y
N
r
e
s
a
m
p
l
e
d
d
a
t
a
s
e
t
a
r
e
t
h
e
n
c
o
m
p
a
r
e
d
.
I
t
c
a
n
b
e
o
b
s
e
r
v
e
d
t
h
a
t
,
t
h
e
r
e
h
a
s
b
e
e
n
a
s
i
g
n
i
f
i
c
a
n
t
i
m
p
r
o
v
e
m
e
n
t
i
n
t
h
e
p
e
r
f
o
r
m
a
n
c
e
o
f
c
l
a
s
s
i
f
i
e
r
m
o
d
e
l
s
,
a
f
t
e
r
a
p
p
l
y
i
n
g
A
D
A
S
Y
N
o
v
e
r
s
a
m
p
l
i
n
g
t
e
c
h
n
i
q
u
e
.
T
h
e
e
s
t
i
m
a
t
e
d
r
e
s
u
l
t
s
o
f
t
h
e
v
a
r
i
o
u
s
c
l
a
s
s
i
f
i
e
r
m
o
d
e
l
s
o
n
t
h
e
o
r
i
g
i
n
a
l
d
a
t
a
s
e
t
b
e
f
o
r
e
a
p
p
l
y
i
n
g
A
D
A
S
Y
N
a
r
e
p
r
e
s
e
n
t
e
d
i
n
T
a
b
l
e
2
.
T
h
e
e
s
t
i
m
a
t
e
d
p
e
r
f
o
r
m
a
n
c
e
p
a
r
a
m
e
t
e
r
s
o
f
t
h
e
v
a
r
i
o
u
s
c
l
a
s
s
i
f
i
e
r
m
o
d
e
l
s
a
s
a
p
p
l
i
e
d
t
o
t
h
e
A
D
A
S
Y
N
r
e
s
a
m
p
l
e
d
d
a
t
a
s
e
t
a
r
e
p
r
e
s
e
n
t
e
d
i
n
T
a
b
l
e
3
.
Table
2
.
Per
for
m
ance m
et
rics
est
i
m
at
ed
fo
r
va
rio
us
classi
fie
r
m
od
el
s b
e
f
ore ap
plyi
ng
A
D
ASYN
ov
e
rsam
pling
Clas
sif
ier
m
o
d
els’
p
erfo
r
m
an
ce
m
etri
cs
Decisio
n
tre
e
Ran
d
o
m
f
o
rest
KNN
Log
istic
regress
io
n
Accurac
y
9
7
.08
%
9
9
.98
%
8
0
.33
%
9
7
.10
%
Precisio
n
0
.98
1
0
.99
8
0
.82
3
0
.98
3
F1
-
sco
re
0
.94
2
0
.99
9
0
.80
1
0
.94
3
Table
3.
Per
for
m
ance m
et
rics
est
i
m
at
ed
fo
r
vario
us
classi
fie
r
m
od
el
s af
te
r
app
ly
in
g ADA
SYN
ov
e
rsam
pling
Clas
sif
ier
m
o
d
els’
p
erfo
r
m
an
ce
m
etri
cs
Decisio
n
tr
ee
Ran
d
o
m
f
o
r
est
K
NN
(k = 9)
Log
istic
regr
ess
io
n
Accurac
y
1
0
0
.0%
1
0
0
.0%
8
3
.33
%
9
0
.10
%
Precisio
n
1
.0
1
.0
0
.99
0
.91
Recall
1
.0
1
.0
0
.83
3
0
.89
F1
-
sco
re
1
.0
1
.0
0
.99
8
0
.90
Evaluation Warning : The document was created with Spire.PDF for Python.
Ind
on
esi
a
n
J
E
le
c Eng &
Co
m
p
Sci
IS
S
N:
25
02
-
4752
Fraudule
nt c
re
dit car
d
tr
ansac
ti
on
detect
ion usi
ng so
ft
c
ompu
ti
ng tec
hniq
ues
(
Aishw
ar
y
a
Priy
adar
s
hini
)
1641
Fr
om
Table
3,
we
em
plo
ye
d
K
N
N,
wh
ic
h
gi
ves
us
a
n
accu
racy
scor
e
of
83.
33%
on
the
da
ta
with
th
e
best
value
of
k,
w
hic
h
was
a
ut
o
e
valuated
by
our
m
od
el
t
o
be
k=
9.
T
he
s
econd
m
od
el
,
the
l
og
ist
ic
re
gr
ession
cl
assifi
er,
giv
e
s
a
bette
r
resu
l
t
as
co
m
par
ed
to
that
of
the
K
N
N
al
gorith
m
with
an
ov
e
rall
accuracy
scor
e
of
90.10%
.
The
oth
e
r
two
cl
as
sifie
r
m
od
el
s,
i.e.,
the
ra
ndom
fo
rest
cl
assifi
er
m
od
el
,
a
nd
the
decisi
o
n
tr
ee
cl
assifi
er
m
od
e
l,
ga
ve
a
n
a
bso
lute
10
0.0%
ac
cur
acy
,
w
hich
is
an
i
nteresti
ng
est
im
ation
of
accu
racy,
giv
e
n
th
e
data
set
was
hi
gh
ly
im
balanced
or
s
ke
wed
a
nd
it
was
re
sa
m
pled
us
ing
one
of
the
m
os
t
prom
inent
resa
m
pl
ing
te
chn
iq
ues
, A
DASY
N.
Furth
er,
the
preci
sio
n,
recall
, and F
1
-
sc
ore f
or d
ec
isi
on
tree
an
d r
andom
f
or
est
a
re 1.0,
un
i
qu
e
.
Als
o,
t
he
preci
sio
n,
r
ecal
l
and
F
1
-
s
cor
e
for
K
N
N
and
l
og
ist
ic
re
gr
essi
on
are
0.99,
0.8
33,
0.9
98
a
nd
0.91,
0.89,
0.9
0,
res
pecti
vely
.
Table
4.
Per
for
m
ance of
d
iffe
ren
t cl
assifi
e
rs for P
rusti
and
Ra
th
[
13
]
Perf
o
r
m
an
ce
Met
ri
cs
ELM
MLP
KNN
Ran
d
o
m
For
est
Ens
e
m
b
le
Metho
d
Accurac
y
7
8
.25
8
0
.38
8
1
.43
8
1
.92
8
3
.83
Precisio
n
8
6
.00
8
7
.68
8
8
.52
8
9
.12
9
4
.50
F1
-
sco
re
8
6
.63
8
8
.03
8
8
.84
8
9
.22
9
0
.31
We
com
par
e
ou
r
sim
ulate
d
r
esults
with
P
r
us
ti
an
d
Ra
th
[13],
we
f
ound
that
our
sim
ulate
d
res
ults
ou
t
perform
ed
al
l
cl
assifi
ers
and
e
ns
em
bles
us
ed
by
the
m
pr
esented
a
bove
in
Table
4.
Subse
qu
e
nt
ly
,
the
auth
or
s
in
the
li
te
ratur
e
a
pp
li
ed
var
i
ou
s
sta
nd
al
on
e
s
of
t
c
om
pu
ti
ng
te
ch
niques
t
o
the
pr
e
-
pr
oces
se
d
data
to
ob
ta
in
accu
rat
e
an
d
preci
se
cl
assifi
cat
ion
s
of
t
he
data
into
the
s
ai
d
cat
e
gories,
nam
el
y
the
ge
nuine
c
la
ss
of
cred
it
tran
sact
ion
s
(class
0)
a
nd
t
he
f
raud
ulent
or
dece
ptiv
e
cl
ass
of
c
redi
t
card
tran
sact
ion
s
,
as
desc
ri
bed
i
n
the
li
te
ratur
e
(
cl
ass
1).
A
co
nfusi
on
m
at
rix
was
us
ed
to
es
tim
a
te
accurac
y
and
oth
e
r
pe
rfor
m
ance
pa
ra
m
et
ers
sh
owin
g
the
m
od
el
s
would
cl
assify
the
da
ta
into
cl
ass
0
or
cl
ass
1.
The
obj
ect
i
ve
of
this
st
ud
y
has
bee
n
entirel
y
ded
ic
a
te
d
to
im
pr
ov
i
ng
t
he
m
od
el
pe
rfor
m
ance
by
var
yi
ng
a
nd
fine
-
t
un
i
ng
t
he
m
od
el
par
am
eter
s
that
can
help
us
re
nd
e
r
the
best
resu
lt
s.
Give
n
the
data,
i
f
w
e
can
get
su
c
h
a
good
est
im
ation
c
oncer
ning
the
m
od
el
’s
perfor
m
ance
in
cl
assify
ing
the
fr
a
udulent
tra
ns
a
ct
ion
s
int
o
ei
ther
genuin
e
or
fr
a
udule
nt
cl
asses,
wh
ic
h
is
sign
if
ic
antly
bette
r,
there
is
no
sp
ec
ific
need
of
ha
ving
hea
vier
le
arn
i
ng
al
gorit
hm
s
to
be
e
m
pl
oye
d
wh
e
n
t
he
sta
ndal
on
e m
od
el
s c
an give
su
c
h
a
gr
eat
er
acc
ur
ac
y.
6.
RESU
LT
S
AND DI
SCUS
S
ION
S
T
h
i
s
s
t
u
d
y
a
n
a
l
y
z
e
d
n
um
e
r
o
us
m
ul
t
i
f
a
c
e
t
e
d
a
n
d
c
r
i
t
i
c
a
l
t
a
s
k
s
t
h
a
t
i
n
c
l
u
d
e
d
d
e
t
e
c
t
i
n
g
d
e
c
e
i
t
f
u
l
a
n
d
f
r
a
u
d
u
l
e
n
t
a
c
t
i
v
i
t
i
e
s
i
n
t
h
e
c
r
e
d
i
t
c
a
r
d
i
n
a
r
e
l
a
t
i
v
e
l
y
h
i
g
h
e
r
s
k
e
w
e
d
(
i
m
b
a
l
a
n
c
e
d
)
e
n
v
i
r
o
n
m
e
nt
.
W
e
a
p
p
l
i
e
d
s
t
a
n
d
a
l
o
n
e
s
o
f
t
c
om
p
u
t
i
n
g
a
n
d
i
n
t
e
l
l
i
g
e
n
t
t
e
c
hn
i
q
u
e
s
s
u
c
h
a
s
t
h
e
r
a
n
d
om
f
o
r
e
s
t
,
d
e
c
i
s
i
o
n
t
r
e
e
,
K
N
N
,
a
n
d
l
og
i
s
t
i
c
r
e
g
r
e
s
s
i
o
n
c
l
a
s
s
i
f
i
e
r
t
o
a
c
c
u
r
a
t
e
l
y
a
n
d
p
r
e
d
i
c
t
i
v
e
l
y
e
s
t
i
m
a
t
e
t
h
e
c
l
a
s
s
i
f
i
e
r
’
s
p
e
r
f
o
r
m
a
n
c
e
m
e
t
r
i
c
s
t
o
e
f
f
i
c
i
e
n
t
l
y
d
e
t
e
c
t
f
r
a
u
d
a
n
d
e
f
f
e
c
t
i
v
e
l
y
t
e
l
l
t
h
e
m
a
p
a
r
t
f
r
om
t
h
e
g
e
n
u
i
n
e
c
r
e
d
i
t
c
a
r
d
t
r
a
n
s
a
c
t
i
o
n
s
a
n
d
a
c
h
i
e
v
e
d
s
i
g
n
i
f
i
c
a
n
t
i
m
p
r
o
v
e
m
e
n
t
s
i
n
t
h
e
p
e
r
f
o
r
m
a
n
c
e
m
e
t
r
i
c
s
o
f
t
h
e
c
l
a
s
s
i
f
i
e
r
m
o
d
e
l
s
.
I
n
t
h
e
f
u
t
u
r
e
,
a
n
o
p
e
n
d
i
r
e
c
t
i
o
n
o
f
t
h
e
w
o
r
k
w
o
u
l
d
i
n
c
l
u
d
e
w
o
r
k
i
n
g
w
i
t
h
d
a
t
a
t
h
a
t
r
e
p
r
e
s
e
n
t
s
t
h
e
r
e
a
l
-
w
o
r
l
d
s
c
e
n
a
r
i
o
,
i
.
e
.
,
t
h
e
d
a
t
a
s
e
t
i
s
p
r
o
p
e
r
l
y
ba
l
a
n
c
e
d
o
u
t
i
n
t
e
r
m
s
o
f
t
h
e
m
i
n
o
r
i
t
y
c
l
a
s
s
,
t
h
e
f
r
a
u
d
u
l
e
n
t
c
r
e
di
t
c
a
r
d
t
r
a
n
s
a
c
t
i
o
n
s
.
A
l
s
o
,
i
m
p
r
o
v
i
n
g
t
h
e
m
o
d
e
l
’
s
e
f
f
i
c
i
e
n
c
y
b
y
e
n
h
a
n
c
i
n
g
t
h
e
m
o
d
e
l
’
s
f
i
t
t
i
n
g
p
a
r
a
m
e
t
e
r
s
w
o
u
l
d
b
e
t
h
e
pr
i
m
a
r
y
o
b
j
e
c
t
i
v
e
.
H
e
r
e
,
e
v
e
n
t
h
o
u
g
h
o
u
r
w
o
r
k
h
a
s
b
e
e
n
l
i
m
i
t
e
d
t
o
t
h
e
d
a
t
a
s
e
t
,
P
C
A
t
r
a
n
s
f
o
r
m
e
d
n
u
m
e
r
i
c
a
l
v
a
l
u
e
s
,
y
e
t
c
o
n
s
i
de
r
i
n
g
m
o
r
e
g
e
n
e
r
i
c
s
c
e
n
a
r
i
o
,
i
t
w
o
u
l
d
b
e
s
t
i
m
ul
a
t
i
n
g
t
o
e
x
t
e
n
d
o
u
r
w
o
r
k
w
i
t
h
d
a
t
a
s
e
t
s
h
a
v
i
n
g
t
e
xt
v
a
l
u
e
a
n
d
s
e
n
t
i
m
e
n
t
a
l
s
t
a
t
e
m
e
n
t
s
.
ACKN
OWLE
DGE
MENTS
This
researc
h work wa
s fu
nded by “
Woos
ong U
niv
e
rsity
’
s A
ca
dem
ic
Re
search
F
unding
-
20
21
”.
REFERE
NCE
S
[1]
A.
Srivasta
va
,
A.
Kundu
,
S.
Sural
,
and
A.
Maju
m
dar
,
“
Credi
t
c
a
rd
fra
ud
detec
ti
o
n
using
hidde
n
Markov
m
odel
,
”
IEE
E
Tr
ansact
ions
on
dependable
and
s
ec
ure
compu
ting
,
vol.
5,
n
o.
1,
pp
.
37
-
48,
2008
,
do
i:
10.
1109/T
DS
C.
2007.
70228
.
[2]
S.
S.
Dhok,
and
G.
R.
Bamnote,
“
Credi
t
ca
rd
fr
au
d
det
ection
usin
g
hidde
n
Marko
v
m
odel
,
”
Int
ernati
onal
Journal of
Soft
Computing
and
Engi
n
ee
ring
(
IJS
CE)
,
vol. 2,
no.
1
,
pp
.
231
-
2
37,
2012
.
[3]
A.
Praka
sh
and
C.
Chandr
ase
k
ar,
“
An
opti
m
ize
d
m
ult
iple
se
m
i
-
hidde
n
m
ark
ov
m
odel
for
c
red
it
ca
rd
fra
u
d
det
e
ct
ion
,
”
Indi
an
Journal
of
Sci
ence
and
Technol
ogy
,
vol.
8,
no.
2,
pp.
176
-
182,
Jan.
2015
,
doi
:
10.
17485/ijst/20
15/v8i
2/58081
.
[4]
V.
Bhusari and
S.
Pati
l
,
“
Applicati
on
of
hidde
n
Markov
m
odel
i
n
cre
d
i
t ca
rd
fr
a
ud
det
e
ction,
”
In
te
rnational
Jour
nal
of
Distribut
ed
a
nd
Parallel
S
yst
ems
,
vol
.
2
,
no
.
6,
pp
.
203
-
211
,
Nov.
2011
,
doi:
10.
5121/i
j
dps.20
11.
2618
.
Evaluation Warning : The document was created with Spire.PDF for Python.
IS
S
N
:
2502
-
4752
Ind
on
esi
a
n
J
E
le
c Eng &
Co
m
p
Sci,
Vo
l.
23
, N
o.
3
,
Se
ptem
ber
20
21
:
16
34
-
16
42
1642
[5]
N.
B.
Khand
are,
“
Credi
t
ca
rd
f
rau
d
de
te
c
ti
on
u
sing
hidde
n
Ma
rkov
m
odel
,
”
In
te
rnational
Jour
nal
of
Adv
an
ce
Sci
en
ti
fic
Re
sear
ch
and
Engi
n
ee
r
ing
Tr
ends
,
vol
.
1,
no
.
4
,
Jul.
201
6.
[6]
A.
Singh
and
D
.
Nara
y
an
,
“
A
s
urve
y
on
hidd
e
n
m
ark
ov
m
odel
for
cre
dit
c
ard
fra
ud
det
ection
,
”
Inte
rnat
ional
Journal
of
Engi
n
ee
ring a
nd
Ad
va
nce
d
Te
chnol
og
y
(
IJE
AT)
,
vol.
1,
no.
3
,
pp
.
49
-
52
,
2012.
[7]
S.
Subudhi
and
S.
Panigra
hi,
“
Us
e
of
fuz
z
y
c
luste
ring
and
support
vec
tor
m
ac
hi
ne
for
det
ecting
fra
ud
in
m
obile
te
l
ec
om
m
unic
at
i
on
net
works
,
”
Inte
rnational
Jou
rnal
of
Sec
urity
and
Net
works
,
vol.
11,
no.
1
-
2,
pp.
3
-
11,
2016
,
doi:
10
.
1504/IJSN
.
201
6.
075069
.
[8]
B.
Mehdi
,
C.
Hasna,
and
O.
Tay
eb
,
“
Intelli
g
ent
cr
edit
scori
ng
s
y
stem
usin
g
knowledge
m
ana
gement
,
”
IA
ES
Inte
rnational
J
ournal
of
Art
i
fi
ci
a
l
Int
el
l
ige
n
ce
(
IJA
I)
,
vol
.
8,
no
.
4,
p
p.
391
-
398,
Dec
.
2019
,
do
i:
10.
11591/ijai.v8.i4.pp391
-
398
.
[9]
A.
Guha,
“
P
red
ic
ti
on
of
B
ankr
uptcy
using
B
i
g
Data
Anal
y
t
i
cs
base
d
on
Fuzz
y
c
-
m
ea
ns
Al
gorit
hm
,
”
I
AE
S
Inte
rnational
J
ournal
of
Artific
ia
l
Intelli
g
en
ce
(
IJA
I)
,
vol.
8,
no.
2,
p
p.
168
-
174,
J
un.
2019
,
doi
:
10.
11591/ijai.v8.i2.pp168
-
174
.
[10]
M.
A.
Febri
ant
o
no,
S.
H.
Pram
ono,
R.
Rahmadw
at
i
,
and
G.
Nagh
d
y
,
“
Cla
ss
ifi
c
at
i
on
of
m
ult
i
cl
ass
imbala
nc
ed
da
t
a
using
cost
-
sensiti
ve
decision
tre
e
C5.
0,
”
IAE
S
Int
ernati
onal
Jour
nal
of
Arti
ficial
Inte
lligen
ce
(
IJAI)
,
vol.
9,
no.
1,
pp.
65
-
72
,
Mar
.
2020
,
doi
:
10
.
11
591/i
jai.
v9
.
i1
.
pp
65
-
72
.
[11]
H.
E.
Mos
ta
fa
a
nd
F.
Bena
bbou,
“
A
dee
p
le
arn
in
g
base
d
te
chn
iqu
e
for
pla
gi
ari
sm
det
e
ct
ion
:
a
compara
t
ive
stud
y
,
”
IAE
S
Inte
rnat
io
nal
Journal
of
Arti
fi
c
ial
Int
e
ll
ige
n
ce
(
IJA
I)
,
vol.
9,
no.
1,
pp.
81
-
90,
Mar.
2020
,
doi:
10.
11591/ijai.v9.i1.pp81
-
90
.
[12]
E.
W
.
Ngai,
Y.
Hu,
Y.
H.
W
on
g
,
Y.
Chen
,
and
X.
Sun
,
“
The
appl
icati
on
of
data
m
ini
ng
te
chniques
in
fina
ncia
l
fra
ud
detec
ti
on:
A
cl
assificat
ion
fra
m
ework
and
an
acade
m
ic
rev
ie
w
of
l
it
er
at
ur
e,”
Dec
ision
suppor
t
systems
,
vol.
50,
no
.
3
,
pp
.
55
9
-
569
,
2011
,
doi
:
10.
1016
/j.d
ss
.
2010.
08.
006
.
[13]
D.
Prus
ti
,
and
S.
K.
Rat
h,
“
Fraudule
n
t
Tra
nsa
ct
ion
Det
ection
in
Credi
t
C
ard
b
y
Appl
y
ing
En
sem
ble
Mac
hin
e
Le
arn
ing
techni
ques,
”
In
10
th
Inte
rnationa
l
Confe
renc
e
on
Computing,
C
omm
unic
ati
on
and
Net
workin
g
Technol
ogi
es
,
2
019,
pp
.
1
-
6
,
doi
:
10.
1109
/ICCC
NT45670.
2019.
8944867
.
[14]
Mac
hine
L
ea
r
ning
Group
–
UL
B,
Credit
c
ard
Fraud
Dete
c
ti
on,
Kaggl
e
,
2018.
[Onl
ine
]
.
Avail
able
:
htt
ps://
ww
w.ka
g
gle
.
com/m
lg
-
ulb/
cre
d
it
c
ard
fra
ud
.
[15]
S.
Ghos
h,
Rei
lly
,
“
Credit
ca
rd
fra
ud
det
e
ct
ion
with
a
neur
al
-
n
et
work,”
In
Pro
ce
ed
ings
of
the
Tw
ent
y
-
Sevent
h
Hawaii
Int
ernational
Conf
ere
nce on
Syst
em
Sc
ie
n
ce
s
,
1994
,
p
p.
62
1
-
630,
doi
:
10
.
1
109/HICSS
.
1994.
323314
.
[16]
F.
Carc
illo,
Y.
A.
Le
Borgne
,
O.
Cae
le
n
,
Y.
Kess
ac
i,
F.
Oblé
,
and
G.
Bonte
m
pi,
“
Com
bini
ng
unsupervise
d
an
d
supervise
d
learn
ing
in
cr
edi
t
card
fra
ud
de
te
c
tion,”
Informatio
n
Sci
en
ce
s
,
vo
l.
557,
pp.
317
-
331,
2019
,
do
i
:
10.
1016/j.ins.
20
19.
05.
042
.
[17]
F.
Carc
i
ll
o,
Y.
A.
Le
Borgn
e,
O
.
Cae
l
en
.
,
and
G.
Bonte
m
pi
,
“
Stre
aming
ac
t
ive
l
earning
strategie
s
f
or
rea
l
-
li
f
e
cr
edit
ca
rd
fr
aud
de
tect
ion:
assess
m
ent
and
visualiz
at
ion
,
”
In
te
rnationa
l Journal
of
Data
Sci
en
ce
and
Ana
ly
tics
,
vo
l.
5
,
no
.
4,
pp
.
285
-
300
,
2018
,
doi
:
0
.
100
7/s41060
-
018
-
0116
-
z
.
[18]
A.
C.
Bahnsen,
D.
Aouada
,
A.
Stoja
novic,
and
B.
Otte
rst
en,
“
Feat
ur
e
engi
n
ee
r
i
ng
strat
egies
for
cre
di
t
ca
rd
fr
au
d
det
e
ct
ion
,
”
Ex
p
e
rt Sy
stem
w
it
h
A
ppli
cat
opns
,
vol
.
51,
pp.
134
-
142
,
2016
,
doi
:
10
.
1
016/j
.
eswa.
2015.
12.
030
.
[19]
S.
Bhat
tacha
r
yya,
S.
Jha,
K.
Th
ara
kunne
l,
and
J
.
C.
W
estl
and,
“
Data
m
ini
ng
for
cre
di
t
ca
rd
fra
ud
:
A
compara
ti
ve
stud
y
,
”
De
ci
sion
Support
Syst
ems
,
vol. 50, no. 3,
pp.
602
-
613
,
20
11
,
doi
:
10
.
1016
/j
.
dss
.
2010.
08
.
0
08.
[20]
A.
Dal
Pozzol
o,
G.
Borac
chi
,
O.
Cae
l
en,
C.
Ali
ppi,
and
G.
Bonte
m
pi,
“
Credi
t
ca
rd
fra
ud
detec
ti
on:
a
reali
sti
c
m
odel
ing
and
a
novel
l
ea
rn
ing
st
rat
eg
y
,
”
I
EEE
tr
ansacti
ons
on
n
eural
net
works
and
le
arning
sys
te
ms
,
vol
.
29,
no
.
8.
pp
.
3784
-
379
7,
2017
,
doi
:
10.
1109/T
NN
LS.
20
17.
2736643
.
[21]
J.
Jurgovs
ky
,
M.
Grani
tzer,
K.
Zi
egler,
S.
Cala
bre
tt
o
,
P.
E.
Portie
r
,
L.
He
-
Gue
lt
on,
and
O.
Ca
el
en
,
“
Sequenc
e
cl
assifi
ca
t
ion
for
cre
d
it
-
c
ard
fr
au
d
detec
t
ion,”
E
x
pert
Syst
ems
wit
h
Appl
i
cations
,
vol.
100
,
pp
.
23
4
-
245,
2018
,
doi
:
10.
1016/j.e
sw
a.
2
018.
01.
037
.
[22]
S.
M
a
e
s
,
K.
T
u
y
l
s
,
B.
V
a
n
s
c
h
o
e
n
w
i
n
k
e
l
,
a
n
d
B.
M
a
n
d
e
r
i
c
k
,
“
C
r
e
d
i
t
c
a
r
d
f
r
a
u
d
d
e
t
e
c
t
i
o
n
u
s
i
n
g
B
a
y
e
s
i
a
n
a
n
d
n
e
u
r
a
l
n
e
t
w
o
r
k
s
,
”
In
P
r
o
c
e
e
d
i
n
g
s
o
f
t
h
e
1
s
t
i
n
t
e
r
n
a
t
i
o
n
a
l
n
a
i
s
o
c
o
n
g
r
e
s
s
o
n
n
e
u
r
o
f
u
z
z
y
t
e
c
h
n
o
l
o
g
i
e
s
,
2
0
0
2
,
p
p
.
2
6
1
-
270
.
[23]
R.
J.
Bolt
on,
an
d
D.
J
Hand,
“Unsupervi
sed
profil
ing
m
et
hods
for
fra
ud
det
e
c
ti
on,
”
Credit
sc
oring
and
cre
dit
cont
rol
VII
,
pp.
235
-
255,
2001
.
[24]
M
.
Z
a
r
e
a
p
o
o
r
,
K
.
R
.
S
e
e
j
a
,
a
n
d
M.
A
.
A
l
a
m
,
“
A
n
a
l
y
s
i
s
o
n
c
r
e
d
i
t
c
a
r
d
f
r
a
u
d
d
e
t
e
c
t
i
o
n
t
e
c
h
n
i
q
u
e
s
:
b
a
s
e
d
o
n
c
e
r
t
a
i
n
d
e
s
i
g
n
c
r
i
t
e
r
i
a
,
”
I
n
t
e
r
n
a
t
i
o
n
a
l
j
o
u
r
n
a
l
o
f
c
o
m
p
u
t
e
r
a
p
p
l
i
c
a
t
i
o
n
s
,
v
o
l
.
5
2
,
n
o
.
3
,
p
p
.
3
5
-
4
2
,
2
0
1
2
,
d
o
i
:
1
0
.
5
1
2
0
/
8
1
8
4
-
1
5
3
8
.
[25]
A.
A.
Akin
y
e
lu
,
and
A.
O.
Ade
wum
i,
“
Cla
ss
ifi
ca
t
ion
of
phishi
ng
email
using
ran
dom
fore
st
m
ac
hine
learni
n
g
te
chn
ique
,
”
Journal
of
Applied
M
athe
mati
cs
,
201
4
,
doi
:
10
.
1155/
2014/425731.
[26]
S.
Kira
n,
J.
Gur
u,
R.
Kum
ar,
N.
Kum
ar,
D.
Kat
ariy
a
,
and
M.
S
har
m
a,
“
Credit
ca
rd
fr
aud
de
tec
ti
on
using
Na
ïv
e
Ba
y
es
m
odel
ba
sed
and
KN
N
c
la
ss
ifi
er
,
”
In
te
rn
ati
onal
Journal
of
Adv
an
ce
Re
s
earc
h,
Id
eas
an
d
Innov
ati
ons
in
Technol
ogy
,
vo
l. 4, no. 3, pp. 44
-
47,
2018
.
[27]
E.
N.
Os
egi
and
E.
F
.
Jum
bo,
“
Com
par
at
ive
an
aly
sis
of
cre
d
it
ca
r
d
fra
ud
d
et
e
ctio
n
in
sim
m
ula
te
d
anne
a
li
ng
tra
in
e
d
art
if
ic
i
al
n
eur
a
l
net
work
and
h
ierarc
hi
ca
l
te
m
por
al
m
emor
y
,”
M
achi
ne
Learning
wit
h
Appl
i
cat
io
ns
,
vol.
6
,
20
2
1
,
doi:
10
.
1016/j.m
lwa.
2021.
10008
0.
[28]
C.
Nuno,
G.
Fi
guei
ra
,
and
M.
Costa,
“
A
d
ata
m
ini
ng
-
base
d
s
y
stem
for
cre
di
t
-
ca
rd
fr
aud
de
tecti
on
in
e
-
ta
i
l,”
Dec
ision
Suppor
t
Syst
ems
,
vol
.
9
5,
pp
.
91
-
101
,
2
017
,
doi
:
10
.
101
6/j
.
dss
.
2017.
01
.
002
.
[29]
A.
Coli
n
,
“
Buil
d
ing
Dec
ision Tre
es
with the
ID3
Algorit
hm
”,
Dr
.
Dobbs
Journal
,
1996.
[30]
H.
H
e
,
Y.
Bai
,
E
.
A
Garc
ia,
and
S.
Li,
“
AD
AS
Y
N:
Adapti
ve
s
y
n
the
tic
sam
pli
ng
appr
oac
h
for
im
bal
an
ce
d
l
ea
rnin
g,
”
In
2008
IEE
E
I
nte
rnational
Joint
Confe
ren
ce
on
Neural
Net
w
orks
(
IEE
E
World
Congress
o
n
Computati
ona
l
Inte
lligen
ce
)
,
20
08,
pp
.
1322
-
13
28,
doi
:
10
.
1109
/IJCNN
.
2008.
4633969
.
Evaluation Warning : The document was created with Spire.PDF for Python.