Int
ern
at
i
onal
Journ
al of Inf
orm
at
ic
s
and
Co
m
munic
at
i
on
Tec
hn
olog
y (IJ
-
I
CT)
Vo
l.
8
,
No.
3
,
D
ecembe
r
201
9
, pp.
12
2
~
12
7
IS
S
N:
22
52
-
8776
, DO
I: 10
.11
591/ij
ic
t.v8
i
3.p
p
12
2
-
12
7
122
Journ
al h
om
e
page
:
http:
//
ia
escore.c
om/j
ourn
als/i
ndex.
ph
p/IJI
C
T
Classifie
rs ensem
ble an
d syntheti
c mi
n
orit
y overs
am
pling
techni
ques
f
o
r academic
perf
ormanc
e pre
dict
i
on
Ab
d
ulazeez
Y
usuf
1
,
Ayu
ba
John
2
1
Depa
rtment of
Comput
er
Scie
n
ce
,
Feder
al Univ
ersit
y
Dutse
,
Nig
eri
a
2
Depa
rtment of
Cyber
Secur
it
y
,
Feder
al Unive
rsi
ty
Dutse,
Niger
i
a
Art
ic
le
In
f
o
ABSTR
A
CT
Art
ic
le
history:
Re
cei
ved
Sep
1
8
, 201
9
Re
vised
N
ov
3
, 201
9
Accepte
d Nov
2
3
, 201
9
The
inc
r
ea
sing
n
ee
d
for
d
ata
dr
iv
en
d
ec
ision
m
ak
ing
r
ec
en
tl
y
has
result
ed
in
the
appl
i
cation
o
f
data
mi
ning
in
var
ious
fi
el
ds
including
the
edu
ca
t
iona
l
sec
tor
which
is
r
efe
rr
e
d
to
as
edu
catio
nal
data
mi
n
ing.
Th
e
nee
d
for
i
mprovi
ng
the
per
forma
n
ce
of
dat
a
m
ini
ng
mod
el
s
h
as
a
lso
b
ee
n
id
ent
if
ie
d
as
a
g
ap
for
fu
ture
rese
arc
h
er.
In
Ni
ger
ia,
highe
r
edu
ca
t
iona
l
inst
it
ut
i
ons
collect
var
io
us
student
s’
dat
a
,
but
th
ese
da
ta
ar
e
rar
el
y
used
in
any
de
ci
sion
or
poli
cy
ma
king
to
im
prov
e
the
a
ca
de
mic
p
erf
orma
n
ce
of
student
s.
Th
is
rese
arc
h
work
,
at
t
em
pts
to
im
prove
the
p
er
forma
nc
e
of
d
ata
mi
n
ing
model
s
for
pre
dicting
student
s’
ac
ad
em
i
c
per
for
ma
nc
e
using
st
ac
king
c
la
ss
ifi
e
rs
ense
mb
le
an
d
synthetic
mi
norit
y
ov
er
-
sa
mpl
ing
t
ec
hn
iqu
es.
The
r
ese
ar
ch
was
conduc
te
d
by
adopt
ing
and
e
va
lua
t
ing
the
per
for
ma
n
c
e
of
J48,
IBK
and
SM
O
cl
a
ss
ifi
ers.
The
indi
vidual
cl
assi
fie
rs
mod
el
s,
st
a
ndar
d
sta
cki
ng
class
ifi
er
ense
mble
mod
el
and
stac
king
c
la
ss
ifiers
ense
mble
mo
del
were
tr
ai
n
ed
and
te
sted
on
20
6
studen
ts’
dat
a
se
t
from
th
e
fac
ul
ty
of
scie
n
c
e
f
ede
r
al
univ
ersit
y
Dutse.
Stude
nts’
spec
ific
pre
vious
a
ca
d
e
mi
c
p
erf
orm
ance
re
cor
ds
a
t
Un
ifi
ed
Te
r
ti
ar
y
Matri
cu
la
t
ion
Exa
mi
n
ation,
S
eni
or
Se
conda
r
y
Cert
if
icate
E
xam
in
at
ion
and
first
ye
ar
Cumul
ative
Gra
de
Point
Avera
g
e
of
student
s
are
used
as
d
at
a
inpu
ts
in
WE
KA
3.
9.
1
data
mi
nin
g
tool
to
pre
d
ic
t
student
s’
gra
du
at
ion
class
es
of
degr
ee
s
at
under
gra
dua
te
l
e
vel
.
Th
e
resul
t
s
hows
tha
t
appl
i
c
at
ion
of
synthe
tic
mi
nori
ty
over
-
sampling
technique
for
c
la
s
s
balanc
ing
i
mpr
oves
a
ll
the
v
arious
models
per
forma
n
ce
wit
h
the
proposed
modi
fie
d
sta
cki
n
g
class
ifi
ers
ense
mbl
e
model
outpe
rform
ing
t
he
v
ari
ous
class
ifi
ers
models
in
both
p
erf
orma
n
ce
a
cc
ura
cy
and
RS
ME
va
lu
es
ma
k
ing
i
t the
best
mod
el
.
Ke
yw
or
ds:
Acad
e
mic p
er
f
ormance
pr
e
dicti
on
Ed
ucati
on
al
da
ta
minin
g (
E
D
M
)
Stac
king classi
fiers e
ns
em
ble
Syntheti
c mi
nority
ov
e
r
-
samplin
g
te
c
hniqu
e
(
S
MOTE
)
Copyright
©
201
9
Instit
ut
e
o
f Ad
vanc
ed
Engi
n
ee
r
ing
and
S
cienc
e
.
Al
l
rights re
serv
ed
.
Corres
pond
in
g
Aut
h
or
:
Ayu
ba
J
ohn
Dep
a
rtme
nt of
Cyb
e
r
Sec
uri
ty
Fed
e
ral
,
Un
i
ver
sit
y D
ut
se Jiga
wa
Stat
e
,
Du
tse
, J
iga
wa
,
Nige
ria
.
Emai
l:
ay
uba.j
ohn@f
ud.edu.
ng
1.
INTROD
U
CTION
Decisi
on
ma
kin
g
has
gr
a
dual
ly
bec
om
e
data
dr
i
ven
rece
ntly
,
due
to
t
he
la
rge
amo
un
t
of
da
ta
avail
able
as
a
res
ult
of
a
dv
a
nceme
nt
in
inf
or
mati
on
a
nd
c
om
m
unic
at
ion
te
ch
no
l
ogy
(I
CT
).
Data
mi
ning
has
been
a
pp
li
ed
in
va
rio
us
fiel
ds
li
ke
me
dica
l,
mar
keting,
machine
le
arn
i
ng,
arti
fici
al
intel
li
gen
ce,
c
ust
om
er
relat
io
ns
et
c.
Re
centl
y,
D
at
a
minin
g
is
wide
ly
us
e
d
on
ed
ucati
on
al
datas
et
wh
ic
h
is
re
f
err
e
d
to
as
e
ducat
ion
al
data
minin
g
(EDM
)
an
d
ha
s
now
be
co
m
e
a
very
us
e
f
ul
r
esearch
a
rea
[1
].
T
his
ne
w
e
m
erg
i
ng
fiel
d,
ca
ll
ed
e
du
cat
io
na
l
data
minin
g,
[2
]
is
con
ce
r
ned
w
it
h
dev
el
op
i
ng
methods
t
hat
disco
vers
kn
ow
le
dg
e
i
n
da
ta
or
igi
nating
from
edu
cat
io
nal
en
vir
onments.
T
o
do
this
it
us
es
diff
e
re
nt
data
minin
g
te
ch
niques
an
d
mac
hi
ne
le
arn
i
ng
al
gorith
ms
.
The
stu
dy
[3
]
ind
ic
at
es
that
s
om
e
of
the
pro
blems
relat
ed
t
o
stu
den
ts
’
su
c
cess
in
a
co
ur
s
e
are
ha
rd
to
s
olve
simply
be
caus
e
usual
sta
ti
sti
cal
meth
ods
a
r
e
not
dee
p
e
nough
to
disc
ov
e
r
the
hidden
pa
tt
ern
s
a
nd
knowle
dge
,
us
ef
ul
f
or
ed
uc
at
ion
al
proce
sses
pla
nn
i
ng
and
orga
nizat
ion.
T
her
e
fore,
there
is
need
to
ad
op
t
data
minin
g
Evaluation Warning : The document was created with Spire.PDF for Python.
In
t J
Inf & C
ommu
n Tec
hn
ol
IS
S
N:
22
52
-
8776
Cl
as
sif
ie
rs en
s
emb
le
and
sy
nth
et
ic
m
in
or
it
y
overs
amplin
g
t
echn
i
qu
e
s for
acade
mic
…
(
Ab
du
l
az
eez
Yusuf
)
123
te
chn
iq
ue
for
so
lvi
ng
pro
bl
ems
relat
ed
t
o
stu
de
nts’
s
uccess
us
in
g
data
ori
gin
at
i
ng
from
ed
uc
at
ion
al
env
i
r
onme
nts.
Var
i
ou
s
data
m
ining
te
c
hn
i
ques
hav
e
been
i
mp
le
me
nted
in
resea
rc
h
stu
dies
f
or
e
ducat
io
nal
data
minin
g.
The re
search
wor
k
c
onduct
e
d
[4
]
cate
gorized
all
th
e v
ari
ous met
hods u
sed
i
n
e
ducat
ion
al
data
minin
g
into
the
fo
ll
owin
g
cat
eg
or
ie
s:
Cl
assifi
cat
ion
,
Cl
us
te
rin
g,
Re
la
ti
on
sh
i
p
minin
g,
Disco
very
with
m
odel
s
and
finall
y
Disti
ll
at
ion
of
data
for
human
j
udgm
e
nt.
The
se
data
minin
g
meth
od
s
hav
e
bee
n
a
ppli
ed
in
ma
ny
r
esearch
works
a
nd
w
er
e
re
ported
to
hav
e
be
tt
er
pe
rformance
tha
n
oth
e
r
met
ho
ds
.
A
cr
os
s
va
li
da
ti
on
te
st
re
su
lt
[5
]
ind
ic
at
es
that
data
minin
g
te
chn
i
qu
e
s
predi
ct
ed
sig
nifican
tl
y
bette
r
tha
n
it
s
sta
ti
st
ic
al
c
ounter
par
t.
Th
eref
or
e
,
[6
]
s
uggeste
d
that
w
it
h
t
he
in
creasin
g
need
for
data
minin
g
an
d
a
nalys
es
,
there
is
a
nee
d
f
or
imp
rovi
ng
the
performa
nce
of d
at
a mi
ning
m
od
el
s
and
mac
hin
e lea
rn
i
ng a
lgorit
hm
s
.
2.
RELATE
D
S
TUDIES
In
recent
ti
me
var
io
us
resea
rch
st
ud
ie
s
ha
ve
bee
n
c
onduct
ed
on
pr
e
di
ct
ing
stu
den
ts
’
acade
mic
performa
nce
usi
ng v
a
rio
us da
ta
minin
g t
ech
niques a
nd
ma
chine
lear
ning
al
gorith
ms.
T
he
stu
dy
of
[7
]
a
dopted
on
l
y
tw
o
cl
assi
fiers
al
gorithm
s
in
pr
e
dicti
ng
t
he
dro
pout
feat
ur
es
of
st
ud
e
nt
s.
T
he
resu
lt
of
the
resea
rc
h
on
three
diff
e
re
nt d
at
as
et
s
w
hich
c
on
t
ai
ns
d
if
fer
e
nt
s
tud
e
nts’
at
trib
ut
es,
s
uch
as:
n
a
ti
on
al
it
y
of
the
stu
de
nts,
s
ex
, ci
ty
of
li
vin
g,
hi
gh
sc
hool
gra
des,
program
e
nrolle
d,
num
ber
of
ea
rn
e
d
cre
dits
in
the
first
year
of
stu
dy
an
d
a
ve
rag
e
gr
a
de
i
n
the
fir
st
year
of
st
udy
ind
ic
at
es
t
hat
data
mi
ning
with
J
48
decisi
on
tree
al
go
rithm
is
more
acc
ur
at
e
tha
n
Naïve
Ba
ye
s
cl
assifi
er
al
gorit
hm
wit
h
a
n
ac
cur
ac
y
of
81.
1679
%.
T
he
re
s
earch
on
l
y
co
nsi
der
e
d
tw
o
cl
a
ssifie
r
al
gorithms.
M
e
anwhil
e,
[8
]
use
d
m
ore
cl
assif
ie
r
al
gorith
ms
wh
ic
h
a
re;
Ne
ur
al
Netw
ork
(
NN),
Decisi
on
Tree,
Suppor
t
Vecto
r
M
achi
ne
(SV
M
)
,
K
-
nea
rest
neig
hbor
(
KNN)
,
Naïve
Ba
ye
s
and
R
ule
Ba
sed
to
pr
e
dict
le
arn
e
rs’
pro
gr
essi
on
in
t
erti
ary
e
du
cat
i
on.
The
fin
ding
f
rom
t
he
resea
r
ch
i
nd
ic
at
es
tha
t
SVM
has
t
he
highest
perf
ormance
accurac
y
of
73.
33%
an
d
t
he
le
ast
perfor
mance
was
re
corde
d
by
L
ogist
ic
Re
gr
essi
on
w
hich
has
60.05%
accurac
y.
O
nl
y
psyc
hometri
c
factors
relat
ed
to
stu
de
nts
a
re
consi
der
e
d
in
c
onduct
ing
the
r
esearch
.
Aca
de
mic
performa
nce
of
stude
nt
[
2
]
is
not
a
r
esult
of
on
ly
one
decidi
ng
facto
r
besides
it
heav
il
y
hinge
s
on
var
i
ou
s
fa
ct
ors
li
ke
pe
rsonal,
s
ocio
-
eco
nomic
, p
s
yc
holo
gi
cal
and
oth
e
r
e
nv
i
ronme
ntal va
ri
ables.
The
w
ork
of
[9
]
ma
de
us
e
of
three
cl
assi
fiers
w
hich
a
re
Naïve
Ba
yes,
Decisi
on
tree
and
Ne
ur
al
Netw
ork.
I
n
th
e
stu
dy, c
onti
nuous
at
trib
utes
wer
e
disc
reti
zed
us
in
g
opti
ma
l
eq
ual
width
bi
nn
in
g
an
d
S
yntheti
c
M
in
or
it
y
O
ve
r
-
Sam
plin
g
(
S
M
O
TE)
te
ch
niq
ue
wa
s
us
ed
to
inc
rease
t
he
vo
l
um
e
of
data,
beca
us
e
the
r
e
we
re
li
mit
ed
instanc
es
in
the
acq
ui
red
data
du
rin
g
pr
e
proce
ssin
g.
Ne
ur
al
Net
work
a
nd
Naïv
e
Ba
yes
wer
e
r
eported
to
be
more
acc
ur
at
e
tha
n
Dec
isi
on
t
ree
for
c
la
ssific
at
ion
w
he
n
Op
ti
mal
E
qu
al
Wi
dth
Bi
nn
i
ng
a
nd
S
yn
theti
c
M
in
or
it
y
Ov
e
r
-
Samp
li
ng
(SM
OTE
)
te
ch
ni
ques
are
ap
plied
on
data.
B
oth
c
la
ssifie
rs
hav
e
an
acc
uracy
of
71.6%
though;
Neural
Netw
ork a
l
gor
it
hm
was
disc
overe
d t
o
be
slo
wer
w
hen
co
m
par
e
d t
o
the
Na
ïve
Ba
yes a
l
go
rithm
th
ere
by
ma
king
Naïve
Ba
yes
model
bette
r t
ha
n t
he
ne
ur
al
ne
twork
m
odel
.
[2
]
Fo
c
us
e
d
on
ide
ntify
i
ng
the
slo
w
le
arn
er
s
a
mon
g
st
ud
e
nts
an
d
disp
la
ying
i
t
by
a
pre
dicti
ve
data
mi
nin
g
m
odel
us
in
g
cl
assifi
cat
io
n
-
base
d
al
gorithms
(
Mult
il
ayer
Perce
ption,
Naï
ve
Ba
yes,
S
MO,
J
48
an
d
REPT
re
e.
The
work
s
hows
that
m
ul
ti
la
yer
Perce
ption has
the h
i
gh
e
st acc
ur
ac
y of 7
5%
a
nd RepT
ree
ha
s the least
acc
uracy
w
it
h 6
7.7
6%.
A
co
mp
a
rati
ve
analysis o
f
th
r
ee
sel
ect
ed
cl
assifi
cat
ion
al
go
rithms;
Decisi
on
T
ree
(
DT
),
N
aï
ve
Ba
ye
s
(N
B)
,
an
d
Ru
l
e
Ba
sed
(RB)
was
co
nducte
d
by
[10
]
t
o
pre
dict
stud
e
nts’
academic
pe
rformance
.
The
analysi
s
was
done
t
o
di
sco
ver
the
be
st
te
ch
niques
to
d
evel
op
a
pre
di
ct
ive
m
od
el
f
or
St
udent
Aca
de
mic
Pe
rforma
nce of
first
seme
ste
r
performa
nce
f
or
first
yea
r
B
achelo
r
of
co
m
pute
r
sci
e
nce
stu
den
ts
at
U
niv
e
rsiti
Su
lt
a
n
Zai
nal
Ab
i
din
.
Rule
Ba
sed
cl
assi
fier
was
disc
ov
e
r
ed
t
o
be
the
be
st
model
am
ongs
t
th
e
oth
e
r
cl
assifi
ers
by
rec
ei
vin
g
the
hi
gh
e
st
pe
rformance
acc
ur
ac
y
value
of
71.3
%
.
T
he
model
in
thi
s
stu
dy
does
no
t
pro
vid
e
de
ta
il
ed
inf
or
mati
on
a
bout
the
st
udent
s’
pe
rforma
nce
.
It
only
pr
e
dic
te
d
the
fir
st
-
ye
ar
pe
rformance
of
st
ud
e
nts
no
t
the
gr
a
duat
ion
pe
r
forma
nce
a
nd
cl
assifi
es
stud
e
nts’
perf
or
ma
nc
e
into
poor,
a
ver
a
ge
a
nd
go
od.
T
he
re
sear
ch
on
early
pr
e
dicti
on
of
st
ud
e
nts’
Gr
a
de
P
o
int
A
ver
a
ge
(GPA
)
by
[
11
]
al
so
s
howe
d
that
s
uppo
rt
vecto
r
m
achine
(S
V
M)
cl
assi
fier
al
go
rithm
pr
edict
ion
is
m
or
e
accu
rate
t
han
the
extre
me
le
arn
i
ng
mac
hine
meth
od
a
nd
neural
netw
ork.
With
performa
nce
a
ccur
ac
y
of
93.
06%
w
he
n
sec
ond
year
GPA
of
st
ud
e
nts
is
c
o
ns
i
der
e
d
a
nd
97.98%
wh
e
n
t
he
t
hir
d
year
G
PA
of
stu
den
ts
ar
e
co
ns
ide
re
d.
Thr
ee
s
up
e
rv
i
sed
machi
ne
l
earn
i
ng
al
gori
thms’
performa
nce
w
ere
e
valuate
d
on
stu
den
ts
’ a
ss
essment
da
ta
c
har
act
erist
ic
s
by
[
14
] t
o
predi
ct
su
cces
s
i
n
a
cours
e
(eit
her
passe
d
or
fail
ed)
the
r
esult
ind
ic
at
es
that
base
on
th
ei
r
pr
e
dicti
on
accurac
y,
ease
of
le
ar
ning
an
d
us
e
r
fr
ie
ndly
ch
a
rac
te
risti
cs, N
aï
ve
Bayes classi
fier
ou
t
performs
d
eci
sio
n
tre
e a
nd n
e
ural
n
et
w
ork
cl
assifi
e
rs.
Th
e
resea
rch
c
onduct
ed
by
[13
]
to
pr
e
dict
st
ud
e
nts’
perfor
mance
s
hows
that
Ra
ndom
f
orest
is
a
more
accurate
a
nd
faster
al
gorith
m
com
pa
red
t
o
decisi
on
tre
e,
K
-
Nea
rest
Neig
hbour
(IB
K)
a
nd
Mult
i
-
la
ye
r
per
ce
ptr
on
al
gorith
ms
with
a
n
acc
uracy
of
89.
23%.
P
redi
ct
ing
aca
demi
c
pe
rfo
rma
nce
of
stu
den
ts
is
[1
4
]
chall
eng
i
ng
si
nce
st
udents
’
academic
pe
rformance
dep
e
nd
s
on
div
e
rs
e
fact
or
s
s
uch
as
pe
rsonal,
so
ci
o
-
economic,
psy
cho
l
og
ic
al
an
d
oth
e
r
e
nvir
onm
ental
var
ia
bles
.
T
he
st
udy
al
so
ide
ntify
e
ns
em
ble
m
et
hods
a
r
e
the
mo
st
i
nfl
ue
ntial
de
velo
pm
e
nt
in
Data
M
ini
ng
a
nd
M
ac
hi
ne
Lear
ning
in
the
past
deca
de.
A
n
a
ppro
a
ch
f
or
pr
e
dicti
ng
st
udents’
aca
demic
performa
nce
usi
ng
e
ns
em
ble
model
was
pr
e
sented
in
t
he
study
of
[15
].
Stac
kin
g
Evaluation Warning : The document was created with Spire.PDF for Python.
IS
S
N
:
2252
-
8776
In
t J
Inf & C
ommu
n Tec
hn
ol
,
V
ol.
8
,
No.
3,
Dec
201
9
:
12
2
–
12
7
124
ensem
ble
te
ch
nique
was
us
e
d
in
[
16
]
for
predict
ing
ac
a
de
mic
achieve
me
nt
of
st
ud
e
nts.
Perfo
rma
nce
of
th
ree
cl
assifi
ers
al
gorith
ms
wer
e
evaluate
d
an
d
sta
ckin
g
e
ns
e
mb
le
te
ch
niqu
e
was
us
e
d
to
com
bin
e
t
he
three
cl
assifi
ers a
nd
a
be
tt
er
root
m
ean
squa
re
e
rro
r
(RMSE)
valu
e
of
0.1
291
wa
s
ob
ta
ine
d
as
c
ompare
to
0.1
89
8
for
back p
r
op
a
gation ne
ural
n
et
w
ork, 0
.13
14 for M
5P
a
nd
0.1
343 f
or sup
port
vecto
r
mac
hin
e
.
Stac
king
is
on
e
of
the
e
ns
em
ble
te
chn
i
qu
e
s
us
ed
by
resea
rch
e
r
with
t
he
ai
m
of
im
pro
vi
ng
m
odel
’s
performa
nce.
[
17
]
Stac
king
e
ns
em
ble
te
ch
ni
qu
e
has
ca
pab
il
it
y
of
co
m
bin
in
g
heter
og
e
ne
ous
base
cl
assifi
e
rs
a
nd
a
M
et
a
cl
assifi
er
is
trai
ned
f
or
fi
nal
pre
dicti
on.
P
re
dicti
on
s
outp
ut
of
base
cl
assifi
ers
a
re
fe
d
directl
y
as
data
input i
nto t
he
M
et
a cla
ssifie
r
for
t
rainin
g
a
nd
final
pr
e
dicti
on.
Ther
e
f
or
e,
in
t
his
researc
h
w
or
k,
the
imp
r
ove
me
nt
of
the
performa
nce
of
sta
ckin
g
cl
assifi
er
e
ns
e
mb
le
model
was
c
on
sidere
d,
so
that
only
insta
nces
that
are
c
orrectl
y
pr
e
dicte
d
f
ro
m
the
base
cl
as
sifie
rs
a
re
fe
d
t
o
t
he
meta
classi
fier.
3.
RESEA
R
CH
METHO
D
The
meth
odol
o
gy
of
E
du
cat
ion
al
Data
M
inin
g
was
no
t
yet
cl
earl
y
de
f
ined
a
nd
the
re
are
no
cl
ea
r
sta
nd
a
rds
ab
ou
t
wh
ic
h
data
m
ining
meth
ods
or
al
gorithms
a
re
prefe
rab
le
i
n
this
co
nte
xt.
Var
i
ou
s
data
minin
g
methods
ha
ve
been
us
e
d
by
di
ff
ere
nt
resea
rc
her
s
f
or
est
ima
ti
ng
pref
era
ble
al
gorithms
i
n
this
co
ntext
[7
]
.
But
in
ge
ne
ral,
it
was
sta
te
d
in
[18
]
that
Data
mini
ng
pr
oce
sses
f
ollo
ws
a
set
of
ste
ps
that
m
us
t
be
e
xecu
te
d
reg
a
rd
le
ss
of
th
e
al
gorithms
or
meth
odolog
y
t
hat
will
be
imp
le
mented
.
I
n
t
hi
s
study
t
he
Cr
os
s
Ind
us
tr
y
St
and
a
r
d
Pr
oc
ess
for Da
ta
M
ini
ng (
CR
I
SP
-
D
M)
was a
dopted
.
3.1.
D
ata c
oll
ection
A
total
of
20
6
stud
e
nts’
data
f
rom
the
f
acult
y
of
sci
e
nce,
Fe
der
al
Un
i
ver
sit
y
D
utse
was
co
ll
ect
ed.
The
data
set
was
di
vid
e
d i
nt
o t
wo
su
bse
ts
f
or
m
odel
trai
ni
ng
an
d t
est
in
g
resp
e
ct
ively. T
he
m
od
el
was
trai
ne
d
us
in
g
164
st
ud
e
nt’s
data
w
hich
represents
80%
of
the
data
set
w
hile
42
stu
den
t
s’
data
wh
ic
h
r
epr
ese
nt
20%
of
t
he
data set
was us
ed
in
m
od
el
test
ing
.
3.2.
D
ata
pre
pa
r
at
i
on
an
d
cl
eaning
The
data
pr
e
pa
rati
on
ph
a
se
co
ver
s
al
l
act
ivit
ie
s
require
d
in
c
on
st
ru
ct
in
g
the
final
data
set
that
was
fe
d
into W
E
K
A
3.
9.1
data
mini
ng
to
ols f
r
om
th
e
init
ia
l
ra
w
d
a
ta
.
It
is
a know
n
fact
that
real
-
world
d
at
a
te
nd
to
be
incomple
te
,
in
consi
ste
nt
a
nd
no
is
y.
T
her
e
fore,
f
or
real
-
w
orl
d
data
to
be
ut
il
iz
ed
by
t
he
da
ta
minin
g
to
ol,
the
y
hav
e
t
o
be
f
ur
t
her
pr
e
-
process
ed.
T
he
at
trib
ut
e
filt
er
in
WE
KA
3.9.1
was
us
e
d
to
re
move
no
is
y
an
d
in
co
mp
le
te
data.
The
fin
al
summa
ry
of
at
trib
utes
us
e
d
for
co
nductin
g
the
exp
e
rime
nts
a
fter
data
cl
eanin
g
are
pr
ese
nted
in
Ta
ble
1.
Table
1.
Summ
ary o
f
sel
ect
ed
stud
e
nts’
at
trib
utes
S/N
Attribu
tes
Data T
y
p
e
1
Eng
lish
Score
Flo
at
2
Su
b
ject2
Score
Flo
at
3
Su
b
ject3
Score
Flo
at
4
Su
b
ject 4 Sco
re
Flo
at
5
Eng
lish
Gr
ad
e
String
6
Su
b
ject2
Gr
ad
e
String
7
Su
b
ject 3 Gra
d
e
String
8
Su
b
ject 4 Gra
d
e
String
9
First Yea
r
CG
PA
Flo
at
10
Predicted
Class
of
Gradu
atio
n
No
m
in
al
3.3.
M
od
el
in
g
The
model
this
resea
rch
w
ork
at
te
mp
ts
to
de
velo
p
is
a
sta
ckin
g
cl
assifi
er
s
ense
mb
le
mod
el
.
The
data
set
for
the
stu
dy
is
small
a
nd
imbala
nc
e
as
s
uc
h,
machi
ne
le
arn
i
ng
al
gorith
ms
that
a
re
li
ke
ly
to
perf
or
m
well
in
dev
el
op
i
ng
thi
s
typ
e
of
m
ode
l
based
on
pr
e
v
io
us
st
ud
ie
s
wer
e
a
dopte
d.
The
d
at
aset
be
fore
cl
ass
bala
ncin
g
as
sh
ow
n
in
Fig
ure
1.
S
M
OT
E
was
us
e
d
f
or
ba
la
ncing
t
he
cl
asses
i
n
t
he
da
ta
set
th
us,
i
nc
reasin
g
t
he
vol
um
e
of
the
trai
ning
dat
a
set
from
164
instances
to
31
2
i
ns
ta
nces
the
reby
ma
king
al
l
the
f
ou
r
cl
ass
es
to
ha
ve
78
e
qu
a
l
numb
e
rs o
f
ins
ta
nces
il
lustrat
ed
in
Fig
ure
2
.
The
var
i
ous
m
od
el
s
we
re
trai
ned
an
d
te
ste
d
us
in
g
10
-
fo
l
d
cro
ss
valid
at
io
n
to
av
oid
ove
r
-
fitt
ing
the
models.
Propo
sed mo
del fra
mew
ork
ca
n be
seen
in Fi
gure
3
.
Evaluation Warning : The document was created with Spire.PDF for Python.
In
t J
Inf & C
ommu
n Tec
hn
ol
IS
S
N:
22
52
-
8776
Cl
as
sif
ie
rs en
s
emb
le
and
sy
nth
et
ic
m
in
or
it
y
overs
amplin
g
t
echn
i
qu
e
s for
acade
mic
…
(
Ab
du
l
az
eez
Yusuf
)
125
Figure
1.
Datas
et
b
ef
or
e
class
balancin
g
Figure
2
.
D
at
as
et
after class
bal
ancing
wit
h SM
OTE o
n W
EKA
e
xp
l
or
e
r
Figure
3.
Pro
pose
d
m
odel
fra
mew
ork
D
a
t
a
s
e
t
D
a
t
a
P
r
o
c
e
s
s
i
n
g
T
r
a
i
n
i
n
g
D
a
t
a
S
M
O
T
E
T
e
s
t
i
n
g
D
a
t
a
S
t
a
c
k
i
n
g
E
n
s
e
m
b
l
e
I
B
K
J
4
8
S
M
O
I
B
K
J
4
8
S
M
O
M
e
t
a
C
l
a
s
s
i
f
i
e
r
E
v
a
l
u
a
t
i
o
n
A
c
c
u
r
a
c
y
a
n
d
R
M
S
E
P
r
e
d
i
c
t
e
d
o
u
t
c
o
m
e
M
o
d
e
l
l
i
n
g
Evaluation Warning : The document was created with Spire.PDF for Python.
IS
S
N
:
2252
-
8776
In
t J
Inf & C
ommu
n Tec
hn
ol
,
V
ol.
8
,
No.
3,
Dec
201
9
:
12
2
–
12
7
126
3.4.
M
od
el
tr
aining
and
tes
ting
In
t
his
resea
rc
h,
s
e
ries
of
tra
ining
an
d
te
sti
ng
we
re
car
rie
d
ou
t
on
the
usi
ng
t
he
var
i
ous
m
odel
by
div
idi
ng
the
da
ta
set
was
div
i
de
d
int
o
tw
o
sub
set
s
for
model
t
rainin
g
a
nd
te
sti
ng
.
,
f
or
trai
ning,
80
%
of
the
data
set
was
us
ed
a
nd
the
re
maini
ng
20
%
of
the
data
set
was
use
d
f
or
te
sti
ng.
10
-
f
old
cr
os
s
validat
io
n
was
us
e
d
thr
ough
out
m
odel
trai
ning
a
nd
te
sti
ng
to
avo
i
d
ove
r
fitt
ing
the
model
s.
Since
t
he
da
ta
set
is
small
and
imbala
nce
d
S
M
O
TE
te
ch
nique
was
us
e
d
to
balance
the
cl
asses
an
d
incre
ase
data
volum
e
of
the
trai
ning
dat
a
set
s.
T
he
WE
KA
3.9.1
data
minin
g
t
oo
l
pr
ov
i
des
a
trai
n
ing
a
nd
te
sti
ng
op
ti
on
to
trai
n
an
d
te
st
on
th
e
sam
e
data set
.
3.5
.
M
od
el
e
valua
tio
n
To
e
valuate
t
he
perf
or
m
ance
of
the
var
i
ous
models,
Per
forma
nce
acc
ur
a
cy
a
nd
Ro
ot
mean
s
quare
error (R
M
SE
) was
us
e
d
to
in
dicat
e the
var
i
ou
s
m
odel
p
e
rformances
whic
h
a
re
pr
ese
nted
in
ta
bula
r
forms.
4.
RESU
LT
S
A
ND
DI
SCUS
S
ION
The
re
su
lt
s
as
obta
ined
on
trai
ning
the
var
i
ous
models
usi
ng
the
trai
ning
data
set
befor
e
cl
a
ss
b
al
anci
ng
is
pr
ese
nted
in
Table
2
wh
il
e
the
va
rio
us
m
od
el
s
pe
rforma
nce
re
su
lt
s
a
fter
cl
ass
balanc
ing
with
S
MO
TE
is
pr
ese
nted
in
T
able
3
The
res
ul
t
fr
om
Ta
ble
3
ind
ic
at
es
that
cl
ass
balanci
ng
us
in
g
SMO
TE
res
ults
in
im
prov
i
ng
al
l
the
var
io
us
models
pe
rformance
.
Th
ough,
al
l
the
var
i
ou
s
model
s
recorde
d
im
pro
veme
nt
in
their
performa
nce.
The
propose
d
modifie
d
sta
c
ki
ng
cl
assifi
e
rs
ensem
ble
m
odel
ou
tpe
rfo
rme
d
the
oth
e
r
m
odel
s
in
bo
t
h per
forma
nce acc
ur
ac
y
a
nd R
MSE
valu
es whic
h ma
ke
s the
model
bette
r
tha
n
the
o
t
he
r
cl
assifi
ers
m
od
el
.
Table
2.
Per
for
mance
of
var
i
ous
model
on tr
ai
nin
g datase
t
befor
e
class
ba
la
ncing
S/N
Clas
sifiers
Accuracy
RMSE
TPR
FPR
Precisio
n
1
J4
8
8
7
.19
5
1
%
0
.23
2
2
0
.87
2
0
.08
2
0
.88
1
2
SMO
8
6
.58
5
4
%
0
.33
0
1
0
.86
6
0
.09
6
0
.87
0
3
IBK
8
2
.92
6
8
%
0
.20
9
7
0
.82
9
0
.13
4
0
.84
0
4
Stan
d
ard Stacki
n
g
8
7
.19
5
1
%
0
.24
0
4
0
.87
2
0
.08
2
0
.88
1
5
Mod
ified Stack
in
g
9
3
.90
2
4
%
0
.17
1
9
0
.93
9
0
.04
8
0
.94
1
Table
3.
Per
for
mance
res
ult o
f vario
us m
od
e
l on trai
ni
ng
da
ta
set
after
cl
ass
b
al
anci
ng
S/N
Clas
sifiers
Accuracy
RMSE
TPR
FPR
Precisio
n
1
J4
8
9
5
.19
2
3
%
0
.14
5
5
0
.95
2
0
.01
6
0
.95
3
2
SMO
9
0
.70
5
1
%
0
.32
4
8
0
.90
7
0
.03
1
0
.90
8
3
IBK
9
0
.70
5
1
%
0
.15
9
1
0
.90
7
0
.03
1
0
.91
9
4
Stan
d
ard Stacki
n
g
9
6
.79
4
9
%
0
.10
9
8
0
.96
8
0
.01
1
0
.96
9
5
Mod
ified Stack
in
g
9
7
.75
6
4
%
0
.10
6
0
0
.97
8
0
.00
7
0
.97
8
The
var
ious
models
per
for
ma
nc
e
accuracy
resul
t
s
obta
ine
d
on
te
s
ti
ng
th
e
var
ious
c
la
ss
ifi
er
mode
ls
i
ndic
a
te
s
that
the
mod
ified
sta
c
king
ense
mbl
e
m
odel
outp
erf
orm
ed
th
e
oth
er
mod
el
s
with
an
acc
ur
ac
y
of
97.
7564%
and
RMS
E
of
0.
1060.
5.
CONCL
US
I
O
N
Data
minin
g
c
an
be
a
pp
li
e
d
on
st
ud
e
nts’
da
ta
avail
able
to
hi
gh
e
r
ed
uca
ti
on
al
insti
tuti
ons
to
dev
el
op
models
f
or
pre
dicti
ng
stud
ent
s
’
gr
a
duat
ion
cl
asses
of
de
gre
es
early
us
in
g
stud
e
nts’
first
year
C
GPA,
U
TME
su
bject
s
’
sc
or
e
s
an
d
their
co
rresp
onding
gr
a
des
i
n
SSCE
.
Re
so
lvin
g
cl
as
s
imbala
nc
e
pr
ob
le
m
in
data
s
et
us
ed
for
de
velo
ping
data
mini
ng
m
od
el
s
to
pre
dict
stude
nts
’
aca
demic
pe
rform
ance
us
i
ng
s
yntheti
c
min
or
it
y
over
-
samplin
g
te
c
hniqu
e
(SM
OTE
)
r
es
ults in im
prov
i
ng m
od
el
pe
rformance
.
REFERE
NCE
S
[1]
B.
Ry
an
&
Y.
K
al
in
a,
"The
Sta
te
of
Educat
ion
al
Data
Mining
in
2009:
A
R
eview
and
Future
Visi
ons,"
Journal
of
Educ
ati
ona
l
Dat
a
Mini
ng
,
Arti
uc
le
1
,
vo
l. 1, no.
1
,
2009.
[2]
K.
Parne
e
t,
M.
Singh
&
G.
S.
Jos
an,
"
Cl
assific
a
tion
and
Predi
ct
io
n
Based
Dat
a
Mi
ning
Algor
it
h
ms
to
Predi
ct
Slow
Le
arn
ers
in
Edu
ca
t
ion
Se
ct
or
,
"
3rd
Inte
rnat
iona
l
Con
fe
renc
e
on
R
ecent
Tr
ends
i
n
Computing
20
15(ICRTC
-
2015)
Proce
dia
Computer
Sc
ie
nc
e
57
,
p
p.
500
-
508
,
201
5.
[3]
N.
Sre
cˇk
o
&
Z
.
Moti,
"S
tude
nt
D
at
a
Mining
Solu
t
ion
-
Know
le
dge
Mana
gement
Sy
st
em
Relat
ed
to
Higher
Educat
io
n
Instit
uti
ons,"
E
x
pert
Syst
ems wi
t
h
Applications
,
a
t
Sc
ie
nc
eDire
c
t
,
vol.
41
pp.
6400
-
6407
.
2014
.
Evaluation Warning : The document was created with Spire.PDF for Python.
In
t J
Inf & C
ommu
n Tec
hn
ol
IS
S
N:
22
52
-
8776
Cl
as
sif
ie
rs en
s
emb
le
and
sy
nth
et
ic
m
in
or
it
y
overs
amplin
g
t
echn
i
qu
e
s for
acade
mic
…
(
Ab
du
l
az
eez
Yusuf
)
127
[4]
P.
Nithya,
B
.
Umam
ah
eswari
,
A.
Um
ade
vi
.
,
"
A
Survey
on
Educ
a
ti
ona
l
Da
t
a
Mining
in
Fi
el
d
of
Edu
ca
t
io
n
,"
Inte
rnational
J
ournal
of
Ad
vanc
ed
Re
sear
ch
in
Compu
te
r
Engi
n
ee
rin
g
&
Techno
l
ogy
(IJ
AR
CET
)
,
vol.
5
,
no
.
1
,
pp
.
69
-
78
2016.
[5]
S.
A.
Mohame
d
,
W
.
Hus
ai
n
,
N
.
A.
Rashid
.
,
"A
Revi
ew
on
Pred
ic
ti
ng
Studen
t’s
Perform
ance
usi
ng
Data
Mining
Te
chn
ique
s,"
Si
e
cnc
eDire
ct
P
roc
edi
a
Computer
Sci
en
ce
th
e
Thir
d
Information
S
yste
ms
Int
ernational
Con
fe
renc
e
,
no.
72
,
pp
.
414
-
422,
2015
.
[6]
M.
S.
T
abr
a
&
A.
La
wan
,
"A
C
ompa
ra
ti
ve
Anal
ysis
of
the
Perfo
rma
nc
e
of
Thre
e
Mac
hine
L
ea
rni
ng
Algorit
hms
f
or
Twe
et
s
on
Niger
ia
n
Data
s
et
,
"
Th
e
In
te
rn
ati
ona
l
J
ournal
of
E
-
le
ar
ning
and
Edu
cational
Technol
og
ie
s
in
th
Digit
a
l
Me
dia
(I
JEETD
M)
,
vol
.
3
,
no
1,
pp.
23
-
30
,
2017
.
[7]
N.
Vl
at
ko,
S.
R
iste
etal
.
,
"Edu
cati
onal
Dat
a
Minin
g:
C
ase
Study
for
Predi
ct
ing
Stud
e
nt
Dropout
in
Hi
gher
Educat
ion
,
"
htt
ps://
ww
w.re
se
ar
chgate
.
net/pub
li
c
at
ion/
282
3338
27
Confer
en
ce Pape
r, April
,
201
5.
[8]
G.
Gera
ldi
ne
,
C.
McGuinne
ss
,
P.
Ow
ende
.
,
"A
n
A
ppli
c
at
ion
of
C
las
sific
at
ion
Mode
ls
to
Pre
dict
Le
ar
ner
Progress
ion
in
T
ert
i
ary
Educat
ion
,
"
Confer
en
ce
Pap
er,
Februa
ry
2014
DO
I:
10
.
1109/IAdCC.
20
14.
6779384.
201
4.
[9]
J.
S.
Ta
nv
ee
r
,
R
.
I
.
R
ashu
e
tal
.
,
"Improvi
ng
Ac
c
ura
cy
of
Stud
ents
’
Final
Grade
Predic
ti
on
Mode
l
Us
ing
Opt
im
a
l
Equa
l
Wi
dth
Bin
ning and S
ynthet
ic
Minori
ty O
ver
-
Sampl
ing
Te
ch
nique
,
"
De
ci
sion
Anal
y
ti
cs
(2015
) 2:1
A Spr
inge
r
Open
Journal
,
2
015.
[10]
A.
Fadhilah,
N.
H.
Ismail,
A.
Abdul
Azi
z.,
"Th
e
Predic
ti
on
of
Stu
dent
s’
Ac
ade
m
ic
Perform
an
ce
Us
ing
Cl
assific
a
ti
o
n
Data
Mining
Te
chn
ique
s,"
A
ppli
ed
Ma
the
m
ati
cal
S
cienc
es
,
vol.
9
,
no.
129,
pp.
64
15
-
6426,
2015.
htt
p://dx.doi.org/10.12988/ams.2
015.
53289
[11]
Te
ressa
T.
&
Chikohora
,
"
A
Study
of
the
Fact
ors
Consi
d
er
ed
when
Choos
ing
an
Appropr
ia
t
e
Dat
a
Minin
g
Algorit
hm
,
"
Int
e
rnational
Journal
of
So
ft
Computing
and
Engi
n
ee
ring
(IJ
SCE
)
,
vol.
4,
no.
3
,
pp
.
42
-
45,
July
201
4.
ISS
N:
2231
-
2307
[12]
O.
Edi
n
&
M.
S
ulj
ić,
"D
a
ta
Min
ing
Approac
h
fo
r
Predicting
Stu
dent
Perfor
ma
n
c
e,
"
Ec
onomic
R
e
vi
ew
–
Journal
o
f
Ec
onomics
and
Busine
ss
,
vol
.
x
,
no.
1
,
pp
.
4
-
12
,
May
2012.
[13]
M.
S.
Mythil
i
&
A.
R.
M.
Shanav
as,
"
An
Analysis
of
student
s’
Perf
orma
nc
e
Us
ing
Cla
ss
ifi
c
at
ion
Al
gorit
hms,"
IOSR
Journal
o
f
Comp
ute
r
Engi
ne
ering
(IOSR
-
JC
E)
e
-
I
SSN:
2278
-
0661,
p
-
ISSN:
2278
-
8
727,
vo
l.
6,
no
.
1
,
pp
.
63
-
69,
Ver
.
III
Jan.
2014.
ww
w.i
osrjourna
ls.
org.
2014
[14]
S.
Saja
d
in,
M.
Za
rl
is,
D.
Hart
a
ma
,
S.
Ramlia
n
a,
E.
W
ani,
"
Pr
edi
c
ti
on
of
Stud
ent
Ac
ademic
P
erf
orma
n
ce
by
a
n
Applic
a
ti
on
of
D
at
a
Min
ing
T
ec
h
nique
s,"
2011
In
te
rnational
Conf
ere
nce
on
Mana
geme
nt
and
Arti
f
ic
ial
Intelli
g
ence
Ipe
dr
,
vol
.
6
,
pp
.
110
-
114,
2011.
[15]
R.
Sikora
&
O.
H.
Al
-
la
y
moun
,
"A
Modifie
d
Stac
king
Ensemble
Mac
h
ine
Learni
ng
Algori
th
m
Us
ing
Gene
tic
Algorit
hms,"
Jo
urnal
of
In
te
rnat
iona
l
Te
chnol
og
y
and
In
formatio
n
Manage
ment
,
vol.
23
,
no
.
1,
pp.
1
-
12.
2014
[16]
N.
Chanama
rn
,
K.
T
am
e
e,
P.
Sitt
id
ec
h
.
,
"S
t
a
cki
ng
T
ec
hn
iqu
e
for
Ac
ade
m
ic
Achie
v
em
en
t
Predic
ti
on
,
"
201
6
Inte
rna
ti
ona
l Wo
rkshop on
Smart Info
-
Media Sys
t
em
s in
As
ia
(SIS
A 2016),
pp
.
14
-
17.
2
016
[17]
D.
Saso
&
B
.
Z
e
nko,
"
Is
Combi
n
i
ng
Cla
ss
ifi
e
rs
with
Stac
k
ing
B
et
t
e
r
tha
n
Selecti
ng
t
he
Best
On
e?,"
S
pringer
Journal
of
Mac
h
ine Learning,
vo
l. 54, pp. 255
-
273
,
2004.
[18]
T.
Ahmet.,
"E
arly
Predi
ct
ion
of
Student
s’
Gr
ade
Point
Ave
rag
es
a
t
Gradua
ti
on
:
A
Data
Min
ing
Appr
oac
h,
"
Eurasia
n
Journal
of
Edu
c
ati
onal Re
search
,
no
.
54
,
pp
.
207
-
226,
2014
.
BIOGR
AP
HI
ES OF
A
UTH
ORS
Yus
uf
Abdulaz
e
ez
r
ecei
ved
his
B
Sc.
degr
ee
in
co
mput
er
Scie
n
ce
f
rom
Adam
awa
S
ta
t
e
Univer
si
ty
Mubi,
Niger
ia
i
n
2011
and
pre
sentl
y
a
post
-
gr
adua
t
e
studen
t
a
t
th
e
dep
artme
n
t
of
Co
mput
er
Scie
nc
e,
B
aye
ro
Univer
sity
Kano
,
Niger
i
a.
His
re
sea
rch
in
te
r
est
i
ncl
udes
Dat
a
Mi
ning,
Arti
f
icial
Inte
lligen
ce a
nd
HCI.
Ayuba
John
rece
ive
d
his
B
.
Eng
.
Engi
ne
eri
ng
Deg
ree
in
Comput
er
Engi
ne
eri
ng
fro
m
Univer
sity
of
Maiduguri
,
Nige
ria
in
2010
,
and
M.E
ng.
Comput
er
Eng
ineeri
ng
Degre
e
from
Univ
ersit
y
of
Ben
in,
Niger
ia,
in
2017
,
He
has
worked
as
a
tr
ansmi
ss
ion
Engi
ne
er
in
N
at
ion
al
Cont
ro
l
Cent
re
(NCC);
Tra
nsmiss
ion
Compa
ny
of
Nig
e
ria
(TCN)
in
201
4
and
he
is
a
m
e
mbe
r
of
the
Nig
eri
an
soci
et
y
of
engi
ne
ers
(NS
E),
cur
r
ent
ly
a
l
ec
tur
er
from
Fe
der
al
Univ
ersit
y
Dutse,
Niger
ia.
His
rese
a
rch
int
er
ests
are
in
the
areas
of
Mi
cro
elec
tron
ic
s'
I
nte
lli
gent
Secur
i
ty
Sys
te
m
&
W
ire
l
ess
Sensor
Networks.
Evaluation Warning : The document was created with Spire.PDF for Python.