Int
ern
at
i
onal
Journ
al of Ele
ctrical
an
d
Co
mput
er
En
gin
eeri
ng
(IJ
E
C
E)
Vo
l.
9
, No
.
5
,
Octo
ber
201
9
, pp.
4321
~
43
25
IS
S
N:
20
88
-
8708
,
DOI: 10
.11
591/
ijece
.
v
9
i
5
.
pp4321
-
43
25
4321
Journ
al h
om
e
page
:
http:
//
ia
es
core
.c
om/
journa
ls
/i
ndex.
ph
p/IJECE
Compari
ng mach
ine
learn
ing and
ensembl
e learnin
g
in the fi
eld
of
fo
otb
all
Shu
aib Kha
n
,
Kirub
anand
V.
B
Depa
rtment
o
f
C
om
pute
r
Scie
n
ce,
CHRIS
T
(De
e
m
ed
to
b
e
Unive
rsit
y
)
,
Ind
ia
Art
ic
le
In
f
o
ABSTR
A
CT
Art
ic
le
history:
Re
cei
ved
Des
7
, 2
01
8
Re
vised
A
pr
13
, 2
01
9
Accepte
d
Apr
2
5
, 201
9
Footbal
l
h
as
been
one
of
th
e
m
ost
popula
r
and
lo
ved
sports
since
it
s
birt
h
o
n
Novem
ber
6th,
1869.
Th
e
m
a
in
rea
son
for
t
his
is
because
it
is
high
l
y
unpre
dictable
in
nat
ure
.
Predicti
ng
footba
ll
m
atche
s
result
s
see
m
s
li
ke
the
per
fect
probl
em
for
m
ac
hine
le
ar
ning
m
odel
s.
Bu
t
the
r
e
ar
e
var
io
us
ca
veats
such
as
pic
k
ing
the
r
ight
f
eatur
es
from
an
enor
m
ous
num
ber
o
f
availa
b
le
fea
tur
es.
The
r
e
have
b
ee
n
m
an
y
m
odel
s
which
h
ave
b
ee
n
appl
i
ed
to
var
ious
footba
ll
-
relate
d
dat
ase
ts.
Th
is
pape
r
ai
m
s
to
compare
Su
pport
Vec
to
r
Mac
hine
s
a
m
a
chi
ne
l
ea
rn
ing
m
odel
and
XG
Boost
an
Ense
m
ble
le
arn
ing
m
odel
and
how
Ensemble
Le
arn
ing
ca
n
gre
a
tly
i
m
prove
the
ac
cu
racy
of
th
e
pre
dict
ions.
Ke
yw
or
d
s
:
Decisi
on
trees
Ensem
ble
le
arn
in
g
Pr
esci
sio
n
Suppor
t
vect
or m
achines
XG
B
oost
Copyright
©
201
9
Instit
ut
e
o
f Ad
vanc
ed
Engi
n
ee
r
ing
and
S
cienc
e
.
Al
l
rights re
serv
ed
.
Corres
pond
in
g
Aut
h
or
:
Shuaib
K
han,
Dep
a
rtm
ent o
f C
om
pu
te
r
Scie
nce
,
CHRIST
(
Dee
m
ed
to
be Un
i
ver
sit
y),
Ho
s
ur Mai
n R
oad, B
hav
a
ni
Nag
a
r, S.
G. Pa
ly
a, Beng
al
uru, Ka
rn
at
a
ka 560
029, I
ndia
.
Em
a
il
:
sh
uaib.kh
a
n@cs.c
hr
is
tun
ive
rsity
.in
1.
INTROD
U
CTION
Football
is
a
ver
y
unpre
dicta
ble
s
port,
t
he
num
ber
of
upset
s
ca
us
e
d
by
wea
ke
r
te
a
m
s
beati
ng
relat
ively
stro
nger team
s is b
oundle
ss,
m
ay
be
wh
y t
he
spo
r
t i
s lov
ed by s
o
m
any all
acro
s
s the wor
l
d.
W
hen
it
com
es
to
who's
go
i
ng
t
o
wi
n
in
a
footb
al
l
m
at
ch,
there
is
a
whole
in
du
s
try
around
it
,
pr
e
-
m
at
ch
analy
sis
by
football
pundit
s
an
d
e
xperts,
post
-
gam
e
analy
sis
by
f
or
m
er
play
ers
or
prof
e
ssio
nals,
Entire
c
ha
nn
el
s
li
ke
ESPN
,
Sk
y
S
ports
are
de
dic
at
ed
to
try
ing
to
analy
se
an
d
figure
out
as
to
wh
ic
h
te
am
is
go
in
g
to
w
in
the
m
at
ch
and
e
ve
n
du
rin
g
hal
ftim
e
,
there
are
com
m
entat
or
s
try
ing
to
predi
ct
who
is
go
i
ng
to
win
base
d
on
half
-
ti
m
e
stats.
Be
tt
ing
Com
pan
ie
s
th
rive
on
the
unpred
ic
ta
bili
ty
of
football
m
a
tc
hes.
Ther
e
a
re
va
rio
us
betti
ng
c
om
pan
ie
s
w
ho
hav
e
their
own
wa
ys
or
m
od
el
s
t
o
predict
the
r
esults
of
the
se
m
at
ches,
base
d
on
th
e
pr
e
dicti
on of t
hese m
od
el
s th
ey
can
a
dju
st t
heir o
dd
s
acco
r
dingly
.
Ther
e
have
be
en
m
any
pap
er
s
and
m
od
el
s
im
ple
m
ented
to
pr
e
dict
the
m
a
tc
hes,
m
os
t
of
wh
ic
h
ha
ve
achieve
d
a
rea
so
na
ble
am
ou
nt
of
acc
ur
acy
.
The
obj
ect
ive
of
t
his
pa
p
er
i
s
to
sho
w
the
diff
e
re
nce
bet
ween
a
Ma
chine
le
ar
nin
g
m
od
el
an
d
an
ensem
ble
le
arn
i
ng
m
od
el
.
Ma
chine
Lear
ning
can
be
ap
plied
to
the
va
rio
us
aspects
of
real
-
li
fe.
But
ever
y
app
li
cat
io
n
of
a
Ma
chine
Lea
rn
i
ng
is
di
ff
e
re
nt
as
there
is
a
vast
va
riet
y
of
data
g
ene
rated
in
t
he
m
od
er
n
da
y.
For
e
xam
ple,
a
m
achine
l
earn
i
ng
m
od
el
us
e
d
t
o
pr
e
di
ct
the
value
of
bitcoin
m
igh
t
no
t
be
ve
ry
accur
at
e
in
cl
assify
ing
pictures
of
dogs.
Choosin
g
the
rig
ht
m
achine
le
arn
in
g
m
od
el
is
a
par
t
of
t
he
pr
oble
m
.
The
oth
e
r
pa
rt
is
getti
ng
the
data
pre
pa
red
f
or
th
e
m
od
el
.
O
ften
ti
m
es
we
do
no
t
get
the
ideal
data
s
et
,
there
m
ay
be
m
issi
ng
val
ues
,
du
plica
ti
on
,
e
tc
.
Pr
e
-
P
ro
ces
s
ing
t
he
data
se
t
is
the
oth
e
r
pa
rt
of
the pr
ob
le
m
.
The
pe
rfor
m
ance
of
Ma
chine
Learn
in
g
m
od
el
s
and
cl
assifi
ers
are
usua
ll
y
ran
ke
d
on
som
e
fo
rm
of
Accuracy
.
Tha
t
is
the
c
om
pari
so
n
bet
ween
t
he
act
ual
resu
l
ts
and
the
pr
e
di
ct
ed
or
obta
in
ed
res
ults.
Ensem
ble
le
arn
in
g
ai
m
s
t
o
im
pr
ov
e
the
accuracy
of
yo
ur
le
a
rn
e
rs
(classi
fiers)
by
ass
e
m
bling
them
tog
et
he
r.
The
e
rrors
Evaluation Warning : The document was created with Spire.PDF for Python.
IS
S
N
:
2088
-
8708
In
t J
Elec
&
C
om
p
En
g,
V
ol.
9
, N
o.
5
,
Oct
ober
201
9
:
4
3
2
1
-
4
3
2
5
4322
pro
du
ce
d
by
a
Ma
chine
Lea
r
ning
cl
assifi
e
r
can
be
br
ok
e
n
dow
n
i
nto
bi
as,
va
riance
a
nd
I
rr
e
duci
ble
erro
r
.
Ensem
ble
le
arn
ers
hel
p
us
ge
t
the
ri
gh
t
bala
nce
betwee
n
bi
as
an
d
va
rianc
e
erro
rs.
This
ba
la
nce
is
al
s
o
know
n
as
the
Bi
as
-
V
ariance
tra
de
-
off.
I
n
t
his
pa
per,
we
l
ook
at
on
e
ty
pe
of
e
ns
em
ble
l
earn
i
ng
m
od
el
cal
le
d
Boo
sti
ng.
B
oo
sti
ng
is
it
erati
ve
in
natu
re
a
nd
a
dds
weig
ht
to
an
ob
se
r
va
ti
on
or
data
ba
sed
on
t
he
pr
evio
us
resu
lt
s
of cla
ss
ific
at
ion
.
2.
RESEA
R
CH MET
HO
D
2
.
1.
Rela
ted
w
ork
Jo
se
ph
an
d
Fe
nton [
1]:
Ba
ye
sia
n
Nets
ha
ve
been
use
d
in
t
hi
s
pa
per
t
o
a
ntici
pate
the r
es
ults
of
S
occe
r
m
at
ches
and
t
he
res
ult
is
co
m
par
ed
with
ot
her
m
od
el
s
suc
h
as
K
-
Near
e
st
-
Neig
hbou
rs,
Mc
4
et
c.
T
he
pap
e
r
us
es
e
xpert
opinion
f
or
featu
re
sel
ect
ion
in
ste
ad
of
m
at
he
m
at
ic
al
m
od
el
s
and
the
a
na
ly
sis
i
s
done
on
the
m
at
ches
play
ed
by
Totte
nha
m
Ho
tspur.
T
he
pa
per
s
hows
that
the
Ba
ye
sia
n
Nets
ou
t
perform
th
e
oth
er
cl
assifi
ers whe
n
the
d
at
a
set
is d
is
j
oi
nt.
Dobravec
[
2]:
This
pap
e
r
rec
ognizes
t
he
dif
ficult
y
of
the
m
achine
le
ar
nin
g
ap
proac
h
i
n
the
fiel
d
of
fo
otb
al
l.
This
pap
e
r
us
es
a
M
at
rix
Fact
or
iz
at
ion
M
odel
w
hi
ch
forecast
s
t
he
am
ou
nt
of
go
al
s
sco
red
by
a
tea
m
against a
certai
n opp
on
e
nt. Th
e AUC sc
or
e
obtai
ned b
y t
he m
od
el
is 0
.
677.
Du
ša
n
a
nd
Dia
na
[
3]:
This
pa
per
us
es
Stat
ist
ic
al
Techn
iq
ue
s
su
c
h
as
P
oiss
on
distrib
utio
n
to
pr
e
dict
football
m
a
tc
hes.
The
m
od
el
app
li
es
Po
iss
on
distrib
utio
n
to
the
first
ha
lf
of
the
seas
on
an
d
the
n
us
i
ng
the
resu
lt
s
it
si
m
ulate
s
the
oth
e
r
half
of
the
seas
on’s
re
su
lt
s.
T
heir
Mo
del
can
be
us
e
d
f
or
a
pr
i
or
i
im
pact
a
naly
sis
by
goi
ng
t
hro
ugh
sim
ulatio
ns
of
diff
e
re
nt
m
anag
e
m
ent
strat
egies
ba
s
ed
on
th
ei
r
exp
ect
e
d
e
ff
ec
ts
on
m
at
ch
resu
lt
s.
Haghig
hat
et
al
.
[4
]
:
T
his
pap
e
r
ide
ntifie
s
two
m
ai
n
pro
blem
s
fo
r
da
ta
m
ining
in
the
fiel
d
of
Football
.
T
he
f
irst
is
the
relat
ively
low
acc
ur
acy
of
cl
as
sifie
rs
try
in
g
to
pr
e
dict
data,
im
plyi
ng
m
or
e
acc
ur
at
e
m
od
el
s
need
t
o
be
f
ound
a
nd
the
seco
nd
is
the
la
ck
of
good
qual
it
y
fr
e
e
data
set
s
in
te
rm
s
of
the
sta
ti
sti
cs.
Most o
f
t
he data
set
s contai
n d
at
a colle
ct
ed
f
r
om
w
ebsite
s a
nd not act
ually
r
el
eva
nt stat
ist
ic
s.
Fo
r
rest
a
nd
Sim
m
on
s
[5
]
:
T
his
pa
per
goe
s
over
the
qu
antit
at
ive
factor
s
a
ff
ect
in
g
t
he
bea
utif
ul
gam
es,
it
trie
s
to
est
ablish
a
relat
ion
s
hip
be
tween
the
H
om
e
Tea
m
Su
pport
ers
a
nd
t
he
in
flue
nce
on
the
ref
e
ree. It est
a
blishes t
hat the
hom
e team
, in
g
e
ner
al
,
r
ecei
ve
s f
e
wer Yell
ow Car
ds
.
Aleja
ndr
o
et
al
.
[6
]
:
This
pa
per
s
pea
ks
ab
out
the
H
om
e
a
dv
a
ntage
phen
om
eno
n
te
am
s
face
w
hile
play
ing
at
hom
e,
it
trie
s
to
e
xpla
in
or
fi
nd
a
reason
f
or
this
ph
e
nom
eno
n.
I
t
concl
udes
by
say
ing
t
hat
it
c
an
be
a
com
bin
at
ion
of
fa
ct
or
s
s
uch
as
beh
avi
our
of
the
cr
owd
,
ph
yc
ologica
l
eff
ect
of
the
play
ers,
fam
iliarity
with
the stadi
um
etc
.
2
.
2.
Me
thod
ology
2.2.1.
D
ata
cl
e
an
in
g and pre
-
pr
ocessin
g
The
Dataset
s
el
ect
ed
co
ntained
feat
ur
es
s
uch
as
the
nu
m
ber
of
goal
s
sco
re
d
by
hom
e
te
a
m
,
the
num
ber
of
go
al
s
sc
or
e
d
by
away
te
a
m
,
Shots
ta
ken
by
ho
m
e
te
a
m
,
S
ho
ts
ta
ke
n
by
away
te
a
m
,
hom
e
te
a
m
po
i
nts,
a
way
t
ea
m
po
ints
,
a
var
ie
ty
of
betti
ng
od
ds
,
an
d
f
inall
y
the
Fu
ll
-
tim
e
resu
lt
.
T
he
dataset
s
col
le
ct
ed
wer
e
fro
m
the
ye
ar 20
00 to 2
013.
Sele
ct
ing
the
r
igh
t
featu
res
is
a
ver
y
i
m
po
rtant
par
t
of
Ma
chine
Lear
ning
,
these
can
be
done
us
i
ng
Stat
ist
ic
al
te
s
ts
su
c
h
as
Pe
ars
on’s
C
orrelat
ion
,
Li
near
Disc
rim
inant
analy
sis,
A
NOV
A,
Chi
-
S
qu
a
re
te
s
ts
et
c.
Howe
ver,
in
t
his
m
od
el
,
we
com
pu
te
d
a
dd
it
ion
al
featu
res
fr
om
the
data
set
it
sel
f
su
c
h
as
H
om
e
Team
W
in
Streak
,
Ho
m
e
Team
Loss
Streak,
A
way
Tea
m
W
in
Strea
k,
A
way
Team
Loss
Streak
,
Diff
e
rence
in
Po
in
t
s
from
the
Ho
m
e
Tea
m
and
the
Aw
ay
Team
,
Diff
e
ren
ce
in
Last
Year
’
s
Pre
dicti
on
s
.
All
these
featu
res
wer
e
th
e
dep
e
ndent
va
riables.
Fu
ll
-
Ti
m
e Result
w
as
ta
ken
a
s the
In
dep
e
ndent
Va
riable.
2.2.2.
Sup
po
r
t
v
ec
to
r
ma
c
hines
Suppor
t
Vect
or
Ma
chines
ha
s
been
co
ns
ide
r
ed
as
on
e
of
th
e
go
-
t
o
al
gorithm
s
fo
r
data
sci
entist
s,
b
ut
wh
y
is
it
a
fa
vouri
te
?
This
is
beca
us
e
of
on
e
Re
aso
n
,
T
he
Kernel
T
rick
[7]
.
SV
M
is
an
Ef
fici
ent
Data
An
al
ysi
s
Al
gor
it
h
m
or
Mod
el
w
hic
h
can b
e
use
d
both f
or
cl
assifi
cat
ion
as w
el
l
as
well
as
regressio
n.
I
t
us
es
a
hype
rp
la
ne
to
separ
at
e
the
da
ta
into
cl
asses.
This
hype
rp
la
ne
or
li
ne
m
us
t
be
sel
ect
ed
i
n
su
c
h
a
way
that
it
m
axi
m
iz
es
the
distance
bet
w
een
the
cl
os
e
st
data
po
i
nt
of
each
cl
ass.
It
i
s
cr
ucial
that
we
fin
d
a
n
optim
a
l
hype
rp
la
ne
be
cause
it
cl
assifi
es
the
data
co
rr
ect
ly
and
we
will
hav
e
higher
acc
ur
acy
on
unsee
n
(test
)
data
.
To
fin
d
th
e
op
t
i
m
al
hyper
pla
ne
we
nee
d
a
m
arg
i
n.
A
m
arg
i
n
is
t
he
distanc
e
betwee
n
the
cl
os
est
point
a
nd
the
hype
rp
la
ne.
T
he
Ma
r
gin
is
c
al
le
d
a
no
m
an’
s
la
nd
because
there
sho
uld
n’t
be
any
da
ta
po
i
nts
bet
w
een
th
e
hype
rp
la
ne
a
nd the m
arg
in.
As sho
wn
in
F
ig
ur
e
1.
Evaluation Warning : The document was created with Spire.PDF for Python.
In
t J
Elec
&
C
om
p
En
g
IS
S
N:
20
88
-
8708
Compari
ng
machine le
arnin
g and e
ns
e
m
ble lear
ning in
the f
ie
ld o
f f
oot
ba
ll
(
Shua
i
b
Kh
an
)
4323
Figure
1.
De
pi
ct
ion
of
S
VM
The
Kernel
T
ri
ck
is
us
e
d
in
c
ase
the
data
is
Non
-
Linea
r.
T
he
Kernel
tric
k
co
ver
ts
our
da
ta
(u
s
ually
to
a
higher
di
m
ension)
in
s
uch
a
w
ay
that
we
can
dra
w
buil
d
an
opti
m
al
hyper
pl
ane.
I
n
ot
her
words,
it
conver
ts
the
data
int
o
unrec
ognizable
data
wh
ic
h
ca
n
be
use
d
by
the
S
V
M.
This
hel
ps
to
accu
ratel
y
draw
a
m
arg
in
bet
wee
n
cl
asses.
A
ke
rn
el
f
un
ct
io
n
is
responsi
ble
f
or
tra
ns
f
orm
in
g
the
data.
T
he
re
are
m
any
K
ern
el
s
functi
ons f
or alm
os
t
all types
of
d
at
a. T
he
K
ern
el
fun
ct
io
n use
d
her
e is cal
le
d
RB
F (
Ra
dial
Ba
sis Fu
nction)
as
sh
ow
n
in
Fi
gur
e 2
.
Figure
2
.
RB
F
form
ula
Her
e ||
x
-
x’
||
2
is
the E
uclidia
n dist
ance
betwe
en
tw
o data p
oi
nts and σ
is a
pa
ram
et
er.
The R
BF k
er
nel
com
pu
te
s
the
distance
f
ro
m
t
he
ori
gin
or
a
ny
oth
er
cal
le
d
a
center.
It
is
a
real
valued
functi
on
wh
ic
h
is
us
e
d
get an ap
pro
xi
m
at
ion
of fu
nc
ti
on
s
[8
]
.
2.2.3.
X
GB
oos
t
The
E
ns
em
ble
Learn
e
r
us
e
d
in
this
m
od
el
w
as
a
boost
ing
a
lgorit
hm
kn
ow
n
as
XG
B
oost.
Boo
sti
ng
i
s
a
ty
pe
of
ense
m
ble
le
arn
er
that
trai
ns
the
m
od
el
on
a
ran
dom
iz
ed
sa
m
ple
of
the
dat
a
and
f
or
the
da
ta
po
ints
wh
ic
h
we
ren’t
pr
e
dicte
d
co
rrec
tl
y,
it
include
s
them
in
the
nex
t
sam
ple
of
rand
om
l
y
selected
data.
I
n
oth
e
r
words, it a
dds
weig
ht to
t
he u
ns
uc
cessf
ul
pr
e
dicti
on
s
and t
r
ai
ns
the
classi
f
ie
r
agai
n.
Boo
sti
ng
trai
ns,
the
cl
assifi
er
s
in
a
seq
ue
nc
e
in
su
c
h
a
wa
y
that
a
new
cl
assifi
er
s
hould
con
ce
ntrate
on
th
os
e
cases
wh
ic
h
wer
e
cl
a
ssifie
d
inco
rre
ct
ly
.
The
resu
lt
s
of
the
seq
ue
nc
e
of
cl
assifi
er
s
are
com
par
ed
and
a voti
ng to det
erm
ine the outpu
t a
s s
how
n
i
n
F
i
gure
3
.
Figure
3
.
De
pi
ct
ion
of
bo
os
ti
ng
Evaluation Warning : The document was created with Spire.PDF for Python.
IS
S
N
:
2088
-
8708
In
t J
Elec
&
C
om
p
En
g,
V
ol.
9
, N
o.
5
,
Oct
ober
201
9
:
4
3
2
1
-
4
3
2
5
4324
XG
B
oost
sta
nds
for
e
xtrem
e
gr
a
die
nt
boost
ing
.
G
rad
ie
nt
Boo
sti
ng
w
ork
s
on
the
pri
nci
ple
of
a
ty
pe
Decisi
on
trees
known
as
Cl
assifi
cat
ion
an
d
Re
gr
essio
n
T
rees
or
C
ART
.
Trees
are
go
od
at
ha
nd
li
ng
huge
dataset
s,
they
can
ha
nd
le
qual
it
at
iv
e
as
well
as
qu
a
ntit
at
ive
data,
they
can
igno
re
redu
nd
a
nt
var
ia
bles,
but
on
e
m
ajo
r
dr
a
w
bac
k
is
that
the
predict
io
n
pe
rfo
rm
ance
is
ver
y
poor
,
but
this
is
becau
se
of
a
la
rg
e
am
ou
nt
of
var
ia
nce.
XG
B
oo
st
s
olv
e
s
thi
s
pro
blem
by
t
akin
g
a
s
pecifi
c
nu
m
ber
of
t
r
ees
,
each
t
ree
is
gro
wn
(
trai
ne
d)
t
o
the
wei
ghte
d
ve
rsions
of
the
trai
ning
data.
T
his
f
orm
of
we
igh
ti
ng
dec
orre
la
te
s
the
trees
that
is
it
rem
oves
the
correla
ti
on
bet
ween
t
rees
by
fo
c
us
in
g
on
the
re
gions
m
i
ssed
by
the
pa
st
trees.
T
he
f
inal
Cl
assifi
er
is
th
e
we
ig
hted
a
ve
ra
ge
of
t
he
cl
assi
fiers.
G
rad
ie
nt
boos
ti
ng
im
pr
oves
al
l
the
good
feat
ur
es
of
tr
ees
su
c
h
as
va
r
ia
ble
sel
ect
ion
,
m
ixed
pr
e
dicto
rs
et
c.
an
d
im
prov
e
s
on
the
w
eak
featu
res
s
uch
as
predict
ion
pe
rfor
m
ance
a
nd
scal
abili
ty
o
f
tr
ees [
9].
3.
RESU
LT
S
A
ND AN
AL
YSIS
As
m
entioned
befor
e
,
the
goal
of
this
pa
per
is
to
c
ompare
a
n
e
ns
e
m
ble
le
arn
in
g
m
od
el
an
d
a
m
achine
le
arn
i
ng
m
od
el
to
sh
ow
how
a
n
ensem
ble
le
arn
ing
ca
n
gr
eat
ly
i
m
pr
ov
e
accu
racy.
First,
the
data
is
cl
eaned
a
nd
Fe
at
ur
es
are
c
ompu
te
d
an
d
a
dded
us
i
ng
t
he
e
xisti
ng
data
set
.
The
fe
at
ur
e
s
are
then
s
el
ect
ed
put
into
t
he
S
VM
m
od
el
with
the
RB
F
kernel.
T
his
data
is
al
s
o
fe
d
i
nto
t
he
X
GBoost
m
od
el
and
the
res
ults
fr
om
bo
t
h
the
m
od
el
s ar
e c
om
par
ed
as s
how
n
in
Fi
gure
4.
Figure
4
.
Mo
de
l
us
e
d
f
or c
om
par
ison
Both
Accuracy
sco
re
and F1 s
cor
e a
re
us
e
d
to calc
ulate
the
perform
ance o
f
the m
od
el
s.
T
he
F
1
sc
ore
us
es
pr
eci
si
on
an
d
recall
.
It
m
a
intai
ns
the
balance
bet
w
een
the
preci
sion
of
the
outp
ut
an
d
t
he
re
c
al
l
of
the outp
ut.
Let
,
fp
=
false
posit
ives;
tp =
tr
ue
posit
ives;
tn
=
tr
ue ne
gatives;
fn
=
false
n
e
ga
ti
ves.
Pr
eci
sio
n
= t
p
/
(tp
+
fp
)
Re
cal
l = t
p/ (
tp
+ fn
)
F1_score =
2 * (P
recisi
on*Re
cal
l)/(Precisi
on
+R
ecal
l)
The O
bs
er
vations o
btaine
d
a
r
e as
s
how
n
in
Tabel
1
a
nd Ta
ble 2
:
-
Table
1.
SV
M
resu
lt
s
Variable
F1
sco
re
Accurac
y
Tr
ain
in
g
Set
0
.71
5
0
.75
6
Test Set
0
.65
4
0
.66
0
Table
2
.
X
GBoost
re
su
lt
s
Variable
F1
sco
re
Accurac
y
Tr
ain
in
g
Set
0
.85
5
0
.86
9
Test Set
0
.80
1
0
.82
4
Evaluation Warning : The document was created with Spire.PDF for Python.
In
t J
Elec
&
C
om
p
En
g
IS
S
N:
20
88
-
8708
Compari
ng
machine le
arnin
g and e
ns
e
m
ble lear
ning in
the f
ie
ld o
f f
oot
ba
ll
(
Shua
i
b
Kh
an
)
4325
4.
CONCL
US
I
O
N
The
Go
al
of
t
hi
s
pap
e
r
was
to
show
t
he
s
up
e
rior
it
y
of
en
se
m
ble
le
arn
ing
ov
e
r
m
achine
le
arn
i
ng
m
od
el
s
in
the
fiel
d
of
Football
.
As
we
ca
n
see,
X
GBoost
perfor
m
s
sign
ific
antly
bette
r
tha
n
i
ts
m
achine
le
a
rn
i
ng
counter
par
t
S
VM.
With
a
n
accuracy
sco
re
of
0.855
a
nd
an
F
1_sc
or
e
of
0.8
01.
It
s
houl
d
be
no
te
d
th
at
these
pr
e
dicti
on
s
or
cl
assifi
cat
ion
wer
e
m
ade
wit
h
the
us
e
of
in
-
gam
e
sta
ti
sti
c
s
w
hich
a
re
not
avail
able
be
f
or
e
t
he
m
at
ch
ta
kes
place.
T
his
is be
cause
t
he
ai
m
of
this pa
per
is
not
to
predict
t
he
football
m
atch
es o
r
c
om
e
up
w
it
h
an
al
gorithm
or
m
od
el
to
predict
the
f
oo
t
ball
m
at
ch
es.
Ther
e
a
re
m
a
ny
dif
fer
e
nt
ensem
ble
m
od
el
s
an
d
m
achine
le
arn
i
ng
m
od
el
s
w
hich
can
be
im
ple
m
ented
in
this
are
to
pr
e
dic
t
fo
ot
ball
Ma
tch
es.
T
he
Re
su
l
ts
of
this pa
per s
hows
th
at
en
sem
ble lea
r
ning ca
n be a
good c
hoic
e whe
n
try
i
ng to p
red
ic
t t
he
r
es
ults in t
hi
s f
ie
ld.
ACKN
OWLE
DGE
MENTS
The
aut
hor
w
ou
l
d
li
ke
to
than
k
facil
it
at
or
s
of
the
un
i
ve
rsity
hav
e
be
en
a
m
os
t
helpful
al
ly
in
structu
rin
g
t
he data
and
unde
r
sta
nd
i
ng
t
he pr
ob
le
m
d
om
ai
n.
Th
e
pa
per w
ould
not ha
ve be
en
c
om
plete
w
it
hout
the
m
assive
c
on
t
rib
ution
s
by
al
l
the
facu
l
ti
es
and
res
o
urces
in
li
eu
of
Christ
(D
eem
ed
to
be
Un
i
ve
rsity
)
,
for bo
t
h resear
ch
a
nd r
e
view
.
REFERE
NCE
S
[1]
A.
Jos
eph,
N
.
E.
Fenton
,
"
P
red
icting
Footb
al
l
result
s
usin
g
Ba
y
esi
an
Ne
ts
and
o
t
her
Mac
hine
L
ea
rni
ng
Te
chn
ique
s
,
"
Kn
owle
dge
-
Based Syste
ms
,
vol
.
19
,
no.
7,
pp.
54
4
-
5
53,
Nov 2006
.
[2]
Stefa
n
Dobrav
e
c,
"
Fore
ca
sting
Footbal
l
W
orld
Cup
Resul
t
s
using
a
Ma
tri
x
Fac
toriza
t
i
on
Techni
que
,
"
El
e
kt
rotehn
iski
Ve
stni
k/
E
le
c
trot
ec
hni
cal
Revie
w
,
vol. 82, no.
1,
p
p.
61
-
65
,
Jan
20
15
.
[3]
Munđar
Duš
an
and
Šim
ić
Diana
,
"
Croatian
Fir
st
Footbal
l
Lea
gue:
Team
s'
per
form
anc
e
in
the
cha
m
pionship,
"
Croatian
R
ev
i
ew
of Ec
onomi
c, Bus
ine
ss
and
Social
Statis
tics
,
vol
.
2,
no.
1,
pp.
15
-
23,
2016
.
[4]
Mara
l
Hagh
ighat,
Ham
id
Rast
eg
ari
,
Nasim
Nourafz
a
,
"
A
Review
of
Data
Mining
te
chn
ique
s
for
Result
Predi
ct
io
n
in
Sports
,
"
ACSI
J
,
v
o
l. 2,
no.
5,
Nov
2013.
[5]
Burai
m
o
B.
,
Forrest
D.
and
Sim
m
ons
R.
,
"
The
1
2th
m
an
?
:
ref
e
re
ei
ng
bi
as
in
English
and
Germ
an
socc
er
,
"
vol.
173
,
pp.
431
-
449
,
Ma
r
2010.
[6]
Le
ga
z
-
Arrese,
Alej
a
ndro
,
Moli
ner
-
Urdia
l
es,
Di
ego,
Munguí
a
-
I
zqui
erd
o
,
Diego
,
"
Hom
e
Advanta
ge
and
Sports
Perform
anc
e:
E
vide
nc
e,
Causes
and
Ps
y
chol
og
ic
a
l
Im
pli
c
ation
s,”
Univ
ersitas
Psyc
ho
logi
ca
,
vol.
12
,
no
.
3
,
pp.
933
-
943
,
20
13.
[7]
Ben
Ulm
er,
Mathe
w
Ferna
nd
e
z
M
pet
erson,
"
Predic
ti
ng
S
occ
er
Ma
tc
h
R
esult
s
in
the
Engl
ish
Prem
ier
Le
agu
e,
"
2013
,
[Online
]
.
Avai
la
bl
e:
ht
tp:
/
/c
s2
29.
stanfor
d.
edu/proj2014/Be
n%
20Ulm
er,
%20Matt
%20Fern
andez,
%20Predic
t
ing%
20Socce
r%20Re
sults%20in%20the%20English%20P
remie
r%20L
ea
gue
.
pdf
.
[8]
Hui
Cao,
Ta
kas
hi
Nait
o,
Yos
hiki
Ninom
i
y
a
,
"
Approxim
at
e
RBF
Kerne
l
SVM
and
Its
Appl
ic
ations
in
Pedestri
an
Cla
s
sificati
on
,
"
MLVMA’
08
,
Ma
rseil
l
e, Fra
nce,
Oct
2008
.
[9]
Ti
anqi
Ch
en,
C
arl
os
Guestrin,
"
XG
Boost:
A
S
ca
l
abl
e
Tree
Boosting
S
y
stem,
"
Proce
ed
ings
of
the
22nd
AC
M
SIGKD
D
,
Aug
2016
,
785
-
794
.
[10]
"
2000
-
2013
Prim
ie
r
Le
agu
e
Data
set
,"
Foot
ba
-
data.co,
(20
18),
[Online
]
.
Avail
ab
le
:
h
t
tp:
//
ww
w.foot
b
a
ll
-
dat
a
.
co
.
uk/e
ng
landm
.
php
BIOGR
AP
H
I
ES
OF
A
UTH
ORS
Shuaib
K
ha
n,
Corr
es
pondin
g
A
uthor,
(
St
ud
e
nt)
D
epa
rtm
ent
of
Com
pu
te
r
Scie
nce,
Christ
(
Deem
e
d
to
b
e
Unive
rs
it
y), Bang
al
or
e
,
I
ndia
, Em
ai
l:
sk
pa
la
rd
i
n@g
m
ai
l.co
m
.
Kir
ub
a
na
nd
V
.B
(A
ss
ociat
e
Pr
ofess
or)
D
epar
tm
ent
of
Com
pu
te
r
Scie
nce,
Ch
rist
(D
eem
ed
to
be
Un
i
ver
sit
y)
,
Bang
al
or
e
, In
dia
.
Evaluation Warning : The document was created with Spire.PDF for Python.