In
d
o
n
e
sian
Jou
r
n
al of
Ele
c
tr
i
c
a
l
En
g
in
e
erin
g
a
n
d
C
om
pu
ter S
c
ien
ce
Vol.
14, No.
1, April 2019,
pp.
333~339
ISSN: 2502-
4752,
DOI
:
10.115
91/ijeecs.
v
14.
i
1
.
pp333-339
3
33
Jou
rn
a
l
h
o
me
pa
ge
:
ht
tp:
//i
a
e
score
.
com
/
j
o
u
r
na
l
s
/
i
n
d
e
x
.
p
hp/
i
j
eec
s
Comparing bags of fe
atures, conve
ntional
convolut
ional n
e
ural
network
and alexnet for fruit reco
gnition
Nik
Noor A
k
m
al Abd
u
l
Hamid, Rabia
tu
l
Ad
a
w
iy
a
Ra
za
li
, Z
a
i
da
h
I
b
r
ah
im
Facu
lt
y of Co
m
put
er
a
nd
M
athe
m
a
tical
S
c
i
ences
, U
n
i
vers
i
t
i
Tek
n
o
lo
gi
M
ARA,
S
hah
Al
a
m
,
S
e
langor,
M
a
lay
s
ia
Art
i
cl
e In
fo
ABSTRACT
A
r
tic
le hist
o
r
y
:
R
e
ce
i
v
e
d
Jun 17,
2018
Re
vise
d S
e
p 30,
201
8
A
c
c
e
pte
d
D
ec 6,
201
8
Th
is
p
aper
p
res
e
nt
s
a
co
mp
arati
v
e
st
ud
y
b
e
tween
B
ag
o
f
F
eat
ures
(
B
o
F
)
,
Con
v
en
tio
nal
Conv
ol
ution
a
l
N
e
ural
N
etwo
rk
(
CNN
)
a
nd
Alex
net
f
o
r
frui
t
recogni
tion
.
A
ut
omat
i
c
f
ru
it
r
eco
gni
ti
on
can m
i
n
im
ize h
u
m
a
n in
te
rve
n
ti
on
i
n
th
eir
f
r
u
it
h
a
rv
est
i
n
g
o
p
e
ratio
n
s
,
op
erati
on
t
i
m
e
a
nd
h
arves
t
i
n
g
co
st.
O
n
th
e
ot
her
hand
,
th
i
s
t
as
k
is
v
ery
chal
len
g
i
ng
becau
se
o
f
t
h
e
sim
i
la
riti
es
i
n
sh
apes,
col
o
u
r
s
and
text
ures
a
m
o
ng
v
ari
o
us
t
yp
es
o
f
f
r
u
its.
Th
us,
a
ro
b
u
s
t
t
echni
qu
e
th
at
can
p
ro
duce
g
o
o
d
r
esu
lt
i
s
n
eces
sary
.
Due
t
o
t
h
e
o
ut
s
t
andi
ng
perf
o
r
man
ce
o
f
d
eep
l
earn
i
n
g
l
i
k
e
CN
N
and
its
pre-trai
ned
m
o
del
s
l
i
ke
Al
exN
e
t
in
i
m
a
g
e
r
eco
gn
iti
on
,
thi
s
p
ap
er
i
n
v
est
i
ga
t
e
s
th
e
accu
r
ac
y
o
f
c
o
nv
e
n
t
i
o
n
a
l
C
N
N
,
a
n
d
A
l
e
x
ne
t
i
n
r
e
c
o
gn
iz
in
g
th
ir
t
y
d
if
fe
r
e
n
t
t
ypes
o
f
f
ruits
f
r
o
m
a
p
u
b
li
cly
av
ail
a
ble
d
a
taset.
Bes
i
d
e
s
th
a
t
,
the
recog
n
it
i
o
n
p
erf
o
rman
ce
of
B
o
F
i
s
als
o
e
x
a
m
i
n
e
d
s
i
nce
it
i
s
on
e
of
t
he
m
ach
ine
learni
ng
t
ech
niq
u
es
tha
t
a
c
h
ie
v
e
s
goo
d
re
su
lt
i
n
ob
je
c
t
r
ec
o
g
n
ition
.
T
h
e
e
x
p
e
r
ime
nt
al
r
esults
in
di
cate
th
at
a
ll
of
t
h
e
se
t
h
r
ee
t
echn
i
qu
es
p
rodu
ce
exc
e
ll
ent
r
eco
gn
iti
on
accur
acy
.
F
u
rth
e
rmo
r
e,
c
on
vent
io
nal
CNN
achieves
t
h
e
f
a
st
es
t
re
co
gn
iti
on
resu
lt com
p
a
r
ed
to
BoF, and
A
l
e
xn
et.
K
eyw
ord
s
:
Al
exn
e
t
Ba
g of
f
e
a
t
ures
CN
N
Fru
i
t
re
co
gn
itio
n
Co
pyri
gh
t © 2
019 In
stit
u
t
e
of Advanced
En
gi
neeri
n
g
an
d
Scien
ce.
All
rights
res
e
rv
ed.
Corres
pon
d
i
n
g
Au
th
or:
N
i
k N
oor
A
km
al A
bd
ul H
a
m
id,
F
a
cult
y
o
f
C
o
m
put
e
r
a
n
d
Ma
t
hem
a
t
i
ca
l
S
c
i
e
nce
s
,
Un
iv
e
r
sit
i
Tekn
o
lo
gi
M
A
R
A,
Sha
h
Alam
,
S
el
a
n
g
o
r,
M
alays
i
a.
Em
ail:
nik
n
o
o
r
a
kma
l
1
9
9
4
@
g
m
ail.com
1.
I
N
TR
OD
U
C
TI
O
N
F
r
uit
rec
o
g
n
it
i
o
n
is
u
se
ful
for
au
toma
t
i
c
frui
t
harve
s
tin
g
tha
t
can
r
e
d
u
c
e
o
r
m
i
n
imi
z
e
hu
ma
n
int
e
rvention
i
n
their
fruit
harves
t
i
n
g
o
pera
t
i
o
n
s
a
n
d
a
l
so
t
he
ope
ra
ti
o
n
tim
e
a
nd
ha
rves
ti
n
g
c
ost.
F
r
u
it
rec
o
g
n
it
io
n
sys
t
em
p
lay
s
a
n
im
porta
nt
r
ole
i
n
a
utom
at
ica
l
l
y
d
et
ec
tin
g
a
n
d
inspe
c
t
in
g
t
h
e
fr
u
i
t
s
f
or
h
ar
ves
tin
g
w
ithi
n
t
he
fru
i
t
i
m
a
ges.
T
he
i
mple
me
n
t
a
t
i
o
n
o
f
f
r
u
i
t
r
ec
og
ni
t
i
o
n
a
p
p
l
ica
t
ion
g
i
ves
gre
a
t
val
u
e
o
f
p
r
o
duc
ts
t
o
the c
o
ns
u
m
er
s
[1]
.
F
r
uit
re
co
gn
i
t
i
o
n
a
p
p
lic
a
tio
n i
s
al
s
o use
f
u
l
fo
r
f
r
u
i
t
d
i
se
ase
d
e
t
e
c
tion
an
d
reco
gni
t
i
on
.
Th
e
detec
t
i
on
a
n
d
ide
n
tific
a
tio
n
o
f
f
ru
i
t
i
s
based
on
h
uma
n
’
s
n
a
k
ed
e
yes
w
h
i
c
h
i
s
t
i
m
e
c
o
n
s
um
ing
a
n
d
c
o
st
ly.
B
e
s
i
de
s,
it
ca
n
faci
l
i
ta
t
e
t
he
c
on
tro
l
o
f
fr
uit
di
sease
s
a
s
t
h
e
d
i
s
e
a
se
c
an
b
e
a
voi
ded
b
y
a
p
p
ro
priate
s
pri
n
k
l
in
g
of
p
es
tic
ide
s
t
h
r
ou
gh au
t
o
m
a
ti
c fruit
r
e
c
o
g
n
i
t
i
o
n
proc
e
s
s
.
The
perf
orm
a
nce
of
f
r
u
it
r
e
cog
n
iti
on
[
1]
,
spee
ch
r
e
c
o
g
n
i
tio
n[2]
,
vi
su
a
l
obj
e
c
t
reco
gni
t
i
o
n
[
3
]
,
ce
l
e
br
ity fa
ce
re
cog
n
iti
on
[
3]
a
nd
ma
ny o
t
he
r
doma
i
ns like
ge
n
o
m
i
cs
a
n
d
d
r
u
g
d
i
sc
over
y
[
3]
h
a
s
d
ram
a
tica
l
l
y
impro
v
e
s
w
i
t
h
t
he
u
se
o
f
de
ep
l
e
a
rn
in
g.
D
e
e
p
le
arn
i
n
g
i
s
a
c
l
as
s
o
f
m
achi
n
e
lea
r
n
i
n
g
a
l
gori
t
h
ms
t
ha
t
uses
mu
l
tip
l
e
l
ay
e
r
s
t
h
a
t
c
ont
a
i
n
n
o
n
l
i
n
ea
r
p
r
o
c
e
s
si
ng
uni
ts.
On
e
o
f
t
he
t
ec
h
n
iq
ues
un
de
r
de
ep
l
ea
rnin
g
i
s
Con
v
o
l
u
tio
na
l
N
e
ura
l
N
etw
o
r
k
s
(CN
N
)
[
4].
CN
N
prov
i
d
es
s
ucc
e
ssfu
l
r
e
sults
i
n
a
r
e
a
s
o
f
i
m
a
ge
r
e
c
og
ni
t
i
o
n
and
cl
ass
i
ficat
i
on. T
he inp
u
t
i
s
a
n
i
m
a
ge
use
d for
r
e
c
o
g
n
i
t
ion,
a
n
d
d
u
r
ing
co
nvo
lut
i
on
a
l
p
ro
ce
ss, t
h
e
out
put
o
f
the i
m
age
be
ca
m
e
a
c
t
iva
t
i
on
m
a
ps.
Co
nv
o
l
u
t
i
o
nal
la
ye
r
act
s
as
a
f
il
ter
t
o
w
a
r
d
s
the inp
u
t
i
n term
s
of
s
i
zes
a
n
d
pad
d
i
n
g
f
or
f
e
a
ture
e
xtra
c
t
i
on.
P
ooli
ng
l
a
y
e
r
i
s
o
pera
tin
g
as
a
r
e
ducer
o
f
t
h
e
fea
t
ure
m
a
ps.
A
t
t
he
e
nd,
t
he
ou
tpu
t
l
a
y
er
acts a
s
fu
l
l
y
c
o
nne
cted
l
a
y
er
a
nd
p
erfor
m
t
h
e
obje
c
t
c
lass
i
f
ica
t
i
o
n.
Evaluation Warning : The document was created with Spire.PDF for Python.
I
SSN: 2502-
4752
I
n
do
n
e
si
an
J
E
l
e
c
E
n
g
&
C
o
m
p
S
ci
, V
o
l
.
1
4
,
No. 1, April 2019 :
333 –
3
39
33
4
A
l
e
x
n
e
t,
a
p
re-
t
r
a
i
n
ed
C
N
N
m
odel,
h
as
p
ro
duce
d
v
ery
g
o
o
d
r
e
su
l
t
s
f
o
r
t
h
e
pas
t
f
ew
p
a
s
t
ye
ars
[5]
.
Al
e
x
Net
i
s
t
h
e
w
i
n
n
e
r
o
f
I
ma
g
e
Net
La
rg
e
Sc
al
e
Vi
su
al
R
eco
gni
t
i
o
n
Ch
all
e
n
g
e
(
ILSVRC
)
i
n
2
012.
I
t
i
s
des
i
g
n
e
d
by
t
h
e
S
upe
rV
isio
n
gro
up,
w
h
i
c
h
c
ons
i
s
ts
o
f
G
e
o
f
fr
ey
H
in
to
n,
A
lex
K
r
izhe
vsky
,
a
nd
I
l
ya
S
utske
v
e
r
[6].
T
his
m
o
d
e
l
sh
ow
s
b
i
g
i
m
pac
t
s
o
n
i
ma
ge
r
e
c
ogn
i
t
i
o
n
a
nd
c
l
as
s
i
fica
ti
o
n
t
as
ks
a
s
it
pr
o
d
uc
es
o
u
t
s
t
a
n
din
g
perform
ance
.
A
l
exN
e
t
ac
hie
v
e
d
t
he
t
op
5
e
rror
s
f
rom
2
6
%
t
o
1
5
.
3
%
i
n
I
L
S
V
R
C
[
6
]
.
T
h
e
n
e
t
w
o
r
k
h
a
s
m
o
r
e
f
ilt
ers
p
e
r
l
a
y
e
r
wi
th
s
t
ack
e
d
c
o
nvo
lu
t
i
on
al
l
a
y
ers
c
o
n
s
i
s
t
i
n
g
o
f
11x
11
,
5x5
,3
x3
c
on
vol
uti
o
n
s
,
ma
x
p
o
o
li
ng
,
dro
p
o
u
t
,
da
ta
a
ugm
en
ta
t
i
o
n
,
ReLU
a
c
t
i
v
a
t
i
ons,
a
n
d
S
G
D
w
i
t
h
m
ome
n
t
u
m
f
o
r
f
a
c
e
r
e
c
o
g
n
i
t
i
o
n
[
6
]
[
7
]
.
R
e
L
U
a
c
t
i
v
a
ti
on
i
s
a
t
ta
ch
e
d
a
ft
e
r
e
v
e
ry
c
on
vol
utio
na
l
l
a
y
e
r. Al
e
x
N
e
t
wa
s
t
ra
in
e
d
f
o
r
s
ix
d
a
y
s
si
mu
lt
an
e
o
u
s
ly
o
n
t
w
o
N
v
i
d
i
a
G
e
F
orc
e
G
TX 580
G
P
U
s
w
hi
c
h
is t
h
e
rea
s
o
n
w
hy t
h
eir
ne
tw
o
rk
i
s
split i
n
to tw
o
pi
p
e
l
i
n
es [
8].
Ba
g
o
f
w
ords
(
Bo
W)
[
9]
h
as
b
ee
n
use
d
f
or
doc
ume
n
t
c
l
a
ssifica
tio
n
.
B
a
g
o
f
F
e
a
t
u
r
e
s
(
B
o
F
)
w
a
s
in
t
r
od
uc
ed
f
i
r
st
b
y
[1
0]
f
or
v
ide
o
r
e
t
rie
v
a
l
f
oll
o
w
e
d
by
[1
1]
f
or
i
m
a
ge
c
ate
gor
i
z
a
t
io
n
tha
t
i
nsp
i
re
d
from
t
h
e
ori
g
ina
l
t
ex
t
re
presen
tat
i
on
m
ode
l
.
A
n
i
m
a
g
e
i
s
repre
s
en
te
d
as
a
n
un
orde
r
e
d
co
l
l
ec
ti
o
n
o
f
v
i
s
u
a
l
w
or
d
s
.
BoF
gi
ves
an
e
x
t
re
m
e
ly
c
om
pac
t
d
e
s
c
r
i
p
t
i
on
o
f
i
m
a
g
es
a
s
t
h
ey
a
re
r
e
pr
esente
d
as
h
i
s
to
gra
m
s
of
l
oca
l
d
e
s
cri
p
t
o
rs.
The
m
a
in
i
dea
is
t
o
o
b
t
ai
n
v
i
s
u
al
w
or
ds
(
fea
t
ur
es)
by
q
u
a
n
ti
zin
g
t
h
e
lo
cal
d
es
crip
t
o
rs
o
f
imag
e
s
i
n
t
h
e
da
t
a
set
base
d
o
n
a
v
is
ual
v
o
c
a
b
u
l
ary
.
T
he
a
l
g
orith
m
take
s
as
a
n
in
put
t
he
t
rai
n
i
ng
da
ta
d
escri
p
t
i
o
n
an
d
gi
ve
s
a
s
a
n
ou
tpu
t
a
s
e
t
o
f
cl
us
t
e
rs.
Each
c
l
u
s
t
e
r
i
s
re
prese
n
te
d
b
y
o
ne
v
i
s
u
al
w
or
d.
T
he
i
ma
ge
i
s
now
r
e
p
re
sen
t
ed
a
s
a
bag
of
v
is
ual
w
o
rds
a
n
d
a
hi
st
ogram
c
a
n
b
e
b
u
ilt
w
i
t
h
a
d
ime
n
si
o
n
e
qu
a
l
t
o
the
vis
u
a
l
v
oc
a
b
u
l
ary
si
ze
,
ea
ch
bi
n w
i
l
l
c
o
n
t
ai
n the
v
i
s
u
a
l
w
ord’
s
fre
q
ue
nc
y
w
i
t
h
r
espec
t
t
o th
e
i
m
ag
e
[1
2]
.
Th
e
arc
h
it
e
c
t
ure
o
f
a
p
re-t
ra
in
e
d
C
NN
m
ode
l
l
i
k
e
A
l
e
x
N
et
i
s
fix
ed
w
h
i
l
e
w
e
ca
n
de
si
gn
o
ur
o
w
n
arc
h
i
t
e
c
t
u
re
f
o
r
a
c
on
ve
n
tio
na
l
CN
N
m
odel.
W
h
e
n
the
con
v
e
n
t
i
ona
l
C
N
N
m
odel
g
o
es
d
e
e
p
er
i
n
the
i
r
c
o
n
v
o
l
u
t
i
o
n
a
r
c
h
i
t
e
c
t
u
r
e
,
i
t
c
a
n
r
e
a
c
h
a
l
o
w
e
r
i
d
e
n
t
i
f
i
c
a
t
i
o
n
e
rr
or
r
ate
com
p
are
d
t
o
t
h
e
h
u
m
a
n’s
ey
e
s
.
A
con
v
e
n
t
i
ona
l
CN
N
is
a
ble
t
o
g
ive
a
gre
a
t
so
lu
tio
n
i
n
e
x
t
ra
cting
t
he
h
ie
ra
rchica
l
repr
esen
tat
i
on
o
f
i
n
p
u
t
d
a
t
a
wh
i
c
h
it
r
e
m
a
i
n
s
u
n
c
h
a
ng
e
d
t
o
co
nv
e
r
s
i
on
a
n
d
s
c
a
l
e
s
[
13].
Th
e
c
onv
e
n
ti
on
al
C
NN
mo
d
e
l
p
r
odu
c
e
s
g
r
e
a
t
resul
t
s
for
ob
je
ct
r
e
c
o
g
n
i
tio
n
a
p
p
lica
t
ion
s
,
thus
it
is
s
u
ita
bl
e
t
o
e
xam
i
ne
t
he
fru
it
cl
assifica
t
i
o
n
p
rob
l
em
.
H
o
w
e
ve
r, in c
o
m
puter
v
i
s
i
o
n, the
fru
i
t
c
lass
ifica
t
i
o
n
ta
sk
p
r
o
v
ide
s
c
ha
ll
e
n
ges in
im
a
ge
re
c
og
n
iti
on
be
c
a
u
se
o
f
the sim
ila
r sha
p
e
s
, co
l
ors a
n
d
tex
t
ures am
ong t
h
e va
ri
ous fr
u
i
t
s. Th
u
s, t
he
m
ain o
b
j
ec
ti
ve
of t
h
is r
esea
rc
h is
t
o
in
ves
tiga
t
e
the
rec
ogni
ti
o
n
acc
urac
y
perfor
m
a
n
ce
of
B
oF
c
ompa
re
d
t
o
c
on
ve
n
t
i
ona
l
CN
N
and
pre-
trai
ned
CN
N
m
odel w
h
ic
h
is
A
le
xne
t
i
n
r
ec
og
ni
z
i
ng
f
ruit ba
se
d o
n
c
ol
o
r
im
ages
a
nd gr
a
y
scale
i
m
a
g
es.
2.
RESEARCH
M
ETH
O
D
Th
e
e
x
p
e
ri
me
nt
s
fo
r
thi
s
r
e
s
earch
h
av
e
b
een
c
o
ndu
c
t
ed
u
si
ng
M
at
la
b2
0
18a
us
i
ng Mac
B
o
o
k
P
ro w
ith
51
2G
B
s
t
orag
e
,
t
he
p
roc
e
ss
or
i
s
3
.
1G
H
z
I
nt
e
l
C
ore
i5
a
nd
m
em
ory
i
s
8
G
B
R
A
M
.
T
h
e
F
r
u
i
t
d
a
t
a
s
e
t
w
a
s
ob
ta
ine
d
from
Resea
r
chG
a
te [1
4
].
T
his
da
ta
se
t
c
onta
i
n
s
3
0
cla
s
s
e
s
o
f
fru
i
t
s
w
hic
h
a
r
e
A
p
p
l
e
B
ra
e
burn, A
pp
le
Go
ld
en
1
,
Ap
pl
e
Gol
d
en
2
,
Ap
pl
e
Gold
en
3
,
A
p
p
l
e
R
e
d
1
,
Ap
pl
e
Red
2
,
A
p
p
l
e
R
e
d
3
,
A
p
p
l
e
R
e
d
D
e
l
i
c
i
o
u
s
,
A
p
p
l
e
Re
d
Y
e
l
l
ow
,
A
p
ple
G
r
ann
y
S
m
i
th,
A
p
ric
o
t
,
A
vo
c
a
do,
A
voc
a
d
o
R
i
p
e
,
B
an
an
a
,
B
an
a
n
a
R
e
d
,
C
a
c
t
u
s
F
r
ui
t,
C
a
n
ta
l
o
upe
1
,
Ca
nta
l
ou
pe
2
,
Car
a
m
bol
a
,
C
her
r
y
1,
C
herr
y
2,
C
he
rry
R
a
i
n
i
e
r
,
Clem
ent
i
ne,
Coc
o
s,
D
a
tes,
G
rana
di
lla,
G
r
ape
P
i
n
k
,
G
r
a
pe
W
hi
te,
G
r
ape
Wh
i
t
e
2
a
n
d
G
rape
fr
u
it
P
i
nk.
T
he
d
a
t
ase
t
c
on
si
s
t
s
o
f
9
6
0
trai
ni
n
g
i
m
a
ge
s
an
d
2
40
va
l
i
d
a
t
ion
im
a
g
es
w
he
re
e
ach
c
lass
has
exa
c
t
ly
4
0
im
age
s
.
The
im
ages
w
e
r
e
in
vari
ous
v
i
e
w
s
f
or
eac
h
c
l
a
ss.
The
size
o
f
th
e
im
ages
f
or
e
a
c
h
cl
a
ss
i
s
100
b
y
100
p
ix
els
bu
t
a
l
l
o
f
t
h
e
i
ma
g
e
s
w
e
r
e
r
esize
d
t
o
2
2
4
x
2
24
for
th
i
s
e
xpe
rime
nt
.
F
i
g
u
re
1
s
how
s
sa
m
p
l
e
i
m
ages
f
r
o
m
Frui
t
da
t
a
se
t
f
o
r
thir
ty
classes.
F
i
gure
1. S
a
m
ple
P
i
cture
s
f
ro
m
F
r
uit D
a
t
a
set [1
4]
Evaluation Warning : The document was created with Spire.PDF for Python.
Indonesia
n
J
Elec Eng
&
C
o
m
p
S
ci
ISSN:
2502-
4752
C
o
m
p
a
r
ing
b
a
g
s
of
f
e
at
u
r
es
,
co
n
v
ent
i
on
al
co
n
v
ol
uti
onal
neu
r
a
l
n
e
tw
ork…
(N
i
k
Noo
r
Ak
ma
l
Ab
du
l
Hami
d
)
33
5
2.
1.
B
ag
of
F
eat
u
re
s
B
o
F
I
n
sp
i
r
ed
f
r
o
m
the
or
i
g
i
n
a
l
t
e
x
t
r
e
pr
e
s
e
n
ta
t
i
on
mo
de
l,
B
oF
w
a
s
i
nt
r
o
duc
ed
f
or
i
m
a
ge
c
ate
g
or
iza
t
io
n
t
h
a
t
w
a
s
r
e
p
r
e
s
e
n
t
e
d
a
s
a
n
u
n
o
r
d
e
r
e
d
c
o
l
l
e
c
t
i
o
n
o
f
v
i
s
u
a
l
w
o
r
d
s
[
1
5
]
.
A
s
t
h
e
y
a
r
e
r
e
p
r
e
s
e
n
t
e
d
a
s
h
i
s
t
o
g
r
a
m
s
of
l
oca
l
d
esc
r
i
p
tor
s
,
B
o
F
g
i
v
e
s
an
e
x
t
r
e
me
ly
c
om
pac
t
d
es
c
r
ip
t
i
o
n
o
f
i
m
a
ge
s.
A
l
oc
a
l
d
e
s
cr
i
p
t
o
r
is
u
sed
i
n
im
age
c
a
t
e
gor
i
z
a
tio
n
a
nd
o
b
j
ec
t
r
e
c
o
g
n
i
t
i
o
n
t
a
s
ks
a
nd
a
l
so
t
o
m
atc
h
s
imi
l
a
r
ob
jec
t
i
ns
ta
nc
es.
Ma
n
y
m
eth
o
d
s
for
fea
t
ur
e
de
s
c
r
i
pti
o
n
ca
n
b
e
e
m
p
l
o
ye
d.
T
h
u
s,
i
n
th
i
s
w
or
k,
w
e
tar
g
et
t
he
r
esu
lt
b
a
sed
o
n
v
isua
l
w
o
rds
a
ccur
acy.
A
c
tiv
i
t
ie
s
t
o
i
den
t
if
y
obj
e
c
ts
i
n
i
m
age
s
,
tr
a
n
scr
i
be
s
pee
c
h
i
n
t
o
te
xt
,
m
a
tc
h
new
s
item
s
,
p
o
s
t
s
or
pr
o
duc
t
s
w
it
h
use
r
s’
i
n
t
er
est
s
,
a
nd
se
lec
t
r
e
l
e
v
a
n
t
r
e
su
l
t
s
of
se
arc
h
c
an
b
e
p
e
rfo
rmed
b
y
u
s
i
n
g
ma
ch
i
n
e
l
e
arni
ng
t
ech
niq
u
e
s su
c
h
a
s
Bo
F
[3
].
In
o
rd
er
t
o
ob
t
a
i
n
a
B
o
F
de
sc
r
i
ptor
,
w
e
need
t
o
e
x
t
r
a
c
t f
eat
ur
e
s
f
r
o
m
t
h
e
i
m
a
g
e
.
T
h
e
f
e
a
t
u
r
e
u
s
e
d
i
s
S
p
e
e
d
e
d
U
p
R
o
b
u
s
t
F
e
a
t
u
r
e
s
(
S
U
R
F
)
.
S
U
R
F
d
e
s
c
r
i
p
t
o
r
i
s
e
q
u
a
l
t
o
c
o
m
m
o
n
im
age
trans
f
or
ma
t
i
o
n
s
wh
i
c
h
a
r
e
ima
g
e
rotati
o
n
,
scale
cha
nge
s,
illum
i
na
t
i
o
n
a
n
d
s
ma
ll
c
ha
n
g
es
i
n
vi
e
w
po
int.
I
n
a
dd
it
io
n,
S
U
R
F
i
s
a
b
l
e
to
c
om
pu
te
d
i
s
t
i
n
ct
i
v
e
de
scr
i
p
t
or
s
q
u
ic
k
l
y
[1
6]
.
T
he
c
lass
if
ier
use
d
t
o
c
l
as
sif
y
t
h
e
S
U
R
F
i
s
S
u
pp
or
t
V
e
c
t
or
M
a
c
h
i
n
e
(
S
V
M)
.
S
V
M
can
b
e
u
s
ed
f
or
m
ul
t
i
p
l
e
kind
o
f
o
bj
e
c
t
r
e
c
ogni
t
i
on
l
i
k
e
fr
u
i
t
r
e
cog
n
i
tio
n,
b
r
a
in
w
ave
r
e
c
o
gni
ti
o
n
a
n
d
i
m
a
ge
c
lass
i
f
ica
t
i
on
of
r
e
m
ote
s
e
nsin
g
[
17]
.
A
mult
ic
lass
S
V
M
w
a
s
u
s
ed
t
o
ac
co
mmo
d
a
t
e
a
m
u
l
t
i
cl
ass
p
r
ob
l
e
ms
b
ut
S
VM
act
ua
l
l
y
wa
s
deve
lope
d
for
b
i
nar
y
c
la
ssi
f
i
cat
i
o
n
[1
7]
.
F
i
gur
e
2
sh
ow
s
the
il
l
u
str
a
t
i
o
n
of
p
r
o
cess
f
o
r
ba
g
o
f
f
ea
t
u
r
e
s
.
F
i
gur
e
2.
I
ll
us
t
r
a
t
i
on
P
r
oc
ess
of
B
a
g
o
f
F
e
atur
es
[
1
8
]
2.
2.
Co
n
v
ent
i
o
n
a
l
C
o
n
v
o
lu
tio
n
al Neura
l
Netwo
r
k
(CNN)
The
a
r
chi
t
ec
tu
r
e
o
f
a
conve
n
t
i
o
nal
CN
N
cons
ist
s
o
f
t
h
r
e
e
l
a
yer
s
w
hic
h
a
r
e
c
onv
ol
ve
l
a
y
e
r
,
poo
l
i
n
g
laye
r
a
n
d
Rec
t
i
f
ied
L
i
n
e
ar
uni
t
(
R
eL
u)
w
hich
i
s
a
l
so
k
n
o
w
n
a
s
a
s
truct
u
r
e
d
series
o
f
lay
e
rs
[
1
]
.
The
c
onve
nti
ona
l
CN
N
’
s
r
o
le
i
s
t
o
t
r
a
c
k
d
a
t
a
sim
i
l
a
r
w
i
t
h
t
h
e
c
o
nve
nt
i
o
n
a
l
f
eed
fo
rward
n
e
u
r
a
l
n
et
wo
rk
.
E
a
ch
im
age
is
s
u
b
m
i
t
t
e
d
t
hr
ou
g
h
t
he
l
a
y
er
s
un
t
il
a
lo
ss
f
u
nct
i
on
i
s
ach
ieve
d
a
t
t
he
t
o
p
l
ayer
[
5
]
.
T
he
e
xtr
a
ct
i
on
o
f
fe
at
ur
e
s
f
r
o
m
a
n
i
m
a
g
e
i
s per
f
o
r
m
e
d
b
y
u
s
i
n
g
f
i
l
t
e
r
s
a
nd
im
age
pa
tc
he
s th
a
t
s
tride
o
v
er
t
h
e
i
n
p
u
t
i
m
age
i
n
t
he
c
onv
ol
ve
l
ayer
.
O
n
t
he o
the
r
han
d,
R
e
L
u la
ye
r
r
e
plac
es
a
l
l
nega
t
i
ve
p
ixe
l
val
ues
in t
he f
ea
tur
e
ma
p
w
i
t
h
z
er
o.
I
n
o
r
d
er
t
o
r
e
duc
e
t
h
e
dim
e
ns
io
na
li
ty,
po
o
l
i
n
g
la
ye
r
is
a
ppl
ied
t
ha
t
a
llow
s
t
he
f
e
a
t
ur
e
ma
p
to
b
e
d
o
w
n
-
sam
p
le
d.
M
a
x
p
o
o
lin
g
la
y
e
r
c
o
mpu
t
es
t
he
m
a
x
imum
l
oca
l
o
f
fe
a
t
ur
e
m
a
p
.
T
he
n,
n
e
i
gh
bor
in
g
p
o
o
l
i
n
g
ta
k
e
s
i
npu
t
f
r
o
m
t
h
e
f
e
a
t
u
r
e
m
ap
s
th
a
t
a
re
s
hi
ft
ed
b
y
mo
re
t
h
a
n
on
e
r
ow
s
or
c
o
l
u
m
ns.
F
i
g
u
r
e
3
s
h
o
w
s
t
he
l
a
y
er
o
f
a
c
onve
nti
ona
l
CN
N
[19]
.
F
i
gur
e
3.
T
he
l
aye
r
o
f
a
conve
nt
i
ona
l
CN
N
[19]
.
Evaluation Warning : The document was created with Spire.PDF for Python.
I
SS
N
:
250
2-
4
7
5
2
In
do
n
e
sia
n
J
Elec Eng
&
C
o
mp
S
ci,
Vo
l. 1
4
,
No
. 1
,
Ap
r
i
l
2
0
1
9
:
333
–
3
3
9
33
6
2.
3.
Alex
n
e
t
The
com
m
on
pr
e-
tr
a
i
ne
d
C
N
N
m
odel
i
nve
s
t
i
g
a
t
e
d
i
n
th
is
p
a
p
er
i
s
Al
e
x
Net
th
at
i
s
th
e
winn
er
o
f
th
e
ILS
V
RC
i
n
201
2
[1
][5
].
A
lex
N
et
h
as
m
o
r
e
filter
l
ay
er
s
wi
th
s
tac
ke
d
of
c
on
v
o
lu
ti
o
n
l
a
y
er
s
com
p
ar
e
d
t
o
t
h
e
c
onve
nti
ona
l
C
N
N
ar
chi
t
ec
t
u
r
e
w
he
r
e
it
is
d
e
s
i
gne
d
w
ith
d
ee
per
arc
h
it
e
c
ture.
F
or
t
h
i
s
re
sear
ch,
t
h
e
ful
l
y-
co
n
n
ect
e
d
l
ayers
a
r
e
fi
n
e
-tun
ed
t
o
cl
a
s
si
fy
3
0
d
i
ff
e
r
e
n
t
c
a
t
e
go
ries
s
i
n
ce
t
he
d
ata
s
e
t
c
o
n
s
i
sts
of
30
d
i
ffer
en
t
fr
uits.
A
n
il
l
u
s
t
r
a
t
i
o
n
o
f
t
h
e
la
ye
r
of
A
lexN
et
i
s
show
n
i
n
F
i
gur
e
4.
F
i
gur
e
4.
T
he
l
ayer
s
of
A
l
e
x
n
e
t
[
2
]
3.
RESULTS
A
ND
A
NAL
YS
IS
3.
1.
B
a
g
of
Fe
atures
Th
e
si
z
e
o
f
i
m
a
g
e
i
n
th
e
i
nput
l
ay
e
r
i
s
2
24x2
24
x
3
p
ix
e
l
s
fo
r
co
l
o
r
i
m
ag
es
a
nd
2
24x
224
x
1
p
ix
e
l
f
o
r
gra
y
sca
l
e
i
m
age
s
.
S
p
eed
u
p
Rob
u
s
t
F
e
a
t
u
res
(
S
URF)
i
s
extrac
t
e
d
i
n
B
oF
w
her
e
it
detec
t
s
the
be
st
s
ca
le
in
varia
n
ce.
Fo
r
th
is
m
o
d
e
l
,
R
G
B
ima
g
e
s
a
n
d
g
r
a
ysca
l
e
i
m
a
ges
we
r
e
t
es
te
d
a
nd
t
h
e
r
e
sul
t
s
how
s
t
h
at
t
h
e
t
ota
l
proc
ess
i
ng
tim
e
for
gra
y
sc
a
l
e
ima
g
e
is
f
as
ter
tha
n
R
G
B
i
m
a
ges
w
h
i
ch
i
s
due
t
o
t
h
e
l
e
ss
numbe
r
o
f
p
i
x
el
r
e
pr
ese
n
t
a
tio
n
bu
t
a
c
c
u
r
acy
1
i
s
o
b
t
ai
ne
d
b
y
R
G
B
i
m
a
ge
s
a
nd
n
o
t
g
r
a
y
s
c
a
l
e
i
m
a
g
e
s
.
T
h
i
s
i
s
b
e
c
a
u
s
e
t
h
e
c
onve
r
s
i
o
n
p
r
o
cess
fr
om
c
o
l
o
u
r
im
age
t
o
g
r
a
ysca
le
i
m
a
ge
e
limi
n
a
t
e
s
so
me
d
a
t
a
t
h
at
m
ay
b
e
u
s
ef
ul
i
n
o
b
j
e
c
t
r
e
cog
n
i
tio
n.
T
able
1
s
how
s
t
h
e
di
ff
er
e
n
t
r
e
sul
t
s
pr
o
d
u
c
e
d
by
B
oF
f
or
c
ol
or
a
nd
g
r
aysca
l
e
im
ages.
Ta
b
l
e
1.
A
c
c
ura
c
y
pe
r
f
or
ma
nc
e
of
B
oF
Inp
u
t
Im
a
g
e
Input
S
i
z
e
A
ccu
r
a
c
y
To
tal
Ti
me
R
G
B
I
m
a
g
e
224
x
224
x3
1
9
m
in
3
s
Gray
sca
l
e
I
m
a
g
e
224
x
224
x1
0
.
9
8
5 m
i
n
47s
3.
2.
Co
n
v
ent
i
o
n
a
l
C
o
n
v
o
lu
tio
n
al Neura
l
Netwo
r
k
(CNN)
Fo
r
co
nv
en
ti
on
al
C
NN,
t
h
e
d
at
ase
t
i
s
t
e
st
ed
w
ith
i
ma
g
e
s
i
z
e
22
4
x2
24
x3
.
In
C
N
N
,
t
h
e
r
e
a
r
e
t
w
o
m
a
in la
y
e
r
s tha
t
pla
y im
p
o
rta
n
t
ro
le
s i
n
a
na
lysi
ng
t
he
da
t
ase
t
whi
c
h
a
r
e
con
vol
v
e
l
ay
e
r
a
n
d
m
ax
poo
l
i
ng
l
a
y
e
r.
Ex
per
i
m
e
nts o
n
di
ffe
r
e
nt
v
a
l
ues
f
o
r
bo
t
h
of
t
h
ese
la
ye
r
s
a
r
e
p
e
r
f
or
m
e
d
to
d
e
t
er
mine t
he bes
t
a
c
c
u
r
a
cy
a
nd
t
he
r
e
s
u
l
t
s
a
r
e
s
h
o
w
n
i
n
T
a
b
l
e
2
.
B
a
s
e
d
o
n
T
a
b
l
e
2
,
F
r
u
i
t
s
d
a
t
a
s
e
t
i
s
t
e
st
e
d
t
wel
v
e
ti
mes
t
o
s
ee
t
h
e
ac
c
u
r
a
c
y
o
f
t
h
e
re
cogn
i
t
i
on
resu
l
t
b
a
s
ed
o
n
di
ff
ere
n
t
conv
ol
v
e
l
ay
er
a
nd
l
e
arni
n
g
rat
e
.
Fo
r R
G
B
i
m
a
g
es,
a
si
ng
l
e
c
on
vol
v
e
laye
r
w
ith
(
3,
1
6
)
and
lea
r
nin
g
r
a
t
e
0
.
001
a
c
h
ieve
a
c
c
u
r
a
cy of
1
a
nd the t
o
t
a
l
pr
oce
ssi
n
g
t
i
m
e
i
s
3
m
inu
t
e
s
a
nd
10
sec
onds.
F
or
d
o
u
b
le
c
on
v
o
l
v
e
la
yer
s
w
i
t
h
(
5
,
20)
a
n
d
(
3
,
20)
,
t
he
a
c
c
ur
a
c
y
is
0
.
9
9
6
7
w
i
t
h
t
o
ta
l
pr
oc
essi
n
g
time
of
6
m
i
n
ute
s
a
n
d
5
s
ec
on
ds.
M
e
a
nw
hi
l
e
f
or
g
r
a
y
s
c
a
le
i
m
a
ge
,
a
si
ngle
c
o
nv
o
l
ve
l
a
y
er
w
it
h
(
5
,
2
0)
a
n
d
lea
r
n
i
n
g
r
ate
0.
00
1
sh
ow
s
t
h
e
r
e
sult
o
f
a
c
c
ur
a
c
y
is
0
.
9
9
3
3
a
nd
the
to
ta
l
pr
ocess
i
n
g
tim
e
for
the
ex
pe
rim
e
nt
i
s
2
min
u
t
es
a
nd
18
se
c
o
nd
s.
F
o
r
dou
b
l
e
c
o
n
v
o
l
v
e
la
y
e
r
for
gr
a
y
sc
a
l
e
i
m
age
w
i
t
h
(
5,
20)
a
nd
(
3
,
2
0)
a
n
d
lea
r
n
i
n
g
r
a
t
e
0.
000
1,
t
he
t
ota
l
p
r
o
ce
ssi
ng
t
i
m
e
i
s
5
m
i
nu
tes
an
d
58
s
e
c
ond
s.
Evaluation Warning : The document was created with Spire.PDF for Python.
Ind
ones
i
a
n
J
E
lec
En
g & Co
mp
S
c
i
IS
S
N
: 2502-
47
52
Co
mp
aring
b
a
g
s
of
f
e
atu
r
e
s
, c
o
n
vent
i
o
n
a
l
c
o
n
v
ol
ut
i
onal
ne
u
r
a
l
n
e
t
w
ork…
(Ni
k
Noo
r
Akma
l
Abdu
l
Hami
d
)
33
7
Ta
b
l
e 2.
A
c
c
ura
c
y pe
rfor
ma
nc
e
of c
on
ve
nt
iona
l
CN
N
Inp
u
t
No
o
f
la
y
e
rs
C
onvolve
La
ye
r
Stride
E
poc
h,
L
e
a
r
ning
Ra
t
e
A
c
c
u
r
a
c
y
T
ota
l
T
i
m
e
RG
B
I
m
ag
e
Singl
e
La
ye
r
3,
16
3
5
,
0
.
0001
1
4
m
in
27s
5,
20
2
5
,
0
.
0001
1
6
m
in
08s
3,
16
3
5
,
0
.
001
1
3
m
in
10s
5,
20
2
5
,
0
.
001
1
5
m
in
35s
D
ouble
La
ye
r
5,
20
3,
20
3
3
5,
0.
0001
0
.
996
7
6 m
i
n
5s
9,
40
3,
20
3
3
5,
0.
0001
1
13
m
i
n
6s
G
r
a
y
s
c
al
e
Im
a
g
e
Singl
e
La
ye
r
3,
16
3
5
,
0
.
0001
1
4
m
in
27s
5,
20
2
5
,
0
.
0001
1
3
m
in
05s
3,
16
3
5
,
0
.
001
0
.
9
.33
2 m
i
n
47s
5,
20
2
5
,
0
.
001
0
.
993
3
2 m
i
n
18s
D
ouble
La
ye
r
5,
20
3,
20
3
2
5,
0.
0001
1
5
m
in
58s
9,
40
3,
20
3
2
5,
0.
0001
1
11
m
i
n
28s
3.3.
Alexn
et
In
o
rd
e
r
t
o
co
m
p
l
e
t
e
t
h
i
s
e
x
p
e
ri
m
e
n
t
,
t
h
e
i
m
a
g
es
a
re
r
e
s
i
z
ed
t
o
2
24
x
224
x3
p
i
x
e
l
s.
F
or
t
he
expe
r
i
me
n
t
w
i
t
h
A
l
e
xne
t
,
o
n
l
y
c
o
l
o
r
im
age
s
a
r
e
t
este
d.
Ba
se
d
o
n
Table
3.
F
r
u
i
t
s
dataset
was
t
e
ste
d
t
hree
ti
m
e
s
t
o
i
n
v
e
s
t
i
ga
te
t
he
e
ffec
t
of
d
i
f
fere
nt
l
e
a
r
n
ing
ra
tes
to
t
h
e
a
c
c
u
racy
.
It
s
h
o
w
s
t
h
a
t
acc
u
r
a
c
y
1
i
s
ob
ta
i
n
ed
wi
t
h
0
.0
001
l
ea
rn
i
n
g
ra
t
e
a
nd
t
h
e
t
ot
a
l
p
roc
e
s
s
in
g
ti
me
t
o
co
mp
le
t
e
t
he
e
x
p
e
r
ime
n
t
is
2
2
m
i
n
u
t
e
s
and
2
seco
nds.
Tab
l
e
3.
A
ccuracy
p
erform
an
ce
of
Alex
net
N
o
o
f
Te
st
I
m
a
g
e
I
nput
S
i
z
e
Le
a
r
ning
Ra
t
e
A
cc
ur
a
c
y
Tota
l
T
i
m
e
1
224x
224
x3
0
.
0001
1
22
m
i
n
2s
2
224x
224
x3
0
.
001
0
.
5708
22
m
i
n
3s
3
224x
224
x3
0
.
0005
0
.
9542
23
m
i
n
02s
Ta
b
l
e
4
sh
ow
s
the
summ
ar
y
of
f
ru
i
t
r
ecog
n
i
t
i
on
pe
rform
a
nc
e
us
in
g
BoF
,
c
onve
nt
iona
l
CN
N
and
A
l
ex
net.
B
y
referrin
g
t
o
Ta
b
l
e
4,
w
e
ca
n
see
t
h
a
t
a
ll
the
thre
e
m
o
d
e
l
s
p
r
o
d
u
c
e
g
r
e
a
t
a
c
c
u
r
a
c
y
w
h
i
c
h
i
s
1
exc
e
p
t
f
or
c
o
nve
n
tio
na
l
CN
N
w
i
t
h
g
raysc
a
l
e
ima
g
e
w
h
i
c
h
is
0
.9
9.
T
h
e
t
o
t
a
l
t
im
e
for
A
l
e
xne
t
is
t
he
l
o
nge
st
com
p
are
d
to th
e
ot
her tw
o
m
o
del
s
d
ue
t
o the num
ber o
f
la
y
e
r
s
th
a
t
i
t
ha
s w
h
i
c
h is
m
ore
th
an the
c
on
ve
nt
io
n
a
l
C
N
N.
Tab
l
e
4.
F
ruit re
cog
n
i
t
i
on
perfor
m
a
n
ce
of
B
o
F
, Con
ve
nt
i
ona
l CN
N
a
nd A
l
exne
t
M
ode
M
a
c
h
in
e
L
earn
i
n
g
D
e
e
p
L
e
a
r
n
i
n
g
Ba
g of
F
e
a
t
ur
e
s
C
onve
ntiona
l
C
N
N
A
l
e
xne
t
RG
B
Gr
ays
c
a
l
e
RG
B
Gr
a
y
s
cal
e
R
G
B
Inp
u
t Siz
e
224x
224
x3
224x
224x1
2
24x
224
x3
224x
224
x1
224x
224
x3
Ac
c
u
r
a
c
y
1
1
1
0
.9
9
1
Tot
a
l
T
i
m
e
9
m
in
3
s
5 m
i
n
47s
3
m
in
10s
2
m
i
n
1
8s
22
m
i
n
2s
4.
CONCL
U
S
ION
Thi
s
p
a
p
er
h
a
s
p
re
sen
t
e
d
t
he
d
i
f
f
e
re
nt
acc
u
r
ac
y
p
e
rfo
rman
ce
o
f
B
o
F
,
c
onve
nt
i
ona
l
CN
N
an
d
A
l
ex
net
for
fr
uit
r
eco
gn
i
tio
n
based
o
n
F
rui
t
d
a
t
ase
t
.
We
a
na
lyz
e
t
h
e
perf
orma
nce
of
fru
it
r
ec
o
gni
t
i
o
n
b
ase
d
on
c
o
lo
ur
a
n
d
g
raysc
a
l
e
im
ag
e
s
.
B
ase
d
on
t
h
e
re
su
l
t
s
o
f
t
he
e
x
p
e
r
i
m
e
n
t
s
,
w
e
c
a
n
s
e
e
t
h
a
t
B
o
F
w
i
t
h
S
U
R
F
and
S
V
M
s
t
i
l
l
p
ro
duc
e
e
x
ce
lle
nt
r
e
s
ul
ts
a
s
CN
N
.
E
ven
t
hou
g
h
t
h
e
overa
ll
t
r
ai
n
i
n
g
a
nd
tes
tin
g
t
i
m
e
o
f
BoF
i
s
lo
nge
r com
p
a
r
ed
t
o
co
n
v
e
n
t
i
ona
l CN
N
b
u
t
it
i
s
s
til
l fa
ste
r
t
han
A
le
xN
et.
T
h
i
s sh
ow
s the
rob
u
s
t
ne
ss of B
oF
in
Evaluation Warning : The document was created with Spire.PDF for Python.
I
SSN: 2502-
4752
I
n
do
n
e
si
an
J
E
l
e
c
E
n
g
&
C
o
m
p
S
ci
, V
o
l
.
1
4
,
No. 1, April 2019 :
333 –
3
39
33
8
rec
o
g
n
i
z
i
n
g
frui
ts
w
i
t
h
d
i
ffere
nt
s
ha
pes,
c
ol
our
a
n
d
t
e
x
t
u
re
.
F
o
r
fut
u
r
e
w
or
k,
w
e
w
ill
do
more
e
x
p
e
r
ime
n
ts
on o
t
her
datase
t
s
w
ith
o
t
h
e
r
mac
hine
lea
rn
i
n
g
and
deep
l
e
a
r
n
in
g
t
ech
niqu
e
s
.
ACKNOW
LEDG
E
MEN
T
S
The
a
u
th
ors
w
o
u
l
d
l
i
ke
t
o
t
h
a
n
k
F
a
cu
l
t
y
of
C
ompu
te
r
and
Ma
t
h
e
m
a
tica
l
S
cie
n
c
e
s,
U
ni
ve
rs
iti
Tek
n
o
l
og
i
MA
R
A
,
S
h
ah
A
l
a
m, S
elan
g
o
r,
for spo
nsor
in
g t
h
is
r
ese
a
r
ch
.
REFE
RENCES
[1]
N.
A
.
Mu
hammad
,
A
.
A
.
N
asir,
Z.
I
brah
im
,
and
N.
S
abri,
“Ev
a
l
u
at
ion
o
f
C
NN
,
A
lexnet
a
nd
G
oogleNet
f
o
r
Fru
i
t
Recog
n
it
io
n
,
”
IJE
E
C
S
,
vol.
1
2
,
no.
2
,
p
p
.
46
8–4
7
5
,
2
018
.
[2]
R
.
D
.
S
a
f
i
y
a
h
,
Z
.
A
.
R
a
h
i
m
,
S
.
S
y
a
f
i
q
,
Z
.
I
b
r
a
h
i
m
,
a
n
d
N
.
S
a
b
r
i
,
“P
erf
o
rm
a
n
ce
Ev
alu
a
tio
n
f
o
r
Visi
on-Bas
e
d
Veh
i
cle
Classif
i
catio
n
U
s
i
ng Co
n
v
o
l
u
t
io
nal Neu
r
al N
etw
o
rk,
”
IJE
T
, vo
l
. 7
, pp
. 86
–
9
0
, 20
1
8
.
[3]
Y. Lecu
n, Y. Be
n
gio,
and
G. H
i
n
t
o
n
,
“Deep
learn
i
n
g
,”
Nature
,
v
o
l. 52
1
,
no
.
75
53
,
p
p
. 4
36
–4
44
,
20
15
.
[4]
J. Ub
b
ens,
M
. Cies
lak,
P
.
P
r
usi
nkiewi
cz,
a
n
d
I
.
S
t
av
nes
s
,
“T
he
use of
p
lan
t
m
o
d
el
s
i
n
deep learni
ng
: An appl
icatio
n
to
leaf
co
unti
n
g
in
ros
ett
e
p
l
a
n
t
s,”
Plan
t Me
tho
d
s
,
vo
l.
14,
n
o
.
1
,
pp.
1
–1
0,
201
8.
[5]
N.
S
ab
ri,
Z.
A
bdu
l
Azi
z
,
Z.
I
b
r
ahi
m
,
M
.
a.R.
A
km
al
R
as
ydan
and
A
.
H
.
Ab
d
Gh
ani
,
“
Co
m
p
a
r
ing
Con
vol
utio
n
Neu
r
a
l
N
etwo
rk
M
odel
s
f
or
L
eaf
R
ecog
n
i
t
i
o
n
,
”,
In
tern
atio
nal Jo
ur
na
l of Eng
i
n
eeri
n
g
an
d
Tech
no
lo
gy
(
I
JE
T)
,
v
o
l
.
7,
pp.
1
4
1
–
1
4
4
,
201
8.
[6]
A
.
K
o
r
t
y
l
e
w
s
k
i
,
B
.
E
g
g
e
r
,
A
.
S
c
h
nei
d
er,
T.
G
erig
,
A.
M
o
r
el-F
ors
ter,
a
nd
T
.
Vet
t
er,
“Em
p
i
r
ically
A
n
a
ly
zing
t
he
Eff
ect
o
f
D
a
taset Bias
es o
n
Deep Face
Reco
gn
it
io
n
S
y
stems
,
” 20
1
7.
[7]
N.
A
t
e
qah
,
B
.
M
a
t
,
N
.
Hi
daya
h
,
B
.
Abd
,
a
nd
Z
.
I
b
r
ahim,
“
C
e
l
ebri
t
y
F
ace
Re
co
gn
iti
o
n
u
s
i
n
g
D
eep
L
earni
ng
,”,
IJEECS
,
v
o
l.
12,
n
o.
2
,
pp
.
47
6–
4
8
1,
2
01
8.
[8]
I.
S
uts
k
ev
er
a
nd
G
.
E.
H
i
n
t
on,
"
Im
ageNet
C
l
a
s
s
ificat
io
n
with
D
e
e
p
C
onvoluti
onal
N
eur
a
l
Network
s
.",
In
A
d
van
ces
in
Neur
a
l
In
f
o
rmat
io
n
Pro
cessi
ng Sys
t
em
s
,
pp
.
1097
-110
5,
2
0
12
[9]
K.
S
.
G
e
orge
a
nd
S
.
J
o
s
eph,
“
Tex
t
C
l
a
ssifi
c
a
t
i
on
by
Au
gm
enting
Bag
o
f
W
o
r
d
s
(
bo
w)
R
ep
re
s
e
nt
ati
on
wi
th
C
o-
Occu
rren
c
e
F
eatu
r
e”,
IOS
R
Jo
urnal
of Compu
t
er
En
gi
neeri
ng
(
I
OS
R-JCE),
Vo
l 16
,
No
1
,
p
p
3
4
-3
8
., 20
1
4
.
[10]
J.
S
ivic
a
nd
A
.
Zisserman,
“Video
G
o
o
g
l
e
:
A
t
e
x
t
r
e
t
r
i
e
v
a
l
a
p
p
r
o
ach
t
o
o
b
ject
m
at
chi
ng
i
n
v
i
d
eos
,
”
Towar. Categ.
Ob
ject Recog
n
i
t
.
,
no
.
Iccv,
pp.
1
4
7
0–1
47
7,
200
3.
[11]
C.
S
.
V
e
neg
a
s-Barrera
a
nd
J
.
M
a
nj
arrez,
“
Vi
s
u
al
C
ateg
ori
zatio
n
wit
h
B
ags
o
f
K
eypoints,”
Rev.
M
ex.
Bio
d
i
vers.
,
vo
l.
82,
n
o
.
1
,
pp.
1
79
–1
91,
2
0
1
1
.
[12]
C
.
H
i
b
a
,
Z
.
H
a
m
i
d
,
a
n
d
A
.
O
m
a
r
,
“
Bag
o
f
F
eatu
r
es
M
o
d
el
U
si
ng
t
h
e
N
e
w
Ap
proa
ch
es :
A
C
o
m
p
r
ehens
i
v
e
S
t
udy
”,
In
te
rn
at
io
na
l J
o
urna
l
Ad
v
a
n
c
e
s
Co
mp
ute
r
S
c
ie
nc
e
a
n
d
Ap
pl
ic
a
t
i
o
ns
,
v
o
l
.
1
, no
.
7
,
p
p
. 2
26
–2
34
,
2
0
1
6
.
[13]
Z.
I
brahim,
N
.
S
abri
a
nd
D
.
I
s
a,
“
Pal
m
O
il
Fresh
Fruit
Bunch
R
i
p
e
nes
s
G
radi
ng
R
eco
gn
iti
on
U
sing
C
o
n
v
o
luti
on
a
l
Neura
l
N
et
work”,
Jou
r
n
a
l
o
f
T
e
l
ecomm
un
ication,
Elect
ronic
&
Com
p
u
t
er
Eng
i
neerin
g
,
V
o
l
9,
N
o.
3
-2,
2
018,
p
p
.
10
9-1
13.
[14]
Mureșan,
H
orea
and
Oltean,
M
i
ha
i
.
"
Fruit
recognition
f
r
o
m
i
mage
s
u
s
in
g
deep
l
earn
i
n
g
".
A
c
ta Uni
vers
itat
i
s
Sa
pi
e
n
ti
ae,
In
fo
rm
atica
.
10.
2
6-42.
1
0.
2
4
7
8
/
a
usi-2
018
-00
0
2
,
2
0
1
8
.
[15]
E.
O
k
a
f
o
r,
P
.
P
a
wara,
F.
K
araa
b
a
,
and
O.
S
u
r
inta,
“
C
om
parat
i
ve
S
t
u
dy
Bet
w
een
D
eep
L
ea
rn
in
g
and
Bag
of
V
i
s
u
a
l
Words
f
o
r
Wild-
A
ni
mal
Recognition
.
”
,
In I
E
E
E
Sy
mp
osiu
m Se
r
i
e
s
fo
r Co
mpu
t
a
t
io
na
l Inte
ll
ige
n
c
e
(
S
SCI
)
,
pp
.
1
-
8
,
20
16
.
[16]
S
.
H
w
a
ng
,
“
B
a
g
-o
f-vis
ual
-
words
ap
pro
ach
b
as
ed
o
n
S
U
RF
f
eat
ures
to
pol
yp
d
et
ectio
n
i
n
w
irel
e
ss
cap
su
le
e
n
do
sc
op
y
vide
os,
”
Pr
oc.
20
11
Int
.
Conf.
Imag
e
P
r
oces
s.
Com
put.
Vi
sio
n
,
P
a
tt
ern
R
ecogn
it
io
n
,
IPCV
2
01
1,
v
o
l
.
2
,
no
.
i
,
p
p
.
9
41
–944,
2
0
1
1
.
[17]
Z.
I
brah
im
,
N.
S
ab
ri,
and
N
.
N
.
A.
M
an
gshor,
“L
ea
f
Recog
n
i
t
i
on
u
s
in
g
Tex
t
u
r
e
F
eat
ures
f
or
H
erb
a
l
Plan
t
Iden
tif
i
cati
o
n
”
.
Ind
o
n
e
sia
n
Jour
n
a
l of Elect
ri
cal
En
gin
eeri
ng an
d
Co
mp
u
t
er
Sci
e
n
c
e
,
9
(1),
152-15
6
,
2
018
[18]
S
.
O
’H
ara
an
d
B.A
.
D
rap
e
r,
“
Int
r
o
d
u
c
ti
on
t
o
the
b
a
g
of
f
eat
ures
p
aradi
g
m
f
o
r
i
m
ag
e
clas
s
i
fi
cati
o
n
a
n
d
ret
r
ieval”.
arXi
v
p
r
eprint
a
rX
iv:1
101
.
3
354,
2
01
1.
[19]
C.
Z
han
g
,
P
.
W
ang
,
,
K
.
Chen
,
an
d
J.
K
.
Kämäräin
en,
“
I
dent
ity-a
w
a
re
c
on
vo
l
u
t
i
o
n
a
l
n
eu
ral
n
e
t
w
ork
s
f
or
f
acial
exp
r
essi
on
recog
n
i
t
io
n”.
Jou
r
na
l of
Systems En
g
i
neer
ing
a
nd Ele
ct
ro
ni
cs
,
28
(4
),
7
84
-7
92
,
2
017
.
B
I
OGRAPHIES
O
F AUTHO
RS
Nik
Noor
A
k
m
al
A
bdul
H
am
i
d
i
s
a
M
a
ster’s
s
tudent
a
t
Universiti
T
e
kno
lo
g
i
M
A
R
A,
S
hah
Alam
,
S
e
l
a
n
g
o
r
,
M
a
l
a
y
s
i
a
w
here
s
he
i
s
con
t
i
n
u
i
ng
h
er
e
du
catio
n
in
t
h
e
f
i
e
ld
o
f
Com
puter
S
ci
ence
i
n
W
eb
T
echnology.
Her
area
o
f
interest
s
is
i
m
a
g
e
p
ro
cess
i
ng
,
d
a
ta
s
ci
en
ce
and
b
i
g
dat
a
,
an
d
e
l
ectr
o
nic
co
mm
erce.
Evaluation Warning : The document was created with Spire.PDF for Python.
Ind
ones
i
a
n
J
E
lec
En
g & Co
mp
S
c
i
IS
S
N
: 2502-
47
52
Co
mp
aring
b
a
g
s
of
f
e
atu
r
e
s
, c
o
n
vent
i
o
n
a
l
c
o
n
v
ol
ut
i
onal
ne
u
r
a
l
n
e
t
w
ork…
(Ni
k
Noo
r
Akma
l
Abdu
l
Hami
d
)
33
9
Rabiatul
A
dawiya
R
azali
is
a
M
a
s
t
e
r’s
s
tudent
a
t
Univers
iti
Tek
n
o
l
o
g
i
MAR
A
,
S
h
ah
Alam, Se
l
an
go
r in
t
h
e
fiel
d
o
f
C
o
m
p
u
t
e
r
Sci
e
nce
in
W
eb
T
echno
lo
g
y
.
Her
area
of
i
n
terest
i
s
im
age
processi
ng,
dat
a
b
a
s
e
a
n
d
kn
owle
dg
e
-
ba
se
.
Z
a
idah
I
brahim
i
s
an
Associate
Prof
ess
o
r
a
t
t
he
F
aculty
o
f
C
o
mp
ut
er
a
nd
Mat
h
em
ati
cal
S
cie
n
ces,
U
n
iv
ersi
ti
T
eknol
og
i
MARA,
Sh
ah
A
l
a
m,
S
elangor,
M
a
l
a
ysi
a
.
S
he
has
b
een
t
e
ach
in
g
cou
r
ses
rel
a
ted
t
o
A
rti
f
ic
i
a
l
Int
e
lligence
f
o
r
ov
er
t
en
y
ears.
S
he
i
s
act
ively
i
nvolved
in
r
es
e
a
rch
a
n
d
pu
bl
icati
o
n
und
er
D
i
gita
l
Im
age,
A
u
d
io
a
n
d
S
p
eec
h
T
e
c
h
no
lo
gy
(
DIAST
)
r
esearc
h
i
n
t
eres
t
g
r
oup
t
h
at
i
n
c
lud
e
t
ext
and
ob
je
c
t
r
e
c
o
gn
it
io
n.
Evaluation Warning : The document was created with Spire.PDF for Python.