Indonesian
J
our
nal
of
Electrical
Engineering
and
Computer
Science
V
ol.
37,
No.
3,
March
2025,
pp.
1772
∼
1784
ISSN:
2502-4752,
DOI:
10.11591/ijeecs.v37.i3.pp1772-1784
❒
1772
T
omato
leaf
disease
detection
using
T
aguchi-based
P
ar
eto
optimized
lightweight
CNN
Bappaditya
Das,
C.
S.
Raghuv
anshi
Department
of
Computer
Science
and
Engineering,
F
aculty
of
Engineering
and
T
echnology
,
Rama
Uni
v
ersity
,
Kanpur
,
India
Article
Inf
o
Article
history:
Recei
v
ed
Mar
12,
2024
Re
vised
Sep
30,
2024
Accepted
Oct
7,
2024
K
eyw
ords:
Deep
learning
Hyperparameters
tuning
Leaf
disease
detection
Multiobjecti
v
e
T
aguchi
method
ABSTRA
CT
The
prospect
of
food
security
becoming
a
global
danger
by
2050
due
to
the
e
xponential
gro
wth
of
the
w
orld
population.
An
increase
in
production
is
indis-
pensable
to
satisfy
the
escalating
demand
for
food.
Considering
the
scarcity
of
arable
land,
safe
guarding
crops
ag
ainst
disease
is
the
best
alternati
v
e
to
maxi-
mize
agricultural
output.
The
con
v
entional
method
of
visually
detecting
agri-
cultural
diseases
by
skilled
f
armers
is
time-consuming
and
vulnerable
to
inac-
curacies.
T
echnology-dri
v
en
agriculture
is
an
inte
gral
strate
gy
for
ef
fecti
v
ely
addressing
this
matter
.
Ho
we
v
er
,
orthodox
lightweight
con
v
olutional
neural
netw
ork
(CNN)
models
for
early
crop
disease
detection
require
ne-tuning
to
enhance
the
precision
and
rob
ustness
of
the
models.
Disco
v
ering
the
optimal
combination
of
se
v
eral
h
yperparameters
might
be
an
e
xhausti
v
e
process.
Most
researchers
use
trial
and
error
to
set
h
yperparameters
in
deep
learning
(DL)
net-
w
orks.
This
study
introduces
a
ne
w
systematic
approach
for
de
v
eloping
a
less
sensiti
v
e
CNN
for
crop
leaf
disease
detection
by
h
yperparameter
tuning
in
DL
netw
orks.
Hyperparameter
tuning
using
a
T
aguchi-based
orthogonal
array
(O
A)
emphasizes
the
S/N
ratio
as
a
performance
metric
primarily
dependent
on
the
model’
s
accurac
y
.
The
multi-objecti
v
e
P
areto
optimization
technique
accom-
plished
the
selection
of
a
rob
ust
model.
The
e
xperimental
results
demonstrated
that
the
suggest
ed
approach
achie
v
ed
a
high
le
v
el
of
accurac
y
of
99.846%
for
tomato
leaf
disease
detection.
This
approach
can
generate
a
set
of
optimal
CNN
models’
congurations
to
classify
leaf
disease
with
limi
ted
resources
accurately
.
This
is
an
open
access
article
under
the
CC
BY
-SA
license
.
Corresponding
A
uthor:
C.
S.
Raghuv
anshi
Department
of
Computer
Science
and
Engineering,
F
aculty
of
Engineering
and
T
echnology
Rama
Uni
v
ersity
Kanpur
,
209217
Uttar
Pradesh,
India
Email:
drcsraghuv
anshi.fet@ramauni
v
ersity
.ac.in
1.
INTR
ODUCTION
Plants
are
essential
for
human
ci
vilization
as
the
y
generate
food
and
pro
vide
protection
ag
ainst
harmful
radiation.
T
omato
is
a
popular
,
nutrient-dense
v
e
getable
with
pharmacological
properties
[1].
The
e
xtensi
v
e
use
of
tomatoe
s
escalates
demand,
resulting
in
an
annual
consumption
of
about
160
million
tons
[2].
T
omatoes
are
a
highly
protable
crop
for
agricultural
households
and
can
ha
v
e
a
signicant
im
pact
on
reducing
po
v
erty
[3].
As
per
F
A
O,
plant
diseases
alone
accounted
for
14%
of
agricultural
production
losses,
leading
to
an
annual
trade
decit
of
$200
billion
[2].
T
imely
identication
of
plant
diseases
can
minimize
the
use
of
pesticides,
thereby
safe
guarding
consumer
health
and
the
en
vironment.
T
raditional
visual
diagnosis
of
pests
and
pathogens
is
more
J
ournal
homepage:
http://ijeecs.iaescor
e
.com
Evaluation Warning : The document was created with Spire.PDF for Python.
Indonesian
J
Elec
Eng
&
Comp
Sci
ISSN:
2502-4752
❒
1773
time-consuming
and
comple
x.
F
armers
f
ace
the
formidable
challenge
of
frequent
ly
monitoring
their
plants
to
pre
v
ent
the
spread
of
disease.
Therefore,
de
v
eloping
an
automated,
rapid,
and
accurate
leaf
disease
detection
system
is
imperati
v
e
for
the
early
identication
of
diseases
and
holds
immense
importance.
Con
v
olutional
neural
netw
orks
(CNNs)
ha
v
e
emer
ged
as
a
po
werful
tool
for
automated
leaf
dise
ase
detection
[4].
Their
success
in
accurately
detecting
diseases
has
fuelled
a
sur
ge
in
research,
focusing
on
de
v
el-
oping
no
v
el
CNN
architectures
or
applying
e
xisting
models
to
v
arious
cr
o
ps
[4]–[9].
The
e
xtensi
v
e
culti
v
ation
of
tomatoes
and
the
a
v
ailability
of
lar
ge,
publicly
accessible
datasets
containing
di
v
erse
disease
cate
gories
ha
v
e
made
tomato
leaf
disease
detection
a
popular
area
of
deep
learning
(DL)
research
[10].
Both
the
restructured
deep
residual
dense
netw
ork
(RRDN)
model
[11]
and
impro
v
ed
f
aster
re
gion
con
v
olutional
neural
netw
ork
(F
aster
RCNN)
[12]
emplo
yed
deep
residual
netw
orks
for
feature
e
xtraction.
Researchers
ha
v
e
de
v
eloped
se
v-
eral
ef
cient,
lightweight
CNN
models
using
DL
to
classify
tomato
leaf
diseases.
T
oLeD
[13],
a
CNN
with
a
small
parameter
count
of
0.2
M,
achie
v
ed
a
maximum
testing
accurac
y
of
91%,
where
v
alidation
accurac
y
w
as
impro
v
ed
by
adjusting
through
h
yperparameter
tuning
of
the
epoch,
batch
size,
learning
rate,
dropout
rate,
number
of
con
v
olution
layers,
and
pooling
layers.
The
IN
AR-SSD
model
[14],
[15],
combining
rain-
bo
w
concatenation
with
the
SSD
algorithm
and
the
Inception
module,
achie
v
ed
98.49%
and
78.80%
accurac
y
for
classifying
v
e
common
leaf
diseases
of
tomato
and
apple,
respecti
v
ely
.
Bhujel
et
al.
[16],
de
v
eloped
a
20-layered
lightweight
CNN
model
(l
w
resnet20
cbam)
by
incorporating
the
con
v
olutional
block
attention
module
(CB
AM),
spatial
attention
(SA),
squeeze
and
e
xcitation
(SE),
and
dual
attention
(D
A)
modules
into
the
ResNet-20
architecture
to
classify
tomato
leaf
diseases.
The
model
attained
a
T
op-1
accurac
y
of
99.51%
with
a
v
alidation
loss
of
0.0155.
A
customized
CNN
w
as
de
v
eloped
using
DenseNet201
as
the
base
architecture,
follo
wed
by
adding
three
con
v
olutional
layers
and
a
attening
layer
[17].
This
model
achie
v
ed
the
highest
v
alidation
accurac
y
of
98.26%
in
diagnosing
tomato
leaf
diseases
on
the
PlantV
illage
dataset.
Hyperparameter
tuning
w
as
utilized
for
Ef
cientNet-based
transfer
learning
to
achie
v
e
89%
accurac
y
and
0.235
loss
in
identifying
v
e
classes
of
cassa
v
a
leaf
diseases
[18].
An
e
xperimental
approach
optimized
h
yperparameters,
such
as
batch
size,
epochs,
learning
rate,
optimizer
,
and
loss
function.
Inte
grating
channel,
spatial,
and
pix
el
attention
using
ResNet50,
multi-feature
fusion
netw
ork
(MFFN),
and
the
adapti
v
e
attention
mechanism
achie
v
ed
99.8%
v
alidation
accurac
y
for
tomato
leaf
disease
classication
[19].
T
rials
were
conducted
for
100
epochs
using
v
arious
combinations
of
channel
attention
module
(CAM),
position
attention
module
(P
AM),
and
cross-position
attention
module
(CP
AM)
with
a
batch
size
of
4
and
a
x
ed
learning
rate
of
0.0003.
Optimal
batch
size
and
learning
rate
v
alues
can
signicantly
decrease
the
training
time
of
the
model
[20],
whereas
adjusting
the
ratios
of
the
dataset
for
training,
testing,
and
v
al
idation
impro
v
es
the
model’
s
ac-
curac
y
.
Since
ResNet50
outperforms
visual
geometry
group
(V
GG)16
and
V
GG19
in
detecting
leaf
diseases,
an
online
application
[21]
for
recommending
initial
treatment
by
utilizing
ResNet50
achie
v
ed
the
highest
ac-
curac
y
of
98.98%.
Datasets
of
v
arying
sizes
were
used
to
assess
the
v
alidation
accurac
y
and
loss
of
the
model.
The
principal
component
analysis
(PCA)
technique
using
V
GG16
[22]
and
the
tw
o-stage
transfer
learning
ap-
proach
emplo
ying
V
GGNet
[23]
achie
v
e
high
accurac
y
in
detecting
tomato
leaf
diseases.
This
research
uses
semantic
se
gmentation
to
distinguish
precisely
between
disease-af
fected
and
health
y
re
gions.
Cutting-edge
ap-
proaches,
such
as
the
proposed
U-Net
design
with
skip
connections
and
dilated
con
v
oluti
o
ns
,
ensure
accurate
separation.
Researchers
emplo
yed
the
T
aguchi
methodology
to
optimize
h
yperparameters
within
a
CNN
model
for
accurate
breast
histopathology
image
classication
[24].
A
similar
approach
w
as
applied
to
determine
the
optimal
architectural
conguration
for
a
DL
netw
ork
for
mal
w
are
detection
[25].
Six
of
the
nine
control
v
ari-
ables
were
assigned
tw
o
le
v
els,
while
the
number
of
lters
per
con
v
olutional
operation
w
as
assigned
three
le
v
els.
The
authors
utilized
ANO
V
A
to
e
v
aluate
model
performance
and
identify
signicant
parameters
based
on
lar
ger
-is-better
criteria.
A
generalized
T
aguchi
method
w
as
proposed
for
optimizing
h
yperparameters
in
multi-objecti
v
e
CNN
models
[26].
The
method
in
v
olv
ed
dening
a
performance
functional
v
ector
,
emplo
ying
e
xtended
orthogonal
arrays(O
As),
and
computing
a
performance
inde
x
to
identify
optimal
parameter
settings.
Optimizing
h
yperparameters
is
crucial
yet
challenging
in
de
v
eloping
DL
models.
The
traditional
tri
al-
and-error
method
for
determining
optimal
h
yperparameter
congurations
for
CNNs
is
e
xhausti
v
e
and
time-
consuming.
Despite
their
established
impact
on
DL
model
performance,
optimizing
CNN
h
yperparameters
for
detecting
tomato
leaf
diseases
requires
further
e
xploration.
This
paper
proposes
a
frame
w
ork
that
inte
grates
T
aguchi-based
h
yperparameter
ne-tuning
and
multi-objecti
v
e
P
aret
o
optimization
to
de
v
elop
a
lightweight
CNN
model
for
accurately
detecting
tomato
leaf
disease.
The
signicant
contrib
ution
of
this
research
is
as:
i)
T
o
preprocess
the
image,
we
perform
v
arious
augmentations,
such
as
rotation,
scaling,
ipping,
brightness
adjustment,
normalization,
color
enhancement,
and
noise
reduction.
T
omato
leaf
disease
detection
using
T
a
guc
hi-based
P
ar
eto
optimized
...
(Bappaditya
Das)
Evaluation Warning : The document was created with Spire.PDF for Python.
1774
❒
ISSN:
2502-4752
ii)
W
e
designed
a
lightweight
CNN
model
with
less
than
three
million
parameters,
achie
ving
superior
accu-
rac
y
in
tomato
leaf
disease
detection
compared
to
pre
vious
classical
CNN
models.
This
model’
s
memory
ef
cienc
y
mak
es
it
suitable
for
deplo
yment
in
resource-constrained
en
vironments,
such
as
mobile
or
em-
bedded
de
vices.
iii)
W
e
emplo
yed
a
systematic
approach
to
optimize
h
yperparameters
rather
than
relying
on
trial
and
error
.
In
our
performance
re
vie
w
,
we
consider
v
alidation
accurac
y
and
loss
f
actors
rather
than
depending
e
xclusi
v
ely
on
the
standard
S/N
ratio
based
solely
on
accurac
y
.
W
e
e
xpand
the
population
to
a
x
ed
size
by
adjusting
the
de
v
eloped
model’
s
h
yperparameters.
i
v)
W
e
de
v
eloped
the
most
rob
ust
and
least
vulnerable
model
by
making
a
trade-of
f
between
multi-objecti
v
es
using
P
areto
front
optimization.
The
rest
of
the
paper
is
structured
as
follo
ws:
section
2
outlines
the
proposed
methodology
.
Section
3
pro
vides
detailed
e
xplanations
and
discussions
of
the
research
ndings.
Final
ly
,
section
4
concludes
with
dif
ferent
application
areas
of
our
research.
2.
METHOD
This
comprehensi
v
e
approach
ensures
accurate
crop
disease
identication
through
a
str
eamlined
process,
as
depicted
in
Figure
1.
Figure
1.
Process
o
w
diagram
of
proposed
model
2.1.
Datatset
pr
eparation
A
total
of
18,160
tomato
leaf
images
with
laboratory
backgrounds
were
used
from
10
classes
col-
lected
from
the
publicly
a
v
ailable
Kaggle
dataset.
These
images
were
di
vided
into
training
(13,083),
v
alidation
(3,265),
and
testing
(1,812)
sets.
Representati
v
e
images
from
each
class
are
sho
wn
in
Figure
2.
T
o
ensure
compatibility
with
our
proposed
model,
all
images
were
resized
to
224
×
224
pix
els.
2.2.
Data
pr
epr
ocessing
2.2.1.
Data
augmentation
In
preprocessing,
data
augmentation
i
s
a
v
aluable
machine
learning
(ML)
technique
that
combats
o
v
er
-
tting
by
generating
di
v
erse
modied
v
ersions
of
e
xisting
data.
This
process
often
applied
to
images,
in
v
olv
es
ipping,
cropping,
rotation,
resizing,
and
more
transformations.
Random
rotation
is
a
common
approach
that
repositions
objects
in
the
frame
by
applying
arbitrary
clockwise
or
anticlockwise
rotations.
Scaling
refers
to
resizing
a
digital
image
while
maintaining
its
aspect
ratio
to
ensure
it
does
not
appear
distorted.
Flipping
or
mirroring
pix
els
horizontally
or
v
er
tically
creates
a
mirror
ef
fect
and
can
increase
dataset
di
v
ersity
.
Enhancing
image
brightness
through
pix
el
intensity
adjustments
during
preprocessing
di
v
ersies
data.
Indonesian
J
Elec
Eng
&
Comp
Sci,
V
ol.
37,
No.
3,
March
2025:
1772–1784
Evaluation Warning : The document was created with Spire.PDF for Python.
Indonesian
J
Elec
Eng
&
Comp
Sci
ISSN:
2502-4752
❒
1775
Figure
2.
Representati
v
e
tomato
leaf
image
from
each
class
2.2.2.
Normalization
Normalization
in
preprocessing
adjusts
pix
el
intensity
ranges,
which
is
benecial
for
impro
ving
glare-
damaged
images
by
contrast
or
histogram
stretching.
It
enhances
ML
algorithm
ef
cienc
y
and
accurac
y
.
The
normalization
process
can
be
mathematically
represented
as:
x
nor
m
=
x
−
x
min
x
max
−
x
min
(1)
where
x
is
the
original
pix
el
v
alue,
x
min
and
x
max
are
the
minimum
and
maximum
pix
el
v
alues
in
the
image,
respecti
v
ely
,
and
x
nor
m
is
the
normalized
pix
el
v
alue.
2.2.3.
Color
enhancement
Color
enhancement
reduces
illumination
and
camera-related
v
ariations
by
enhancing
color
cons
is-
tenc
y
.
The
specialized
algorithms
correct
color
discrepancies
to
impro
v
e
data
quality
,
aiding
in
accurate
disease
detection
on
agricultural
crop
images.
2.2.4.
Noise
r
eduction
Noise
reduction,
a
typical
digital
image
processing
task,
remo
v
es
unw
anted
pix
el
v
alue
uctua
tions,
enhancing
clarity
and
aiding
visual
analysis,
often
using
lters
and
Gaussian
blur
.
A
Gaussian
blur
,
achie
v
ed
through
a
Gaussian
function,
is
a
standard
graphics
ef
fect
that
reduces
visual
noise
and
detail.
It
dif
fers
signi-
cantly
from
bok
eh
or
shado
ws,
crea
ting
a
smooth,
translucent
screen-lik
e
appearance.
A
Gaussian
blur
applies
a
weight
to
nearby
pix
els
based
on
the
tw
o-dimensional
Gaussian
function
gi
v
en
by
(2).
g
(
X
,
Y
)
=
1
√
2
π
σ
e
−
X
2
+
Y
2
2
σ
2
(2)
Where
X
represents
the
horizontal
axis,
Y
the
v
ertical
axis,
and
σ
the
standard
de
viation
in
a
Gaussian
distri-
b
ution.
The
Gaussian
function
peaks
at
(0,0),
and
its
magnitude
diminishes
with
increasing
X
or
Y
.
2.3.
Pr
oposed
lightweight
CNN
Our
proposed
lightweight
CNN
architecture
le
v
erages
ef
cient
b
uilding
blocks
to
achie
v
e
accurat
e
tomato
leaf
disease
classication.
The
architecture
incorporates
v
e
distinct
block
types:
Con
vBlock,
Incep-
tionBlock,
FireBlock,
GhostBlock,
and
AttentionBl
ock
as
sho
wn
in
Figure
3.
A
Con
vBlock
is
a
fundamental
block
consisting
of
three
stack
ed
Con
vModules.
Each
Con
vModule
utilizes
a
tw
o-dimensional
(2D)
con
v
olu-
tional
layer
follo
wed
by
a
2D
max-pooling
layer
.
The
con
v
olutional
layer
e
xtracts
features
by
applying
trainable
lters
to
the
input
image,
generating
unique
feature
maps
for
dif
ferent
image
locations.
The
subsequent
max-
pooling
layer
do
wnsamples
the
feature
maps
while
retaining
the
most
signicant
information
by
selecting
the
maximum
v
alue
within
non-o
v
erlapping
re
gions.
This
do
wnsampling
reduces
model
parameters,
impro
v
es
translation
in
v
ariance,
and
promotes
spatial
re
gularization
to
mitig
ate
o
v
ertting.
T
omato
leaf
disease
detection
using
T
a
guc
hi-based
P
ar
eto
optimized
...
(Bappaditya
Das)
Evaluation Warning : The document was created with Spire.PDF for Python.
1776
❒
ISSN:
2502-4752
Figure
3.
Proposed
lightweight
CNN
architecture
detailing
inception,
re,
and
ghost
modules
The
InceptionBlock
recei
v
es
the
output
from
the
Con
vBlock
and
comprises
tw
o
consecuti
v
e
inception
modules.
While
structurall
y
similar
to
the
modules
used
in
GoogLeNet,
our
implementation
utilizes
v
arying
k
ernel
sizes
and
lter
quantities
in
the
rst
incept
ion
module
to
enable
feature
learning
across
multiple
scales.
This
approach
enhances
model
accurac
y
by
mitig
ating
the
v
anishing
gradient
problem,
a
common
issue
in
deep
neural
netw
orks.
Additionally
,
a
1
×
1
con
v
olutional
lter
within
the
inception
module
allo
ws
the
netw
ork
to
learn
patterns
across
the
entire
image
depth,
reducing
feature
map
dimensionality
.
The
FireBlock
comprises
three
re
modules
in
series,
recei
ving
an
output
from
the
InceptionBlock
as
input.
FireBlock
e
xpands
the
channel
depth
by
12
on
input.
Each
of
the
initial
tw
o
re
modules
stretches
the
input
channel
depth
by
4
of
their
input,
while
the
third
one
by
3.
The
re
module
primarily
focuses
on
optimizing
computation
and
e
xtr
acting
features
in
an
ef
cient
w
ay
.
The
re
module
increases
the
number
of
channels
of
input
feature
maps
while
preserving
height
and
width.
The
rst
re
module
in
our
architecture
transforms
feature
maps
from
(27,
27,
32)
to
(27,
27,
128).
MaxPooling2D
of
Con
vModule
reduces
feature
map
dimensions
by
do
wnscaling
and
preserving
essential
features
while
reducing
the
computation
and
mem-
ory
requirem
ents.
The
Con
v
olution2D
operation
in
Con
vModule
generates
multiple
feature
maps
that
capture
dif
ferent
patterns
without
altering
the
spatial
dimensions.
As
a
result,
the
Con
vModule
reduces
the
dimensions
of
the
input
feature
maps
from
(27,
27,
128)
to
(14,
14,
96).
The
input
of
the
ne
xt
re
module
will
be
feature
maps
with
dimensions
(14,
14,
96).
Thus,
a
Con
vModule
acts
as
a
bridge
between
tw
o
sequentially
connected
re
modules.
Our
modied
approach
has
allo
wed
us
to
reduce
the
input
dimension
by
50%,
leading
to
remark-
able
ef
cienc
y
in
learning
high-
and
lo
w-le
v
el
input
features.
The
GhostBlock,
composed
of
tw
o
sequentially
connected
ghost
modules
[27],
ef
ciently
generates
additional
feature
maps
through
linear
operations.
Each
ghost
module
rst
em
p
l
o
ys
standard
con
v
olut
ions
follo
wed
by
linear
tr
ansformations
to
produce
additional
feature
maps.
Attention
module
enables
the
netw
ork
to
focus
on
the
most
critical
features
and
generates
a
feature
map.
Global
a
v
erage
pooling
is
a
process
that
reduces
the
spatial
dimension
of
the
feature
map
generat
ed
by
the
AttentionBlock
and
con
v
erts
it
into
a
x
ed-length
feat
ure
v
ector
.
Each
element
of
the
v
ector
is
assigned
a
channel
wise
singular
v
alue.
A
dense
layer
with
the
rectied
linear
unit
(ReLU)
acti
v
ation
function
is
added
follo
wing
the
global
a
v
erage
pooling
operation.
The
ReLU
acti
v
ation
function
helps
to
speed
up
the
training
process
by
reducing
the
lik
elihood
of
v
anis
hing
gradients.
The
dropout
layer
follo
ws
the
dense
layer
.
The
dropout
layer
randomly
drops
neurons
after
each
iteration
to
pre
v
ent
o
v
er
-reliance
on
specic
features.
Indonesian
J
Elec
Eng
&
Comp
Sci,
V
ol.
37,
No.
3,
March
2025:
1772–1784
Evaluation Warning : The document was created with Spire.PDF for Python.
Indonesian
J
Elec
Eng
&
Comp
Sci
ISSN:
2502-4752
❒
1777
2.4.
Hyper
parameters
optimization
2.4.1.
Contr
ol
factors
and
le
v
el
selection
This
study
e
xamined
the
inuence
of
six
k
e
y
h
yperparameters
on
a
CNN
model’
s
perform
ance:
epochs,
learning
rate,
batch
size,
optimizer
,
number
of
neurons
in
the
nal
dense
layer
,
and
dropout
rate.
W
e
assigned
a
discrete
set
of
le
v
els
to
each
h
yperparameter
to
enable
a
systematic
e
v
aluation.
All
h
yperparam-
eters,
e
xcept
dropout
rate,
were
assigned
equal
le
v
els
to
ensure
a
balanced
e
xploration
of
the
h
yperparameter
space.
Utilizing
the
data
presented
in
T
able
1
and
the
formulas
in
section
2.4.2,
we
constructed
an
ef
fecti
v
e
O
A,
as
sho
wn
in
T
able
2.
This
array
allo
ws
us
to
assess
e
v
ery
possible
combination
of
h
yperparameter
v
alues
systematically
.
T
able
1.
Le
v
el
of
h
yperparameters
and
corresponding
v
alues
Le
v
el
Hyperparameters
Neurons
at
dense
layer
Epoch
Optimizer
Learning
rate
Dropout
rate
Batch
size
Le
v
el
1
90
10
RMSprop
0.0001
0.10
16
Le
v
el
2
120
25
AdamW
0.0002
0.20
32
Le
v
el
3
50
Nadam
0.0003
0.30
64
Le
v
el
4
75
Adam
0.0005
0.40
128
2.4.2.
Design
of
orthogonal
array
The
minimum
number
o
f
e
xperiments
for
N
number
of
control
f
actors
in
the
T
aguchi
method
is
dened
as
[18],
(
D
O
E
)
min
=
N
X
j
=1
(
D
O
F
)
j
+
1
(3)
and
DOF
of
a
control
f
actor
with
L
le
v
el
is
dened
as,
(
D
O
F
)
L
=
L
−
1
(4)
In
this
study
,
we
in
v
estig
ated
six
v
ariables,
comprising
v
e
control
v
ariables,
each
with
four
le
v
els
and
one
binary
v
ariable.
This
e
xperimental
setup
resulted
in
a
total
of
16
de
grees
of
freedom.
An
L
16
(4
5
)
T
aguchi
O
A
w
as
emplo
yed
to
e
xplore
the
design
space
for
the
initial
v
e
v
ariables
ef
ciently
.
Subsequently
,
to
accommodate
the
binary
v
ariable
and
e
xpand
the
e
xperimental
scope,
we
e
xtended
the
array
to
32
e
xperimental
runs.
This
e
xpansion
w
as
achie
v
ed
by
i
ncorporating
the
tw
o
le
v
els
of
the
sixth
v
ariable
while
ensuring
a
uniform
distrib
ution
of
le
v
els
across
all
parameters.
This
approach
not
only
s
atised
b
ut
e
xceeded
the
minimum
requirements
for
e
xperimental
size,
thereby
enhancing
the
statistical
rob
ustness
of
the
analysis.
2.4.3.
T
aguchi
method
The
T
aguchi
method,
a
rob
ust
optimization
frame
w
ork
de
v
eloped
by
Genichi
T
aguchi,
has
signi-
cantly
enhanced
product
and
process
quality
across
v
arious
industries
[28].
Unlik
e
the
e
xhausti
v
e
full
f
actorial
approach,
T
aguchi’
s
methodology
reduces
the
e
xperim
ental
b
urden
while
ef
fecti
v
ely
identifying
optimal
pa-
rameter
combinations.
V
alidation
accurac
y
and
loss
are
critical
metrics
for
assessing
DL
model
performance.
Hyperparameters,
such
as
epoch,
learning
rate,
optimizer
,
batch
size,
and
dropout,
substantially
impact
these
metrics.
T
raditionally
,
researchers
ha
v
e
relied
on
time-consuming
trial-and-error
methods
to
optimize
these
h
yperparameters.
The
T
aguchi
method
of
fers
a
more
ef
cient
alternati
v
e
using
O
A
to
e
xplore
the
design
space
systematically
.
O
As
enable
the
in
v
estig
ation
of
multiple
f
actors
and
their
interacti
ons
with
a
minimal
number
of
e
xperiments.
Although
the
selection
of
O
As
is
inuenced
by
the
number
of
control
parameters
and
their
le
v
els
[29],
researchers
can
customize
the
array
size
to
meet
specic
e
xperimental
requirements
[30].
The
T
aguchi
method
is
notably
more
ef
cient
than
the
full
f
actorial
approach,
requiring
signicantly
fe
wer
e
xperi-
ments
(
L
×
(
P
−
1)
compared
to
L
P
),
where
L
represents
the
number
of
le
v
els,
and
P
denotes
the
number
of
parameters.
This
ef
cienc
y
is
particularly
adv
antageous
when
computational
resources
or
time
constraints
are
limiting
f
actors.
The
T
aguchi
method
uses
the
signal-to-noise
ratio
(S/N)
as
an
optimization
criteri
on,
which
is
dened
by
(5).
S/N
=
Desired
signal
strength
Unw
anted
noise
po
wer
(5)
T
omato
leaf
disease
detection
using
T
a
guc
hi-based
P
ar
eto
optimized
...
(Bappaditya
Das)
Evaluation Warning : The document was created with Spire.PDF for Python.
1778
❒
ISSN:
2502-4752
The
S/N
v
alue
is
determined
based
on
the
problem
type
and
e
v
aluated
using
one
of
three
performance
criteria:
lar
ger
-is-better
,
smaller
-is-better
,
or
nominal-is-better
.
F
or
the
lar
ger
-is-better
criterion,
the
S/N
ratio
is
gi
v
en
by:
η
l
=
(
S
/
N
)
l
=
−
10
log
1
n
n
X
i
=1
1
y
2
i
!
(6)
for
the
smaller
-is-better
criterion,
the
S/N
ratio
is:
η
s
=
(
S
/
N
)
s
=
−
10
log
1
n
n
X
i
=1
y
2
i
!
(7)
for
the
nominal-is-better
criterion,
the
S/N
ratio
is:
η
a
=
(
S
/
N
)
a
=
10
log
y
2
s
2
y
(8)
here,
y
i
represents
the
outcome
of
the
i
-th
run
of
a
collection
of
n
observ
ations,
y
2
denotes
the
mean
squared
response,
and
s
2
y
is
the
v
ariance.
2.4.4.
P
ar
eto
optimization
P
areto
optimization,
also
kno
wn
as
P
areto
front
optimization,
is
a
technique
for
multi-objecti
v
e
op-
timization
in
v
arious
elds,
i
ncluding
engineering
and
mathematics.
P
areto
dominance
is
the
k
e
y
concept
in
P
areto
front
optimization.
The
domination
between
tw
o
solutions
is
dened
as
[31],
[32]:
A
solution
P1
is
considered
to
dominate
another
solution
P2
if
and
only
if
both
of
the
follo
wing
conditions
are
satised:
a)
The
solution
P1
is
superior
or
equal
to
P2
in
all
objecti
v
es.
b)
The
solution
P1
is
superior
to
P2
in
at
le
ast
one
objecti
v
e.
The
non-dominant
points
are
represented
as
a
non-domination
front.
In
Figure
4,
the
curv
e
passing
through
P
3
,
P
5
,
and
P
6
labeled
as
”Non-dominated
front”
of
the
graph
with
tw
o
conicting
objecti
v
es
-
f
1
and
f
2
respecti
v
ely
.
Figure
4.
A
set
of
points
along
with
the
rst
non-dominated
front
2.4.5.
Pr
oposed
T
aguchi-based
P
ar
eto
fr
ont
optimization
algorithm
The
o
wchart
of
our
proposed
algorithm
is
sho
wn
in
the
Figure
5.
In
our
Algorithm
1,
the
size
of
the
O
A(R)
depends
solely
on
the
number
of
control
f
actors
(P)
and
the
le
v
els
for
each
control
f
actor
(le
v
elF
actor).
The
proposed
T
aguchi-based
P
areto
front
optimization
(TPFO)
tak
es
a
parameter
named
totalT
rials
equal
to
R.
Indonesian
J
Elec
Eng
&
Comp
Sci,
V
ol.
37,
No.
3,
March
2025:
1772–1784
Evaluation Warning : The document was created with Spire.PDF for Python.
Indonesian
J
Elec
Eng
&
Comp
Sci
ISSN:
2502-4752
❒
1779
The
v
alue
of
R
is
deri
v
ed
using
the
formula
and
technique
e
xplained
in
section
2.4.2,
le
v
elF
actor
=
[
ℓ
j
k
|
j
∈
{
1
,
.
.
.
,
P
}
,
k
∈
{
1
,
.
.
.
,
L
i
}
]
be
a
set
of
arrays
where
ℓ
j
k
refers
to
the
k
-th
le
v
el
of
the
j
-th
f
actor
.
The
function
CREATE
OA
generates
an
O
A
of
size
R
(
L
R
)
with
P
number
of
control
f
actors
and
levelFactor
.
The
literature
re
vie
w
established
the
function
ar
guments.
A
v
ector
with
P
components
represents
each
input
for
P
h
yperparameters
and
is
mathematically
represented
as,
H
=
[
ℓ
h
1
(
τ
)
,
ℓ
h
2
(
τ
)
,
.
.
.
,
ℓ
hP
(
τ
)]
where
ℓ
hi
(
τ
)
represents
the
le
v
el
of
the
i
-th
h
yperparameter
at
the
τ
-th
trial.
Each
SETFACTOR
operation
uses
a
unique
combination
of
parameter
le
v
els
f
rom
the
T
aguchi
table.
CONDUCT
EXP
runs
the
e
xperiment
using
that
setting
as
input
and
records
the
outcomes
in
T
.
T
is
a
2D
table
wi
th
R
(total
number
of
trials/runs)
ro
ws
and
tw
o
columns
for
storing
v
alidation
accurac
y
and
loss
v
alues.
F
or
ins
tance,
T
[
τ
][1]
and
T
[
τ
][2]
contain
the
v
alidation
loss
and
accurac
y
for
the
τ
-th
trial.
The
SORT
and
FILTER
functions
arrange
T
’
s
v
alidation
accurac
y
records
in
descending
order
and
discard
those
records
belo
w
a
user
-dened
threshold.
Create
a
scatter
plot
of
ltered
records
where
each
point
P
τ
represents
a
feasible
solution
in
the
objec-
ti
v
e
space
dened
by
objecti
v
e
1
and
objecti
v
e
2
for
the
τ
-th
entry
of
the
O
A.
FIND
SOLUTIONS
compares
each
solution
with
e
v
ery
other
solution
to
determine
whether
an
y
other
solution
dominates.
Add
the
solution
to
the
Best
Response
set
only
if
no
other
solutions
dominate
it.
H
∗
is
a
set
of
combinations
of
h
yperparam-
eters
considered
the
best
possible
settings
for
achie
ving
optimal
results,
i.e.,
H
∗
⊆
H
.
Highlight
the
P
areto
front
by
sk
etching
non-dominated
solutions.
Figure
5.
Flo
wchart
of
T
aguchi-based
pareto
optimized
CNN
T
omato
leaf
disease
detection
using
T
a
guc
hi-based
P
ar
eto
optimized
...
(Bappaditya
Das)
Evaluation Warning : The document was created with Spire.PDF for Python.
1780
❒
ISSN:
2502-4752
Algorithm
1.
TPFO
for
h
yperparameters
tuning
1:
Pr
ocedur
e
TPFO(totalT
rials)
2:
Declare
τ
,
T
[1
..R
,
1
..
2]
,
P
,
LP
,
Threshold
3:
Input:
H
=
{
[
ℓ
h
1
(
τ
)
,
ℓ
h
2
(
τ
)
,
.
.
.
,
ℓ
hP
(
τ
)]
,
1
≤
ℓ
hi
(
τ
)
≤
L
i
and
1
≤
τ
≤
R
,
follo
wing
T
aguchi
orthogonal
array
}
4:
Initialize
total
T
r
ial
s
←
R
,
τ
←
0
,
H
∗
←
∅
,
B
est
R
esponse
←
{
[
−∞
,
+
∞
]
}
5:
Set
numF
actor
s
←
P
,
l
ev
el
F
actor
←
[
L
1
,
L
2
,
L
3
,
.
.
.
,
L
P
]
6:
//
Create
a
T
aguchi
orthogonal
array
with
R
number
of
ro
ws
//
7:
LR
←
C
R
E
AT
E
O
A
(
P
,
l
ev
el
F
actor
)
8:
//
Perform
the
e
xperiments
for
each
trial
//
9:
while
τ
<
R
do
10:
SET
F
A
CT
ORS(
H
τ
)
11:
[
V
al
Accur
acy
,
V
al
Loss
]
←
C
O
N
D
U
C
T
E
X
P
(
τ
)
12:
T
[
τ
,
1]
←
V
al
Accur
acy
13:
T
[
τ
,
2]
←
V
al
Loss
14:
τ
←
τ
+
1
15:
end
while
16:
SOR
T
DESCEND(
T
.V
al
Accur
acy
)
{
Sort
the
table
by
V
alAccurac
y
in
descending
order
}
17:
FIL
TER(
T
)
based
on
(
T
.V
al
Accur
acy
≥
T
hr
eshol
d
)
18:
PLO
T((Filtered(
T
)))
19:
B
est
R
esponse
←
B
est
R
esponse
∪
{
F
I
N
D
S
O
L
(
F
il
ter
ed
(
T
))
}
20:
H
∗
←
H
∗
∪
{
H
associated
with
B
est
R
esponse
}
21:
Dra
w
P
areto
front.
3.
RESUL
TS
AND
DISCUSSION
W
e
performed
our
in
v
estig
ations
on
a
laptop
equipped
with
an
AMD
Ryzen
5
5600H
processor
,
an
NVIDIA
GeF
orce
GTX
1650
GPU,
and
a
64-bit
W
indo
ws
11
operating
system.
T
ensorFlo
w
with
K
eras
in
Python
3.9.12
w
as
the
DL
frame
w
ork,
utilizing
CUD
A
11.6
for
GPU
acceleration.
Additionally
,
we
beneted
from
the
high-performance
GPU
P100
setup
of
fered
by
Kaggle
accelerators
for
computationally
intensi
v
e
tasks.
An
e
xperimental
design
based
on
a
T
aguchi
O
A
w
as
emplo
yed
to
in
v
estig
ate
the
inuence
of
h
yperpa-
rameters
on
model
performance.
The
results,
presented
in
T
able
2,
stem
from
multiple
trials
using
tw
o
distinct
models
with
v
arying
h
yperparameters.
Each
trial
in
v
olv
ed
dif
ferent
settings
for
the
number
of
ne
u
r
on
s
in
the
dense
layer
,
epochs,
optimizer
,
learning
rate,
dropout
rate,
and
batch
size.
The
primary
objecti
v
e
of
the
se
trials
w
as
to
identify
the
most
ef
fecti
v
e
combinations
of
these
h
yperparameters.
W
e
conducted
paired
and
unpaired
t-
tests
to
assess
the
impact
of
the
number
of
neurons
on
model
performance.
The
results,
summarized
in
T
able
3,
indicate
no
statistically
signicant
dif
ferences
between
model
1
and
model
2
re
g
arding
v
alidation
accurac
y
and
loss.
W
e
ha
v
e
used
the
Pearson
correlation
coef
cient
to
analyze
the
linear
relationship
between
numeric
h
yperparameters
and
model
performance.
The
optimizer
,
being
a
cate
gorical
v
ariable,
w
as
e
xcluded
from
this
analysis.
The
results,
visualized
in
Figure
6,
re
v
ealed
a
strong
positi
v
e
correlation
between
epochs
and
learning
rate
with
v
alidation
accurac
y
(r
=
0.7476
and
r
=
0.7370,
respecti
v
ely).
These
r
v
alues
indicate
that
increases
in
these
h
yperparamet
ers
are
associated
with
impro
v
ed
model
performance.
Con
v
ersely
,
epochs
and
learning
rate
e
xhibited
the
strongest
ne
g
ati
v
e
correlation
with
v
alidation
loss
(r
=
-0.7334
and
r
=
-0.7289,
respecti
v
ely),
suggesting
that
increasing
these
h
yperparameters
leads
to
a
decline
in
model
error
.
Both
h
yperparameters
e
xhibited
statistically
signicant
(p-v
alue
<
0.05)
positi
v
e
correlations
with
v
alidation
accurac
y
and
ne
g
ati
v
e
correlations
with
v
alidation
loss.
Dropout
sho
wed
ne
gligible
correlations
with
v
alidation
accurac
y
(r
=
-0.2518,
p
=
0.1795)
and
v
alidation
loss
(r
=
0.2748,
p
=
0.1416),
suggesting
that
dropout
may
not
ha
v
e
been
a
critical
f
actor
in
impro
ving
model
performance
within
the
e
xplored
parameter
space.
Similarly
,
batch
size
had
minimal
impact
on
v
alidation
accurac
y
(r
=
-0.2161,
p
=
0.2514)
and
v
alidation
loss
(r
=
0.2118,
p
=
0.2612),
indicating
that
the
range
of
batch
sizes
tested
had
a
ne
gligible
ef
fect
on
model
generalization.
The
res
ults
indicate
that
epochs
and
learning
rate
are
the
most
critical
h
yperparameters
inuencing
model
performance.
Careful
tuning
of
these
parameters
can
lead
to
signicant
impro
v
ements
in
v
alidation
accurac
y
and
reductions
in
v
alidation
loss.
Dropout
and
batch
size,
within
the
ranges
e
xplored,
ha
v
e
minimal
impact
on
the
model’
s
performance.
W
e
accomplished
a
comparati
v
e
analysis
of
four
optimizers
(Adam,
Nadam,
AdamW
,
and
RMSprop)
to
e
v
aluate
their
impact
on
model
performance.
Adam
demonstrated
superior
performance
across
all
metrics,
achie
ving
the
highest
v
alidation
accurac
y
(99.794%)
with
the
lo
west
v
alidation
loss
(0.00653).
Ne
v
ertheless,
it
displayed
a
broader
range
of
performance,
indicating
a
possible
sensiti
vity
to
h
yperparameter
settings
or
dataset
Indonesian
J
Elec
Eng
&
Comp
Sci,
V
ol.
37,
No.
3,
March
2025:
1772–1784
Evaluation Warning : The document was created with Spire.PDF for Python.
Indonesian
J
Elec
Eng
&
Comp
Sci
ISSN:
2502-4752
❒
1781
characteristics.
Nadam
e
xhibited
e
xceptional
performance,
achie
ving
the
highest
accurac
y
of
99.846%
and
the
lo
west
loss
of
0.00703.
While
its
performance
w
as
consistent,
it
did
not
consistently
surpass
Adam.
AdamW
and
RM
Sprop
generally
underperformed
compared
to
Adam
and
Nadam.
The
performance
of
RMSprop
w
as
characterized
by
the
broadest
range
of
outcomes,
suggesting
potential
instability
.
T
able
2
illustrates
that
o
v
er
93%
of
the
trials
achie
v
ed
an
accurac
y
e
xceeding
90%,
with
30%
of
cases
surpassing
99%
accurac
y
.
W
e
ha
v
e
selecti
v
ely
presented
data
points
with
v
alidation
accurac
y
greater
than
99%
to
visualize
top-performing
models.
T
rials
15
and
32
demonstrated
e
xceptional
performance,
achie
ving
v
alidation
accuracies
of
99.846%
and
99.794%,
respecti
v
ely
,
with
corresponding
losses
of
0.00703
and
0.00653.
Both
trials
belong
to
the
rst
non-
dominated
P
areto
front
as
sho
wn
in
Figure
7,
pro
viding
options
for
selecting
optimal
models.
The
choice
of
optimizer
signicantly
impacts
model
performance,
with
Adam
demonstrating
superior
o
v
erall
results.
Ho
w-
e
v
er
,
its
sensiti
vity
suggests
that
it
may
only
be
uni
v
ersally
optimal
for
some
scenarios.
Nadam
emer
ged
as
a
reliable
alternati
v
e,
balancing
performance
and
stability
.
RMSprop
and
AdamW
generally
underperformed
compared
to
Adam
and
Nadam.
Future
research
should
focus
on
e
xpanding
the
dataset
and
e
xploring
a
broader
range
of
h
yperparameter
v
alues.
Additionally
,
in
v
estig
ating
adapti
v
e
optimization
techniques
that
inte
grate
the
strengths
of
v
arious
optimizers
could
be
a
promising
direction
for
future
w
ork.
T
able
2.
Performance
e
v
aluation
of
h
yperparameter
combinations
using
O
A
with
multiple
objecti
v
es
T
rials
Model
Hyperparameters
Objecti
v
es
Neurons
at
dense
layer
Epoch
Optimizer
Learning
rate
Dropout
Batch
size
V
alidation
accurac
y
V
alidation
loss
1
Model
1
1
1
1
1
1
1
94.425%
0.18000
2
1
1
2
2
2
2
93.016%
0.19360
3
1
1
3
3
3
3
93.690%
0.19640
4
1
1
4
4
4
4
88.730%
0.32100
5
1
2
1
2
3
4
96.202%
0.11780
6
1
2
2
1
4
3
95.160%
0.13580
7
1
2
3
4
1
2
96.018%
0.11360
8
1
2
4
3
2
1
97.672%
0.07540
9
1
3
1
3
4
2
97.978%
0.06000
10
1
3
2
4
3
1
93.720%
0.18100
11
1
3
3
1
2
4
98.890%
0.03845
12
1
3
4
2
3
1
99.264%
0.02160
13
1
4
1
4
2
3
98.500%
0.04890
14
1
4
2
3
1
4
99.660%
0.01350
15
1
4
3
2
4
1
99.846%
0.00703
16
1
4
4
1
3
2
99.755%
0.00700
17
Model
2
2
1
1
1
1
1
93.660%
0.17460
18
2
1
2
2
2
2
95.069%
0.14500
19
2
1
3
3
3
3
95.038%
0.13590
20
2
1
4
4
4
4
87.351%
0.34750
21
2
2
1
2
3
4
92.890%
0.19380
22
2
2
2
1
4
3
97.703%
0.06658
23
2
2
3
4
1
2
96.477%
0.09570
24
2
2
4
3
2
1
97.152%
0.07570
25
2
3
1
3
4
2
97.700%
0.07362
26
2
3
2
4
3
1
95.957%
0.17560
27
2
3
3
1
2
4
99.173%
0.02460
28
2
3
4
2
1
3
99.387%
0.01880
29
2
4
1
4
2
3
98.652%
0.04483
30
2
4
2
3
1
4
99.690%
0.01010
31
2
4
3
2
4
1
99.720%
0.01160
32
2
4
4
1
3
2
99.794%
0.00653
T
able
4
presents
a
comparati
v
e
analysis
of
v
arious
CNN
models
for
tomato
leaf
dise
ase
classi
cation
based
on
the
number
of
trainable
parameters
and
achie
v
ed
accurac
y
.
The
proposed
TPF
O
CNN
outperforms
all
other
models
with
a
v
alidation
accurac
y
of
99.84%
while
maintaining
a
reduced
number
of
trainable
parameters
(
<
3
M).
Inte
grating
an
attention
mechanism
ef
fecti
v
ely
captures
rele
v
ant
features,
contrib
uting
to
enhanced
performance.
In
contrast,
LMBRNet
[9]
achie
v
es
a
high
accurac
y
of
99.70%
b
ut
with
a
lar
ger
parameter
count.
T
omato
leaf
disease
detection
using
T
a
guc
hi-based
P
ar
eto
optimized
...
(Bappaditya
Das)
Evaluation Warning : The document was created with Spire.PDF for Python.