Inter
national
J
our
nal
of
Electrical
and
Computer
Engineering
(IJECE)
V
ol.
6,
No.
6,
December
2016,
pp.
3238
–
3246
ISSN:
2088-8708
3238
Image
Retrie
v
al
with
Rele
v
ance
F
eedback
using
SVM
Acti
v
e
Lear
ning
T
ruong-Giang
Ngo
1
,
Quoc-T
ao
Ngo
2
,
and
Duc-Dung
Nguy
en
3
1
Department
of
Information
T
echnology
,
HaiPhong
Pri
v
ate
Uni
v
ersity
2,3
Institute
of
Information
T
echnology
,
V
ietnamese
Academy
of
Sciences
and
T
echnology
Article
Inf
o
Article
history:
Recei
v
ed
Jun
26,
2016
Re
vised
Aug
25,
2016
Accepted
Sep
8,
2016
K
eyw
ord:
Interacti
v
e
image
retrie
v
al
Content-based
image
retrie
v
al
Rele
v
ance
feedback
SVM
Acti
v
e
learning
Batch
mode
acti
v
e
learning
ABSTRA
CT
In
content-based
image
retrie
v
al,
rele
v
ant
feedback
is
studied
e
xtensi
v
ely
to
narro
w
the
g
ap
between
lo
w-le
v
el
image
f
eature
and
high-le
v
el
semantic
concept.
In
gen-
eral,
rele
v
ance
feedback
aims
to
impro
v
e
the
retrie
v
al
performance
by
learning
with
user’
s
judgements
on
the
retrie
v
al
results.
Despite
widespread
interest,
b
ut
feedback
related
technologies
are
often
f
aced
with
a
fe
w
limitations.
One
of
the
most
ob
vious
limitations
is
often
requiring
the
user
to
repeat
a
number
of
steps
before
obtaining
the
impro
v
ed
search
results.
This
mak
es
the
process
inef
ficient
and
tedious
search
for
the
online
applications.
In
this
paper
,
a
ef
fecti
v
e
feedback
related
scheme
for
content-
based
image
retrie
v
al
is
proposed.
First,
a
decision
boundary
is
learned
via
Support
V
ector
Machine
to
filter
the
images
in
the
database.
Then,
a
ranking
function
for
se-
lecting
the
most
informati
v
e
samples
will
be
calculated
by
defining
a
no
v
el
criterion
that
considers
both
the
scores
of
Support
V
ector
Ma
chine
function
and
similarity
met-
ric
between
the
”ideal
query”
and
the
images
in
the
database.
The
e
xperimental
results
on
standard
datasets
ha
v
e
sho
wed
the
ef
fecti
v
eness
of
the
proposed
method..
Copyright
c
2016
Institute
of
Advanced
Engineering
and
Science
.
All
rights
r
eserved.
Corresponding
A
uthor:
Ngo
T
ruong
Giang
Department
of
Information
T
echnology
,
HaiPhong
Pri
v
ate
Uni
v
ersity
No.36
Dan
Lap
Road,
Hai
Phong,
V
ietnam
Phone:
+84904051206
Email:
giangnt@hpu.edu.vn
1.
INTR
ODUCTION
The
rapid
de
v
elopment
of
digital
de
vices
and
the
dominance
of
social
netw
orks
ha
v
e
led
to
the
great
demand
of
sharing,
bro
wsing
and
searching
images.
Therefore,
to
satisfy
such
requirements,
image
retrie
v
al
systems
ha
v
e
become
an
ur
ge
necessity
.
Basically
,
there
are
tw
o
main
frame
w
orks
to
form
image
retrie
v
al
systems:
te
xt-based
and
content-based
systems
[1].
In
te
xt-based
image
ret
rie
v
al
systems,
the
users’
queries
are
composed
by
k
e
y-w
ords,
which
describe
image
content.
The
system
retrie
v
es
images
based
on
image
labels
which
are
annotated
manually
.
Ho
we
v
er
,
the
dif
ficulties
in
annotating
a
massi
v
e
number
of
images
and
a
v
oiding
subjecti
v
ely
labelling
mak
e
this
frame
w
ork
impractical.
In
order
to
o
v
ercome
such
hindrances,
Content-Based
Image
Retrie
v
al
(CBIR
)
is
kno
wn
to
be
a
more
optimized
approach
which
aims
to
bring
image
content
closer
to
human
understanding.
In
CBIR,
lo
w-le
v
el
visual
features,
such
as
colors,
te
xtures,
patterns,
and
shapes
are
used
to
describe
image
contents.
These
lo
w-le
v
el
features
are
automatically
e
xtracted
to
represent
the
images
in
the
database
without
manual
interv
entions.
Its
adv
antage
o
v
er
k
e
yw
ord
based
image
retri
e
v
al
lies
in
the
f
act
that
feature
e
xtraction
can
be
performed
automat
ically
and
the
image’
s
o
wn
content
is
al
w
ays
consistent.
Ho
we
v
er
,
the
most
challenging
problem
in
the
CBIR
systems
is
the
semantic
g
ap
[2],
[3],
i.e.,
images
of
dissimilar
semantic
content
may
share
some
common
lo
w-le
v
el
features,
while
images
of
similar
semantic
content
may
be
scattered
in
the
feature
space.
Despite
the
great
deal
of
research
w
ork
dedicated
to
the
e
xploration
of
an
ideal
descriptor
for
im
age
content
[4],
[5],
[6],
[7],
[8],
[9]
its
performance
is
f
ar
from
satisf
actory
due
to
the
fundamental
J
ournal
Homepage:
http://iaesjournal.com/online/inde
x.php/IJECE
,
DOI:
10.11591/ijece.v6i6.11631
Evaluation Warning : The document was created with Spire.PDF for Python.
IJECE
ISSN:
2088-8708
3239
dif
ference
between
human
understanding
(high
le
v
el
concepts)
and
machine
understanding
(lo
w
le
v
el
features).
T
o
narro
w
do
wn
the
semantic
g
ap,
one
possible
solution
is
to
inte
grate
human
interaction
in
the
system,
which
is
popularly
kno
wn
as
Rele
v
ance
Feedback
(RF)
[10],
[11].
In
general,
RF
aims
to
impro
v
e
the
retrie
v
al
performance
by
learning
with
user’
s
judgments
on
the
retrie
v
al
results.
In
this
w
ay
,
the
system
needs
to
be
run
through
se
v
eral
iterations.
In
each
iteration,
the
CBIR
system
fist
returns
a
short
list
of
top-rank
ed
images
with
respect
to
a
user’
s
query
by
a
re
gular
retrie
v
al
approach
based
on
Euclidean
distance
measure,
and
then
some
images
are
gi
v
en
to
users,
labeled
by
them
as
being
rele
v
ant
or
irrele
v
ant
(positi
v
e
or
ne
g
ati
v
e
e
xamples).
Using
thes
e
labeled
images
as
seeds,
machine
learning
techniques
will
be
used
to
b
uild
a
model
to
classify
the
database
images
into
tw
o
classes:
a
class
containing
images
that
suppose
to
satisfy
the
users
and
the
other
class
containing
the
irrele
v
ant
images.
A
typical
scenario
for
a
CBIR
system
with
RF
using
machine
learning
[2]
(represented
in
Figure
1)
is
as
follo
ws:
1.
User
chooses
the
query
image.
Extracting
lo
w-le
v
el
features
of
the
query
image
2.
Returning
result
images.
There
are
tw
o
cases:
Initial
phase:
depends
on
the
similarity
measure
of
lo
w-le
v
el
features
between
query
image’
s
fea-
tures
and
database
image’
s
features
since
we
don’
t
ha
v
e
an
y
training
e
xample
to
train
machine
learning
classification.
Result
images
in
RF
loops:
Using
the
function
of
the
classification
as
a
ranking
function.
3.
User
judges
these
initial
result
images
as
to
whether
and
to
what
de
gree,
the
y
are
rele
v
ant
(positi
v
e
e
xamples)/irrele
v
ant
(ne
g
ati
v
e
e
xamples)
to
the
query
e
xample.
After
judging,
these
images
are
labeled.
4.
Machine
learning
algorithm
is
applied
to
learn
the
user
feedback
using
labeled
e
xamples
obtained
from
the
first
to
the
current
iteration.
Then
go
back
to
Step
2.
Note
that
in
this
scenario,
Step
2,
3
and
4
are
repeated
until
the
user
is
satisfied
with
the
results.
Figure
1.
The
CBIR
system
with
Rele
v
ance
Feedback
[2]
From
a
general
machine
learning
vie
w
,
RF
is
essentially
a
binary
classification
problem
in
which
sample
images
pro
vided
by
the
user
are
emplo
yed
to
train
a
classifier
,
which
is
then
used
to
classify
the
database
into
images
that
are
rele
v
ant
to
the
query
and
those
that
are
not
[1],
[2].
Ho
we
v
er
,
RF
is
v
ery
dif
ferent
from
the
traditional
classification
problem
because
the
feed
backs
pro
vided
by
the
user
are
often
limited
in
real-w
orld
image
retrie
v
al
systems.
Therefore,
small
sample
learning
methods
are
most
promising
for
RF
.
Support
V
ector
Machine
(SVM)
is
one
of
the
popular
small
sample
learning
methods
widely
used
in
recent
years,
which
has
a
v
ery
good
performance
for
pattern
classification
problems
[12],
[13],
[14],
[15],
[16].
Compared
with
other
learning
algorithms,
SVM
appears
to
be
a
good
candidate
for
se
v
eral
reasons:
gener
-
alization
ability
,
without
restricti
v
e
assumptions
re
g
arding
the
data,
f
ast
learning
and
e
v
aluation
for
rele
v
ance
feedback,
fle
xibility
,
e.g.,
prior
kno
wledge
can
be
easily
used
to
tune
its
k
ernels.
Ho
we
v
er
,
for
the
SVM-based
Ima
g
e
Retrie
val
with
Rele
vance
F
eedbac
k
using
SVM
Active
Learning
(T
ruong-Giang
Ngo)
Evaluation Warning : The document was created with Spire.PDF for Python.
3240
ISSN:
2088-8708
rele
v
ance
feedback,
the
retrie
v
al
performance
is
actually
w
orse
when
the
number
of
labeled
positi
v
e
feedback
samples
is
small.
SVM
acti
v
e
learning
acti
v
ely
selects
samples
close
to
the
boundary
as
the
most
informati
v
e
s
amples
for
the
user
to
label
in
each
round
of
RF
[17],
[18],
[19].
Although
SVM
acti
v
e-based
rele
v
ance
feedback
can
w
ork
better
than
the
con
v
entional
SVM-based
rele
v
ance
feedback,
it
has
tw
o
major
dra
wbacks:
First,
the
performance
of
SVM
is
usually
limited
by
the
number
of
labeled
e
xamples.
Second,
since
the
batch
of
e
xamples
is
selected
all
at
once,
the
pre
viously
label
ed
e
xamples
will
ha
v
e
no
influence
on
the
selection
of
the
rest
e
xamples
in
the
batch.
T
o
solv
e
this
problem,
Hoi
et
al.
[20]
recently
ha
v
e
been
proposed
the
Semi-Supervised
SVM
Batch
Mode
Acti
v
e
Learning.
This
method
first
constructs
a
k
ernel
function
which
is
learned
from
a
mixture
of
labelled
and
unlabelled
e
xamples.
The
k
ernel
will
then
be
used
to
ef
fecti
v
ely
identify
the
informati
v
e
and
di
v
erse
e
xamples
for
acti
v
e
learning
via
a
minmax
frame
w
ork.
Zhang
et
al
[21]
ha
v
e
been
proposed
a
dynamic
batch
mode
SVM
acti
v
e
learning
scheme,
which
dynamically
select
a
batch
of
e
xamples
one
by
one,
using
the
label
of
the
pre
viously
selected
e
xample
to
gui
de
the
selection
of
the
ne
xt
one.
The
selection
of
feedback
e
xamples
is
determined
by
both
the
e
xisting
classification
boundary
and
pre
viously
labelled
e
xamples.
In
the
solutions
presented,
the
selection
of
e
xamples
for
the
user
to
l
abel
in
each
round
of
RF
is
solely
determined
by
the
e
xisting
SVM
decision
boundary
.
Ho
we
v
er
,
in
early
iterations,
the
SVM
decision
boundary
might
not
be
accurate
due
to
the
lack
of
training
e
xamples.
In
this
case,
the
samples
selected
by
the
those
methods
will
not
be
those
that
shoul
d
be
selected,
and
it
mak
es
the
subsequent
learning
inef
ficient.
Consequently
,
a
poor
retrie
v
al
performance
will
result,
e
v
en
if
se
v
eral
rounds
of
learni
ng
ha
v
e
been
performed.
T
o
address
the
abo
v
e
problems,
we
propose
a
no
v
el
Batch
Mode
for
SVM
acti
v
e
learning.
In
proposed
method,
a
decision
boundary
first
is
learned
via
SVM
to
filter
the
images
in
the
database.
Then,
a
ranking
function
will
be
constructed
by
defining
a
no
v
el
criterion
that
considers
both
the
scores
of
SVM
function
and
similarity
mea
sure
between
the
query
and
the
images
in
the
database.
This
can
ef
fecti
v
ely
reduce
the
adv
erse
ef
fect
of
inaccurate
decision
boundary
.
By
using
the
priority
coef
ficient
in
the
ra
nk
i
ng
function,
W
e
can
select
a
batch
of
feedback
e
xamples
which
may
be
informati
v
e
enough
to
impro
v
e
the
retrie
v
al
accurac
y
significantly
.
The
e
xperimental
results
on
standard
datasets
ha
v
e
sho
wed
the
ef
fecti
v
eness
of
the
proposed
method,
especially
when
the
number
of
initially
labelled
samples
is
small
in
early
iterations.
The
rest
of
this
paper
is
or
g
anized
as
follo
ws.
Section
2
presents
the
basic
theory
about
SVM-based
RF
.
Section
3
presents
the
problem
formulation
and
our
solution.
The
retrie
v
al
performance
of
the
proposed
method
is
presented
in
Section
4.
Finally
,
we
discuss
future
research
directions
and
gi
v
e
the
conclusions.
2.
SVM-B
ASED
RELEV
ANCE
FEEDB
A
CK
SVM
w
as
first
introduced
by
V
apnik
et
al.
in
[22]
and
until
no
w
is
an
acti
v
e
part
of
the
machine
learning
research
around
the
w
orld.
W
ith
strong
theoretica
l
foundations
a
v
ailable,
it
is
being
used
for
man
y
applications
and
is
a
popular
small
sample
learning
method
that
has
a
v
ery
good
performance
for
pattern
clas-
sification
problems.
The
k
e
y
idea
of
SVM
is,
gi
v
en
a
set
of
n
labelled
e
xamples
L
=
f
(
x
1
;
y
1
)
;
:
:
:
;
(
x
l
;
y
l
)
g
,
where
x
i
2
R
d
represents
an
image
by
a
d-dimensional
v
ector
,
and
y
i
2
f
1
;
1
g
is
the
label,
to
find
a
h
yper
-
plane.
f
(
x
)
=
(
w
.x
)
+
b
(1)
that
achie
v
es
the
best
separation
of
tw
o
classes,
pro
vided
that
the
empirical
risk
is
minimized
and
the
mar
gin
is
maximized
for
the
training
v
ectors
that
are
correctly
classified.
This
is
a
quadratic
programming
problem.
It
is
solv
ed
by
finding
w
and
b
so
as
to
minimize
the
function
1
2
k
w
k
2
+
C
n
X
i
=1
i
s:t:
y
i
(
w
.x
i
+
b
)
1
i
;
i
0
;
i
=
1
:
:
:
n:
(2)
The
corresponding
dual
form
can
be
the
follo
wing:
Find
the
parameters
i
;
i
=
1
:
:
:
n
,
which
maxi-
mize
the
function
L
(
)
=
n
X
i
=1
i
1
2
n
X
i;j
=1
i
j
y
i
y
j
K
(
x
i
:
x
j
)
(3)
s:t:
n
X
i
=1
y
i
i
=
0
;
0
6
i
6
C
;
i
=
1
:
:
:
n;
IJECE
V
ol.
6,
No.
6,
December
2016:
3238
–
3246
Evaluation Warning : The document was created with Spire.PDF for Python.
IJECE
ISSN:
2088-8708
3241
where
K
(
x
i
:
x
j
)
is
a
k
ernel
function.
There
are
man
y
k
ernel
functions
for
nonlinear
mapping.
W
e
choose
to
use
the
Gaussian
radial
basis
function
as
the
k
ernel
function
in
our
e
xperiments
K
(
x;
y
)
=
exp
(
x
y
)
2
2
;
(4)
where
parameter
is
the
width
of
the
Gaussian
function.
F
or
a
gi
v
en
k
ernel
function,
the
SVM
classifier
is
gi
v
en
by
f
(
x
)
=
sig
n
l
X
i
=1
i
y
i
K
(
x
i
:
x
j
)
+
b
!
(5)
and
the
decision
boundary
is
P
l
i
=1
i
y
i
K
(
x
i
:
x
j
)
+
b
=
0
.
In
SVM-based
CBIR
rele
v
ance
feedback,
the
decision
boundary
has
been
used
to
measure
the
rel
e-
v
ance
between
a
gi
v
en
pattern
and
the
query
image.
In
general,
the
e
xam
ples
ha
v
e
the
lar
ge
absolute
v
alues
of
SVM
functions,
the
corresponding
prediction
confidence
will
be
high.
In
a
traditional
method
for
rele
v
ance
feedback,
users
judge
on
the
top-rank
ed
image
e
xamples,
which
ha
v
e
the
lar
gest
v
alues
of
t
he
SVM
function
f
(
x
)
.
This
strate
gy
is
called
P
assi
v
e
feedback.
It
tends
to
choose
the
most
rele
v
ant
e
xamples.
But
the
y
might
not
be
the
most
i
n
f
ormati
v
e
e
xamples
for
training
SVM.
Acti
v
e
learning
method
is
proposed
to
deal
with
this
problem.
Acti
v
e
learning,
kno
wn
as
pool-based
acti
v
e
learning,
is
a
subfield
of
machine
learning
and
is
one
of
the
most
promising
methods
curr
ently
a
v
ailable.
Acti
v
e
learning
tends
to
choose
the
most
uncertain
e
xamples
which
are
close
to
the
decision
boundary
of
SVM.
3.
B
A
TCH
MODE
FOR
SVM
A
CTIVE
LEARNING
In
CBIR
system,
the
RF
can
be
formulated
as
an
acti
v
e
learning
problem,
that
the
most
informati
v
e
un-
labeled
e
xamples
will
be
selected
for
impro
ving
the
classification
performance.
Let
L
=
f
(
x
1
;
y
1
)
;
:::;
(
x
l
;
y
l
)
g
denote
the
labeled
image
e
xamples
that
are
solicited
through
RF
,
and
U
=
f
x
l
+1
;
:::;
x
l
+
u
g
the
unlabeled
image
e
xamples,
where
x
i
2
R
d
represents
an
image
by
a
d-dimensional
v
ector
.
Let
S
be
a
set
of
k
unlabeled
image
e
xamples
to
be
selected
in
RF
,
and
r
isk
(
f
;
S
;
L
;
U
)
be
a
risk
function
that
depends
on
the
classifier
f
.
In
[20],
selecting
the
most
informati
v
e
unlabeled
e
xamples
for
the
RF
is
defined
as
finding
the
assignment
v
ector
S
,
which
minimizes
the
risk
function.
S
=
arg
min
S
U
^j
S
j
=
k
r
isk
(
f
;
S
;
L
;
U
)
(6)
The
SVM-based
acti
v
e
learning
method
selects
the
unlabeled
e
xample
that
is
closest
to
the
dec
ision
boundary
.
This
can
be
e
xpressed
by
the
follo
wing
optimization
problem
x
=
arg
min
x
2U
j
f
(
x
)
j
(7)
F
or
a
query
,
after
the
boundary
is
learned
based
on
the
user’
s
feedback,
the
images
in
the
dat
abase
are
filtered
by
the
decision
boundary
.
Ho
we
v
er
,
in
early
iterations,
the
SVM
decision
boundary
might
not
be
accurate
due
to
the
lack
of
training
e
xamples.
Consequently
,
a
poor
retrie
v
al
performance
will
result.
In
this
case,
similarity
measure
of
lo
w-le
v
el
features
may
be
more
reliable
and
can
be
used
to
restrict
this
problem.
Therefore,
we
propose
a
method
that
can
combine
tw
o
scores
of
SVM
function
and
similarity
measure
to
form
a
unique
ranking
function.
Let
D
S
i
denote
the
distance
of
the
image
i
from
the
decision
boundary
gi
v
en
by
SVM
acti
v
e
learning,
and
D
S
(
x
i
)
=
j
f
(
x
i
)
j
=
j
w
.x
i
+
b
)
j
(8)
where
w
and
b
denote
the
normal
v
ector
and
the
bias
of
the
separating
h
yperplane,
respecti
v
ely
,
and
x
i
is
the
feature
v
ector
representing
the
image
i.
Let
D
E
i
denote
the
Euclidean
distance
obtained
between
the
image
i
with
the
”ideal
query”
image
c
,
and
D
E
(
x
i
)
=
(
k
x
i
x
c
k
if
f
(
x
i
)
0
1
otherwise
(9)
Ima
g
e
Retrie
val
with
Rele
vance
F
eedbac
k
using
SVM
Active
Learning
(T
ruong-Giang
Ngo)
Evaluation Warning : The document was created with Spire.PDF for Python.
3242
ISSN:
2088-8708
where
x
c
=
arg
max
x
j
2U
D
S
(
x
j
)
:
The
ranking
function
of
our
method
for
the
i
-th
image
can
be
defined
as
follo
ws.
D
S
E
(
x
i
)
=
N
r
el
N
r
el
+
N
nonr
el
D
S
(
x
i
)
+
(1
N
r
el
N
r
el
+
N
nonr
el
)
D
E
(
x
i
)
(10)
where
N
r
el
is
the
total
number
of
rele
v
ant
images
and
N
nonr
el
is
the
total
number
of
non-rele
v
ant
images
in
each
loop.
W
e
will
choose
the
unlabeled
e
xamples,
which
ha
v
e
the
smallest
v
alues
of
the
ranking
function
D
S
E
for
the
user
to
label.
x
=
arg
min
x
2U
D
S
E
(
x
)
(11)
The
o
v
erall
algorithm
of
batch
mode
for
SVM
acti
v
e
learning
is
briefly
described
in
Algorithm
1.
Algorithm
1
:
Batch
Mode
for
SVM
Acti
v
e
Learning
Input:
L
;
U
/*
labeled
and
unlabeled
data
*/
k,
K
/*
batch
size
and
an
input
k
ernel,
e.g.
an
RBF
k
ernel*/
Output:
S
/*
a
batch
of
unlabeled
e
xamples
selected
for
labeling*/
Pr
ocedur
e:
1:
T
rain
an
SVM
classifier:
f
=
S
V
M
T
r
ain
(
L
;
K
);
/*
call
a
standard
SVM
solv
er
*/
2:
Compute
D
S
=
(
j
f
(
x
l
+1
)
j
;
:
:
:
;
j
f
(
x
n
)
j
)
T
;
3:
Compute
D
E
=
(
D
E
(
x
l
+1
)
;
:
:
:
;
D
E
(
x
n
));
by
Eq.
9
4:
S
=
;
5:
while
jS
j
6
k
do
6:
f
or
each
x
j
2
U
do
7:
D
S
E
(
x
j
)
=
N
r
el
N
r
el
+
N
nonr
el
D
S
(
x
j
)
+
(1
N
r
el
N
r
el
+
N
nonr
el
)
D
E
(
x
j
)
8:
end
f
or
9:
x
j
=
arg
min
x
j
U
D
S
E
(
x
j
);
10:
S
S
[
f
x
j
g
;
11:
U
U
f
x
j
g
;
12:
end
while
13:
r
etur
n
S
.
4.
RESUL
T
AND
AN
AL
YSIS
T
o
e
v
aluate
the
performance
of
the
proposed
algorithm,
we
conduct
an
e
xtensi
v
e
set
of
CBIR
e
xper
-
iments
by
comparing
the
propos
ed
algorithm
to
se
v
eral
SVM
feedback
methods
that
ha
v
e
been
used
in
image
retrie
v
al.
The
image
database
is
a
selected
subset
from
Corel
Gallery
,
which
contains
10800
images
from
about
80
dif
ferent
cate
gories,
autumn,
a
viation,
bonsai,
castle,
cloud,
dog,
elephant,
iceber
g,
primates,
ship,
tiger
....
Each
cate
gory
consists
of
about
100
images
and
all
the
image
s
are
cate
gory-homogeneous.
F
or
feature
rep-
resentation
in
the
e
xperiment,
we
e
xtract
three
types
of
features:
color
,
te
xture
and
shape,
which
are
used
in
[20].
F
or
color
,
we
selected
the
color
moments.
Firstly
,
we
con
v
ert
the
color
space
from
RGB
into
HSV
.
Then,
we
e
xtract
3
moments:
color
mean,
color
v
ariance
and
color
sk
e
wness
in
each
color
channel,
respecti
v
ely
.
Thus,
a
9
-dimensional
color
moment
is
used.
F
or
te
xture,
a
p
yramidal
w
a
v
elet
transform
(PWT)
is
performed
on
the
gray
images.
Each
w
a
v
elet
decomposition
on
a
gray
2
D-image
results
in
four
scaled-do
wn
subimages.
In
total,
3
-le
v
el
decomposition
is
conducted
and
features
are
e
xtracted
from
9
of
the
subimages
by
computing
entrop
y
.
Thus,
a
9-dimensional
w
a
v
elet
v
ector
is
used.
Thus,
in
total,
a
36-dimensional
feature
v
ector
is
used
to
represent
each
image.
F
or
shape,
the
edge
direction
histogram
(EDH)
is
used
as
the
shape
features.
The
edge
information
contained
in
the
images
is
generated
and
processed
using
the
Cann
y
edge
detection
algorithm.
The
edge
direction
histogram
is
quantized
into
18
bins
of
20
de
grees
each,
thus
a
total
of
18
edge
features
are
e
xtracted.
IJECE
V
ol.
6,
No.
6,
December
2016:
3238
–
3246
Evaluation Warning : The document was created with Spire.PDF for Python.
IJECE
ISSN:
2088-8708
3243
All
of
these
features
are
combined
into
a
feature
v
ector
,
which
results
in
a
v
ector
with
36
v
alues,
we
then
normalize
each
feature
to
a
normal
distrib
ution
to
eliminate
the
ef
fect
of
dif
ferent
scales.
The
distance
between
pairs
of
images
is
computed
as
the
Euclidean
distance.
4.1.
Comparati
v
e
perf
ormance
e
v
aluation
W
e
performed
a
series
of
e
xperiments
to
sho
w
the
ef
fecti
v
eness
of
the
proposed
method
and
compare
its
performance
with
with
three
state-of-the-art
SVM
feedback
methods:
SVM
Acti
v
e
Learning
[17],
SVM
Batch
Mode
Acti
v
e
Learning
[20]
and
Dynamic
Batch
Sampling
Mode
[21].
T
o
illustrate
the
actual
situation
of
online
users,
randomly
selected
20
images
from
the
database
are
used
to
query
,
thus
there
will
be
1600
query
sessions.
In
the
firs
t
step
of
each
query
session,
the
images
in
the
database
are
rank
ed
according
to
their
Euclidean
distances
to
the
query
.
User’
s
rele
v
ance
adjustments
are
simulated
automatically
in
each
loop,
and
top
15
images
are
used
to
label
related
or
unrelated.
The
images
in
the
same
class
are
considered
rele
v
ant
and
the
rest
are
considered
irrele
v
ant.
All
images
are
labeled
in
the
feedback
loop
that
will
be
used
for
learning
system.
The
retrie
v
al
results
using
the
proposed
algorithm
without
the
rele
v
ance
feedback
are
sho
w
in
Fig.
2
a).
The
image
at
the
top
of
left
-hand
corner
is
the
query
image,
the
images
are
framed
in
red
is
rele
v
ance
to
query
image,
the
rest
is
non-rele
v
ance
to
query
image.
It’
s
easy
to
realize
that
the
number
of
rele
v
ance
images
to
the
query
image
is
v
ery
limited;
there
are
so
man
y
images
though
the
distance
is
v
ery
close
to
the
query
image
b
ut
v
ery
dif
ferent
semantics
and
vice
v
ersa.
Ho
we
v
er
,
after
four
feedback
loops,
the
number
of
rele
v
ant
images
of
the
proposed
method
has
significantly
impro
v
ed
as
sho
wn
in
Fig.2
b)
Figure
2.
The
retrie
v
al
results
using
the
proposed
algorithm:
(a)
the
result
without
the
rele
v
ance
feedback,
(b)
the
result
after
four
feedback
iterations
W
e
use
the
A
v
erage
Precision
(AP)
measure
as
an
e
v
aluation
measure,
which
defined
by
NISTTREC
video
(TRECVID).
The
AP
v
alue
that
can
be
obtained
at
each
iteration
is
defined
as
the
a
v
erage
of
precision
v
alue
obtained
after
each
rele
v
ant
picture
is
retrie
v
ed.
The
precision
v
alue
is
the
ratio
between
the
retrie
v
ed
rele
v
ant
pictures
and
the
number
of
pictures
currently
retrie
v
ed.
In
f
act,
using
the
result
for
only
one
query
is
not
reliable.
In
order
to
e
v
aluate
the
performance
of
CBIR,
we
need
to
compute
the
retrie
v
al
results
for
v
arious
image
e
xamples,
then
use
the
a
v
erage
v
alues
of
their
results.
Moreo
v
er
,
by
v
arying
N
,
the
number
of
returned
images,
we
can
plot
Mean
A
v
erage
Precision
as
a
function
of
N
with
the
number
of
result
images
fix
ed
to
20
,
40
,
60
,
80
and
100
.
This
e
xperiment
is
to
e
v
aluate
the
ef
ficient
performance
of
all
four
methods
in
each
case
of
user’
s
requirement.
Se
v
eral
observ
ations
can
be
dra
wn
from
the
results
in
Fig.3,
Fig.4.
First
,
we
observ
e
that
retrie
v
al
performance
of
all
the
methods
is
impro
v
ed
after
a
number
of
rounds.
This
result
indicates
the
important
of
RF
technique
in
CBIR
system.
Second,
we
observ
e
that
our
proposed
method
tends
to
be
more
ef
fecti
v
e
than
the
others
in
early
iterations.
That
is
e
xpected
because
SVM
performance
is
lo
w
when
the
number
of
training
e
xamples
for
classification
is
small;
and
ranking
images
mainly
based
on
the
similarity
measure
of
lo
w-le
v
el
features
is
better
.
Ho
we
v
er
,
as
the
number
of
the
feedback
iteration
increases,
the
number
of
training
e
xamples
seems
to
be
lar
ge
enough
to
learn
a
good
SVM,
so
the
si
milarity
measure
is
no
longer
necessary
.
These
results
ag
ain
sho
w
the
ef
fecti
v
e
of
proposed
for
selecting
a
batch
of
informati
v
e
unlabeled
e
xamples
for
rele
v
ance
feedback
in
CBIR.
Ima
g
e
Retrie
val
with
Rele
vance
F
eedbac
k
using
SVM
Active
Learning
(T
ruong-Giang
Ngo)
Evaluation Warning : The document was created with Spire.PDF for Python.
3244
ISSN:
2088-8708
Figure
3.
Relationship
between
a
v
erage
AP
and
number
of
returned
images:
(a)
the
first
feedback
iteration,
(b)
the
second
feedback
iteration,
(c)
the
third
feedback
iteration,
and
(d)
the
fourth
feedback
iteration.
5.
CONCLUSION
In
this
paper
,
we
ha
v
e
proposed
a
no
v
el
batch
mode
SVM
acti
v
e
learning
scheme
for
rele
v
ance
feed-
back
in
CBIR.
W
e
choose
a
batch
of
feedback
e
xamples
for
the
user
to
label
us
ing
the
combined
ranking
function
instead
of
the
SVM
decision
function
used
in
traditional
methods.
Concretely
,
we
combine
tw
o
scores
of
SVM
function
and
similarity
measure
to
form
a
unique
ranking
function.
W
ith
the
help
of
combined
ranking
function,
not
only
the
adv
erse
ef
fect
of
inaccurate
decision
boundary
due
to
lack
of
initially
labelled
samples
can
ef
fecti
v
ely
be
reduced,
the
retrie
v
al
performance
can
be
further
enhanced
when
there
is
suf
ficient
number
of
initially
labelled
samples.
The
e
xperimental
resul
ts
on
a
subset
of
COREL
demonstrate
the
impro
v
ement
by
proposed
scheme
o
v
er
the
traditional
schemes,
especially
when
the
number
of
initially
labelled
samples
is
small.
As
future
de
v
elopments
of
this
w
ork,
we
plan
to
e
xtend
the
e
xperimental
on
other
datasets.
A
CKNO
WLEDGEMENT
This
paper
w
as
supported
in
part
by
the
V
ietnam
National
F
oundation
for
Science
and
T
echnology
De
v
elopment
under
N
AFOSTED
Grant
102.02.16.09
and
Institut
e
of
Information
T
echnology
,
V
AST
under
Grant
CS’14.3.
REFERENCES
[1]
M.
S.
Le
w
,
N.
Sebe,
C.
Djeraba,
and
R.
Jain,
“Content-based
multimedia
information
re
trie
v
al:
State
of
the
art
and
challenges,
”
A
CM
T
r
ans.
Multimedia
Comput.
Commun.
Appl.
,
v
ol.
2,
no.
1,
pp.
1–19,
Feb
.
2006.
[2]
Y
.
Liu,
D.
Zhang,
G.
Lu,
and
W
.-Y
.
Ma,
“
A
surv
e
y
of
content-based
image
retrie
v
al
with
high-le
v
el
semantics,
”
P
attern
Reco
gnition
,
v
ol.
40,
no.
1,
pp.
262–282.
[3]
R.
Datta,
D.
Joshi,
J.
Li
,
and
J.
Z.
W
ang,
“Image
retrie
v
al:
Ideas,
influences,
and
trends
of
the
ne
w
age,
”
A
CM
Computing
Surve
ys
,
v
ol.
40,
no.
2,
pp.
1–
60,
May
2008.
[4]
H.
Bay
,
A.
Ess,
T
.
T
uytelaars,
and
L.
V
.
Gool,
“Speeded-up
rob
ust
features
(surf),
”
Computer
V
ision
and
Ima
g
e
Under
standing
,
v
ol.
110,
no.
3,
pp.
346
–
359,
2008,
similarity
Matching
in
Computer
V
ision
and
Multimedia.
[5]
J.
W
u
and
J.
M.
Rehg,
“Centrist:
A
visual
descriptor
for
scene
cat
e
gori
zation,
”
IEEE
T
r
ansactions
on
P
attern
Analysis
and
Mac
hine
Intellig
ence
,
v
ol.
33,
no.
8,
pp.
1489–1501,
Aug
2011.
IJECE
V
ol.
6,
No.
6,
December
2016:
3238
–
3246
Evaluation Warning : The document was created with Spire.PDF for Python.
IJECE
ISSN:
2088-8708
3245
Figure
4.
Relationship
between
a
v
erage
AP
and
number
of
iterations:
(a)
the
top
20
returned
images,
(b)
the
top
40
returned
images,
(c)
the
top
60
returned
images,
and
(d)
the
top
80
returned
images
[6]
L.
W
u
and
S.
C.
H.
Hoi,
“Enhancing
bag-of-w
ords
models
with
semantics-preserving
metric
learning,
”
IEEE
MultiMedia
,
v
ol.
18,
no.
1,
pp.
24–37,
Jan
2011.
[7]
N.
T
.
Giang,
N.
Q.
T
ao,
N.
D.
Dung,
and
N.
T
.
The,
“Sk
eleton
based
shape
matching
using
re
weighted
random
w
alks,
”
in
The
pr
oceding
of
the
IEEE
on
9th
International
Confer
ence
on
Information,
Commu-
nications
and
Signal
Pr
ocessing
(ICICS)
,
December
2013,
pp.
1–5.
[8]
Y
.
K.
J.
K.
Zukuan
WEI,
Hongyeon
KIM,
“
An
ef
ficient
content
based
image
retrie
v
al
scheme,
”
TELK
OM-
NIKA
Indonesian
J
ournal
of
Electrical
Engineering
,
v
ol.
11,
no.
11,
p.
6986
6991,
No
v
ember
2013.
[9]
O.
M.
A.
B.
Cha
wki
Y
ouness
,
El
Asnaoui
Khalid,
“Ne
w
method
of
content
based
image
retrie
v
al
based
on
2-d
esprit
method
and
the
g
abor
filters,
”
TELK
OMNIKA
Indonesian
J
ournal
of
Electrical
Engineering
,
v
ol.
15,
no.
2,
pp.
313–320,
August
2015.
[10]
M.
O.
Y
.
Rui,
T
.
S.
Huang
and
S.
Mehrotra,
“Rele
v
ance
feedback:
A
po
werful
tool
for
interacti
v
e
content-
based
image
retrie
v
al,
”
IEEE
T
r
ansactions
on
Cir
cuits
and
Systems
for
V
ideo
T
ec
hnolo
gy
,
v
ol.
8,
pp.
644–
655,
1998.
[11]
B.
Thomee
and
M.
Le
w
,
“Interacti
v
e
search
in
image
ret
rie
v
al:
a
surv
e
y
,
”
International
J
ournal
of
Multi-
media
Information
Retrie
val
,
v
ol.
1,
no.
2,
pp.
71–86,
2012.
[12]
M.
M.
Rahman,
P
.
Bhattacharya,
and
B.
C.
Desai,
“
A
frame
w
ork
for
medical
image
retrie
v
al
using
ma-
chine
learning
and
statistical
similarity
matching
techniques
with
rele
v
ance
feedback,
”
IEEE
T
r
ansactions
on
Information
T
ec
hnolo
gy
in
Biomedicine
,
v
ol.
11,
no.
1,
pp.
58–69,
Jan.
2007.
[13]
R.
Min
and
H.
Cheng,
“Ef
fecti
v
e
image
retrie
v
al
using
dominant
color
descriptor
and
fuzzy
support
v
ector
machine,
”
P
attern
Reco
gnition
,
v
ol.
42,
no.
1,
pp.
147
–
157,
2009.
[14]
R.-S.
W
u
and
W
.-H.
Chung,
“Ensemble
one-class
support
v
ector
machines
for
content-based
image
re-
trie
v
al,
”
Expert
Systems
with
Applications
,
v
ol.
36,
no.
3,
P
art
1,
pp.
4451
–
4459,
2009.
[15]
X.-Y
.
W
ang,
J.-W
.
Chen,
and
H.-Y
.
Y
ang,
“
A
ne
w
inte
grat
ed
svm
classifiers
for
rele
v
ance
feedback
content-based
image
retrie
v
al
using
em
parameter
estimation,
”
Applied
Soft
Computing
,
v
ol.
11,
no.
2,
pp.
2787
–
2804,
2011.
[16]
G.
Li,
“Impro
ving
rele
v
ance
feedback
in
image
retrie
v
al
by
incorporating
unlabelled
images,
”
TELK
OM-
NIKA
Indonesian
J
ournal
of
Electrical
Engineering
,
v
ol.
11,
no.
7,
pp.
3634–3640,
2013.
[17]
S.
T
ong
and
E.
Chang,
“Support
v
ector
machine
acti
v
e
learning
for
image
retrie
v
al,
”
in
Pr
oceedings
of
the10th
A
CM
International
Confer
ence
on
Multimedia
,
2001,
pp.
107–118.
[18]
S.
C.
H.
Hoi
and
M.
R.
L
yu,
“
A
semi-supervised
acti
v
e
learning
frame
w
ork
for
image
retrie
v
al,
”
in
Pr
o-
ceedings
of
the
2005
IEEE
Computer
Society
C
onfer
ence
on
Computer
V
ision
and
P
attern
Reco
gnition
,
Ima
g
e
Retrie
val
with
Rele
vance
F
eedbac
k
using
SVM
Active
Learning
(T
ruong-Giang
Ngo)
Evaluation Warning : The document was created with Spire.PDF for Python.
3246
ISSN:
2088-8708
year
=
2005,
pa
g
es
=
302–309,
.
[19]
R.
Liu,
Y
.
W
ang,
T
.
Baba,
D.
Masumoto,
and
S.
Nag
ata,
“Svm-based
acti
v
e
feedback
in
image
retrie
v
al
using
clustering
and
unlabeled
data,
”
P
attern
Reco
gnition
,
v
ol.
41,
no.
8,
pp.
2645
–
2655,
2008.
[20]
S.
C.
H.
Hoi,
R.
Jin,
J.
Zhu,
and
M.
R.
L
yu,
“Semisupervised
svm
batch
mode
acti
v
e
learning
with
applications
to
image
retrie
v
al,
”
J
ournal
A
CM
T
r
ansactions
on
Information
Systems
,
v
ol.
27,
no.
3,
pp.
16:1–16:29,
May
2009.
[21]
X.
Zhang,
J.
Cheng,
C.
Xu,
H.
Lu,
and
S.
Ma,
“
A
dynamic
batch
sampling
mode
for
svm
acti
v
e
learning
in
image
retrie
v
al,
”
in
Recent
Advances
in
Computer
Science
and
Information
Engineering
,
ser
.
Lecture
Notes
in
Electrical
Engineering,
2012,
v
ol.
128,
pp.
399–406.
[22]
V
.
N.
V
apnik,
The
Natur
e
of
Statistical
Learning
Theory
.
Ne
w
Y
ork,
NY
,
USA:
Springer
-V
erlag
Ne
w
Y
ork,
Inc.,
1995.
IJECE
V
ol.
6,
No.
6,
December
2016:
3238
–
3246
Evaluation Warning : The document was created with Spire.PDF for Python.