Inter
national
J
our
nal
of
Electrical
and
Computer
Engineering
(IJECE)
V
ol.
8,
No.
2,
April
2018,
pp.
1230
–
1237
ISSN:
2088-8708,
DOI:
10.11591/ijece.v8i2.pp1230-1237
1230
T
ool
Use
Lear
ning
f
or
a
Real
Robot
Handy
W
icaksono
1,2
and
Claude
Sammut
1
1
School
of
Computer
Science
and
Engineering,
Uni
v
ersity
of
Ne
w
South
W
ales;
Sydne
y
,
Australia
2
Electrical
Engineering
Department,
Petra
Christian
Uni
v
ersity;
Surabaya,
Indonesia
Article
Inf
o
Article
history:
Recei
v
ed
July
24,
2017
Re
vised
December
16,
2017
Accepted
December
27,
2017
K
eyw
ord:
tool
use
by
a
robot
tool
use
learning
action
learning
inducti
v
e
logic
programming
robot
softw
are
architecture
ABSTRA
CT
A
robot
may
need
to
use
a
tool
to
solv
e
a
comple
x
problem.
Currently
,
tool
use
must
be
pre-programmed
by
a
human.
Ho
we
v
er
,
this
is
a
dif
ficult
task
and
can
be
helped
if
the
robot
is
able
to
learn
ho
w
to
use
a
tool
by
itself.
Most
of
the
w
ork
in
tool
use
learning
by
a
robot
is
done
using
a
feature-based
representat
ion.
Despite
man
y
successful
results,
this
representa-
tion
is
limited
in
the
types
of
tools
and
tasks
that
can
be
handled.
Furthermore,
the
comple
x
relationship
between
a
tool
and
other
w
orld
objects
cannot
be
captured
easily
.
Relational
learning
methods
ha
v
e
been
proposed
to
o
v
ercome
these
weaknesses
[1,
2].
Ho
we
v
er
,
the
y
ha
v
e
only
been
e
v
aluated
in
a
sensor
-less
sim
ulation
to
a
v
oid
the
comple
xities
and
uncer
-
tainties
of
the
real
w
orl
d.
W
e
present
a
real
w
orld
implementation
of
a
relational
tool
use
learning
system
for
a
robot.
In
our
e
xperiment,
a
robot
requires
around
ten
e
xamples
to
learn
to
use
a
hook-lik
e
tool
to
pull
a
cube
from
a
narro
w
tube.
Copyright
c
2018
Institute
of
Advanced
Engineering
and
Science
.
All
rights
r
eserved.
Corresponding
A
uthor:
Handy
W
icaksono
School
of
Computer
Science
and
Engineering,
Uni
v
ersity
of
Ne
w
South
W
ales
handyw@cse.unsw
.edu.au
1.
INTR
ODUCTION
Humans
use
tools
to
help
them
complete
e
v
eryday
tasks.
The
ability
to
use
tools
is
a
feature
of
humans
intelligence
[3].
Lik
e
humans,
a
robot
also
needs
to
be
able
to
use
a
tool
to
solv
e
a
comple
x
task.
As
an
e
xample,
in
the
RoboCup@Home
competition,
a
r
ob
ot
is
ask
ed
to
demonstrate
se
v
eral
tool
use
abilities
such
as
opening
a
bottle
by
using
a
bottle
opener
or
w
atering
a
plant
with
a
w
atering
can
[4].
The
robot
is
gi
v
en
complete
kno
wledge
of
the
tools
and
ho
w
to
use
them.
When
such
kno
wledge
is
not
a
v
ailable,
a
robot
must
learn
it.
Most
w
ork
in
tool
use
learning
has
used
a
feature-bas
ed
representation
that
is
dependent
on
an
object’
s
primiti
v
e
features.
Thus,
it
is
not
fle
xible
enough
to
be
applied
to
dif
ferent
tools
and
en
vironments.
Learning
is
also
limited
to
tool
selection
only
.
Bro
wn
proposed
a
relational
approach
which
can
o
v
ercome
these
limitations.
Furthermore,
W
icaksono
&
Sammut
([
5
]
)
ha
v
e
suggested
that
this
representation
has
potential
to
solv
e
a
more
comple
x
problem,
such
as
tool
creation.
W
e
define
a
tool
as
an
object
that
is
deliberately
emplo
yed
by
an
agent
to
help
it
achie
v
e
a
goal,
which
w
ould
be
too
dif
ficult
to
achie
v
e
without
the
tool.
W
e
w
ant
to
learn
a
tool
action
model
that
e
xplains
changes
in
the
properties
of
one
or
more
objects
af
fected
by
the
tool,
gi
v
en
that
certain
preconditions
are
met.
F
ollo
wing
Bro
wn
[1],
learning
is
performed
by
trial
and
error
in
an
Inducti
v
e
Logic
Programming
(ILP)
setting
[6].
Bro
wn
[1]
carried
out
tool
use
learning
in
a
sensor
-less
simulation.
This
means
that
a
robot
has
perfect
kno
wledge
of
the
w
orld
and
uncertainties
in
the
sensors’
readings
are
eliminated.
In
this
paper
,
we
w
ould
present
a
complete
r
o
bot
ic
system
for
tool
us
e
learning
follo
wing
a
relational
learning
approach.
This
includes
three
components:
1.
De
v
eloping
a
robot
softw
are
architecture
that
consists
of
primiti
v
e
and
abstract
layers
and
f
acilitates
communi-
cation
between
them.
2.
Creating
a
mechanism
to
detect
objects
and
generating
primiti
v
e
beha
viors
for
tool
use.
3.
Extending
relational
learning
methods
and
conducting
tool
use
learning
e
xperiments
using
a
real
robot.
In
the
follo
wing
sections,
we
describe
rele
v
ant
pre
vious
w
ork,
the
kno
wledge
representation
formalism
used,
the
softw
are
architecture,
and
the
tool
use
learning
mechanism.
Finally
,
we
perform
se
v
eral
real
w
orld
e
xperiments
and
dra
w
conclusions.
J
ournal
Homepage:
http://iaesjournal.com/online/inde
x.php/IJECE
Evaluation Warning : The document was created with Spire.PDF for Python.
Int
J
Elec
&
Comp
Eng
ISSN:
2088-8708
1231
2.
RELA
TED
W
ORK
An
intelligent
robot
is
identified
by
its
ability
to
mak
e
a
decision
and
learn
autonomously
in
an
unstructured
en
vironment.
Machine
learning
techniques
could
be
useful
as
the
y
can
compensate
for
the
imperfect
model
of
a
robot
in
a
real
w
orld.
F
or
e
xample,
control
of
a
robot
manipulator
can
be
impro
v
ed
by
equipping
a
PD
torque
controller
with
Adapti
v
e
Neuro-Fuzzy
Inference
System
[7]
or
combining
sliding
mode
control
with
genetic
algorithm
[8].
In
a
beha
vior
-based
robotics
approach,
reinforcement
learning
can
be
used
to
learn
the
robot’
s
beha
vior
[9].
Here
we
describe
the
pre
vious
w
ork
in
tool
use
learning
by
a
robo
t
.
W
ood
et
al.
[10]
use
a
neural
netw
ork
to
learn
appropriate
posture
of
a
Son
y
Aibo
robot
so
it
can
reach
an
object
by
using
a
tool
placed
on
its
back.
Sto
ytche
v
[11]
learns
to
select
a
correct
t
oo
l
via
its
af
fordances
which
are
grounded
in
robot
beha
viors.
Ho
we
v
er
,
its
result
can
not
be
generalized
to
other
ne
w
tools.
More
recent
w
ork
by
T
ikhanof
f
et
al.
[12])
combine
e
xploratory
beha
viors
and
a
geometrical
feature
e
xtraction
to
learn
af
fordances
and
tool
use.
Mar
et
al.
[13]
e
xtend
t
heir
w
ork
by
learning
a
grasp
configuration
which
influences
the
outcome
of
a
tool
use
action.
The
limitations
of
feature-based
representations
were
mentioned
abo
v
e.
Only
recently
ha
v
e
such
representa-
tions
been
e
xtended
so
that
a
robot
can
learn
to
grasp
a
tool
[13].
An
ILP
system
can
o
v
ercome
these
limitations,
as
it
represents
the
tools
and
other
objects
in
a
relati
onal
representation
[1,
2].
Ho
we
v
er
,
pre
vious
w
ork
has
been
done
pre
viously
in
sensor
-less
simulation
only
.
This
means
the
comple
xities
of
acquiring
perceptions,
generating
precise
mo
v
ements,
and
dealing
with
w
orld
uncertainties,
are
a
v
oided.
W
e
aim
to
de
v
elop
a
complete
robotic
system
that
f
acilitates
this
learning
in
a
real
w
orld.
3.
REPRESENTING
ST
A
TES
AND
A
CTIONS
W
e
maintain
tw
o
representations
of
states
and
actions
in
primiti
v
e
and
abstract
form.
Primiti
v
e
states
are
the
positions
of
all
objects
in
the
w
orld
that
are
captured
by
vision
sensors
using
a
mechanism
described
in
section
4.2..
As
we
only
use
simple
objects,
the
y
can
be
detected
by
their
tw
o-dimensional
appearance
only
.
Abstract
states
are
represented
as
e
xpressions
in
first-order
logic,
to
be
more
specific
as
Horn
clauses.
Primi-
ti
v
e
states
are
translated
to
an
abstract
state
by
calling
rele
v
ant
Prolog
programs
with
the
primiti
v
e
states
as
their
bound
v
alues.
T
o
be
classified
as
a
tool,
an
object
must
possess
particular
structural
properties
(e.g.
a
particular
side
where
a
hook
is
attached
to
the
handle)
and
spatial
properties
(e.g.
a
hook
is
touching
the
back
side
of
a
cube).
W
e
collect
these
properties
in
a
h
ypothesis,
namely
tool
pose
,
which
is
sho
wn
belo
w
in
simplified
form.
t
o
o
l
p
o
s
e
(
H
a
n
d
l
e
,
H
o
o
k
,
B
o
x
,
T
u
b
e
)
:
a
t
t
a
c
h
e
d
e
n
d
(
H
a
n
d
l
e
,
H
o
o
k
,
b
a
c
k
)
,
%
a
s
t
r
u
c
t
u
r
a
l
p
r
o
p
e
r
t
y
a
t
t
a
c
h
e
d
s
i
d
e
(
H
a
n
d
l
e
,
H
o
o
k
,
S
i
d
e
)
,
%
a
s
t
r
u
c
t
u
r
a
l
p
r
o
p
e
r
t
y
t
o
u
c
h
i
n
g
(
H
o
o
k
,
B
o
x
,
b
a
c
k
)
.
%
a
s
p
a
t
i
a
l
p
r
o
p
e
r
t
y
W
e
use
an
e
xtended
STRIPS
action
model
[14]
to
describe
an
abstract
action
and
ho
w
it
af
fects
the
w
orld.
A
primiti
v
e
goal
that
has
to
be
achie
v
ed
by
a
robot
is
added
at
the
end
of
the
model.
Action
name
PRE
:
states
that
must
be
true
so
that
an
action
can
be
performed
ADD
:
conditions
that
become
true
as
a
result
of
the
action
DELETE
:
conditions
that
are
no
longer
true
follo
wing
the
action
CONSTRAINTS
:
the
ph
ysical
limits
of
actuators
that
constrain
the
action
This
action
model
is
also
a
planning
operator
.
Thus,
action
learning
does
not
happen
in
isolation,
as
it
is
a
part
of
a
problem
solving
scenario.
Ev
ery
abstract
action
is
link
ed
to
a
set
of
primiti
v
e
beha
viors.
This
will
be
discussed
later
in
section
4.3.
4.
R
OBO
TIC
SYSTEM
In
this
section,
we
e
xplain
our
softw
are
architecture,
objects
detection
method,
and
beha
vior
generation
mechanism.
4.1.
Softwar
e
Ar
chitectur
e
W
e
use
a
relational
representation
in
the
high-le
v
el
layer
to
tak
e
adv
antage
of
the
e
xpressi
v
eness
of
first-order
logic
(FOL)
clauses,
especially
Horn
clauses,
and
e
qu
i
p
a
robot
with
useful
symbolic
techniques.
W
e
implement
this
layer
in
SWI
Prolog.
Most
modern
Prolog
implementations,
including
SWI,
incorporate
a
constraint
solv
er
which
pro
vide
adv
anced
numerical
reasoning
capabilities.
T
ool
Use
Learning
for
a
Real
Robot(Handy
W
icaksono)
Evaluation Warning : The document was created with Spire.PDF for Python.
1232
ISSN:
2088-8708
In
the
lo
w-le
v
el
layer
,
a
robot
operates
in
a
w
orld
of
continuous,
noisy
and
uncertain
measurements.
Its
sensors
return
readings
in
numeric
ranges,
and
its
actuators
operate
based
on
numerical
set
points.
As
we
aim
to
use
Robot
Operating
System
(R
OS
1
)
as
our
middle
w
are,
we
implem
ent
this
layer
in
Python.
The
Baxter
robot,
from
Re-
think
Robotics,
has
its
o
wn
R
OS-Python
API,
a
set
of
classes
that
pro
vides
wrappers
around
the
R
OS
communication.
Another
layer
,
namely
the
translator
,
is
needed
to
map
primiti
v
e
to
abstract
states
and
to
link
an
abstract
action
to
corresponding
primiti
v
e
beha
viors.
It
is
written
in
C++
which
has
an
interf
ace
to
SWI
Prolog
as
the
language
for
the
abstract
layer
.
Communication
to
the
primiti
v
e
layer
is
done
vi
a
R
OS
using
a
simple
publish
and
subscribe
mechanism.
Our
robot
softw
are
architecture
is
sho
wn
in
Fig.
1a.
W
e
gi
v
e
a
simple
illustration
of
action
e
x
ecution,
namely
find
tool
,
which
in
v
olv
es
communication
be-
tween
layer
s.
In
the
primiti
v
e
layer
,
a
cube
and
a
tube
are
detected,
their
corners
are
acquired
and
published
to-
gether
by
a
R
OS
node,
namely
/translator
,
as
a
R
OS
topic,
/pri
states
.
The
translator
also
has
a
R
OS
node,
/translator
,
that
subscribes
to
that
topic.
Then
it
e
v
aluates
the
status
of
the
abstract
state,
in
tube
(Cube,Tube)
,
by
calling
the
rele
v
ant
Prolog
predicate.
In
the
abstract
layer
,
in
tube(Cube,Tube)
is
called.
If
it
is
true,
as
it
is
the
only
the
precondition
of
the
find
tool
action,
then
the
translator
will
publish
it
as
the
R
OS
topic,
/abs
action
that
represents
the
currently
acti
v
e
action.
The
primi
ti
v
e
layer
subscribes
to
this
topic.
If
it
recognizes
find
tool
as
an
acti
v
e
action,
then
it
triggers
the
corresponding
primiti
v
e
beha
viors.
See
Fig.
1b
for
detail.
Actions translator
Abstract Action 1
Constr
aint
Solver /
Pose Selector
Abstract lay
er
Perception
Object
Detection
Behaviors Generation
PID Contr
oller
ILP Learner
States
trans
lator
Logical
State
Planner
Abstract Action n
Tran
slator
Primitive la
yer
Tool Selec
tor
Tool
identity
Abstract
actions
Tool
pose
Abstract
states
Primitive
states
Pixels
Primitive action +
param
eters
Motors
commands
Primitive
states
Real World
(a)
Softw
are
architecture
objects de
tection
ROS node: /translator
cube(12,6,15,9)
tube(0,0,15,20)
ROS topic:
/pri_states
call Prolog predic
ate:
in_tube(Cube,Tube)
PRE:
in_tube(Cube,Tube)
STRIP
S action model
NAME:
find_tool
EFF:
tool_found(Tool)
ROS node: /p
rimitive
ROS topic:
/abs_action
publish "
find_tool"
abstra
ct action
behaviors generation
Abstract lay
er
Tran
slator
Primitive la
yer
go to tool area,
find c
orrect tool
(b)
Simplified
e
xample
Figure
1.
Robot
softw
are
architecture
and
its
e
xample
4.2.
Object
Detection
W
e
detect
all
objects
(i.e.
dif
ferent
tools,
a
cube,
and
a
tube)
by
their
tw
o-dimensional
appearances
only
.
W
e
use
the
Baxter
camera
to
detect
objects
locally
,
while
an
e
xternal
web
camera
pro
vides
the
global
w
orld
state.
W
e
combine
local
and
global
images
from
both
cameras
to
achie
v
e
our
goal.
The
OpenCV
library
is
used
for
object
detection.
In
pre-processing,
Gaussian
smoothing
is
applied.
Later
,
the
e
dg
e
s
of
objects
in
an
image
are
detected
with
a
Cann
y
Edge
detector
,
and
their
contours
are
found.
Each
contour
is
tested
to
check
whether
it
has
the
properties
of
a
particular
object.
The
contour
properties
include
the
number
of
edges,
the
area,
the
angle
formed
by
tw
o
adjacent
parts,
the
con
v
e
xity
,
and
the
perimeter
.
T
ools
are
treated
specially
as
the
y
ha
v
e
more
than
one
shape.
After
being
detected
as
a
tool,
the
object’
s
type
is
determined,
and
its
handle
and
hook(s)
are
acquired.
All
objects
are
represented
1
http://www
.ros.or
g
Int
J
Elec
&
Comp
Eng
V
ol.
8,
No.
2,
April
2018:
1230
–
1237
Evaluation Warning : The document was created with Spire.PDF for Python.
Int
J
Elec
&
Comp
Eng
ISSN:
2088-8708
1233
by
their
minimum
and
maxim
um
corner
points
in
Cartesian
coordinates
(
X
min
;
Y
min
;
X
max
;
Y
max
).
The
complete
mechanism
is
sho
wn
in
Algorithm
1.
Algorithm
1
Objects
detection
1:
Capture
an
image
2:
Perform
Gaussian
smoothing
3:
Detect
the
edges
of
objects
using
Cann
y
detector
4:
Find
the
contours
of
the
objects
5:
f
or
each
contour
do
6:
Classify
the
contour
7:
if
The
contour
is
either
cube,
tube,
or
tool
then
8:
Find
its
minimum
&
maximum
corner
points
9:
r
etur
n
all
minimum
&
maximum
corner
points
4.3.
Beha
vior
Generation
In
our
tool
use
application,
a
sequence
of
actions,
or
a
plan,
must
be
performed
to
achie
v
e
the
goal,
pulling
a
cube
from
a
narro
w
tube.
Each
action
uses
dif
ferent
strate
gies
(in
v
erse
kinematics
solving,
visual
serv
oing,
gripper
acti
v
ation,
constraints
solving)
depending
on
whether
the
primiti
v
e
goal
is
kno
wn
or
not,
and
whether
the
visual
information
is
needed
or
not.
W
e
sho
w
an
e
xample
of
the
plan
in
T
able
1.
T
able
1.
The
actions
sequence
for
pulling
a
cube
from
a
narro
w
tube
No
Action
T
echnique
Goal
V
isual
1
Find
the
tool
In
v
erse
Kinematics
(IK)
solving,
visual
serv
oing
Kno
wn
Y
es
2
Grip
the
tool
Close
the
gripper
Kno
wn
No
3
Position
the
tool
Constra
ints
solving,
IK
solving
Unkno
wn
Y
es
4
Pull
the
cube
IK
solving
Kno
wn
No
5
Mo
v
e
the
tool
a
w
ay
IK
solving
Kno
wn
No
6
Ungrip
the
tool
Open
the
gripper
Kno
wn
No
An
abstract
action
is
directly
link
ed
to
corresponding
primiti
v
e
actions.
F
or
simplicity
,
we
only
consider
mo
v
ement
in
tw
o-dimensional
Cartesian
coordinates.
A
primiti
v
e
goal
is
a
tar
get
location
in
that
coordinate
system.
When
a
goal
is
kno
wn,
we
use
an
in
v
erse
kinematics
(IK)
solv
er
to
compute
the
angles
of
all
joints
and
mo
v
e
the
arm
to
w
ards
that
goal.
Ho
we
v
er
,
if
it
is
unkno
wn,
we
use
a
constraint
solv
er
to
create
a
numerical
goal
from
abstract
states
which
are
the
preconditions
of
an
abstract
action
model
(see
Fig.
2a).
T
o
perform
an
accurate
tool
pi
cking
action,
we
use
a
simple
v
ersion
of
image-based
visual
serv
oing
[15],
where
an
error
signal
is
determined
directly
in
terms
of
image
feature
parameters.
W
e
locate
the
robot
arm
in
a
position
where
a
chosen
tool
is
located
in
the
center
of
an
image
captured
by
Baxter’
s
wrist
camera,
so
the
arm
can
mo
v
e
do
wnw
ard
perpendicularly
to
tak
e
the
tool.
See
Fig.
2b
for
the
detail.
Action 1
Action 2
Action n
Plan
Action Mod
el
Constr
aint Solver
IK Solver
Controller
Abstract goal
Primitive goal
Goal in joi
nts movement
Motor commands
(a)
Constraints
solving
to
get
a
primiti
v
e
goal
Detect a c
hosen tool in the
image (see Alg. 1)
Is the tool on
the cente
r of image?
Move arm towards center
Go down to pi
ck the tool
Yes
No
Start
Finish
(b)
A
visual
serv
oing
process
Figure
2.
beha
viors
generation
T
ool
Use
Learning
for
a
Real
Robot(Handy
W
icaksono)
Evaluation Warning : The document was created with Spire.PDF for Python.
1234
ISSN:
2088-8708
5.
T
OOL
USE
LEARNING
In
this
section,
we
describe
relational
tool
use
learning
[1,
2].
The
robot
must
learn
a
tool
action
model
that
describes
the
properties
of
a
tool
and
other
objects
in
the
w
orld
that
enable
a
robot
to
use
a
tool
successfully
.
Specifically
,
the
robot
learns
a
tool
pose
predicate,
which
describes
the
structural
and
spatial
conditions
that
must
be
met
for
the
tool
to
be
used.
W
e
maintain
a
v
ersion
space
of
h
ypotheses,
follo
wing
Mitchell
[16].
A
v
ersion
space
is
bounded
by
the
most
general
h
ypothesis
(
h
g
)
and
the
most
specific
h
ypothesis
(
h
s
).
As
we
e
xplain
belo
w
,
the
v
ersion
space
is
used
to
determine
what
e
xperiments
the
robot
should
perform
to
generalize
or
specialize
the
tool
action
model.
Initially
,
a
robot
may
not
ha
v
e
a
complete
action
model
for
a
tool,
so
it
cannot
construct
a
plan.
After
a
tutor
gi
v
es
a
tool
use
e
xample,
the
robot
se
gments
the
observ
ed
beha
vior
into
discrete
actions,
matches
them
to
actions
already
in
i
ts
background
kno
wledge,
and
constructs
a
ne
w
action
if
there
is
no
match
for
a
particular
beha
vior
.
This
construction
may
not
be
suf
ficient
to
describe
the
tool
action
in
a
general
enough
form
that
it
can
be
applied
in
dif
ferent
situations.
Therefore,
trial
and
error
learning
is
performed
to
refine
the
initial
action
model.
T
rial
and
error
in
v
olv
ed
performing
e
xperiments
in
which
a
dif
ferent
tool
is
selected
or
its
pose
is
changed
to
test
the
current
h
ypothesis
for
the
tool
action.
In
tool
selection,
an
object
that
has
properties
that
are
most
similar
to
a
pre
viously
useful
tool
is
chosen.
More
speci
fically
,
we
test
whether
an
object
satisfies
all
primary
structural
properties,
stored
in
h
g
,
and
most
of
the
secondary
ones,
stored
in
h
s
,
gi
v
en
its
primiti
v
e
states.
Ha
ving
a
correct
tool
is
useless
when
it
i
s
located
in
an
incorrect
position
before
it
trying
to
pull
the
tar
get
object.
W
e
select
the
pose
by
solving
the
constraints
of
the
spatial
literals
in
the
tool
pose
predicate.
W
e
check
whether
the
spatial
constraints
of
an
object
can
be
solv
ed
or
not.
The
unsolv
ed
constraints
are
ignored,
and
the
final
result
is
used
as
the
numerical
goal
for
the
robot.
After
a
tool
use
learning
e
xperiment
is
performed,
its
result,
whether
succ
essful
or
not,
i
s
passed
to
the
ILP
learner
so
the
h
ypothesis
can
be
refined.
Our
learning
is
deri
v
ed
from
Golem
[17],
an
ILP
algorithm.
Refinement
of
h
s
is
done
by
finding
the
constrained
Least
General
Generalization
(LGG)
of
a
positi
v
e
e
xamples
pair
,
and
h
g
is
refined
by
performing
the
ne
g
ati
v
e-based
reduction.
The
learning
algorithm
is
a
modification
of
Haber
[2]
and
is
sho
wn
in
Algorithm
2.
Algorithm
2
T
ool
use
learning
1:
input:
ne
w
action
model
M
,
h
g
=
true
;
h
s
=
preconditions
of
M
,
N
trials,
K
consecuti
v
e
success
2:
while
success
<
K
or
index
<
N
do
3:
e
=
g
ener
ate
exper
iment
(
h
s
;
h
g
)
4:
tool
=
sel
ect
tool
(
h
s
;
h
g
)
//
select
a
tool
with
highest
rank
5:
f
or
each
e
i
in
e
do
6:
pose
=
sel
ect
pose
(
e
i
)
//
performing
constraint
solving
on
preconditions
of
the
rele
v
ant
action
model
7:
if
pose
=
nul
l
then
8:
br
eak
9:
success
=
execute
exp
(
tool
;
pose
)
//
performing
a
tool
use
e
xperiment
10:
if
success
then
11:
label
pose
positi
v
e
12:
increment
cons
success
13:
h
s
=
generalise
h
s
//
via
Least
General
Generalisation
14:
else
15:
label
pose
ne
g
ati
v
e
16:
cons
success
0
17:
h
g
=
specialise
h
g
//
via
ne
g
ati
v
e
based
reduction
18:
add
e
i
to
training
data
19:
increment
index
6.
RESUL
TS
AND
DISCUSSIONS
The
Baxter
research
robot
is
used
in
this
e
xperiment
(see
Fig.
3a).
The
objects
in
the
scene
include
a
cube,
a
tube,
and
fi
v
e
objects
of
a
dif
ferent
shape
that
potential
tools.
W
e
also
ha
v
e
another
set
of
tool
candidates
whose
hooks
are
narro
wer
than
their
handles
(see
Fig.
3b).
W
e
use
Baxter’
s
wrist
camera
for
visual
serv
oing.
W
e
add
an
e
xternal
web
camera
f
acing
the
tube
to
pro
vide
the
global
s
tate.
W
e
di
vide
the
e
xperiments
into
tw
o
stages.
Initially
,
we
e
v
aluate
the
performance
of
our
object
detection
algorithm
and
action
e
x
ecution.
Later
,
we
conduct
tool
use
learning
e
xperiments.
Int
J
Elec
&
Comp
Eng
V
ol.
8,
No.
2,
April
2018:
1230
–
1237
Evaluation Warning : The document was created with Spire.PDF for Python.
Int
J
Elec
&
Comp
Eng
ISSN:
2088-8708
1235
web camera
Baxter han
d's
camera
5 tools
cube
tube
(a)
Scene
details
(b)
All
set
of
tools
Figure
3.
Experimental
scene
6.1.
Object
Detection
and
Execution
Experiments
The
contours
of
all
objects
(cube,
tube,
and
tools)
are
detected,
using
Baxter’
s
wrist
camera,
and
their
min-
imum
and
maximum
points
are
acquired.
These
are
mark
ed
with
colored
dots
in
the
object
corners
in
Fig.
4.
Ev
en
when
the
tool
is
held
by
the
gripper
,
our
algorithm
is
still
able
to
detect
the
object.
Ho
we
v
er
,
the
vision
system
may
f
ail
in
a
lo
w-light
en
vironment.
It
also
assumes
that
all
objects
are
aligned
perpendicular
to
each
other
and
the
camera.
baxter_cam
: cube
dete
ction
web_cam: cube
detecti
on
baxter_cam
: tube
detec
tion
web_cam: tube
detecti
on
baxter_cam
: tools de
tection
web_cam: tools dete
ction 1
web_cam: tools de
tection 2
web_cam: selec
ted tool
detecti
on
Figure
4.
Objects
detection
in
images
captured
by
the
Baxter
hand
camera
and
the
web
camera
W
e
also
e
v
aluate
the
capability
of
the
Baxter
to
mo
v
e
accurately
to
accomplish
a
tool
use
task
based
on
the
actions
sequence
in
T
able
1.
In
Fig.
5
we
illustrate
these
actions:
find
tool
action
which
uses
visual
serv
oing,
grip
tool
action,
which
acti
v
ates
the
robot’
s
gripper
,
and
position
tool
actions,
whic
h
tries
to
satisfy
a
prim-
iti
v
e
goal
gi
v
en
by
a
constraint
solv
er
.
From
these
trials,
it
w
as
determined
that
the
Baxter
can
mo
v
e
smoothly
to
complete
the
task.
W
e
observ
e
that
most
errors
are
caused
by
the
perception
system,
not
the
action
mechanism.
find_tool
grip_tool
position_tool
Figure
5.
Sample
of
the
robot’
s
actions
6.2.
Lear
ning
Experiments
In
this
e
xperiment,
we
use
fi
v
e
dif
ferent
candidate
tools
with
narro
w
and
wide
hooks.
The
cube,
as
a
tar
get
object,
is
located
at
a
particular
position
inside
the
tube,
which
is
changed
after
the
Baxter
can
withdra
w
it
from
the
tube
successfully
.
W
e
adopt
an
initial
action
model
which
w
as
created
by
Bro
wn
([1]).
Learning
by
trial
and
error
is
then
performed
to
refine
the
tool
pose
h
ypothesis.
W
e
stop
learning
if
the
robot
accomplishes
the
task
three
times
consecuti
v
ely
.
The
complete
learning
sequence
is
sho
wn
in
Fig.
6.
W
e
need
at
least
ten
e
xamples
to
learn
to
perform
this
task.
Ho
we
v
er
,
more
e
xperiments
may
be
needed
if
an
y
errors
occur
(e.g.
a
change
in
lighting
or
an
in
v
alid
mo
v
e
produced
by
the
IK
solv
er).
The
first
positi
v
e
e
xample
is
gi
v
en
by
a
teacher
.
In
t
he
second
e
xample,
the
robot
attempts
to
imitate
the
tool
and
location
of
the
first
one,
b
ut
it
f
ails
as
the
cube
location
is
changed.
In
the
third
e
xample,
the
robot
finds
that
the
attachment
side
of
the
hook
should
be
the
same
as
the
location
of
the
cube
inside
the
tube.
Later
on,
the
robot
still
mak
es
mi
stak
es
as
it
tends
to
choose
a
tool
with
a
narro
w
hook.
This
narrower
property
is
eliminated
from
h
s
in
the
generalization
process
in
T
ool
Use
Learning
for
a
Real
Robot(Handy
W
icaksono)
Evaluation Warning : The document was created with Spire.PDF for Python.
1236
ISSN:
2088-8708
example 2: -
example 3: +
example 4: -
example 5: -
example 1: +
example 7: -
example 8: +
example 9: +
example 10: +
example 6: +
Figure
6.
T
ool
use
learning
e
xperiments
in
a
real
w
orld
the
eighth
e
xample.
Finally
,
learning
is
completed
in
the
tenth
e
xample,
as
it
performs
three
consecuti
v
e
successful
e
xperiments.
The
final
h
ypothesis
is
shorter
(
h
s
is
reduced
from
14
literals
to
11
lite
rals)
and
more
general
than
the
initial
h
ypothesis.
6.3.
Discussions
Our
result
is
simil
ar
to
the
pre
vious
relational
learning
approach
[1],
the
y
learn
tool-use
in
12
e
xperiments,
as
we
use
the
relati
v
ely
same
mechanism.
Ho
we
v
er
,
we
perform
our
e
xperiment
on
a
real
robot,
while
the
former
only
do
it
on
a
sensor
-less
simulation.
This
includes
bridging
the
g
ap
between
lo
w-le
v
el
layer
(in
R
OS
en
vironment)
and
high-le
v
el
layer
(in
SWI
Prolog).
W
e
also
b
uild
objects
detection
system
that
has
to
deal
with
the
noisy
en
vironment.
Pre
vious
w
ork
did
not
do
an
y
detections,
as
it
acquires
perfect
data
from
a
simulator
.
Compared
to
the
line
of
w
ork
in
feature-based
representation,
such
as
w
ork
of
T
ikhanof
f
et
al.
[12],
our
approach
can
learn
f
aster
(theirs
needs
at
least
120
trials
in
v
arious
learning
stages)
as
we
can
easily
incorporate
human
e
xpertise
in
the
background
kno
wledge.
Our
e
xperiment
scenario
is
also
more
complicated,
as
we
locate
the
tar
get
object
inside
a
tube.
W
e
e
xploit
the
ability
of
the
relational
representation
to
represent
a
comple
x
relationship
between
objects
compactly
.
W
e
also
learn
the
tool-pose,
where
the
tool
should
be
located
to
be
able
to
pull
the
tool
successfully
,
while
it
is
predefined
in
their
w
ork.
There
are
limitations
in
our
w
ork,
especially
in
the
perception
system
which
can
only
handle
2D
images
and
not
rob
ust
in
changing
en
vironments.
In
this
area,
pre
vious
w
ork
[12,
13]
is
better
than
ours.
Despite
these
limitations,
the
e
xperiment
demonst
rates
that
the
learning
system
is
capable
of
relational
tool
use
learning
in
the
real
w
orld.
In
other
w
ork
of
us,
we
also
use
a
ph
ysics
simulator
to
assist
a
real
robot
performs
tool
use
learning
[18].
7.
CONCLUSIONS
AND
FUTURE
W
ORK
W
e
ha
v
e
de
v
eloped
a
complete
robot
system
for
relational
tool
use
learning
in
the
real
w
orld.
The
primiti
v
e,
translator
and
abstract
layers
ha
v
e
been
b
uilt,
along
with
with
their
interf
aces.
W
e
ha
v
e
also
implemented
a
system
to
detect
objects
and
generate
primiti
v
e
beha
viors
by
in
v
erse
kinematics,
a
constraint
solv
er
and
a
visual
serv
oing
mechanism.
Finally
,
we
ha
v
e
e
xtended
a
tool
use
learning
system,
which
has
been
tested
in
e
xperiments
in
the
real
w
orld.
F
or
a
relati
v
ely
simple
tool
use
scenario,
the
robot
at
needs
at
least
ten
e
xamples
to
learn
the
tool’
s
properties.
In
the
future,
we
w
ant
our
robot
to
do
tool
creation
when
the
a
v
ailable
tools
cannot
be
used
to
solv
e
the
current
problem.
Those
tools
can
then
be
tested
in
the
simulator
,
to
sa
v
e
time
and
materials,
and
manuf
actured
by
using
a
3D
printer
.
The
de
v
elopment
of
a
more
rob
us
t
perception
and
a
more
accurate
mo
v
ement
systems
will
impro
v
e
our
system
performance.
The
system
will
be
tested
on
a
wider
v
ariety
of
tool
use
scenarios,
with
dif
ferent
tools
and
dif
ferent
tasks.
A
CKNO
WLEDGEMENT
Handy
W
icaksono
is
supported
by
The
Directorate
General
of
Resources
for
Science,
T
echnology
and
Higher
Education
(DG-RSTHE),
Ministry
of
Research,
T
echnology
,
and
Higher
Education
of
the
Republic
of
Indonesia.
REFERENCES
[1]
S.
Bro
wn
and
C.
Sammut,
“
A
relational
approach
to
tool-use
learning
in
robots,
”
in
Inductive
Lo
gic
Pr
o
gr
amming
.
Springer
,
2012,
pp.
1–15.
[2]
A.
Haber
,
“
A
system
architecture
for
learning
robots,
”
Ph.D.
dissertation,
School
of
Computer
Science
and
Engineering,
UNSW
Australia,
2015.
Int
J
Elec
&
Comp
Eng
V
ol.
8,
No.
2,
April
2018:
1230
–
1237
Evaluation Warning : The document was created with Spire.PDF for Python.
Int
J
Elec
&
Comp
Eng
ISSN:
2088-8708
1237
[3]
S.
H.
Johnson-Fre
y
,
“What’
s
so
special
about
human
tool
use?”
Neur
on
,
v
ol.
39,
no.
2,
pp.
201
–
204,
2003.
[Online].
A
v
ailable:
http://www
.sciencedirect.com/science/article/pii/S0896627303004240
[4]
J.
Stckler
and
S.
Behnk
e,
“
Adapti
v
e
tool-use
strate
gies
for
anthropomorphic
service
robots,
”
in
2014
IEEE-RAS
International
Confer
ence
on
Humanoid
Robots
,
No
v
2014,
pp.
755–760.
[5]
H.
W
icaksono
and
C.
Sammut,
“
A
learning
frame
w
ork
for
tool
creation
by
a
robot,
”
in
Pr
oceedings
of
A
CRA
,
2015.
[6]
S.
Muggleton,
“Inducti
v
e
logic
programming,
”
Ne
w
Gener
ation
Computing
,
v
ol.
8,
no.
4,
pp.
295–318,
1991.
[7]
O.
Bachir
and
A.-f.
Zoubir
,
“
Adapti
v
e
neuro-fuzzy
inference
system
based
control
of
puma
600
robot
manipula-
tor
,
”
International
J
ournal
of
Electrical
and
Computer
Engineering
,
v
ol.
2,
no.
1,
p.
90,
2012.
[8]
A.
R.
Firdaus
and
A.
S.
Rahman,
“Genetic
algorithm
of
sliding
mode
control
design
for
manipulator
robot,
”
TELK
OMNIKA
(T
elecommunication
Computing
Electr
onics
and
Contr
ol)
,
v
ol.
10,
no.
4,
pp.
645–654,
2012.
[9]
H.
W
icaksono,
H.
Khosw
anto,
and
S.
K
usw
adi,
“Beha
viors
coordination
and
learning
on
autonomous
na
vig
ation
of
ph
ysical
robot,
”
TELK
OMNIKA
(T
elecommunication
Computing
Electr
onics
and
Contr
ol)
,
v
ol.
9,
no.
3,
pp.
473–482,
2013.
[10]
A.
W
ood,
T
.
Horton,
and
R.
Amant,
“Ef
fecti
v
e
tool
use
in
a
habile
agent,
”
in
Systems
and
Information
Engineer
-
ing
Design
Symposium,
2005
IEEE
,
April
2005,
pp.
75–81.
[11]
A.
Sto
ytche
v
,
“Beha
vior
-grounded
representation
of
tool
af
fordances,
”
in
Pr
oceedings
of
the
2005
IEEE
Interna-
tional
Confer
ence
on
Robotics
and
A
utomation
,
April
2005,
pp.
3060–3065.
[12]
V
.
T
ikhanof
f,
U.
P
attacini,
L.
Natale,
and
G.
Metta,
“Exploring
af
fordances
and
tool
use
on
the
icub,
”
in
2013
13th
IEEE-RAS
International
Confer
ence
on
Humanoid
Robots
(Humanoids)
,
Oct
2013,
pp.
130–137.
[13]
T
.
Mar
,
V
.
T
ikhanof
f,
G.
Metta,
and
L.
Natale,
“Self-supervised
learning
of
grasp
dependent
tool
af
fordances
on
the
icub
humanoid
robot,
”
in
2015
IEEE
International
Confer
ence
on
Robotics
and
A
utomation
(ICRA)
,
M
ay
2015,
pp.
3200–3206.
[14]
R.
Fik
es
and
N.
J.
Nilsson,
“STRIPS:
A
ne
w
approach
to
the
application
of
theorem
pro
ving
to
problem
solving,
”
Artif
.
Intell.
,
v
ol.
2,
no.
3/4,
pp.
189–208,
1971.
[15]
S.
Hutchinson,
G.
D.
Hager
,
and
P
.
I.
Cork
e,
“
A
tutorial
on
visual
serv
o
control,
”
IEEE
tr
ansactions
on
r
obotics
and
automation
,
v
ol.
12,
no.
5,
pp.
651–670,
1996.
[16]
T
.
M.
Mitchell,
“V
ersion
spaces:
A
candidate
elimination
approac
h
to
rule
learning,
”
in
Pr
oceedings
of
the
5th
international
joint
confer
ence
on
Artificial
i
n
t
ellig
ence-V
olume
1
.
Mor
g
an
Kaufmann
Publishers
Inc.,
1977,
pp.
305–310.
[17]
S.
Muggleton
and
C.
Feng,
“Ef
ficient
induction
in
logic
programs,
”
in
Inductive
Lo
gic
Pr
o
gr
amming
,
S.
Mug-
gleton,
Ed.
Academic
Press,
1992,
pp.
281–298.
[18]
H.
W
icaksono
and
C.
Sammut,
“Relational
tool
use
learning
by
a
robot
in
a
real
and
simulated
w
orld,
”
in
Pr
o-
ceedings
of
A
CRA
,
2016.
BIOGRAPHY
OF
A
UTHORS
Handy
W
icaksono
is
a
Ph.D.
student
at
Artificial
Intelligence
Group,
School
of
Computer
Science
and
Engineering,
UNSW
Australia,
with
bachelor
and
master
de
gree
i
n
Electrical
Engineering
from
Institut
T
eknologi
Sepuluh
Nopember
,
Indonesia.
He
is
also
a
lecturer
in
Electrical
Engineering
De-
partment,
Petra
Christian
Uni
v
ersity
,
Indonesia.
His
research
is
in
the
area
of
artificial
intelligence,
intelligent
robot,
and
industrial
automation.
Claude
Sammut
is
a
Professor
of
Computer
Science
and
Engineering
at
the
Uni
v
e
rsity
of
Ne
w
South
W
ales,
Head
of
the
Artificial
Intelligence
Research
Group
and
Deputy
Director
of
the
iCin-
ema
Centre
for
Interacti
v
e
Cinema
Research.
Pre
viously
,
he
w
as
a
program
manager
for
the
Smart
Internet
T
echnology
Cooperati
v
e
Research
Centre,
the
UNSW
node
Director
of
the
ARC
Centre
of
Excellence
for
Autonomous
Systems
and
a
m
ember
of
the
joint
ARC/NH&MRC
project
on
Thinking
Systems.
His
early
w
ork
on
relational
learning
helped
to
the
lay
the
foundations
for
the
field
of
Inducti
v
e
Logic
Pr
ogramming
(ILP).
W
ith
Donald
Michie,
he
also
did
pioneering
w
ork
in
Beha
vioral
Cloning.
His
current
interests
include
Con
v
ersational
Agents
and
Robotics.
T
ool
Use
Learning
for
a
Real
Robot(Handy
W
icaksono)
Evaluation Warning : The document was created with Spire.PDF for Python.