Int
ern
at
i
onal
Journ
al of
P
ower E
le
ctr
on
i
cs a
n
d
Drive
S
ystem
(I
J
PE
D
S
)
Vo
l.
11
,
No.
3
,
Septem
be
r 2020
, pp.
1123
~
1131
IS
S
N:
20
88
-
8694
,
DOI: 10
.11
591/
ij
peds
.
v11.i
3
.
pp
1123
-
113
1
1123
Journ
al h
om
e
page
:
http:
//
ij
pe
ds
.i
aescore.c
om
Adaptiv
e dynam
ic pro
graming
bas
ed opti
mal co
ntrol for
a
robot m
an
i
pulator
Dao Phu
ong
Na
m
1
, Ng
uyen H
ong Q
uan
g
2
, Tr
an
Phu
ong
N
am
3
, Tr
an Thi
H
ai
Yen
4
1,3
Hanoi
Univer
sity
of
Sc
ie
n
ce
a
nd
Technol
ogy
,
Viet
na
m
2,4
Tha
i
N
guyen
Univer
sity
of
T
e
chnol
ogy,
Vie
tn
am
Art
ic
le
In
f
o
ABSTR
A
CT
Art
ic
le
history:
Re
cei
ved
Sep
2
, 2
01
9
Re
vised
N
ov
9
, 201
9
Accepte
d
Fe
b
4
, 2
0
20
In
th
is
p
ape
r
,
the
opt
im
a
l
cont
ro
l
proble
m
of
a
no
nli
ne
ar
robot
m
a
nipul
at
or
in
abse
nce
of
holon
omi
c
constra
in
t
f
orc
e
b
ase
d
on
th
e
point
o
f
vie
w
of
ada
pt
ive
dynam
i
c
progr
a
mm
ing
(AD
P)
i
s
pre
sent
ed.
To
begi
n
with,
th
e
ma
nipulator
was
interve
ned
by
ex
ac
t
l
inea
ri
za
t
ion.
Th
en
th
e
fr
am
ework
of
AD
P
and
Robust
Inte
gra
l
of
the
Sign
of
t
he
Err
or
(RISE)
was
deve
lop
ed.
The
AD
P
al
gorit
h
m
em
pl
oys
Neura
l
Ne
t
work
te
chn
ique
to
tun
e
simu
ltaneously
th
e
ac
tor
-
cri
t
ic
net
w
ork
to
appr
oximate
the
con
trol
p
oli
cy
and
the
co
st
func
t
io
n,
respe
ctivel
y
.
Th
e
conve
rg
ence
o
f
weight
as
we
ll
as
positi
on
tr
acking
cont
ro
l
proble
m
was
c
onsidere
d
by
t
heor
etical
analy
sis.
Finally,
th
e
num
erica
l
exa
mp
le
is
c
onsidere
d
to
i
ll
ustrate
the
e
ffe
ctivene
ss
of
proposed
cont
rol
design
.
Ke
yw
or
d
s
:
Ad
a
ptive
d
yn
a
mic
pro
gr
am
min
g (
ADP)
Inp
ut constrai
nt
Neural
netw
ork
Robot ma
nipul
at
or
Robust inte
gr
al
of the
sig
n of
the er
ror (RIS
E)
This
is an
open
acc
ess arti
cl
e
un
der
the
CC
BY
-
SA
l
ic
ense
.
Corres
pond
in
g
Aut
h
or
:
Ngu
yen Ho
ng
Qu
a
ng,
Thai
Ngu
yen
Un
i
ver
sit
y o
f Te
ch
no
l
ogy, V
ie
tnam,
666, Street
3/2,
Tich
L
uong
W
ard, T
hai Ng
uyen
Ci
ty
-
T
hai
Ngu
yen Pro
vinc
e, V
ie
tna
m.
Emai
l:
qu
a
ng.
nguye
nhong@t
nu
t.e
du.
vn
1.
INTROD
U
CTION
In
rece
nt
year
s
,
the
c
ontr
ol
m
et
hodo
l
ogy
f
or
r
obotic
syst
ems
has
been
w
idely
devel
ope
d
no
t
only
i
n
pr
act
ic
al
ap
plica
ti
on
s
[1,
2]
,
bu
t
al
s
o
in
the
or
et
ic
al
anal
ys
i
s
[
3
-
6].
T
he
m
ai
n
chall
en
ges
of
t
he
c
on
t
ro
l
desig
n
hav
e
bee
n
co
nsi
der
e
d,
s
uch
as
r
obust
ad
a
ptiv
e con
t
ro
l p
r
ob
l
em, moti
on
/
for
ce con
t
ro
l, i
nput satur
at
io
n
a
nd
fu
ll
sta
te
con
s
trai
nts
[7,
8]
an
d
the
path
plan
ning
pro
blem
[
9].
S
ever
al
c
on
tr
ol
te
chn
i
qu
e
s
ha
ve
been
e
mp
l
oyed
f
or
manipula
to
rs
t
o
ta
c
kle
the
is
su
e
of
i
nput
s
at
ur
at
io
n
by
a
dd
i
ng
m
ore
te
rms
into
the
de
sign
e
d
co
ntr
ol
input
consi
der
i
ng
th
e
ab
s
ence
of
in
pu
t
Co
ns
trai
nt
[4,
5,
10
-
13]
.
I
n
[4],
a
uthors
pro
po
se
d
a
ne
w
ref
e
re
nce
of
con
t
ro
l
sy
ste
m
due
t
o
the
i
nput
sat
ur
at
io
n.
T
he
a
dd
it
io
nal
te
r
m
w
or
l
d
be
c
omp
uted
base
d
on
t
he
de
riva
ti
ve
of
pr
e
vious
Lya
puno
v
ca
nd
i
date
fun
ct
io
n
al
ong t
he
sta
te
traje
c
tor
y under
the
con
t
ro
l i
nput s
at
ur
a
ti
on [4
].
Fu
rt
hermo
re,
auth
or
s
in
[5]
giv
e
a
ne
w
appr
oach
to
a
ddress
t
he
in
put
co
ns
trai
nts
as
well
as
com
bin
in
g
wit
h
handlin
g
t
he
distu
rb
a
nces
.
The
pro
posed
sli
din
g
surfa
ce
was
em
ploye
d
the
Sat
f
un
ct
ion
of
joint
va
riables.
In
orde
r
to
re
al
iz
e
the
disad
van
ta
ge
of
sta
t
e
const
raints
i
n
ma
nipulat
or,
the
aut
hors
in
[7,
8]
pro
po
se
d
t
he
f
rame
work
of
Ba
rr
ie
r
Lya
pu
nov
f
un
ct
io
n
a
nd
M
oore
-
Pe
nrose
in
ver
se
,
F
uzzy
-
Neural
N
et
work
te
chn
iq
ue.
The
equ
i
valent
sli
ding
m
ode
co
ntr
ol
al
gorith
m
was
desi
gned
the
n
the
boun
dedness
o
f
con
t
ro
l
input
was
est
imat
ed.
The
a
dv
a
ntage
of
t
his
ap
proac
h
i
s
that
in
pu
t
boun
dedness
ab
so
lutel
y
adj
us
te
d
by
sel
ect
ing
se
veral
p
ara
mete
rs.
The
w
ork
in
[10
-
13]
pr
ese
nt
s
a
te
c
hniq
ue
to
im
pleme
nt
the
in
put
c
onstrai
nt
us
in
g
a
m
od
i
fied
Lya
punov
Ca
ndidate
f
un
ct
i
on.
Be
cau
se
of
the
act
uato
r
sa
turati
on,
the
L
yapu
nov
f
unct
ion
w
ould
be
add
e
d
more
t
he
qua
drat
ic
te
rm
fro
m
t
he
dif
fer
e
nc
e
betwee
n
t
he
c
on
tr
ol
in
pu
t
f
rom
co
ntr
oller
a
nd
t
he
real
sig
nal
app
li
ed
t
o
obje
ct
.
The
c
on
tr
ol
desig
n
wa
s
ob
ta
ined
afte
r
co
ns
ide
rin
g
t
he
Lya
punov
func
ti
on
de
rivati
ve
al
ong
Evaluation Warning : The document was created with Spire.PDF for Python.
IS
S
N
:
2088
-
8
694
In
t J
P
ow
Ele
c
&
D
ri
S
ys
t,
V
ol
.
11
, N
o.
3
,
Se
ptembe
r
2020
:
11
23
–
11
31
1124
the
s
ys
te
m
tr
aj
ect
or
y.
H
owever,
t
hese
a
foreme
ntioned
tra
diti
on
al
nonlinea
r
te
ch
ni
qu
es
hav
e
s
ever
a
l
dr
a
w
back
s
,
s
uc
h
as
di
ff
ic
ult
ie
s
in
fin
ding
eq
uiv
al
e
nt
L
yapu
nov
f
un
ct
ion
,
dy
nam
ic
of
ad
diti
on
al
te
rms
[7,
8,
10
-
13
].
Op
t
imi
zat
io
n
Tech
nique
usi
ng
G
A
(
ge
netic
al
gorithm
),
PSO
(
par
ti
cl
e
swarm
op
ti
mi
zat
ion
)
wer
e
ad
resse
d
to
so
l
ve
t
he
pap
t
h
plan
ning
pro
blem
[
9].
T
he
MPC
(
model
pr
e
dicti
ve
c
ontr
ol)
s
ol
ution,
wh
ic
h
is
t
he
s
pecial
case
of
op
ti
mal
c
on
t
rol
desig
n,
has
be
en
in
vestiga
te
d
f
or
li
nea
r
m
otor
no
t
only
on
li
ne
min
-
max
te
ch
ni
qu
e
i
n
[14,
15
]
bu
t
al
s
o
offli
ne
al
go
rithm
i
n
[16].
In
ord
e
r
to
co
ns
id
er
f
or
r
obot
m
ani
pula
tors.
Op
ti
mal
c
ontr
ol
al
gorith
m
obta
ins
t
he
c
ontrol
desig
n
t
ha
t
can
ta
c
kle
th
e
input,
sta
te
const
raint
bas
ed
on
co
nsi
der
i
ng
t
he
opti
miza
ti
on
pr
ob
le
m
in
presence
of
co
nst
raint.
A
n
a
symptoti
c
opti
m
al
co
ntr
ol
desi
gn
was
pr
ese
nted
in
[
3]
by
s
olv
i
ng
directl
y
t
he
Ri
cc
at
i
equ
at
io
n
in
li
near
s
ys
te
ms
.
H
oweve
r,
it
is
dif
ficult
t
o
fin
d
t
he
exp
li
ci
t
so
luti
on
of
Ri
ccat
i
eq
uation
a
s
well
as
par
ti
al
di
ff
e
ren
ti
al
HJB
(Hami
lt
on
-
Ja
co
bi
-
Be
ll
man)
e
qu
at
ion
in
ge
ne
ral
cas
e.
T
he
a
ppr
ox
i
mate
/a
dap
ti
ve
dynamic
pro
gr
ammin
g
(ADP
)
has
bee
n
paid
m
uc
h
at
te
nti
on
f
or
op
ti
mal
c
ontr
ol
pro
blem
i
n
re
cent
year
s
bec
ause
it
is
nece
ssary
to
s
olv
e
no
t
only
Ri
c
cat
i
equ
at
io
n
f
or
li
near
sy
ste
ms
but al
so
H
JB e
quat
io
n
f
or
nonlinea
r
sy
ste
ms
. Th
a
nks to Kr
onecke
r
pro
duct
techni
qu
e, a
utho
rs
in [17
]
pro
po
se
d
t
he
onli
ne
s
olu
ti
on
for
li
near
syst
ems
with
out
the
knowle
dge
of
sy
ste
m
matri
x
base
d
on
t
he
l
east
-
sq
ua
res
so
l
ution
f
rom
ac
quisi
ti
on
of
a
s
uffici
ent
num
be
r
of
data
po
i
nts.
I
n
[
18],
Z
ong
-
Pi
ng
Jia
ng
et
al
.
exten
d
the
ab
ove
on
li
ne
so
l
ution
t
o
ob
ta
in
the
c
omplet
el
y
unknow
n
dyna
mics
by
mea
ns
that
does
not
dep
e
nd
on
ei
the
r
matri
x
A
or
matri
x
B
of
li
near
s
ys
t
ems.
T
he
fact
t
ha
t
Ri
ccat
i
e
quat
ion
was
c
ons
idere
d
i
n
m
or
e
detai
l
in
the
c
ompu
ta
ti
on
prob
le
m
a
s
well
as
data
acqu
isi
ti
on.
More
over,
the
e
xplo
rati
on
noise
on
the
ti
me
i
nt
erv
al
was
menti
on
e
d
in
pro
pose
d
al
gorithm
[
18]
.
I
ns
te
ad
of
the
a
ppr
oach
of
em
ployin
g
Kro
ne
cker
product
f
or
the
case
of
li
nea
r
sy
ste
ms
,
the
ne
ur
al
netw
ork
appr
ox
imat
io
n
was
me
ntio
ne
d
for
c
os
t
f
unct
ion
t
o
im
ple
ment
on
li
ne
ad
a
ptive
algorit
hm o
n
t
he Act
or/
Crit
ic
stru
ct
ur
e
for c
on
ti
nu
ous ti
me
non
li
nea
r
s
ys
t
ems [1
9].
Howe
ver,
the
pro
po
se
d
al
go
r
it
hm
re
qu
i
red
t
he
k
nowle
dg
e
of
i
nput
-
to
-
sta
t
e
dynamics
t
o
update
the
con
t
ro
l
poli
cy
as
well
as
pe
rsiste
nt
c
onditi
on
was
not
co
ns
ide
red
[19
].
T
he
weig
ht
pa
rameters
in
ne
ur
a
l
netw
ork
w
ere
t
un
e
d
to
minim
iz
e
the
ob
je
ct
i
ve
in
the
le
ast
-
sq
ua
res
se
ns
e
[
19].
T
he
the
or
et
ic
al
analysis
about
conve
rg
e
nce
of
co
st
f
un
ct
io
n
an
d
c
on
tr
ol
in
pu
t
i
n
a
dap
ti
ve
/a
pproximat
e
dynamic
pro
grammi
ng
(ADP
)
wa
s
the
exte
ns
io
n
of
the
w
ork
in
[
20].
Tha
nks
to
the
the
or
et
ic
al
analysis
a
bout
the
ne
ur
al
net
work
ap
pro
xi
mati
on
,
auth
or
s
in
[21]
pr
es
ente
d
the
novel
on
li
ne
A
DP
al
gorithm
wh
ic
h
e
nab
le
s
to
tu
ne
sim
ultaneo
us
l
y
both
a
ct
or
and
crit
ic
neur
al
netw
orks
.
T
he
weig
hts
trai
ning
pro
blem
of
crit
ic
ne
ural
netw
ork
(
N
N)
was
imple
men
te
d
by
modifie
d
Le
ve
nb
e
r
g
-
M
ar
quar
dt
al
go
rithm
to
minimi
ze
th
e
s
qu
a
re
resid
ual
e
rror.
Mor
eo
ve
r,
the
tu
ni
ng
of
weig
hts in
act
or an
d
c
riti
c NN
d
e
pend o
n
eac
h other t
o o
btain the
w
ei
gh
ts c
onve
rg
e
nce.
I
t
is worth
noti
ng
that
the
per
sist
e
nce
of
e
xcita
ti
on
(
PE)
co
ndit
ion
need
to
be
s
at
isfie
d
an
d
L
yapuno
v
sta
bili
ty
t
heor
y
was
em
pl
oy
e
d
to
a
nalysis
th
e
c
on
verge
nce
pro
blem
[
21]
.
E
xtensi
on
of
th
e
wor
k
i
n
[
21],
based
on
the
a
nalys
is
of
appr
ox
imat
e
B
el
lman
er
ror,
t
he
propose
d
a
lgorit
hm
in
[
22]
e
na
bles
to
on
li
ne
simult
a
neousl
y
imple
ment
without
the
kn
ow
le
dg
e
of d
ri
ft
te
rm.
I
n
[
23]
,
the
i
den
ti
fier
along w
it
h
a
d
a
ptati
on
la
w
ca
n
be
desc
ribe
d
usi
ng
a
Neural
N
et
w
ork
to
ap
pro
xim
at
e
the
dy
nami
c
uncertai
nties
of
no
nlinear
model.
An
ext
ensio
n
us
in
g
s
pecial
cost
f
unct
io
n
ha
s
bee
n
pro
pos
ed
in
[
24,
25]
to
ena
ble
ha
ndli
ng
of
in
pu
t
c
on
st
raint.
The
fr
ame
w
ork
of
ADP
te
chn
iq
ue
an
d
cl
assic
al
sli
din
g
m
od
e
c
on
tr
ol
wa
s
pr
ese
nt
ed
to
desig
n
t
he
op
ti
mal
c
ontr
ol
f
or
an
in
ver
te
d
pend
ulu
m
[
26]
.
H
ow
e
ve
r,
the
eff
ect
ive
ness
of
A
DP
has
be
en
sti
ll
no
t
c
onsidere
d
f
or
a
r
obot
ma
nipulat
or
i
n
aforeme
ntio
ne
d
researc
hes.
T
his
work
pr
opos
e
d
the
co
ntr
ol
al
gorith
m
c
ombinin
g
exac
t
li
near
iz
at
ion,
Robust
In
te
gr
al
of
t
he
Sign
of
the
E
rror
(RISE
[
3])
an
d
ADP
te
c
hn
i
qu
e
f
or
ma
nipulat
or
s
in
a
bs
e
nce
of
ho
l
onomi
c
const
raint.
Thi
s
A
DP
te
c
hn
i
que
was
im
plemented
usi
ng
simult
ane
ou
s
tun
i
ng
meth
od
to
sat
isfy
the
weig
ht
conve
rg
e
nce a
nd stabil
it
y.
2.
DYN
AM
I
C M
ODEL O
F
A ROBOT
M
A
NIP
ULATO
R
AND C
ONT
ROL
OBJE
C
TIVE
Con
si
der the
f
ollow
i
ng ro
bo
t
man
i
pu
la
t
or
w
it
ho
ut c
onstrai
nt:
(
)
(
)
(
)
(
)
,
(
)
d
M
q
q
C
q
q
q
G
q
F
q
t
+
+
+
+
=
(1)
Seve
ral
ap
pro
pr
ia
te
ass
umpti
on
s
[3]
will
be
co
ns
id
ere
d
to
de
velo
p
the
co
ntr
ol
de
sign
in
ne
xt
chap
te
rs.
Assu
m
pt
i
on
1
.
T
he
ine
rtia
matri
x
M
(
q)
is
sy
mmetric
,
posit
ive
de
f
init
e,
and
gua
ran
te
es
the
ineq
ualit
y
()
n
t
as foll
ows:
22
1
(
)
)
,
(
T
m
M
q
m
q
(2)
Evaluation Warning : The document was created with Spire.PDF for Python.
In
t J
P
ow Elec
& Dri S
ys
t
IS
S
N: 20
88
-
8
694
Ad
ap
ti
ve
dyna
mic pro
grami
ng
ba
s
ed
opti
m
al
co
ntr
ol for
a
ro
bot
mani
pu
l
ato
r
(
Dao Ph
uong N
am)
1125
wh
e
re
1
m
,
()
mq
,
is
a
kn
own
posit
ive
c
onsta
nt,
a
known
po
sit
iv
e
f
unct
ion,
a
nd
t
he
sta
nd
a
rd E
uclidean
no
rm, res
pecti
vely
.
Assu
m
pt
i
on
2
.
The
relat
ion
s
hi
p
bet
ween
a
n
i
ner
ti
a
matri
x
M
(
q)
an
d
th
e
Corio
li
s
mat
rix
,
()
C
qq
can
be rep
rese
nted as f
ollows:
(
(
)
(
)
)
2
,
0
.
Tn
C
M
q
q
q
R
−
=
(3)
It
sho
uld
be
no
ti
ced
that
this
manipula
to
r
is
consi
der
e
d
i
n
t
he
a
bs
e
nce
of
ho
l
onom
ic
c
on
strai
nt
f
or
ce
.
The
c
on
t
ro
l
ob
je
ct
ive
is
to
fi
nd
the
c
ontrol
al
gorithm
bein
g
the
f
rame
work
of
ex
act
li
ne
arizat
ion
,
RIS
E
an
d
ADP
te
ch
niqu
e
enab
li
ng
the
po
sit
io
n
trac
kin
g
c
ontr
ol
in
manipula
to
rs
c
on
t
ro
l
s
ys
te
m
as
show
n
in
Fig
ure
1
.
ADP alg
or
it
hm wil
l be e
m
ployed
to
im
plem
ent opti
mal c
ontr
ol d
esi
gn as
desr
i
bed in
ne
xt
ch
apte
r.
Fig
ure
1
.
Co
ntr
ol str
uctu
re
3.
ADAPTI
VE
DYN
AM
I
C P
ROGR
AMMI
NG A
PP
ROA
CH FO
R A
R
OBOT
MA
NI
PULAT
OR
3.1.
AD
P
a
lg
orithm
In
[
3],
by
us
i
ng
the
co
ntr
ol
i
nput
(4)
for
m
anip
ulator
(
1)
with
nonlinea
r
f
un
ct
io
n
(
5)
obta
inin
g
from
(6)
-
(8), we
lead to
the
nonli
ne
ar mo
del (9):
d
uh
=
−
+
+
(4)
(
)
(
)
11
11
(
)
(
)
C
h
M
e
e
G
q
F
q
=
+
+
+
(5)
1
d
e
q
q
=−
(6)
2
1
1
1
e
e
e
=+
(7)
2
2
2
r
e
e
=+
(8)
(
)
(
)
x
f
x
g
x
u
=+
(9)
wh
e
re
1
2
e
x
e
=
,
1
1
1
2
()
0
nn
nn
I
e
fx
e
MC
−
−
=
−
an
d
1
0
()
nn
gx
M
−
=
−
Now,
the
c
on
tr
ol
ob
je
ct
is
to
desig
n
a
c
ontr
ol
la
w
u
t
o
gu
aran
te
e
not
only
sta
bili
zat
ion
(9)
but
al
so
minimi
zi
ng the
quadrat
ic
co
st
functi
on
with i
nf
i
nite h
or
iz
on as foll
ows:
(
)
0
0
(
)
,
V
x
r
x
u
d
t
=
(10)
(
)
(
)
,
T
r
x
u
Q
x
u
R
u
=+
(11)
Evaluation Warning : The document was created with Spire.PDF for Python.
IS
S
N
:
2088
-
8
694
In
t J
P
ow
Ele
c
&
D
ri
S
ys
t,
V
ol
.
11
, N
o.
3
,
Se
ptembe
r
2020
:
11
23
–
11
31
1126
In
w
hich
,
(
)
Qx
and
R
is
po
sit
ive
de
finite
f
un
ct
io
n
of
x
,
s
ymmet
ric
de
finite
po
s
it
ive
matri
x, res
pect
ively.
This
work
pre
sents
a
s
olu
ti
on
f
or
a
pproxi
mate
ap
proac
h
cal
le
d
ada
pti
ve
dynamic
pro
gr
a
mmi
ng
(ADP
)
f
or
op
ti
mal co
ntr
ol d
e
sign. In
[
21,
22],
c
onside
r
the
fo
ll
owin
g
a
ff
i
ne
sy
ste
m
.
(
)
(
)
x
f
x
g
x
u
=+
(12)
wh
e
re
n
xR
,
m
u
U
R
.
()
fx
an
d
()
gx
sat
isfy
Lipschitz
c
ondi
ti
on
and
(
0
)
0
f
=
.
The
c
os
t
f
un
ct
ion
is
def
ine
d
as
(10
).
T
he
nex
t
def
i
niti
on
was
gi
ve
n
in
[17,
18]
to
s
how
that
the
op
ti
mal c
ontrol
so
l
ution wil
l b
e co
ns
ide
red in
the set
of ad
m
issi
ble con
tr
ol.
Def
i
niti
on
1:
A
con
t
ro
l
po
li
cy
()
x
is
def
ine
d
as
a
dm
issi
ble
po
li
c
y
if
()
x
sta
bili
ze
sy
ste
m
(12
)
a
n
d
the equivale
nt
value f
un
ct
io
n
()
Vx
is finit
e.
(
)
is de
note
d
set
of a
dmi
ssible co
ntr
ol
po
li
cy
.
Fo
r
a
ny admi
ss
ible p
olicy
()
x
, the
non
li
nea
r Ly
apun
ov E
qu
at
i
on (NLE
)
ca
n be
form
ulate
d
(
)
(
)
(
)
(
)
(
)
(
)
(
)
,0
T
V
r
x
x
f
x
g
x
x
x
+
+
=
(13)
Def
i
ning
Hami
lt
on
functi
on a
nd opti
mal c
ost
f
unct
io
n
as
f
ol
lows
:
(
)
(
)
(
)
(
)
(
)
(
)
,
,
,
T
xx
H
x
V
r
x
V
f
x
g
x
=
+
+
(14)
(
)
*
(
)
m
in
(
,
)
t
V
x
r
x
=
We lead
to
the
fo
ll
owin
g HJB
equati
on
:
(
)
*
*
*
0
m
i
n
(
,
,
)
(
,
,
)
xx
H
x
V
H
x
V
==
(15)
It
can
be
no
ti
ced
t
hat,
*
is
opt
imal
poli
cy
corres
pondin
g
with
the
opti
mal
cost
f
un
c
ti
on
a
nd
(
,
,
)
0
x
H
x
V
=
with a
ny a
dm
i
ssible p
olicy is
N
LE
.
Now,
the
opti
mal
co
ntr
ol
po
li
cy
can
be
obt
ai
ned
by
ta
king
the
de
rivati
ve
of
Hamilt
on
pro
blem
with
resp
ect
t
o po
li
c
y
,
(
)
*
1
*
1
2
T
x
R
g
V
−
=−
(16)
This
wor
k
present
P
olicy
I
te
rati
on
(PI)
al
gorithm
f
or
a
r
obot
man
ipu
la
to
r
incl
udin
g
2
ste
ps
as foll
ows:
In
it
ia
te
ad
missi
ble contr
ol
po
li
cy
0
()
x
,
Re
peat
Step
1: P
olicy
Evaluati
on
So
lve
NLE
for
()
i
Vx
corres
pondin
g give
n
c
on
t
ro
l
poli
cy
i
,
(
)
(
)
(
)
(
)
(
)
(
)
(
)
,0
T
i
i
i
x
r
x
x
V
f
x
g
x
x
+
+
=
(17)
Step
2: P
olicy i
mpro
veme
nt
Update
new po
li
cy
acco
rd
i
ng
to
,
Evaluation Warning : The document was created with Spire.PDF for Python.
In
t J
P
ow Elec
& Dri S
ys
t
IS
S
N: 20
88
-
8
694
Ad
ap
ti
ve
dyna
mic pro
grami
ng
ba
s
ed
opti
m
al
co
ntr
ol for
a
ro
bot
mani
pu
l
ato
r
(
Dao Ph
uong N
am)
1127
(
)
11
1
2
i
T
i
x
R
g
V
+−
=−
(18)
Un
ti
l
m
a
x
nn
=
or
1
ii
v
VV
+
−
.
Wh
e
re
m
a
x
n
is a n
umbe
r of
li
mit
e
d
it
erati
on a
nd
v
is an a
rb
it
ra
ry
giv
e
n
s
mall
po
sit
ive num
ber.
This
al
gorith
m
is
co
ns
ide
r
ed
in
[
21]
tha
t
prov
e
each
po
li
cy
co
ntr
ol
i
is
admissi
ble
co
ntrol
.
The
c
os
t
f
un
ct
ion
i
V
was
re
duc
ed
at
eac
h
ste
p
un
ti
l
c
onve
r
ge
t
o
op
ti
mal
po
li
cy
a
nd
i
co
nv
e
r
ge
to
wa
rd
op
ti
mal
po
li
c
y as well
.
Howe
ver,
t
he
no
nlinear
L
ya
punov
(17)
i
s
hard
t
o
so
l
ve
directl
y.
T
her
e
fore,
in
r
ecent
years,
fin
ding
an
i
nd
i
rectl
y
wa
y
to
s
olv
e
t
his
eq
uat
ion
has
been
c
on
ce
r
ned
by
m
any
resea
rch
e
s
[20
-
25]
.
I
n
th
e
nex
t
ste
ps
,
tw
o
neural
n
et
w
orks
c
al
le
d
Acto
r
-
C
r
it
ic
(A
C)
are
t
raine
d
simult
a
neousl
y
to
s
olv
e
ap
pro
ximat
el
y
the
HJB e
qu
at
io
n.
The
c
os
t
f
unct
ion
a
nd
it
s
as
so
ci
at
ed
po
li
c
y
ca
n
be
re
presented
by
us
ing
a
ne
ural
ne
twork
(
NN
)
as foll
ows
,
(
)
*
*1
()
1
()
2
T
v
T
T
a
V
W
x
u
R
g
x
W
−
=+
=
−
+
(19)
Wh
e
re,
()
x
is
c
or
respo
nd
i
ng
fun
ct
ion
of
N
N
t
ha
t
usual
ly
be
in
g
sel
ect
ed
as
poly
nomial
,
Ga
us
ses,
sigm
oid
functi
on and
so o
n.
is de
note
d
x
.
Appro
ximate
d
op
ti
mal c
os
t
functi
on and
opti
mal p
olicy a
re
pr
ese
nted:
(
)
1
ˆˆ
()
1
ˆ
ˆ
()
2
T
c
T
T
a
V
W
x
u
R
g
x
W
−
=
=
−
(20)
No
te
that,
t
o
appr
ox
imat
e
HJB
so
l
ution,
we
nee
d
to
f
ind
on
l
y
te
rm
ˆ
c
W
.
Howe
ver,
to
sta
bili
ze
cl
os
ed
-
lo
op
syst
em,
both
ˆ
a
W
,
ˆ
c
W
are
em
ployed
,
wh
ic
h
le
ads
to
the
fle
xib
il
it
y
that
can
help
handlin
g
the
sta
bili
ty o
f
s
ys
te
m in
le
a
rn
i
ng
process
.
By
re
placi
ng
t
he
opti
mal
poli
cy
an
d
the
opt
imal
cost
funct
ion
a
nd
by
Act
or
-
Crit
ic
netw
orks
in
HJB
(17),
HJB e
rror can
be o
btaine
d.
(
)
(
)
(
)
(
)
ˆ
ˆ
ˆ
ˆ
TT
c
h
jb
Q
x
u
R
u
W
f
x
g
x
u
+
+
+
=
(21)
1
11
ˆ
ˆ
ˆ
ˆ
(
)
(
)
42
T
T
T
T
a
a
c
a
h
j
b
Q
x
W
G
W
W
f
x
g
R
g
W
+
+
−
=
(22)
Wh
e
re
1
T
G
g
R
g
−
=
.
The
t
un
i
ng law
for
ˆ
c
W
is desc
ribe
d
as
foll
ows
,
ˆ
1
c
h
jb
T
Wc
=
−
+
(23)
1
T
c
T
=
−
+
(24)
Evaluation Warning : The document was created with Spire.PDF for Python.
IS
S
N
:
2088
-
8
694
In
t J
P
ow
Ele
c
&
D
ri
S
ys
t,
V
ol
.
11
, N
o.
3
,
Se
ptembe
r
2020
:
11
23
–
11
31
1128
(
)
(
0)
ro
tI
+
=
=
.
Wh
e
re
r
t
+
is
rese
tt
ing
ti
me.
T
o
avo
i
d
sl
ow
c
onverge
nce
on
ˆ
c
W
,
t
he
matri
x
i
s
consi
der
e
d
with
de
fau
lt
matri
x
(0
)
w
hen
mi
ni
mu
m
ei
genval
ue
of
reac
h
a
giv
e
n
small
po
sit
ive
num
be
r.
(
)
(
)
(
)
()
T
x
f
x
g
x
u
=
+
an
d
1
T
+
is n
ormal
iz
at
ion
f
act
or.
To
make
s
ur
e
the
c
onve
rg
e
nce
of
ˆ
c
W
with
update
la
w
(24
),
()
x
mu
st
sat
isfy
the
Persiste
nc
e
Excit
at
ion
(P
E
)
c
onditi
on [21].
(
)
(
)
0
0
12
tT
T
t
I
d
I
+
(25)
for
se
ver
al
pos
it
ive n
umb
ers
1
,
2
,
T
.
Wh
e
re
()
()
1
T
t
=
+
.
On
t
he
oth
e
r
hands
,
(
22)
is
nonlinea
r
e
quat
ion
of
ˆ
a
W
.
T
her
e
fore,
t
he
t
un
i
ng
la
w
for
ˆ
a
W
is
form
ulate
d bas
ed on G
D
al
gorithm t
o
mi
nim
iz
e the c
os
t
(
)
2
()
h
jb
t
.
(
)
(
)
12
1
ˆ
ˆ
ˆ
ˆ
ˆ
G
1
T
a
a
a
c
H
J
B
a
a
c
T
W
p
r
o
j
W
W
W
W
•
=
−
−
−
−
+
(26)
Wh
e
re
{}
p
r
o
j
is a
pro
je
ct
ion
ope
rato
r
[
22]
that e
ns
ure the
bo
unde
dness
of up
datat
ion
la
w.
No
te
that,
the
s
e
par
amet
e
rs
of
both
tw
o
N
N
’s
upda
te
la
w
c
,
1
a
,
2
a
m
us
t
be
sel
ect
ed
t
o
sat
isf
y
so
me
c
onditi
ons
[22]
to
e
nsure
sta
bili
ty
of
cl
os
e
d
-
l
oop
sy
ste
m.
O
ne
c
an
al
so
fi
nd
t
he
c
omplet
e
pro
of
o
f
conve
rg
e
nce
of p
a
rameters
and sta
bili
ty of syste
m i
n
[
22]
.
3.2.
RIS
E
fee
dba
c
k c
on
t
rol
design
In
[
3], the
cont
ro
l t
er
m µ
(t)
is
desig
ne
d based
on th
e
RIS
E fr
amew
ork
as
fo
ll
ow
s:
22
(
)
(
)
(
)
(
)
11
(
)
(
)
0
ss
µ
t
k
e
t
k
e
t
+
−
+
+
(27)
Wh
e
re
()
n
t
is de
sc
ribe
d
as:
2
2
1
2
(
(
)
1
)
s
k
e
s
g
n
e
=
+
+
(28)
s
k
is
posit
ive
c
onsta
nt
co
ntr
ol
gain,
a
nd
1
can
be
sel
ect
ed
bei
ng
a
po
sit
ive
c
on
t
ro
l
gai
n
sel
ect
ed
acco
r
di
ng
t
o
the
foll
owin
g
s
uffici
ent
cond
it
io
n
,
1
1
2
2
1
+
(29)
Re
mark
1:
It
i
s
dif
fer
e
nt
fro
m
the
w
ork
i
n
[3],
in
our
w
ork
the
ADP
al
gorith
m
is
prese
nted
to
fin
d
the
interme
diate
op
ti
mal
co
nt
ro
l
in
pu
t
in
t
he
absence
of
dyna
mic
uncert
ai
nty
.
F
ur
t
hermo
re,
A
DP
te
chn
i
qu
e
was
c
onside
red in
[20
-
26]
w
as
sti
ll
n
ot to
app
ly for a
rob
otic mani
pu
la
t
or
.
Re
mark
2:
I
n
com
par
e
wi
th
the
w
ork
of
Dixon
[3]
that
desig
n
opti
m
al
con
tr
ol
so
l
vi
ng
Ri
ccat
i
equ
at
io
n,
this
w
ork
re
qu
ire
s
pa
rtia
l
knowle
dge
of
m
anip
ulator
’s
dyna
mic
incl
uding
matri
ces
,
MC
.
Howe
ver,
us
in
g
t
he
A
DP
a
ppr
oac
h,
the
opti
mal
co
ntr
ol
pro
blem
is
a
ddr
essed
in
gen
e
r
al
case
f
or
a
ny
gi
ve
n
cost fu
nction a
s (10)
with
ou
t
const
raint.
Evaluation Warning : The document was created with Spire.PDF for Python.
In
t J
P
ow Elec
& Dri S
ys
t
IS
S
N: 20
88
-
8
694
Ad
ap
ti
ve
dyna
mic pro
grami
ng
ba
s
ed
opti
m
al
co
ntr
ol for
a
ro
bot
mani
pu
l
ato
r
(
Dao Ph
uong N
am)
1129
4.
OFFLINE
SI
MU
L
ATIO
N RESULTS
Con
si
der
the
offli
ne
sim
ulati
on
of
a
tw
o
-
li
nk
ma
nipulat
or
con
t
ro
l
s
ys
te
m
us
in
g
A
DP
te
chn
i
qu
e
a
nd
RISE al
gorith
m.
The ge
ner
al
dy
namic
of tw
o
-
l
ink
ma
ni
pu
la
t
or is
represe
nted
by (1
) wit
h
22
2
5
2
c
o
s(
)
1
c
o
s(
)
1
c
o
s(
)
1
qq
M
q
++
=
+
,
2
2
1
2
2
12
sin
(
)
(
)
sin
(
)
sin
(
)
0
q
q
q
q
q
C
qq
−
−
+
=
(
)
(
)
(
)
1
1
2
12
1
.
2
c
o
s
c
o
s
9
.
8
c
o
s
q
q
q
G
qq
++
=
+
,
(
)
0
.
1
F
s
i
g
n
q
=−
,
(
)
(
)
0
.
1
s
in
0
.
1
c
o
s
d
t
t
=
.
Value
f
un
ct
io
n i
s (10)
with t
he
term:
(
)
0
T
Q
x
x
Q
x
=
.
1
1
1
2
0
2
1
2
2
QQ
Q
QQ
=
,
11
4
0
2
2
4
0
Q
=
,
12
21
44
46
QQ
−
==
−
,
22
40
04
Q
=
,
0
.2
5
0
0
0
.2
5
R
=
,
15.6
10.6
10.6
10.4
=
Without
l
os
s
of
gen
e
rali
ty,
t
he
set
-
point
is
s
el
ect
ed
as
00
T
d
q
=
,
ini
ti
al
sta
te
is
0
0.159
8
0.225
7
T
q
=
.
The o
ptimal
v
a
lue fu
nction w
hich
is
so
l
ved
directl
y
i
n [3] i
s
(
)
(
)
12
*
2
2
2
2
2
1
2
1
2
3
3
2
4
3
4
3
4
2
0
2
4
3
2
.5
c
o
s
0
.5
c
o
s
0
nn
T
nn
Q
V
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
M
−
=
=
−
+
+
+
+
+
+
The u
pd
at
at
io
n l
aw of
ˆ
c
W
and
ˆ
a
W
ar
e re
pr
ese
nted
in (2
3) an
d (
26)
w
it
h
,
a
1
2
8
0
0
,
1
,
(
0
)
1
0
0
,
0
.
0
0
1
,
0
.
0
1
,
1
c
T
a
=
=
=
=
=
=
.
NN act
ivati
on
functi
on is sele
ct
ed
as
,
2
2
2
2
2
1
2
1
2
3
3
2
4
3
4
3
4
2
(
)
c
o
s
(
)
c
o
s
(
)
T
x
x
x
x
x
x
x
x
x
x
x
x
x
x
=
.
The
op
ti
mal
pa
rameter
2
4
3
2
.
5
1
1
1
0
.
5
W
=−
that
is
obta
ined
by
s
olv
in
g
dire
ct
ly
HJB
as
s
how
n
in
[
3].
Fig
ure
s
(1)
a
nd
(
2)
s
how
the
co
nve
rg
e
nce
of
ˆ
c
W
,
ˆ
a
W
.
The
val
ue
of
ˆ
c
W
after
11
0s
is
2
4
3
2
.
5
1
1
1
0
.
5
−
.
To
sat
is
fy
P
E
co
nd
it
io
n
as
in
(
25),
a
prob
i
ng
sig
nal
is
ad
ded
i
n
s
yst
em
input.
M
ore
over,
s
ys
te
m
’s
er
ror
ev
ol
ution
i
s
show
n
i
n
Fig
ure
(3)
determi
ning
the
sta
bili
ty
of
co
ntr
ol
s
ys
te
m
and s
ta
te
’s
e
vo
luti
on
a
s s
how
n
in
Fig
ure
4
.
Evaluation Warning : The document was created with Spire.PDF for Python.
IS
S
N
:
2088
-
8
694
In
t J
P
ow
Ele
c
&
D
ri
S
ys
t,
V
ol
.
11
, N
o.
3
,
Se
ptembe
r
2020
:
11
23
–
11
31
1130
Fig
ure
2. Co
nverg
e
nce
of criti
c’s para
mete
rs
Fig
ure
3. Co
nverg
e
nce
of act
or’s pa
rameters
Figure
4. Stat
e’s
e
vo
l
ution
5.
CONCL
US
I
O
N
This
pa
per
me
ntion
e
d
the
prob
le
m
of
opti
mal
con
t
ro
l
de
sign
for
a
ma
ni
pu
la
to
r
in
co
mb
inati
on
with
RISE
a
nd
e
xac
t
li
near
iz
at
ion
.
With
the
ADP
te
chn
iq
ue,
the
so
luti
on
of
HJ
B
equ
at
io
n
wa
s
fou
nd
by
it
er
at
ion
al
gorithm to o
btain the c
ontr
oller sati
sfy
i
ng n
ot
on
l
y
the c
onve
rg
e
nce
of
weig
ht but also
the positi
on
tr
ackin
g.
Offli
ne
simula
ti
on
s
we
re
im
pl
emented
to
va
li
da
te
the
perf
ormance
a
nd
eff
ect
ive
ness
of
the
opti
mal
con
t
ro
l
for
ma
nipulat
ors.
ACKN
OWLE
DGE
MENTS
This
researc
h
was
sup
ported
by
Re
sear
ch
F
oundat
io
n
f
unde
d
by
Thai
N
guye
n
Un
i
ver
sit
y
of Tech
nolo
gy.
REFERE
NCE
S
[1]
Mohamm
ed
A
.
A.
Al
-
Mekhla
f
i,
Herm
an
Wa
hid
,
Azia
n
Abd
Azi
z
,
"A
dapt
iv
e
Neu
ro
-
Fuzzy
Contro
l
Approac
h
for
a
Single
Inve
r
te
d
Pendulum
Sys
tem",
Int
ernati
ona
l
Journal
of
Elec
tric
al
and
Comp
ute
r E
ngin
ee
ring
(IJ
ECE
)
,
Vol
.
8
,
No.
5,
pp.
3657
-
3665
,
2018
.
[2]
Dw
i
Prihant
o
,
I
rawa
n
Dw
i
Wahyono,
Suw
asono
and
Andrew
Nafa
lski.
"V
irt
u
al
La
bor
at
ory
f
or
Li
n
e
Fol
lowe
r
Robot
Comp
et
i
tion"
,
Int
ernati
on
al
Journal
o
f
Elec
tri
cal
and
Co
mputer
Eng
ine
e
ring
(IJ
ECE
)
,
V
ol.
7,
No.
4,
pp
.
2253
-
2260
,
201
7
.
[3]
Keit
h
Dupree
,
P
ara
g
M.
Patre,
Za
ch
ary
D.
Wil
cox,
W
arr
en
E
.
Dixon,
“Asympt
o
ti
c
opt
im
a
l
con
trol
of
un
ce
r
ta
in
nonli
ne
ar
Eu
le
r
–
La
gra
ng
e
sys
tem
s
”,
Au
tomatica
,
Vol.
47
,
pp
.
99
-
107,
2011
.
[4]
Xin
Hu,
Xinjian
g
W
ei
,
Huif
eng
Zh
ang,
Jian
Ha
n,
Xiuhu
a
Li
u
,
“Robust
ada
pt
iv
e
t
rac
k
ing
cont
r
ol
for
a
c
la
ss
of
me
ch
ani
c
al
sys
te
ms
with
unkno
wn
disturba
nc
es
under
actua
tor
satura
t
ion”
,
Int
.
J.
Robust
&
No
nli
near
Control
,
Vol.
29
,
Iss
ue. 6, pp. 1893
-
1908,
2019.
[5]
Yong
Guo,
Bing
Huang,
Ai
-
jun
Li
,
Chang
-
qing
Wa
ng,
“In
te
gra
l
slidi
ng
mod
e
c
o
ntrol
for
Euler
-
L
agr
ange
sys
te
ms
with
input sat
ur
a
ti
on”
Int. J. Rob
ust
&
Non
li
near
Control
,
vo
l
.
29,
no
.
4,
pp.
1088
-
1100,
2018
.
Evaluation Warning : The document was created with Spire.PDF for Python.
In
t J
P
ow Elec
& Dri S
ys
t
IS
S
N: 20
88
-
8
694
Ad
ap
ti
ve
dyna
mic pro
grami
ng
ba
s
ed
opti
m
al
co
ntr
ol for
a
ro
bot
mani
pu
l
ato
r
(
Dao Ph
uong N
am)
1131
[6]
Changj
ia
ng
Xi
,
Jiuxia
ng
Dong
,
“Ada
pti
v
e
r
eliab
le
guar
an
teed
p
e
rform
ance
cont
r
ol
of
unc
ertain
n
onli
ne
ar
sys
tems
by
using
expon
ent
-
dep
ende
nt
b
arr
ie
r
Ly
apunov
func
ti
on”
,
Int.
J.
Robust
&
No
nli
near
Con
trol
,
vol
.
29
,
no
.
4,
pp.
1051
-
1062
,
2019.
[7]
We
i
He,
Yuhao
Chen,
Zh
ao
Y
in,
“Ada
pt
ive
N
eur
al
Network
Control
of
an
Unce
rtain
Robo
t
W
it
h
Ful
l
-
State
Constrai
nts”
,
I
E
EE
Tr
ansacti
ons
on
Cyb
erne
t
ic
,
Vol.
46
,
No
.
3,
p
p.
620
-
629
,
201
6.
[8]
We
i
He,
Yit
ing
Dong,
“Ada
ptive
Fuzzy
Neur
a
l
Network
Cont
rol
for
a
Constrai
ned
Robot
Us
ing
Impe
d
ance
Le
arn
ing” ,
IEEE
Tr
ansacti
ons
on
Neural
N
et
w
orks and
Learni
ng
Syste
ms
,
vol
.
29,
no
.
6
,
pp
.
11
74
-
1186,
2018
.
[9]
Panigra
hi
,
Prat
a
p
Kumar
e
t
al.,
“
Compa
rison
of
GS
A,
SA
and
P
SO
Based
Inte
lli
gent
Contro
ll
ers
for
Path
Planni
n
g
of
Mobile Robot
in
Unknow
n
En
vironm
ent”,
201
5.
[10]
We
i
He
,
Yi
ti
ng
Dong,
Yit
ing
Dong,
Ch
angyi
n
Su
n
“Ada
p
ti
ve
Ne
ura
l
Impe
d
ance
Control
of
a
Ro
boti
c
Manipu
la
t
or
Wi
th
Input
Sa
t
ura
ti
on
”,
IEEE
Tr
ansacti
ons
on
Syste
ms
,
M
an
and
Cyb
erne
tics:
Syste
ms
,
vol.
46,
no
.
3,
pp.
334
-
344
,
20
16.
[11]
Zi
ti
ng
Ch
en,
Zh
i
jun
Li
,
Phil
ip
C
hen
“Ada
pti
v
e
Neura
l
Control
of
Unce
r
tain
MI
MO
Nonline
ar
S
ystem
s
Wi
th
Sta
te
and
Inpu
t
Con
strai
nts”
,
IEEE
Tr
ansacti
ons
on
Neural
Ne
t
works
and
Lea
rning
Sy
st
ems
,
vol
.
28,
no
.
6,
pp.
1318
-
1330
,
2017.
[12]
Guanyu
La
i
,
Z
hi
Li
u
,
Yun
Z
hang,
Chun
Lu
ng
Phili
p
Chen
,
Shengli
Xie
,
“Asymme
tr
ic
A
ct
ua
tor
Ba
ckl
as
h
Compe
nsati
on
i
n
Quanti
z
ed
Adapti
v
e
Control
o
f
Unce
rtain
Net
worked
Nonlinear
Sys
te
ms”,
I
E
EE
Tr
ansacti
ons
on
Neural
Ne
tworks
and
Learning
S
yste
ms
,
vol
.
28
,
no
.
2
,
pp
.
294
-
3
07,
2017
.
[13]
Ta
rek
Mada
n
i,
Boubake
r
Daa
ch
i,
and
Kari
m
Dj
ouani
,
“Modul
ar
Controller
Desi
gn
Based
Fast
Te
rm
ina
l
Slidi
n
g
Mode
for
Artic
u
la
t
ed
Exoskeleto
n
Sys
te
ms”,
IE
E
E
Tr
ansacti
ons
on
Control
Syst
ems
Technol
ogy
,
vol
.
25
,
no
.
3,
pp.
1133
-
1140
,
2016.
[14]
Quang
N.H.
,
e
t
al
.
,
"M
in
M
ax
Model
Pred
ictiv
e
Con
trol
for
Po
lysole
noid
L
inea
r
Motor",
In
te
rn
ati
onal
Journal
of
Powe
r E
le
c
troni
cs
and
Dr
ive
Sys
te
m (IJPEDS)
,
V
ol.
9
,
No
.
4
,
pp
.
1666
-
1675
,
201
8
.
[15]
Quang
N.H.
,
e
t
a
l.,
“On
tr
acking
con
trol
pr
oble
m
for
po
ly
solenoi
d
mot
or
model
pr
edi
c
t
ive
appr
oa
ch”,
Inte
rnational
Jo
urnal
of El
e
ct
ri
c
al
and
Comput
er
Engi
n
ee
ring
(IJ
ECE
)
,
vol
.
10
,
no
.
1
,
pp
.
849
-
85
5
,
2020
.
[16]
Qu
ang
N.H.,
e
t
al
.
,
“
Mult
i
par
a
me
tric
mod
el
pr
edi
c
ti
ve
cont
rol
base
d
on
la
gu
er
re
model
for
pe
rma
nen
t
ma
g
n
et
li
ne
ar
synchrono
us
mot
ors”,
Int
e
rnational
Journal
of
Elec
tric
al
and
Computer
E
ngine
ering
(IJ
E
CE)
,
vo
l.
9,
no.
2,
pp.
10
67
-
1077
,
2019
.
[17]
Vrabi
e,
D.,
Pastr
ava
nu,
O.,
Abu
-
Khala
f,
M.,
&
L
ewis,
F.
L.,
“Ada
pti
v
e
opt
im
a
l
c
ontrol
for
contin
uous
-
ti
me
li
n
ea
r
sys
te
ms ba
sed
o
n
policy
i
te
r
at
io
n
”,
Aut
omati
ca
,
vol.
45
,
no
.
2
,
pp
.
477
-
484
,
2009
.
[18]
Yu
Jiang,
Zhon
g
-
Ping
Jiang,
“
Comput
ational
a
dapt
iv
e
optimal
cont
rol
for
con
ti
nuous
-
ti
m
e
l
in
ea
r
sys
te
ms
wi
t
h
com
pl
et
e
ly
unkn
own dyna
mics”,
Aut
oma
tica
,
vo
l.
48,
pp.
2699
-
27
04,
2012
.
[19]
Vrabi
e,
D.,
&
Le
wis,
F.
L.,
“
Neura
l
n
et
work
appr
oa
ch
to
co
nti
nuous
-
ti
m
e
di
rec
t
ada
p
ti
ve
op
ti
mal
con
trol
fo
r
par
tiall
y
unknown nonl
in
ea
r
sys
tems”,
N
eural
N
etwor
ks
,
vol.
22
,
n
o.
3
,
pp
.
237
-
24
6,
2009
.
[20]
Murad
Abu
-
Khala
f
,
Frank
L.Lew
is
,
“Ne
arl
y
op
t
im
al
con
trol
la
w
s
for
nonli
n
ea
r
sys
te
ms
with
sat
ur
at
ing
actua
tors
using a
n
eur
a
l
n
e
twork
HJ
B
appr
oac
h
”
,
Aut
omati
ca
,
vo
l
.
49,
no.
1
,
pp
.
779
-
791
,
2
005.
[21]
Vamvouda
kis,
K.
G.,
Le
wis,
F.
L
.
,
“Onli
n
e
a
c
tor
-
critic
al
gor
ithm
to
solv
e
th
e
cont
inuous
-
t
ime
infi
n
it
e
hori
zo
n
opti
mal
cont
rol
proble
m
”,
Aut
o
matic
a
,
vol
.
46
,
no.
5
,
pp
.
878
-
8
88,
2010
.
[22]
Kyria
kos
G.
Va
mvouda
kis1,
Dr
aguna
Vrabi
e,
Frank
L.
L
ewis,
“
Online
ada
p
ti
v
e
al
gorit
h
m
for
op
ti
mal
cont
ro
l
wi
t
h
int
egr
al
r
ei
nfor
c
em
en
t
l
ea
rn
ing”,
Int. J. Robust
&
Nonli
n
ear
Cont
rol
,
vol
.
24
,
no
.
17,
pp
.
2686
-
27
10,
2014
.
[23]
S.
Bhasin
,
R.
Kamala
purka
r
,
M.
Johns
on,
K
.
G.
Vamvouda
kis,
F
.
L.
Le
wis,
W
.
E
.
Dixon,
“A
nov
el
actor
–
cr
it
i
c
–
ide
nti
f
ie
r
ar
chi
t
ec
tur
e
for
appr
oxim
ate
op
ti
m
al
cont
ro
l
of
unc
ert
a
in
nonl
inear
sys
te
ms”
Aut
o
matic
a
,
vol
.
49
,
no.
1
,
pp
.
82
-
92
,
2013.
[24]
Hami
dr
e
za
Mo
dar
es,
Frank
L
.
L
ewis,
Moha
mm
ad
-
B
aghe
r
Naghibi
-
Sis
ta
ni
,
“Ada
pt
ive
Op
ti
mal
Contro
l
of
Unknow
n
Constrai
ned
-
Inpu
t
Sys
te
ms
Us
ing
Poli
cy
It
era
t
ion
and
Neura
l
Ne
tworks”,
IE
EE
Tr
ansacti
ons
on
N
eural
Net
works and
Le
arning
Syste
ms
,
vol
.
24
,
no
.
10
,
p
p.
1513
-
1525
,
2
013.
[25]
Hami
dre
za
Modare
s,
Fr
ank
L
.
Le
wis,
Moha
m
ma
d
-
Bagh
er
Na
ghibi
-
Sis
ta
ni
,
“I
nte
gra
l
re
inforce
me
nt
learni
ng
a
nd
expe
ri
ence
r
epla
y
for
ad
apt
iv
e
o
pti
mal
cont
ro
l
o
f
par
ti
a
ll
y
-
unkn
own
constraine
d
-
input
continuou
s
-
ti
me
sys
te
ms
”
,
Aut
omatic
a
,
vo
l
.
50,
no
.
1,
pp.
19
3
-
202,
2014
.
[26]
Nam
D.P,
et
al.,
"A
dapt
ive
Dyna
mi
c
Progr
am
m
in
g
base
d
In
te
gra
l
Slidi
ng
Mode
C
ontrol
L
aw
for
Conti
nuous
-
Ti
m
e
Sys
te
ms:
A
De
sign
for
Inve
r
ted
Pendulu
m
Sy
stem
s"
,
I
nte
rnat
ional
Journal
o
f
M
ec
hani
cal
E
ngine
ering
and
Robot
ic
s
Re
sear
ch
,
Vol
.
8
,
No
.
2
,
pp
.
279
-
283
,
M
arc
h
2019
.
Evaluation Warning : The document was created with Spire.PDF for Python.