TELKOM
NIKA Indonesia
n
Journal of
Electrical En
gineering
Vol.12, No.6, Jun
e
201
4, pp. 4115 ~ 4
1
2
4
DOI: 10.115
9
1
/telkomni
ka.
v
12i6.538
8
4115
Re
cei
v
ed
De
cem
ber 2
9
, 2013; Re
vi
sed
March 3, 201
4; Acce
pted
March 18, 20
14
Constructing Cerebellum Model by Researching on its
Contributions to DIVA
Yuan
y
u
an
Wu*
1
, Shaob
a
i Zhang
2
Coll
eg
e of Co
mputer, Nan
jin
g Univ
er
sit
y
of Posts and T
e
le
communic
a
tio
n
s
No.66
Xi
nmofa
n
Roa
d
, Nan
jin
g, Jiangs
u
*Corres
p
o
ndi
n
g
author, e-ma
i
l
: 5250
15
923
@
qq.com
1
, adzs
b
@1
63.com
2
A
b
st
r
a
ct
DIVA (Directi
o
n
s into
Vel
o
citi
es of
Articul
a
to
rs) is a
math
e
m
atic
al
mode
l
of the pr
ocess
e
s be
hi
nd
speec
h acq
u
isi
t
ion an
d pro
d
u
c
tion, sup
pose
d
to achi
ev
e a
functiona
l repr
esentati
on of a
r
eas in th
e bra
i
n
that are i
n
volv
ed i
n
spe
e
ch
p
r
oducti
on a
nd
speec
h p
e
rcep
tion. Introduc
in
g cere
bel
lu
m c
ontrol
mecha
n
i
s
m
into the
mo
del
plays a si
gnifi
cant role
in i
m
provi
ng the
mecha
n
is
m of s
peec
h acq
u
isiti
on an
d pro
duc
ti
o
n
base
d
on DIVA
mod
e
l. T
he pa
per studi
es its learn
i
ng pr
oces
s, and expl
ore
s
cerebe
llar co
ntributi
ons to th
e
mo
de
l, that is
feedforw
a
rd
le
a
r
nin
g
, sens
ory
pred
ictions, fe
edb
ack co
mmand
pro
ducti
on
an
d the
ti
min
g
o
f
del
ays, and th
en constructs a cereb
e
ll
u
m
mo
de
l that
is closer to ne
uroan
ato
m
y an
d
is appli
ed to DIVA
mo
de
l. Si
mu
lat
i
on
resu
lts sh
o
w
that the
i
m
pr
oved
DI
VA
mo
del
can
pr
od
uc
e
mor
e
cl
ear
a
nd
expl
icit s
p
e
e
ch
soun
ds, and is
more cl
ose to
hu
ma
n-lik
e pro
nunc
iatio
n
system. The cer
e
b
e
llu
m
mo
de
l that establis
he
d i
n
this pap
er can
be ap
pli
ed to s
peec
h acq
u
isiti
on an
d pro
duct
i
on b
a
se
d on D
I
VA mod
e
l.
Ke
y
w
ords
:
cereb
e
ll
ar ro
le
s, cereb
e
ll
u
m
mo
de
l, DIVA
mo
de
l, spe
e
ch
acq
u
isiti
o
n
an
d pr
oducti
on,
mo
t
o
r
lear
nin
g
Copy
right
©
2014 In
stitu
t
e o
f
Ad
van
ced
En
g
i
n
eerin
g and
Scien
ce. All
rig
h
t
s reser
ve
d
.
1. Introduc
tion
DIVA model
wa
s first prop
ose
d
by G
u
e
n
ther i
n
19
94
[1], and the
n
ha
s be
en im
proved
a
lot till now. T
he ea
rlie
r version
s
of
DIVA model
s ha
ve som
e
di
sa
dvantage
s m
o
re o
r
le
ss [2
]. In
orde
r to solv
e these p
r
obl
ems, a DIVA model [5]
that is close
r
to neuroan
atom
y is propo
sed
by
Guenth
e
r
an
d Gh
osh. Th
e mod
e
l u
s
e
s
d
oubl
e-se
n
s
ory,
audito
ry and
som
a
t
o
se
nsory, a
s
the
ben
chma
rk structu
r
e, and
defines the
comp
one
nt
s
that are invol
v
ed in premo
t
or are
a
, motor
area a
nd au
di
tory and so
m
a
tose
nsory areas in
ce
reb
r
al and cere
be
llar co
rtex, which e
s
tabli
s
h
e
s
a co
rrespon
di
ng rel
a
tion
shi
p
between th
e com
pon
ent
s an
d a
c
tual
neuroan
atom
y. Moreover,
the
model combi
nes feedfo
r
ward control
subsy
s
te
m wit
h
feedba
ck
control sub
s
ystem to contro
l
articul
a
tor m
o
vements that
contai
n re
alistic neu
ral p
r
o
c
e
ssi
ng d
e
lays, and
com
p
u
t
er sim
u
lation
s
of the mod
e
l
are
present
ed to illu
stra
te that
the
model
ca
n p
r
ovide
a det
ailed a
c
cou
n
t for
experim
ents
involving co
mpen
sation
s to pertur
bati
ons of the li
p and ja
w. Although a l
o
t of
improvem
ent
s have bee
n done on
DIVA model, so
me
other pe
rf
orma
nce fact
ors a
r
e not ta
ken
accou
n
t of, and ce
re
bella
r learni
ng me
cha
n
ism i
s
n
o
t integrate
d
into the mod
e
l to accou
n
t for
the timing of delays a
nd sensory motor
learni
ng in ne
ural tra
n
smi
s
sion.
The ce
re
bell
u
m has a
n
a
natomical structure
that is con
s
iste
nt throug
hout. Be
cau
s
e of
its uniqu
e internal
stru
cture and the wid
e
sp
rea
d
co
nn
ectivity to and from cereb
r
a
l
cortex, seve
ral
hypothe
se
s that the
cereb
e
llum utili
ze
s a con
s
is
te
nt pro
c
e
ssi
ng
schem
e of tra
n
sforming i
n
p
u
ts
to outputs
h
a
ve bee
n p
r
opo
sed. Alle
n and
Tsuk
a
hara
first p
r
opo
sed th
e
cereb
r
o
c
ereb
ellar
intera
ction th
eory in
197
4
[8]. And in 1
998, Miall
et
al. pro
p
o
s
ed
the con
c
ept
of intern
al m
odel
[10]. They believed that the ce
rebell
u
m
contain
s
two
varieties of i
n
ternal m
ode
l, forward an
d
inverse mod
e
l
s. The forwa
r
d model p
r
e
d
i
c
ts the
con
s
e
quen
ce of a
motion or
an
action, an
d th
e
inverse mo
de
l provide
s
th
e
essential
co
mmand
s to
a
c
compli
sh the
motion o
r
a
c
t
i
on. Mean
whil
e,
the ce
reb
e
llu
m may co
ntri
bute a
s
a d
e
l
a
y model th
at queu
es
se
nsory predi
ction
s
to mat
c
h
wi
th
actual
sen
s
o
r
y feedba
ck.
These th
eori
e
s
above
h
a
v
e laid
the fo
undatio
n fo
r t
he inte
gratio
n of
cerebell
a
r rol
e
s into moto
r control and le
arnin
g
syste
m
.
Although DIV
A
model is u
s
ed to represent the
functi
on of area
s i
n
the brain t
hat are
involved in
sp
eech p
r
od
ucti
on a
nd
spe
e
ch pe
rcepti
on,
these
stu
d
ie
s sh
ow that th
e cere
bellum
is
also
an
indi
sp
ensable
pa
rt
of the m
odel,
and th
e
gl
oba
l ce
reb
r
o
c
e
r
e
bellar ci
rcuitry has be
en
well
Evaluation Warning : The document was created with Spire.PDF for Python.
ISSN: 23
02-4
046
TELKOM
NI
KA
Vol. 12, No. 6, June 20
14: 4115 – 4
124
4116
establi
s
h
ed.
The inte
gration of th
e
ce
rebell
u
m i
n
to
DIVA mo
de
l plays a
sig
n
ificant
role
in
perfe
cting the
mecha
n
ism of spee
ch a
c
quisitio
n
and
prod
uctio
n
. Howeve
r, how
to integrate the
cerebell
u
m i
n
to this me
chani
sm, an
d
whi
c
h
rol
e
s doe
s th
e
cereb
e
llum
pl
ay in the
whole
pro
c
e
ssi
ng? I
n
this pap
er, these pro
b
lem
s
will be di
scussed.
2. DIVA Mod
e
l and its Le
arning Proce
s
s
DIVA model
(se
e
Fi
gure
1) i
s
a
math
ematical
mod
e
l that de
scri
bes the p
r
o
c
esse
s of
spe
e
ch a
c
q
u
i
s
ition
and
p
r
odu
ction, a
n
d
is used
to
re
pre
s
ent
the f
unctio
n
of
areas in th
e
brain
that are invol
v
ed in spe
e
ch pro
d
u
c
tion
and spee
ch p
e
rception. Th
e model i
s
an
adaptive neu
ral
netwo
rk th
at learn
s
to
cont
rol movem
e
n
t
s of simul
a
te
d sp
ee
ch a
r
ticulato
rs i
n
order to p
r
o
duce
words,
syllab
l
es, or p
hone
mes [6]. It consi
s
ts
of int
egrate
d
feedf
orward and f
eedb
ack cont
rol
sub
s
ystem. It
take
s a
spe
e
ch
sound
st
ring
as i
nput
to gen
erate
a
time sequ
en
ce
of arti
culat
o
r
positio
ns that
co
mman
d
m
o
vements of
the si
mula
ted
vocal
tra
c
t.
Figure 1
is th
e current
DIV
A
model blo
c
k diagram.
Before
DIVA mod
e
l
can
produ
ce
sp
eech
sou
n
d
s
, the ma
ppi
ngs bet
wee
n
ea
ch
comp
one
nt o
f
the mod
e
l
must b
e
le
arned. In
or
d
e
r to inve
stigat
e cere
bella
r
contri
bution
s
to
DIVA model, the first thing is to explicit the
learnin
g
pro
c
e
ss of
the mappin
g
s bet
wee
n
the
variou
s
com
p
onent
s of th
e
model.
The
whol
e le
arni
n
g
p
r
ocess is
divided i
n
to t
w
o
pha
se
s, e
a
rly
babbli
ng ph
a
s
e an
d imitation pha
se. Fi
gure
2 is
a
simplified DIV
A
model blo
c
k diag
ram
wh
ich
indicates the
mappin
g
s t
uned d
u
ri
ng
the two lea
r
ning
pha
se
s. Durin
g
a
babbli
ng ph
a
s
e,
somato
se
nso
r
y and au
dito
ry feedba
ck signal
s, whi
c
h
are u
s
ed to l
earn th
e map
p
ing
s
betwee
n
different ne
ural rep
r
e
s
enta
t
ions, are pro
v
ided by
ran
d
o
m moveme
n
t
s of the spe
e
c
h a
r
ticul
a
tors.
After babbli
n
g pha
se, the
model g
o
e
s
i
n
to imitation
pha
se in
whi
c
h the
mod
e
l can
qui
ckly l
earn
to prod
uce eit
her n
e
w
so
un
ds fro
m
au
dio
sampl
e
s
pro
v
ided to it or
arbitrary com
b
ination
s
of t
h
e
sou
n
d
s
it has learne
d.
Figure 1. The
DIVA Model Block
Diag
ra
m
Figure 2. Lea
rning in the
DIVA model
Evaluation Warning : The document was created with Spire.PDF for Python.
TELKOM
NIKA
ISSN:
2302-4
046
Con
s
tru
c
ting
Cerebell
u
m
Model b
y
Re
searchin
g
on its Co
ntributio
ns to DIVA (Yuan
yua
n
Wu
)
4117
2.1. Early
Babbling Phas
e
Duri
ng
a p
r
o
c
e
s
s si
milar
to infant b
a
b
b
ling, the
m
odel first le
a
r
ns the
rel
a
tionship
betwe
en m
o
tor
com
m
and
s and
corre
s
p
ondin
g
sen
s
o
r
y out
come
s.
Somatosen
s
o
r
y and
a
udito
ry
feedba
ck that cau
s
e
s
moto
r co
mman
d
s
is provi
ded b
y
pseud
o-ran
dom arti
culat
o
r movem
ent
s.
The mot
o
r
comm
and
s
and thei
r
se
nso
r
y con
s
e
quen
ce
s a
r
e
use
d
to tu
ne the
syna
ptic
proje
c
tion
s from sen
s
ory
error m
a
p
s
to the feedf
o
r
ward
control
map (re
d
arrows in Fi
gure 2).
Once tune
d, these
proj
ect
i
ons tran
sform sen
s
o
r
y e
rro
r si
gnal
s i
n
to co
rrectiv
e
motor velo
city
comm
and
s.
2.2. Imitation Phase
After bab
blin
g ph
ase in
which
the
gen
e
r
al
se
nsory-to
-motor ma
ppi
ng h
a
s be
en
l
earn
ed,
the mod
e
l g
oes i
n
to a
seco
nd le
arni
ng ph
ase, imitation pha
se, to p
r
od
u
c
e
spe
e
ch sound
s.
There a
r
e t
w
o ki
nds of m
appin
g
s
nee
d
to be l
earne
d in thi
s
p
h
a
s
e, o
ne i
s
th
e map
p
ing
in
the
feedba
ck co
n
t
rol system from spe
e
ch sound ma
p to auditory and
somato
sen
s
ory target m
a
p
(blue
arro
ws i
n
Figu
re 2
)
, a
nd the oth
e
r i
s
in
the fe
edf
orward control system
fro
m
spe
e
ch so
und
map to a
r
ticulator velo
cit
y
and po
siti
on map
s
(green a
r
rows i
n
Figu
re 2
)
. In the form
er,
analo
gou
s to
the
soun
ds
of the n
a
tive lang
uage
of
an infa
nt, the mod
e
l i
s
p
r
esented
spe
e
ch
sou
nd
sam
p
l
e
s
whi
c
h
take the fo
rm
o
f
time varyin
g a
c
ou
stic si
gnal
s
spo
k
e
n
by a
hum
an
spe
a
ker. On
ce a new sp
ee
ch so
und i
s
presented,
it becom
es a
s
so
ciated with a
n
unused cell
in
spe
e
ch so
un
d map. Subseque
ntly, the
model lea
r
n
s
an audito
ry target for that spee
ch soun
d in
the form of a
time-varying
region. In this
way,
the corresp
ondi
ng rel
a
tionship bet
wee
n
the cell
in
spe
e
ch so
un
d map and th
e auditory target in auditory
target map is esta
blished
, that is to say,
weig
hts from
spe
e
ch
sou
nd ma
p to a
uditory tar
get
map
are
tun
ed. In ad
ditio
n
, weig
hts f
r
om
spe
e
ch so
un
d map to som
a
tose
nsory target map a
r
e
tuned du
ring
corre
c
t self p
r
odu
ction
s
.
Once a
udito
ry target
s h
a
ve be
en l
e
a
r
n
ed, the
se
co
n
d
ki
nd
of ma
p
p
ing
are
al
so
learn
e
d
durin
g the
im
itation ph
ase. Becau
s
e th
a
t
the p
r
oj
e
c
ti
ons from
spe
e
ch
soun
d m
ap to
arti
cula
tor
velocity and positio
n map
s
are tune
d poorly,
and
prod
uctio
n
re
lies heavily on the feedba
ck
control sy
ste
m
, large
sen
s
ory e
r
ror
sig
nals a
r
e p
r
od
uce
d
in the i
n
itial attempts to pro
d
u
c
e
the
spe
e
ch soun
d. Ho
weve
r, the
feed
ba
ck-based co
rre
ct
ive
motor
co
mmand i
n
e
a
c
h p
r
o
d
u
c
tion
is
adde
d to the weight
s from spe
e
ch sou
nd map
to articulato
r velocity and position m
aps,
increme
n
tally improvin
g th
e accu
ra
cy o
f
the
feedforward moto
r
comman
d
. Wit
h
practi
ce, th
e
feedforwa
rd
comm
and
s a
r
e abl
e to p
r
odu
ce the
spee
ch
sou
n
d
with minim
a
l sen
s
o
r
y error.
Therefore,
un
less un
expe
cted se
nsory f
eedb
ack i
s
e
n
co
untered,
the produ
ctio
n of the spee
ch
sou
nd little re
lies on the fe
edba
ck co
ntrol system.
3. The Propo
sed Me
thod
Ba
s
e
d o
n
the c
e
r
e
be
lla
r
a
c
tivity
th
a
t
is
no
t
ed by ne
uro
i
maging
studi
es of m
o
tor l
e
arnin
g
,
Guenth
e
r
put
the cere
bell
u
m control
m
odule i
n
to th
e proje
c
tion
s from
spe
e
ch
sou
nd m
ap
to
articul
a
tor vel
o
city and
po
sition ma
ps i
n
feedf
orwa
rd co
ntrol
system of the DIVA model (t
he
cerebell
u
m
module
highli
ghted by a
red outline i
n
Figure 1), i
n
ord
e
r to le
arn a
nd mai
n
tain
feedforwa
rd
motor
comm
and
s. On th
e ba
sis
of t
he anato
m
ical stru
ctu
r
e
and ob
se
rve
d
neurophy
siol
ogy, seve
ral
functio
nal
role
s h
a
ve b
een
hypothe
sized
for th
e
ce
reb
e
llum, in
cludi
ng
tonic
reinfo
rceme
n
t, timing of be
ha
vior, com
m
a
nd-fee
dba
ck com
pari
s
o
n
, combini
n
g
and
coo
r
din
a
ting movement
s, sen
s
o
r
y
processing and
m
o
tor le
arni
ng
[8]. Therefo
r
e
,
the ce
re
bell
u
m
control mo
dul
e ca
n be
ap
plied to n
o
t o
n
ly feedf
orwa
rd control
sy
stem
of the
DIVA model
but
also fee
dba
ck co
ntrol
syst
em.
3.1. Ad
ding
th
e
Cere
be
llum Module
to
th
e Proj
ection
fr
om Fee
dbac
k
Con
t
rol M
a
p
to
Articul
a
tor V
e
locit
y
and
Position Ma
ps
Kawato
a
nd colleag
ues ha
ve
prop
osed a
ce
re
b
e
llar f
eedb
ack-erro
r lea
r
ning
mo
del [10],
as
sh
own i
n
Figure 3.
Th
e controlled
obje
c
t in
Fi
g
u
re
3 i
s
a
ph
ysical
entity t
hat ne
ed
s to
be
controlled
by the ce
ntral
nervou
s
syst
em (CNS
), such
as th
e e
y
es, han
ds
o
r
leg
s
. It can
be
con
s
id
ere
d
a
s
a ca
scad
e of transfo
rma
t
ions bet
wee
n
motor com
m
and an
d lin
kag
e
motion, and
betwe
en this
linka
ge motio
n
and the co
ntrolled o
b
je
ct motion. The inverse m
o
d
e
l is co
nsi
dered
as a
ne
ural
repre
s
e
n
tation
of the tra
n
sf
ormatio
n
fro
m
the de
si
re
d moveme
nt
trajecto
ry of t
h
e
controlled
ob
ject to th
e
correspon
ding
moto
r
com
m
and
s.
Th
e feedba
ck co
ntrolle
r conv
erts
trajecto
ry error into a
co
rrective feed
ba
ck m
o
tor
com
m
and
whi
c
h i
s
u
s
ed a
s
a t
eaching
sign
al to
Evaluation Warning : The document was created with Spire.PDF for Python.
ISSN: 23
02-4
046
TELKOM
NI
KA
Vol. 12, No. 6, June 20
14: 4115 – 4
124
4118
train the inverse mod
e
l. Becau
s
e the
transfe
r
ch
a
r
acte
ri
stics o
f
the inverse
model are t
h
e
inverse of
tho
s
e
of the
con
t
rolled
obje
c
t, the
ca
scade
of the two
sy
stem
s give
s
an a
pproxima
t
e
identity functi
on. Th
at is,
if a
desi
r
e
d
tra
j
ectory
is giv
en to
the i
n
verse m
odel, t
hen
at the
en
d of
the cascade,
the actual tr
aj
ectory
will be
fairly close to the
desired t
r
ajectory. Thus, the accurat
e
inverse mod
e
l can be u
s
ed as an id
e
a
l feedforwa
rd controller,
and its outpu
t signal is ca
lled
feedforwa
rd motor
comma
nd.
Figure 3. The
Cere
bella
r F
eedb
ack-erro
r Lea
rnin
g Model
The
ce
reb
e
llar fee
dba
ck-error l
e
a
r
nin
g
mod
e
l a
b
o
v
e also
can
be a
pplied
to spee
ch
acq
u
isitio
n a
nd produ
ction
based o
n
DI
VA model (se
e
Figu
re 4
)
. T
he de
sired
se
nso
r
y targ
ets
in
auditory an
d
somato
sen
s
ory target maps a
r
e
compa
r
ed to
the curren
t auditory a
nd
somato
se
nso
r
y states, a
n
d
the erro
r
sign
al ar
i
s
e
s
. The error
sign
al is the
n
mapp
ed in
to
approp
riate
corre
c
tive m
o
tor comma
nd via feed
back
control
maps. T
h
e
feedba
ck motor
comm
and, o
n
the one h
and, is u
s
ed
as a teachi
ng sig
nal, wi
th the desire
d
sen
s
o
r
y target
trajecto
ry whi
c
h i
s
u
s
ed a
s
a
contextu
al sig
nal, to train the i
n
verse mo
del of
the ce
reb
e
llu
m.
Thus, the correspon
ding fe
edforwa
rd mo
tor comm
and
is learned to prod
uce. On the other ha
n
d
,
the feedba
ck
motor comma
nd is integ
r
at
ed and combi
ned with the feedforwa
rd m
o
tor co
mman
d
in articul
a
tor
velocity and p
o
sition ma
ps
to contro
l m
u
scl
es of the face a
nd voca
l tract to prod
uce
the spe
e
ch
sound. The a
c
tual auditory
and somato
sensory state
s
of the curre
n
t
spee
ch so
u
n
d
are
appli
ed t
o
the n
e
xt ci
rcul
ation
aga
in. As
a
re
sult, the ce
re
bellar feed
ba
ck-e
rro
r le
arning
model
can b
e
adde
d to DIVA model. The co
rrective
feedba
ck m
o
tor comma
md is re
ceive
d
by
the cerebellu
m and
u
s
ed
as
a tea
c
hi
n
g
si
gnal
to train the
inverse
model
to l
earn
to
pro
d
u
ce
feedforwa
rd motor comm
a
nd.
Fro
m
the above we kn
ow the
ce
reb
e
llum is
not o
n
ly involved with
the lea
r
ning
and m
a
inten
a
n
ce
of feedfo
r
wa
rd
motor
comm
and
s b
u
t also
re
ceiv
es the
corre
c
tive
feedba
ck motor co
mman
d
as a tea
c
hin
g
signal.
Figure 4. The
Cere
bella
r F
eedb
ack-erro
r Lea
rnin
g Model Based o
n
DIVA Mode
l
In addition, n
euroi
magin
g
studie
s
of mo
tor l
earning h
a
ve noted cerebella
r activity that is
asso
ciated
wi
th the
si
ze
or freq
uen
cy of
se
ns
ory
erro
r. It is hypoth
e
si
zed
that t
he
ce
reb
e
llu
m
make
s
a
cont
ribution
to the
feedb
ack m
o
tor
comma
nd,
and
a
rep
r
e
s
entation
of se
nso
r
y e
rro
rs i
n
Evaluation Warning : The document was created with Spire.PDF for Python.
TELKOM
NIKA
ISSN:
2302-4
046
Con
s
tru
c
ting
Cerebell
u
m
Model b
y
Re
searchin
g
on its Co
ntributio
ns to DIVA (Yuan
yua
n
Wu
)
4119
the ce
re
bellu
m drive
s
corrective m
o
tor com
m
and
s
and
cont
ribut
es to fe
edb
a
c
k-ba
se
d mo
tor
learni
ng. The
functional
role of the ce
rebell
u
m is
i
n
acco
rd wit
h
that of the proje
c
tion from
feedba
ck co
ntrol ma
p to articul
a
tor v
e
locity and
positio
n map
s
. The
cu
rre
nt auditory a
n
d
somato
se
nso
r
y state
s
whi
c
h a
r
e
availa
ble thr
oug
h sensory fee
d
b
a
ck a
r
e
com
pare
d
to the
s
e
targets in the
highe
r-ord
e
r
auditory an
d somato
se
ns
o
r
y corti
c
e
s
. If
the curre
n
t se
nso
r
y state falls
outsid
e
of th
e targ
et regi
on, an
erro
r sig
nal
a
r
i
s
e
s
. The
s
e
e
r
ror
sign
als are tran
smitted
to
feedba
ck co
ntrol map, a
nd then ma
pped into a
p
p
rop
r
iate
co
rrective moto
r comma
nd
s
via
learn
ed p
r
oje
c
tion
s from th
e sen
s
o
r
y error cell
s to the
motor co
rtex. This mappi
n
g
from de
sire
d
sen
s
o
r
y outcome to the a
ppro
p
ri
ate m
o
tor a
c
tion
is an inverse ki
nematic tran
sformation a
n
d
is
the functio
nal
role of th
e i
n
verse mo
del
of t
he ce
re
b
e
llum. As a
result, we ad
d
the ce
re
bell
u
m
module
to th
e proje
c
tion f
r
om fe
edb
ack
control ma
p to a
r
ticul
a
tor velo
city a
nd p
o
sition
maps.
The add
ed ce
rebell
u
m mod
u
le is hig
h
ligh
t
ed by a gree
n outline in Fi
gure 5.
3.2. Adding
the
Cer
e
be
llum Modules to
the Pr
ojections
fr
om Speech
Sound Ma
p to
Audito
r
y
an
d Somatose
nsor
y
Target Maps
The forwa
r
d
model p
r
ovid
es the
cruci
a
l
state
e
s
timates that
can p
r
edi
ct the out
come
of
motor
action.
For
exampl
e, in visu
ally guide
d tra
cking tasks, th
e subje
c
t trie
s to
cont
rol t
he
positio
n of hi
s or
he
r ha
nd
via visual inf
o
rmat
io
n fro
m
the targ
et and the
hand
. This info
rm
ation
is del
ayed du
e to visual
proce
s
sing
and
doe
s not
di
rectly inform t
he CNS ab
ou
t the cha
nge
s in
muscle fo
rce
s
o
r
even joi
n
t angle
s
to
corre
c
t
any
motor e
r
rors.
Ho
wever, a f
o
rward mod
e
l
can
provide the m
i
ssi
ng feed
ba
ck info
rm
atio
n and solve the pro
b
lem.
In DIVA model, the projections from sp
eech
sou
nd
map to sen
s
ory target ma
p predi
ct
the so
und
of
the sp
ea
ke
r’s own
voice while p
r
odu
cin
g
the
sou
nd
based o
n
a
u
d
itory exampl
es
from othe
r
sp
eakers p
r
od
u
c
ing th
e
sou
nd, as we
ll a
s
on
e’s own
previou
s
co
rrect p
r
od
uctio
n
s.
The ce
re
bell
u
m use
s
sen
s
ory e
rro
r to
build forward model
s to gene
rate sen
s
ory p
r
edi
ctio
ns.
Therefore, th
e cerebell
u
m
contrib
u
tes
to the a
ttenuation of sen
s
ory target re
pre
s
entatio
n in
s
e
ns
or
y c
o
r
t
ex.
Figure 5. The
DIVA Model that Adds the
Cere
bellu
m Control Mod
u
les
More
over, DIVA
model co
ntains not
o
n
l
y
intrin
si
cally
co
rtico
-
corti
c
al delay
s b
u
t also
a
kind of lea
r
ne
d or ne
ce
ssary ti
ming of delays. For exa
m
ple, t
he del
ays between
prem
otor cort
ex
and
motor
co
rtex are
set t
o
ma
ke
the
l
earni
ng
si
g
n
a
l
s a
rrive
at th
e moto
r
co
rte
x
at the
sa
m
e
time a
s
the
correspon
ding
feedb
ack co
rre
ctive com
m
and
sig
nal,
so
that th
e
corre
c
t p
o
rtio
n of
the feedfo
r
ward
comm
and
are
ad
apted. T
h
e
delay
s b
e
t
ween
prem
otor
co
rtex and
auditory/som
atose
n
sory a
r
ea
s are also set to
make the auditory/somato
s
en
sory expe
cta
t
ion
sign
als
arri
ve at the
error
ma
ps at
th
e
sam
e
tim
e
that the
co
rre
sp
on
din
g
auditory/som
atose
n
sory st
ate sign
als, so that the error sig
nal
s are com
puted
corre
c
tly. Some
studie
s
have
prop
osed th
e
ce
reb
e
llum
can b
e
u
s
e
d
a
s
a
del
ay mo
del that
que
u
e
s th
e
sen
s
o
r
y
predi
ction
s
to mat
c
h
wit
h
a
c
tual
sen
s
ory
feed
ba
ck. Th
e
ce
re
bellum
delay
s
sign
als for an
approp
riate d
u
ration
or trig
gers a
pprop
ri
ate pa
rts
of cereb
r
al
co
rt
ex
at the p
r
op
er times
and
as a
Evaluation Warning : The document was created with Spire.PDF for Python.
ISSN: 23
02-4
046
TELKOM
NI
KA
Vol. 12, No. 6, June 20
14: 4115 – 4
124
4120
locu
s for lea
r
ning feed
back com
m
an
ds.
Hen
c
e, the
cereb
e
llum is
a likely co
ntri
butor to deal
with
the timing of delays.
On the ba
si
s
of the ce
reb
e
l
l
ar fun
c
tional
ro
le
s ab
ove, we ad
d the
cereb
e
llum m
o
dule to
the proj
ection
from sp
ee
ch
soun
d map t
o
audito
ry target map, the same
as to somatosen
s
o
r
y
target ma
p.
That is, thi
s
mappin
g
ma
y includ
e
a t
r
ans-cerebella
r contrib
u
tion
in ad
dition t
o
a
corti
c
o
-
cortical contrib
u
tio
n
. The
add
ed
ce
reb
e
llum
module
s
are highlighte
d
b
y
blue outline
s
in
Figure 5.
4. Rese
arch
Metho
d
4.1. Structu
r
e of the
Cer
e
bellum Model
The ce
re
bellu
m model that applies to DI
VA m
odel in this pap
er i
s
configured ba
sed on
neuroan
atom
y [13]. Figure
6 sho
w
s th
e
stru
cture
of the ce
reb
e
llu
m model whi
c
h is fo
rmed
b
y
120 gra
nula
r
(Gr) cell
s,
1
Golgi (Go
)
ce
ll,
6
ba
sket
a
nd
stellate
(B
a/St) cell
s, a
n
d
1 Pu
rkinje
(Pk)
cell. T
he n
u
m
ber
of ea
ch
cell is on
the
b
a
si
s of
th
e a
c
tual ratio
of th
e cell p
opul
ation a
s
m
u
ch
as
possibl
e [13]. Mossy fibe
rs (mf) [
15] deliver the inputs that carry
a
desi
r
ed traje
c
tory to Go cel
l
s
and G
r
cell
s. Go cell
s re
ceive excitatory input from Gr cells a
s
well, and sim
u
l
t
aneou
sly inh
i
bit
Gr cells, form
ing a neg
ative feedba
ck lo
op. The exci
t
a
tory output
s of Gr are also
received by Pk
cell
s an
d Ba/St cells. Me
a
n
whil
e, Ba/St cell
s i
nhibit
Pk cell
s, forming a n
egat
ive feedforwa
rd
pathway. The
Pk cells
are
the sol
e
outp
u
t of the
cere
bellum a
nd a
signifi
cant p
a
r
t of this o
u
tp
ut
rea
c
he
s cere
bral cortex via the thalamu
s
. In addi
tion,
a climbing fi
ber (cf) d
e
livers an
othe
r in
put
that ca
rri
es a
co
ntrol
erro
r sig
nal to
e
a
c
h P
k
ce
ll. B
y
adju
s
ting th
e syn
aptic eff
i
ca
cie
s
b
e
twe
en
Gr
and
Pk, th
e outp
u
t of
a
Pk
cell
ca
n b
e
mo
dified to
redu
ce
the
e
r
ror si
gnal.
So
wh
en
Gr a
n
d
cf
are b
o
th acti
vated, the synaptic effica
cie
s
de
crea
se, forming lo
ng term d
e
p
r
essio
n
(L
TD),
whe
r
ea
s whe
n
Gr is active
alone the synaptic effica
ci
es incre
a
se, formin
g long term potentiati
on
(LTP) [16].
Synaptic
wei
ghts fo
r th
ese conn
ecti
o
n
s
a
bove
we
re eithe
r
po
si
tive or
neg
ative
rand
om n
u
m
bers de
pen
di
ng on th
e type
of syna
pse, that is, excita
to
ry or in
hibit
o
ry. And only
the
synap
se
s bet
wee
n
Gr
cells and a Pk cell can be mo
difiable.
Figure 6. The
Structure of the Ce
reb
e
llu
m Model
Each
cell
type is described as follows,
where
Y
is th
e
output of
ea
ch cell an
d
j
i
W
is
the
synapti
c
wei
g
ht betwee
n
a cell
i
and a cell
j
.
Golgi cell:
6
120
11
Go
G
o
mf
mf
G
o
Gr
Gr
mf
G
r
X
WY
W
Y
(1)
2
1
1
Go
Go
X
Y
e
(2)
Gran
ule cell:
Evaluation Warning : The document was created with Spire.PDF for Python.
TELKOM
NIKA
ISSN:
2302-4
046
Con
s
tru
c
ting
Cerebell
u
m
Model b
y
Re
searchin
g
on its Co
ntributio
ns to DIVA (Yuan
yua
n
Wu
)
4121
6
1
Gr
Gr
mf
mf
Gr
Go
Go
mf
XW
Y
W
Y
(3)
2
1
1
Gr
Gr
X
Y
e
(4)
B
a
sk
et
cell:
20
1
B
aB
a
G
r
G
r
Gr
XW
Y
(5)
2
1
1
Ba
Ba
X
Y
e
(6)
Purki
n
je cell:
120
6
11
Pk
P
k
G
r
G
r
Pk
B
a
Ba
Gr
B
a
YW
Y
W
Y
(7)
4.2. Learnin
g
Algorithm
This p
ape
r e
m
ploys the fe
edba
ck-e
rror
lear
ni
ng sch
e
m
e [17] for the learning al
gorithm
of the ce
reb
e
llum mod
e
l. The syn
apti
c
effica
cie
s
betwe
en G
r
and Pk
cell
s are mo
difie
d
to
minimize the
output of the
feedba
ck con
t
roller
by impl
ementing th
e
followin
g
eq
uation
s
, wh
ere
()
PkG
r
Wt
is
a
synaptic
weig
ht bet
we
en a
Gr cell a
nd a
Pk
cell
a
t
time
t
,
is
a lea
r
n
i
ng
r
a
te
, an
d
cf
E
is
the ac
tivity of c
f
input.
()
P
kG
r
G
r
c
f
Wt
Y
E
(8)
()
(
1
)
(
)
P
k
Gr
PkG
r
PkGr
Wt
Wt
Wt
(9)
5. Results a
nd Analy
s
is
For the exp
e
rime
ntal si
mulation, we
gr
adu
ally add the co
rre
spo
ndin
g
ce
rebellum
module to th
e
sub
s
ystem
o
n
the ba
si
s of f
eedba
ck-b
a
s
ed DIVA
mo
del,
and com
pare differe
nces
of formant freque
nci
e
s a
n
d
articulator
positio
ns
of
DIVA model that is ad
ded
before
and af
ter
whe
n
pro
d
u
c
i
ng the uttera
nce /adi/.
Figure 7
sho
w
s th
e forma
n
t frequ
en
cie
s
an
d a
r
ticul
a
tor p
o
sition
s of fee
dba
ck-b
ased
DIVA model that contai
ns
none of cere
bellar m
odule
s
wh
en produ
cing the utterance /adi/. Fro
m
Figure 7
(a),
(b) we can
see, the
formants
of ta
rget traje
c
tori
es fall
outsid
e
of the exp
e
cte
d
regio
n
and
big errors of
the seco
nd
and third forma
n
t frequ
enci
e
s ap
pe
ar. Beside
s,
b
y
comp
ari
ng th
e arti
culato
r
positio
ns
of
motor
com
m
and
s with
th
at of feedb
ack
comm
and
s that
Figure 7
(c),
(d) sho
w
we
kno
w
, b
o
th a
r
e al
mo
st the
sa
me. Th
at i
s
, the
motor
comm
and
s th
at
control the
articul
a
tor
p
o
sition
s d
e
p
end e
n
tirely
on the fe
edba
ck com
m
and
s, and
the
feedforwa
rd comman
d
s
ca
n be learned.
First of all, we add the ce
rebellum mo
d
u
le
to the pro
j
ection fro
m
spee
ch sound
map to
articul
a
tor ve
locity and p
o
sition m
a
p
s
in feedf
orward
system.
The expe
ri
mental re
sult
o
f
simulatio
n
i
s
sho
w
n i
n
Fig
u
re
8. Figu
re
8 (a
)
(b) indi
cate the
form
ants of ta
rget
traje
c
torie
s
f
a
ll
within the ex
pecte
d re
gio
n
and the e
r
rors
of t
he se
con
d
and thi
r
d formant fre
quen
cie
s
red
u
ce
sub
s
tantially
. From Fig
u
re
8 (c
) (d
) (e
) w
e
can
see, th
e articul
a
tor p
o
sition
s of motor co
mman
d
s
Evaluation Warning : The document was created with Spire.PDF for Python.
ISSN: 23
02-4
046
TELKOM
NI
KA
Vol. 12, No. 6, June 20
14: 4115 – 4
124
4122
and fe
edforward
comm
a
nds a
r
e
alm
o
st the
sam
e
.
That
i
s
, feedforwa
rd comm
and
s a
r
e
integrate
d
an
d
combi
ned
with
fee
dba
ck comm
and
s to
c
ontrol artic
u
lato
r in every attempt
to
prod
uce the
spe
e
ch so
un
d. With pra
c
ti
ce, the
feedf
orward com
m
and
s are a
b
le to pro
duce the
spe
e
ch so
un
d with minim
a
l sen
s
o
r
y error. Th
er
efo
r
e, unless un
expecte
d se
n
s
ory fee
dba
ck is
encounte
r
ed,
the p
r
od
ucti
on of th
e
sp
eech
sou
nd
l
i
ttle relie
s o
n
the fee
dba
ck
control
syst
em.
Thus, it
play
s a
cru
c
ial
ro
le to ad
d the
ce
reb
e
llar m
odule i
n
to th
e feedfo
r
ward sy
stem for
the
feedforwa
rd
comm
and
lea
r
ning.
We p
r
edict
peo
ple with ce
reb
e
llar dama
g
e
m
a
y
have
difficulty
in learni
ng ne
w so
und
s.
(a) ta
rget traj
ectori
es
(b) ta
rget er
ro
rs
(c
) motor
co
mmand
s
(d) fee
dba
ck comm
and
s
Figure 7. The
Forma
n
t Fre
quen
cie
s
an
d
Articulator P
o
sition
s of Fe
edba
ck-b
ase
d
DIVA Mode
l
that Contain
s
None of Cerebellum M
o
d
u
les
whe
n
Produ
cing the
Utteran
c
e /ad
i
/
(a) ta
rget traj
ectori
es
(b) ta
rget er
ro
rs
(c
) motor
co
mmand
s
(
d
)
fe
ed
fo
rw
ar
d
c
o
mma
nd
s
(e) fee
dba
ck comm
and
s
Figure 8. The
Forma
n
t Fre
quen
cie
s
an
d
Articulator P
o
sition
s of DI
VA Model that is Added
Cerebell
u
m
Module in F
e
edforwa
rd Syst
em when Produ
cing the
Utteran
c
e /ad
i
/
Evaluation Warning : The document was created with Spire.PDF for Python.
TELKOM
NIKA
ISSN:
2302-4
046
Con
s
tru
c
ting
Cerebell
u
m
Model b
y
Re
searchin
g
on its Co
ntributio
ns to DIVA (Yuan
yua
n
Wu
)
4123
Next, on the
basi
s
of th
e above m
o
del, we a
d
d
the ce
reb
e
ll
um mod
u
le
s to th
e
proje
c
tion
s from spe
e
ch sound ma
p to auditory/s
om
atose
n
sory target map a
n
d
from feedba
ck
control map t
o
articulator
velocity and
positio
n
map
s
in feedb
ack system. Fig
u
re 9
sho
w
s the
experim
ental
result. Figure
9 (a
). (b
) indi
cate the fo
rm
ants of target
trajecto
rie
s
fl
uctuate
over
an
expecte
d regi
on mo
re
smo
o
thly and
ste
adily with
sm
aller
and
neg
ligible e
r
rors.
The
articulat
o
r
positio
ns of motor co
mm
and
s in Figure 9 (c), (d
). (e) locate more clea
r and
prod
uce sma
ller
fluctuation
s
t
han th
at in
Fi
gure
8.
More
over, a
ddin
g
cerebell
a
r mo
dule
s
di
minishes the
attem
p
ts
to produ
ce th
e spe
e
ch so
u
nd, and a
c
cel
e
rate
s feedfo
r
wa
rd
comm
a
nds le
arni
ng
pro
c
e
ss.
(a) ta
rget traj
ectori
es
(b) ta
rget er
ro
rs
(c
) motor
co
mmand
s
(
d
)
fe
ed
fo
rw
ar
d
c
o
mma
nd
s
(e) fee
dba
ck comm
and
s
Figure 9. The
Forma
n
t Fre
quen
cie
s
an
d
Articulator P
o
sition
s of DI
VA Model that is Added
Cerebell
u
m
Module in F
e
edba
ck System
when Prod
ucin
g the Utteran
c
e /adi/
6. Conclusio
n
In orde
r to i
m
prove
mechani
sms
of spee
ch
a
c
q
u
isition and
pro
ductio
n
ba
se
d on
DIVA
model, a
nd
make
ro
bots
have mo
re
h
u
man
-
like
p
r
onun
ciation
system by u
s
i
ng the im
pro
v
e
d
model, we e
x
plore ce
reb
e
llar co
ntribu
tions
to
DIV
A
model,
su
ch
as feedfo
r
wa
rd
lea
r
nin
g
,
sen
s
o
r
y predi
ction
s
, feedb
ack com
m
an
d prod
ucti
o
n
and the timin
g
of delays, and con
s
tru
c
t a
cerebell
u
m m
odel that is close
r
to neuroanatom
y, a
nd then add i
t
to
the curre
n
t DIVA model.
Simulation
re
sults sho
w
th
e cerebellu
m
mod
e
l that
e
s
tabli
s
he
d in
this p
ape
r
ca
n be
a
pplied
to
spe
e
ch acqui
sition an
d produ
ction ba
sed on DIVA
model. Ho
we
ver, there a
r
e seve
ral issues
about the lea
r
ning p
r
o
c
e
s
s that need to be resolve
d
. First, it remains to be
determi
ned h
o
w
much
of the
learni
ng of
the feedforward co
mma
nd is tran
sfe
rre
d from th
e ce
reb
e
llum
to
prem
otor
co
rtex. Second,
if the
cereb
e
llar
ci
rc
uit is n
e
cessa
r
y in le
arni
ng
the feedfo
r
ward
comm
and, th
e model
wo
ul
d pre
d
ict that
peopl
e with
cerebell
a
r d
a
mage m
a
y h
a
ve difficulty in
learni
ng ne
w
sou
n
d
s
.
Ackn
o
w
l
e
dg
ements
The research
work
wa
s suppo
rted by
Nation
al Nat
u
ral S
c
ien
c
e
Found
ation o
f
Chin
a
unde
r Grant No. 610
731
1
5
and No.613
7306
5.
Referen
ces
[1]
Guenther F
H
.
A neura
l
net
w
o
rk model
of speech ac
qu
isitio
n and m
o
tor e
quiv
a
le
nt spee
ch prod
uctio
n
.
Biol
ogic
a
l cyb
e
r
netics
. 199
4; 72: 43-5
3
.
[2]
Guenther F
H
.
Speec
h so
und
acqu
isitio
n, co
articul
a
tion,
a
n
d
rate effects i
n
a n
eur
al n
e
tw
o
r
k mo
del
o
f
speec
h pro
duc
tion.
Psychol
og
ical Rev
i
ew
. 19
95; 102: 5
94-6
21.
Evaluation Warning : The document was created with Spire.PDF for Python.
ISSN: 23
02-4
046
TELKOM
NI
KA
Vol. 12, No. 6, June 20
14: 4115 – 4
124
4124
[3] Guenther
F
H
.
A theoretica
l
framew
ork for s
peec
h acq
u
isiti
on an
d pro
duc
tion.
In Procee
din
g
s of th
e
Secon
d
Interna
t
iona
l Conf
eren
ce on Co
gn
itiv
e and N
eur
al S
y
stems. Bosto
n
: Boston Un
iv
ersit
y
C
ente
r
for Adaptive S
ystems. 1998.
[4]
Guenther
F
H
,
Ghosh SS,
Ni
e
t
o-Castan
on
A.
A n
eur
al
mod
e
l
of sp
eech
pr
oducti
on
. In
Pr
ocee
din
g
s
of
the 6th Internat
ion
a
l Semi
nar
on Spe
e
ch Pr
o
ductio
n
, S
y
d
n
e
y
, Australi
a. 20
03; 85-9
0
.
[5]
Ghosh SS. Understanding
cortical contri
butions to speech
production thr
o
ugh m
odeling and
functional
imagi
ng, Docto
r
al diss
ertation:
Boston Univ
er
sit
y
T
hesis, 20
05.
[6]
Guenther F
H
,
Ghosh SS, T
ourvil
l
e JA. N
eura
l
mo
d
e
li
ng
and im
agi
ng
of the cortical
interactio
n
s
und
erl
y
i
ng s
y
ll
abl
e prod
uctio
n
.
Brain an
d La
ngu
ag
e
. 200
6; 96(3): 28
0-3
0
1
.
[7]
T
ourville JA, Guenth
e
r FH.
The DIVA mode
l: A neur
al the
o
r
y
of spe
e
ch
acqu
isitio
n an
d
producti
on
.
Lan
gu
age a
nd
Cog
n
itive Proc
esses
. 201
1; 2
6
(7): 952-
98
1.
[8]
Allen GI,
Tsukahar
a N. Cerebroc
er
ebellar c
o
mmunic
a
tion s
y
stems.
Physiological review
. 1974; 54(
4):
957-
100
6.
[9]
Bastian AJ, T
hach W
T
. Stru
ctur
e and func
tion of the cerebe
llum.
In Manto, M. and Pand
olfo, M.,
editors, T
he cerebe
llum a
nd it
s disord
ers. Ca
mbridg
e Un
iver
sit
y
Press. 20
0
1
: 49-68
[10] Wolpert D, Mi
a
ll C, Ka
w
a
to M
.
Internal m
ode
ls in th
e cer
e
b
e
llum.
T
r
e
nds i
n
Co
gnitiv
e
Sci
ence
. 19
98;
2(9): 338-
34
7.
[11]
Att
w
el
l PJ, Ivar
sson M, Mi
llar
L, Yeo
CH.
C
e
rebe
llar m
e
ch
a
n
isms i
n
e
y
e
b
l
i
n
k co
nditi
oni
ng
.
Anna
ls of
the New
York Acade
my of Sc
ienc
es
. 200
2; 978: 79-9
2
.
[12]
Bastian AJ, T
hach W
T
. Stru
ct
ure and func
tion of the cerebe
llum.
In Manto, M. and Pand
olfo, M.,
editors, T
he cerebe
llum a
nd it
s disord
ers.Ca
mbridg
e Un
iver
sit
y
Press. 20
0
1
.
[13]
Caja
l S. Ne
w
i
deas o
n
the st
ructure of the n
e
rvous s
y
stem
in
man a
nd ver
t
ebrates. MIT
Press. 1990.
[14]
Barlo
w
JS. T
he cerebe
llum a
n
d
ada
ptive
co
ntrol. Cambri
dg
e
Universit
y
Pre
ss. 2002.
[15]
Shin
oda
Y, Su
giuc
hi Y, F
u
ta
mi T
,
Iza
w
a
R. Axo
n
co
llat
e
ra
ls of moss
y fib
e
rs from the p
ontin
e n
u
cle
u
s
in the cere
be
lla
r dentate n
u
cle
u
s.
Journa
l of neur
ophys
i
o
l
og
y
. 1992; 67(
3): 547-
560.
[16] M
Ito.
Cerebe
l
l
ar co
ntrol of t
he vesti
bul
e-o
c
ul
ar refl
ex-ar
oun
d the fl
occ
u
lus
hypot
hesi
s
. Ann. Rev.
Neur
osci., 198
2; 5: 275–
296.
[17]
H Gomi, M
Ka
w
a
t
o
. Ad
aptive
feed
ba
ck cont
rol
m
ode
ls of th
e
vestib
uloc
er
ebe
llum
an
d
spin
ocere
b
e
llu
m,
Biolog
ical C
y
bern
e
tics
. 199
2; 68: 105-
114.
Evaluation Warning : The document was created with Spire.PDF for Python.