I
nte
rna
t
io
na
l J
o
urna
l o
f
E
lect
rica
l a
nd
Co
m
pu
t
er
E
ng
ineering
(
I
J
E
CE
)
Vo
l.
1
6
,
No
.
1
,
Feb
r
u
ar
y
20
2
6
,
p
p
.
278
~
287
I
SS
N:
2088
-
8
7
0
8
,
DOI
: 1
0
.
1
1
5
9
1
/ijece.
v
1
6
i
1
.
pp
2
7
8
-
2
8
7
278
J
o
ur
na
l ho
m
ep
a
g
e
:
h
ttp
:
//ij
ec
e.
ia
esco
r
e.
co
m
Para
meter
-
eff
i
cie
nt
fi
ne
-
tuning o
f
s
ma
ll lang
ua
g
e mo
dels for
co
de genera
tion:
a
com
pa
ra
tive stu
dy
of G
e
mm
a
,
Q
wen 2.5
and
Lla
ma
3.2
Va
n
-
Viet
Ng
uy
en
1
,
T
he
-
Vin
h Ng
uy
en
1
,
H
uu
-
K
ha
nh
Ng
u
y
en
2
,
Duc
-
Q
ua
ng
Vu
1
1
F
a
c
u
l
t
y
o
f
I
n
f
o
r
m
a
t
i
o
n
T
e
c
h
n
o
l
o
g
y
,
Th
a
i
N
g
u
y
e
n
U
n
i
v
e
r
s
i
t
y
o
f
I
n
f
o
r
ma
t
i
o
n
a
n
d
C
o
m
mu
n
i
c
a
t
i
o
n
Te
c
h
n
o
l
o
g
y
,
T
h
a
i
N
g
u
y
e
n
,
V
i
e
t
n
am
2
U
n
i
v
e
r
s
i
t
y
o
f
I
n
f
o
r
ma
t
i
o
n
a
n
d
C
o
mm
u
n
i
c
a
t
i
o
n
s
Te
c
h
n
o
l
o
g
y
,
T
h
a
i
N
g
u
y
e
n
U
n
i
v
e
r
si
t
y
,
T
h
a
i
N
g
u
y
e
n
,
V
i
e
t
n
am
Art
icle
I
nfo
AB
S
T
RAC
T
A
r
ticle
his
to
r
y:
R
ec
eiv
ed
Ma
y
1
1
,
2
0
2
5
R
ev
is
ed
Oct
2
,
2
0
2
5
Acc
ep
ted
No
v
2
3
,
2
0
2
5
Larg
e
lan
g
u
a
g
e
m
o
d
e
ls
(LL
M
s)
h
a
v
e
d
e
m
o
n
stra
ted
imp
re
ss
iv
e
c
a
p
a
b
il
it
ies
in
c
o
d
e
g
e
n
e
ra
ti
o
n
;
h
o
we
v
e
r,
th
e
ir
h
ig
h
c
o
m
p
u
tati
o
n
a
l
d
e
m
a
n
d
s
,
p
ri
v
a
c
y
li
m
it
a
ti
o
n
s,
a
n
d
c
h
a
ll
e
n
g
e
s
in
e
d
g
e
d
e
p
l
o
y
m
e
n
t
re
strict
t
h
e
ir
p
ra
c
t
ica
l
u
se
in
d
o
m
a
in
-
sp
e
c
ifi
c
a
p
p
li
c
a
ti
o
n
s.
T
h
is
stu
d
y
e
x
p
lo
re
s
t
h
e
e
ffe
c
ti
v
e
n
e
ss
o
f
p
a
ra
m
e
ter
e
ffici
e
n
t
fin
e
-
tu
n
in
g
fo
r
sm
a
ll
lan
g
u
a
g
e
m
o
d
e
ls
(S
L
M
s)
with
fe
we
r
th
a
n
3
b
il
l
io
n
p
a
ra
m
e
ters
.
We
a
d
o
p
t
a
h
y
b
ri
d
a
p
p
ro
a
c
h
th
a
t
c
o
m
b
in
e
s
lo
w
-
ra
n
k
a
d
a
p
tati
o
n
(L
o
RA)
a
n
d
4
-
b
it
q
u
a
n
ti
z
a
ti
o
n
(QLo
RA)
to
re
d
u
c
e
fi
n
e
-
tu
n
i
n
g
c
o
sts
wh
il
e
p
re
se
rv
in
g
s
e
m
a
n
ti
c
c
o
n
siste
n
c
y
.
E
x
p
e
r
ime
n
ts
o
n
t
h
e
Co
d
e
Al
p
a
c
a
-
2
0
k
d
a
tas
e
t
re
v
e
a
l
t
h
a
t
S
LM
s
fi
n
e
-
t
u
n
e
d
w
it
h
t
h
is
m
e
t
h
o
d
o
u
t
p
e
rf
o
rm
lar
g
e
r
b
a
se
li
n
e
m
o
d
e
ls
,
i
n
c
l
u
d
i
n
g
P
h
i
-
3
M
in
i
4
K
b
a
se
,
i
n
ROU
G
E
-
L.
N
o
ta
b
l
y
,
a
p
p
l
y
i
n
g
o
u
r
a
p
p
r
o
a
c
h
t
o
t
h
e
L
La
M
A
3
3
B
a
n
d
Qw
e
n
2
.
5
3
B
m
o
d
e
ls
y
iel
d
e
d
p
e
r
fo
rm
a
n
c
e
im
p
r
o
v
e
m
e
n
ts
o
f
5
4
%
a
n
d
5
5
%
,
re
s
p
e
c
ti
v
e
l
y
,
o
v
e
r
u
n
t
u
n
e
d
c
o
u
n
t
e
rp
a
r
ts
.
We
e
v
a
l
u
a
te
m
o
d
e
ls
d
e
v
e
l
o
p
e
d
b
y
m
a
j
o
r
a
r
ti
fic
ia
l
i
n
te
ll
i
g
e
n
c
e
(AI
)
p
r
o
v
i
d
e
rs
G
o
o
g
le
(G
e
m
m
a
2
B),
M
e
ta
(L
LaM
A
3
1
B
/3
B)
,
a
n
d
A
li
b
a
b
a
(Q
we
n
2
.
5
1
.
5
B/
3
B
)
a
n
d
sh
o
w
t
h
a
t
p
a
r
a
m
e
t
e
r
-
e
ff
ic
ie
n
t
fi
n
e
-
t
u
n
i
n
g
e
n
a
b
les
t
h
e
m
t
o
se
r
v
e
a
s
c
o
s
t
-
e
ffe
c
t
iv
e
,
h
i
g
h
-
p
e
r
f
o
rm
i
n
g
a
l
ter
n
a
t
iv
e
s
t
o
l
a
r
g
e
r
LL
M
s
.
T
h
e
s
e
fi
n
d
i
n
g
s
h
i
g
h
l
i
g
h
t
t
h
e
p
o
te
n
ti
a
l
o
f
S
L
M
s
a
s sc
a
l
a
b
le
s
o
l
u
t
i
o
n
s
f
o
r
d
o
m
a
i
n
-
s
p
e
c
if
ic
s
o
f
twa
re
e
n
g
i
n
e
e
ri
n
g
ta
s
k
s,
su
p
p
o
r
ti
n
g
b
r
o
a
d
e
r
a
d
o
p
t
io
n
a
n
d
d
e
m
o
c
ra
t
iza
ti
o
n
o
f
n
e
u
ra
l
c
o
d
e
s
y
n
t
h
e
s
is
.
K
ey
w
o
r
d
s
:
Fin
e
-
tu
n
in
g
SLM
C
o
d
e
Sm
all
d
ev
ice
Sm
all
lan
g
u
ag
e
m
o
d
els
So
f
twar
e
en
g
in
ee
r
i
n
g
T
h
is i
s
a
n
o
p
e
n
a
c
c
e
ss
a
rticle
u
n
d
e
r th
e
CC B
Y
-
SA
li
c
e
n
se
.
C
o
r
r
e
s
p
o
nd
ing
A
uth
o
r
:
Van
-
Viet
Ng
u
y
en
Facu
lty
o
f
I
n
f
o
r
m
atio
n
T
ec
h
n
o
lo
g
y
,
T
h
ai
Ng
u
y
en
Un
i
v
er
s
ity
o
f
I
n
f
o
r
m
atio
n
a
n
d
C
o
m
m
u
n
icatio
n
T
ec
h
n
o
l
o
g
y
Z
1
1
5
R
o
ad
,
T
h
ai
Ng
u
y
en
2
5
0
0
0
0
,
Viet
n
am
E
m
ail:
n
v
v
iet@
ictu
.
ed
u
.
v
n
1.
I
NT
RO
D
UCT
I
O
N
I
n
th
e
er
a
o
f
al
g
o
r
ith
m
ic
p
r
o
li
f
er
atio
n
an
d
in
cr
ea
s
in
g
l
y
co
m
p
lex
s
o
f
twar
e
ec
o
s
y
s
tem
s
,
th
e
s
y
n
th
esis
o
f
s
o
u
r
ce
co
d
e
v
ia
n
atu
r
al
lan
g
u
ag
e
in
te
r
f
ac
es
h
as
em
er
g
e
d
as
a
cr
itical
ax
is
o
f
r
esear
ch
a
t
th
e
co
n
f
l
u
en
ce
o
f
f
o
r
m
al
lan
g
u
ag
e
th
e
o
r
y
,
n
eu
r
al
r
ep
r
esen
tatio
n
lear
n
in
g
,
a
n
d
au
to
m
ated
r
ea
s
o
n
in
g
[
1
]
,
[
2
]
.
T
h
is
co
n
v
er
g
en
ce
h
as
r
ev
italized
l
o
n
g
s
tan
d
in
g
q
u
esti
o
n
s
i
n
co
m
p
u
tab
ilit
y
,
ex
p
r
ess
iv
ity
,
an
d
s
y
n
tactic
alig
n
m
en
t
b
etwe
en
h
u
m
an
an
d
m
ac
h
in
e
r
e
p
r
ese
n
tatio
n
s
o
f
in
ten
t
[
3
]
.
T
r
ad
itio
n
al
m
o
d
els
o
f
p
r
o
g
r
a
m
s
y
n
th
esis
,
ce
n
ter
ed
o
n
f
o
r
m
al
g
r
am
m
ar
s
[
4
]
,
s
o
f
twar
e
en
g
in
ee
r
in
g
p
r
in
cip
les
[
5
]
,
d
ed
u
ctiv
e
s
y
n
th
esis
,
o
r
en
u
m
er
ativ
e
s
ea
r
ch
,
h
av
e
p
r
o
v
e
n
in
s
u
f
f
icien
tly
s
ca
lab
l
e
wh
en
c
o
n
f
r
o
n
ted
with
th
e
am
b
ig
u
ity
an
d
h
ig
h
d
im
e
n
s
io
n
ality
o
f
n
atu
r
al
lan
g
u
ag
e
[
6
]
.
T
h
e
ad
v
en
t
o
f
lar
g
e
-
s
ca
le
tr
a
n
s
f
o
r
m
er
-
b
ased
lan
g
u
ag
e
m
o
d
els
(
L
L
Ms)
[
7
]
,
s
u
ch
as
m
o
d
els
in
th
e
GPT
f
am
ily
[
8
]
,
T
5
[
9
]
,
an
d
co
d
e
-
s
p
ec
if
ic
m
o
d
els
lik
e
C
o
d
eT
5
[
1
0
]
an
d
Star
C
o
d
e
r
[
1
1
]
,
h
as
r
ed
ef
in
ed
th
e
p
ar
ad
ig
m
.
B
y
e
m
b
ed
d
in
g
s
y
m
b
o
lic
s
tr
u
ctu
r
es
in
to
co
n
tin
u
o
u
s
v
e
cto
r
s
p
ac
es
am
en
ab
le
to
g
r
ad
ien
t
-
b
ased
Evaluation Warning : The document was created with Spire.PDF for Python.
I
n
t J E
lec
&
C
o
m
p
E
n
g
I
SS
N:
2088
-
8
7
0
8
P
a
r
a
mete
r
-
efficien
t fin
e
-
tu
n
in
g
o
f sma
ll la
n
g
u
a
g
e
m
o
d
els fo
r
…
(
V
a
n
-
V
iet
N
g
u
ye
n
)
279
o
p
tim
izatio
n
,
th
ese
m
o
d
els
h
av
e
ac
h
iev
ed
r
e
m
ar
k
a
b
le
ef
f
i
ca
cy
.
Ho
wev
er
,
th
e
y
ar
e
o
f
te
n
ch
ar
ac
ter
ized
b
y
p
r
o
h
ib
itiv
e
p
ar
am
eter
izatio
n
(
ex
ce
ed
in
g
te
n
s
o
r
h
u
n
d
r
ed
s
o
f
b
illi
o
n
s
o
f
weig
h
ts
)
,
i
n
tr
o
d
u
cin
g
ch
allen
g
es
n
o
t
o
n
ly
in
ter
m
s
o
f
co
m
p
u
tatio
n
al
tr
ac
tab
ilit
y
an
d
ca
r
b
o
n
f
o
o
tp
r
i
n
t
[
1
2
]
b
u
t
also
ep
is
tem
o
lo
g
ically
b
y
o
b
f
u
s
ca
tin
g
t
h
e
in
ter
p
r
etab
ilit
y
an
d
v
er
if
iab
ilit
y
o
f
g
en
e
r
ate
d
co
d
e
a
r
tifa
cts.
T
o
ad
d
r
ess
th
ese
lim
itatio
n
s
,
s
m
all
lan
g
u
ag
e
m
o
d
els
(
SLM
s
)
[
1
3
]
–
[
1
5
]
,
ty
p
ically
co
n
s
tr
ain
ed
to
s
u
b
-
7
B
p
ar
am
eter
r
e
g
im
es,
h
av
e
em
er
g
ed
as
p
r
o
m
is
in
g
alter
n
ativ
es.
W
h
ile
s
m
aller
,
SLM
s
o
f
f
er
p
o
ten
tial
f
o
r
ef
f
icien
t
d
ep
lo
y
m
en
t
o
n
e
d
g
e
d
ev
ices
an
d
with
in
co
n
s
tr
ain
ed
in
f
er
en
ce
en
v
ir
o
n
m
en
ts
.
Ho
wev
er
,
lev
er
a
g
in
g
th
eir
f
u
ll
ca
p
a
b
ilit
y
f
o
r
d
o
m
ai
n
-
s
p
ec
if
ic
task
s
lik
e
co
d
e
g
en
er
atio
n
r
eq
u
ir
es
ef
f
ec
tiv
e
ad
a
p
tatio
n
.
T
h
is
s
tu
d
y
in
v
esti
g
ates
th
e
ef
f
ec
tiv
en
ess
o
f
ap
p
l
y
in
g
p
a
r
am
eter
-
ef
f
icien
t
f
in
e
-
tu
n
i
n
g
(
PEFT
)
tech
n
iq
u
es,
s
p
ec
if
ically
lo
w
-
r
an
k
a
d
ap
tatio
n
(
L
o
R
A)
[
1
6
]
a
n
d
q
u
an
tized
l
o
w
-
r
an
k
ad
ap
tatio
n
(
QL
OR
A)
[
1
7
]
,
to
p
r
o
m
in
e
n
t
ex
is
tin
g
SLM
s
f
o
r
co
d
e
g
en
er
atio
n
.
Ou
r
co
r
e
h
y
p
o
th
esis
is
th
at
w
ith
ef
f
icien
t
f
in
e
-
tu
n
in
g
o
n
a
d
o
m
ain
-
s
p
ec
if
ic
d
ataset,
th
ese
co
m
p
ac
t
m
o
d
els
ca
n
ac
h
iev
e
p
er
f
o
r
m
a
n
ce
co
m
p
a
r
ab
le
to
,
o
r
ev
e
n
s
u
r
p
ass
in
g
,
lar
g
er
b
aselin
e
m
o
d
els,
wh
ile
r
eq
u
ir
in
g
s
ig
n
if
ican
tly
f
ewe
r
co
m
p
u
t
atio
n
al
r
eso
u
r
ce
s
f
o
r
tr
ain
i
n
g
an
d
d
ep
lo
y
m
en
t.
W
e
co
n
d
u
ct
em
p
ir
ical
in
v
esti
g
atio
n
s
o
n
well
-
k
n
o
wn
SLM
s
in
clu
d
in
g
L
L
aM
A
3
(
1
B
an
d
3
B
v
ar
ian
ts
)
[
1
8
]
,
Ge
m
m
a
2
B
[
1
9
]
,
an
d
Qwe
n
2
.
5
(
1
.
5
B
an
d
3
B
v
ar
ian
ts
)
[
2
0
]
.
W
e
f
in
e
-
tu
n
e
th
ese
m
o
d
els
u
s
in
g
L
o
R
A/QL
OR
A
o
n
th
e
C
o
d
eAlp
ac
a
-
2
0
k
d
ataset
[
2
1
]
,
a
s
tr
u
ctu
r
ed
co
r
p
u
s
d
esig
n
ed
f
o
r
in
s
tr
u
ctio
n
-
b
ased
co
d
e
g
e
n
er
at
io
n
.
W
e
ev
alu
ate
p
er
f
o
r
m
an
ce
u
s
in
g
th
e
R
OUGE
-
L
m
etr
ic
an
d
an
aly
ze
t
h
e
ef
f
icien
cy
g
ain
s
in
ter
m
s
o
f
tr
ain
ab
le
p
ar
am
eter
s
.
Ou
r
co
n
tr
ib
u
tio
n
s
ar
e
th
r
ee
f
o
ld
:
i)
an
em
p
ir
ical
d
em
o
n
s
tr
atio
n
o
f
th
e
ef
f
ec
tiv
e
n
ess
o
f
PEFT
(
L
o
R
A/QL
OR
A)
f
o
r
ad
ap
tin
g
ex
is
tin
g
SLM
s
to
co
d
e
g
e
n
er
atio
n
,
ii)
a
c
o
m
p
ar
ativ
e
a
n
aly
s
is
o
f
s
ev
er
al
p
r
o
m
in
e
n
t
SLM
s
u
n
d
er
th
ese
f
in
e
-
tu
n
in
g
r
eg
im
es
o
n
th
e
C
o
d
eAlp
ac
a
-
2
0
k
b
en
ch
m
a
r
k
,
h
ig
h
lig
h
tin
g
th
ei
r
r
elativ
e
s
tr
en
g
th
s
a
n
d
lim
itatio
n
s
,
an
d
iii)
ev
id
en
ce
th
at
ef
f
i
cien
tly
f
in
e
-
t
u
n
ed
s
u
b
-
3
B
m
o
d
els
ca
n
o
u
tp
er
f
o
r
m
lar
g
er
b
aselin
e
m
o
d
els
o
n
th
is
task
,
s
u
g
g
esti
n
g
th
e
im
p
o
r
tan
ce
o
f
ef
f
ec
tiv
e
ad
a
p
tatio
n
o
v
er
b
r
u
te
-
f
o
r
ce
p
ar
am
eter
s
ca
le.
Ultim
ately
,
th
is
wo
r
k
co
n
tr
ib
u
tes
to
f
o
r
m
alizin
g
a
s
ca
lab
le
m
eth
o
d
o
lo
g
y
f
o
r
ef
f
icien
t
n
eu
r
al
co
d
e
g
e
n
er
atio
n
,
s
u
it
ab
le
f
o
r
b
o
th
ac
a
d
em
ic
r
ep
licatio
n
an
d
r
ea
l
-
wo
r
ld
s
o
f
t
war
e
d
ev
elo
p
m
e
n
t
wo
r
k
f
lo
ws.
2.
RE
L
AT
E
D
WO
RK
C
o
d
e
g
en
e
r
atio
n
u
s
in
g
n
atu
r
a
l
lan
g
u
a
g
e
p
r
o
m
p
ts
h
as
e
v
o
lv
ed
s
ig
n
if
ican
tly
with
t
h
e
em
e
r
g
en
ce
o
f
lar
g
e
-
s
ca
le
p
r
etr
ain
ed
m
o
d
els.
E
ar
ly
m
o
d
els
s
u
ch
as
C
o
d
eBER
T
[
2
2
]
an
d
C
o
d
e2
Seq
[
2
3
]
r
elied
h
ea
v
ily
o
n
s
y
n
tactic
f
ea
tu
r
es
an
d
wer
e
li
m
ited
in
th
eir
ab
ilit
y
to
g
en
er
alize
b
ey
o
n
d
p
r
e
d
ef
in
ed
p
atter
n
s
o
r
ab
s
tr
ac
t
s
y
n
tax
tr
ee
s
(
ASTs)
.
T
h
ese
m
o
d
els
o
f
f
er
ed
m
o
d
est
s
u
cc
ess
in
co
d
e
r
etr
iev
al
an
d
class
if
icatio
n
ta
s
k
s
b
u
t
lack
ed
t
h
e
s
em
an
tic
co
m
p
o
s
itio
n
ality
an
d
co
n
tex
tu
al
u
n
d
e
r
s
tan
d
in
g
es
s
en
tial
f
o
r
r
ea
lis
tic
co
d
e
s
y
n
th
esis
.
T
h
e
p
ar
a
d
ig
m
s
h
if
ted
with
th
e
ad
v
en
t
o
f
tr
an
s
f
o
r
m
er
-
b
ased
m
o
d
els
p
r
etr
a
in
ed
o
n
m
ass
iv
e
co
d
e
co
r
p
o
r
a.
Mo
d
els
s
u
ch
as
C
o
d
eT
5
[
1
0
]
,
Po
ly
C
o
d
e
r
,
Ph
i
-
3
,
Ph
i
-
3
Me
ets
L
aw
[
1
5
]
,
[
2
4
]
,
alo
n
g
with
in
s
tr
u
ctio
n
-
t
u
n
ed
v
ar
ian
ts
lik
e
Star
C
o
d
er
[
1
1
]
a
n
d
W
izar
d
C
o
d
er
[
2
5
]
,
in
tr
o
d
u
ce
d
ar
c
h
i
tectu
r
al
an
d
o
b
jectiv
e
-
f
u
n
ctio
n
r
ef
in
em
en
ts
th
at
g
r
ea
tly
en
h
an
ce
d
p
er
f
o
r
m
an
ce
.
Desp
ite
th
eir
im
p
r
o
v
em
en
ts
,
th
ese
m
o
d
els
ty
p
ically
ex
ce
ed
6
b
illi
o
n
p
ar
am
eter
s
,
p
o
s
in
g
s
ig
n
if
ica
n
t
ch
allen
g
es
f
o
r
d
ep
lo
y
m
en
t
o
n
co
n
s
tr
ain
ed
h
a
r
d
wa
r
e
an
d
in
c
r
ea
s
in
g
co
m
p
u
tatio
n
al
c
o
s
ts
.
T
o
ad
d
r
ess
th
ese
lim
itatio
n
s
,
th
e
r
esear
ch
co
m
m
u
n
ity
h
as
p
iv
o
te
d
to
wa
r
d
s
SLM
s
,
wh
i
ch
o
f
f
er
a
m
o
r
e
s
u
s
tain
ab
le
an
d
ac
ce
s
s
ib
le
alter
n
ativ
e
[
1
3
]
,
[
1
4
]
,
[
2
6
]
.
T
h
is
tr
en
d
is
ex
em
p
lifie
d
b
y
m
o
d
els lik
e
Go
o
g
le'
s
Gem
m
a
2
B
[
1
9
]
an
d
Alib
ab
a'
s
Qwe
n
2
.
5
s
er
ies
[
2
0
]
,
wh
ich
lev
er
ag
e
o
p
tim
ized
T
r
an
s
f
o
r
m
er
ar
ch
itectu
r
es
to
d
eliv
er
im
p
r
ess
iv
e
p
er
f
o
r
m
a
n
ce
with
in
a
s
u
b
-
3
B
p
ar
a
m
eter
f
o
o
tp
r
in
t.
T
h
ese
co
m
p
ac
t
m
o
d
els
ar
e
n
o
t
o
n
l
y
m
o
r
e
ef
f
icien
t
b
u
t
also
ex
h
i
b
it
s
tr
o
n
g
ca
p
ab
ilit
ies
in
m
u
ltil
in
g
u
al
an
d
s
p
ec
ialized
task
s
,
m
ak
in
g
th
em
p
r
o
m
is
in
g
ca
n
d
id
ates
f
o
r
d
o
m
ain
-
s
p
ec
if
ic
ap
p
licatio
n
s
lik
e
co
d
e
g
e
n
er
atio
n
.
W
h
ile
th
eir
p
o
ten
tial
is
clea
r
,
th
eir
p
er
f
o
r
m
an
ce
o
n
s
p
ec
ialized
co
d
e
g
en
e
r
atio
n
b
e
n
ch
m
ar
k
s
,
p
ar
ticu
lar
ly
af
ter
tar
g
ete
d
ad
ap
tatio
n
,
r
em
ai
n
s
an
ar
ea
r
e
q
u
ir
in
g
th
o
r
o
u
g
h
in
v
esti
g
atio
n
.
A
k
ey
f
ac
to
r
in
u
n
lo
c
k
in
g
t
h
e
p
o
ten
tial
o
f
th
ese
SLM
s
is
th
e
u
s
e
o
f
s
p
ec
ialized
d
at
asets
an
d
in
s
tr
u
ctio
n
-
tu
n
in
g
.
Data
s
ets
li
k
e
C
o
d
eAlp
ac
a
-
2
0
k
[
2
1
]
,
wh
i
ch
p
r
o
v
id
e
a
co
r
p
u
s
o
f
i
n
s
tr
u
c
tio
n
-
co
d
e
p
air
s
,
ar
e
in
s
tr
u
m
en
tal
in
teac
h
in
g
m
o
d
els
to
m
ap
n
atu
r
al
lan
g
u
a
g
e
in
ten
t
to
s
y
n
tactica
lly
co
r
r
ec
t
an
d
s
em
an
tically
ap
p
r
o
p
r
iate
co
d
e.
B
y
f
in
e
-
tu
n
in
g
o
n
s
u
ch
d
atasets
,
m
o
d
e
ls
lear
n
to
f
o
llo
w
co
m
p
lex
i
n
s
tr
u
ctio
n
s
,
th
er
eb
y
m
o
v
in
g
b
e
y
o
n
d
s
im
p
le
p
atter
n
m
atch
in
g
to
a
m
o
r
e
r
o
b
u
s
t
f
o
r
m
o
f
co
d
e
s
y
n
t
h
esis
.
T
h
i
s
m
eth
o
d
o
l
o
g
y
h
as
b
ec
o
m
e
a
co
r
n
e
r
s
to
n
e
f
o
r
a
d
ap
tin
g
g
e
n
er
al
-
p
u
r
p
o
s
e
lan
g
u
ag
e
m
o
d
els
to
th
e
n
u
an
ce
d
d
o
m
ain
o
f
s
o
f
twar
e
en
g
in
ee
r
in
g
.
Ho
wev
er
,
ev
en
f
o
r
SLM
s
,
f
u
l
l
f
in
e
-
tu
n
in
g
ca
n
b
e
co
m
p
u
tat
io
n
ally
p
r
o
h
ib
itiv
e.
T
h
is
h
as
led
to
th
e
wid
esp
r
ea
d
ad
o
p
tio
n
o
f
p
ar
am
eter
-
ef
f
icien
t
f
in
e
-
t
u
n
in
g
(
PEFT
)
tech
n
iq
u
es.
Me
th
o
d
s
lik
e
lo
w
-
r
an
k
a
d
ap
tatio
n
(
L
o
R
A)
[
2
7
]
,
wh
ich
f
r
ee
ze
s
p
r
etr
ain
ed
m
o
d
el
weig
h
ts
an
d
i
n
jects
tr
ain
ab
le
lo
w
-
r
an
k
m
atr
ices,
an
d
q
u
an
tized
lo
w
-
r
an
k
ad
ap
tatio
n
(
QL
OR
A)
[
1
7
]
,
wh
ic
h
f
u
r
th
er
r
ed
u
ce
s
m
em
o
r
y
u
s
ag
e
b
y
q
u
an
tizin
g
th
e
b
ase
m
o
d
el
to
4
-
b
its
,
h
av
e
b
ec
o
m
e
in
s
tr
u
m
en
tal.
T
h
ese
tech
n
iq
u
es
en
ab
le
th
e
ad
ap
tatio
n
o
f
SLM
s
o
n
co
n
s
u
m
er
-
g
r
a
d
e
Evaluation Warning : The document was created with Spire.PDF for Python.
I
SS
N
:
2
0
8
8
-
8
7
0
8
I
n
t J E
lec
&
C
o
m
p
E
n
g
,
Vo
l.
1
6
,
No
.
1
,
Feb
r
u
ar
y
20
2
6
:
2
7
8
-
287
280
h
ar
d
war
e
w
h
ile
p
r
eser
v
i
n
g
o
r
ev
en
e
n
h
an
ci
n
g
p
er
f
o
r
m
an
ce
o
n
d
o
wn
s
tr
ea
m
task
s
.
B
u
ild
i
n
g
o
n
th
is
lin
e
o
f
r
esear
ch
,
o
u
r
s
tu
d
y
e
v
alu
ates
th
e
ap
p
licatio
n
o
f
L
o
R
A
an
d
QL
OR
A
o
n
v
ar
io
u
s
SLM
s
s
u
ch
as
Gem
m
a
2
B
[
1
9
]
,
Qwe
n
2
.
5
[
2
0
]
,
an
d
s
m
aller
L
L
aM
A
3
v
ar
ia
n
ts
[
1
8
]
,
f
o
cu
s
in
g
o
n
s
tan
d
a
r
d
iz
ed
co
d
e
g
en
er
atio
n
b
en
ch
m
ar
k
s
to
ass
ess
p
er
f
o
r
m
an
ce
an
d
e
f
f
icien
cy
.
W
h
ile
p
r
ev
io
u
s
wo
r
k
s
h
av
e
ex
p
lo
r
ed
f
in
e
-
tu
n
in
g
in
d
i
v
id
u
al
SLM
s
f
o
r
co
d
e
[
1
8
]
,
[
2
0
]
,
a
co
m
p
r
eh
e
n
s
iv
e,
s
id
e
-
by
-
s
id
e
co
m
p
ar
ativ
e
an
aly
s
is
o
f
th
e
lead
in
g
SLM
s
f
r
o
m
d
if
f
er
en
t
m
ajo
r
a
r
tific
ial
in
tellig
en
ce
(
AI
)
p
r
o
v
i
d
er
s
(
Go
o
g
le,
Me
ta,
Alib
ab
a)
u
n
d
e
r
a
u
n
if
ied
PEFT
f
r
am
ewo
r
k
is
s
t
ill
lack
in
g
.
Ou
r
wo
r
k
d
ir
ec
tly
a
d
d
r
ess
es
th
is
g
ap
b
y
s
y
s
tem
atica
lly
ev
al
u
at
in
g
th
e
p
er
f
o
r
m
a
n
ce
o
f
th
ese
m
o
d
els
wh
en
f
in
e
-
tu
n
ed
with
L
o
R
A/QL
OR
A
o
n
th
e
C
o
d
eAlp
ac
a
-
2
0
k
b
en
c
h
m
a
r
k
.
T
h
is
ap
p
r
o
ac
h
allo
ws
f
o
r
a
d
ir
ec
t
co
m
p
ar
is
o
n
o
f
th
eir
in
h
er
en
t
ar
ch
itectu
r
a
l
s
tr
en
g
th
s
an
d
t
h
eir
a
d
ap
tab
ilit
y
to
th
e
co
d
e
g
e
n
er
atio
n
d
o
m
ain
,
p
r
o
v
id
in
g
cr
itical
in
s
ig
h
ts
in
to
th
e
m
o
s
t
ef
f
ec
tiv
e
an
d
ef
f
icien
t p
at
h
way
s
f
o
r
d
e
m
o
cr
atizin
g
n
eu
r
al
c
o
d
e
s
y
n
th
esis
.
3.
M
E
T
H
O
D
I
n
co
n
s
tr
u
ctin
g
a
c
o
m
p
r
e
h
en
s
iv
e
ex
p
er
im
en
tal
f
r
am
ewo
r
k
f
o
r
ev
al
u
atin
g
th
e
e
f
f
icac
y
o
f
s
m
all
-
s
ca
le
tr
an
s
f
o
r
m
er
-
b
ased
ar
c
h
itectu
r
e
s
in
th
e
co
n
tex
t
o
f
s
o
u
r
ce
co
d
e
g
en
er
atio
n
,
o
u
r
m
eth
o
d
o
lo
g
y
is
p
r
ed
icate
d
u
p
o
n
a
m
u
lti
-
tier
ed
ap
p
r
o
ac
h
th
at
in
teg
r
ates:
i)
ar
ch
itectu
r
al
s
elec
tio
n
u
n
d
e
r
p
a
r
am
eter
c
o
n
s
tr
ain
t,
ii)
d
ataset
cu
r
atio
n
a
n
d
task
f
o
r
m
aliza
tio
n
,
iii)
im
p
lem
en
tatio
n
o
f
ad
v
an
ce
d
f
i
n
e
-
tu
n
i
n
g
r
eg
im
es
lev
er
ag
in
g
p
ar
am
eter
-
ef
f
icien
t
ad
a
p
tatio
n
,
a
n
d
iv
)
em
p
ir
ical
v
alid
atio
n
v
ia
s
tan
d
ar
d
ized
lex
ical
s
im
ilar
ity
m
etr
ics.
T
h
is
m
u
ltip
r
o
n
g
ed
s
tr
atag
em
e
n
s
u
r
es
b
o
t
h
m
et
h
o
d
o
lo
g
ical
r
ig
o
r
an
d
r
ep
r
o
d
u
cib
ilit
y
with
in
co
n
s
tr
ain
ed
co
m
p
u
tatio
n
al
t
o
p
o
lo
g
ies.
3
.
1
.
M
o
del selec
t
io
n und
er
p
a
ra
m
e
t
er
ized
co
ns
t
ra
ints
L
et
M
d
en
o
te
th
e
h
y
p
o
th
esis
s
p
ac
e
o
f
au
to
r
e
g
r
ess
iv
e
lan
g
u
a
g
e
m
o
d
els
in
s
tan
tiated
o
v
er
a
p
ar
am
eter
d
o
m
ain
Θ
⊂
R
n
,
w
ith
n
<
3
×
10
9
.
W
e
s
elec
t
f
iv
e
r
ep
r
ese
n
tativ
e
m
o
d
els
M
i
∈
M
,
each
ch
ar
ac
ter
ized
b
y
ar
ch
itectu
r
al
s
p
ar
s
ity
,
m
u
ltil
in
g
u
al
ca
p
ab
ilit
y
,
an
d
d
ec
o
d
er
-
o
n
ly
tr
an
s
f
o
r
m
er
b
a
ck
b
o
n
es:
M
1
:
L
lam
a
-
3
.
2
-
1B
-
I
n
s
tr
u
ct
–
1B
;
M
2
:
L
lam
a
-
3
.
2
-
1B
-
I
n
s
tr
u
ct
–
3
B
;
M
3
:
Gem
m
a
–
2B
;
M
4
:
Qwe
n
2
.
5
–
1
.
5
B
;
M
5
:
Qwe
n
2
.
5
–
3B
.
E
ac
h
M
i
is
in
itialized
with
p
r
etr
ain
ed
weig
h
ts
θ
i
0
an
d
s
u
b
j
ec
ted
to
s
u
b
s
eq
u
en
t
ad
ap
tatio
n
o
n
a
task
s
p
ec
if
ic
d
is
tr
ib
u
tio
n
D
code
.
L
la
m
a
-
3
.
2
-
1B
-
I
n
s
tr
u
c
t
(
M
1
)
i
s
a
1
B
p
ar
am
e
ter
lan
g
u
ag
e
m
o
d
el
f
r
o
m
3
.
2
.
I
t
i
s
d
e
s
ig
n
ed
wi
th
an
o
p
ti
m
i
ze
d
l
ig
h
tw
eig
h
t
ar
ch
i
t
ec
tu
r
e.
T
h
i
s
m
o
d
el
i
s
we
ll
s
u
ite
d
f
o
r
d
ep
lo
y
m
en
t
o
n
p
er
s
o
n
a
l
an
d
m
o
b
il
e
d
ev
i
ce
s
,
en
h
an
cin
g
p
er
f
o
r
m
an
ce
a
cr
o
s
s
v
ar
io
u
s
ap
p
li
ca
tio
n
s
.
L
la
m
a
-
3
.
2
-
3B
-
I
n
s
tr
u
c
t
(
M
2
)
is
a
3
B
-
p
ar
a
m
e
ter
lan
g
u
ag
e
m
o
d
el
f
r
o
m
3
.
2
,
d
es
ig
n
ed
to
s
u
p
p
o
r
t
m
u
lt
ip
le
lan
g
u
ag
es
an
d
o
p
t
im
i
ze
d
f
o
r
ta
s
k
s
s
u
ch
a
s
co
n
v
er
s
at
io
n
,
in
f
o
r
m
a
tio
n
r
etr
iev
al,
an
d
t
ex
t
s
u
m
m
ar
iza
t
io
n
.
T
h
i
s
p
o
w
er
f
u
l
an
d
f
l
ex
ib
l
e
m
o
d
el
i
s
ta
ilo
r
ed
f
o
r
p
er
s
o
n
al
an
d
m
o
b
i
le
d
ev
i
ce
s
,
d
el
iv
e
r
in
g
h
ig
h
e
f
f
i
ci
en
cy
in
m
u
l
t
il
in
g
u
a
l n
a
tu
r
al
lan
g
u
ag
e
p
r
o
ce
s
s
in
g
.
Gem
m
a
-
2
-
2b
-
i
t(
M
3
)
i
s
an
o
p
e
n
-
s
o
u
r
ce
L
L
M
d
ev
e
lo
p
ed
b
y
Go
o
g
l
e
w
i
th
2
b
i
ll
io
n
p
ar
a
m
e
ter
s
.
T
h
i
s
m
o
d
el
i
s
d
e
s
i
g
n
ed
f
o
r
n
at
u
r
al
lan
g
u
ag
e
p
r
o
ce
s
s
in
g
w
h
il
e
b
e
in
g
o
p
t
im
iz
ed
f
o
r
r
eso
u
r
ce
-
co
n
s
tr
a
in
ed
en
v
ir
o
n
m
en
t
s
lik
e
p
er
s
o
n
al
c
o
m
p
u
te
r
s
a
n
d
m
o
b
i
le
d
ev
ic
e
s
.
W
i
th
a
d
e
co
d
er
-
o
n
ly
T
r
an
s
f
o
r
m
er
ar
ch
i
te
ctu
r
e
an
d
e
n
h
an
ce
m
en
t
s
s
u
ch
a
s
s
l
i
d
in
g
wi
n
d
o
w
a
tt
en
t
io
n
an
d
s
o
f
t
ca
p
,
G
em
m
a
-
2
-
2B
-
I
T
o
u
tp
er
f
o
r
m
s
o
th
er
o
p
en
m
o
d
el
s
o
f
s
i
m
i
lar
s
ize
.
Ad
d
it
io
n
a
lly
,
i
t
i
s
b
u
i
lt
f
o
r
s
ea
m
le
s
s
in
teg
r
at
io
n
i
n
to
d
e
v
e
lo
p
er
s
’
an
d
r
e
s
e
ar
ch
er
s
’
wo
r
k
f
lo
w
s
,
s
u
p
p
o
r
t
in
g
p
o
p
u
l
ar
ar
t
if
ic
ia
l
in
t
el
l
ig
en
ce
(
A
I
)
f
r
a
m
e
wo
r
k
s
l
ik
e
Hu
g
g
in
g
Fa
ce
T
r
a
n
s
f
o
r
m
er
s
,
J
A
X,
Py
T
o
r
ch
,
an
d
T
en
s
o
r
Flo
w.
Qw
en
2
.
5
-
1
.
5
B
-
I
n
s
tr
u
c
t
(
M
4
)
i
s
a
lan
g
u
ag
e
m
o
d
el
f
r
o
m
th
e
Qw
en
2
.
5
s
er
ie
s
,
d
ev
elo
p
ed
b
y
th
e
Qw
en
tea
m
.
W
i
th
ap
p
r
o
x
im
ate
ly
1
.
5
4
b
i
l
lio
n
p
a
r
am
et
er
s
,
i
t
i
s
s
p
e
cif
ic
al
ly
o
p
t
im
iz
ed
f
o
r
in
s
t
r
u
c
tio
n
-
f
o
l
lo
win
g
ta
s
k
s
.
T
h
i
s
m
o
d
e
l
s
u
p
p
o
r
ts
m
u
lt
ip
l
e
l
an
g
u
ag
e
s
,
in
clu
d
in
g
Vi
etn
am
e
s
e,
an
d
c
an
p
r
o
c
e
s
s
ex
te
n
d
ed
co
n
te
x
t
s
o
f
u
p
to
1
2
8
,
0
0
0
t
o
k
en
s
,
en
h
an
ci
n
g
it
s
ef
f
ec
t
iv
en
e
s
s
i
n
t
ex
t
g
e
n
er
a
tio
n
,
q
u
e
s
tio
n
an
s
wer
in
g
,
p
r
o
g
r
am
m
in
g
,
an
d
m
ath
em
a
t
ica
l
r
ea
s
o
n
in
g
.
B
u
il
t
o
n
a
T
r
an
s
f
o
r
m
er
ar
ch
i
te
ctu
r
e,
i
t
in
co
r
p
o
r
a
te
s
ad
v
an
c
ed
t
ec
h
n
iq
u
e
s
s
u
ch
a
s
R
o
PE,
Sw
i
GL
U
,
an
d
R
M
SN
o
r
m
to
im
p
r
o
v
e
ef
f
ic
ien
cy
a
n
d
a
cc
u
r
ac
y
.
Qw
en
2
.
5
-
1
.
5
B
-
I
n
s
tr
u
c
t
i
s
r
e
lea
s
ed
u
n
d
er
th
e
Ap
ac
h
e
2
.
0
li
ce
n
s
e,
en
ab
lin
g
f
r
e
e
u
s
ag
e
a
n
d
d
is
tr
i
b
u
tio
n
a
cr
o
s
s
v
ar
io
u
s
ap
p
l
ic
at
io
n
s
.
Qw
en
2
.
5
-
3B
-
I
n
s
tr
u
ct
(
M
5
)
i
s
a
n
ad
v
an
c
ed
lan
g
u
ag
e
m
o
d
el
i
n
th
e
Q
wen
2
.
5
s
e
r
ie
s
,
d
ev
e
lo
p
ed
b
y
th
e
Qw
en
t
ea
m
.
W
ith
ap
p
r
o
x
im
ate
ly
3
.
0
9
b
il
lio
n
p
ar
am
e
ter
s
,
it
i
s
f
in
e
-
tu
n
ed
s
p
e
cif
ic
al
ly
f
o
r
in
s
t
r
u
c
tio
n
-
f
o
llo
w
in
g
ta
s
k
s
.
T
h
is
m
o
d
el
s
u
p
p
o
r
t
s
m
u
l
tip
le
l
an
g
u
ag
e
s
,
in
c
lu
d
in
g
Vi
etn
am
e
s
e
,
an
d
c
an
p
r
o
ce
s
s
co
n
t
ex
t
len
g
th
s
o
f
u
p
to
1
2
8
,
0
0
0
t
o
k
en
s
,
en
h
an
cin
g
it
s
ca
p
ab
i
li
ti
e
s
in
tex
t
g
en
er
at
io
n
,
q
u
es
tio
n
a
n
s
we
r
in
g
,
p
r
o
g
r
am
m
in
g
,
a
n
d
m
at
h
em
at
ica
l
p
r
o
b
lem
s
o
lv
in
g
.
B
u
il
t
o
n
a
T
r
an
s
f
o
r
m
er
ar
ch
i
tec
tu
r
e,
i
t
in
co
r
p
o
r
a
te
s
o
p
ti
m
i
za
tio
n
t
ec
h
n
iq
u
es
s
u
c
h
a
s
R
o
P
E
,
S
wi
GL
U,
an
d
R
MS
No
r
m
to
en
h
an
ce
ef
f
ic
i
en
cy
an
d
a
cc
u
r
ac
y
.
Qw
en
2
.
5
-
3B
-
I
n
s
tr
u
ct
i
s
r
el
ea
s
ed
u
n
d
er
th
e
Qw
en
R
e
s
e
ar
c
h
l
ic
en
s
e,
p
e
r
m
i
t
tin
g
i
t
s
u
s
e
an
d
d
i
s
tr
ib
u
tio
n
in
r
es
ea
r
ch
p
r
o
jec
t
s
.
W
i
th
i
t
s
a
b
il
ity
to
g
en
er
a
te
an
d
u
n
d
er
s
t
an
d
h
i
g
h
-
q
u
a
li
ty
t
ex
t,
th
i
s
m
o
d
el
s
e
r
v
e
s
a
s
a
p
o
wer
f
u
l
to
o
l f
o
r
n
a
tu
r
a
l
lan
g
u
ag
e
p
r
o
ce
s
s
in
g
ap
p
l
ic
at
io
n
s
ac
r
o
s
s
d
iv
er
s
e
l
an
g
u
ag
e
s
an
d
co
n
t
ex
t
s
.
Evaluation Warning : The document was created with Spire.PDF for Python.
I
n
t J E
lec
&
C
o
m
p
E
n
g
I
SS
N:
2088
-
8
7
0
8
P
a
r
a
mete
r
-
efficien
t fin
e
-
tu
n
in
g
o
f sma
ll la
n
g
u
a
g
e
m
o
d
els fo
r
…
(
V
a
n
-
V
iet
N
g
u
ye
n
)
281
3
.
2
.
Da
t
a
s
et
,
f
o
rma
liza
t
io
n a
nd
prepro
ce
s
s
ing
L
ar
g
e
la
n
g
u
a
g
e
m
o
d
els
(
L
L
Ms)
ex
ce
l
at
m
an
y
task
s
th
a
n
k
s
to
h
u
g
e
p
r
etr
ain
e
d
d
atasets
.
Fo
r
o
u
r
ex
p
er
im
en
ts
,
we
u
s
e
th
e
s
ah
il
2
8
0
1
/C
o
d
eAlp
ac
a
-
2
0
k
d
ataset
[
2
1
]
.
Data
s
et
C
o
d
eAlp
ac
a
-
2
0
k
is
a
d
ata
s
et
f
o
r
tr
ain
in
g
an
d
ev
alu
atin
g
n
atu
r
al
lan
g
u
ag
e
p
r
o
ce
s
s
in
g
(
NL
P)
m
o
d
els
in
th
e
p
r
o
g
r
am
m
in
g
f
ield
.
C
o
d
eAlp
ac
a
-
2
0
k
is
a
d
ataset
c
o
n
tain
in
g
a
p
p
r
o
x
im
ately
2
0
,
0
0
0
p
r
o
g
r
am
m
in
g
c
o
d
e
s
am
p
les
an
d
r
elat
ed
co
m
m
en
ts
.
I
t
is
d
esig
n
ed
to
s
u
p
p
o
r
t
r
esear
ch
an
d
d
e
v
elo
p
m
e
n
t
o
f
AI
m
o
d
els
ca
p
ab
le
o
f
u
n
d
er
s
tan
d
i
n
g
an
d
g
en
e
r
atin
g
p
r
o
g
r
a
m
m
in
g
co
d
e.
T
h
e
s
tr
u
ctu
r
e
o
f
th
is
d
ataset
in
clu
d
es
an
in
p
u
t
d
ata
s
et
d
escr
ib
ed
in
n
atu
r
al
lan
g
u
ag
e
ab
o
u
t
th
e
r
eq
u
ir
em
e
n
ts
o
f
th
e
p
r
o
g
r
am
m
in
g
p
r
o
b
lem
;
T
h
e
o
u
t
p
u
t
is
a
p
iece
o
f
p
r
o
g
r
am
m
in
g
co
d
e
th
at
p
er
f
o
r
m
s
th
e
f
u
n
ctio
n
s
d
escr
ib
e
d
.
C
o
d
eAlp
ac
a
-
2
0
k
ca
n
co
n
tain
co
d
e
f
r
o
m
m
an
y
d
if
f
er
e
n
t
p
r
o
g
r
am
m
in
g
lan
g
u
a
g
es
s
u
ch
as
Py
th
o
n
,
J
av
aScr
ip
t,
J
av
a,
C
++
,
an
d
m
a
n
y
o
th
er
s
,
d
e
p
en
d
in
g
o
n
th
e
g
o
al
o
f
u
s
in
g
th
e
d
ataset.
W
e
d
ef
in
e
a
s
u
p
er
v
is
ed
d
ataset
D
c
ode
=
{
(
,
)
}
=
1
,
wh
er
e
∈
Σ
∗
d
en
o
tes
a
n
atu
r
al
la
n
g
u
ag
e
task
in
s
tr
u
ctio
n
an
d
∈
Γ
∗
r
ep
r
esen
ts
th
e
co
r
r
esp
o
n
d
in
g
s
o
u
r
ce
c
o
d
e
s
n
ip
p
et.
T
h
is
d
ataset,
C
o
d
eAlp
a
ca
-
20k
,
is
a
s
tr
u
ctu
r
ed
co
r
p
u
s
en
co
m
p
ass
in
g
m
u
lti
-
lin
g
u
al
co
d
e
r
ep
r
ese
n
tatio
n
s
ac
r
o
s
s
d
iv
er
s
e
p
r
o
g
r
am
m
in
g
p
ar
ad
ig
m
s
P
=
{
Py
th
o
n
,
C
++
,
J
av
a
,
…
}
.
T
o
e
n
f
o
r
ce
s
y
n
tactic
u
n
if
o
r
m
ity
an
d
m
o
d
el
co
m
p
atib
ilit
y
,
we
im
p
lem
en
t
a
tem
p
lated
s
er
ializatio
n
th
er
eb
y
co
n
s
tr
ain
in
g
to
k
e
n
izatio
n
u
n
d
er
co
n
s
is
ten
t p
o
s
itio
n
al
em
b
e
d
d
in
g
s
.
:
(
,
)
↦
⟨
<
|
us
e
r
|
>
<
|
e
n
d
|
>
<
|
a
s
s
ista
n
t
|
>
<
|
e
n
d
|
>
⟩
,
3
.
3
.
Co
ns
t
ra
ints f
ine
-
t
un
ing
pa
ra
dig
m
s
3
.
3
.
1
.
Su
perv
is
ed
f
ine
-
t
un
i
n
g
(
SFT)
L
e
t
L
CE
(
;
,
)
d
e
n
o
t
e
th
e
cr
o
s
s
-
en
t
r
o
p
y
lo
s
s
f
u
n
c
t
io
n
p
a
r
a
m
e
t
er
iz
e
d
b
y
m
o
d
e
l
w
e
ig
h
t
s
,
c
o
m
p
u
t
ed
o
v
e
r
t
h
e
c
o
n
d
i
t
io
n
a
l
l
i
k
e
l
ih
o
o
d
(
∣
)
.
T
h
e
s
u
p
e
r
v
i
s
e
d
f
i
n
e
-
t
u
n
i
n
g
o
b
j
e
c
t
i
v
e
i
s
d
e
f
i
n
e
d
a
s
(
1
)
:
∗
=
∑
L
CE
(
;
,
)
=
1
(
1
)
I
m
p
lem
en
tatio
n
is
r
ea
lized
v
i
a
th
e
API
,
wh
er
e
g
r
ad
ien
t
f
lo
w
is
r
estricte
d
t
o
s
elec
ted
.
3
.
3
.
2
.
L
o
w
-
ra
nk
a
da
pta
t
io
n
(
L
o
RA)
T
o
cir
cu
m
v
en
t
th
e
in
f
ea
s
ib
il
ity
o
f
f
u
ll
weig
h
t
u
p
d
ates,
we
in
v
o
k
e
Lo
RA
,
wh
ic
h
ap
p
r
o
x
im
ates
weig
h
t
p
er
tu
r
b
atio
n
s
v
ia
a
co
n
s
tr
ain
ed
lo
w
-
r
an
k
s
u
b
s
p
ac
e
.
L
et
Δ
≈
,
wh
er
e
∈
R
×
,
∈
R
×
,
an
d
≪
(
,
)
.
T
h
e
ad
a
p
ted
weig
h
ts
ar
e:
=
0
+
⋅
(
2
)
w
i
t
h
b
e
i
n
g
a
s
c
al
i
n
g
h
y
p
e
r
p
a
r
a
m
e
t
e
r
.
T
h
e
f
i
n
e
-
t
u
n
i
n
g
is
r
es
t
r
i
c
t
e
d
t
o
at
t
e
n
ti
o
n
m
a
t
r
i
c
es
(
pr
oj
,
pr
oj
,
pr
oj
,
pr
o
j
)
a
n
d
f
e
e
d
-
f
o
r
w
a
r
d
l
a
y
e
r
s
(
,
,
)
.
3
.
3
.
3
.
Q
ua
ntiz
ed
lo
w
-
ra
nk
a
da
pta
t
io
n (
Q
L
o
RA)
Fo
r
f
u
r
th
er
c
o
m
p
r
ess
io
n
,
we
ad
o
p
t
4
-
b
it
q
u
an
tizatio
n
(
4
)
,
in
teg
r
atin
g
it
with
L
o
R
A
ad
ap
ter
s
.
W
e
d
ef
in
e
th
e
q
u
a
n
tizatio
n
m
ap
p
i
n
g
as (
3
)
:
qua
nt
=
r
o
u
n
d
(
−
m
i
n
(
)
m
ax
(
)
−
m
i
n
(
)
⋅
(
2
−
1
)
)
,
=
4
(
3
)
Fin
e
-
tu
n
in
g
lan
g
u
ag
e
m
o
d
els
f
o
r
s
p
ec
ialized
d
o
m
ain
s
li
k
e
th
e
c
o
d
e
g
en
er
atio
n
task
s
r
eq
u
i
r
es
s
ig
n
if
ican
t
co
m
p
u
tatio
n
al
r
eso
u
r
ce
s
.
T
o
m
itig
ate
th
is
,
we
em
p
lo
y
ed
QL
o
R
A
[
1
7
]
,
a
tec
h
n
iq
u
e
d
esig
n
ed
t
o
ef
f
icien
tly
f
in
e
-
tu
n
e
L
L
Ms
wh
ile
m
in
im
izin
g
m
em
o
r
y
f
o
o
t
p
r
in
t
an
d
tr
ai
n
in
g
tim
e.
QL
o
R
A
ac
h
iev
es
th
is
b
y
q
u
an
tizin
g
th
e
p
r
e
-
tr
ai
n
ed
m
o
d
el
weig
h
ts
to
a
lo
we
r
p
r
ec
is
io
n
(
4
-
b
it)
an
d
th
en
ap
p
ly
in
g
lo
w
-
r
an
k
u
p
d
ates
d
u
r
in
g
th
e
f
in
e
-
tu
n
in
g
p
r
o
ce
s
s
.
T
h
ese
lo
w
-
r
an
k
u
p
d
ates
ar
e
s
to
r
ed
with
h
i
g
h
er
p
r
ec
is
io
n
(
f
p
1
6
)
,
allo
win
g
f
o
r
ef
f
ec
tiv
e
ad
ap
tatio
n
w
h
ile
k
ee
p
in
g
o
v
er
all
m
em
o
r
y
r
eq
u
ir
em
en
ts
lo
w.
Sp
ec
if
ically
,
we
u
tili
ze
d
4
-
b
it
q
u
an
tizatio
n
(
−
−
4
=
)
with
th
e
NF4
q
u
an
tizatio
n
t
y
p
e,
s
tr
ik
in
g
a
b
alan
ce
b
etwe
en
m
o
d
el
co
m
p
r
ess
io
n
an
d
p
er
f
o
r
m
an
ce
.
T
h
is
co
n
f
ig
u
r
atio
n
en
a
b
led
u
s
to
f
u
lly
lev
er
ag
e
th
e
ca
p
ab
il
ities
o
f
o
u
r
ch
o
s
en
m
o
d
els
wh
ile
o
p
er
atin
g
with
in
p
r
ac
tical
r
eso
u
r
ce
co
n
s
tr
ain
ts
.
T
h
is
ap
p
r
o
ac
h
s
ig
n
if
ican
tly
ac
ce
ler
ated
th
e
f
in
e
-
tu
n
in
g
p
r
o
ce
s
s
with
o
u
t
co
m
p
r
o
m
is
in
g
th
e
m
o
d
el'
s
p
er
f
o
r
m
a
n
ce
o
n
t
h
e
c
o
d
e
g
en
er
atio
n
task
s
.
T
h
is
Evaluation Warning : The document was created with Spire.PDF for Python.
I
SS
N
:
2
0
8
8
-
8
7
0
8
I
n
t J E
lec
&
C
o
m
p
E
n
g
,
Vo
l.
1
6
,
No
.
1
,
Feb
r
u
ar
y
20
2
6
:
2
7
8
-
287
282
t
r
a
n
s
f
o
r
m
a
t
i
o
n
d
r
as
t
i
ca
l
l
y
r
e
d
u
c
e
s
m
e
m
o
r
y
o
v
e
r
h
e
a
d
w
h
il
e
m
a
i
n
t
a
i
n
i
n
g
f
u
n
c
ti
o
n
a
l
f
i
d
e
li
t
y
v
i
a
a
d
e
q
u
a
n
t
i
za
t
i
o
n
i
n
v
e
r
s
e
m
a
p
p
i
n
g
a
t
i
n
f
e
r
e
n
ce
.
3
.
4
.
E
v
a
lua
t
i
o
n
m
et
ric
f
o
rma
liza
t
io
n
T
h
e
p
r
im
ar
y
q
u
an
titativ
e
in
s
tr
u
m
en
t
em
p
lo
y
e
d
is
R
OU
GE
-
L
,
a
r
ec
all
-
o
r
ien
ted
m
etr
ic
d
er
iv
ed
f
r
o
m
th
e
lo
n
g
est
co
m
m
o
n
s
u
b
s
eq
u
en
ce
(
L
C
S)
f
r
a
m
ewo
r
k
.
Giv
e
n
a
g
en
er
ated
s
eq
u
e
n
ce
=
{
1
,
…
,
}
an
d
a
r
ef
er
en
ce
s
eq
u
en
ce
=
{
1
,
…
,
}
,
th
e
L
C
S
i
s
d
en
o
ted
as
L
C
S
(
,
)
.
T
h
e
co
r
r
esp
o
n
d
i
n
g
R
OUGE
-
L
p
r
ec
is
io
n
,
r
ec
all
,
an
d
F1
-
s
co
r
e
ar
e
co
m
p
u
ted
as:
=
∣
L
C
S
(
,
)
∣
∣
∣
,
=
∣
L
C
S
(
,
)
∣
∣
∣
,
=
2
+
(
4
)
T
h
is
m
etr
ic
is
p
a
r
ticu
lar
ly
s
u
i
ted
f
o
r
ev
al
u
atin
g
s
o
u
r
ce
co
d
e,
as
it
im
p
licitly
ca
p
tu
r
es
b
o
t
h
lex
ical
c
o
h
er
e
n
ce
an
d
s
tr
u
ctu
r
al
a
d
h
er
e
n
ce
with
o
u
t p
en
alizin
g
m
in
o
r
s
y
n
tactic
p
er
m
u
tatio
n
s
.
3
.
5
.
E
x
perim
ent
s
et
up
All
ex
p
er
im
en
ts
wer
e
co
n
d
u
ct
ed
o
n
a
s
er
v
er
eq
u
ip
p
ed
with
4
×
NVI
DI
A
R
T
X
A5
0
0
0
GPUs
(
2
4
GB
VR
AM
ea
ch
)
,
d
u
al
AM
D
R
y
ze
n
T
h
r
ea
d
r
ip
p
er
PR
O
5
9
6
5
W
X
C
P
Us
(
4
8
lo
g
ical
c
o
r
es),
an
d
2
5
6
GB
R
AM
.
W
e
u
s
ed
Py
T
o
r
c
h
2
.
5
.
1
with
C
UDA
1
2
.
1
,
Hu
g
g
in
g
Face
T
r
an
s
f
o
r
m
e
r
s
4
.
4
6
.
3
,
an
d
PE
FT
0
.
1
3
.
2
.
Mo
d
els
wer
e
f
in
e
-
tu
n
e
d
o
n
th
e
C
o
d
e
Alp
ac
a
-
2
0
k
d
ataset
u
s
in
g
L
o
R
A
an
d
QL
o
R
A.
T
r
ain
in
g
em
p
lo
y
ed
th
e
A
d
am
W
o
p
tim
izer
with
a
lear
n
in
g
r
at
e
o
f
2
×
10
−
4
,
co
s
in
e
s
ch
e
d
u
ler
,
an
d
ea
r
ly
s
to
p
p
i
n
g
b
ased
o
n
v
alid
atio
n
lo
s
s
.
Fo
r
L
o
R
A,
we
s
et
r
an
k
=
8
,
s
ca
l
in
g
=
16
,
an
d
d
r
o
p
o
u
t
=
0
.
05
.
T
ab
le
1
s
u
m
m
ar
izes
th
e
k
ey
tr
ain
in
g
h
y
p
er
p
ar
am
eter
s
ac
r
o
s
s
d
if
f
er
en
t
m
o
d
els.
Fig
u
r
e
1
s
h
o
ws
th
e
co
m
p
u
tatio
n
al
r
eso
u
r
ce
s
an
d
tr
ain
in
g
en
v
ir
o
n
m
en
t
.
T
ab
le
1
.
T
r
ai
n
in
g
h
y
p
e
r
p
ar
am
eter
s
ac
r
o
s
s
m
o
d
els
M
o
d
e
l
s
P
a
r
a
ms
B
a
t
c
h
si
z
e
Ep
o
c
h
s
Tr
a
i
n
a
b
l
e
P
a
r
a
ms
S
t
e
p
s
Ll
a
m
a
-
3
.
2
-
1B
1
.
0
B
1
6
0
50
1
1
.
3
M
7
5
0
Ll
a
m
a
-
3
.
2
-
3B
3
.
2
B
64
50
2
4
.
3
M
1
9
5
0
G
e
mm
a
-
2
-
2B
2
.
0
B
1
2
8
50
2
0
.
8
M
9
5
0
Q
w
e
n
2
.
5
-
1
.
5
B
1
.
5
B
1
2
8
50
1
8
.
5
M
9
5
0
Q
w
e
n
2
.
5
-
3B
3
.
1
B
64
50
2
9
.
9
M
1
9
5
0
Fig
u
r
e
1
.
C
o
m
p
u
tatio
n
al
r
eso
u
r
ce
s
an
d
tr
ain
in
g
en
v
i
r
o
n
m
e
n
t
4.
RE
SU
L
T
S AN
D
D
I
SCU
SS
I
O
N
T
o
r
ig
o
r
o
u
s
ly
ev
alu
ate
th
e
g
en
er
ativ
e
ef
f
icac
y
o
f
p
ar
am
et
er
-
ef
f
icien
tly
f
in
e
-
tu
n
ed
s
m
all
lan
g
u
ag
e
m
o
d
els
with
in
th
e
d
o
m
ain
o
f
s
o
u
r
ce
co
d
e
s
y
n
th
esis
,
we
o
p
er
atio
n
alize
a
m
u
ltifa
ce
ted
e
v
alu
atio
n
p
r
o
to
c
o
l
Evaluation Warning : The document was created with Spire.PDF for Python.
I
n
t J E
lec
&
C
o
m
p
E
n
g
I
SS
N:
2088
-
8
7
0
8
P
a
r
a
mete
r
-
efficien
t fin
e
-
tu
n
in
g
o
f sma
ll la
n
g
u
a
g
e
m
o
d
els fo
r
…
(
V
a
n
-
V
iet
N
g
u
ye
n
)
283
g
r
o
u
n
d
ed
in
estab
lis
h
ed
te
x
tu
al
s
im
ilar
ity
m
etr
ics,
co
m
p
ar
a
tiv
e
b
en
c
h
m
ar
k
in
g
a
g
ain
s
t
r
el
ev
an
t
b
aselin
es
an
d
less
-
tu
n
ed
v
er
s
io
n
s
o
f
th
e
SL
Ms,
an
d
co
n
t
r
o
lled
a
b
latio
n
an
aly
s
is
.
As
s
h
o
wn
in
Fig
u
r
e
2
,
o
u
r
e
v
alu
atio
n
ad
h
er
es
to
th
e
p
r
in
ci
p
le
o
f
m
o
d
el
-
o
u
tp
u
t
co
n
g
r
u
en
ce
,
i
n
wh
i
ch
th
e
s
em
an
tic
f
id
elity
o
f
g
e
n
er
ated
s
o
u
r
ce
co
d
e
is
q
u
an
tifie
d
ag
ain
s
t
h
u
m
an
-
c
u
r
ated
r
ef
er
en
ce
im
p
lem
e
n
tatio
n
s
u
s
in
g
R
OUGE
-
L
.
Ou
r
ex
p
er
im
en
ts
f
o
cu
s
ed
o
n
c
o
m
p
ar
i
n
g
t
h
e
p
er
f
o
r
m
an
c
e
g
ain
s
ac
h
iev
ed
th
r
o
u
g
h
f
in
e
-
tu
n
in
g
th
ese
s
m
aller
m
o
d
els
an
d
a
n
aly
zin
g
th
ei
r
ef
f
ec
tiv
en
ess
r
elativ
e
to
ea
ch
o
th
er
a
n
d
t
o
p
r
io
r
wo
r
k
u
tili
zin
g
lar
g
er
la
n
g
u
a
g
e
m
o
d
els.
T
h
e
f
in
d
in
g
s
o
f
th
ese
ex
p
er
im
en
ts
ar
e
s
u
m
m
a
r
ized
in
T
a
b
le
2
.
Prio
r
to
f
in
e
-
t
u
n
in
g
,
th
e
b
ase
m
o
d
els
d
e
m
o
n
s
tr
ated
lim
ited
p
r
o
f
icien
c
y
in
th
e
d
o
m
ain
.
Fig
u
r
e
2
.
R
OUGE
-
L
p
er
f
o
r
m
a
n
ce
o
f
f
in
e
-
tu
n
e
d
SLM
s
d
u
r
in
g
tr
ain
in
g
s
tep
s
4
.
1
.
Co
m
pa
ra
t
iv
e
perf
o
r
m
a
nce
s
y
nthesi
s
L
et
M
bas
e
d
e
n
o
te
th
e
b
aselin
e
m
o
d
e
l
en
s
em
b
le
{
C
o
d
eBER
T
,
C
o
d
e2
s
eq
,
Ph
i
-
3
Min
i
4K
}
,
an
d
let
M
fi
ne
r
ep
r
esen
t
th
e
s
et
o
f
f
in
e
-
tu
n
e
d
s
m
all
m
o
d
els
s
tu
d
ied
i
n
th
is
wo
r
k
.
T
h
e
em
p
ir
ical
r
esu
l
ts
,
s
u
m
m
ar
ized
in
T
ab
le
2
,
r
ev
ea
l
a
p
r
o
n
o
u
n
ce
d
s
u
p
er
io
r
ity
o
f
o
u
r
m
o
d
els
ac
r
o
s
s
all
ev
alu
ated
in
s
tan
ce
s
.
T
h
e
em
p
ir
ical
r
esu
lts
,
s
u
m
m
ar
ized
i
n
T
a
b
le
2
,
r
e
v
ea
l
a
p
r
o
n
o
u
n
ce
d
s
u
p
er
io
r
ity
o
f
o
u
r
f
in
e
-
tu
n
ed
SLM
s
(
M
fi
ne
)
ac
r
o
s
s
all
ev
alu
ated
in
s
tan
ce
s
co
m
p
ar
ed
to
th
e
M
bas
e
b
aselin
es a
n
d
g
en
e
r
ally
im
p
r
o
v
ed
p
er
f
o
r
m
an
ce
c
o
m
p
ar
e
d
to
M
bas
e
.
T
ab
le
2
.
Mo
d
el
co
m
p
ar
is
o
n
o
n
R
OUGE
-
L
s
co
r
e
M
o
d
e
l
P
a
r
a
ms
Tr
a
i
n
a
b
l
e
P
a
r
a
ms
Tr
a
i
n
i
n
g
S
t
e
p
s
R
O
U
G
E
-
L
C
o
d
e
B
E
R
T
1
1
0
M
1
1
0
M
-
0
.
3
6
C
o
d
e
2
se
q
2
0
0
M
2
0
0
M
-
0
.
3
3
P
h
i
-
3
M
i
n
i
4
K
b
a
s
e
3
.
8
B
3
.
8
B
-
0
.
1
7
Ou
r
r
e
su
l
t
Ll
a
m
a
-
3
.
2
-
1
B
(
b
a
s
e
)
1B
1B
-
0
.
4
5
Ll
a
m
a
-
3
.
2
-
1B
-
I
n
st
r
u
c
t
1B
7M
4K
0
.
4
6
Ll
a
m
a
-
3
.
2
-
3
B
(
b
a
s
e
)
3
.
2
1
B
3
.
2
1
B
-
0
.
4
9
Ll
a
m
a
-
3
.
2
-
3B
-
I
n
st
r
u
c
t
3
.
2
1
B
1
5
M
4K
0
.
5
4
G
e
mm
a
-
2
-
2b
-
i
t
(
b
a
se)
2B
2B
-
0
.
4
6
G
e
mm
a
-
2
-
2b
-
it
-
I
n
st
r
u
c
t
2B
1
0
M
4K
0
.
4
9
Q
w
e
n
2
.
5
-
1
.
5
B
(
b
a
se)
1
.
5
B
1
.
5
B
-
0
.
4
8
Q
w
e
n
2
.
5
-
1
.
5
B
-
I
n
st
r
u
c
t
1
.
5
B
7M
4K
0
.
4
6
Q
w
e
n
2
.
5
-
3
B
(
b
a
se)
3
.
1
B
3
.
1
B
-
0
.
5
1
Q
w
e
n
2
.
5
-
3B
-
I
n
st
r
u
c
t
3
.
1
B
1
5
M
4K
0
.
5
5
4
.
2
.
Abla
t
io
n
a
nd
inte
rpre
t
a
t
io
n
A
cr
it
ic
al
ju
x
tap
o
s
it
io
n
i
s
d
r
a
wn
b
e
tw
ee
n
Ph
i
-
3
M
in
i
4
K
b
as
e
(
3
.
8
B
)
an
d
o
u
r
f
in
e
-
tu
n
ed
3
B
m
o
d
el
s
(
L
L
a
MA
3
3
B
,
Q
wen
2
.
5
3
B
)
,
wh
er
e
in
th
e
l
at
ter
d
em
o
n
s
tr
ab
ly
o
u
tp
er
f
o
r
m
d
e
s
p
it
e
f
ew
e
r
to
ta
l
p
a
r
am
et
er
s
Evaluation Warning : The document was created with Spire.PDF for Python.
I
SS
N
:
2
0
8
8
-
8
7
0
8
I
n
t J E
lec
&
C
o
m
p
E
n
g
,
Vo
l.
1
6
,
No
.
1
,
Feb
r
u
ar
y
20
2
6
:
2
7
8
-
287
284
an
d
s
ig
n
if
ic
an
tly
f
e
wer
tr
ain
a
b
l
e
p
ar
am
et
er
s
d
u
r
in
g
f
in
e
-
tu
n
in
g
(
1
5
M
vs
3
.
8
B
)
.
T
h
is
d
is
cr
e
p
an
c
y
s
u
b
s
t
an
tia
te
s
th
e
h
y
p
o
th
e
s
i
s
t
h
at
ef
f
ec
tiv
e
p
ar
am
e
ter
-
ef
f
ic
i
en
t
f
i
n
e
-
tu
n
in
g
an
d
d
at
a
al
ig
n
m
en
t
c
an
b
e
m
o
r
e
im
p
a
ctf
u
l
th
an
b
r
u
t
e
-
f
o
r
ce
b
a
s
e
m
o
d
el
s
i
ze
o
r
f
u
l
l
m
o
d
e
l
f
i
n
e
-
tu
n
in
g
f
o
r
d
o
m
ain
-
s
p
ec
if
i
c
t
ask
s
.
T
h
e
r
e
s
u
lt
s
s
u
g
g
es
t
th
at
i
t
i
s
n
o
t
m
er
ely
th
e
v
o
lu
m
etr
ic
s
i
ze
o
f
a
m
o
d
el
th
at
g
o
v
e
r
n
s
p
er
f
o
r
m
an
ce
,
b
u
t
r
a
th
e
r
th
e
s
y
n
e
r
g
i
s
ti
c
in
ter
p
lay
b
et
we
en
tr
a
in
in
g
o
b
je
ct
iv
e
,
d
at
a
a
lig
n
m
en
t
,
an
d
ef
f
i
cie
n
t
ad
ap
t
at
i
o
n
tech
n
iq
u
e
s
li
k
e
L
o
R
A
an
d
QL
o
R
A.
Fu
r
th
er
m
o
r
e
,
th
e
co
n
s
is
ten
t
p
er
f
o
r
m
an
ce
ac
r
o
s
s
th
e
L
L
a
MA
-
1
B
an
d
Qwe
n
-
1
.
5
B
v
a
r
ian
ts
b
o
th
y
ield
in
g
R
OUGE
-
L
=
0
.
4
6
im
p
lies
a
p
o
ten
tial
ca
p
ac
ity
ce
ilin
g
wh
en
u
s
in
g
th
ese
f
in
e
-
t
u
n
in
g
tech
n
iq
u
es
at
lo
wer
p
ar
a
m
eter
th
r
esh
o
ld
s
(
1
-
1
.
5
B
)
,
s
u
g
g
esti
n
g
d
im
in
is
h
in
g
m
ar
g
in
al
r
et
u
r
n
s
with
o
u
t
alt
er
n
ativ
e
ad
a
p
tatio
n
s
tr
ateg
ies o
r
p
o
ten
tial a
r
ch
itec
tu
r
al
m
o
d
if
icatio
n
s
to
b
etter
s
u
it th
e
co
d
e
d
o
m
ain
at
th
is
s
ize.
4
.
3
.
P
er
f
o
r
m
a
nce
co
m
pa
riso
n wit
h e
a
rlier
m
o
dels
:
Co
deB
E
R
T
a
nd
Co
de2
s
e
q
C
o
d
eBER
T
an
d
C
o
d
e2
s
eq
we
r
e
k
ey
s
tag
es
in
th
e
d
ev
elo
p
m
en
t
o
f
n
eu
r
al
co
d
e
g
en
e
r
atio
n
,
alth
o
u
g
h
th
ey
m
o
s
tly
u
s
ed
s
y
n
tactic
r
ep
r
esen
tatio
n
s
an
d
h
a
d
tr
o
u
b
le
ca
p
tu
r
in
g
d
ee
p
er
s
em
an
tic
lin
k
ag
es
in
s
o
u
r
ce
co
d
e.
T
h
eir
R
OUGE
-
L
s
co
r
e
s
o
f
0
.
3
6
a
n
d
0
.
3
3
s
h
o
w
h
o
w
lim
ited
th
ey
a
r
e,
esp
ec
iall
y
wh
en
it
co
m
es
to
ac
tiv
ities
th
at
n
ee
d
s
tr
o
n
g
s
em
an
tic
s
y
n
th
esis
an
d
awa
r
e
n
e
s
s
o
f
co
n
tex
t.
I
n
c
o
n
tr
ast,
o
u
r
f
in
ely
tu
n
ed
s
m
all
lan
g
u
ag
e
m
o
d
els
(
SLM
s
)
co
n
s
is
ten
tly
b
ea
t
th
ese
b
aselin
es,
with
R
OUGE
-
L
s
co
r
es
as
h
ig
h
as
0
.
5
5
.
T
h
is
s
h
o
ws
th
at
p
ar
am
eter
e
f
f
icie
n
t
f
in
e
-
tu
n
in
g
m
eth
o
d
s
n
o
t
o
n
ly
m
a
k
e
m
o
d
els
wo
r
k
b
et
ter
b
u
t
also
g
r
ea
tly
im
p
r
o
v
e
t
h
e
q
u
ality
o
f
c
o
d
e
g
e
n
er
atio
n
.
4
.
4
.
E
pis
t
em
o
lo
g
ic
a
l
r
ef
lect
i
o
ns
B
ey
o
n
d
r
aw
m
etr
ics,
o
u
r
f
i
n
d
in
g
s
g
estu
r
e
to
war
d
a
b
r
o
ad
er
e
p
is
tem
o
lo
g
ical
im
p
lic
atio
n
:
th
at
ef
f
ec
tiv
e
ad
a
p
tatio
n
a
n
d
e
f
f
ic
ien
t
ar
ch
itectu
r
es
r
at
h
er
th
a
n
b
r
u
te
-
f
o
r
ce
p
ar
am
eter
ex
p
a
n
s
io
n
m
a
y
d
e
f
in
e
th
e
n
ex
t
f
r
o
n
tier
o
f
n
eu
r
o
-
s
y
m
b
o
l
ic
co
d
e
g
en
er
atio
n
.
T
h
ese
r
esu
lts
ad
v
o
ca
te
f
o
r
a
p
ar
a
d
ig
m
s
h
if
t
to
war
d
task
-
s
p
ec
if
ic
ef
f
icien
t
f
in
e
-
tu
n
in
g
p
r
o
to
co
ls
,
en
ab
lin
g
d
em
o
cr
atize
d
d
ep
lo
y
m
e
n
t
o
f
ca
p
ab
le
SLM
s
with
o
u
t
co
m
p
r
o
m
is
in
g
o
u
t
p
u
t f
id
elity
o
n
d
o
m
ain
task
s
.
4
.
5
.
L
im
it
a
t
io
ns
o
f
t
he
s
t
ud
y
T
h
e
ev
alu
atio
n
o
f
co
d
e
g
en
er
atio
n
m
o
d
els
h
as
s
ev
er
al
lim
itatio
n
s
:
f
ir
s
t,
it
r
elies
s
o
l
ely
o
n
th
e
R
OUGE
-
L
m
etr
ic
f
o
r
lex
ical
s
im
ilar
ity
,
wh
ich
d
o
es
n
o
t
ass
ess
f
u
n
ctio
n
al
c
o
r
r
ec
tn
ess
,
s
u
ch
as
co
m
p
ilab
ilit
y
o
r
o
u
t
p
u
t
ac
cu
r
ac
y
,
n
ec
ess
itat
in
g
f
u
tu
r
e
in
c
o
r
p
o
r
atio
n
o
f
ex
ec
u
tio
n
-
b
ased
b
en
ch
m
ar
k
s
lik
e
Hu
m
an
E
v
al
a
n
d
MBP
P
f
o
r
a
m
o
r
e
co
m
p
r
eh
e
n
s
iv
e
ass
es
s
m
en
t;
s
ec
o
n
d
,
th
e
f
in
e
-
tu
n
in
g
was
lim
ited
to
th
e
C
o
d
eAlp
ac
a
-
2
0
k
d
ataset,
p
o
ten
tially
lead
in
g
to
o
v
er
f
itti
n
g
to
its
s
p
ec
if
ic
in
s
tr
u
ctio
n
s
ty
les
an
d
p
r
o
b
lem
d
is
t
r
ib
u
tio
n
s
,
s
o
test
in
g
o
n
d
i
v
er
s
e
d
atasets
is
ess
en
tial
f
o
r
b
etter
g
en
er
aliza
tio
n
;
th
i
r
d
,
h
y
p
er
p
ar
am
eter
s
f
o
r
PEFT
(
e.
g
.
,
L
o
R
A
r
an
k
r
=8
,
α
=
1
6
,
lea
r
n
in
g
r
ate)
wer
e
s
elec
ted
b
ased
o
n
b
est
p
r
ac
t
ices
with
o
u
t
ex
h
au
s
tiv
e
o
p
tim
izatio
n
,
s
u
g
g
esti
n
g
th
at
m
o
d
el
-
s
p
ec
if
ic
tu
n
in
g
v
ia
g
r
id
s
ea
r
ch
o
r
B
ay
esian
o
p
ti
m
izatio
n
co
u
ld
e
n
h
an
ce
p
er
f
o
r
m
an
ce
;
an
d
f
o
u
r
th
,
th
e
ev
alu
atio
n
u
s
ed
a
s
tatic
d
a
taset,
f
ailin
g
to
r
ef
lect
r
ea
l
-
wo
r
ld
in
ter
ac
tiv
e
s
o
f
twar
e
d
ev
el
o
p
m
en
t
with
m
u
lti
-
tu
r
n
r
e
f
in
em
en
ts
,
th
u
s
r
ec
o
m
m
en
d
in
g
ex
p
lo
r
atio
n
o
f
c
o
n
v
e
r
s
atio
n
al
AI
f
r
am
ewo
r
k
s
f
o
r
h
an
d
lin
g
f
o
llo
w
-
u
p
s
,
co
r
r
ec
tio
n
s
,
an
d
iter
ativ
e
im
p
r
o
v
em
en
ts
.
5.
CO
NCLU
SI
O
N
AND
F
U
T
U
RE
WO
RK
T
h
is
p
ap
er
p
r
esen
ts
a
co
m
p
ar
ativ
e
s
tu
d
y
o
n
p
ar
am
ete
r
-
e
f
f
icien
t
f
in
e
-
tu
n
i
n
g
(
PEFT
)
tech
n
iq
u
es
s
p
ec
if
ically
L
o
R
A
an
d
QL
o
R
A
ap
p
lied
to
s
ev
er
al
s
m
all
la
n
g
u
ag
e
m
o
d
els
(
SLM
s
)
in
clu
d
in
g
L
L
aM
A
3
.
2
,
Qwe
n
2
.
5
,
a
n
d
Gem
m
a.
T
h
e
m
o
d
els
ar
e
f
in
e
-
tu
n
ed
o
n
th
e
C
o
d
eAlp
ac
a
-
2
0
k
d
ataset
f
o
r
th
e
task
o
f
co
d
e
g
en
er
atio
n
,
a
n
d
ev
alu
ate
d
u
s
in
g
R
OUGE
-
L
as
th
e
p
r
im
ar
y
m
etr
ic.
T
h
e
s
tu
d
y
d
em
o
n
s
tr
a
tes
th
at
f
in
e
-
tu
n
ed
SLM
s
ca
n
o
u
tp
er
f
o
r
m
m
u
ch
l
ar
g
er
b
aselin
e
m
o
d
els,
h
ig
h
lig
h
tin
g
th
eir
p
o
ten
tial
f
o
r
lo
w
-
r
e
s
o
u
r
ce
d
ep
lo
y
m
en
t
in
s
o
f
twar
e
e
n
g
in
ee
r
i
n
g
co
n
te
x
ts
.
T
h
e
p
ap
er
is
well
o
r
g
an
i
ze
d
,
m
eth
o
d
o
l
o
g
ically
s
o
u
n
d
,
an
d
p
r
o
v
id
es
clea
r
em
p
ir
ical
ev
id
en
ce
s
u
p
p
o
r
tin
g
th
e
ef
f
ec
tiv
en
ess
o
f
PEFT
in
en
h
an
cin
g
th
e
p
er
f
o
r
m
a
n
ce
o
f
co
m
p
ac
t
m
o
d
els.
Ov
er
all,
it is
a
r
elev
an
t a
n
d
tim
ely
co
n
tr
ib
u
tio
n
to
th
e
f
ield
o
f
ef
f
icien
t
n
eu
r
al
c
o
d
e
g
e
n
er
a
tio
n
.
T
h
is
wo
r
k
s
u
b
s
tan
tiates
th
e
p
o
ten
tial
o
f
s
m
all,
ef
f
icien
tly
tu
n
ed
m
o
d
els
as
v
iab
le,
co
s
t
-
ef
f
e
ctiv
e,
an
d
s
u
s
tain
ab
le
alter
n
ativ
es
to
lar
g
e,
co
m
p
u
tatio
n
ally
d
em
an
d
in
g
m
o
d
els
f
o
r
d
o
m
ain
-
s
p
ec
if
ic
s
o
f
twar
e
en
g
in
ee
r
in
g
p
r
o
b
lem
s
.
T
h
eir
r
eso
u
r
ce
ef
f
icien
cy
m
a
k
es
th
e
m
p
ar
ticu
lar
ly
well
-
s
u
ited
f
o
r
d
ep
lo
y
m
e
n
t
o
n
ed
g
e
an
d
m
o
b
ile
d
ev
ices,
th
u
s
s
u
p
p
o
r
tin
g
th
e
b
r
o
a
d
er
d
e
m
o
cr
atiz
atio
n
o
f
AI
-
ass
is
ted
co
d
in
g
.
Fu
tu
r
e
r
esear
ch
d
ir
ec
tio
n
s
s
tem
d
ir
ec
tly
f
r
o
m
th
ese
p
r
o
m
is
in
g
r
esu
lts
.
W
e
p
lan
to
ex
p
lo
r
e
th
e
co
m
p
ar
ativ
e
ef
f
icac
y
o
f
o
th
e
r
PEFT
m
eth
o
d
s
an
d
r
ef
i
n
e
h
y
p
er
p
ar
am
eter
tu
n
in
g
f
o
r
o
p
tim
al
p
er
f
o
r
m
a
n
ce
.
E
x
p
an
d
i
n
g
th
e
tr
ai
n
in
g
d
ata
with
r
ich
er
,
in
ter
ac
tiv
e
co
d
e
-
r
elate
d
co
n
v
er
s
atio
n
s
c
o
u
ld
e
n
h
an
ce
th
e
m
o
d
els'
ab
ilit
y
to
h
an
d
le
co
m
p
lex
r
eq
u
ests
.
Ap
p
ly
in
g
th
ese
tech
n
iq
u
es
to
SLM
s
f
o
r
d
if
f
er
en
t
p
r
o
g
r
am
m
in
g
d
o
m
ain
s
,
s
u
ch
as
h
ar
d
war
e
d
escr
ip
ti
o
n
lan
g
u
ag
es
o
r
s
m
ar
t
co
n
tr
ac
ts
,
r
ep
r
esen
ts
an
o
th
er
im
p
o
r
tan
t
av
e
n
u
e
.
Evaluation Warning : The document was created with Spire.PDF for Python.
I
n
t J E
lec
&
C
o
m
p
E
n
g
I
SS
N:
2088
-
8
7
0
8
P
a
r
a
mete
r
-
efficien
t fin
e
-
tu
n
in
g
o
f sma
ll la
n
g
u
a
g
e
m
o
d
els fo
r
…
(
V
a
n
-
V
iet
N
g
u
ye
n
)
285
Ad
d
itio
n
ally
,
in
v
esti
g
atin
g
k
n
o
wled
g
e
d
is
till
atio
n
f
r
o
m
lar
g
er
m
o
d
els
an
d
ex
p
l
o
r
in
g
ar
ch
itectu
r
al
m
o
d
if
icatio
n
s
s
p
ec
if
ically
d
esig
n
ed
f
o
r
co
d
e
s
y
n
t
h
esis
with
in
th
e
SLM
p
ar
ad
ig
m
ar
e
c
r
u
cial
s
tep
s
to
war
d
s
d
ev
elo
p
in
g
ev
e
n
m
o
r
e
ca
p
ab
le
an
d
ef
f
icien
t n
eu
r
al
co
d
e
g
en
er
ato
r
s
.
ACK
NO
WL
E
DG
M
E
N
T
S
T
h
is
r
esear
ch
was
s
u
p
p
o
r
ted
b
y
th
e
ĐH2
0
2
5
-
T
N0
7
-
0
7
p
r
o
ject
co
n
d
u
cted
at
th
e
T
h
ai
Ng
u
y
en
Un
iv
er
s
ity
o
f
I
n
f
o
r
m
atio
n
an
d
C
o
m
m
u
n
icatio
n
T
ec
h
n
o
l
o
g
y
,
T
h
ai
Ng
u
y
e
n
,
Vietn
am
,
with
ad
d
itio
n
al
s
u
p
p
o
r
t
f
r
o
m
th
e
AI
in
So
f
twar
e
E
n
g
i
n
ee
r
in
g
L
ab
.
T
h
e
au
th
o
r
s
wo
u
ld
lik
e
t
o
th
a
n
k
th
e
v
alu
ab
le
f
e
ed
b
ac
k
p
r
o
v
id
ed
b
y
th
e
r
ev
iewe
r
s
.
AUTHO
R
CO
NT
RI
B
UT
I
O
NS ST
A
T
E
M
E
N
T
T
h
is
jo
u
r
n
al
u
s
es
th
e
C
o
n
tr
ib
u
to
r
R
o
les
T
ax
o
n
o
m
y
(
C
R
ed
iT)
to
r
ec
o
g
n
ize
in
d
iv
id
u
al
au
th
o
r
co
n
tr
ib
u
tio
n
s
,
r
ed
u
ce
au
th
o
r
s
h
ip
d
is
p
u
tes,
an
d
f
ac
ilit
ate
co
llab
o
r
atio
n
.
Na
m
e
o
f
Aut
ho
r
C
M
So
Va
Fo
I
R
D
O
E
Vi
Su
P
Fu
Van
-
Viet
Ng
u
y
en
✓
✓
✓
✓
✓
✓
✓
✓
✓
✓
✓
✓
✓
✓
T
h
e
-
Vin
h
Ng
u
y
en
✓
✓
✓
✓
✓
✓
✓
✓
✓
✓
Hu
u
-
Kh
an
h
Ng
u
y
en
✓
✓
✓
✓
✓
✓
Du
c
-
Qu
an
g
V
u
✓
✓
✓
✓
✓
C
:
C
o
n
c
e
p
t
u
a
l
i
z
a
t
i
o
n
M
:
M
e
t
h
o
d
o
l
o
g
y
So
:
So
f
t
w
a
r
e
Va
:
Va
l
i
d
a
t
i
o
n
Fo
:
Fo
r
mal
a
n
a
l
y
s
i
s
I
:
I
n
v
e
s
t
i
g
a
t
i
o
n
R
:
R
e
so
u
r
c
e
s
D
:
D
a
t
a
C
u
r
a
t
i
o
n
O
:
W
r
i
t
i
n
g
-
O
r
i
g
i
n
a
l
D
r
a
f
t
E
:
W
r
i
t
i
n
g
-
R
e
v
i
e
w
&
E
d
i
t
i
n
g
Vi
:
Vi
su
a
l
i
z
a
t
i
o
n
Su
:
Su
p
e
r
v
i
s
i
o
n
P
:
P
r
o
j
e
c
t
a
d
mi
n
i
st
r
a
t
i
o
n
Fu
:
Fu
n
d
i
n
g
a
c
q
u
i
si
t
i
o
n
CO
NF
L
I
C
T
O
F
I
N
T
E
R
E
S
T
ST
A
T
E
M
E
NT
T
h
e
au
th
o
r
s
s
tate
n
o
c
o
n
f
lict
o
f
in
ter
est.
T
h
e
au
t
h
o
r
s
h
a
v
e
n
o
f
in
an
cial,
p
er
s
o
n
al,
o
r
p
r
o
f
ess
io
n
al
r
elatio
n
s
h
ip
s
th
at
co
u
ld
i
n
ap
p
r
o
p
r
iately
in
f
l
u
en
ce
th
e
r
esear
c
h
p
r
esen
ted
i
n
th
is
p
ap
er
.
I
NF
O
RM
E
D
CO
NS
E
N
T
W
e
h
av
e
o
b
tain
ed
in
f
o
r
m
ed
c
o
n
s
en
t f
r
o
m
all
in
d
iv
id
u
als in
c
lu
d
ed
in
t
h
is
s
tu
d
y
.
E
T
H
I
CAL AP
P
RO
V
AL
T
h
is
r
esear
ch
d
o
es
n
o
t
r
e
q
u
ir
e
eth
ical
ap
p
r
o
v
al
as
it
d
o
es
n
o
t
in
v
o
lv
e
h
u
m
an
p
ar
ticip
an
ts
,
an
im
al
s
u
b
jects,
o
r
s
en
s
itiv
e
d
ata.
DATA AV
AI
L
AB
I
L
I
T
Y
T
h
e
d
ata
th
at
s
u
p
p
o
r
t
th
e
f
i
n
d
in
g
s
o
f
th
is
s
tu
d
y
ar
e
av
aila
b
le
f
r
o
m
th
e
c
o
r
r
esp
o
n
d
in
g
a
u
th
o
r
u
p
o
n
r
ea
s
o
n
ab
le
r
eq
u
est.
RE
F
E
R
E
NC
E
S
[
1
]
E.
N
.
C
r
o
t
h
e
r
s,
N
.
Ja
p
k
o
w
i
c
z
,
a
n
d
H
.
L.
V
i
k
t
o
r
,
“
M
a
c
h
i
n
e
-
g
e
n
e
r
a
t
e
d
t
e
x
t
:
A
c
o
mp
r
e
h
e
n
s
i
v
e
su
r
v
e
y
o
f
t
h
r
e
a
t
m
o
d
e
l
s a
n
d
d
e
t
e
c
t
i
o
n
met
h
o
d
s,
”
I
EE
E
A
c
c
e
ss
,
v
o
l
.
1
1
,
p
p
.
7
0
9
7
7
–
7
1
0
0
2
,
2
0
2
3
,
d
o
i
:
1
0
.
1
1
0
9
/
A
C
C
ESS
.
2
0
2
3
.
3
2
9
4
0
9
0
.
[
2
]
Q
.
Zh
a
n
g
e
t
a
l
.
,
“
A
su
r
v
e
y
o
n
l
a
r
g
e
l
a
n
g
u
a
g
e
m
o
d
e
l
s
f
o
r
s
o
f
t
w
a
r
e
e
n
g
i
n
e
e
r
i
n
g
,
”
a
rX
i
v
p
r
e
p
r
i
n
t
a
rX
i
v
:
2
3
1
2
.
1
5
2
2
3
,
2
0
2
3
.
[
3
]
Y
.
Jern
i
t
e
e
t
a
l
.
,
“
D
a
t
a
g
o
v
e
r
n
a
n
c
e
i
n
t
h
e
a
g
e
o
f
l
a
r
g
e
-
sc
a
l
e
d
a
t
a
-
d
r
i
v
e
n
l
a
n
g
u
a
g
e
t
e
c
h
n
o
l
o
g
y
,
”
i
n
A
C
M
I
n
t
e
rn
a
t
i
o
n
a
l
C
o
n
f
e
r
e
n
c
e
Pro
c
e
e
d
i
n
g
S
e
r
i
e
s
,
2
0
2
2
,
p
p
.
2
2
0
6
–
2
2
2
2
,
d
o
i
:
1
0
.
1
1
4
5
/
3
5
3
1
1
4
6
.
3
5
3
4
6
3
7
.
[
4
]
S
.
B
a
r
k
e
,
E
.
A
.
G
o
n
z
a
l
e
z
,
S
.
R
.
K
a
s
i
b
a
t
l
a
,
T.
B
e
r
g
-
K
i
r
k
p
a
t
r
i
c
k
,
a
n
d
N
.
P
o
l
i
k
a
r
p
o
v
a
,
“
H
Y
S
Y
N
TH
:
C
o
n
t
e
x
t
-
f
r
e
e
LL
M
a
p
p
r
o
x
i
m
a
t
i
o
n
f
o
r
g
u
i
d
i
n
g
p
r
o
g
r
a
m
s
y
n
t
h
e
si
s
,
”
i
n
A
d
v
a
n
c
e
s
i
n
N
e
u
ra
l
I
n
f
o
rm
a
t
i
o
n
Pr
o
c
e
s
si
n
g
S
y
s
t
e
m
s
,
2
0
2
4
,
v
o
l
.
3
7
,
p
p
.
1
5
6
1
2
–
1
5
6
4
5
.
[
5
]
N
.
V
a
n
V
i
e
t
a
n
d
N
.
T.
V
i
n
h
,
“
L
a
r
g
e
l
a
n
g
u
a
g
e
mo
d
e
l
s i
n
s
o
f
t
w
a
r
e
e
n
g
i
n
e
e
r
i
n
g
,
”
J
o
u
r
n
a
l
o
f
Ed
u
c
a
t
i
o
n
F
o
r S
u
st
a
i
n
a
b
l
e
I
n
n
o
v
a
t
i
o
n
,
v
o
l
.
2
,
n
o
.
2
,
p
p
.
1
4
6
–
1
5
6
,
D
e
c
.
2
0
2
4
,
d
o
i
:
1
0
.
5
6
9
1
6
/
j
e
s
i
.
v
2
i
2
.
9
6
8
.
[
6
]
R
.
G
.
D
r
o
me
y
,
“
F
o
r
ma
l
i
z
i
n
g
t
h
e
t
r
a
n
s
i
t
i
o
n
f
r
o
m
r
e
q
u
i
r
e
me
n
t
s t
o
d
e
s
i
g
n
,
”
i
n
M
a
t
h
e
m
a
t
i
c
a
l
F
ra
m
e
w
o
rks
F
o
r
C
o
m
p
o
n
e
n
t
S
o
f
t
w
a
re:
Mo
d
e
l
s F
o
r
An
a
l
y
si
s
A
n
d
S
y
n
t
h
e
s
i
s
,
W
o
r
l
d
S
c
i
e
n
t
i
f
i
c
,
2
0
0
6
,
p
p
.
1
7
3
–
2
0
5
.
[
7
]
A
.
V
a
sw
a
n
i
e
t
a
l
.
,
“
A
t
t
e
n
t
i
o
n
i
s
a
l
l
y
o
u
n
e
e
d
,
”
A
d
v
a
n
c
e
s
i
n
n
e
u
r
a
l
I
n
f
o
rm
a
t
i
o
n
Pr
o
c
e
ssi
n
g
S
y
st
e
m
s
,
v
o
l
.
3
0
,
2
0
1
7
.
Evaluation Warning : The document was created with Spire.PDF for Python.
I
SS
N
:
2
0
8
8
-
8
7
0
8
I
n
t J E
lec
&
C
o
m
p
E
n
g
,
Vo
l.
1
6
,
No
.
1
,
Feb
r
u
ar
y
20
2
6
:
2
7
8
-
287
286
[
8
]
A
.
R
a
d
f
o
r
d
,
J.
W
u
,
R
.
C
h
i
l
d
,
D
.
L
u
a
n
,
D
.
A
mo
d
e
i
,
a
n
d
I
.
S
u
t
s
k
e
v
e
r
,
“
L
a
n
g
u
a
g
e
m
o
d
e
l
s
a
r
e
u
n
s
u
p
e
r
v
i
se
d
m
u
l
t
i
t
a
s
k
l
e
a
r
n
e
r
s,”
O
p
e
n
AI
b
l
o
g
,
v
o
l
.
1
,
n
o
.
8
,
2
0
1
9
.
[
9
]
C
.
R
a
f
f
e
l
e
t
a
l
.
,
“
E
x
p
l
o
r
i
n
g
t
h
e
l
i
mi
t
s
o
f
t
r
a
n
sf
e
r
l
e
a
r
n
i
n
g
w
i
t
h
a
u
n
i
f
i
e
d
t
e
x
t
-
to
-
t
e
x
t
t
r
a
n
sf
o
r
m
e
r
,
”
J
o
u
r
n
a
l
o
f
M
a
c
h
i
n
e
L
e
a
rn
i
n
g
Re
se
a
rc
h
,
v
o
l
.
2
1
,
n
o
.
1
,
p
p
.
5
4
8
5
–
5
5
5
1
,
2
0
2
0
.
[
1
0
]
Y
.
W
a
n
g
,
W
.
W
a
n
g
,
S
.
J
o
t
y
,
a
n
d
S
.
C
.
H
.
H
o
i
,
“
C
o
d
e
t
5
:
I
d
e
n
t
i
f
i
e
r
-
a
w
a
r
e
u
n
i
f
i
e
d
p
r
e
-
t
r
a
i
n
e
d
e
n
c
o
d
e
r
-
d
e
c
o
d
e
r
mo
d
e
l
s
f
o
r
c
o
d
e
u
n
d
e
r
s
t
a
n
d
i
n
g
a
n
d
g
e
n
e
r
a
t
i
o
n
,
”
i
n
E
MN
L
P
2
0
2
1
-
2
0
2
1
C
o
n
f
e
re
n
c
e
o
n
Em
p
i
r
i
c
a
l
M
e
t
h
o
d
s
i
n
N
a
t
u
r
a
l
L
a
n
g
u
a
g
e
Pr
o
c
e
ssi
n
g
,
Pro
c
e
e
d
i
n
g
s
,
2
0
2
1
,
p
p
.
8
6
9
6
–
8
7
0
8
,
d
o
i
:
1
0
.
1
8
6
5
3
/
v
1
/
2
0
2
1
.
e
mn
l
p
-
mai
n
.
6
8
5
.
[
1
1
]
R
.
L
i
a
n
d
o
t
h
e
r
s,
“
S
t
a
r
C
o
d
e
r
:
M
a
y
t
h
e
so
u
r
c
e
b
e
w
i
t
h
y
o
u
!
,
”
a
r
Xi
v
p
re
p
ri
n
t
a
rXi
v
:
2
3
0
5
.
0
6
1
6
1
,
2
0
2
3
.
[
1
2
]
D
.
P
a
t
t
e
r
s
o
n
e
t
a
l
.
,
“
C
a
r
b
o
n
e
m
i
ssi
o
n
s
a
n
d
l
a
r
g
e
n
e
u
r
a
l
n
e
t
w
o
r
k
t
r
a
i
n
i
n
g
,
”
a
rXi
v
p
r
e
p
r
i
n
t
a
rX
i
v
:
2
1
0
4
.
1
0
3
5
0
,
2
0
2
1
.
[
1
3
]
P
.
Zh
a
n
g
,
G
.
Ze
n
g
,
T
.
W
a
n
g
,
a
n
d
W
.
Lu
,
“
Ti
n
y
L
l
a
ma
:
A
n
o
p
e
n
-
so
u
r
c
e
sma
l
l
l
a
n
g
u
a
g
e
mo
d
e
l
,
”
a
r
Xi
v
p
r
e
p
r
i
n
t
a
r
Xi
v
:
2
4
0
1
.
0
2
3
8
5
.
2
0
2
4
.
[
1
4
]
H
.
W
e
i
e
t
a
l
.
,
“
S
ma
l
l
l
a
n
g
u
a
g
e
m
o
d
e
l
mee
t
s
w
i
t
h
r
e
i
n
f
o
r
c
e
d
v
i
s
i
o
n
v
o
c
a
b
u
l
a
r
y
,
”
a
r
Xi
v
p
r
e
p
ri
n
t
a
r
Xi
v
:
2
4
0
1
.
1
2
5
0
3
.
2
0
2
4
.
[
1
5
]
M
.
A
b
d
i
n
,
S
.
A
.
J
a
c
o
b
s,
Y
.
Y
a
n
g
,
a
n
d
o
t
h
e
r
s
,
“
P
h
i
-
3
t
e
c
h
n
i
c
a
l
r
e
p
o
r
t
:
A
h
i
g
h
l
y
c
a
p
a
b
l
e
l
a
n
g
u
a
g
e
mo
d
e
l
l
o
c
a
l
l
y
o
n
y
o
u
r
p
h
o
n
e
,
”
a
rXi
v
p
re
p
r
i
n
t
a
rXi
v
:
2
4
1
2
.
0
8
9
0
5
.
2
0
2
4
.
[
1
6
]
E.
H
u
e
t
a
l
.
,
“
Lo
r
a
:
Lo
w
-
r
a
n
k
a
d
a
p
t
a
t
i
o
n
o
f
l
a
r
g
e
l
a
n
g
u
a
g
e
mo
d
e
l
s,
”
I
C
L
R
2
0
2
2
-
1
0
t
h
I
n
t
e
rn
a
t
i
o
n
a
l
C
o
n
f
e
r
e
n
c
e
o
n
L
e
a
r
n
i
n
g
Re
p
r
e
se
n
t
a
t
i
o
n
s
,
v
o
l
.
1
,
n
o
.
2
,
p
.
3
,
2
0
2
2
.
[
1
7
]
T.
D
e
t
t
m
e
r
s,
A
.
P
a
g
n
o
n
i
,
A
.
H
o
l
t
z
ma
n
,
a
n
d
L
.
Z
e
t
t
l
e
mo
y
e
r
,
“
Q
LO
R
A
:
Ef
f
i
c
i
e
n
t
f
i
n
e
t
u
n
i
n
g
o
f
q
u
a
n
t
i
z
e
d
L
LM
s
,
”
i
n
A
d
v
a
n
c
e
s
i
n
N
e
u
ra
l
I
n
f
o
rm
a
t
i
o
n
Pr
o
c
e
ssi
n
g
S
y
s
t
e
m
s
,
2
0
2
3
,
v
o
l
.
3
6
,
p
p
.
1
0
0
8
8
–
1
0
1
1
5
.
[
1
8
]
H
.
To
u
v
r
o
n
a
n
d
o
t
h
e
r
s,
“
LLa
M
A
:
O
p
e
n
a
n
d
e
f
f
i
c
i
e
n
t
f
o
u
n
d
a
t
i
o
n
l
a
n
g
u
a
g
e
mo
d
e
l
s
,
”
a
rX
i
v
p
r
e
p
ri
n
t
a
rX
i
v
:
2
3
0
2
.
1
3
9
7
1
,
2
0
2
3
.
[
1
9
]
T.
Li
e
b
e
r
u
m
e
t
a
l
.
,
“
G
e
mm
a
S
c
o
p
e
:
O
p
e
n
sp
a
r
se
a
u
t
o
e
n
c
o
d
e
r
s
e
v
e
r
y
w
h
e
r
e
a
l
l
a
t
o
n
c
e
o
n
G
e
mm
a
2
,
”
i
n
Bl
a
c
k
b
o
x
N
L
P
2
0
2
4
-
7
t
h
Bl
a
c
k
b
o
x
N
L
P
W
o
r
k
sh
o
p
:
A
n
a
l
y
z
i
n
g
a
n
d
I
n
t
e
r
p
re
t
i
n
g
N
e
u
r
a
l
N
e
t
w
o
r
k
s
f
o
r
N
L
P
-
Pr
o
c
e
e
d
i
n
g
s
o
f
t
h
e
Wo
r
k
sh
o
p
,
2
0
2
4
,
p
p
.
2
7
8
–
3
0
0
,
d
o
i
:
1
0
.
1
8
6
5
3
/
v
1
/
2
0
2
4
.
b
l
a
c
k
b
o
x
n
l
p
-
1
.
1
9
.
[
2
0
]
I
.
A
h
m
e
d
e
t
a
l
.
,
“
Q
w
e
n
2
.
5
:
A
c
o
m
p
r
e
h
e
n
s
i
v
e
r
e
v
i
e
w
o
f
t
h
e
l
e
a
d
i
n
g
r
e
s
o
u
r
c
e
-
e
f
f
i
c
i
e
n
t
L
L
M
w
i
t
h
p
o
t
e
n
t
i
a
l
t
o
s
u
r
p
a
s
s
a
l
l
c
o
m
p
e
t
i
t
o
r
s
,
”
A
u
t
h
o
r
e
a
P
r
e
p
r
i
n
t
s
.
2
0
2
5
,
[
O
n
l
i
n
e
]
.
A
v
a
i
l
a
b
l
e
:
h
t
t
p
s
:
/
/
w
w
w
.
t
e
c
h
r
x
i
v
.
o
r
g
/
d
o
i
/
p
d
f
/
1
0
.
3
6
2
2
7
/
t
e
c
h
r
x
i
v
.
1
7
4
0
6
0
3
0
6
.
6
5
7
3
8
4
0
6
/
v
1
.
[
2
1
]
S
.
C
h
a
u
d
h
a
r
y
,
“
C
o
d
e
a
l
p
a
c
a
:
A
n
i
n
s
t
r
u
c
t
i
o
n
-
f
o
l
l
o
w
i
n
g
l
l
a
ma
mo
d
e
l
f
o
r
c
o
d
e
g
e
n
e
r
a
t
i
o
n
,
”
G
i
t
H
u
b
re
p
o
si
t
o
r
y
.
2
0
2
3
.
[
2
2
]
Z.
F
e
n
g
e
t
a
l
.
,
“
C
o
d
e
B
ER
T:
A
p
r
e
-
t
r
a
i
n
e
d
mo
d
e
l
f
o
r
p
r
o
g
r
a
mm
i
n
g
a
n
d
n
a
t
u
r
a
l
l
a
n
g
u
a
g
e
s
,
”
i
n
Fi
n
d
i
n
g
s
o
f
t
h
e
Ass
o
c
i
a
t
i
o
n
f
o
r
C
o
m
p
u
t
a
t
i
o
n
a
l
L
i
n
g
u
i
s
t
i
c
s F
i
n
d
i
n
g
s
o
f
A
C
L
:
EM
N
L
P
2
0
2
0
,
2
0
2
0
,
p
p
.
1
5
3
6
–
1
5
4
7
,
d
o
i
:
1
0
.
1
8
6
5
3
/
v
1
/
2
0
2
0
.
f
i
n
d
i
n
g
s
-
e
mn
l
p
.
1
3
9
.
[
2
3
]
U
.
A
l
o
n
,
O
.
Le
v
y
,
S
.
B
r
o
d
y
,
a
n
d
E.
Y
a
h
a
v
,
“
C
o
d
e
2
S
e
q
:
G
e
n
e
r
a
t
i
n
g
s
e
q
u
e
n
c
e
s
f
r
o
m
st
r
u
c
t
u
r
e
d
r
e
p
r
e
s
e
n
t
a
t
i
o
n
s
o
f
c
o
d
e
,
”
a
rXi
v
p
re
p
ri
n
t
a
rXi
v
:
1
8
0
8
.
0
1
4
0
0
,
2
0
1
9
.
[
2
4
]
N
.
H
.
K
h
a
n
h
,
V
.
N
.
V
a
n
,
N
.
T.
V
i
n
h
,
a
n
d
N
.
H
.
C
o
n
g
,
“
P
h
i
-
3
m
e
e
t
s
l
a
w
:
F
i
n
e
t
u
n
i
n
g
m
i
n
i
l
a
n
g
u
a
g
e
m
o
d
e
l
s
f
o
r
l
e
g
a
l
d
o
c
u
me
n
t
u
n
d
e
r
s
t
a
n
d
i
n
g
,
”
R
e
se
a
rc
h
,
D
e
v
e
l
o
p
m
e
n
t
a
n
d
A
p
p
l
i
c
a
t
i
o
n
o
n
I
n
f
o
rm
a
t
i
o
n
a
n
d
C
o
m
m
u
n
i
c
a
t
i
o
n
T
e
c
h
n
o
l
o
g
y
|
I
S
S
N
:
1
8
5
9
-
3
5
2
6
,
v
o
l
.
2
0
2
4
,
n
o
.
3
,
p
p
.
1
3
6
–
1
4
2
,
2
0
2
4
.
[
2
5
]
Z.
L
u
o
e
t
a
l
.
,
“
W
i
z
a
r
d
c
o
d
e
r
:
Em
p
o
w
e
r
i
n
g
c
o
d
e
l
a
r
g
e
l
a
n
g
u
a
g
e
m
o
d
e
l
s
w
i
t
h
e
v
o
l
-
i
n
s
t
r
u
c
t
,
”
a
rX
i
v
p
r
e
p
r
i
n
t
a
rXi
v
:
2
3
0
6
.
0
8
5
6
8
,
2
0
2
4
.
[
2
6
]
Y
.
Z
h
u
,
M
.
Zh
u
,
N
.
Li
u
,
Z.
X
u
,
a
n
d
Y
.
P
e
n
g
,
“
Ll
a
v
a
-
p
h
i
:
Ef
f
i
c
i
e
n
t
m
u
l
t
i
-
mo
d
a
l
a
ss
i
st
a
n
t
w
i
t
h
sm
a
l
l
l
a
n
g
u
a
g
e
mo
d
e
l
,
”
i
n
E
MC
L
R
2
0
2
4
-
Pro
c
e
e
d
i
n
g
s
o
f
t
h
e
1
s
t
I
n
t
e
r
n
a
t
i
o
n
a
l
Wo
rks
h
o
p
o
n
E
f
f
i
c
i
e
n
t
Mu
l
t
i
m
e
d
i
a
C
o
m
p
u
t
i
n
g
u
n
d
e
r
L
i
m
i
t
e
d
R
e
so
u
r
c
e
s,
C
o
-
L
o
c
a
t
e
d
w
i
t
h
:
MM
2
0
2
4
,
2
0
2
4
,
p
p
.
1
8
–
2
2
,
d
o
i
:
1
0
.
1
1
4
5
/
3
6
8
8
8
6
3
.
3
6
8
9
5
7
5
.
[
2
7
]
Z.
Li
u
,
J.
L
y
n
,
W
.
Zh
u
,
a
n
d
X
.
T
i
a
n
,
“
A
Lo
R
A
:
A
l
l
o
c
a
t
i
n
g
l
o
w
-
r
a
n
k
a
d
a
p
t
a
t
i
o
n
f
o
r
f
i
n
e
-
t
u
n
i
n
g
l
a
r
g
e
l
a
n
g
u
a
g
e
m
o
d
e
l
s
,
”
i
n
Pro
c
e
e
d
i
n
g
s
o
f
t
h
e
2
0
2
4
C
o
n
f
e
re
n
c
e
o
f
t
h
e
N
o
r
t
h
Am
e
r
i
c
a
n
C
h
a
p
t
e
r
o
f
t
h
e
Ass
o
c
i
a
t
i
o
n
f
o
r
C
o
m
p
u
t
a
t
i
o
n
a
l
L
i
n
g
u
i
st
i
c
s:
H
u
m
a
n
L
a
n
g
u
a
g
e
T
e
c
h
n
o
l
o
g
i
e
s
,
2
0
2
4
,
p
p
.
6
2
2
–
6
4
1
.
B
I
O
G
RAP
H
I
E
S O
F
AUTH
O
RS
Va
n
-
Vie
t
Ng
u
y
e
n
is
a
re
se
a
rc
h
e
r
a
n
d
P
h
.
D.
stu
d
e
n
t
a
t
th
e
Th
a
i
Ng
u
y
e
n
Un
iv
e
rsity
o
f
I
n
fo
rm
a
ti
o
n
a
n
d
Co
m
m
u
n
ica
ti
o
n
Tec
h
n
o
lo
g
y
,
T
h
a
i
Ng
u
y
e
n
,
Vie
t
n
a
m
.
He
re
c
e
iv
e
d
a
b
a
c
h
e
lo
r’s
in
in
f
o
rm
a
ti
o
n
tec
h
n
o
lo
g
y
a
t
T
h
a
i
Ng
u
y
e
n
Un
iv
e
rsity
(ICTU),
Vie
t
n
a
m
in
2
0
0
9
.
He
g
o
t
a
m
a
ste
r’s
d
e
g
re
e
o
n
I
n
fo
rm
a
ti
o
n
Tec
h
n
o
lo
g
y
a
t
M
a
n
u
e
l
S
.
En
v
e
rg
a
Un
iv
e
rsity
,
P
h
il
i
p
p
i
n
e
s
in
2
0
1
2
.
He
re
se
a
rc
h
e
s
in
tere
sts
in
c
lu
d
e
a
r
ti
ficia
l
in
tell
ig
e
n
c
e
,
m
a
c
h
i
n
e
lea
rn
in
g
,
a
n
d
g
e
n
e
ra
ti
v
e
AI.
He
c
a
n
b
e
c
o
n
tac
ted
a
t
e
m
a
il
:
n
v
v
iet@
ictu
.
e
d
u
.
v
n
.
Th
e
-
Vin
h
Ng
u
y
e
n
is
c
u
rre
n
tl
y
a
se
n
io
r
lec
t
u
re
r
a
t
th
e
F
a
c
u
l
ty
o
f
I
n
fo
rm
a
ti
o
n
Tec
h
n
o
l
o
g
y
,
Un
i
v
e
rsity
o
f
In
f
o
r
m
a
ti
o
n
a
n
d
C
o
m
m
u
n
ica
ti
o
n
Tec
h
n
o
l
o
g
y
.
He
g
ra
d
u
a
ted
with
a
m
a
ste
r’s
d
e
g
re
e
in
i
n
fo
rm
a
ti
o
n
s
y
ste
m
s
m
a
n
a
g
e
m
e
n
t
fro
m
Ok
la
h
o
m
a
S
tate
U
n
iv
e
rsit
y
,
USA
(u
n
d
e
r
sc
h
o
lars
h
i
p
3
2
2
)
.
He
c
o
m
p
lete
d
h
is
P
h
.
D
.
p
r
o
g
ra
m
u
n
d
e
r
P
ro
jec
t
9
1
1
i
n
2
0
2
0
a
t
Tex
a
s
Tec
h
Un
i
v
e
rsity
,
USA.
His
m
a
in
re
se
a
rc
h
in
tere
sts
a
re
c
o
m
p
u
ter
v
isi
o
n
,
c
o
m
p
u
ter
v
isu
a
li
z
a
ti
o
n
,
a
n
d
c
o
m
p
u
ter
i
n
h
u
m
a
n
b
e
h
a
v
io
r.
He
h
a
s
a
u
th
o
re
d
o
r
c
o
a
u
t
h
o
re
d
m
o
re
th
a
n
5
0
p
u
b
li
c
a
ti
o
n
s
with
1
6
H
-
i
n
d
e
x
a
n
d
m
o
re
t
h
a
n
8
5
0
c
it
a
ti
o
n
s.
He
c
a
n
b
e
c
o
n
tac
ted
a
t
e
m
a
il
:
v
in
h
n
t@ict
u
.
e
d
u
.
v
n
Evaluation Warning : The document was created with Spire.PDF for Python.
I
n
t J E
lec
&
C
o
m
p
E
n
g
I
SS
N:
2088
-
8
7
0
8
P
a
r
a
mete
r
-
efficien
t fin
e
-
tu
n
in
g
o
f sma
ll la
n
g
u
a
g
e
m
o
d
els fo
r
…
(
V
a
n
-
V
iet
N
g
u
ye
n
)
287
H
u
u
-
K
h
a
n
h
N
g
u
y
e
n
h
a
s
g
ra
d
u
a
ted
wit
h
a
m
a
ste
r’s
d
e
g
re
e
in
c
o
m
p
u
ter
sc
ien
c
e
fro
m
th
e
Un
i
v
e
rsity
o
f
In
f
o
r
m
a
ti
o
n
a
n
d
C
o
m
m
u
n
ica
ti
o
n
s
Tec
h
n
o
l
o
g
y
-
Th
a
i
N
g
u
y
e
n
Un
iv
e
rsity
si
n
c
e
2
0
2
2
a
n
d
is
c
u
rre
n
tl
y
a
P
h
D
stu
d
e
n
t
h
e
re
sin
c
e
2
0
2
3
.
His
m
a
in
re
se
a
rc
h
in
tere
sts
a
re
c
o
m
p
u
ter
sc
ien
c
e
,
n
a
tu
ra
l
lan
g
u
a
g
e
p
ro
c
e
ss
in
g
,
g
e
n
e
ra
ti
v
e
AI
a
n
d
c
o
m
p
u
ter
v
isio
n
.
He
c
a
n
b
e
c
o
n
tac
ted
a
t
e
m
a
il
:
k
h
a
n
h
n
h
@t
n
u
.
e
d
u
.
v
n
Duc
-
Q
u
a
n
g
Vu
wa
s
b
o
rn
in
Na
m
Din
h
,
Vie
tn
a
m
in
1
9
9
1
.
H
e
re
c
e
iv
e
d
a
B.
S
.
d
e
g
re
e
in
e
d
u
c
a
ti
o
n
in
i
n
fo
rm
a
ti
o
n
tec
h
n
o
lo
g
y
fro
m
t
h
e
Th
a
i
Ng
u
y
e
n
Un
iv
e
rsity
o
f
Ed
u
c
a
ti
o
n
,
Vie
tn
a
m
,
in
2
0
1
3
a
n
d
a
n
M
.
S
.
d
e
g
re
e
i
n
i
n
fo
rm
a
ti
o
n
sy
ste
m
s
,
fro
m
t
h
e
Un
i
v
e
rsity
o
f
En
g
i
n
e
e
rin
g
a
n
d
Tec
h
n
o
l
o
g
y
(U
ET
),
Vie
tn
a
m
Na
ti
o
n
a
l
Un
i
v
e
rsit
y
,
Ha
n
o
i
(VN
U)
in
2
0
1
6
.
He
re
c
e
iv
e
d
th
e
P
h
.
D.
d
e
g
re
e
in
th
e
De
p
a
rtme
n
t
o
f
Co
m
p
u
ter
S
c
ien
c
e
a
n
d
I
n
fo
rm
a
ti
o
n
En
g
i
n
e
e
rin
g
,
Na
ti
o
n
a
l
Ce
n
tral
Un
iv
e
rsity
,
Taiwa
n
i
n
2
0
2
2
a
n
d
a
p
o
std
o
c
in
2
0
2
3
.
His
re
se
a
rc
h
in
tere
sts
in
c
lu
d
e
m
a
c
h
in
e
lea
rn
i
n
g
,
d
e
e
p
lea
rn
in
g
,
c
o
m
p
u
ter
v
is
io
n
,
s
p
e
e
c
h
p
r
o
c
e
ss
in
g
,
a
n
d
b
io
i
n
fo
rm
a
ti
c
s.
He
c
a
n
b
e
c
o
n
tac
t
e
d
a
t
e
m
a
il
:
v
d
q
u
a
n
g
@ic
t
u
.
e
d
u
.
v
n
.
Evaluation Warning : The document was created with Spire.PDF for Python.