Internati
o
nal
Journal of Ele
c
trical
and Computer
Engineering
(IJE
CE)
V
o
l.
6, N
o
. 5
,
O
c
tob
e
r
201
6, p
p
. 2
150
~215
7
I
S
SN
: 208
8-8
7
0
8
,
D
O
I
:
10.115
91
/ij
ece.v6
i
5.1
082
6
2
150
Jo
urn
a
l
h
o
me
pa
ge
: h
ttp
://iaesjo
u
r
na
l.com/
o
n
lin
e/ind
e
x.ph
p
/
IJECE
Adaptive Speech
Comp
ression
Based on Discrete Wave
Atoms
T
r
ansf
orm
Bo
usselmi Souha
1
, Al
o
u
i
No
uredine
2
, Cherif
Adnane
1
1
Department of Ph
y
s
ics,
Faculty
of Scien
ces of
Tunis, Farh
at
Hache
d
Uni
v
e
r
sity
,
T
uni
si
a
2
Cen
t
r
e
for
Resear
ch on
Micro
e
lectr
o
n
i
cs
&
N
a
no
techno
logy, Sou
sse Techn
o
l
o
g
y
Park
, Tu
n
i
sia
Article Info
A
B
STRAC
T
Article histo
r
y:
Received
Mar 21, 2016
Rev
i
sed
May 26
, 20
16
Accepted
Jun 14, 2016
This paper prop
oses a new adaptive
spe
e
c
h
c
o
mpre
ssion sy
stem ba
se
d on
discrete wave atoms tran
sform. First, the sign
al is decomposed on wave
atom
s
,
then wa
ve atom
coeffi
c
i
ents
are trun
ca
ted us
ing a ne
w adaptive
thresholding which depends
on the
SNR e
s
timation. Th
e thresholded
coeffi
cien
ts
are
quanti
zed us
ing
M
a
x Llo
y
d s
c
a
l
ar quan
tiz
er. B
e
s
i
des
,
t
h
e
y
are en
coded using zero run len
g
th encoding fo
llowed b
y
Huff
man coding.
Numerous
simu
lations ar
e per
f
orme
d to prove the robustness of our
approach
. The r
e
s
u
lts
of current work
are compared with wav
e
le
t bas
e
d
compression b
y
using objectiv
e crit
e
r
ia
, na
me
ly
CR,
SNR,
PSNR a
nd
NRMSE.
This study
shows
tha
t
the
wa
ve a
t
oms tra
n
sform is more
appropriate th
an
wavelets transfo
r
m since it offer
s
a higher compr
e
ssion ratio
and a better
speech quality
.
Keyword:
Ada
p
t
i
v
e t
h
res
hol
di
n
g
DW
A
T
DW
T
Speec
h c
o
m
p
ressi
on
Copyright ©
201
6 Institut
e
o
f
Ad
vanced
Engin
eer
ing and S
c
i
e
nce.
All rights re
se
rve
d
.
Co
rresp
ond
i
ng
Autho
r
:
Bo
u
sselm
i So
uh
a,
Depa
rt
m
e
nt
of
Phy
s
i
c
s,
Facul
t
y
of
Sci
e
nces
o
f
T
uni
s
,
F
a
rh
a
t
H
a
ch
ed
U
n
iv
e
r
s
ity,
El
M
a
nar
,
PB
20
9
2
, B
e
l
v
ede
r
e, T
uni
si
a.
Em
ail: boussel
m
i.souh
a2008@gm
ail.co
m
1.
INTRODUCTION
The im
provem
e
nt in the s
p
ee
ch com
p
ressi
on field is
m
a
in
l
y
related
to
th
e n
eed
of rap
i
d
an
d
efficien
t
tech
n
i
qu
es
fo
r d
a
ta stor
ag
e
an
d
t
r
ansm
issi
o
n
.Th
e
pur
po
se o
f
an
y co
mp
r
e
ssi
on
techniq
u
e
is t
o
pr
esen
t th
e
speec
h signal using fe
w
bits while prese
r
vi
ng t
h
e quality
of t
h
e rec
o
nstructed si
gnal.
Speech c
o
m
p
ression is
b
a
sed
o
n
r
e
du
cin
g
t
h
e
r
e
d
undan
c
y
b
e
tw
een
sam
p
les; it h
a
s b
e
co
m
e
u
b
i
quito
u
s
i
n
m
a
n
y
ap
p
lication
s
such
as
m
o
b
ile te
lep
h
o
n
y
an
d
vo
ice ov
er IP. In
literatu
re, th
e sp
eech
co
m
p
ression
alg
o
rith
m
s
a
r
e sp
lit in
to
two
m
a
in
categories: lossless com
p
ression and lo
ssy
c
o
m
p
ressi
o
n
. T
h
e fi
rst
o
n
e p
r
ovi
des an e
x
ac
t
reconst
r
uct
i
o
n of t
h
e
o
r
i
g
in
al sign
al; h
o
wev
e
r, it can
no
t ach
iev
e
lo
w
d
a
ta rates.
We can
m
e
n
tio
n
th
e Run
Leng
th
En
cod
i
ng
an
d
th
e
Hu
ffm
an co
di
ng
as t
h
e
m
o
st
kn
o
w
n al
go
r
i
t
h
m
s
i
n
t
h
is category.
T
h
e second one
ca
nnot obtain
an
exact
recon
s
tru
c
tion
,
bu
t it prov
id
es h
i
gh
co
m
p
ressio
n
ratio
s
[1
].
In
g
e
n
e
ral, the ex
istin
g
sp
eech
co
m
p
ression
alg
o
rith
m
s
c
o
m
b
in
e b
e
tween
bo
th
of th
em in
o
r
d
e
r to
increase
as much as
possibl
e the c
o
m
p
res
s
i
o
n
rat
i
o
. T
h
e
r
e are
di
ffe
re
n
t
t
echni
ques
o
f
a
udi
o c
o
m
p
ressi
on
,
nam
e
ly the direct speech
com
p
ression,
param
e
ter
extraction a
nd t
r
ansform
a
tion m
e
thods
. The
direct
co
m
p
ression
co
n
s
ists
o
f
ex
tractin
g
th
e sign
ifican
t in
fo
rm
ati
o
n
i
n
tem
p
o
r
al d
o
m
ain
to
app
r
o
x
i
m
a
te th
e o
r
ig
in
al
si
gnal
[
2
]
.
Par
a
m
e
t
e
r ext
r
act
i
on m
e
t
hods e
x
t
r
act
t
h
e pa
r
a
m
e
t
e
rs of t
h
e si
gnal
su
ch
as l
i
n
ear p
r
e
d
i
c
t
i
v
e
com
p
ressi
o
n
[
3
]
.
T
r
ans
f
orm
com
p
ressi
o
n
t
e
chni
que
(e
.g
.,
di
scret
e
c
o
si
ne
t
r
ans
f
orm
s
[4]
,
wa
vel
e
t
t
r
a
n
s
f
o
r
m
s
[5]
)
co
n
v
ert
t
h
e si
gnal
fr
om
t
h
e t
i
m
e
dom
ai
n t
o
anot
h
e
r
parsi
m
oni
o
u
s dom
ai
n. Am
ong t
h
em
, t
h
e
wavel
e
t
t
r
ans
f
o
r
m
i
s
t
h
e m
o
st
po
pul
ar
o
n
e si
nce i
t
w
a
s use
d
i
n
m
a
ny
si
gnal
pr
oces
si
ng
ap
pl
i
cat
i
ons,
[
6
]
-
[
9
]
.
M
a
ny
com
p
arat
i
v
e st
udi
es h
a
ve bee
n
p
r
ov
en t
h
at
wa
vel
e
t
out
pe
rf
orm
s
t
h
e DC
T (
D
i
s
c
r
et
e cosi
ne
s
tran
sform
)
which
is u
tilized b
y
th
e MPEG stand
a
rd, FFT (Fast Fo
urie
r Tran
sform
)
[4
] and
LPC
(Lin
ear
Pred
ictiv
e Co
din
g
) [10
]
. In
recen
t years, n
e
w m
u
lt
i-s
cale trans
f
orm
calle
d wa
ve atom
s
has bee
n
em
erged, it
Evaluation Warning : The document was created with Spire.PDF for Python.
I
J
ECE
I
S
SN
:
208
8-8
7
0
8
Adaptive Speec
h
C
o
mpressi
on
Base
d on Disc
re
te Wa
ve
Atom
s Tran
sform
(Bo
u
ssel
m
i Sou
ha
)
2
151
h
a
s
b
e
en
in
cl
ud
ed in m
a
n
y
articles in
th
e
field
of im
ag
e pro
cessing
[1
1
]-[1
4
]
, th
is tran
sfo
r
m
is well su
ited
for
represen
tin
g
t
h
e im
ag
es, d
a
ta, d
u
e
t
o
its d
i
rection
a
lity
an
d
sp
arsity co
m
p
ared
with th
e d
i
screte
wav
e
let
tran
sform
.
Sp
arsity is
th
e i
m
p
o
rtan
t criterion
th
at can
be considere
d
in
speec
h co
m
p
ressi
on
, i
n
co
nt
r
a
st
of
d
i
rection
a
lity, wh
ich
is co
n
s
i
d
ered
for
o
n
l
y
2D or
h
i
gh
er
d
i
m
e
n
s
io
n
a
l sig
n
a
l
[15
]
.
In th
is con
t
ex
t
,
we h
a
v
e
pr
o
pose
d
a ne
w com
p
ressi
o
n
sy
st
em
t
o
expl
ore t
h
e
use o
f
wave at
om
s i
n
t
h
e fi
el
d o
f
spe
ech com
p
ressi
on;
we
h
a
v
e
also
p
r
o
p
o
s
ed
an
ad
ap
tiv
e th
resho
l
d
,
wh
ich
d
e
p
e
nd
s o
n
th
e SNR esti
m
a
t
i
o
n
to
p
r
eserv
e
th
e
q
u
a
lity
o
f
t
h
e rec
o
nst
r
uct
e
d si
gnal
.
Th
is article is
stru
ctured
as
fo
llo
ws. Th
e
n
e
x
t
s
ect
i
o
n de
sc
ri
bes t
h
e
di
scr
e
t
e
wave
at
om
s t
r
ans
f
orm
.
The sect
i
o
n II
I
gi
ves m
o
re de
t
a
i
l
about
t
h
e p
r
o
p
o
sed s
p
eec
h com
p
ressi
o
n
sy
st
em
. Then,
sim
u
l
a
t
i
on res
u
l
t
s
are
p
r
esen
ted in
sectio
n
IV. Fi
n
a
lly, we con
c
lud
e
th
is
work with sectio
n V.
2.
DISCRETE
WAVE AT
OMS T
R
ANSFORM
In [1
0
]
, a sign
al is con
s
id
ered
as o
s
cillato
ry
m
o
d
e
l wh
en
it
can
b
e
d
e
scri
bed
as th
e
fun
c
ti
o
n
b
e
l
o
w:
()
s
i
n
(
()
)
(
)
g
fx
N
x
h
x
(
1
)
x
, i
s
coor
di
nat
e
.
g
, and
h
are
C
scale function.
h
has a com
p
act
suppo
rt
i
n
[
0
,
1
]
2
and
N
is a large constant
.
Fo
uri
e
r
se
ri
es
decom
poses
a
fu
nct
i
o
n
havi
n
g
a
fi
ni
t
e
d
u
rat
i
on
o
r
whi
c
h i
s
pe
ri
o
d
i
c
i
n
t
o
a s
u
m
of
osci
l
l
at
i
ng
fun
c
tion
,
n
a
m
e
ly sin
e
s an
d
co
sin
e
s. In
Fou
r
ier tran
sform
,
sp
arsity is
m
i
s
s
ed
du
e to
d
i
sco
n
tinu
ities, wh
ich
is
kn
o
w
n a
s
Gi
b
b
s P
h
e
nom
eno
n
.
It
nee
d
s a
n
im
port
a
nt
num
ber
of
coe
ffi
ci
ent
s
t
o
reco
nst
r
uct
a
di
sco
n
t
i
nui
t
y
with
min
i
m
a
l
lo
ss o
f
accu
r
acy. Fo
r g
e
tting
sp
arse so
l
u
tio
n
of sig
n
a
l
f,
w
a
v
e
ato
m
s w
e
r
e
pr
opo
sed
by
Dem
a
net
and
Yi
n
g
i
n
[
1
6]
,[
1
7
]
.
Theorem
:
Fo
r
f
be
of the
f
o
rm
(1
).
Ass
u
m
e
g
has no
critical po
in
ts. Th
en
f
can be
re
prese
n
t
e
d t
o
accuracy
in
b
y
th
e larg
est
CN
wa
ve atom
s coefficients in abs
o
l
u
te val
u
e,
whe
r
e for all
0
M
, there exists
0
M
C
such that
_1
/
M
M
CC
.
Thi
s
t
h
e
o
rem
m
eans t
h
at
w
a
ve at
om
s t
r
ansf
orm
gi
ve
ON
co
efficien
ts to o
s
cillato
ry fun
c
tion.
Unde
r s
o
m
e
ac
curacy situati
o
n,
we
woul
d
need
3/
2
ON
curvelet c
o
efficients
or
2
ON
wavelet c
o
effi
cients.
Desp
ite th
e fact th
at th
is co
n
c
lu
sion
is forced
in
two
or h
i
gh
er
d
i
m
e
n
s
io
nal sig
n
a
l, it is
easy to
rev
e
rt to
on
e
d
i
m
e
n
s
io
n
a
l situ
atio
n, in
wh
ich
wrapp
i
ng
d
e
scrip
tion
an
d
d
i
rectio
n
a
lity o
f
wav
e
at
o
m
s are n
o
t
co
n
s
i
d
ered
bu
t
the spa
r
sity is
prese
r
ved. The
princi
ple of
our study is the assum
p
tion tha
t
the speech si
gnal
obeys this
m
odel.
Whe
r
eas, a s
p
eech signal
obeys the
m
o
del
m
e
ntioned
above.
W
e
t
h
ink t
h
at wa
ve
atom
s transform can
represen
t a sp
eech
sign
al m
o
re sp
arsely an
d i
m
p
r
ov
e co
m
p
ressi
on
factor w
h
ile preserv
i
n
g
t
h
e qu
ality u
pon
reco
nst
r
uct
i
o
n.
W
a
ve at
om
s are a vari
a
n
t
o
f
wa
vel
e
t
p
ackets; they have a high
fre
quency localization that
can
no
t b
e
ach
i
e
v
e
d
u
s
ing
a filter b
a
n
k
b
a
sed
on
wav
e
le
t p
ack
ets an
d
C
u
rv
elet, Gab
o
r ato
m
s. W
a
v
e
ato
m
s
ex
actly in
terpolate b
e
tween
Gab
o
r ato
m
s an
d
d
i
rection
a
l wav
e
lets. Th
e
p
a
ram
e
ter
α
represent the m
u
lti-scale
t
r
ans
f
o
r
m
pro
p
ert
i
e
s,
fr
om
0 (
u
ni
f
o
rm
) t
o
1 (
d
y
a
di
c
)
.
The
pa
ram
e
t
e
r
β
m
easure
s
the
wave
packet’s
d
i
rection
a
l selectiv
ity. Fig
u
re
1
ex
po
ses th
e
variou
s tran
sform
s
.
Fi
gu
re
1.
I
d
ent
i
fi
cat
i
on
of
va
r
i
ous
t
r
an
sf
orm
s
as (
α
,
β
) fam
i
l
i
e
s of
wa
ve
pa
cket
s [
1
6]
Wave
at
om
s 1
D
fam
i
l
y
funct
i
on
i
s
d
e
fi
ne
d
as
()
x
,
with subscrip
t
μ
= (j
,
m
,
n
)
. Th
e index
e
d
poi
nt
(
x
μ
,
ωμ
)
in phase
-
s
p
ace
is defi
ned as
follows.
Evaluation Warning : The document was created with Spire.PDF for Python.
I
S
SN
:
2
088
-87
08
I
J
ECE
Vo
l. 6
,
N
o
. 5
,
O
c
tob
e
r
20
16
:
215
0
–
21
57
2
152
2
j
x
n
,
2
j
wn
(2)
12
2m
a
x
2
j
j
i
Cm
C
The elem
ents of fram
e
()
x
are na
med wa
ve atom
s when:
2
ˆ
(1
2
)
J
jM
M
C
2
(1
2
)
J
jM
M
C
(3)
()
2
(
1
2
)
0
jj
M
y
x
Cx
x
M
In
p
r
act
i
ce,
wa
ve at
om
s are c
onst
r
uct
e
d
fr
o
m
t
e
nsor
pr
o
d
u
ct
s
of
a
part
i
c
ul
ar
wa
vel
e
t
p
acket
,
w
h
i
c
h
satisfies p
a
rabo
lic scalin
g
wav
e
leng
th th
at
is ach
iev
e
d
usin
g d
e
co
m
p
ositio
n
arch
itectu
r
e lik
e i
n
com
p
le
te
wavel
e
t
pac
k
et
as s
h
o
w
n i
n
Fi
gu
re
2
[1
6]
.
Fi
gu
re
2.
St
rat
e
gy
o
f
wa
ve at
om
s and c
o
r
r
es
po
n
d
i
n
g set
of
sub
b
a
n
ds [
1
6]
Figure 3 s
h
ows the space-fre
que
ncy
dom
a
in form
s of one-dim
e
nsional wave atom
s at increasi
ng
scales.
Figure
3. One
dim
e
nsional wave atom
s in
s
p
ace
fre
que
ncy
dom
a
in at increasing scales
0
100
200
300
-0.
3
-0.
2
-0.
1
0
0.
1
0.
2
j
=
3,
m=
(
3
)
,
s
p
at
i
a
l
do
mai
n
0
100
200
300
0
1
2
3
4
5
6
j
=
3, m=
(3), f
r
equ
en
c
y
do
mai
n
0
100
200
300
-0
.
4
-0
.
3
-0
.
2
-0
.
1
0
0.
1
0.
2
0.
3
j
=
4,
m=
(5),
s
pat
i
a
l
do
mai
n
0
10
0
20
0
30
0
0
0.5
1
1.5
2
2.5
3
3.5
4
j
=
4
,
m=
(5),
f
r
equ
en
c
y
do
mai
n
Evaluation Warning : The document was created with Spire.PDF for Python.
IJECE
ISS
N
:
2088-8708
Adaptive Speec
h
C
o
mpressi
on
Base
d on Disc
re
te Wa
ve
Atom
s Tran
sform
(Bo
u
ssel
m
i Sou
ha
)
2
153
3.
AD
APTI
VE S
P
EECH
CO
M
P
RESSI
ON
U
S
ING
W
AVE
ATO
M
S T
R
A
N
SF
OR
M
T
h
e bl
ock
di
agram
of t
h
e
pr
o
pose
d
c
o
m
p
ressi
o
n
sy
st
em
i
s
i
l
l
u
st
rat
e
d by
Fi
gu
re 4
.
The
di
ffe
rent
steps
of the sys
t
e
m
are explaine
d
i
n
t
h
e
fol
l
o
wi
n
g
para
gra
p
hs.
Figure 4.
Bloc
k
diagram
of a
d
aptive
s
p
eech
com
p
ression using
DWAT
3.
1.
Discrete w
ave
atoms
tr
ans
f
orm
The first step of
our approach consists in
dec
o
m
posing the s
p
eech
signal using DWAT
. The
particula
r
ity
of
this transf
or
m
is
to con
v
e
r
t the tem
pora
l
represe
n
ta
tio
n o
f
a signal i
n
to a tim
e
-fre
que
ncy
rep
r
ese
n
t
a
t
i
on.
Thi
s
dom
ai
n t
r
ans
f
o
r
m
a
ti
on
red
u
ces t
h
e redundancy
a
n
d decorrelate
s
the signal’s s
a
m
p
les,
thus
, decrease
s
the bitrate of
t
r
ansm
issio
n
. Wa
ve
atom
s concentrat
e speec
h i
n
form
ation int
o
a fe
w
coefficients as
shown i
n
Fi
gure
5
[3]. The
r
e
f
ore,
after appl
ying t
h
e
wa
ve
at
om
s t
r
ansf
or
m
of a si
g
n
al
,
m
a
ny
co
efficien
ts
will eith
er b
e
zero
o
r
h
a
v
e
n
e
g
lig
ib
le m
a
g
n
itudes.
Fi
gu
re 5.
N
o
r
m
al
i
zed W
a
ve
at
om
s coefficients
of s
p
eec
h si
gnals
3.
2.
N
e
w adapt
i
ve
t
h
resho
lding
Th
resho
l
d
i
n
g
i
s
th
e m
o
st i
m
p
o
rtan
t step
in
co
m
p
ression
b
a
sed
tran
sfo
r
m
;
it co
n
s
ists of rej
ecting
th
e
coef
fi
ci
ent
s
of
t
h
e
D
W
AT
t
r
ans
f
o
r
m
i
n
feri
or
t
o
a
gi
v
e
n t
h
res
hol
d.
The
r
e a
r
e
d
i
ffere
n
t
m
e
t
hods
o
f
Evaluation Warning : The document was created with Spire.PDF for Python.
I
S
SN
:
2
088
-87
08
I
J
ECE
Vo
l. 6
,
N
o
. 5
,
O
c
tob
e
r
20
16
:
215
0
–
21
57
2
154
t
h
res
hol
di
n
g
, s
u
ch a
s
t
h
e
har
d
t
h
res
hol
di
n
g
a
nd t
h
e s
o
ft
t
h
re
shol
di
n
g
whi
c
h are t
h
e com
m
onl
y
used m
e
t
h
o
d
s.
In
t
h
i
s
wo
r
k
w
e
ha
ve
use
d
t
h
e
ha
rd
t
h
re
sh
ol
d
gi
ve
n i
n
t
h
i
s
e
quat
i
o
n:
Re
Re
Re
0
Ci
f
C
T
C
oth
e
rwise
(4)
The c
h
oi
ce o
f
t
h
res
hol
d
T
i
s
ve
ry
del
i
cat
e;
a
bad
ch
oi
ce
of
t
h
res
hol
d ca
n
deg
r
a
d
e t
h
e
si
gnal
a
f
t
e
r
reco
nst
r
uct
i
o
n.
The
r
e i
s
n
o
su
i
t
a
bl
e t
h
res
h
ol
d
fo
r al
l
si
gnal
s
d
u
e t
o
t
h
e
di
versi
t
y
of
spee
ch si
gnal
s
. T
h
us,
w
e
have i
n
t
r
o
d
u
ce
d a new a
d
apt
i
ve t
h
res
h
ol
di
n
g
pr
ocess
whi
c
h al
l
o
ws t
h
e ad
ju
st
m
e
nt
of t
h
e t
h
resh
ol
d acc
o
r
di
ng
to the
desire
d speech quality.
The
fl
ow c
h
art
of the a
d
a
p
tive
threshold
proc
ess is shown in Figure
6.
St
a
r
t
C
o
ef
f
i
ci
en
t
o
f
D
W
A
T
C
i
=C
i
+1
Sor
t
(
a
b
s
(
C
bt
h
)
T=C
bt
h
(C
i
)
C
at
h
=C
bt
h
.*(
a
b
s
(c
)
)
>T
c
SN
R
(
C
bt
h
,C
at
h
)>19
C
i
=0
NO
EN
D
YE
S
Fi
gu
re
6.
Fl
o
w
cha
r
t
o
f
a
d
apt
i
ve t
h
res
hol
di
n
g
Ci: Retain
ed
co
efficien
ts.
C
bth
: Coefficients
bef
o
re
thre
shol
din
g
.
C
ath
: Co
efficien
ts after thresho
l
d
i
ng
.
3.
3.
L
l
oyd
-M
a
x
sc
al
ar
qu
an
ti
z
a
ti
on
Aft
e
r
t
h
resh
ol
d
i
ng,
a
qua
nt
i
zat
i
on
p
r
oces
s i
s
per
f
o
r
m
e
d. It
deal
s wi
t
h
t
h
e ap
pr
o
x
i
m
ati
on
of t
h
e
ret
a
i
n
ed
coe
ffi
ci
ent
s
o
f
D
W
AT
wi
t
h
a
fi
n
i
t
e
set
of
val
u
es.
The
r
e
a
r
e
t
w
o
m
e
t
hods
o
f
qua
nt
i
zat
i
on:
T
h
e
scalar qua
n
tization a
n
d
the
vector
qu
ant
i
zat
i
on.
I
n
gene
ral
,
q
u
a
n
t
i
zat
i
on
causes
a rel
a
t
i
v
e di
st
ort
i
o
n of
t
h
e
si
gnal
,
w
h
i
c
h
c
a
n
be m
i
nim
i
zed
by
t
h
e
use
o
f
t
h
e
Ll
oy
d
-
M
a
x scal
ar
q
u
a
n
t
i
zer
.
3.
4.
Enco
ding
To achieve the speech com
p
ression, we ha
ve encode
d the
qua
ntized coe
f
ficients using a particular
Ru
n
Leng
th
En
cod
i
ng
su
itable f
o
r
our
v
ect
o
r
. Th
is typ
e
of
en
cod
i
ng
co
des on
ly th
e run
s
o
f
zer
o
s w
i
t
h
two
bytes. The firs
t byte indicates the start
of a
sequence
of z
e
ros a
n
d the s
econd
one
re
presents
the number of
zeros
.
[1
8]
T
h
i
s
st
ep i
s
fol
l
o
wed
by
a
H
u
f
f
m
a
n codi
ng
in
order to eli
m
inate an
y re
dundancy ca
us
ed
by
qua
ntization. To rec
onst
r
uc
t the speech signal, we
ha
ve reve
rse
d
the differe
n
t stages (W
a
v
e atom
s,
qua
nt
i
zat
i
o
n
,
c
odi
ng
).
Evaluation Warning : The document was created with Spire.PDF for Python.
I
J
ECE
I
S
SN
:
208
8-8
7
0
8
Adaptive Speec
h
C
o
mpressi
on
Base
d on Disc
re
te Wa
ve
Atom
s Tran
sform
(Bo
u
ssel
m
i Sou
ha
)
2
155
4.
TEST AND
RESULTS
In t
h
i
s
sect
i
o
n
,
a M
A
TL
AB
pr
og
ram
has been
de
vel
o
pe
d t
o
i
m
pl
em
en
t
t
h
e speec
h com
p
ressi
o
n
codec
based
o
n
D
W
AT
as
d
e
scri
be
d i
n
t
h
i
s
pa
per
.
T
o
ev
al
uat
e
t
h
e e
ffi
c
i
ency
o
f
t
h
e
d
e
vel
o
ped
al
g
o
r
i
t
h
m
,
a
com
p
arative study
betwee
n the DWAT
a
n
d
D
W
T al
g
o
ri
t
h
m
i
s
perf
orm
e
d usi
ng
o
b
j
ect
i
v
e cri
t
e
ri
a;
C
R
,
SN
R
,
PSNR
a
nd
NR
M
S
E. I
n
al
l
si
m
u
l
a
t
i
ons,
onl
y
sou
r
ce speec
h si
g
n
al
s ext
r
a
c
t
e
d fr
om
t
h
e TIM
I
T
Dat
a
ba
se are
expl
oi
t
e
d [1
9]
.
Co
m
p
ressio
n
ratio
(CR)
(()
)
()
L
e
ngth
x
n
C
L
e
ng
ht
c
W
C
(
5
)
cW
C
,
is th
e leng
th of th
e co
m
p
ress
ed
wave
atom
s trans
f
orm
vector.
Si
gnal
t
o
noi
se
rat
i
o
(S
NR
)
2
10
2
10
l
o
g
(
)
x
e
SNR
(6)
Whe
r
e
2
x
, is the
mean square
of the
speec
h si
gnal a
n
d
2
e
is the m
ean squa
re
differe
n
ce
bet
w
een the
o
r
i
g
in
al an
d reco
nstru
c
ted
si
gn
al.
Peak signal
to noi
se ratio
(PSNR)
2
10
2
10
l
o
g
NX
PS
NR
x
r
(7)
N
, i
s
t
h
e l
e
ngt
h
of t
h
e rec
o
nst
r
uct
e
d si
g
n
al
,
X
i
s
th
e
m
a
x
i
m
u
m ab
so
lu
te square v
a
lu
e
o
f
the sig
n
a
l
x
and
2
x
r
is the e
n
ergy
of the
error bet
w
een
the
reconstructed a
n
d
ori
g
inal signal.
Norm
alized root m
ean square
error (NRMSE)
2
2
()
()
(()
(
)
)
x
xn
r
n
NRM
S
E
x
nn
(
8
)
()
x
n
, is th
e sp
eech
sig
n
a
l,
()
rn
is th
e reco
nstru
c
ted
si
gn
al, an
d
()
x
n
is the
mean of the
speech si
gnal.
Th
e test
resu
lts of th
e propo
sed
al
gorithm
are summ
ari
zed in Ta
ble1.
Table
1.
Perform
ance evaluat
i
on
of the
pr
opose
d algorithm
usi
n
g TIM
I
T s
p
eech files
Audio file
Algor
ith
m
CR
SNR
PSNR
NRM
S
E
sx27.
wav
DW
AT
10.
119
8
19.
132
4
39.
841
6
0.
1105
sx11.
wav
DW
AT
12.
459
3
19.
137
0
36.
715
8
0.
1104
sx12.
wav
DW
AT
9.
8699
19.
096
8
37.
325
3
0.
1110
sx37.
wav
DW
AT
12.
412
1
19.
127
9
37.
644
1
0.
1106
sx57.
wav
DW
AT
10.
475
7
19.
135
4
35.
719
9
0.
1105
sx26.
wav
DW
AT
8.
4150
19.
121
0
36.
454
4
0.
11
0
6
sx243.
wav
DW
AT
12.
328
1
19.
139
4
39.
001
1
0.
1104
Fr
o
m
th
e above tab
l
e, it is obv
iou
s
th
at
ou
r
ap
pro
ach
of
f
e
rs a h
i
gh
co
m
p
r
e
ssio
n
r
a
tio
. Th
e SN
R is in
av
erag
e
o
f
19
d
B
th
at is
h
i
gh
en
oug
h to
certify a goo
d
q
u
a
lity of th
e
recon
s
tru
c
ted sig
n
a
l.
We can
as well
rem
a
rk
th
at b
y
u
s
ing
t
h
e ad
ap
tiv
e thresh
o
l
din
g
we
h
a
ve go
t a un
iform
S
N
R, in con
t
rast in
[4
] th
e thresho
l
d
val
u
e i
s
set
m
a
nual
l
y
, w
h
i
c
h
enge
n
d
ers a
n
o
t
u
n
i
f
orm
SNR
t
h
at
vari
es b
e
t
w
een
10
dB
a
nd
2
2dB
.
He
n
ce t
h
e
q
u
a
lity of recon
s
tru
c
ted
sign
al is no
t assu
red fo
r all sp
eech
sig
n
a
ls.
To eval
uate the efficiency of our a
p
proach a co
m
p
arative stu
d
y
is esta
b
lish
e
d
with
oth
e
r stud
ies
base
d o
n
D
W
T release
d
in
[4]
,
[2
0]
, a
n
d
[8]
.
Fo
r t
h
e
DWT algorithm
,
we have
used the
Da
ubechies
Evaluation Warning : The document was created with Spire.PDF for Python.
I
S
SN
:
2
088
-87
08
I
J
ECE
Vo
l. 6
,
N
o
. 5
,
O
c
tob
e
r
20
16
:
215
0
–
21
57
2
156
o
r
t
h
ogo
n
a
l wav
e
let d
b10
and we h
a
v
e
app
lied
fiv
e
d
e
co
mp
o
s
ition
lev
e
ls
an
d
a
g
l
ob
al thresho
l
d
i
n
g
. Gi
v
e
n
t
h
e
acoustic differences
betwee
n m
a
le and
female we
have
e
ffectuate
d c
o
mparis
on tests on three
voices from
each
gende
r
.
T
h
ese
res
u
lts are
in Ta
ble
2 a
n
d Table
3.
Tabl
e
2. T
h
e
p
e
rf
orm
a
nce res
u
l
t
s
o
f
t
h
e
D
W
AT a
n
d
D
W
T
al
go
ri
t
h
m
s
for
fem
a
l
e
voi
ces
Audio file
Algor
ith
m
CR
SNR
PSNR
NRMSE
sx69.
wav
DWA
T
9.
8108
19.
129
9
35.
418
6
0.
1105
DWT
7.
7978
18.
003
8
34.
707
6
0.
1258
sx84.
wav
DWA
T
10.
291
5
19.
137
7
37.
149
2
0.
1104
DWT
8.
2026
17.
851
3
36.
545
0
0.
1281
sx210.
wav
DWA
T
13.
385
6
19.
132
0
36.
208
4
0.
1105
DWT
6.
3563
18.
123
3
36.
385
5
0.
1278
Tabl
e 3.
Th
e
perfo
r
m
a
n
ce resu
lts of th
e DWAT and
DW
T
alg
o
rith
m
s
fo
r
male v
o
i
ces
Audio file
Algor
ith
m
CR
SNR
PSNR
NRMSE
sx156.
wav
DWA
T
12.
931
3
19.
100
0
38.
093
9
0.
1109
DWT
8.
9302
17.
907
4
38.
312
7
0.
1272
sx229.
wav
DWA
T
9.
5145
19.
165
6
35.
063
8
0.
1101
DWT
6.
5588
17.
941
0
33.
839
2
0.
1268
sx289.
wav
DWA
T
9.
8997
19.
114
1
37.
067
7
0.
1107
DWT
6.
1226
17.
980
7
35.
880
4
0.
1262
Thr
o
ug
h
out
Ta
bl
e 2 an
d Tabl
e 3, i
t
i
s
cl
early
show
n t
h
at
t
h
e pr
o
p
o
sed s
y
st
em
rat
e
s ar
e bet
t
e
r t
h
an
D
W
T fo
r m
a
l
e
and fem
a
l
e
vo
i
ces. In fact
, i
t
has i
m
prove
d t
h
e C
R
,
PSNR
,
SNR
pa
ram
e
t
e
rs;
whi
l
e
dec
r
e
a
si
ng
th
e NRMSE.
Desp
ite, th
e
wav
e
let filter o
p
ti
mizatio
n
u
s
ed
to
i
m
p
r
ov
e th
e sp
eech
co
m
p
ressi
on
u
s
ing
DWT as
gi
ve
n
by
[
8
]
,
[
21]
,
[
22]
t
h
ey
cann
o
t
ac
hi
eve
t
h
e c
o
m
p
ressi
on
rat
i
o
o
b
t
a
i
n
ed
by
t
h
e
p
r
o
p
o
se
d al
g
o
ri
t
h
m
.
5.
CO
NCL
USI
O
N
In t
h
i
s
pa
per
,
a new a
d
a
p
t
i
v
e speec
h c
o
m
p
ressi
on al
g
o
ri
t
h
m
usi
ng
di
scret
e
Wave
At
om
s i
s
prese
n
t
e
d
.
T
h
e
eval
uat
i
o
n
of
pe
rf
orm
a
nce
usi
n
g
o
b
j
ecti
v
e criteria s
u
c
h
as CR, SNR, PSNR
and
NRMSE
sho
w
s t
h
at
t
h
e devel
ope
d
al
go
ri
t
h
m
present
e
d i
n
t
h
i
s
pape
r gi
ves a
hi
gh c
o
m
p
ressi
on
rat
i
o
wi
t
h
o
u
t
destruction t
h
e
quality of the
recons
tructed speech si
gnal.
A com
p
arativ
e
study betwee
n our DWAT and the
D
W
T m
e
t
hods
dem
onst
r
at
es t
h
at
t
h
e
pr
op
ose
d
al
g
o
ri
t
h
m
i
n
creases t
h
e c
o
m
p
ressi
o
n
fact
o
r
by
2.
5 t
o
7
wi
t
h
o
u
t
sacrificing
th
e
sp
eech
in
tellig
ib
ility n
o
r
th
e
Sig
n
a
l to no
ise
ratio
.
REFERE
NC
ES
[1]
K. Say
ood, “Intr
oduction
to D
a
ta Com
p
ression, T
h
ird Ed
ition
,
”
M
o
rgan Kaufmann Publishers
, 200
6.
[2]
W. Guo,
et al
., “A novel signal com
p
ression
m
e
thod based on op
tim
al
ensem
b
le em
pirical m
ode decom
position for
bearing
vibr
atio
n signals,”
Journal of Sound and
Vibration
.
[3]
P
.
Vankat
e
s
w
aran
, et
al
.,
“
A
n effici
ent t
i
m
e
do
m
a
in s
p
eech
co
m
p
res
s
i
on algori
t
hm
bas
e
d on L
P
C and s
ub-band
coding techn
i
qu
es,”
Journal o
f
Communication
, vol/issue: 4(6)
,
pp. 423–428
, 20
09.
[4]
G.
Rajesh,
et al
.,
“
S
peech Com
p
res
s
i
on us
ing Different Tr
ans
f
orm
Techniqu
es
,”
IEEE Internationa
l Conference on
Computer and C
o
mmunication T
echnolog
y (
I
CCCT)
,
pp. 146-15
1, 2011
.
[5]
O.
Yamanaka,
et
al
., “Image co
mpression using
wave
le
t trans
f
o
r
m
and vector q
u
an
tization with
variab
le blo
c
k
s
i
ze,
”
So
ft Comp
uting in
Industrial App
lica
tions,
SM
Cia ' 08
. I
E
EE Conference on
, Muroran,
pp
. 3
59-364, 2008
.
[6]
D.
Narmadha,
et al
.,
“
A
n Optim
al HS
I Im
age Com
p
ression using DWT and CP,”
International Journal o
f
Electrica
l
and
C
o
mputer Engin
e
ering (
I
JECE)
, v
o
l/issue: 4
(
3), pp
. 411–
421, 2014
.
[7]
H. E. S
u
r
y
av
ans
h
i,
et al.,
“
D
igit
al Im
age W
a
ter
m
arking in W
a
vele
t Dom
a
in,”
International Jou
r
nal of Electrica
l
and Computer Engineering
(
I
JECE)
, vol/issue: 3
(
1), pp
. 1–
6, 20
13.
[8]
A.
Kumar,
e
t
al
., “
The op
tim
iz
ed
wavel
e
t
fil
t
ers
f
o
r s
p
eech
com
p
r
e
s
s
i
on,”
In
t
J Sp
eech
T
echno
l
(
Springe
r),
2012.
[9]
O.
Khalifa,
et a
l
., “Compression using Wavelet Transform,”
International Jour
nal of Signal Pr
ocessing (
S
PIJ),
vol/issue:
2(5), p
p
. 17-26
, 2008
. I
SSN (Online): 1
985-2339.
[10]
M.
A.
Ma
wla
,
et al.,
“
C
om
paring S
p
eech C
o
m
p
res
s
i
on Us
ing W
a
velets
W
ith Other S
p
eech Com
p
res
s
i
on
Sc
he
me
s,
”
S
t
ude
nt Conf
erence
o
n
Resear
ch
and
Developmen
t, Pr
oceed
ings,
pp
. 5
5
-58, 2003
.
[11]
J
.
Rajees
h,
et
al.
,
“
P
erform
ance anal
ys
is
of wave atom
trans
f
orm
in texture cla
s
s
i
fication
,
”
in S
i
gnal, Image an
d
Video
Processin
g
,
vol. 8
,
pp
. 1–8
, 2014
.
[12]
F.
Forooz
a
n
,
et al
., “Wave atom based Compressive Sensing and
adaptive beam forming
in ultrasound imaging,”
Acousti
cs, Spee
c
h and Signal Processing (
I
C
A
SSP)
,
2015
IEEE Internation
a
l
Conferenc
e
on, South Brisbane,
QLD,
pp. 2474-
2478, 2015
.
Evaluation Warning : The document was created with Spire.PDF for Python.
I
J
ECE
I
S
SN
:
208
8-8
7
0
8
Adaptive Speec
h
C
o
mpressi
on
Base
d on Disc
re
te Wa
ve
Atom
s Tran
sform
(Bo
u
ssel
m
i Sou
ha
)
2
157
[13]
Z. W. Qiang,
et
al
., “Image Denoising Based on Wave
Atoms
and C
y
cle Spin
ning,”
Computat
ional Intellig
ence
and Security (
C
IS)
,
Eighth
Intern
ational Con
f
eren
ce on
, Guangzho
u,
pp
. 310-313
,
2012.
[14]
Z. Haddad
,
et al
.,
“Wave atoms based co
mpression method for fingerprint imag
es,”
Patt
ern Re
co
gnition
, vo
l. 46,
pp. 2450–2464
,
2013.
[15]
H.
Xu,
et al.
, “
E
CG data compression based
o
n
wave atom transform,”
Multim
edia Signa
l Pro
cessing (
MMSP),
IEEE 13th
Inter
national
Workshop on, Hangzhou,
pp
. 1-5
,
2011
.
[16]
L
.
De
ma
ne
t,
et al.
, “
W
a
ve a
t
o
m
s and sparsit
y
of oscilla
tor
y
patt
erns,”
App
l
ied and Computational Harmonic
Analysis,
vol/issue: 23(3)
, pp
. 36
8-387, 2007
.
[17]
L. Demanet, “C
urvelets, wave
atoms, and wave
equations,”
PH
D Thesis in Cal
ifornia Insti
t
ute
of Techno
logy
,
2006.
[18]
Kinsner,
et a
l
.
,
“
S
peech and Im
a
g
e S
i
gnal Com
p
res
s
i
on with W
a
vele
ts
,”
I
EEE W
e
scanex Con
f
ere
n
ce Proc
eeding
s
,
IEEE
, N
e
w
Yor
k
, N
Y
,
pp. 368-37
5, 1993
.
[19]
V.
Zue,
et al.
, “
S
peech da
tab
a
s
e
developm
ent a
t
M
I
T: TIM
I
T and b
e
y
ond
,”
Speech Communication,
vol/issue: 9(4),
pp. 351
– 356
, 1
990.
[20]
H. A
y
adi
,
“
s
peech com
p
res
s
i
on us
ing wavele
ts
,
”
El
ectr
i
cal &
Computer
E
ngin
eering Department,
Islamic.
University
of
Gaza, Pal
e
stine
,
20
10.
[21]
N.
Aloui,
et al.
,
“
A
New algorithm
for QM
F Banks
Des
i
gn and Its
Applicatio
n in S
p
eech Com
p
res
s
i
on us
ing
DWT,
”
International Arab
Journal
of Informatio
n Technolog
y
, 2
015.
[22]
N.
Aloui,
et
al
.,
“Genetic Algo
rithm for Desi
gnin
g
QMF Banks and Its Applica
tion In Speech Co
mpression using
W
a
velets,
”
I
.
J. I
m
age, Graphics
and Signal Processing,
vol. 6
,
pp
. 1-8
,
2013
.
BIOGRAP
HI
ES OF
AUTH
ORS
Souha bou
sse
lmi
was born in Bizerte,
Tunisia, in 1986
.
She received th
e Master’s degree
in
Electronics from the Faculty
of
Sciences of Tunis (FST) in 2012
. Currently
, she
is pursuing the
Ph.D.
degr
ee in Electronics
with
the Faculty
of Sciences
of Tun
i
s,
in the laborato
r
y of Innovation
of communicant and cooperativ
e mobiles (
Innov’Com
), the Higher School of Communication of
Tunis (SUPCOM). Her r
e
sear
ch
intere
sts in
clud
e audio
signal co
mpression.
Nour
e
ddine
Aloui
rece
ived th
e
m
a
s
t
er, doctor
a
t
e
degre
e
s
fro
m Sciences Faculty of Tunis; he is
a Research
er M
e
mber in
Innov’
Com group, Sig
n
al
Processing Laborator
y
.
His field
of inter
e
st
concerns d
i
gital
signal pro
cessin
g
.
Adna
ne
Che
r
i
f
rece
ived th
e eng
i
neer
, m
a
s
t
er an
d doctora
te degr
ees
from
Nation
a
l Engin
eer
ing
School of Tunis (ENIT), in
Tunisia.h
e
is a P
r
ofessor at the
Science Faculty of Tunis and
responsible in I
nnov’Com grou
p- Signal Proce
ssing Laborator
y
,
Science Faculty
of
Tunis,
Tunis
i
a
.
His
f
i
el
d of in
ter
e
s
t
c
o
n
cerns
dig
ita
l s
i
g
n
al pro
ces
s
i
ng
a
nd s
p
eech
proc
e
s
s
i
ng.
Evaluation Warning : The document was created with Spire.PDF for Python.