TELKOM
NIKA
, Vol. 11, No. 9, September 20
13, pp.
5359
~53
6
4
ISSN: 2302-4
046
5359
Re
cei
v
ed Ma
rch 6, 2
013;
Re
vised J
une
12, 2013; Accepte
d
Ju
ne
24, 2013
Improved a Priori SNR Estimation for Speech
Enhancement Incorporating Speech Distortion
Component
Shifeng Ou*,
Chao Ge
ng,
Ying Gao
Institute of Science a
nd
T
e
chnol
og
y for Opto-el
e
ctronic Inf
o
rmatio
n
, Yantai Univ
ersit
y
, Yantai 2
6
4
005,
Shan
do
ng, Chi
n
a
*Corres
p
o
ndi
n
g
author, e-ma
i
l
: 2508
00
719
@
qq.com
A
b
st
r
a
ct
T
he w
e
ll
know
n d
e
cisi
on-d
i
re
cted (DD)
a
ppr
oach
dras
tic
a
ll
y li
mits th
e l
e
v
e
l of
music
a
l
n
o
ise,
but
the estimated
a priori SN
R matches
the pre
v
ious fra
m
e rat
her than the
c
u
rrent on
e. Pla
pous i
n
troduc
e
d
a
nove
l
meth
od c
a
lle
d tw
o-step
nois
e
re
ductio
n
(T
SNR) tech
ni
que t
o
refi
ne th
e a
prior
i
SN
R
estimatio
n
of th
e
DD appr
oac
h. How
e
ver, the perfor
m
a
n
ce o
f
this me
thod
dep
en
ds on th
e accurate
nes
s of the estimated
speec
h i
n
its s
e
con
d
ste
p
. In
this p
a
p
e
r, w
e
prop
os
e
an
i
m
prove
d
appr
oa
ch for th
e
a
pri
o
ri S
NR
esti
ma
tio
n
in D
C
T
do
ma
in
w
i
th tw
o steps like th
e T
S
NR
met
hod.
W
h
ile
in th
e seco
nd
step, consi
der
i
ng the
tw
o stat
e
compo
nents
o
f
the esti
mati
o
n
error
betw
e
en sp
eec
h si
g
nal
an
d its es
timati
on, the
s
peec
h d
i
stortio
n
compo
nent an
d residu
al n
o
i
s
e compo
nent,
w
e
make
the
estimate
d sp
eech su
btracte
d
by its spee
ch
distortio
n
as a refine
d esti
mati
on for the cle
a
n
speec
h sig
n
a
l
. Because th
e speec
h distorti
on co
mp
on
ent i
s
offset, the estimate
d a pri
o
ri
SNR is
more a
ccurate. A nu
mber of obj
ective
tests results show
the improv
ed
perfor
m
a
n
ce of
the propos
ed
appr
oach.
Ke
y
w
ords
:
sp
eech e
n
h
ance
m
e
n
t, signa
l to nois
e
ratio, spe
e
ch distorti
on, nois
e
reducti
on
Copy
right
©
2013 Un
ive
r
sita
s Ah
mad
Dah
l
an
. All rig
h
t
s r
ese
rved
.
1. Introduc
tion
Most of the e
x
isting voice
comm
uni
cati
on sy
ste
m
s a
r
e de
sig
ned f
o
r
processin
g
of noise
free
spee
ch. Ho
wever, spe
e
ch sign
als u
s
ed
a
s
a
n
inp
u
t to these
system
s are
often deg
ra
ded
by
additive noi
se. So the pro
b
lem of enh
a
n
cin
g
sp
ee
ch
degraded
by unco
r
related
additive noise,
whe
n
o
n
ly th
e noi
sy
spe
e
c
h i
s
availabl
e, ha
s
b
een
widely
add
re
ssed i
n
the
p
a
st fe
w d
e
ca
des
and it still provides a
n
a
c
tive field of rese
ar
ch. Ma
ny approa
ch
es h
a
ve bee
n investigate
d
in
orde
r to
gai
n spe
c
tral
e
nhan
cem
ent, incl
udin
g
h
a
rd
or soft deci
s
io
n e
s
ti
mation, spe
c
tral
subtractio
n, Wien
er filteri
ng, and min
i
mum
mea
n
squa
re e
r
ror (M
MSE)
estimation [1
-4].
Wide
sp
rea
d
use
of the
s
e
method
s in
due to
the f
a
ct that th
ey are fairly
st
raightfo
rward
to
impleme
n
t, effective in
rem
o
ving vari
ou
s backg
ro
und
noises
and
h
a
ve low com
putational l
o
a
d
.
Almost all of
these
spe
e
ch enha
ncem
ent app
roa
c
h
e
s rely on th
e estimatio
n
of a sho
r
t-ti
me
spe
c
tral
gai
n, whi
c
h
is
a fu
nction
of the
a pri
o
ri
SNR. So the
estim
a
tion of th
e a
prio
ri S
N
R is a
cru
c
ial p
a
rt o
f
spee
ch en
h
ancement al
gorithm
[5]. An erroneo
u
s
estim
a
tion
of this param
eter
lead
s to spee
ch di
stortio
n
, musi
cal n
o
ise, or
redu
ce
d
noise re
du
ction. In the me
antime, piracy
become
s
in
crea
singly ra
mpant a
s
the cu
stome
r
s can e
a
sily
dupli
c
ate an
d redi
strib
u
te
the
received mult
imedia conte
n
t to a large a
udien
ce
Many of the
existing a pri
o
ri SNR e
s
ti
mati
on tech
ni
que
s req
u
ire
either expe
ri
mentally
pre
-
spe
c
ified
weig
hting factors o
r
prio
r
assumpti
o
n
s
of the param
eter in the si
gnal mod
e
l. The
well e
s
tabli
s
h
ed de
cisi
on
-d
irecte
d (DD)
approa
ch
i
s
computation
a
ll
y efficient an
d perfo
rm
s q
u
ite
well i
n
n
o
ise
redu
ction
ap
plicatio
ns [6,
7], but thi
s
a
ppro
a
ch
has
a seri
ou
s d
r
a
w
ba
ck that
the
estimated
a
p
r
iori
SNR foll
ows the
sh
ap
e of the
i
n
sta
n
taneo
us S
N
R
with a
sim
p
le del
ay of o
ne
sho
r
t time fra
m
e. To sup
p
ress the probl
em of
the de
cisi
on directe
d
app
roa
c
h,
a novel meth
od,
calle
d two-step n
o
ise
red
u
ction
techni
que
(TS
N
R) i
s
p
r
e
s
ent
ed t
o
refine th
e e
s
timation
of t
he a
prio
ri SNR [5,
8]. It is also
repo
rted th
at
seve
ral a
prio
ri SNR e
s
tim
a
tion ap
pro
a
ches
have b
e
e
n
prop
osed b
a
sed on
high
er
orde
r m
o
men
t
s[9], which h
a
ve sh
own p
r
omisin
g resul
t
s in a
numb
e
r
of
appli
c
atio
ns and are
of
parti
cula
r value
w
hen
dealin
g
with
a
mixture of
normal-Lapl
ace
p
r
oc
es
se
s
.
Evaluation Warning : The document was created with Spire.PDF for Python.
ISSN: 23
02-4
046
TELKOM
NIKA
Vol. 11, No
. 9, September 201
3: 535
9 – 5364
5360
In this
pape
r,
an effe
ctive
a p
r
iori
SNR e
s
timation
for n
o
isy
sp
eech e
nhan
cement i
s
prop
osed by inco
rpo
r
ating
the spee
ch
distortio
n
co
mpone
nt to the estimate
d
spee
ch to refine
the e
s
timated
a p
r
io
ri S
N
R. The
pro
p
o
s
ed al
gorit
h
m
not
only retai
n
s better
noi
se
redu
ction, but
also im
proves defi
c
ien
c
y i
n
te
rm
s of suppressin
g
m
u
si
cal n
o
is
e
and furth
e
r
redu
ce
s the e
c
ho
becau
se of the enh
an
cin
g
trackin
g
speed of a
p
o
steri
o
ri S
N
R. In simul
a
tions
with sp
eech
sign
als deg
ra
ded
by diverse
noi
se
s, th
e propo
se
d method sh
ows
imp
r
ove
d
p
e
rform
a
n
c
e o
v
er
the other two
method
s for a
numbe
r of measure
s
.
The pap
er is organi
ze
d as follows: In Section 2 we
review the spee
ch enh
an
ceme
nt
probl
em a
n
d
the de
cisi
o
n
-di
r
e
c
ted a
ppro
a
ch whi
c
h i
s
mo
st
freque
ntly use
d
in
spe
e
ch
comm
uni
cati
on
system
s.
In Se
ction
3 we p
r
e
s
e
n
t a n
o
vel
SNR e
s
timation a
pproa
ch by
employing
sp
eech disto
r
tion com
pon
e
n
t. In Sect
ion 4 we sho
w
that the propo
sed ap
proach
outperfo
rm
s t
he de
ci
sion
-d
irecte
d a
nd T
S
NR
app
roa
c
hes for
non
-stationary n
o
ise a
s
well a
s
f
o
r
stationa
ry noi
se in
term
s
of seve
ral in
strum
ental
m
easure
s
. Fin
a
lly in Sectio
n 5, we give
ou
r
con
c
lu
sio
n
s.
2. Problem Formulation
It is assumed
that the noise sign
al
()
vt
is a
dditive, i.e.
()
()
()
y
t
xt
vt
with
()
x
t
,
()
yt
the clea
n sp
e
e
ch a
nd noi
sy spee
ch at time
t
. Taking the DCT to the observed
si
gnal give
s us:
,,
,
,0
,
,
1
n
k
nk
nk
YX
V
k
K
(1)
Whe
r
e
,
nk
X
,
,
nk
Y
and
,
nk
V
denote the DCT tran
sform
ed com
pon
e
n
ts of the cle
an spe
e
ch, noisy
spe
e
ch an
d n
o
ise
si
gnal
s
resp
ectively,
K
is the
total
numbe
r
of fre
quen
cy
comp
onent
s,
k
and
n
rep
r
e
s
ent the freque
ncy and frame in
dex. The objective is to find an estim
a
tor
,
ˆ
nk
X
whi
c
h
minimizes th
e expe
cted v
a
lue of
a giv
en di
stortio
n
measure con
d
itionally to a
set of
sp
ect
r
al
noisy fe
ature
s
. Sin
c
e th
e
statistical
mod
e
l is ge
nerall
y
nonlin
ear,
a
nd b
e
cau
s
e
n
o
di
rect
soluti
on
for the spec
tral es
timation
exis
ts
,
we first deriv
e
a
pri
o
ri S
N
R e
s
ti
mate fro
m
th
e noi
sy fe
atures.
An es
timation of
,
nk
X
is su
bseq
uently obtain
ed by applyin
g
a sp
ect
r
al g
a
in
(,
)
Gn
k
to each
short
time spe
c
tral
co
mpon
ent
,
nk
Y
. The
choi
ce
of the
di
stortion
me
a
s
ure d
e
termi
nes the
gain
behavio
r, i.e., the tradeoff
betwee
n
noi
se redu
ct
ion and spee
ch distortio
n
.
Ho
wever,
the key
para
m
eter i
s
the estimate
d a prio
ri SNR be
cau
s
e it
determin
e
s t
he efficien
cy
of the spe
e
c
h
enha
ncement
for a given noise p
o
wer spectrum de
nsity.
With the assumptio
n
tha
t
different DCT co
mpo
n
e
n
ts on inde
x
k
are stat
istically
indep
ende
nt, the estimatio
n
for clea
n sp
eech com
pon
ent can b
e
ob
tained a
s
follows:
,,
ˆ
(,
)
nk
nk
X
Gn
k
Y
(2
)
Whe
r
e
(,
)
Gn
k
is th
e gain fu
ncti
on. In gen
eral, it can b
e
expre
s
sed
as a fu
nctio
n
of the a
poste
rio
r
i SNR and a p
r
io
ri
SNR define
d
as follows:
2
,
SN
R
(
,
)
(,
)
nk
pos
t
V
Y
nk
nk
(3
)
2
,
SN
R
(
,
)
(,
)
nk
pr
i
o
V
EX
nk
nk
(4
)
Whe
r
e
2
,
(,
)
Vn
k
nk
E
V
is a
s
sumed to
b
e
kn
own sin
c
e it can be
easily
comp
uted du
rin
g
spe
e
ch pau
ses. An estim
a
tion of the a prio
ri SNR is made a
c
cording to th
e so
-called
DD
approa
ch [4]:
Evaluation Warning : The document was created with Spire.PDF for Python.
TELKOM
NIKA
ISSN:
2302-4
046
Im
proved a P
r
iori SNR Esti
m
a
tion for Speech
Enhan
cem
ent Incorp
orating
…
(Shi
feng Ou
)
5361
2
1,
ˆ
ˆ
SN
R
(
,
)
(
1
)
m
a
x
SN
R
(
,
)
1
,
0
(,
)
nk
DD
pr
i
o
pos
t
V
X
nk
nk
nk
(5)
Whe
r
e
1,
ˆ
nk
X
is the estimated
sp
eech com
pon
ent at previou
s
frame.
Several varia
n
ts of the gai
n function
(,
)
Gn
k
have been
rep
o
rted in the li
terature, su
ch
as
Wien
er,
spe
c
tral
subt
ractio
n, o
r
Maximum Li
kelih
ood
esti
mates. But,
without l
o
ss of
gene
rality, here the gain fu
nction i
s
ch
osen as the
Wie
ner filter simil
a
r to [5].
SNR
(
,
)
(,
)
SN
R
(
,
)
1
pr
i
o
pr
io
nk
Gn
k
nk
(6
)
3. A Priori SNR Es
timati
on Incorpora
t
ing Speech
Distor
tion Componen
t
Some an
alysis of DD a
p
p
r
oa
ch b
ehavi
o
r h
a
s
been
repo
rted i
n
[5, 6] and the
re
sults
indicated that
DD
app
roa
c
h ca
n d
r
a
s
tically limit
the level of mu
sical noi
se, but
the estim
a
ted
a
prio
ri SNR fol
l
ows the i
n
sta
n
taneo
us S
N
R
with
a fram
e delay.
Con
s
eque
ntly, since gain
fun
c
tio
n
(,
)
Gn
k
depe
nd
s on
the a pri
o
ri S
NR,
(,
)
Gn
k
comp
uted at current
frame mat
c
h
e
s the
previo
us
frame, and th
us the p
e
rfo
r
mance of the
spee
ch
enh
ancement
system is d
e
g
r
aded. In orde
r to
remove the
d
r
awba
cks of
DD a
p
p
r
oa
ch
while mai
n
ta
ining its a
d
va
ntage
s, Plap
ous p
r
o
p
o
s
ed
to
comp
ute the
a pri
o
ri
S
N
R
for the
next frame
u
s
ing
DD ap
proa
ch
and to
apply
it to the curre
n
t
frame
be
cau
s
e of the f
r
am
e del
ay. This
lead
s to
the
TSNR ap
pro
a
ch
which
is
comp
osed
of
two
step
s to refin
e
the estimati
on of the a pri
o
ri SNR [5].
In the first step, the gain functio
n
(,
)
Gn
k
is co
mputed u
s
ing
DD app
ro
ach as de
scrib
e
d
in the previou
s
se
ction.
SNR
(
,
)
(,
)
SN
R
(
,
)
1
DD
pr
i
o
DD
prio
nk
Gn
k
nk
(7)
In the secon
d
step, the g
a
in is then u
s
ed to
refine
the estimate
d a prio
ri SNR of DD
approa
ch, an
d the estimati
on of TSNR i
s
obtain
ed u
s
ing the followi
ng equ
ation:
2
,
(,
)
SN
R
(
,
)
(,
)
nk
TS
N
R
pr
i
o
V
Gn
k
Y
nk
nk
(8)
The a pri
o
ri S
NR e
s
timate
d
using
DD a
p
p
roa
c
h
sho
w
s goo
d prope
rties b
u
t suffe
rs fro
m
a frame dela
y
which is
re
moved by the se
con
d
ste
p
of the TSNR algo
ri
thm. Therefore, thi
s
techni
que ca
n provide fa
st respon
se
to an abrupt
incre
a
se in the spee
ch
signal with
out
int
r
odu
cin
g
m
u
si
cal noi
se.
Ho
wever, a
s
we
can
se
e
from Equ
a
tion (8
), the T
S
NR e
s
timati
on for
a pri
o
ri SNR
depe
nd
s o
n
t
he e
s
timatio
n
,
ˆ
nk
X
for the
cl
ea
n spee
ch
co
mpone
nt ju
st
as Equ
a
tion
(2) sho
w
ed
,,
ˆ
(,
)
nk
nk
X
Gn
k
Y
. More
over, it
is
kno
w
n
th
at the e
s
tima
tion erro
r
,
nk
e
be
tween th
e e
s
timate
d
spe
e
ch co
m
pone
nt
,
ˆ
nk
X
and
its co
rre
sp
o
nding a
c
tual
spee
ch
co
mpone
nt
,
nk
X
incl
ude
s t
w
o
parts. Thi
s
ca
n be formul
ated as:
,
,,
,,
ˆ
(,
)
nk
n
k
nk
nk
n
k
e=
X
-
X
=
G
n
k
Y
-
X
(9
)
Subs
tituting
,,
,
nk
nk
nk
YX
V
i
n
to (9) gives
us
:
,
,,
((
,
)
1
)
(
,
)
nk
nk
n
k
e
G
nk
-
X
+
G
nk
V
(1
0)
Evaluation Warning : The document was created with Spire.PDF for Python.
ISSN: 23
02-4
046
TELKOM
NIKA
Vol. 11, No
. 9, September 201
3: 535
9 – 5364
5362
Whe
r
e
,
(,
)
1
nk
Gn
k
-
X
repre
s
ent
s the sp
eech disto
r
tion com
pon
e
n
t and
,
(,
)
nk
Gn
k
V
i
s
t
h
e
resi
dual
noi
se co
mpon
ent
. In orde
r to i
m
prove th
e a
c
curate
ne
ss
of
,
ˆ
nk
X
, it is inspired to make
the estim
a
tio
n
,
ˆ
nk
X
subtra
cted
by sp
eech
distortio
n
,
(,
)
1
nk
Gn
k
-
X
as
the refin
ed e
s
timation
,,
ˆ
R
nk
X
for the actual
clean
spe
e
ch comp
one
nt.
,,
,
,
ˆ
ˆ
(,
)
1
R
nk
nk
nk
X
=X
-
G
n
k
-
X
(1
1)
The refined
estimation
is clo
s
e
r
to th
e actu
al spe
e
ch
co
mpon
ent than the
forme
r
estimation in
Equation (9). From the a
b
o
v
e equation,
we sub
s
titute
,
ˆ
nk
X
for
,
nk
X
, then we get:
,,
,
,
ˆ
ˆ
2
(
,)
2
(
,)
(
,
)
R
nk
nk
nk
X
-
G
nk
X
-
G
n
k
G
nk
Y
(12)
Then
the refi
ned
TS
NR (R-TSNR) app
roach
for
the
a prio
ri SNR
estimation i
s
obtaine
d
in our pa
pe
r, whi
c
h can be
descri
bed a
s
followin
g
two
step
s:
2
1,
ˆ
ˆ
SN
R
(
,
)
(
1
)
m
a
x
SN
R
(
,
)
1
,
0
(,
)
nk
DD
prio
post
V
X
nk
nk
nk
(13)
2
,
2(
,
)
(
,
)
SN
R
(
,
)
(,
)
nk
RT
S
N
R
pr
i
o
V
-G
n
k
G
n
k
Y
nk
nk
(14)
Whe
r
e
is th
e controll
er p
a
ram
e
ter
whi
c
h
ca
n b
e
a
d
j
usted
to a
c
hi
eve be
st result, and i
n
o
u
r
experim
ental
the para
m
ete
r
is cho
s
en a
s
=0.
9
8.
4. Experimental Re
sults
In this sectio
n, the performance of the pr
opo
se
d R-TSNR ap
pro
a
ch is te
sted
for noisy
spe
e
ch en
ha
ncem
ent, an
d compa
r
ed
to that of DD as
well
a
s
TSRN app
ro
ach. T
he
sp
e
e
ch
material u
s
e
d
for tests con
s
ist
s
of six se
ntences
spo
k
en by three
males a
nd three female
s. The
numbe
r
of sa
mples pe
r fra
m
e is
K
=256
with a
n
ove
r
l
ap of
128
sa
mples.
The
n
o
ise
si
gnal
s u
s
ed
in our evalua
tion incl
ude
white n
o
ise (White),
Hig
h
freque
ncy ch
annel noi
se
(HF), De
stroy
e
r
engin
e
roo
m
noise (Dest
r
o
y
er), and Ba
bble noi
se
(Babble
)
. The spee
ch si
gnal
is sa
mpled
at 8
kHz an
d deg
raded by the
s
e noises at
th
e SNR of 0dB
, 5dB, and 10
dB.
Figure 1. The
Original Sp
e
e
ch
Figure 2. The
Noisy Spee
ch
Ti
m
e
(
S
e
c
.)
F
r
e
q
ue
nc
y
0.
2
0.
4
0.
6
0.
8
1
1.
2
1.
4
1.
6
1.
8
2
0
50
0
10
00
15
00
20
00
25
00
30
00
35
00
40
00
Ti
m
e
(
S
e
c
.)
F
r
e
q
ue
nc
y
0.
2
0.
4
0.
6
0.
8
1
1.
2
1.
4
1.
6
1.
8
2
0
50
0
10
00
15
00
20
00
25
00
30
00
35
00
40
00
Evaluation Warning : The document was created with Spire.PDF for Python.
TELKOM
NIKA
ISSN:
2302-4
046
Im
proved a P
r
iori SNR Esti
m
a
tion for Speech
Enhan
cem
ent Incorp
orating
…
(Shi
feng Ou
)
5363
Figure 3. Enhanced Spee
ch by DD Meth
od
Figure 4. Enhanced Spee
ch by TSNR M
e
thod
Figure 5. Enhanced Spee
ch by R-TS
NR Method
Firstly, the
re
sults of th
e th
ree
alg
o
rithm
s
fo
r
spe
e
ch
enha
ncement
are
com
pare
d
in
the
freque
ncy
do
main by m
e
a
n
s of th
e spe
e
ch
sp
ect
r
og
ram. Figu
re 1,
Figure 2, Fi
g
u
re
3, Figu
re
4
and Figrue 5 sho
w
s clea
n spe
e
ch,
noi
sy
spe
e
ch co
rrupted by
the De
stroye
r
e
n
gine roo
m
n
o
i
se
with 0 dB
an
d the results
of enha
nced
spe
e
ch
e
s
u
s
i
ng DD, TSNR, and
our propo
sed m
e
th
ods,
respe
c
tively. From the o
b
tained results,
it c
an be se
en that the R-TSNR ap
pro
a
ch h
a
s a
be
tter
noise reducti
on capability;
it
ha
s
l
e
ss residual noi
se
whil
e keep
ing more of the
speech signals
energy uncha
nged tha
n
the
other two ap
proa
ch
es.
Table 1. Co
m
pari
s
on of SEGSNR
of
Enhan
ced Si
gnal in Vari
ou
s Noi
s
e
Con
d
i
tions
Noise
ty
p
e
Input
SNR
Output S
E
GS
NR
DD
TSNR
R-
TSNR
White
0 dB
4.72
4.90
5.29
5 dB
7.04
7.29
7.69
10 dB
8.93
8.96
9.32
HF
0 dB
4.66
4.75
5.11
5 dB
7.09
7.21
7.59
10 dB
9.16
9.25
9.50
Destro
ye
0 dB
4.29
4.47
5.01
5 dB
6.96
7.08
7.34
10 dB
9.26
9.39
9.88
Babble
0 dB
2.78
2.86
2.96
5 dB
4.72
4.81
5.11
10 dB
7.13
7.30
7.57
Table 2. Co
m
pari
s
on of LS
D of Enhan
ce
d
Signal in Vari
ous
Noi
s
e Co
ndition
s
Noise
ty
p
e
Input
SNR
Output LS
D
DD
TSNR
R-
TSNR
White
0 dB
8.32
7.94
7.73
5 dB
7.32
7.11
6.89
10 dB
7.00
6.67
6.42
HF
0 dB
8.07
7.82
7.61
5 dB
6.62
6.42
5.98
10 dB
5.96
5.84
5.58
Destro
ye
0 dB
8.13
7.84
7.64
5 dB
6.41
6.18
5.81
10 dB
5.06
4.82
4.54
Babble
0 dB
8.15
7.92
7.66
5 dB
6.18
5.85
5.61
10 dB
4.85
4.64
4.37
Ti
m
e
(
S
e
c
.)
F
r
eq
ue
nc
y
0.
2
0.
4
0.
6
0.
8
1
1.
2
1.
4
1.
6
1.
8
2
0
50
0
10
00
15
00
20
00
25
00
30
00
35
00
40
00
Ti
m
e
(
S
e
c
.)
F
r
eq
ue
nc
y
0.
2
0.
4
0.
6
0.
8
1
1.
2
1.
4
1.
6
1.
8
2
0
50
0
10
00
15
00
20
00
25
00
30
00
35
00
40
00
Ti
m
e
(
S
e
c
.)
F
r
eq
ue
nc
y
0.
2
0.
4
0.
6
0.
8
1
1.
2
1.
4
1.
6
1.
8
2
0
50
0
10
00
15
00
20
00
25
00
30
00
35
00
40
00
Evaluation Warning : The document was created with Spire.PDF for Python.
ISSN: 23
02-4
046
TELKOM
NIKA
Vol. 11, No
. 9, September 201
3: 535
9 – 5364
5364
The
seg
m
en
tal SNR (S
EGSNR) a
n
d
log
-
spe
c
tral di
stortion
(LS
D
)
mea
s
ures a
r
e
adopte
d
for t
he o
b
je
ctive
evaluation
[1
0]. For t
he
segmental
SNR, only f
r
ame
s
with
seg
m
e
n
tal
SNR valu
es
greate
r
than
-10 dB and le
ss tha
n
35
d
B
are con
s
id
ered. Ta
ble 1
gives the ou
tput
SEGSNR
re
sults of th
e en
han
ced
sp
ee
ch
sig
nals ob
tained
usin
g
DD, TS
NR a
nd the
propo
se
d
R-TS
NR al
go
rithm in
vari
o
u
s
noi
se
co
n
d
itions an
d le
vels. Th
e
re
sults of
the
LS
D a
r
e
sho
w
e
d
in
Table 2, also
in various n
o
ise conditio
n
s and leve
l
s
. From the two tables, we
can ob
se
rve that
the prop
osed
algorithm al
ways h
a
s a h
i
gher SEGS
N
R and lo
we
r
LSD a
s
com
p
ared to the ot
her
algorith
m
s u
n
der all teste
d
environ
menta
l
conditio
n
s.
5. Conclusio
n
An improved
expre
s
sion
for the
a
pri
o
ri
SNR
estim
a
tion for
spe
e
ch
en
han
ce
ment in
DCT d
o
main
has be
en pro
posed in this
pape
r. Un
li
ke
the tradition
al estimation
approa
che
s
, we
have delib
erately con
s
ide
r
ed the
spe
e
c
h di
storti
o
n
comp
one
nt in the estima
ted spe
e
ch. By
inco
rpo
r
ating
this comp
one
nt to refine th
e estim
a
tor, t
he imp
r
oved
perfo
rman
ce
i
s
obtai
ned. T
he
perfo
rman
ce
of the p
r
op
ose
d
e
s
timat
o
r h
a
s be
en
examine
d
u
s
ing
a
real
spe
e
ch si
gn
al in
several noi
se
environm
ent
s. The expe
ri
mental re
sult
s we
re
com
p
ared
with the
DD a
nd TS
NR
method
s
an
d sho
w
ed
that the
pro
posed
es
tim
a
tor perfo
rm
ed
b
e
tter n
o
ise
redu
ction
perfo
rman
ce
than the othe
r estimato
rs.
Ackn
o
w
l
e
dg
ements
This work was suppo
rted
by NSFC under G
r
a
n
t No
s. 61005
0
21, 61201
45
7 and A
Proje
c
t of Shand
ong Pro
v
ince Hi
ghe
r Educatio
nal
Scien
c
e an
d Technol
og
y Program
u
nder
contract J12L
N27.
Referen
ces
[1]
MK Hasa
n, MSA Z
ilan
y
, M
R
Kha
n
. DCT
Speec
h En
ha
nceme
n
t
w
i
th
Hard
and
Soft T
h
resholdi
n
g
Criteria,
El
ectronics L
e
tters
. 2002; 38(
13): 66
9-67
0.
[2]
T
Inoue, H Sa
ru
w
a
tari, Y T
a
kahas
hi, K Shi
k
ano, K Ko
nd
o
.
T
heoretica
l
A
nal
ysis
of Musi
cal No
ise i
n
Genera
lize
d
Spectral
Su
btrac
t
ion
Base
d on
Higher Order
Statistics.
IEEE T
r
ansactio
n
s on Aud
i
o
,
Speec
h, and L
ang
ua
ge Proce
ssing
. 20
11; 19
(6): 1770-
17
79
.
[3]
H Din
g, I Soo
n
,
S Koh, C Ye
o
.
A Spectral F
i
l
t
er
ing M
e
tho
d
Based
on
H
y
br
id W
i
e
ner F
ilter
s for Speec
h
eEnh
anc
ement
.
Speech Co
mmu
n
ic
ation
. 2
0
09; 51(3): 2
59-
267.
[4]
Y Ephraim, D
Mala
h. Speec
h
Enhanc
eme
n
t Using
a
Minim
u
m Mean-s
qua
re Error Short-time Spectra
l
Amplitu
de Esti
mator.
IEEE Transacti
ons o
n
Acoustics,
Speech a
nd Si
gn
al Process
i
ng
. 198
4;
32(6):
110
9-11
21.
[5]
C Pl
apo
us, C
Marro, P Sc
al
a
r
t. Improved
Si
gna
l-to
-n
oi
se R
a
ti
o
Esti
matio
n
for S
p
e
e
ch
E
nha
nceme
n
t.
IEEE Transactions on Au
di
o, Speec
h, and L
ang
ua
ge Proce
ssing
. 20
06; 14
(6): 2098-
21
08
.
[6]
O Capp
é. Eli
m
inati
on of th
e Musica
l No
i
s
e Phe
nome
n
on
w
i
th th
e
Ephra
i
m an
d
Mala
h Nois
e
Suppr
essor. I
EEE Transactions on Speech
Audio Processing
. 199
4; 2(2): 345-
349.
[7]
K Suzumi, S
Hiroshi,
M Ry
oi
chi,
S Kiy
o
hiro, K Kazunobu.
T
heoretic
al
Analys
is of
Musical
Nois
e
Generati
on
in
Noise
Re
ducti
on Meth
ods w
i
t
h
Decis
i
on-
Dir
ected a
Prior
i
SNR
Esti
mator
. Proceed
in
g
s
of Internatio
nal
W
o
rkshop on
Acoustic Si
g
n
a
l
Enha
ncem
ent
. Aachen. 20
12
; 1-4.
[8]
X Z
h
a
ng, H Ji
ang,
J Z
han
g.
Improv
ed pr
iori
SNR estimati
on for sou
nd e
nha
nce
m
e
n
t w
i
th Gaussia
n
statistical mo
d
e
l.
Proce
e
d
i
ng
s of Internati
o
nal
Co
nferenc
e on
Com
put
er Scie
nce
an
d Ed
ucatio
n
.
Melb
ourn
e
. 20
12; 130
7-1
310.
[9]
T
Moazzeni, A
Amei, J
Ma,
Y Jia
ng, Statis
tica
l M
ode
l B
a
sed S
NR Esti
mation
Meth
od
for Sp
eec
h
Sign
als.
Electr
onics L
e
tters
. 2012; 48(
12): 72
7-72
9.
[10] K
Kond
o.
Sub
j
ective
Qua
lity Measur
e
m
ent of
Speec
h:
Its Evalu
a
tion, E
s
timati
on
and
Appl
icatio
ns
.
Berlin: Spr
i
ng
e
r
Press. 2012.
Evaluation Warning : The document was created with Spire.PDF for Python.