语音实验一端点检测

上传人：s*** IP属地：天津上传时间：2022-09-05 格式：DOCX 页数：17 大小：35.72KB 积分：28 举报 版权申诉

已阅读5页，还剩12页未读，继续免费阅读

版权说明：本文档由用户提供并上传，收益归属内容提供方，若内容存在侵权，请进行举报或认领

文档简介

1、实验一语音信号端点检测一、实验目的1学会 MATLAB的使用，掌握 MATLAB的程序设计方法；2掌握语音处理的基本概念、基本理论和基本方法；3掌握基于 MATLAB编程实现带噪语音信号端点检测；4学会用 MATLAB对信号进行分析和处理。5. 学会利用短时过零率和短时能量，对语音信号的端点进行检测。二、实验仪器设备及软件HP D538、MATLAB三、实验原理端点检测是语音信号处理过程中非常重要的一步，它的准确性直接影响到语音信号处理的速度和结果。本次实验利用短时过零率和短时能量相结合的语音端点检测算法利用短时过零率来检测清音，用短时能量来检测浊音，两者相配合便实现了信号

2、信噪比较大情况下的端点检测。算法对于输入信号的检测过程可分为短时能量检测和短时过零率检测两个部分。算法以短时能量检测为主，短时过零率检测为辅。根据语音的统计特性，可以把语音段分为清音、浊音以及静音（包括背景噪声）三种。在本算法中，短时能量检测可以较好地区分出浊音和静音。对于清音，由于其能量较小，在短时能量检测中会因为低于能量门限而被误判为静音；短时过零率则可以从语音中区分出静音和清音。将两种检测结合起来，就可以检测出语音段（清音和浊音）及静音段1、短时能量计算定义 n 时刻某语言信号的短时平均能量 En 为： x( m)w(n m) 2nEn x(m)w(n m) 2m n

3、 ( N 1)式中 N 为窗长，可见短时平均能量为一帧样点值的平方和。特n殊地，当窗函数为矩形窗时，有2(m)Enxm n ( N 1)2、短时过零率过零就是指信号通过零值。过零率就是每秒内信号值通过零值的次数。对于离散时间序列，过零则是指序列取样值改变符号，过零率则是每个样本的改变符号的次数。对于语音信号，则是指在一帧语音中语音信号波形穿过横轴（零电平）的次数。可以用相邻两个取样改变符号的次数来计算。如果窗的起点是n=0，短时过零率 Z 为波形穿过横轴（零电平）的次数1N 1Sgn(Sw(n 1) |Z 0| Sgn( Sw(n)2 n 0sgn( x)1, x01, x

4、0短时过零可以看作信号频率的简单度量浊音的短时平均幅度最大，无声的短时平均幅度最小，清音的短时过零率最大，无声居中，浊音的短时过零率最小。3、短时自相关函数Nk 1Rw(k)sw(n)sw( nk)0是偶函数；s(n)是周期的，那么 R（k）也是周期的；可用于基音周期估计和线性预测分析4、判断语音信号的起点和终点利用短时平均幅度和短时过零率可以判断语音信号的起点和终点。语音端点检测方法可采用测试信号的短时能量或短时对数能量、联合过零率等特征参数，并采用双门限判定法来检测语音端点，即利用过零率检测清音，用短时能量检测浊音，两者配合。首先为短时能量和过零率分别确

5、定两个门限，一个是较低的门限数值较小，对信号的变化比较敏感，很容易超过;另一个是比较高的门限，数值较大。低门限被超过未必是语音的开始，有可能是很短的噪声引起的，高门限被超过并且接下来的自定义时间段内的语音。四、实验步骤及程序（1）实验步骤：1、取一段录音作为音频样本。2、利用公式分别编程计算这段语音信号的短时能量和短时过零率，然后分别画出它们的曲线。3、调整能量门限。4、进行幅度归一化并设置帧长、短时能量阈值、过零率阈值等参数。5、编写程序实现语音端点检测。6、最后得到语音端点检测图像。（2) 语音信号的端点检测程序流程图：输入幅度设计算短时能调整开始输出

6、样本端图 1.1 语音信号的端点检测程序流程图语音信号的端点检测实验源程序：clc;clear;x,fs=wavread(2.wav);%y = end_point(x);%f0 = pitch_sift(x,0.38,fs);plot(f0);%e_x=(frame(x,lpc_spectrum,fs);%plot(e_x(2,:);%某一维随时间变化plot(e_x(:,89);%一帧信号各维之间变化hold on;c=melcepst(x,fs);plot(c(89,:),k);frame 定义% function y = frame(x,func,SAMP_FREQ,l,ste

7、p)where y is output on a frame by frame basis, x is input speech,and l is the window size. l and step are optional parameters,by default SAMP_FREQ is 8000, l is 200, and step is 100.func is a string e.g. pitch that determines a function that you wantto apply to x on a short-time basis.% Written by:

8、Levent ArslanApr. 11, 1994%function yy = frame(x,func,SAMP_FREQ,l,step)m,n=size(x);if mnn=m;elsen=n;x=x;endif nargin 3, SAMP_FREQ=16000; end;if nargin 4, l=SAMP_FREQ/40; end;if nargin 5, step=l/2; end;num_frames=ceil(n/step);%NUMBER OF FRAMESx(n+1:n+2*l)=zeros(2*l,1);%ADD ZEROS AT THE END OFTHE SPEE

9、CH SIGNALi=0:step:num_frames*step;%i is the arithmetical proportion seriesby stepj=i*ones(1,l);i=j+ones(num_frames+1,1)*1:l;y=reshape(x(i),num_frames+1,l);y=(hanning(l)*ones(1,num_frames+1).*y;for i=1:num_framescmd=sprintf(yy(:,i)=%s(y(:,i); ,func);eval(cmd);endmelcepst定义function c=melcepst(s,fs,w,n

10、c,p,n,inc,fl,fh)%MELCEPST Calculate the mel cepstrum of a signalC=(S,FS,W,NC,P,N,INC,FL,FH)% Simple use: c=melcepst(s,fs) % calculate mel cepstrum with 12coefs, 256 sample frames%c=melcepst(s,fs,e0dD) % include log energy,0th cepstral coef, delta and delta-delta coefs%Inputs:s speech signal%fssample

11、 rate in Hz (default 11025)%ncnumber of cepstral coefficients excluding 0thcoefficient (default 12)n length of frame (default power of 2 30 ms)p number of filters in filterbank (default floor(3*log(fs) )inc frame increment (default n/2)%fllow end of the lowest filter as a fraction of fs (default =0)

12、%fhhigh end of highest filter as a fraction of fs (default =0.5)%wany sensible combination of the following:%Rrectangular window in time domain%N Hanning window in time domain%M Hamming window in time domain (default)%ttriangular shaped filters in mel domain(default)%nhanning shaped filters in mel d

13、omain%mhamming shaped filters in mel domain%p filters act in the power domain%a filters act in the absolute magnitude domain(default)%0include 0th order cepstral coefficient%einclude log energy%d include delta coefficients (dc/dt)%D include delta-delta coefficients (d2c/dt2)%zhighest and lowest filt

14、ers taper down to zero(default)%ylowest filter remains at 1 down to 0 frequencyand%highest filter remains at 1 up to nyquistfreqency%If ty or ny is specified, the total power in the fftis preserved.%Outputs: cmel cepstrum output: one frame per row%Copyright (C) Mike Brookes 1997%Last modified Thu Ju

15、n 15 09:14:48 2000%VOICEBOX is a MATLAB toolbox for speech processing. Home page is athttp:/www.ee.ic.ac.uk/hp/staff/dmb/voicebox/voicebox.html%This program is free software; you can redistribute it and/ormodifyit under the terms of the GNU General Public License as published bythe Free Software Fou

16、ndation; either version 2 of the License,or(at your option) any later version.%This program is distributed in the hope that it will be useful,but WITHOUT ANY WARRANTY; without even the implied warranty ofMERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See theGNU General Public License for more

17、details.%You can obtain a copy of the GNU General Public Licensefrom/pub/gnu/COPYING-2.0 or by writing toFree Software Foundation, Inc.,675 Mass Ave, Cambridge, MA 02139, USA.%if nargin2 fs=11025; endif nargin3 w=M; endif nargin4 nc=12; endif nargin5 p=floor(3*log(fs); endif nargin6 n=pow2(floor(log

18、2(0.03*fs); endif nargin9fh=0.5;if nargin8fl=0;if narginncc(:,nc+1:end)=;elseif pncc=c zeros(nf,nc-p);endif any(w=0)c(:,1)=;endif any(w=e)c=log(sum(pw). c;end% calculate derivativeif any(w=D)vf=(4:-1:-4)/60;af=(1:-1:-1)/2;ww=ones(5,1);cx=c(ww,:); c; c(nf*ww,:);vx=reshape(filter(vf,1,cx(:),nf+10,nc);vx(1:8,:)=;ax=reshape(filter(af,1,vx(:),nf+2,

人人文库> 全部分类> 办公材料 > 办公文档

温馨提示

1. 本站所有资源如无特殊说明，都需要本地电脑安装OFFICE2007和PDF阅读器。图纸软件为CAD,CAXA,PROE,UG,SolidWorks等.压缩文件请下载最新的WinRAR软件解压。
2. 本站的文档不包含任何第三方提供的附件图纸等，如果需要附件，请联系上传者。文件的所有权益归上传用户所有。
3. 本站RAR压缩包中若带图纸，网页内容里面会有图纸预览，若没有图纸预览就没有图纸。
4. 未经权益所有人同意不得将文件中的内容挪作商业或盈利用途。
5. 人人文库网仅提供信息存储空间，仅对用户上传内容的表现方式做保护处理，对用户上传分享的文档内容本身不做任何修改或编辑，并不能对任何下载内容负责。
6. 下载文件中如有侵权或不适当内容，请与我们联系，我们立即纠正。
7. 本站不保证下载资源的准确性、安全性和完整性, 同时也不承担用户因使用这些下载资源对自己和他人造成任何形式的伤害或损失。

语音实验一端点检测

文档简介

温馨提示

最新文档

评论

语音实验一端点检测

文档简介

温馨提示

最新文档

评论

相关文档