基于相空间重构的语音特征研究
[作者]陈亮; 张雄伟;
[摘要]本文通过重构语音信号相空间。研究语音的相似序列重复度及其熵信息 ,分析比较了语音信号在相空间中的非线性特征。根据清音和浊音在多维相空间中的不同空间分布特性 ,对语音音素进行了分类。利用语音信号在相空间中的非线性特征可以为语音识别研究提供一个新的方向
[Abstract]Based on Takens theory,time delay method is used to reconstruct phase space of speech signal in this paper.In hyper dimensional phase space,similar sequence repeatability (RPT) of speech are calculated.At the same time,according to the RPT difference between voice and unvoice,speech phonemes are classified.The method proposed in this paper provides a new way for study of speech recognition.
[关键字]混沌; 相空间重构; 相似序列重复度; 熵信息;
|
基于子带信息的鲁棒语音特征提取框架
[作者]张欣研; 王帆; 郑方; 徐明星; 吴文虎;
[摘要]本文提出一种鲁棒语音特征提取框架。通过使用一种基于子带能量分布的噪声估计方法 ,无需静音段 ,就可以估计出带噪语音的子带噪声 ,同时提出结合谱减和谱加权方法对特征进行处理 ,最终生成具有较高鲁棒性的特征实验证明 ,在语音识别系统中 ,这种特征可以有效提高语音识别的鲁棒性 ,在噪声较强 (信噪比 0dB到15dB)的情况下 ,识别率可以提高 2 0 %以上 ;并且 ,在干净语音的情况下又能保证识别率没有大的下降 ;同时 ,这种特征上的处理方法对各种噪声的适应能力都很强 ,无需对噪声进行预先分类即可得到很好的抗噪效果
[Abstract]In this paper,we propose a new method to integrate the sub-band information into features via both the sub-band weighting and the spectral subtraction for robust speech recognition.In this method,just simple on-line noise estimation and sub-band processing where the sub-bands divided by the filter banks of common MFCC Calculation are added into the traditional MFCC calculation algorithm to achieve the robust MFCC,without any prior knowledge of the noise. Furthermore,other robust methods after the feature ex...
[关键字]语言识别; 噪声估计; 鲁棒语音特征;
|
基于词图树扩展的语音命令理解及其容错算法的研究
[作者]陈俊燕; 李涓子; 王作英;
[摘要]本文对计算机语音命令理解的算法作了一些探索性的研究。首先针对词图结构的特点提出了一种词图树扩展理解算法 ,通过分析与实验比较 ,发现该算法在保证精确率的下降很小的条件下可获得比传统的Nbest路径理解算法高得多的召回率 ,而计算效率仅相当于Nbest路径理解算法中句子候选数取值很小时的情况 ;其次根据对实验结果的分析与观察 ,给出了一种行之有效的命令理解容错算法 ,使得理解召回率提高到91 7% ,精确率仍保持在 90 %以上 ,而理解错误率降低了 13 5 % ,同时计算复杂度的上升几乎可以忽略
[Abstract]In order to build a more accurate and robust voice command system, a novel Word Graph Expansion algorithm for voice command understanding is presented in this paper.It has been proved by experimental results that this algorithm has a much better performance than the generally adopted N best algorithm while maintaining high computation efficiency. Also an error tolerance method is put forward to improve the robustness of our voice command understanding module,which further decreases the understanding error ...
[关键字]语音命令; N-best路径理解算法; 词图扩展; 自顶向下的图表句法分析方法; 容错;
|
广播语音的音频分割
[作者]贾磊; 穆向禺; 徐波;
[摘要]本文的广播电视新闻的分割系统分为三部分 :分割、分类和聚类。分割部分是采用本文提出的基于检测熵变化趋势的分割算法来检测连续语音音频信号的声学特征跳变点 ,从而实现不同性质的音频信号的分割。这种检测方法不同于传统的需要门限的跳变点检测方法 ,它是以检测一定窗长的信号内部的每一个可能的分割点所分割的两段信号的信号熵的变化趋势来检测音频信号声学特征跳变点的 ,可以避免由于门限的选择不当所带来的分割错误。分类部分是采用传统的基于高斯混合模型 (GMM )的高斯分类器进行分类 ,聚类部分采用基于矢量量化 (VQ)的说话人聚类算法进行说话人聚类。应用此系统分割三段 30分钟的新闻 ,成功的实现了连续音频信号的分割 ,去除掉了所有的背景音乐 ,以较高的精度把属于同一个人的说话语音划归为一类 ,为广播语音的分类识别打下了良好的基础
[Abstract]Speaker change point detection based on BIC criterion is the most widely used method in speaker change detection in broadcasting segmentation.Although the author asserts that this method is free from threshold,the BIC value of a change point must above 0 is too strict for some short utterance.Because speakers are different from each other,the BIC value of two different speakers is spread over a large range in our test.In this paper,a speaker change detection method based on entropy changing trend is used to...
[关键字]广播语音的音频分割; 声学特征跳变点检测; 基于BIC准则的声学特征跳变点检测; 熵变化趋势;
|
基于最大似然模型插值的快速说话人自适应算法
[作者]吕萍; 王作英; 陆大金;
[摘要]本文提出了一种新的说话人自适应算法———最大似然模型插值。其基本思想是 ,利用语音单元间的相关性 ,根据最大似然准则由一组说话人相关模型的线性组合得到测试者的说话人自适应模型。接着介绍了此插值框架下的两种具体自适应算法 :均值线性插值算法和矩阵线性插值算法。实验证明上述算法有良好的收敛性 ,在只有 3句自适应数据时便能使识别系统的性能有较大提高
[Abstract]A novel speaker adaptation method named maximum likelihood model interpolation (MLMI) is proposed.The basic idea of MLMI is to compute the speaker adapted (SA) model of a test speaker by a linear convex combination of a set of speaker dependent (SD) models according to maximum likelihood (ML) criterion.This method has made use of the correlation of speech units.Then,two concrete algorithms named mean linear interpolation and matrix linear interpolation respectively are given.Experiments show that 3 adaptati...
[关键字]连续语音识别; 说话人自适应; 最大似然模型插值; 均值线性插值算法; 矩阵线性插值;
|
汉语韵律边界的声学实验研究
[作者]胡伟湘; 徐波; 黄泰翼;
[摘要]本文以带有韵律标注的语料库ASCCD为基础 ,从语音信号分析的角度 ,研究了汉语普通话韵律间断模式在语音的时长、基频和音强等三个方面的表现特征 ,并在大量统计分析的基础上建立了识别分类的决策树模型 ,实验证明 ,这些特征能较好地描述朗读话语的韵律间断模式
[Abstract]Based on large speech corpus (ASCCD) with prosodic structure label,this paper presents some statistic result on acoustic parameter of prosodic boundary.We study the syllable duration,intensity and pitch at the boundary and select a serial of acoustic parameter to train a CART.The result shows that the parameter characterize acoustic feature of the prosodic boundary and the trained CART can classify different boundary efficiency.So it is possible to train statistical model for prosodic boundary location in M...
[关键字]韵律边界; 韵律结构; 决策树;
|
二字词词义组合推理方法的研究
[作者]郑家恒; 钱揖丽; 李竞;
[摘要]汉字是表义文字 ,具有丰富的语义内容 ,汉字是一个有限的封闭集 ,它的数目是有限的 ,而汉语的词是一个开放系统 ,它是无限的。本文以“字义基元化、词义组合化”为基本思想 ,从字义着手 ,研究二字词词义组合。首先以经过整理的《现代汉语规范字典》、《现代汉语词典》和《同义词词林》为资源 ,从中自动搜索、抽取出二字词词义组合 ,建立汉字字义、词义知识库 ,然后再采用《同义词词林》的语义体系 ,通过语义相关度等的计算确定它们的组合类型 ,为研究二字词词义的组合提供一定的参考价值
[Abstract]As an ideography of abundant semantic contents, the Chinese character is a closed set with limited number while the Chinese word is an open system which is unlimited.Following the idea of" character sense elementalization and word sense combinationalization", this paper researches the combination of word sense with the character sense as the starting point. Firstly, it establishes the database of character sense and word sense by searching automatically the combinations of two character words' word ...
[关键字]词义; 语义相关度; 二字词词义组合; 词汇学;
|
一种词义与词的混合语言模型及其应用
[作者]侯珺; 王作英;
[摘要]本文提出了一种基于词和词义混合的统计语言模型 ,研究了这个模型在词义标注和汉语普通话语音识别中的性能 ,并且与传统的词义模型和基于词的语言模型进行了对比。这个模型比传统词义模型更准确地描述了词义和词的关系 ,在词义标注中具有较小的混淆度 ;在汉语普通话连续音识别中 ,这个词义模型的性能优于基于词的三元文法模型 ,并且需要较小的存储空间
[Abstract]A hybrid semantic and word based language model is brought forward in this paper.The performance of the model is tested in semantic tagging and Mandarin speech recognition,and compared with traditional N gram and semantic language models.The hybrid model better describes the relation between semantics and words and achieves a lower perplexity in tagging corpus.In Mandarin speech recognition,this model shows a better performance and requires less memory space than the word based trigram model.
[关键字]统计语言模型; 词义模型; 词义标注; 语音识别;
|
基于SVM和k-NN结合的汉语交集型歧义切分方法
[作者]李蓉; 刘少辉; 叶世伟; 史忠植;
[摘要]本文提出了基于支持向量机 (SVM)和k 近邻 (k NN)相结合的一种分类方法 ,用于解决交集型伪歧义字段。首先将交集型伪歧义字段的歧义切分过程形式化为一个分类过程并给出一种歧义字段的表示方法。求解过程是一个有教师学习过程 ,从歧义字段中挑选出一些高频伪歧义字段 ,人工将其正确切分并代入SVM训练。对于待识别歧义字段通过使用SVM和k NN相结合的分类算法即可得到切分结果。实验结果显示使用此方法可以正确处理 91 .6%的交集歧义字段 ,而且该算法具有一定的稳定性。
[Abstract]This paper presents an algorithm based on the combination of Support Vector Maching(SVM)and k Nearest neighbor (k NN),to deal with ambiguities in Chinese word segmentation.We regard the ambiguities segmentation as a classified problem and propose a vector representation of them.The method to find the solutions is supervised learning.After the ambiguities being selected and classified by handwork,the ambiguities with high frequency are trained by SVM.For the testhing ambiguities,we classify it based on mixe...
[关键字]支持向量; 类代表点; 交集型歧义; 汉语自动分词;
|
汉英翻译系统英文生成中选词模型的设计
[作者]陈毅东; 李堂秋; 洪青阳; 郑旭玲;
[摘要]本文描述了一种基于实例比较 ,辅以语义模式匹配的英文选词模型的设计。首先 ,我们讨论了汉英翻译系统英文生成中选词的重要性 ,然后比较了几种可能的选词策略并提出我们的选词模型 ,接着我们较详细地描述了生成词典的结构以及选词算法。文中 ,我们还简要介绍了我们所使用的语义知识资源———《知网》
[Abstract]This paper describes a model of English lexical selection,based on example comparison and with the help of semantic pattern match.First,we argue the importance of lexical selection during English generation of Chinese English Machine Translation system.Then we make a comparison about several tactics of lexical selection and bring forward our model.At last we describe the structure of generation lexicon and the algorithm for our method in detail.This paper also briefly introduces the semantic knowledge reso...
[关键字]基于实例; 语义模式; 相似度; 知网;
|
共95页 当前第36页