| 133 | 5 | 209 |
| 下载次数 | 被引频次 | 阅读次数 |
对于大规模的语音语料,语音切分方法主要有传统的人工切分和机器自动化切分2种方式.人工切分大规模语音语料的切分质量易控制,但效率低、成本高;机器自动化切分效率高,但后期查找切分错误时任务极其繁重.因而提出一种人机交互语音切分系统,切分人员可选择自动切分算法,设置切分参数,修改有问题的自动切分结果,同时可自动生成用于HTK训练的标注文件.以课题组采集的1 000个普米语语音文件为研究对象,以普米语孤立词为切分基元,机器自动化切分存在难以避免的切分错误,后期检查时工作量巨大;然而使用本文提出的人机交互语言切分系统进行切分,切分人员在无需高认知度的情况下也可做到近100%的切分正确率.
Abstract:The methods of speech segmentation are divided mainly into the manual segmentation and the machine-automatic segmentation for large speech corpora. The quality of segmentation can be controlled easily through the traditional manual segmentation; however,the shortcomings of manual segmentation such as inefficiency,and high cost are also obvious. As we all know,the method of machine- automatic segmentation has the advantage of high efficiency,but there is much work in detecting the segmentation errors. Thus,this paper proposes a system of speech segmentation based on human- computer interaction which provides different segment algorithms and parameters,modifies the errors of automatic segmentation results,and generates the labeling of HTK training files. Taking one file that includes one thousand units of the Pumi speeches as the research object,this research has proved that a personal with low segmentation- related cognitive competence can use the proposed segmentation system to achieve nearly one hundred percent accuracy.
[1]刘华平,李昕,徐柏龄,等.语音信号端点检测方法综述及展望[J].计算机应用研究,2008(08):2278-2283.
[2]薛胜尧.基于改进型双门限语音端点检测算法的研究[J].电子设计工程,2015(04):78-81.
[3]郑继明,张萍.基于小波变换的音频分割[J].计算机工程与应用,2011(07):139-142.
[4]王米利,佘玉梅,苏洁,等.基于非特定发音人拉祜语孤立词语音识别研究[J].云南民族大学学报(自然科学版).2015(04):337-340.
[5]苏洁,李余芳,郭琳,等.HTK参数对普米语孤立词识别率的影响[J].云南民族大学学报(自然科学版).2015(06):510-513.
[6]李余芳,苏洁,胡文君,等.基于HTK的普米语孤立词的语音识别[J].云南民族大学学报(自然科学版).2015(05):426-430.
[7]张俊星,石立新,王都生.阈值自适应语音自动分割系统模型[J].计算机工程与设计,2010(08):1886-1888.
[8]李冠宇,于洪志,吴志强.一种语料缺乏条件下的藏语音素自动切分方法[J].计算机工程与科学,2014(10):2009-2013.
[9]张瑞杰,李弼程,屈丹.基于可信度变化趋势的音频分割算法[J].计算机工程,2010,36(8):177-179.
[10]李战明,尚丰.一种基于谱熵的语音端点检测方法[J].电子技术与软件工程,2015(01):200-202.
[11]樊星,赵菁华.用户界面与人机交互标准化综述[J].信息技术与标准化,2015(004):28-31.
[12]卓嘎,边巴旺堆,姜军.基于短时平均能量和短时过零率的藏语语音端点检测研究[J].电脑知识与技术,2014(31):7466-7469.
基本信息:
中图分类号:TN912.3
引用信息:
[1]郭琳,苏洁,李余芳,等.一种人机交互语音切分系统[J].云南民族大学学报(自然科学版),2016,25(01):87-91.
基金信息:
云南省教育厅科学研究基金(2014Z091);; 云南民族大学研究生创新基金(2015YJCXY285)
2016-01-10
2016-01-10