高级检索
许元洪, 郭琼. 数据挖掘技术在语音识别中的应用[J]. 应用技术学报, 2019, 19(1): 84-87. DOI: 10.3969/j.issn.2096-3424.2019.01.012
引用本文: 许元洪, 郭琼. 数据挖掘技术在语音识别中的应用[J]. 应用技术学报, 2019, 19(1): 84-87. DOI: 10.3969/j.issn.2096-3424.2019.01.012
XU Yuanhong, GUO Qiong. Application of Data Mining Techniques in Speech Recognition[J]. Journal of Technology, 2019, 19(1): 84-87. DOI: 10.3969/j.issn.2096-3424.2019.01.012
Citation: XU Yuanhong, GUO Qiong. Application of Data Mining Techniques in Speech Recognition[J]. Journal of Technology, 2019, 19(1): 84-87. DOI: 10.3969/j.issn.2096-3424.2019.01.012

数据挖掘技术在语音识别中的应用

Application of Data Mining Techniques in Speech Recognition

  • 摘要: 通过数据挖掘技术实现对语音来源的识别,从而完成对说话人身份的认证以及操作权限的分配,具有非常重要的理论和实际意义。主要针对相同和不同语音内容两个类别的说话人语音识别进行了研究。通过在说话人识别领域广泛应用的梅尔频率倒谱系数进行语音的特征提取,并结合动态时间规整算法进行模式匹配分类。特别地,在不同的语音内容识别探究中,在采用动态时间规整算法前,结合了K-means++算法以及主成分分析算法来对梅尔频率倒谱系数矩阵进行降维和聚类,以保证待匹配模板的维度相近或相同。结果表明,在相同语音内容的识别过程中,选择合适的阈值可以获得较好的识别效果。

     

    Abstract: Using the data mining techniques to recognize the speech sources, certify the speaker identities and assign the operation permissions is quite meaningful in both theoretical and practical senses. This paper mainly investigates two types of speech recognition. One is based on the same voice contents, while the other is on different voice contents. For the algorithms, the widely used Mel frequency cepstral coefficient (MFCC) algorithm is adopted for the feature extraction; and dynamic time warping algorithm are combined to classify the patterns. In particular, K-means++ algorithm and principle component analysis algorithm are added before the use of dynamic time warping algorithm for the second type. As a result, in the type of the same voice contents, once an appropriate threshold is selected, a good recognition effect can be derived.

     

/

返回文章
返回