高级检索
孙陈影, 沈希忠. LSTM和GRU在城市声音分类中的应用[J]. 应用技术学报, 2020, 20(2): 158-164. DOI: 10.3969/j.issn.2096-3424.2020.02.008
引用本文: 孙陈影, 沈希忠. LSTM和GRU在城市声音分类中的应用[J]. 应用技术学报, 2020, 20(2): 158-164. DOI: 10.3969/j.issn.2096-3424.2020.02.008
SUN Chenying, SHEN Xizhong. Research and Application of LSTM and GRU in Urban Sound Classification[J]. Journal of Technology, 2020, 20(2): 158-164. DOI: 10.3969/j.issn.2096-3424.2020.02.008
Citation: SUN Chenying, SHEN Xizhong. Research and Application of LSTM and GRU in Urban Sound Classification[J]. Journal of Technology, 2020, 20(2): 158-164. DOI: 10.3969/j.issn.2096-3424.2020.02.008

LSTM和GRU在城市声音分类中的应用

Research and Application of LSTM and GRU in Urban Sound Classification

  • 摘要: 不同类型的声音对城市居民的身心健康质量影响不同,将城市声音精准的分类有利于对其进行有效的评价,从而促进对城市声音的管理。深度学习在语音识别方面已有所应用,其中循环神经网络(RNN)表现最为突出。由于基本RNN存在明显的梯度消失、网络损耗大、准确率低等问题,应用改进的RNN对城市背景噪声进行分类。采用长短期记忆神经网络(LSTM)和门控循环单元(GRU)神经网络,构建深度循环神经网络模型,通过城市记录的公共数据集UrbanSound8K对搭建的深度神经网络的准确性进行测试分析。模型基于梅尔频率倒谱系数的基准实现,得出的结果与基本RNN相比有明显的提升。

     

    Abstract: Different types of sounds have different effects on the quality of physical and mental health of urban residents. Accurate classification of urban sounds is conducive to effective evaluation of them, thus promoting the management of urban sounds. Deep learning has been applied in speech recognition, among which the recurrent neural network (RNN) is the most prominent. Due to the obvious gradient disappearance, large network loss and low accuracy of the basic RNN, the improved recurrent neural network was employed to classify the urban background noise. The long short-term memory neural network (LSTM) and the gated recurrent unit (GRU) neural network were used to construct a deep-circulating neural network model. The accuracy of the constructed deep neural network was tested and analyzed by the public data set UrbanSound8K. The model was based on the benchmark of the Mel frequency cepstral coefficient and the results were significantly improved compared with the basic RNN.

     

/

返回文章
返回