激光与光电子学进展, 2020, 57 (8): 081006, 网络出版: 2020-04-03
基于多尺度残差式卷积神经网络与双向简单循环单元的光学乐谱识别方法 下载: 1291次
Optical Music Recognition Method Combining Multi-Scale Residual Convolutional Neural Network and Bi-Directional Simple Recurrent Units
数字图像处理 光学乐谱识别 卷积神经网络 多尺度特征融合 简单循环单元 digital image processing optical music recognition convolutional neural network multi-scale feature fusion simple recurrent units
摘要
光学乐谱识别在音乐信息检索和计算机辅助教学等领域有着重要价值,针对传统框架处理步骤复杂、精度较低,而基于深度学习的算法模型训练耗时久,且对难点音符识别误差较大的问题,提出了一种改进的卷积循环神经网络以提升识别精度。首先在原始乐谱中增加不同的噪声,以扩充乐谱图像,提高训练模型的鲁棒性;随后利用多尺度残差式卷积神经网络对乐谱图像中的音符特征进行提取,提升后续识别精度;最后利用双向简单循环单元网络识别音符特征,加快训练收敛速度。实验结果表明,改进后网络模型的平均符号错误率下降至0.3234%,收敛速度加快,训练耗时约为传统卷积循环神经网络的1/3。
Abstract
Optical music recognition plays an important role in the field of music information retrieval and computer aided instruction. For traditional frameworks, the processing steps are complicated, and the accuracy is low. Moreover, deep learning algorithm-based model training takes a long time and shows large recognition error for difficult notes. In this work, an improved convolutional recurrent neural network is proposed. First, different noises were added to the original score to expand the score image and improve the robustness of the training model. Then, the multi-scale residual convolutional neural network was used to extract note features to improve the subsequent recognition accuracy. Finally, bi-directional simple recurrent units were adopted to recognize note features and accelerate convergence of the algorithm in the training stage. Experimental results show that the average symbol error rate of the proposed network model has been reduced to 0.3234%. Thanks to the faster converging rate, the training time is about one third of that of traditional convolutional recurrent neural network.
吴琼, 李锵, 关欣. 基于多尺度残差式卷积神经网络与双向简单循环单元的光学乐谱识别方法[J]. 激光与光电子学进展, 2020, 57(8): 081006. Qiong Wu, Qiang Li, Xin Guan. Optical Music Recognition Method Combining Multi-Scale Residual Convolutional Neural Network and Bi-Directional Simple Recurrent Units[J]. Laser & Optoelectronics Progress, 2020, 57(8): 081006.