红外与激光工程, 2018, 47 (2): 0203008, 网络出版: 2018-04-26   

基于混合卷积自编码极限学习机的RGB-D物体识别

RGB-D object recognition based on hybrid convolutional auto-encoder extreme learning machine
作者单位
1 西北工业大学 电子信息学院, 陕西 西安 710072
2 瞬态冲击技术重点实验室, 北京 102202
摘要
有效学习丰富的表征信息在RGB-D目标识别任务中至关重要, 是实现高泛化性能的关键。针对卷积神经网络训练时间长的问题, 提出了一种混合卷积自编码极限学习机(HCAE-ELM)结构, 包括卷积神经网络(CNN)和自编码极限学习机(AE-ELM), 该结构合并了CNN的有效性和AE-ELM快速性的优点。它使用卷积层和池化层分别从RGB和深度图来有效提取低阶特征, 然后在共享层合并两种模型特征, 输入到自编码极限学习机中以得到高层次的特征, 最终的特征使用极限学习机(ELM)进行分类, 以获得更好的快速泛化能力。文中在标准的RGB-D数据集上进行了评估测试, 其实验结果表明, 相比较深度学习和其他的ELM方法, 文中的混合卷积自编码极限学习机模型取得了良好的测试准确率, 并且有效地缩减了训练时间。
Abstract
Learning rich representations efficiently plays an important role in RGB-D object recognition task, which is crucial to achieve high generalization performance. For the long training time of convolutional neural networks, a Hybrid Convolutional Auto-Encoder Extreme Learning Machine Structure(HCAE-ELM) was put forward which included Convolutional Neural Network(CNN) and Auto-Encoder Extreme Learning Machine(AE-ELM), which combined the power of CNN and fast training of AE-ELM. It used convolution layers and pooling layers to effectively abstract lower level features from RGB and depth images separately. And then, the shared layer was developed by combining these features from each modality and fed to an AE-ELM for higher level features. The final abstracted features were fed to an ELM classifier, which led to better generalization performance with faster learning speed. The performance of HCAE-ELM was evaluated on RGB-D object dataset. Experimental results show that the proposed method achieves better testing accuracy with significantly shorter training time in comparison with deep learning methods and other ELM methods.
参考文献

[1] Microsoft Kinect[EB/OL]. (2013-03-05). http://www.kinect.com.

[2] 曹雏清, 李瑞峰, 赵立军. 基于深度图像技术的手势识别方法[J]. 计算机工程, 2012, 38(8): 16-21.

    Cao Chuqing, Li Ruifeng, Zhao Lijun. Hand posture recognition method based on depth image technoloy[J]. Computer Engineering, 2012, 38(8): 16-21. (in Chinese)

[3] 王鑫, 沃波海, 管秋, 等. 基于流形学习的人体动作识别[J]. 中国图象图形学报, 2014, 19(6): 914-923.

    Wang Xin, Wo Bohai, Guan Qiu, et al. Human action recognition based on manifold learning[J]. Journal of Image and Graphics, 2014, 19(6): 914-923. (in Chinese)

[4] 李长勇, 曹其新. 基于深度图像的蔬果形状特征提取[J]. 农业机械学报, 2012, 43(Z1): 242-245.

    Li Changyong, Cao Qixin. Extraction method of shape feature for vegetables based on depth image[J]. Transactions of the Chinese Society for Agricultural Machinery, 2012, 43(Z1): 242-245. (in Chinese)

[5] 许可. 卷积神经网络在图像识别的应用研究[D]. 杭州: 浙江大学, 2012.

    Xu Ke. Study of convolutional neural network applied on image recognition[D]. Hangzhou: Zhejiang University, 2012. (in Chinese)

[6] Blum M, Springenberg J T, Wulfing J, et al. A learned feature descriptor for object recognition in RGB-D data[C]// IEEE International Conference on Robotics & Automation, 2012, 44(8): 1298-1303.

[7] Socher R, Huval B, Bath B P, et al. Convolutional- recursive deep learning for 3D object classification[C]// NIPS, 2012: 665-673.

[8] Niun Xiaoxiao, Suen Ching Y. A novel hybrid CNN-SVM classifier for recognizing handwritten digits[J]. Pattern Recognition, 2012, 45(4): 1318-1325.

[9] 刘天华, 杨绍清, 刘松涛. 基于CNN的海空光电目标检测技术研究[J]. 红外与激光工程, 2008, 37(S2): 310-313.

    Liu Tianhua, Yang Shaoqing, Liu Songtao. Research of sea-aero target detection from photoelectricity image based on cellular neural networks[J]. Infrared and Laser Engineering, 2008, 37(S2): 310-313. (in Chinese)

[10] 李军梅, 胡以华, 陶小红. 基于主成分分析与BP神经网络的识别方法研究[J]. 红外与激光工程, 2005, 34(6): 719-723.

    Li Junmei, Hu Yihua, Tao Xiaohong. Recognition method based on principal component analysis and back-propagation neural network[J]. Infrared and Laser Engineering, 2005, 34(6): 719-723. (in Chinese)

[11] Wang Yong, Xu Haisong. Spectral characterization of scanner based on PCA and BP ANN[J]. Chinese Optics Letters, 2005, 3(12): 725-728.

[12] Huang G B, Zhu Q Y, Siew C K. Extremelearning machine: anewlearningschemeof feedforward neural networks[C]//IEEE International Joint Conference on Neural Networks, 2004, 2: 985-990.

[13] Huang G B, Zhu Q Y, Siew C K: Extreme learning machine: theory and applications[J]. Neurocomputing, 2006: 70(1-3):489-501.

[14] Arel I, Rose D C, Karnowski T P. Deep machine learning-a new frontier in artificial intelligence research[J]. IEEE Computational Intelligence Magazine, 2010, 5(4): 13-18.

[15] Boureau Y L, Ponce J, Lecun Y. A theoretical analysis of feature pooling in visual recognition[C]//International Conference on Machine Learning, 2010, 32 (4): 111-118.

[16] Kasun L L C, Zhou H, Huang G B, et al. Representational learning with ELMs for big data[C]//Intelligent Systems IEEE, 2013, 28(6): 31-34.

[17] Lai K, Bo L, Ren X, et al. A large-scale hierarchical multi-view RGB-D object dataset[C]//IEEE International Conference on Robotics and Automation, 2011: 1817-1824.

[18] Bo L, Lai K, Ren X, et al. Object recognition with hierarchical kernel descriptors[C]//IEEE Int Conf on Computer Vision and Pattern Recognition, 2011: 1729-1736.

[19] Schwarz M, Schulz H, Behnke S. RGB-D object recognition and pose estimation based on pre-trained convolutional neural network features[C]//IEEE Int Conf on Robotics & Automation, 2015: 1329-1335.

[20] Cheng Y, Zhao X, Huang K, et al. Semi-supervised learning and feature evaluation for RGB-D object recognition[J]. Computer Vision & Image Understanding, 2015, 139 (C): 149-160.

[21] Li F, Liu H, Xu X, et al. Multi-modal local receptive field extreme learning machine for object recognition[C]// International Joint Conference on Neural Networks, 2016: 1696-1701.

殷云华, 李会方. 基于混合卷积自编码极限学习机的RGB-D物体识别[J]. 红外与激光工程, 2018, 47(2): 0203008. Yin Yunhua, Li Huifang. RGB-D object recognition based on hybrid convolutional auto-encoder extreme learning machine[J]. Infrared and Laser Engineering, 2018, 47(2): 0203008.

本文已被 2 篇论文引用
被引统计数据来源于中国光学期刊网
引用该论文: TXT   |   EndNote

相关论文

加载中...

关于本站 Cookie 的使用提示

中国光学期刊网使用基于 cookie 的技术来更好地为您提供各项服务,点击此处了解我们的隐私策略。 如您需继续使用本网站,请您授权我们使用本地 cookie 来保存部分信息。
全站搜索
您最值得信赖的光电行业旗舰网络服务平台!