光学 精密工程, 2018, 26 (5): 1231, 网络出版: 2018-08-14   

三维语义场景复原网络

Three-dimentional reconstruction of semantic scene based on RGB-D map
作者单位
1 长春工业大学 应用技术学院, 吉林 长春 130000
2 中国科学院 长春光学精密机械与物理研究所, 吉林 长春 130031
摘要
从不完整的视觉信息中推断出物体的三维几何形状是机器视觉系统应当具备的重要能力, 而识别出场景中物体的语义是机器视觉系统的核心。传统方法通常将二者分离实现, 本文将场景复原与目标语义紧密结合, 提出了一种三维语义场景复原网络模型, 仅以单一深度图作为输入, 实现对三维场景的语义分类和场景复原。首先, 建立一种端到端的三维卷积神经网络, 网络的输入是深度图, 使用三维上下文模块来对相机视锥体内的区域进行学习, 进而输出带有语义标签的三维体素; 其次, 建立了带有密集体积标签的合成三维场景数据集, 用于训练本文的深度学习网络模型; 最后通过实验表明, 与现有的语义分类和场景复原方法相比, 语义场景的复原接收区域增加了2.0%。结果表明: 三维学习网络的复原性能良好, 语义标注的准确率较高。
Abstract
Reconstruction of 3D object is an important part in machine vision system, and the semantic understanding of 3D object is a core function for the machine vision system. In this paper, 3D restoration was combined with the semantic understanding of 3D object, a 3D semantic scene recovery network was proposed. The semantic classification and scene restoration of 3D scene were achieved only by using a single RGB-D map as input. Firstly, an end-to-end 3D convolution neural network was established. The input of the network was a depth map. The 3D context module was used for learning the region within the camera view, then the 3D voxels with semantic labels were generated. Secondly, a synthetic data set with dense volume labels was established to train the depth learning network. Finally, the experimental results showed that the recovery performance w improved by 2.0% compared with the state-of-art. It can be seen that the 3D learning network plays well in 3D scene restoration, it owns high accuracy in semantic annotation of object in the scene.

林金花, 王延杰. 三维语义场景复原网络[J]. 光学 精密工程, 2018, 26(5): 1231. LIN Jin-hua, WANG Yan-jie. Three-dimentional reconstruction of semantic scene based on RGB-D map[J]. Optics and Precision Engineering, 2018, 26(5): 1231.

本文已被 2 篇论文引用
被引统计数据来源于中国光学期刊网
引用该论文: TXT   |   EndNote

相关论文

加载中...

关于本站 Cookie 的使用提示

中国光学期刊网使用基于 cookie 的技术来更好地为您提供各项服务,点击此处了解我们的隐私策略。 如您需继续使用本网站,请您授权我们使用本地 cookie 来保存部分信息。
全站搜索
您最值得信赖的光电行业旗舰网络服务平台!