基于总变分最小化模型的异步并行GPU加速算法 下载: 764次
Asynchronous Parallel GPU Acceleration Method Based on Total Variation Minimization Model
解放军信息工程大学信息系统工程学院, 河南 郑州 450002
图 & 表
图 1. (a)同步并行计算;(b)异步并行计算
Fig. 1. (a) Synchronous parallel computing; (b) asynchronous parallel computing
下载图片 查看原文
图 2. GPU机群的组成结构
Fig. 2. Structures of GPU cluster
下载图片 查看原文
图 3. 原始图像及4组实验在第2000轮迭代时中间层的重建结果。(a)原始图像;(b)实验1的结果;(c)实验2的结果;(d)实验3中同步并行结果;(e)实验3中非同步并行结果;(f)实验4中同步并行结果;(g)实验4中非同步并行结果
Fig. 3. Original image and reconstruction results of middle slices at iteration of 2000 times. (a) Original image; (b) result of experiment 1; (c) result of experiment 2; (d) result of sync-parallel computing in experiment 3; (e) result of async-parallel computing in experiment 3; (f) result of sync-parallel computing in experiment 4; (g) result of async-parallel computing in experiment 4
下载图片 查看原文
表 1重建加速的实验平台参数
Table1. Parameters of reconstruction acceleration experimental platform
Item | Computing server model |
---|
GPU computing server I | GPU computing server II |
---|
CPU | Intel IvyBridgeE5-2630v2 (2.6 GHz) | Intel IvyBridgeE5-2630v2 (2.6 GHz) | GPU | Tesla K20 (5 G)×1 | Tesla K20 (5 G)×1Tesla K40 (12 G)×1 | RAM | 32 G | 32 G | Operating system | Windows 7, 64 bit | Windows 7, 64 bit | Number | 2 | 2 |
|
查看原文
表 2锥束CT系统扫描参数
Table2. Scanning parameters of cone-beam CT system
Item | Parameter |
---|
Scanning angle range /(°) | 0-360 | Number of probes | 1024×1024 | Distance from source to rotation center /mm | 600 | Distance from source to detector /mm | 1200 | Projection number | 60 | Total of projection data | 1024×1024×60 | Reconstruction scale | 512×512×512 | Pixel size /mm×mm | 0.25×0.25 |
|
查看原文
表 34组实验在第2000轮迭代时结果的RMSE以及平均每轮迭代时间
Table3. RMSE of results at iteration of 2000 times and average iteration time of each iteration for four experiments
Result | Experiment 1 | Experiment 2 | Experiment 3 | Experiment 4 |
---|
Synchronous parallel | Asynchronous parallel | | Synchronous parallel | Asynchronous parallel |
---|
RMSE | 6.78×10-4 | 6.78×10-4 | 6.78×10-4 | 5.69×10-4 | 6.78×10-4 | 5.34×10-4 | Average single time /s | 6842.00 | 130.50 | 53.93 | 51.49 | 53.26 | 45.67 | (compared with CPU) | - | 52.4 | 126.9 | 132.9 | 128.5 | 149.8 | (compared with single GPU) | - | - | 2.42 | 2.53 | 2.45 | 2.86 |
|
查看原文
路万里, 蔡爱龙, 郑治中, 王林元, 李磊, 闫镔. 基于总变分最小化模型的异步并行GPU加速算法[J]. 光学学报, 2018, 38(4): 0411004. Wanli Lu, Ailong Cai, Zhizhong Zheng, Linyuan Wang, Lei Li, Bin Yan. Asynchronous Parallel GPU Acceleration Method Based on Total Variation Minimization Model[J]. Acta Optica Sinica, 2018, 38(4): 0411004.