Chinese Optics Letters, 2020, 18 (6): 061001, Published Online: May. 12, 2020   

Object tracking method based on joint global and local feature descriptor of 3D LIDAR point cloud Download: 896次

Author Affiliations
1 State Key Laboratory of Pulsed Power Laser Technology, National University of Defense Technology, Hefei 230037, China
2 Anhui Provincial Key Laboratory of Electronic Restriction, National University of Defense Technology, Hefei 230037, China
3 The Military Representative Bureau of the Ministry of Equipment Development, Central Military Commission in Beijing, Beijing 100191, China
Abstract
To fully describe the structure information of the point cloud when the LIDAR-object distance is long, a joint global and local feature (JGLF) descriptor is constructed. Compared with five typical descriptors, the object recognition rate of JGLF is higher when the LIDAR-object distances change. Under the situation that airborne LIDAR is getting close to the object, the particle filtering (PF) algorithm is used as the tracking frame. Particle weight is updated by comparing the difference between JGLFs to track the object. It is verified that the proposed algorithm performs 13.95% more accurately and stably than the basic PF algorithm.

Object tracking technology is widely used both in civil and military fields, including unmanned driving[1], intelligent robots[2,3], ballistic missile tracking[4], and so on[5]. Particle filtering (PF)[6] is a filtering method based on the Monte Carlo method and recursive Bayesian estimation. It is not restricted by system conditions and can effectively estimate parameters and filter states under nonlinear and non-Gaussian conditions. Therefore, PF has been widely used in practical moving object tracking systems[68" target="_self" style="display: inline;">–8].

The data obtained by LIDAR[9] has high accuracy and is not easily affected by illumination changes[10]. In addition to the application of PF to object tracking in two-dimensional (2D) space, the application in three-dimensional (3D) space is also very rich. However, on the one hand, many researchers only take 3D information obtained by LIDAR as a supplementary means of 2D image information in the research process. Seldom have researchers been able to fully explore the potential structure information of 3D data. Choi and Christensen[11] used photometric (color) and geometric (3D points and surface normals) features to determine the likelihood of each particle. Held et al.[12] comprehensively used the descriptor of 3D shape, color, and motion to fully describe the object. In the other hand, Zhou[13] directly used the original point cloud information after segmentation for robots to grab dynamic objects successfully. However, for an object with higher speed, as the LIDAR-object distance and the angle of the LIDAR platform change, the original data of point cloud will change accordingly. Such changes have influences on the direct use of the original point cloud information, which will decrease the stability of the object description and tracking process.

In addition, the above studies were all completed in a close LIDAR-object distance, which means that the amount of point cloud data used was relatively abundant. As the LIDAR-object distance becomes longer, the amount becomes smaller, which increases the difficulty in describing the object. Seldom are studies under the above situation. In 2D space, joint color features are used to fully conduct the description when the camera-object distance is long. Ning et al.[14] built the joint color material to enhance the object description. Li et al.[15] also proposed the joint color space descriptor to enhance the robustness of postulate estimation of a point cloud scene in the PF framework.

In this Letter, LIDAR is assumed to be in an airborne platform for object tracking from an initial distance of 3 km from the scene. There is no open access to the data obtained under the situation used for tracking; thus, the data for experiment in this Letter was undocumented before. The LIDAR-object distance continuously changes during the movement of the platform. The number of laser beams of the LIDAR is 64. The repetition frequency and frame frequency of the LIDAR is 10 kHz and 20 Hz, respectively. The structure of the proposed method is shown in Fig. 1. Firstly, an object is chosen. To fully describe the object, an n×341-dimensional joint global and local feature (JGLF) descriptor is proposed. With the framework of PF, an object tracking method is proposed. Comparing the Bhattacharyya distances[16] between the JGLF of initial particles and the chosen object, particle weights are calculated, and the particle resample is done. Upon getting the new particles, the object will be tracked.

Fig. 1. Proposed object tracking method of point cloud based on JGLF.

下载图片 查看所有图片

Local or global feature descriptors of the 3D LIDAR point cloud cannot perform well when the data is little. In order to make the most of 3D structure information, the JGLF descriptor is proposed. When the platform is far away from the object, and the imaging resolution remains the same, the JGLF descriptor performs better. The process of calculating the JGLF descriptor is as follows.

The relationship between one point and other points in its k neighborhood is calculated one by one among the 3D dataset P. For the point Pq of point cloud P, a 1×33-dimensional eigenvector is obtained after the calculation of fast point feature histogram (FPFH) (Pq)[17]. Then, the points that have been used for calculating the local features are successively included in the set Qq(q=1,2,,n). For each data set Qq, it is considered as a whole to extract global features and calculate the viewpoint feature histogram (VFH) (Qq)[18] with a 1×308 eigenvector.

The sum value is first calculated to obtain the sum of all the feature eigenvectors in FPFH (Pq) and VFH (Qq): sum=FPFH(Pq)+VFH(Qq).

The construction matrix calculation of the JGLF descriptor is JGLF(Pq)=FPFH(Pq)×TA+VFH(Qq)×TB.

TA and TB can be described by TA=[1/33sum1/33sum1/33sum1/33sum330000308],TB=[00003801/308sum1/308sum1/308sum1/308sum33].

PF[7,1921" target="_self" style="display: inline;">–21] uses particles with weights to approximate the state distribution of the object at time k. One possible state of the object is represented by the equations of station and measurement: xk=fk(xk1,vk1),yk=hk(xk,nk).

The PF algorithm consists of six steps: particle initialization, state prediction, sequential importance sampling, weight calculation, weight normalization, and particle resample, as is shown in Fig. 2. In the first particle initialization step, the total number of the particle is set as n. The object to be tracked is selected, and its JGLF descriptor is extracted using the method shown in Fig. 1. State estimation is made according to Xk=Xk1+vk1, where vk1 stands for the random Gaussian noise. It is assumed that the importance distribution q(k) at moment k is only related to the state value x(k1) at moment k and the measured value y(k) at moment k: q(xk|x0:k1,y1:k)=q(xk|xk1,yk),wk(i)wk1(i)·p(yk|xk(i))·p(xk(i)|xk1(i))q(xk(i)|xk1(i),yk).

Fig. 2. Flow chart of the proposed object tracking method.

下载图片 查看所有图片

The Bhattacharyya distance of JGLF between the particle swarm and the object is introduced to measure the particle weight in the proposed algorithm. The larger the weight is, the higher the similarity between the particles and the object is. In the PF algorithm, the weight represents the importance of each particle. The weight of each particle is characterized by the Bhattacharyya distance. Assume that there are two n×m-dimensional vectors h(n,m) and g(n,m): bk(h,g)=i=1nj=1mh(i,j)×g(i,j).

bk is called the Bhattacharyya coefficient, and its value range is [0,1]. The Bhattacharyya distance is calculated as d(k)=1bk(h,g).

If the calculated d(k) value is smaller, and the two histograms are more similar, then the similarity between the two point clouds is higher.

Thus, the particle weight should be increased. Therefore, the weight is calculated as wk(i)=1[1d(k)2]2.

Then, the weights of each particle are normalized as wk(i)=wk(i)/i=1Nwk(i).

Neff is the effective particle number, and NT is the threshold of the particle number; and if Neff<NT, the following calculation is performed: Neff=1i=1Ns(wk(i))2.

A group of particles are generated again in the update area. First, according to the following formula, the cumulative probability is calculated and normalized: ck(0)=0,,ck(i)=ck(i1)+wk(i),ck(i)=ck(i)/i=1Nck(i).

Then, a set of random numbers that obey uniform distribution uU(0,1) is generated. umin, which is the smallest and satisfies ck(i)umin, is found. The particle state value is set as xk1(i)=xk1(umin). After completing the above steps, the average value of the object state can be expressed as E(xk)=i=1Nswk(i)xk(i).

The end of object tracking is related to the length of the data sequence. In order to verify the performance of the proposed algorithm, experiments are implemented with Visual Studio 2017 (VS2017) and the Point Cloud Library (PCL)[22] on a computer with a main frequency of 3.5 GHz and a memory of 8 G. The experiments include testing the object recognition ability of the JGLF descriptor and the proposed tracking algorithm. Besides, since there is no open access for data obtained from the aircraft platform to track the object, data used in this Letter are simulated by the software Blender[23]. To obtain data, one can infer to the operation introduction of the software.

In order to evaluate the effect of the proposed algorithm, the intersection ratio R of a single frame and the tracking accuracy and running time are calculated to, respectively, evaluate the accuracy, stability, and real-time performance of the algorithm. The intersection ratio of a single frame characterizes the ratio of the coincidence part between the particle bounding box and the actual object point cloud and the number Ng of the actual object point cloud in a single frame. The calculation equation[24] is R=(xi,yi,zi)AGpiNg×100%.

(xi,yi,zi) represents the spatial position of the point Pi, which is the overlapping space of the calculated particle set A and the corresponding true dataset G. The larger the ratio is, the better the tracking in a single frame performs. The threshold of R is set to 50%. When the calculated ratio of a frame is larger than 50%, then the frame is considered as being successfully tracked.

The tracking accuracy S in the overall tracking process is the ratio of the number of frames satisfying the threshold condition to the total number of frames. The average running time is calculated to evaluate the real-time performance of the entire process. In different simulation scenarios, the feature template libraries are trained in advance to compare the object recognition capabilities of the JGLF descriptor, VFH[18], clustered VFH (CVFH)[17], global radius-based surface descriptor (GRSD)[25], ensemble of shape functions (ESF)[26], and FPFH[27] descriptor by feature matching. The average object recognition rate is used as the evaluation standard, and the result is shown in Fig. 3.

Fig. 3. Comparison of the object recognition rate at different distances between LIDAR and the object.

下载图片 查看所有图片

From Fig. 3, as the LIDAR-object distance becomes longer, the object recognition capability of each descriptor decreases. The ESF descriptor performs best, then the JGLF descriptor. Only observing FPFH and VFH, when the LIDAR-object distance is from 0.5 km to 1.5 km, the average object recognition rate of FPFH is higher than that of VFH. While the LIDAR-object distance exceeds 2 km, VFH performs better, which proves that the ability of single local or global features to describe the continuous change of the LIDAR-object distance is still insufficient. CVFH performs more segmentation and clustering in advance in the calculation process, so its performance is relatively better, but this descriptor itself has strong instability, which can explain the increase in the range of 2.25 to 2.5 km.

From Table 1, it can be seen that the average object recognition rate of JGLF is 15.5% lower than that of ESF, but it is still the second accurate descriptor, and it performs 33 ms faster than ESF. Besides, JGLF is more stable than ESF.

Table 1. Object Recognition Ability Comparison of Six Descriptors

Object Recognition Rate (%)Average Running Time (ms)
MeanStandard Deviation
FPFH62.3713.175
VFH64.3411.253.6
CVFH68.917.794.5
GRSD39.8516.2831
ESF87.5916.7139
JGLF72.0910.816

查看所有表

The tracking result of the proposed method is shown in Fig. 4.

Fig. 4. Results of the PF point cloud tracking algorithm based on JGLF in frame n: (a) n=1, (b) n=61, (c) n=121, (d) n=181, and (e) n=241. (f) Comparison between particles in frame 186.

下载图片 查看所有图片

In Fig. 4, each tracking result is drawn in a bounding box. As the object moves and the LIDAR-object distance becomes shorter, the tracking bounding box can track the object well in order to more clearly observe the difference between the particles generated by the proposed algorithm and the basic PF algorithm. In Fig. 4(f), black particles represent actual point cloud data, blue particles represent the results obtained by the basic PF algorithm, and red particles represent the results obtained by the PF tracking algorithm based on the JGLF proposed in this Letter. The whole spatial distribution of red particles is closer to the actual object point cloud than that of the blue ones.

The red line in Fig. 5 represents the threshold of tracking accuracy, and it is set as 50%. The blue line represents the intersection ratio in each frame using the proposed algorithm, and the green line represents the intersection ratio in each frame using the basic PF algorithm. In the beginning, the LIDAR-object distance is long, so the result is relatively poor because of the small amount of point cloud data. As LIDAR and the object move closer, the performance becomes better. However, results of the basic one still show that in some frames it works badly, which is mainly caused by the lack in describing the difference between the object and particles. During the whole process, the tracking accuracy of the basic PF algorithm is 84.87%, while the tracking accuracy of the proposed PF algorithm in this Letter is 98.82% with a 13.95% improvement. The 13.95% increase proves that using a JGLF can make better use of the information of the point cloud.

Fig. 5. Comparison of the object tracking effect between the two algorithms.

下载图片 查看所有图片

In Table 2, the tracking results of five algorithms are shown. The five algorithms are the basic PF algorithm, PF algorithms based on FPFH, VFH, CVFH, and JGLF in this Letter. It can be seen from Table 2 that the PF algorithms based on FPFH, VFH, CVFH, and JGLF all have better performance in tracking accuracy and stability than the basic one. As the proposed algorithm combined both global and local features, it achieved the highest tracking accuracy and the average intersection ratio of a single frame. Compared with three other algorithms in tracking accuracy, the increase is 13.95%, 9.78%, 8.06%, and 7.57%, respectively. For the mean value of R of each single frame, the increase is 15.2%, 7.34%, 5.8%, and 7.68%, respectively. The proposed algorithm is second to the one based on CVFH in the stability of each single frame. Considering the increase of calculation complexity, the running time of the proposed algorithm is only increased by 0.52 ms. Since the LIDAR obtains data with the interval of 50 ms, the running time of the proposed algorithm is 12.96 ms, accounting for only about 26% of the time interval of data acquisition, which proves that the proposed one has a good real-time performance.

Table 2. Comparison of Tracking Results of Five Algorithms

Tracking Accuracy (%)R of Single Frame (%)Average Running Time (ms)CPU Utilized Percent (%)
MeanStandard Deviation
Basic algorithm84.8772.8117.7512.445
Algorithm based on FPFH89.0480.6712.4712.717
Algorithm based on VFH90.7682.2114.5912.687
Algorithm based on CVFH91.2580.339.0712.828
Proposed algorithm98.8288.0111.9612.968

查看所有表

In summary, when it comes to object tracking in a long distance, the lack of object information can be a difficult problem to tackle. With the development of LIDAR, the structure information of the 3D point cloud can meet the above needs. Under the situation with only a little 3D LIDAR point cloud data because the LIDAR platform is far from the object, the JGLF descriptor is proposed. Compared with current feature descriptors of the 3D LIDAR point cloud, the proposed descriptor performs better in the accuracy, stability, and real-time performance of object recognition. Using PF for the frame, an object tracking algorithm is proposed. The comparison experiments prove a better performance of the proposed algorithm, as it increases the tracking accuracy to 98.82%. At the same time, the results show that when object tracking has to be operated in a long distance, using the 3D LIDAR point cloud can be helpful to improve the accuracy and stability of tracking results.

References

[1] ChenT.DaiB.LiuD.SongJ., in International Congress on Image & Signal Processing (IEEE, 2016), p. 1566.

[2] BraunM.HofeleM.SchanzJ.RuckS.PohlM.BörretR.RiegelH., Proc. SPIE10909, 109090V (2019).PSISDG0277-786X

[3] ZhouZ. Y.WangJ. J.ZhuZ. F.YangD. H.WuJ., Optik158, 639 (2018).OTIKAJ0030-4026

[4] YuM.GongL.OhH.ChenW. H.ChambersJ., IEEE Trans. Aerosp. Electron. Syst.54, 1066 (2017).

[5] ZhaoJ.XiaoG.ZhangX.BavirisettiD. P., Chin. Opt. Lett.17, 031001 (2019).CJOEE31671-7694

[6] SangaleS. P.RahaneS. B., in IEEE Inventive Computation Technologies International Conference (2016), p. 1.

[7] SedaiS.BennamounM.HuynhD. Q.GaussianA., IEEE Trans. Image Process.22, 4286 (2013).

[8] FatimaU.YadavJ. P. S.GoelR. K., Int. J. Comput. Appl.173, 30 (2017).

[9] ChenZ. D.FanR. W.YeG. C.LuoT.GuanJ. Y.ZhouZ. G.ChenD. Y., Chin. Opt. Lett.16, 041101 (2018).

[10] LiM.HuY.ZhaoN.GuoL., IEEE Geosci. Remote Sens. Lett.16, 962 (2019).

[11] ChoiC.ChristensenH. I., in International Conference on Intelligent Robots and Systems (IEEE, 2014), p. 1084.

[12] HeldD.LevinsonJ.ThrunS.SavareseS., Int. J. Robot. Res.35, 30 (2016).

[13] ZhouB. N., Research of Fast Object Tracking Algorithm in 3D Point Cloud Environment, Master Thesis (Wuhan University of Science and Technology, 2018).

[14] NingJ.ZhangL.ZhangD.WuC., Int. J. Pattern Recogn. Artif. Intell.23, 1245 (2009).

[15] LiS.KooS.LeeD., in IEEE Conference on IEEE/RSJ International (2015), p. 6079.

[16] BiS.BroggiM.BeerM., Mech. Syst. Signal Process.117, 437 (2019).

[17] AldomaA.VinczeM.BlodowN.GossowD.GedikliS.RusuR. B.BradskiG., in IEEE International Conference on Computer Vision Workshops (2011), p. 6.

[18] RusuR. B.BradskiG.ThibauxR.HsuJ., in IEEE/RSJ International Conference on Intelligent Robots and Systems (2010), paper 11689992.

[19] RusuR. B., KI-Kunstliche Intell.24, 345 (2010).

[20] LeizeaI.ÁlvarezH.BorroD., Comput. Vision Image Understand.133, 51 (2015).

[21] AhmadA.LimaP., Robot. Auton. Syst.61, 1084 (2013).ROBBDQ

[22] AldomaA., IEEE Robot. Autom. Mag.19, 80 (2012).IRAMEB1070-9932

[23] Blender, .

[24] RagabK., Int. J. Pattern Recogn. Artif. Intell.30, 1660004 (2016).

[25] MartonZ. C.PangercicD.BlodowN.BeetzM., Int. J. Robot. Res.30, 1378 (2011).

[26] WohlkingerW.VinczeM., in IEEE International Conference on Robotics and Biomimetic (2012).

[27] RusuR. B.BlodowN.BeetzM., in IEEE International Conference on Robotics and Automation (2009), p. 3212.

Qishu Qian, Yihua Hu, Nanxiang Zhao, Minle Li, Fucai Shao, Xinyuan Zhang. Object tracking method based on joint global and local feature descriptor of 3D LIDAR point cloud[J]. Chinese Optics Letters, 2020, 18(6): 061001.

本文已被 3 篇论文引用
被引统计数据来源于中国光学期刊网
引用该论文: TXT   |   EndNote

相关论文

加载中...

关于本站 Cookie 的使用提示

中国光学期刊网使用基于 cookie 的技术来更好地为您提供各项服务,点击此处了解我们的隐私策略。 如您需继续使用本网站,请您授权我们使用本地 cookie 来保存部分信息。
全站搜索
您最值得信赖的光电行业旗舰网络服务平台!