Journal of Jilin University(Engineering and Technology Edition) ›› 2025, Vol. 55 ›› Issue (2): 709-721.doi: 10.13229/j.cnki.jdxbgxb.20230543

Previous Articles    

Three-dimensional object detection algorithm based on multi-scale candidate fusion and optimization

Hua CAI1(),Yan-yang ZHENG1,Qiang FU2,Sheng-yu WANG1,Wei-gang WANG3,Zhi-yong MA3   

  1. 1.School of Electronic Information Engineer,Changchun University of Science and Technology,Changchun 130022,China
    2.School of Opto-Electronic Engineer,Changchun University of Science and Technology,Changchun 130022,China
    3.No. 2 Department of Urology,The First Hospital of Jilin University,Changchun 130061,China
  • Received:2023-05-30 Online:2025-02-01 Published:2025-04-16

Abstract:

To address the issues of target omission and the inclusion of a large number of background points in keypoint sampling for point cloud-based object detection, an improved algorithm based on the PV-RCNN network is introduced. This approach employs both a regional proposal fusion network and weighted non-maximum suppression (NMS) to merge proposals generated at various scales while eliminating redundancy. A segmentation network is utilized to segment foreground points from the original point cloud, and object center points are identified based on these proposals. Gaussian density functions are employed for regional density estimation, which assigns different sampling weights to solve the problem of difficult sampling in sparse areas. Experimental evaluations on the KITTI dataset indicate that the algorithm enhances the average precision at medium difficulty levels by 0.39%, 1.31%, and 0.63% for cars, pedestrians, and cyclists, respectively. Generalization experiments were also conducted on the Waymo open dataset. The results suggest that the introduced algorithm achieves higher accuracy compared to most of the existing 3D object detection networks.

Key words: computer version, 3D object detection, region proposal fusion, weighted non-maximum suppression, keypoint sampling

CLC Number: 

  • TP391

Fig.1

System structure diagram"

Fig.2

Flowchart of weighted non-maximum suppression"

Fig.3

Foreground point segmentation network"

Fig.4

Schematic diagram of key point sampling based on central point density"

Table 1

Classification criteria for the three difficulty levels in the KITTI dataset detection task"

等 级简单中等困难
最小边界框高度40像素25像素25像素
最大遮挡等级完全可见部分遮挡难以看清
最大截断率15%30%50%

Table 2

Comparison results of quantitative detection performance of different KITTI test set"

算法类型3D 车辆 (IoU=0.7)/%3D行人 (IoU=0.5)/%3D 骑行者 (IoU=0.5)/%
简单中等困难简单中等困难简单中等困难
VoxelNet11一阶段77.4765.1157.7339.4833.6931.5061.2248.3644.37
SECOND12一阶段84.6575.9668.7145.3135.5233.1475.8360.8253.67
PointPillars29一阶段82.5874.3168.9951.4541.9238.8977.1058.6551.92
Point-GNN30一阶段88.3379.4772.2951.9243.7740.1478.6063.4857.08
IAvSSD27一阶段88.3480.1375.0446.5139.0335.6178.3561.9455.70
TANet31二阶段84.3975.9468.8253.7244.3440.4975.7059.4452.53
Part-A2[32二阶段87.8178.4973.5153.1043.3540.0679.1763.5256.93
PointRCNN21二阶段86.9675.6470.7047.9839.3736.0174.9658.8252.53
STD16二阶段87.9579.7175.0953.2942.4738.3578.6961.5955.30
PV-RCNN17二阶段90.2581.4376.8252.1743.2940.2978.6063.7157.65
本文二阶段90.3281.8277.1153.8644.6040.8479.5264.3458.03

Table 3

Results of 3DmAP of different algorithms in R11 standard on KITTI validation set"

算法年份mAP3D/%
简单中等困难
SECOND12201888.6178.6277.22
PointPillars29201986.6276.0668.91
STD16201989.7079.8079.30
PointRCNN21201988.8878.6377.38
Part-A2 [30202089.4779.4778.54
PV-RCNN17202089.3583.6978.70
VoxelRCNN33202189.4184.5278.93
JPV-Net34202289.7184.6179.09
本文202590.5484.7279.81

Table 4

Comparison of algorithm running time on KITTI dataset"

算法VoxelNet11Point-RCNN21PV-RCNN17STD16EQ-PVRCNN37Ours
运行时间/ms2201008080200140

Table 5

Comparison results of quantitative detection performance of Waymo open dataset validation set"

算法车辆(mAP/mAPH)/%行人(mAP/mAPH)/%骑行者(mAP/mAPH)/%
LEVEL 1LEVEL 2LEVEL 1LEVEL 2LEVEL 1LEVEL 2
SECOND1272.3/71.763.9/63.368.7/58.260.7/51.360.6/59.358.3/57.0
PointPillars2972.1/71.563.6/63.170.6/56.762.8/50.364.4/62.361.9/59.9
Centerpoint3676.6/76.068.9/68.479.0/73.471.0/65.872.1/71.069.5/68.5
IA-SSD2770.5/69.761.6/60.969.4/58.560.3/50.767.7/65.365.0/62.7
CenterFormer3775.2/74.770.2/69.778.6/73.073.6/68.372.3/71.369.8/68.8
VoxSeT3874.5/74.066.0/65.680.0/72.472.5/65.471.6/70.369.0/67.8
PV-RCNN1777.5/76.969.0/68.475.0/65.666.0/57.667.8/66.465.4/64.0
本文78.5/78.172.5/71.880.1/75.373.8/67.874.3/73.670.9/68.7

Table 6

Compare the detection results of vehicles within different ranges on the Waymo Open dataset"

模型年份车辆 3D mAP (IoU=0.7)
0~30 m30~50 m50 m-Inf
PV-RCNN17202091.9269.2142.17
Voxel-RCNN33202192.4974.0953.15
CT3D39202192.5175.0755.36
VoxSeT38202291.1375.7554.23
本文202592.1075.8155.42

Fig.5

Visual example of algorithm detection on KITTI dataset"

Fig.6

Visual example of algorithm detection on Waymo open dataset"

Fig.7

Comparison chart of the detection effects between the algorithm proposed in this paper and the baseline algorithm in the sparse areas in the distance on the KITTI dataset"

Fig.8

Visual comparison between the algorithm proposed in this paper and the current state-of-the-art algorithms on the Waymo Open dataset"

Table 8

Comparative results of ablation experiments"

算法区域候选融合加权NMS区域密度采样mAP(中等)/%
车辆行人骑行者
基础网络(PV-RCNN)81.4343.2963.71
实验网络181.6044.2464.05
实验网络281.7644.3564.09
本文算法81.8244.7064.34
1 Qian R, Garg D, Wang Y, et al. End-to-end pseudo-lidar for image-based 3d object detection[C]∥Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seattle, USA, 2020: 5881-5890.
2 Wang Z, Huang Z, Fu J, et al. Object as query: equipping any 2D object detector with 3D detection ability[J]. Arxiv Preprint, 2023, 1: No.230102364.
3 陶博, 颜伏伍, 尹智帅, 等. 基于高精度地图增强的三维目标检测算法[J]. 吉林大学学报: 工学版, 2023, 53(3): 802-809.
Tao Bo, Yan Fu-wu, Yin Zhi-shuai, et al. 3D object detection algorithm based on high-precision map enhancement[J]. Journal of Jilin University (Engineering and Technology Edition), 2023, 53(3): 802-809.
4 Yang Z, Zhou Y, Chen Z, et al. 3D-man: 3D multi-frame attention network for object detection[C]∥Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Montreal, Canada, 2021: 1863-1872
5 Li Y, Yu A W, Meng T, et al. Deepfusion: lidar-camera deep fusion for multi-modal 3d object detection[C]∥Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, New Orleans, USA, 2022: 17182-17191.
6 才华, 寇婷婷, 杨依宁, 等. 基于轨迹优化的三维车辆多目标跟踪[J]. 吉林大学学报: 工学版, 2024, 54(8): 2338-2347.
Cai Hua, Kou Ting-ting, Yang Yi-ning, et al. Three-dimensional vehicle multiple target tracking based on trajectory optimization[J]. Journal of Jilin University (Engineering and Technology Edition), 2024, 54(8): 2338-2347.
7 Zheng A, Zhang Y, Zhang X, et al. Progressive end-to-end object detection in crowded scenes[C]∥Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, New Orleans, USA, 2022: 857-866.
8 Redmon J, Divvala S, Girshick R, et al. You only look once: unified, real-time object detection[C]∥ 2016 IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, USA, 2016: 779-788.
9 Waleed A, Sherif A, Mahmoud Z, et al. Yolo3D: end-to-end real-time 3D oriented object bounding box detection from lidar point cloud[C]∥Computer Vision-ECCV 2018 Workshops, Munichi, Germany, 2018: 716-728.
10 Zhou Y, Sun P, Zhang Y, et al. End-to-end multi-view fusion for 3D object detection in lidar point clouds[C]∥Proceedings of the Conference on Robot Learning, Cambridge, USA, 2020: 923–932.
11 Zhou Y, Tuzel O. Voxelnet: end-to-end learning for point cloud based 3d object detection[C]∥Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, USA, 2018: 4490-4499.
12 Yan Y, Mao Y, Li B. Second: sparsely embedded convolutional detection[J]. Sensors, 2018, 18(10): No.18103337.
13 Qi C R, Su H, Mo K, et al. Pointnet: deep learning on point sets for 3d classification and segmentation[C]∥Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, USA, 2017: 652-660.
14 Qi C R, Yi L, Su H, et al. Pointnet++: deep hierarchical feature learning on point sets in a metric space[C]∥Proceedings of the 31st International Conference on Neural Information Processing Systems, Long Beach, USA, 2017: 5105-5114.
15 Qi C R, Liu W, Wu C, et al. Frustum pointnets for 3d object detection from rgb-d data[C]∥Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, USA, 2018: 918-927.
16 Yang Z, Sun Y, Liu S, et al. Std: sparse-to-dense 3d object detector for point cloud[C]∥Proceedings of The IEEE/CVF International Conference on Computer Vision, Seoul, South Korea, 2019: 1951-1960.
17 Shi S, Guo C, Jiang L, et al. PV-RCNN: point-voxel feature set abstraction for 3d object detection[C]∥Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seattle, USA, 2020: 10529-10538.
18 Geiger A, Lenz P, Urtasun R. Are we ready for autonomous driving? the kitti vision benchmark suite[C]∥2012 IEEE Conference on Computer Vision and Pattern Recognition, Providence, Rhode Island,USA, 2012: 3354-3361.
19 Sun P, Kretzschmar H, Dotiwalla X, et al. Scalability in perception for autonomous driving: Waymo open dataset[C]∥Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seattle, USA, 2020: 2446-2454.
20 Ye M, Xu S, Cao T. Hvnet: hybrid voxel network for lidar based 3D object detection[C]∥Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seattle, USA, 2020: 1631-1640.
21 Shi S, Wang X, Li H. Pointrcnn: 3D object proposal generation and detection from point cloud[C]∥Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seoul, South Korea, 2019: 770-779.
22 Liu Z, Tang H, Lin Y, et al. Point-voxel CNN for efficient 3D deep learning[C]∥Proceedings of the 33rd International Conference on Neural Information Processing Systems, Vancouver, Canada, 2019: 965-975.
23 田枫, 姜文文, 刘芳, 等. 混合体素与原始点云的三维目标检测方法[J]. 重庆理工大学学报: 自然科学, 2022, 36(11): 108-117.
Tian Feng, Jiang Wen-wen, Liu Fang, et al. Hybrid element and original point cloud 3D target detection method [J]. Journal of Chongqing University of Technology (Natural Science), 2022, 36(11):108-117.
24 Shi S, Jiang L, Deng J, et al. PV-RCNN++: point-voxel feature set abstraction with local vector representation for 3D object detection[J]. International Journal of Computer Vision, 2023, 131(2): 531-551.
25 车运龙, 袁亮, 孙丽慧. 基于强语义关键点采样的三维目标检测方法[J]. 计算机工程与应用, 2024, 60(9): 254-260.
Che Yun-long, Yuan Liang, Sun Li-hui, et al. 3D object detection method based on strong semantic key point sampling[J]. Computer Engineering and Applications, 2024, 60(9): 254-260.
26 He C, Zeng H, Huang J, et al. Structure aware single-stage 3D object detection from point cloud[C]∥Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seattle, USA, 2020: 11873-11882.
27 Zhang Y, Hu Q, Xu G, et al. Not all points are equal: learning highly efficient point-based detectors for 3D lidar point clouds[C]∥Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, New Orleans, USA, 2022: 18953-18962.
28 Lin T Y, Goyal P, Girshick R, et al. Focal loss for dense object detection[C]∥Proceedings of the IEEE International Conference on Computer Vision, Venice, Italy, 2017: 2980-2988.
29 Lang A H, Vora S, Caesar H, et al. Pointpillars: fast encoders for object detection from point clouds [C]∥Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seoul, South Korea, 2019: 12697-12705.
30 Shi W, Rajkumar R. Point-GNN: graph neural network for 3d object detection in a point cloud[C]∥Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seattle, USA, 2020: 1711-1719.
31 Liu Z, Zhao X, Huang T, et al. Tanet: robust 3D object detection from point clouds with triple attention[J]. Proceedings of the AAAI Conference on Artificial Intelligence, 2020, 34(7): 11677-11684.
32 Shi S, Wang Z, Shi J, et al. From points to parts: 3D object detection from point cloud with part-aware and part-aggregation network[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2020, 43(8): 2647-2664.
33 Deng J, Shi S, Li P, et al. Voxel R-Cnn: towards high performance voxel-based 3d object detection[J]. Proceedings of the AAAI Conference on Artificial Intelligence, 2021, 35(2): 1201-1209.
34 Song N, Jiang T, Yao J. JPV-Net: joint point-voxel representations for accurate 3D object detection[J]. Proceedings of the AAAI Conference on Artificial Intelligence, 2022, 36(2): 2271-2279.
35 Yang Z, Jiang L, Sun Y, et al. A unified query-based paradigm for point cloud understanding[C]∥Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, New Orleans, USA, 2022: 8541-8551.
36 Yin T, Zhou X, Krahenbuhl P. Center-based 3D object detection and tracking[C]∥Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Nashville, USA, 2021: 11784-11793.
37 Zhou Z, Zhao X, Wang Y, et al. Centerformer: center-based transformer for 3D object detection[C]∥European Conference on Computer Vision, Tel Aviv, Israel, 2022: 496-513.
38 He C, Li R, Li S, et al. Voxel set transformer: a set-to-set approach to 3D object detection from point clouds[C]∥2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition, New Orleans, USA, 2022: 8417-8427.
39 Sheng H, Cai S, Liu Y, et al. Improving 3D object detection with channel-wise transformer[C]∥Proceedings of the IEEE/CVF International Conference on Computer Vision, Montreal, Canada, 2021: 2743-2752.
[1] Hua CAI,Ting-ting KOU,Yi-ning YANG,Zhi-yong MA,Wei-gang WANG,Jun-xi SUN. Three-dimensional vehicle multi-target tracking based on trajectory optimization [J]. Journal of Jilin University(Engineering and Technology Edition), 2024, 54(8): 2338-2347.
[2] Xin CHENG,Sheng-xian LIU,Jing-mei ZHOU,Zhou ZHOU,Xiang-mo ZHAO. 3D object detection algorithm fusing dense connectivity and Gaussian distance [J]. Journal of Jilin University(Engineering and Technology Edition), 2024, 54(12): 3589-3600.
[3] Bo TAO,Fu-wu YAN,Zhi-shuai YIN,Dong-mei WU. 3D object detection based on high⁃precision map enhancement [J]. Journal of Jilin University(Engineering and Technology Edition), 2023, 53(3): 802-809.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!