Detecting social groups in 3D space is essential for applications such as human-robot interaction, surveillance, and crowd behavior analysis. Traditional methods based on 2D visual data often struggle with spatial inaccuracies due to perspective distortion and lack of depth information, especially in crowded or occluded environments. This motivates a shift toward 3D sensing, particularly with LiDAR, which provides accurate spatial measurements and robust performance under varying lighting and environmental conditions. This thesis presents a real-time 3D group detection framework that relies solely on geometric proximity without using high-level semantic cues such as head orientation, body pose, or activity recognition. The proposed pipeline consists of three main stages: (1) 3D pedestrian detection from LiDAR point clouds using pre-trained deep learning-based detectors, (2) distance-based clustering of detected pedestrian centroids using the DBSCAN algorithm, and (3) frame-wise group assignment and evaluation. Notably, the entire pipeline operates without the need for any learning-based training during grouping stages, making it lightweight and easy to deploy. For evaluation, a centroid-matching strategy is used, comparing predicted and annotated ground truth group centroids via thresholded Euclidean distance. This evaluation strategy addresses the limitations of traditional pairwise clustering metrics, especially where semantic definitions of “group” vary across datasets. The system is tested on two public datasets, L-CAS and JRDB. The results demonstrate that the proposed method achieves reasonable performance, offering a practical and efficient solution for real-time 3D group detection.
3D Group Detection Using Distance-Based Clustering
PRANTA, MD SEHABUB ZAMAN
2024/2025
Abstract
Detecting social groups in 3D space is essential for applications such as human-robot interaction, surveillance, and crowd behavior analysis. Traditional methods based on 2D visual data often struggle with spatial inaccuracies due to perspective distortion and lack of depth information, especially in crowded or occluded environments. This motivates a shift toward 3D sensing, particularly with LiDAR, which provides accurate spatial measurements and robust performance under varying lighting and environmental conditions. This thesis presents a real-time 3D group detection framework that relies solely on geometric proximity without using high-level semantic cues such as head orientation, body pose, or activity recognition. The proposed pipeline consists of three main stages: (1) 3D pedestrian detection from LiDAR point clouds using pre-trained deep learning-based detectors, (2) distance-based clustering of detected pedestrian centroids using the DBSCAN algorithm, and (3) frame-wise group assignment and evaluation. Notably, the entire pipeline operates without the need for any learning-based training during grouping stages, making it lightweight and easy to deploy. For evaluation, a centroid-matching strategy is used, comparing predicted and annotated ground truth group centroids via thresholded Euclidean distance. This evaluation strategy addresses the limitations of traditional pairwise clustering metrics, especially where semantic definitions of “group” vary across datasets. The system is tested on two public datasets, L-CAS and JRDB. The results demonstrate that the proposed method achieves reasonable performance, offering a practical and efficient solution for real-time 3D group detection.File | Dimensione | Formato | |
---|---|---|---|
Thesis_898502.pdf
non disponibili
Dimensione
5.78 MB
Formato
Adobe PDF
|
5.78 MB | Adobe PDF |
I documenti in UNITESI sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.
https://hdl.handle.net/20.500.14247/25189