Stabilization and Validation of 3D Object Position Using Multimodal Sensor Fusion and Semantic Segmentation

M.P. Muresan, I. Giosan, S. Nedevschi

Sensors 2020, 20, 1110; doi:10.3390/s20041110, pp. 1-33.

The stabilization and validation process of the measured position of objects is an important step for high‐level perception functions and for the correct processing of sensory data. The goal of this process is to detect and handle inconsistencies between different sensor measurements, which result from the perception system. The aggregation of the detections from different sensors consists in the combination of the sensorial data in one common reference frame for each identified object, leading to the creation of a super‐sensor. The result of the data aggregation may end up with errors such as false detections, misplaced object cuboids or an incorrect number of objects in the scene. The stabilization and validation process is focused on mitigating these problems. The current paper proposes four contributions for solving the stabilization and validation task, for autonomous vehicles, using the following sensors: trifocal camera, fisheye camera, long‐range RADAR (Radio detection and ranging), and 4‐layer and 16‐layer LIDARs (Light Detection and Ranging). We propose two original data association methods used in the sensor fusion and tracking processes. The first data association algorithm is created for tracking LIDAR objects and combines multiple appearance and motion features in order to exploit the available information for road objects. The second novel data association algorithm is designed for trifocal camera objects and has the objective of finding measurement correspondences to sensor fused objects such that the super‐sensor data are enriched by adding the semantic class information. The implemented trifocal object association solution uses a novel polar association scheme combined with a decision tree to find the best hypothesis–measurement correlations. Another contribution we propose for stabilizing object position and unpredictable behavior of road objects, provided by multiple types of complementary sensors, is the use of a fusion approach based on the Unscented Kalman Filter and a single‐layer perceptron. The last novel contribution is related to the validation of the 3D object position, which is solved using a fuzzy logic technique combined with a semantic segmentation image. The proposed algorithms have a real‐time performance, achieving a cumulative running time of 90 ms, and have been evaluated using ground truth data extracted from a high‐precision GPS (global positioning system) with 2 cm accuracy, obtaining an average error of 0.8 m.