Multi-Task Network for Panoptic Segmentation in Automated Driving

A. Petrovai, S. Nedevschi

Proceeding of 2019 IEEE Intelligent Transportation Systems Conference (ITSC), Auckland, New Zeeland, 26-30 October,2019, pp. 2394-2401.

In this paper, we tackle the newly introduced panoptic segmentation task. Panoptic segmentation unifies semantic and instance segmentation and leverages the capabilities of these complementary tasks by providing pixel and instance level classification. Current state-of-the-art approaches employ either separate networks for each task or a single network for both task and post processing heuristics fuse the outputs into the final panoptic segmentation. Instead, our approach solves all three tasks including panoptic segmentation with an end-to-end learnable fully convolutional neural network. We build upon the Mask R-CNN framework with a shared backbone and individual network heads for each task. Our semantic segmentation head uses multi-scale information from the Feature Pyramid Network, while the panoptic head learns to fuse the semantic segmentation logits with variable number of instance segmentation logits. Moreover, the panoptic head refines the outputs of the network, improving the semantic segmentation results. Experimental results on the challenging Cityscapes dataset demonstrate that the proposed solution
achieves significant improvements for both panoptic segmentation and semantic segmentation.

pdf

Curb Detection in Urban Traffic Scenarios Using LiDARs Point Cloud and Semantically Segmented Color Images

S.E.C. Deac, I. Giosan, S. Nedevschi

Proceeding of 2019 IEEE Intelligent Transportation Systems Conference (ITSC), Auckland, New Zeeland, 26-30 October,2019, pp. 3433-3440.

In this paper we propose a robust curb detection method which is based on the fusion between semantically labeled camera images and a 3D point cloud coming from LiDAR sensors. The labels from the semantically enhanced cloud are used to reduce the curbs’ searching area. Several spatial cues are next computed on each candidate curb region. Based on these features, a candidate curb region is either rejected or refined for obtaining a precise positioning of the curb points found inside it. A novel local model-based outlier removal algorithm is proposed to filter out the erroneous curb points. Finally, a temporal integration of the detected curb points in multiple consecutive frames is used to densify the detection result. An objective evaluation of the proposed solution is done using a highresolution digital map containing ground truth curb points. The proposed system has proved capable of detecting curbs of any heights (from 3cm up to 30cm) in complex urban road scenarios (straight roads, curved roads, intersections with traffic isles and roundabouts).

pdf

Efficient instance and semantic segmentation for automated driving

A. Petrovai, S. Nedevschi

Proceeding of 2019 IEEE Intelligent Vehicles Symposium (IV 2019), Paris, France, 9 – 12 June, 2019, pp. 2575-2581.

Environment perception for automated vehicles is achieved by fusing the outputs of different sensors such as cameras, LIDARs and RADARs. Images provide a semantic understanding of the environment at object level using instance segmentation, but also at background level using semantic segmentation. We propose a fully convolutional residual network based on Mask R-CNN to achieve both semantic and instance level recognition. We aim at developing an efficient network that could run in real-time for automated driving applications without compromising accuracy. Moreover, we compare and experiment with two different backbone architectures, a classification type of network and a faster segmentation type of network based on dilated convolutions. Experiments demonstrate top results on the publicly available Cityscapes dataset.

pdf

Environment Perception Architecture using Images and 3D Data

H. Florea, R. Varga, S. Nedevschi

Proceedings of 2018 14th IEEE International Conference on Intelligent Computer Communication and Processing (ICCP), Cluj-Napoca, Romania, September 7-9, 2018, pp. 223-228.

This paper discusses the architecture of an environment perception system for autonomous vehicles. The modules of the system are described briefly and we focus on important changes in the architecture that enable: decoupling of data acquisition from data processing; synchronous data processing; parallel computation on GPU and multiple CPU cores; efficient data passing using pointers; adaptive architecture capable of working with different number of sensors. The experimental results compare execution times before and after the proposed optimizations. We achieve a 10 Hz frame rate for an object detection system working with 4 cameras and 4 LIDAR point clouds.

pdf

A Fast RANSAC Based Approach for Computing the Orientation of Obstacles in Traffic Scenes

F. Oniga, S. Nedevschi

Proceedings of 2018 14th IEEE International Conference on Intelligent Computer Communication and Processing (ICCP), Cluj-Napoca, Romania, September 7-9, 2018, pp. 209-214.

A low complexity approach for computing the orientation of 3D obstacles, detected from lidar data, is proposed in this paper. The proposed method takes as input obstacles represented as cuboids without orientation (aligned with the reference frame). Each cuboid contains a cluster of obstacle locations (discrete grid cells). First, for each obstacle, the boundaries that are visible for the perception system are selected. A model consisting of two perpendicular lines is fitted to the set of boundary cells, one for each presumed visible side. The main dominant line is computed with a RANSAC approach. Then, the second line is searched, using a constraint of perpendicularity on the dominant line. The existence of the second line is used to validate the orientation. Finally, additional criteria are proposed to select the best orientation based on the free area of the cuboid (on top view) that is visible to the perception system.

pdf

Real-Time Stereo Reconstruction Failure Detection and Correction Using Deep Learning

V.C. Miclea, S. Nedevschi, L. Miclea

Proceedings of 2018 IEEE Intelligent Transportation Systems Conference (ITSC), Maui, Hawaii, USA, November 4-7, 2018, pp. 1095-1102.

This paper introduces a stereo reconstruction method that besides producing accurate results in real-time, is capable to detect and conceal possible failures caused by one of the cameras. A classification of stereo camera sensor faults is initially introduced, the most common types of defects being highlighted. We next present a stereo camera failure detection method in which various additional checks are being introduced, with respect to the aforementioned error classification. Furthermore, we propose a novel error correction method based on CNNs (convolutional neural networks) that is capable of generating reliable disparity maps by using prior information provided by semantic segmentation in conjunction with the last available disparity. We highlight the efficiency of our approach by evaluating its performance in various driving scenarios and show that it produces accurate disparities on images from Kitti stereo and raw datasets while running in real-time on a regular GPU.

pdf

OREOS: Oriented Recognition of 3D Point Clouds in Outdoor Scenarios

Lukas Schaupp, Mathias Buerki, Renaud Dube, Roland Siegwart, and Cesar Cadena

IEEE/RJS Int. Conference on Intelligent RObots and Systems (IROS) 2019

We introduce a novel method for oriented place recognition with 3D LiDAR scans. A Convolutional Neural Network is trained to extract compact descriptors from single 3D LiDAR scans. These can be used both to retrieve near-by place candidates from a map, and to estimate the yaw discrepancy needed for bootstrapping local registration methods. We employ a triplet loss function for training and use a hard negative mining strategy to further increase the performance of our descriptor extractor. In an evaluation on the NCLT and KITTI datasets, we demonstrate that our method outperforms related state-of-the-art approaches based on both data-driven and handcrafted data representation in challenging long-term outdoor conditions.

pdf   video

@inproceedings{SchauppIROS2019,
Title = {Map Management for Efficient Long-Term Visual Localization in Outdoor Environments},
Author = {L. Schaupp and M. Buerki and R. Dube and R. Siegwart and C. Cadena},
Fullauthor = {Lukas Schaupp and Mathias Buerki and Renaud Dube and Roland Siegwart and Cesar Cadena},
Booktitle = {{IEEE/RJS} Int. Conference on Intelligent RObots and Systems ({IROS})},
Month = {November},
Year = {2019},
}

VIZARD: Reliable Visual Localization for Autonomous Vehicles in Urban Outdoor Environments

Mathias Buerki, Lukas Schaupp, Marcyn Dymczyk, Renaud Dube, Cesar Cadena, Roland Siegwart, and Juan Nieto

IEEE Intelligent Vehicles Symposium (IV) 2019

Changes in appearance is one of the main sources of failure in visual localization systems in outdoor environments. To address this challenge, we present VIZARD, a visual localization system for urban outdoor environments. By combining a local localization algorithm with the use of multi-session maps, a high localization recall can be achieved across vastly different appearance conditions. The fusion of the visual localization constraints with wheel-odometry in a state estimation framework further guarantees smooth and accurate pose estimates. In an extensive experimental evaluation on several hundreds of driving kilometers in challenging urban outdoor environments, we analyze the recall and accuracy of our localization system, investigate its key parameters and boundary conditions, and compare different types of feature descriptors. Our results show that VIZARD is able to achieve nearly 100% recall with a localization accuracy below 0.5m under varying outdoor appearance conditions, including at night-time.

pdf   video

@inproceedings{BuerkiIV2019,
Title = {Map Management for Efficient Long-Term Visual Localization in Outdoor Environments},
Author = {M. Buerki and L. Schaupp and M. Dymczyk and R. Dube and C. Cadena and R. Siegwart and J. Nieto},
Fullauthor = {Mathias Buerki and Lukas Schaupp and Marcyn Dymczyk and Renaud Dube and Cesar Cadena and Roland Siegwart and Juan Nieto},
Booktitle = {{IEEE} Intelligent Vehicles Symposium ({IV})},
Month = {June},
Year = {2019},
}

Object Classification Based on Unsupervised Learned Multi-Modal Features for Overcoming Sensor Failures

Julia Nitsch, Juan Nieto, Roland Siegwart, Max Schmidt, and Cesar Cadena

IEEE International Conference on Robotics and Automation (ICRA) 2019

For autonomous driving applications it is critical to know which type of road users and road side infrastructure are present to plan driving manoeuvres accordingly. Therefore autonomous cars are equipped with different sensor modalities to robustly perceive its environment. However, for classification modules based on machine learning techniques it is challenging to overcome unseen sensor noise. This work presents an object classification module operating on unsupervised learned multi-modal features with the ability to overcome gradual or total sensor failure. A two stage approach composed of an unsupervised feature training and a uni-modal and multimodal classifiers training is presented. We propose a simple but effective decision module switching between uni-modal and multi-modal classifiers based on the closeness in the feature space to the training data. Evaluations on the ModelNet 40 data set show that the proposed approach has a 14% accuracy gain compared to a late fusion approach operating on a noisy point cloud data and a 6% accuracy gain when operating on noisy image data.

pdf

@inproceedings{BescosICRA2019,
Title = {Object Classification Based on Unsupervised Learned Multi-Modal Features for Overcoming Sensor Failures},
Author = {J. Nitsch and J. Nieto and R. Siegwart and M. Schmidt and C. Cadena},
Fullauthor = {Julia Nitsch and Juan Nieto and Roland Siegwart and Max Schmidt and Cesar Cadena},
Booktitle = {{IEEE} International Conference on Robotics and Automation ({ICRA})},
Month = {May},
Year = {2019},
}

Empty Cities: Image Inpainting for a Dynamic-Object-Invariant Space

Berta Bescos, Jose Neira, Roland Siegwart, and Cesar Cadena

IEEE International Conference on Robotics and Automation (ICRA) 2019

In this paper we present an end-to-end deep learning framework to turn images that show dynamic content, such as vehicles or pedestrians, into realistic static frames. This objective encounters two main challenges: detecting all the dynamic objects, and inpainting the static occluded background with plausible imagery. The former challenge is addressed by the use of a convolutional network that learns a multiclass semantic segmentation of the image. The second problem is approached with a conditional generative adversarial model that, taking as input the original dynamic image and its dynamic/static binary mask, is capable of generating the final static image. These generated images can be used for applications such as augmented reality or vision-based robot localization purposes. To validate our approach, we show both qualitative and quantitative comparisons against other state-of-the-art inpainting methods by removing the dynamic objects and hallucinating the static structure behind them. Furthermore, to demonstrate the potential of our results, we carry out pilot experiments that show the benefits of our proposal for visual place recognition.

pdf   website   code   video

@inproceedings{BescosICRA2019,
Title = {Empty Cities: Image Inpainting for a Dynamic-Object-Invariant Space},
Author = {B. Bescos and J. Neira and R. Siegwart and C. Cadena},
Fullauthor = {Berta Bescos and Jose Neira and Roland Siegwart and Cesar Cadena},
Booktitle = {{IEEE} International Conference on Robotics and Automation ({ICRA})},
Month = {May},
Year = {2019},
}