An open visual-inertial mapping framework: maplab

This repository contains maplab, an open, research-oriented visual-inertial mapping framework, written in C++, for creating, processing and manipulating multi-session maps. On the one hand, maplab can be considered as a ready-to-use visual-inertial mapping and localization system. On the other hand, maplab provides the research community with a collection of multi-session mapping tools that include map merging, visual-inertial batch optimization, and loop closure.

Furthermore, it includes an online frontend, ROVIOLI, that can create visual-inertial maps and also track a global drift-free pose within a localization map.

Final Event Press Release

Already today, Lane Assist and Adaptive Cruise Control (ACC) ensure greater safety when driving on motorways and A-roads. In the city, however, automated driving functions are a completely different challenge. The hurdles are much higher there. In order to overcome them, Volkswagen has joined forces with IBM, the Technical Universities of Cluj-Napoca, Prague, and ETH Zurich within the ‘UP-Drive’ project funded by the European Commission.

See the full story here.

Appearance-Based Landmark Selection for Visual Localization

Mathias Bürki, Cesar Cadena, Igor Gilitschenski, Roland Siegwart and Juan Nieto

Journal of Fields Robotics (JFR) 2019

Visual localization in outdoor environments is subject to varying appearance conditions rendering it difficult to match current camera images against a previously recorded map. Although it is possible to extend the respective maps to allow precise localization across a wide range of differing appearance conditions, these maps quickly grow in size and become impractical to handle on a mobile robotic platform. To address this problem, we present a landmark selection algorithm that exploits appearance co‐observability for efficient visual localization in outdoor environments. Based on the appearance condition inferred from recently observed landmarks, a small fraction of landmarks useful under the current appearance condition is selected and used for localization. This allows to greatly reduce the bandwidth consumption between the mobile platform and a map backend in a shared‐map scenario, and significantly lowers the demands on the computational resources on said mobile platform. We derive a landmark ranking function that exhibits high performance under vastly changing appearance conditions and is agnostic to the distribution of landmarks across the different map sessions. Furthermore, we relate and compare our proposed appearance‐based landmark ranking function to popular ranking schemes from information retrieval, and validate our results on the challenging University of Michigan North Campus long‐term vision and LIDAR data sets (NCLT), including an evaluation of the localization accuracy using ground‐truth poses. In addition to that, we investigate the computational and bandwidth resource demands. Our results show that by selecting 20–30% of landmarks using our proposed approach, a similar localization performance as the baseline strategy using all landmarks is achieved.


 title = {Appearance-Based Landmark Selection for Visual Localization},
 author = {M. Buerki and C. Cadena and I. Gilitschenski and R. Siegwart and Juan Nieto},
 fullauthor ={Buerki, Mathias and Cadena, Cesar and Gilitschenski, Igor and Siegwart, Roland and Nieto, Juan},
 journal = {{Journal of Fields Robotics}},
 year = {2019},
 volume = {6},
 number = {6},
 pages  = {1041--1073},

OREOS: Oriented Recognition of 3D Point Clouds in Outdoor Scenarios

Lukas Schaupp, Mathias Buerki, Renaud Dube, Roland Siegwart, and Cesar Cadena

IEEE/RJS Int. Conference on Intelligent RObots and Systems (IROS) 2019

We introduce a novel method for oriented place recognition with 3D LiDAR scans. A Convolutional Neural Network is trained to extract compact descriptors from single 3D LiDAR scans. These can be used both to retrieve near-by place candidates from a map, and to estimate the yaw discrepancy needed for bootstrapping local registration methods. We employ a triplet loss function for training and use a hard negative mining strategy to further increase the performance of our descriptor extractor. In an evaluation on the NCLT and KITTI datasets, we demonstrate that our method outperforms related state-of-the-art approaches based on both data-driven and handcrafted data representation in challenging long-term outdoor conditions.

pdf   video

Title = {Map Management for Efficient Long-Term Visual Localization in Outdoor Environments},
Author = {L. Schaupp and M. Buerki and R. Dube and R. Siegwart and C. Cadena},
Fullauthor = {Lukas Schaupp and Mathias Buerki and Renaud Dube and Roland Siegwart and Cesar Cadena},
Booktitle = {{IEEE/RJS} Int. Conference on Intelligent RObots and Systems ({IROS})},
Month = {November},
Year = {2019},

VIZARD: Reliable Visual Localization for Autonomous Vehicles in Urban Outdoor Environments

Mathias Buerki, Lukas Schaupp, Marcyn Dymczyk, Renaud Dube, Cesar Cadena, Roland Siegwart, and Juan Nieto

IEEE Intelligent Vehicles Symposium (IV) 2019

Changes in appearance is one of the main sources of failure in visual localization systems in outdoor environments. To address this challenge, we present VIZARD, a visual localization system for urban outdoor environments. By combining a local localization algorithm with the use of multi-session maps, a high localization recall can be achieved across vastly different appearance conditions. The fusion of the visual localization constraints with wheel-odometry in a state estimation framework further guarantees smooth and accurate pose estimates. In an extensive experimental evaluation on several hundreds of driving kilometers in challenging urban outdoor environments, we analyze the recall and accuracy of our localization system, investigate its key parameters and boundary conditions, and compare different types of feature descriptors. Our results show that VIZARD is able to achieve nearly 100% recall with a localization accuracy below 0.5m under varying outdoor appearance conditions, including at night-time.

pdf   video

Title = {Map Management for Efficient Long-Term Visual Localization in Outdoor Environments},
Author = {M. Buerki and L. Schaupp and M. Dymczyk and R. Dube and C. Cadena and R. Siegwart and J. Nieto},
Fullauthor = {Mathias Buerki and Lukas Schaupp and Marcyn Dymczyk and Renaud Dube and Cesar Cadena and Roland Siegwart and Juan Nieto},
Booktitle = {{IEEE} Intelligent Vehicles Symposium ({IV})},
Month = {June},
Year = {2019},

Object Classification Based on Unsupervised Learned Multi-Modal Features for Overcoming Sensor Failures

Julia Nitsch, Juan Nieto, Roland Siegwart, Max Schmidt, and Cesar Cadena

IEEE International Conference on Robotics and Automation (ICRA) 2019

For autonomous driving applications it is critical to know which type of road users and road side infrastructure are present to plan driving manoeuvres accordingly. Therefore autonomous cars are equipped with different sensor modalities to robustly perceive its environment. However, for classification modules based on machine learning techniques it is challenging to overcome unseen sensor noise. This work presents an object classification module operating on unsupervised learned multi-modal features with the ability to overcome gradual or total sensor failure. A two stage approach composed of an unsupervised feature training and a uni-modal and multimodal classifiers training is presented. We propose a simple but effective decision module switching between uni-modal and multi-modal classifiers based on the closeness in the feature space to the training data. Evaluations on the ModelNet 40 data set show that the proposed approach has a 14% accuracy gain compared to a late fusion approach operating on a noisy point cloud data and a 6% accuracy gain when operating on noisy image data.


Title = {Object Classification Based on Unsupervised Learned Multi-Modal Features for Overcoming Sensor Failures},
Author = {J. Nitsch and J. Nieto and R. Siegwart and M. Schmidt and C. Cadena},
Fullauthor = {Julia Nitsch and Juan Nieto and Roland Siegwart and Max Schmidt and Cesar Cadena},
Booktitle = {{IEEE} International Conference on Robotics and Automation ({ICRA})},
Month = {May},
Year = {2019},

SegMap: Segment-based Mapping and Localization using Data-driven Descriptors

Renaud Dube, Andrei Cramariuc1, Daniel Dugas, Hannes Sommer, Marcin Dymczyk, Juan Nieto, Roland Siegwart, and Cesar Cadena

International Journal of Robotics Research (IJRR) 2019

Precisely estimating a robot’s pose in a prior, global map is a fundamental capability for mobile robotics, e.g. autonomous driving or exploration in disaster zones. This task, however, remains challenging in unstructured, dynamic environments, where local features are not discriminative enough and global scene descriptors only provide coarse information. We therefore present SegMap: a map representation solution for localization and mapping based on the extraction of segments in 3D point clouds. Working at the level of segments offers increased invariance to view-point and local structural changes, and facilitates real-time processing of large-scale 3D data. SegMap exploits a single compact data-driven descriptor for performing multiple tasks: global localization, 3D dense map reconstruction, and semantic information extraction. The performance of SegMap is evaluated in multiple urban driving and search and rescue experiments. We show that the learned SegMap descriptor has superior segment retrieval capabilities, compared to state-of-the-art handcrafted descriptors. In consequence, we achieve a higher localization accuracy and a 6% increase in recall over state-of-the-art. These segment-based localizations allow us to reduce the open-loop odometry drift by up to 50%. SegMap is open-source available along with easy to run demonstrations.


 title = {{SegMap}: Segment-based Mapping and Localization using Data-driven Descriptors},
 author = {R. Dube and A. Cramariuc and D. Dugas and H. Sommer and M. Dymczyk and J. Nieto and R. Siegwart and C. Cadena},
 fullauthor ={Renaud Dube and Andrei Cramariuc and Daniel Dugas and Hannes Sommer and Marcin Dymczyk and Juan Nieto and Roland Siegwart and Cesar Cadena},
 journal = {{International Journal of Robotics Research}},
 year = {2019},
 volume = {XX},
 number = {X},
 pages  = {1--16},

Multiple Hypothesis Semantic Mapping for Robust Data Association

Lukas Bernreiter, Abel Gawel, Hannes Sommer, Juan Nieto, Roland Siegwart and Cesar Cadena

IEEE Robotics and Automation Letters, 2019

We present a semantic mapping approach with multiple hypothesis tracking for data association. As semantic information has the potential to overcome ambiguity in measurements and place recognition, it forms an eminent modality for autonomous systems. This is particularly evident in urban scenarios with several similar-looking surroundings. Nevertheless, it requires the handling of a non-Gaussian and discrete random variable coming from object detectors. Previous methods facilitate semantic information for global localization and data association to reduce the instance ambiguity between the landmarks. However, many of these approaches do not deal with the creation of completely globally consistent representations of the environment and typically do not scale well. We utilize multiple hypothesis trees to derive a probabilistic data association for semantic measurements by means of position, instance, and class to create a semantic representation. We propose an optimized mapping method and make use of a pose graph to derive a novel semantic SLAM solution. Furthermore, we show that semantic covisibility graphs allow for a precise place recognition in urban environments. We verify our approach using real-world outdoor dataset and demonstrate an average drift reduction of 33% w.r.t. the raw odometry source. Moreover, our approach produces 55% less hypotheses on average than a regular multiple hypothesis approach.


title={Multiple Hypothesis Semantic Mapping for Robust Data Association}, 
author={L. {Bernreiter} and A. {Gawel} and H. {Sommer} and J. {Nieto} and R. {Siegwart} and C. {Cadena}}, 
journal={{IEEE Robotics and Automation Letters}}, 

Empty Cities: Image Inpainting for a Dynamic-Object-Invariant Space

Berta Bescos, Jose Neira, Roland Siegwart, and Cesar Cadena

IEEE International Conference on Robotics and Automation (ICRA) 2019

In this paper we present an end-to-end deep learning framework to turn images that show dynamic content, such as vehicles or pedestrians, into realistic static frames. This objective encounters two main challenges: detecting all the dynamic objects, and inpainting the static occluded background with plausible imagery. The former challenge is addressed by the use of a convolutional network that learns a multiclass semantic segmentation of the image. The second problem is approached with a conditional generative adversarial model that, taking as input the original dynamic image and its dynamic/static binary mask, is capable of generating the final static image. These generated images can be used for applications such as augmented reality or vision-based robot localization purposes. To validate our approach, we show both qualitative and quantitative comparisons against other state-of-the-art inpainting methods by removing the dynamic objects and hallucinating the static structure behind them. Furthermore, to demonstrate the potential of our results, we carry out pilot experiments that show the benefits of our proposal for visual place recognition.

pdf   website   code   video

Title = {Empty Cities: Image Inpainting for a Dynamic-Object-Invariant Space},
Author = {B. Bescos and J. Neira and R. Siegwart and C. Cadena},
Fullauthor = {Berta Bescos and Jose Neira and Roland Siegwart and Cesar Cadena},
Booktitle = {{IEEE} International Conference on Robotics and Automation ({ICRA})},
Month = {May},
Year = {2019},