maplab: An Open Framework for Research in Visual-inertial Mapping and Localization

Thomas Schneider, Marcin Dymczyk, Marius Fehr, Kevin Egger, Simon Lynen, Igor Gilitschenski and Roland Siegwart

IEEE Robotics and Automation Letters, 2018

Robust and accurate visual-inertial estimation is crucial to many of today’s challenges in robotics. Being able to localize against a prior map and obtain accurate and drift-free pose estimates can push the applicability of such systems even further. Most of the currently available solutions, however, either focus on a single session use-case, lack localization capabilities or an end-to-end pipeline. We believe that by combining state-of-the-art algorithms, scalable multi-session mapping tools, and a flexible user interface, we can create an efficient research platform. We believe that only a complete system, combining state-of-the-art algorithms, scalable multi-session mapping tools, and a flexible user interface, can become an efficient research platform. We therefore present maplab, an open, research-oriented visual-inertial mapping framework for processing and manipulating multi-session maps, written in C++. On the one hand, maplab can be seen as a ready-to-use visual-inertial mapping and localization system. On the other hand, maplab provides the research community with a collection of multi-session mapping tools that include map merging, visual-inertial batch optimization, and loop closure. Furthermore, it includes an online frontend that can create visual-inertial maps and also track a global drift-free pose within a localization map. In this paper, we present the system architecture, five use-cases, and evaluations of the system on public datasets. The source code of maplab is freely available for the benefit of the robotics research community.


title={maplab: An Open Framework for Research in Visual-inertial Mapping and Localization}, 
author={T. Schneider and M. T. Dymczyk and M. Fehr and K. Egger and S. Lynen and I. Gilitschenski and R. Siegwart}, 
journal={{IEEE Robotics and Automation Letters}}, 

Traffic Scene Segmentation based on Boosting over Multimodal Low, Intermediate and High Order Multi-range Channel Features

Arthur D. Costea and Sergiu Nedevschi

Proceedings of 2017 IEEE Intelligent Vehicles Symposium (IV) June 11-14, 2017, Redondo Beach, CA, USA, pp. 74-81

In this paper we introduce a novel multimodal boosting based solution for semantic segmentation of traffic scenarios. Local structure and context are captured from both monocular color and depth modalities in the form of image channels. We define multiple channel types at three different levels: low, intermediate and high order channels. The low order channels are computed using a multimodal multiresolution filtering scheme and capture structure and color information from lower receptive fields. For the intermediate order channels, we employ deep convolutional channels that are able to capture more complex structures, having a larger receptive field. The high order channels are scale invariant channels that consist of spatial, geometric and semantic channels. These channels are enhanced by additional pyramidal context channels, capturing context at multiple levels. The semantic segmentation is achieved by a boosting based classification scheme over superpixels using multi-range channel features and pyramidal context features. A presegmentation is used to generate semantic channels as input for more powerful final segmentation. The final segmentation is refined using a superpixel-level dense CRF. The proposed solution is evaluated on the Cityscapes segmentation benchmark and achieves competitive results at low computational costs. It is the first boosting based solution that is able to keep up with the performance of deep learning based approaches.


Past, Present, and Future of Simultaneous Localization And Mapping: Towards the Robust-Perception Age

Cesar Cadena, Luca Carlone, Henry Carrillo, Yasir Latif, Davide Scaramuzza, Jose Neira, Ian Reid and John J. Leonard

IEEE Transactions on Robotics 32 (6) pp 1309-1332, 2016

Simultaneous Localization and Mapping (SLAM)consists in the concurrent construction of a model of the environment (the map), and the estimation of the state of the robot moving within it. The SLAM community has made astonishing progress over the last 30 years, enabling large-scale real-world applications, and witnessing a steady transition of this technology to industry. We survey the current state of SLAM. We start by presenting what is now the de-facto standard formulation for SLAM. We then review related work, covering a broad set of topics including robustness and scalability in long-term mapping, metric and semantic representations for mapping, theoretical performance guarantees, active SLAM and exploration, and other new frontiers. This paper simultaneously serves as a position paper and tutorial to those who are users of SLAM. By looking at the published research with a critical eye, we delineate open challenges and new research issues, that still deserve careful scientific investigation. The paper also contains the authors’ take on two questions that often animate discussions during robotics conferences: Do robots need SLAM? and Is SLAM solved?


 title = {Past, Present, and Future of Simultaneous Localization And Mapping: Towards the Robust-Perception Age},
 author = {C. Cadena and L. Carlone and H. Carrillo and Y. Latif and D. Scaramuzza and J. Neira and I. Reid and J.J. Leonard},
 journal = {{IEEE Transactions on Robotics}},
 year = {2016},
 number = {6},
 pages  = {1309--1332},
 volume = {32},

Appearance-Based Landmark Selection for Efficient Long-Term Visual Localization

Mathias Buerki, Igor Gilitschenski, Elena Stumm, Roland Siegwart, and Juan Nieto

International Conference on Intelligent Robots and Systems (IROS) 2016

landmark_selectionWe present an online landmark selection method for efficient and accurate visual localization under changing appearance conditions. The wide range of conditions encountered during long-term visual localization by e.g. fleets of autonomous vehicles offers the potential exploit redundancy and reduce data usage by selecting only those visual cues which are relevant at the given time. Therefore co-observability statistics guide landmark ranking and selection, significantly reducing the amount of information used for localization while maintaining or even improving accuracy.

pdf   video

Title = {Appearance-Based Landmark Selection for Efficient Long-Term Visual Localization},
Author = {M. Buerki and I. Gilitschenski and E. Stumm and R. Siegwart and J. Nieto},
Fullauthor = {Mathias Buerki and Igor Gilitschenski and Elena Stumm and Roland Siegwart and Juan Nieto},
Booktitle = {{IEEE/RSJ} International Conference on Intelligent Robots and Systems ({IROS})},
Address = {Daejeon, Korea},
Month = {October},
Year = {2015},