<?xml version="1.0" encoding="UTF-8"?><xml><records><record><source-app name="Biblio" version="7.x">Drupal-Biblio</source-app><ref-type>47</ref-type><contributors><authors><author><style face="normal" font="default" size="100%">Brachmann, Eric</style></author><author><style face="normal" font="default" size="100%">Carsten Rother</style></author></authors></contributors><titles><title><style face="normal" font="default" size="100%">Learning Less is More - 6D Camera Localization via 3D Surface Regression</style></title><secondary-title><style face="normal" font="default" size="100%">Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition</style></secondary-title></titles><dates><year><style  face="normal" font="default" size="100%">2018</style></year><pub-dates><date><style  face="normal" font="default" size="100%">nov</style></date></pub-dates></dates><urls><web-urls><url><style face="normal" font="default" size="100%">http://arxiv.org/abs/1711.10228</style></url></web-urls></urls><pages><style face="normal" font="default" size="100%">4654–4662</style></pages><isbn><style face="normal" font="default" size="100%">9781538664209</style></isbn><language><style face="normal" font="default" size="100%">eng</style></language><abstract><style face="normal" font="default" size="100%">Popular research areas like autonomous driving and augmented reality have renewed the interest in image-based camera localization. In this work, we address the task of predicting the 6D camera pose from a single RGB image in a given 3D environment. With the advent of neural networks, previous works have either learned the entire camera localization process, or multiple components of a camera localization pipeline. Our key contribution is to demonstrate and explain that learning a single component of this pipeline is sufficient. This component is a fully convolutional neural network for densely regressing so-called scene coordinates, defining the correspondence between the input image and the 3D scene space. The neural network is prepended to a new end-to-end trainable pipeline. Our system is efficient, highly accurate, robust in training, and exhibits outstanding generalization capabilities. It exceeds state-of-the-art consistently on indoor and outdoor datasets. Interestingly, our approach surpasses existing techniques even without utilizing a 3D model of the scene during training, since the network is able to discover 3D scene geometry automatically, solely from single-view constraints.</style></abstract></record></records></xml>