{"id":692,"date":"2015-02-18T16:11:23","date_gmt":"2015-02-18T16:11:23","guid":{"rendered":"http:\/\/cvl.file2.wcms.tu-dresden.de\/CVLD\/?page_id=692"},"modified":"2018-01-15T12:46:03","modified_gmt":"2018-01-15T12:46:03","slug":"scene-understanding","status":"publish","type":"page","link":"https:\/\/hci.iwr.uni-heidelberg.de\/vislearn\/research\/scene-understanding\/","title":{"rendered":"Scene Understanding"},"content":{"rendered":"<h1>This Page is Under Construction, and the content below is obsolete.<\/h1>\n<p><a href=\"https:\/\/hci.iwr.uni-heidelberg.de\/vislearn\/wp-content\/uploads\/2015\/02\/3DunderstandV1.png\"><img loading=\"lazy\" decoding=\"async\" class=\"alignright wp-image-1274 \" src=\"https:\/\/hci.iwr.uni-heidelberg.de\/vislearn\/wp-content\/uploads\/2015\/02\/3DunderstandV1-300x82.png\" alt=\"3DunderstandV1\" width=\"318\" height=\"87\" srcset=\"https:\/\/hci.iwr.uni-heidelberg.de\/vislearn\/wp-content\/uploads\/2015\/02\/3DunderstandV1-300x82.png 300w, https:\/\/hci.iwr.uni-heidelberg.de\/vislearn\/wp-content\/uploads\/2015\/02\/3DunderstandV1.png 1018w\" sizes=\"auto, (max-width: 318px) 100vw, 318px\" \/><\/a>In this theme we develop efficient inference techniques for 3D Scene understanding. Our ultimate goal is to take a few RGB(D) images and output in real-time the full scene-graph, with all objects present in the scene, their corresponding attributes and their 3D spatial relationship, e.g. \u201cobject A is supported by object B\u201d. While this is very hot research area with many ongoing efforts, we currently focus on few research directions. These are in particular: 3D Pose estimation of known object instances or classes and semantic segmentation of (stereo) images.<\/p>\n<p><strong>Running Projects:<\/strong><\/p>\n<ul>\n<li><a title=\"Pose Estimation\" href=\"https:\/\/hci.iwr.uni-heidelberg.de\/vislearn\/research\/scene-understanding\/pose-estimation\/\">Object Instance Recognition and Pose Estimation<\/a> (jointly with Prof Gumhold&#8217;s team (TUD))<\/li>\n<li><a href=\"http:\/\/kylezheng.org\/densesegattobj\/\">Dense Semantic Segmentation with Objects and Attributes<\/a> (this project is run by Shuai Zheng from Phil Torr&#8217;s team in Oxford)<\/li>\n<li><a href=\"http:\/\/www.ifl.tu-dresden.de\/?dir=Forschung\/Aktuelle_Projekte\/Rollverkehrsmanagement\/LiDAR\">3D Semantic Segmentation and Tracking on an airfield\u00a0<\/a> (supporting Prof. Fricke&#8217;s team (TUD). DFG Project)<\/li>\n<\/ul>\n<p><strong>Selected Publications:<\/strong><\/p>\n<ul>\n<li>E. Brachmann, F. Michel, A. Krull, M. Y. Yang, S. Gumhold, C. Rother, <a href=\"http:\/\/wwwpub.zih.tu-dresden.de\/~cvweb\/publications\/papers\/2016\/rgbpose.pdf\" target=\"_blank\" rel=\"noopener\">Uncertainty-Driven 6D Pose Estimation of Objects and Scenes\u00a0from a Single RGB Image<\/a>, CVPR\u00a02016.<\/li>\n<li>F. Michel, A. Krull, E. Brachmann, M. Y. Yang, S. Gumhold, C. Rother, <a href=\"http:\/\/wwwpub.zih.tu-dresden.de\/~cvweb\/publications\/papers\/2015\/Pose_Estimation_of_Kinematic_Chain_Instances_via_Object_Coordinate_Regression-Michel-BMVC15.pdf\">Pose Estimation of Kinematic Chain Instances via Object Coordinate Regression<\/a>, BMVC 2015<\/li>\n<li>A. Krull, F. Michel, E. Brachmann, S. Gumhold, S. Ihrke, C. Rother, <a href=\"http:\/\/wwwpub.zih.tu-dresden.de\/%7Ecvweb\/publications\/papers\/2014\/accv2014finalpaper.pdf\">6-DOF Model Based Tracking via Object Coordinate Regression<\/a>, ACCV 2014 (the associated system won a demo honorable mention award at ACCV 14)<\/li>\n<li>E. Brachmann, A. Krull, F. Michel, S. Gumhold, J. Shotton, and C. Rother, <a href=\"http:\/\/wwwpub.zih.tu-dresden.de\/%7Ecvweb\/publications\/papers\/2014\/PoseEstimationECCV2014.pdf\">Learning 6D Object Pose Estimation using 3D Object Coordinates<\/a>, ECCV 2014<\/li>\n<li>V. Vineet, C. Rother, P. H.S. Torr, Higher Order Priors for Joint Intrinsic Image, Objects, and Attributes Estimation, NIPS 2013<\/li>\n<li>M. Bleyer, C. Rhemann, and C. Rother, Extracting 3D Scene-consistent Object Proposals and Depth from Stereo Images, ECCV 2012<\/li>\n<\/ul>\n<p><strong>Datasets:<\/strong><\/p>\n<ul>\n<li><a href=\"https:\/\/hci.iwr.uni-heidelberg.de\/vislearn\/iccv2015-articulation-challenge\/\">Articulated Objects Dataset<\/a><\/li>\n<li><a href=\"https:\/\/hci.iwr.uni-heidelberg.de\/vislearn\/iccv2015-occlusion-challenge\/\">Occluded Object Dataset<\/a><\/li>\n<\/ul>\n","protected":false},"excerpt":{"rendered":"<p>This Page is Under Construction, and the content below is obsolete. In this theme we develop efficient inference techniques for 3D Scene understanding. Our ultimate goal is to take a few RGB(D) images and output in real-time the full scene-graph, with all objects present in the scene, their corresponding attributes and their 3D spatial relationship, [&hellip;]<\/p>\n","protected":false},"author":1,"featured_media":0,"parent":125,"menu_order":0,"comment_status":"closed","ping_status":"closed","template":"","meta":{"inline_featured_image":false,"footnotes":""},"class_list":["post-692","page","type-page","status-publish","hentry","post"],"yoast_head":"<!-- This site is optimized with the Yoast SEO plugin v27.4 - https:\/\/yoast.com\/product\/yoast-seo-wordpress\/ -->\n<title>Scene Understanding - Computer Vision and Learning Lab Heidelberg<\/title>\n<meta name=\"robots\" content=\"index, follow, max-snippet:-1, max-image-preview:large, max-video-preview:-1\" \/>\n<link rel=\"canonical\" href=\"https:\/\/hci.iwr.uni-heidelberg.de\/vislearn\/research\/scene-understanding\/\" \/>\n<meta property=\"og:locale\" content=\"en_US\" \/>\n<meta property=\"og:type\" content=\"article\" \/>\n<meta property=\"og:title\" content=\"Scene Understanding - Computer Vision and Learning Lab Heidelberg\" \/>\n<meta property=\"og:description\" content=\"This Page is Under Construction, and the content below is obsolete. In this theme we develop efficient inference techniques for 3D Scene understanding. Our ultimate goal is to take a few RGB(D) images and output in real-time the full scene-graph, with all objects present in the scene, their corresponding attributes and their 3D spatial relationship, [&hellip;]\" \/>\n<meta property=\"og:url\" content=\"https:\/\/hci.iwr.uni-heidelberg.de\/vislearn\/research\/scene-understanding\/\" \/>\n<meta property=\"og:site_name\" content=\"Computer Vision and Learning Lab Heidelberg\" \/>\n<meta property=\"article:modified_time\" content=\"2018-01-15T12:46:03+00:00\" \/>\n<meta property=\"og:image\" content=\"https:\/\/hci.iwr.uni-heidelberg.de\/vislearn\/wp-content\/uploads\/2015\/02\/3DunderstandV1-300x82.png\" \/>\n<meta name=\"twitter:label1\" content=\"Est. reading time\" \/>\n\t<meta name=\"twitter:data1\" content=\"2 minutes\" \/>\n<script type=\"application\/ld+json\" class=\"yoast-schema-graph\">{\"@context\":\"https:\\\/\\\/schema.org\",\"@graph\":[{\"@type\":\"WebPage\",\"@id\":\"https:\\\/\\\/hci.iwr.uni-heidelberg.de\\\/vislearn\\\/research\\\/scene-understanding\\\/\",\"url\":\"https:\\\/\\\/hci.iwr.uni-heidelberg.de\\\/vislearn\\\/research\\\/scene-understanding\\\/\",\"name\":\"Scene Understanding - Computer Vision and Learning Lab Heidelberg\",\"isPartOf\":{\"@id\":\"https:\\\/\\\/hci.iwr.uni-heidelberg.de\\\/vislearn\\\/#website\"},\"primaryImageOfPage\":{\"@id\":\"https:\\\/\\\/hci.iwr.uni-heidelberg.de\\\/vislearn\\\/research\\\/scene-understanding\\\/#primaryimage\"},\"image\":{\"@id\":\"https:\\\/\\\/hci.iwr.uni-heidelberg.de\\\/vislearn\\\/research\\\/scene-understanding\\\/#primaryimage\"},\"thumbnailUrl\":\"https:\\\/\\\/hci.iwr.uni-heidelberg.de\\\/vislearn\\\/wp-content\\\/uploads\\\/2015\\\/02\\\/3DunderstandV1-300x82.png\",\"datePublished\":\"2015-02-18T16:11:23+00:00\",\"dateModified\":\"2018-01-15T12:46:03+00:00\",\"breadcrumb\":{\"@id\":\"https:\\\/\\\/hci.iwr.uni-heidelberg.de\\\/vislearn\\\/research\\\/scene-understanding\\\/#breadcrumb\"},\"inLanguage\":\"en-US\",\"potentialAction\":[{\"@type\":\"ReadAction\",\"target\":[\"https:\\\/\\\/hci.iwr.uni-heidelberg.de\\\/vislearn\\\/research\\\/scene-understanding\\\/\"]}]},{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\\\/\\\/hci.iwr.uni-heidelberg.de\\\/vislearn\\\/research\\\/scene-understanding\\\/#primaryimage\",\"url\":\"https:\\\/\\\/hci.iwr.uni-heidelberg.de\\\/vislearn\\\/wp-content\\\/uploads\\\/2015\\\/02\\\/3DunderstandV1.png\",\"contentUrl\":\"https:\\\/\\\/hci.iwr.uni-heidelberg.de\\\/vislearn\\\/wp-content\\\/uploads\\\/2015\\\/02\\\/3DunderstandV1.png\",\"width\":1018,\"height\":280},{\"@type\":\"BreadcrumbList\",\"@id\":\"https:\\\/\\\/hci.iwr.uni-heidelberg.de\\\/vislearn\\\/research\\\/scene-understanding\\\/#breadcrumb\",\"itemListElement\":[{\"@type\":\"ListItem\",\"position\":1,\"name\":\"Home\",\"item\":\"https:\\\/\\\/hci.iwr.uni-heidelberg.de\\\/vislearn\\\/\"},{\"@type\":\"ListItem\",\"position\":2,\"name\":\"Research\",\"item\":\"https:\\\/\\\/hci.iwr.uni-heidelberg.de\\\/vislearn\\\/research\\\/\"},{\"@type\":\"ListItem\",\"position\":3,\"name\":\"Scene Understanding\"}]},{\"@type\":\"WebSite\",\"@id\":\"https:\\\/\\\/hci.iwr.uni-heidelberg.de\\\/vislearn\\\/#website\",\"url\":\"https:\\\/\\\/hci.iwr.uni-heidelberg.de\\\/vislearn\\\/\",\"name\":\"Computer Vision and Learning Lab Heidelberg\",\"description\":\"Computer Vision and Learning Lab Heidelberg\",\"potentialAction\":[{\"@type\":\"SearchAction\",\"target\":{\"@type\":\"EntryPoint\",\"urlTemplate\":\"https:\\\/\\\/hci.iwr.uni-heidelberg.de\\\/vislearn\\\/?s={search_term_string}\"},\"query-input\":{\"@type\":\"PropertyValueSpecification\",\"valueRequired\":true,\"valueName\":\"search_term_string\"}}],\"inLanguage\":\"en-US\"}]}<\/script>\n<!-- \/ Yoast SEO plugin. -->","yoast_head_json":{"title":"Scene Understanding - Computer Vision and Learning Lab Heidelberg","robots":{"index":"index","follow":"follow","max-snippet":"max-snippet:-1","max-image-preview":"max-image-preview:large","max-video-preview":"max-video-preview:-1"},"canonical":"https:\/\/hci.iwr.uni-heidelberg.de\/vislearn\/research\/scene-understanding\/","og_locale":"en_US","og_type":"article","og_title":"Scene Understanding - Computer Vision and Learning Lab Heidelberg","og_description":"This Page is Under Construction, and the content below is obsolete. In this theme we develop efficient inference techniques for 3D Scene understanding. Our ultimate goal is to take a few RGB(D) images and output in real-time the full scene-graph, with all objects present in the scene, their corresponding attributes and their 3D spatial relationship, [&hellip;]","og_url":"https:\/\/hci.iwr.uni-heidelberg.de\/vislearn\/research\/scene-understanding\/","og_site_name":"Computer Vision and Learning Lab Heidelberg","article_modified_time":"2018-01-15T12:46:03+00:00","og_image":[{"url":"https:\/\/hci.iwr.uni-heidelberg.de\/vislearn\/wp-content\/uploads\/2015\/02\/3DunderstandV1-300x82.png","type":"","width":"","height":""}],"twitter_misc":{"Est. reading time":"2 minutes"},"schema":{"@context":"https:\/\/schema.org","@graph":[{"@type":"WebPage","@id":"https:\/\/hci.iwr.uni-heidelberg.de\/vislearn\/research\/scene-understanding\/","url":"https:\/\/hci.iwr.uni-heidelberg.de\/vislearn\/research\/scene-understanding\/","name":"Scene Understanding - Computer Vision and Learning Lab Heidelberg","isPartOf":{"@id":"https:\/\/hci.iwr.uni-heidelberg.de\/vislearn\/#website"},"primaryImageOfPage":{"@id":"https:\/\/hci.iwr.uni-heidelberg.de\/vislearn\/research\/scene-understanding\/#primaryimage"},"image":{"@id":"https:\/\/hci.iwr.uni-heidelberg.de\/vislearn\/research\/scene-understanding\/#primaryimage"},"thumbnailUrl":"https:\/\/hci.iwr.uni-heidelberg.de\/vislearn\/wp-content\/uploads\/2015\/02\/3DunderstandV1-300x82.png","datePublished":"2015-02-18T16:11:23+00:00","dateModified":"2018-01-15T12:46:03+00:00","breadcrumb":{"@id":"https:\/\/hci.iwr.uni-heidelberg.de\/vislearn\/research\/scene-understanding\/#breadcrumb"},"inLanguage":"en-US","potentialAction":[{"@type":"ReadAction","target":["https:\/\/hci.iwr.uni-heidelberg.de\/vislearn\/research\/scene-understanding\/"]}]},{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/hci.iwr.uni-heidelberg.de\/vislearn\/research\/scene-understanding\/#primaryimage","url":"https:\/\/hci.iwr.uni-heidelberg.de\/vislearn\/wp-content\/uploads\/2015\/02\/3DunderstandV1.png","contentUrl":"https:\/\/hci.iwr.uni-heidelberg.de\/vislearn\/wp-content\/uploads\/2015\/02\/3DunderstandV1.png","width":1018,"height":280},{"@type":"BreadcrumbList","@id":"https:\/\/hci.iwr.uni-heidelberg.de\/vislearn\/research\/scene-understanding\/#breadcrumb","itemListElement":[{"@type":"ListItem","position":1,"name":"Home","item":"https:\/\/hci.iwr.uni-heidelberg.de\/vislearn\/"},{"@type":"ListItem","position":2,"name":"Research","item":"https:\/\/hci.iwr.uni-heidelberg.de\/vislearn\/research\/"},{"@type":"ListItem","position":3,"name":"Scene Understanding"}]},{"@type":"WebSite","@id":"https:\/\/hci.iwr.uni-heidelberg.de\/vislearn\/#website","url":"https:\/\/hci.iwr.uni-heidelberg.de\/vislearn\/","name":"Computer Vision and Learning Lab Heidelberg","description":"Computer Vision and Learning Lab Heidelberg","potentialAction":[{"@type":"SearchAction","target":{"@type":"EntryPoint","urlTemplate":"https:\/\/hci.iwr.uni-heidelberg.de\/vislearn\/?s={search_term_string}"},"query-input":{"@type":"PropertyValueSpecification","valueRequired":true,"valueName":"search_term_string"}}],"inLanguage":"en-US"}]}},"_links":{"self":[{"href":"https:\/\/hci.iwr.uni-heidelberg.de\/vislearn\/wp-json\/wp\/v2\/pages\/692","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/hci.iwr.uni-heidelberg.de\/vislearn\/wp-json\/wp\/v2\/pages"}],"about":[{"href":"https:\/\/hci.iwr.uni-heidelberg.de\/vislearn\/wp-json\/wp\/v2\/types\/page"}],"author":[{"embeddable":true,"href":"https:\/\/hci.iwr.uni-heidelberg.de\/vislearn\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/hci.iwr.uni-heidelberg.de\/vislearn\/wp-json\/wp\/v2\/comments?post=692"}],"version-history":[{"count":29,"href":"https:\/\/hci.iwr.uni-heidelberg.de\/vislearn\/wp-json\/wp\/v2\/pages\/692\/revisions"}],"predecessor-version":[{"id":3627,"href":"https:\/\/hci.iwr.uni-heidelberg.de\/vislearn\/wp-json\/wp\/v2\/pages\/692\/revisions\/3627"}],"up":[{"embeddable":true,"href":"https:\/\/hci.iwr.uni-heidelberg.de\/vislearn\/wp-json\/wp\/v2\/pages\/125"}],"wp:attachment":[{"href":"https:\/\/hci.iwr.uni-heidelberg.de\/vislearn\/wp-json\/wp\/v2\/media?parent=692"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}