<?xml version="1.0" encoding="UTF-8"?><xml><records><record><source-app name="Biblio" version="7.x">Drupal-Biblio</source-app><ref-type>47</ref-type><contributors><authors><author><style face="normal" font="default" size="100%">Uta Büchler</style></author><author><style face="normal" font="default" size="100%">Biagio Brattoli</style></author><author><style face="normal" font="default" size="100%">Björn Ommer</style></author></authors></contributors><titles><title><style face="normal" font="default" size="100%">Improving Spatiotemporal Self-Supervision by Deep Reinforcement Learning</style></title><secondary-title><style face="normal" font="default" size="100%">Proceedings of the European Conference on Computer Vision (ECCV)</style></secondary-title></titles><keywords><keyword><style  face="normal" font="default" size="100%">action recognition</style></keyword><keyword><style  face="normal" font="default" size="100%">deep reinforcement learning</style></keyword><keyword><style  face="normal" font="default" size="100%">image understanding</style></keyword><keyword><style  face="normal" font="default" size="100%">self-supervision</style></keyword><keyword><style  face="normal" font="default" size="100%">shuffling</style></keyword></keywords><dates><year><style  face="normal" font="default" size="100%">2018</style></year></dates><publisher><style face="normal" font="default" size="100%">(UB and BB  contributed equally)</style></publisher><pub-location><style face="normal" font="default" size="100%">Munich, Germany</style></pub-location><language><style face="normal" font="default" size="100%">eng</style></language><abstract><style face="normal" font="default" size="100%">Self-supervised learning of convolutional neural networks can harness large amounts of cheap unlabeled data to train powerful feature representations. As surrogate task, we jointly address ordering of visual data in the spatial and temporal domain. The permutations of training samples, which are at the core of self-supervision by ordering, have so far been sampled randomly from a fixed preselected set. Based on deep reinforcement learning we propose a sampling policy that adapts to the state of the network, which is being trained. Therefore, new permutations are sampled according to their expected utility for updating the convolutional feature representation. Experimental evaluation on unsupervised and transfer learning tasks demonstrates competitive performance on standard benchmarks for image and video classification and nearest neighbor retrieval.</style></abstract></record></records></xml>