Neural Style Transfer

Following, we provide a (selective) overview of our research on neural style transfer. For a comprehensive list, please visit our publication page.

Rethinking Style Transfer: From Pixels to Parameterized Brushstrokes
Kotovenko, D, Wright, M, Heimbrecht, A and Ommer, B (2021).
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

There have been many successful implementations of neural style transfer in recent years. In most of these works, the stylization process is confined to the pixel domain. How- ever, we argue that this representation is unnatural because paintings usually consist of brushstrokes rather than pixels. We propose a method to stylize images by optimizing parameterized brushstrokes instead of pixels and further introduce a simple differentiable rendering mechanism. Our approach significantly improves visual quality and en- ables additional control over the stylization process such as controlling the flow of brushstrokes through user input. We provide qualitative and quantitative evaluations that show the efficacy of the proposed parameterized representation.

arXiv | Project page | Code

Content and Style Disentanglement for Artistic Style Transfer
Kotovenko, D, Sanakoyeu, A, Lang, S, Ommer, B (2019).
Proceedings of the Intl. Conf. on Computer Vision (ICCV).

Artists rarely paint in a single style throughout their career. More often they change styles or develop variations of it. In addition, artworks in different styles and even within one style depict real content differently: while Picasso’s Blue Period displays a vase in a blueish tone but as a whole, his Cubist works deconstruct the object. To produce artistically convincing stylizations, style transfer models must be able to reflect these changes and variations. Recently many works have aimed to improve the style transfer task, but neglected to address the described observations. We present a novel approach which captures particularities of style and the variations within and separates style and content. This is achieved by introducing two novel losses: a fixpoint triplet style loss to learn subtle variations within one style or between different styles and a disentanglement loss to ensure that the stylization is not conditioned on the real input photo. In addition the paper proposes various evaluation methods to measure the importance of both losses on the validity, quality and variability of final stylizations. We provide qualitative results to demonstrate the performance of our approach.

arXiv | Project page | Code

Using a Transformation Content Block For Image Style Transfer
Kotovenko, D, Sanakoyeu, A, Lang, S, Ma, P, Ommer, B (2019).
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

Style transfer has recently received a lot of attention, since it allows to study fundamental challenges in image understanding and synthesis. Recent work has significantly improved the representation of color and texture and computational speed and image resolution. The explicit transformation of image content has, however, been mostly neglected: while artistic style affects formal characteristics of an image, such as color, shape or texture, it also deforms, adds or removes content details. This paper explicitly focuses on a content-and style-aware stylization of a content image. Therefore, we introduce a content transformation module between the encoder and decoder. Moreover, we utilize similar content appearing in photographs and style samples to learn how style alters content details and we generalize this to other class details. Additionally, this work presents a novel normalization layer critical for high resolution image synthesis. The robustness and speed of our model enables a video stylization in real-time and high definition. We perform extensive qualitative and quantitative evaluations to demonstrate the validity of our approach.

arXiv | Project page | Code

A Style-Aware Content Loss for Real-time HD Style Transfer
Sanakoyeu, A, Kotovenko, D, Lang, S, Ommer, B (2018).
Proceedings of the European Conference on Computer Vision (ECCV) (Oral).

Recently, style transfer has received a lot of attention. While much of this research has aimed at speeding up processing, the approaches are still lacking from a principled, art historical standpoint: a style is more than just a single image or an artist, but previous work is limited to only a single instance of a style or shows no benefit from more images. Moreover, previous work has relied on a direct comparison of art in the domain of RGB images or on CNNs pre-trained on ImageNet, which requires millions of labeled object bounding boxes and can introduce an extra bias, since it has been assembled without artistic consideration. To circumvent these issues, we propose a style-aware content loss, which is trained jointly with a deep encoder-decoder network for real-time, high-resolution stylization of images and videos. We propose a quantitative measure for evaluating the quality of a stylized image and also have art historians rank patches from our approach against those from previous work. These and our qualitative results ranging from small image patches to megapixel stylistic images and videos show that our approach better captures the subtle nature in which a style affects content.

arXiv | Project page | Code