We present Diffusion in Style, a simple method to adapt Stable Diffusion to any desired style, using only a small set of target images. It is based on the key observation that the style of the images generated by Stable Diffusion is tied to the initial latent tensor. Not adapting this initial latent tensor to the style makes fine-tuning slow, expensive, and impractical, especially when only a few target style images are available. In contrast, fine-tuning is much easier if this initial latent tensor is also adapted. Our Diffusion in Style is orders of magnitude more sample-efficient and faster. It also generates more pleasing images than existing approaches, as shown qualitatively and with quantitative comparisons.
Matthias Finger, Konstantin Androsov, Qian Wang, Jan Steggemann, Yiming Li, Anna Mascellani, Varun Sharma, Xin Chen, Rakesh Chawla, Matteo Galli
Alessandro Mapelli, Alina Kleimenova, Radoslav Marchevski
Rachid Guerraoui, Anne-Marie Kermarrec, Sadegh Farhadkhani, Rafael Pereira Pires, Rishi Sharma, Marinus Abraham de Vos