Timezone: »
Latent Space Editing in Transformer-Based Flow Matching
Tao Hu · David Zhang · Meng Tang · Pascal Mettes · Deli Zhao · Cees Snoek
Event URL: https://openreview.net/forum?id=Bi6E5rPtBa »
This paper strives for image editing via generative models. Flow Matching is an emerging generative modeling technique that offers the advantage of simple and efficient training. Simultaneously, a new transformer-based U-ViT has recently been proposed to replace the commonly used UNet for better scalability and performance in generative modeling. Hence, Flow Matching with a transformer backbone offers the potential for scalable and high-quality generative modeling, but their latent structure and editing ability are as of yet unknown. Hence, we adopt this setting and explore how to edit images through latent space manipulation. We introduce an editing space, we call $u$-space, that can be manipulated in a controllable, accumulative, and composable manner. Additionally, we propose a tailored sampling solution to enable sampling with the more efficient adaptive step-size ODE solvers. Lastly, we put forth a straightforward yet powerful method for achieving fine-grained and nuanced editing using text prompts. Our framework is simple and efficient, all while being highly effective at editing images while preserving the essence of the original content.We will provide our source code and include it in the appendix.
This paper strives for image editing via generative models. Flow Matching is an emerging generative modeling technique that offers the advantage of simple and efficient training. Simultaneously, a new transformer-based U-ViT has recently been proposed to replace the commonly used UNet for better scalability and performance in generative modeling. Hence, Flow Matching with a transformer backbone offers the potential for scalable and high-quality generative modeling, but their latent structure and editing ability are as of yet unknown. Hence, we adopt this setting and explore how to edit images through latent space manipulation. We introduce an editing space, we call $u$-space, that can be manipulated in a controllable, accumulative, and composable manner. Additionally, we propose a tailored sampling solution to enable sampling with the more efficient adaptive step-size ODE solvers. Lastly, we put forth a straightforward yet powerful method for achieving fine-grained and nuanced editing using text prompts. Our framework is simple and efficient, all while being highly effective at editing images while preserving the essence of the original content.We will provide our source code and include it in the appendix.
Author Information
Tao Hu (University of Amsterdam)
David Zhang (University of Amsterdam)
Meng Tang (University of California, Merced)
Pascal Mettes (University of Amsterdam)
Deli Zhao (Alibaba Group)
Cees Snoek (University of Amsterdam)
More from the Same Authors
-
2023 : Neural Networks Are Graphs! Graph Neural Networks for Equivariant Processing of Neural Networks »
David Zhang · Miltiadis (Miltos) Kofinas · Yan Zhang · Yunlu Chen · Gertjan Burghouts · Cees Snoek -
2023 Poster: Cones: Concept Neurons in Diffusion Models for Customized Generation »
Zhiheng Liu · Ruili Feng · Kai Zhu · Yifei Zhang · Kecheng Zheng · Yu Liu · Deli Zhao · Jingren Zhou · Yang Cao -
2023 Poster: RLEG: Vision-Language Representation Learning with Diffusion-based Embedding Generation »
Liming Zhao · Kecheng Zheng · Yun Zheng · Deli Zhao · Jingren Zhou -
2023 Poster: Composer: Creative and Controllable Image Synthesis with Composable Conditions »
Lianghua Huang · Di Chen · Yu Liu · Yujun Shen · Deli Zhao · Jingren Zhou -
2023 Poster: MetaModulation: Learning Variational Feature Hierarchies for Few-Shot Learning with Fewer Tasks »
Wenfang Sun · Yingjun Du · Xiantong Zhen · Fan Wang · Ling Wang · Cees Snoek -
2023 Oral: Cones: Concept Neurons in Diffusion Models for Customized Generation »
Zhiheng Liu · Ruili Feng · Kai Zhu · Yifei Zhang · Kecheng Zheng · Yu Liu · Deli Zhao · Jingren Zhou · Yang Cao -
2023 Poster: Unlocking Slot Attention by Changing Optimal Transport Costs »
Yan Zhang · David Zhang · Simon Lacoste-Julien · Gertjan Burghouts · Cees Snoek -
2022 Poster: Principled Knowledge Extrapolation with GANs »
Ruili Feng · Jie Xiao · Kecheng Zheng · Deli Zhao · Jingren Zhou · Qibin Sun · Zheng-Jun Zha -
2022 Spotlight: Principled Knowledge Extrapolation with GANs »
Ruili Feng · Jie Xiao · Kecheng Zheng · Deli Zhao · Jingren Zhou · Qibin Sun · Zheng-Jun Zha -
2022 Poster: Region-Based Semantic Factorization in GANs »
Jiapeng Zhu · Yujun Shen · Yinghao Xu · Deli Zhao · Qifeng Chen -
2022 Spotlight: Region-Based Semantic Factorization in GANs »
Jiapeng Zhu · Yujun Shen · Yinghao Xu · Deli Zhao · Qifeng Chen -
2021 Poster: Understanding Noise Injection in GANs »
Ruili Feng · Deli Zhao · Zheng-Jun Zha -
2021 Spotlight: Understanding Noise Injection in GANs »
Ruili Feng · Deli Zhao · Zheng-Jun Zha -
2021 Poster: Uncertainty Principles of Encoding GANs »
Ruili Feng · Zhouchen Lin · Jiapeng Zhu · Deli Zhao · Jingren Zhou · Zheng-Jun Zha -
2021 Spotlight: Uncertainty Principles of Encoding GANs »
Ruili Feng · Zhouchen Lin · Jiapeng Zhu · Deli Zhao · Jingren Zhou · Zheng-Jun Zha