DreamPose: 通过稳定扩散实现时尚图像到视频的综合 (DreamPose: Fashion Image-to-Video Synthesis via Stable Diffusion) - 专知论文

会员服务 ·

0

视频 · 稳定扩散 · 图像引导 · 时间一致性 · 合成 ·

2023 年 4 月 14 日

DreamPose: Fashion Image-to-Video Synthesis via Stable Diffusion

翻译：DreamPose: 通过稳定扩散实现时尚图像到视频的综合

Johanna Karras,Aleksander Holynski,Ting-Chun Wang,Ira Kemelmacher-Shlizerman

from arxiv, Project page: https://grail.cs.washington.edu/projects/dreampose/

We present DreamPose, a diffusion-based method for generating animated fashion videos from still images. Given an image and a sequence of human body poses, our method synthesizes a video containing both human and fabric motion. To achieve this, we transform a pretrained text-to-image model (Stable Diffusion) into a pose-and-image guided video synthesis model, using a novel finetuning strategy, a set of architectural changes to support the added conditioning signals, and techniques to encourage temporal consistency. We fine-tune on a collection of fashion videos from the UBC Fashion dataset. We evaluate our method on a variety of clothing styles and poses, and demonstrate that our method produces state-of-the-art results on fashion video animation. Video results are available on our project page.

翻译：---- 我们提出了一个基于扩散的方法DreamPose，用于从静止的图像生成动画时尚视频。给定一张图像和一个人体姿势序列，我们的方法合成一个包含人体和织物运动的视频。为了实现这一点，我们将预训练的文本到图像模型（Stable Diffusion）转化为一个姿势和图像引导的视频合成模型，使用一种新颖的微调策略，一组架构变化来支持添加的调节信号，以及鼓励时间一致性的技术。我们在UBC时尚数据集的一组时尚视频上进行微调。我们评估了我们的方法在各种服装风格和姿势上的表现，并证明我们的方法在时尚视频动画方面产生了最先进的效果。视频结果可以在我们的项目页面上获得。

0

相关内容

视频

【CVPR 2022】多模态视频字幕的端到端生成预训练，End-to-end Generative Pretraining for Multimodal Video Captioning

【CVPR 2022】多模态视频字幕的端到端生成预训练，End-to-end Generative Pretraining for Multimodal Video Captioning

专知会员服务

27+阅读 · 2022年3月3日

【CVPR 2022】可控图像合成与编辑的合成生成先验学习，SemanticStyleGAN: Learning Compositonal Generative Priors for Controllable Image Synthesis and Editing

【CVPR 2022】可控图像合成与编辑的合成生成先验学习，SemanticStyleGAN: Learning Compositonal Generative Priors for Controllable Image Synthesis and Editing

专知会员服务

23+阅读 · 2022年3月3日

【USC2021】常识推理，47页ppt，Commonsense Reasoning in the Wild

专知会员服务

33+阅读 · 2021年10月9日

【DeepMind-牛津-CMU-CVPR2020】无监督词映射视觉基准，Visual Grounding in Video

【DeepMind-牛津-CMU-CVPR2020】无监督词映射视觉基准，Visual Grounding in Video

专知会员服务

12+阅读 · 2020年3月13日

【论文推荐】小样本视频合成，Few-shot Video-to-Video Synthesis

【论文推荐】小样本视频合成，Few-shot Video-to-Video Synthesis

专知会员服务

24+阅读 · 2019年12月15日

卷！用扩散模型合成连贯视觉故事，输入字幕就能脑补画面，代词ta都分得清

卷！用扩散模型合成连贯视觉故事，输入字幕就能脑补画面，代词ta都分得清

机器之心

0+阅读 · 2022年11月27日

7 Papers & Radios | 谷歌推出DreamBooth扩散模型；张益唐零点猜想论文出炉

7 Papers & Radios | 谷歌推出DreamBooth扩散模型；张益唐零点猜想论文出炉

机器之心

2+阅读 · 2022年11月13日

只需3个样本一句话，AI就能定制照片级图像，谷歌在玩一种很新的扩散模型

只需3个样本一句话，AI就能定制照片级图像，谷歌在玩一种很新的扩散模型

机器之心

0+阅读 · 2022年11月11日

上百种预训练中文词向量：Chinese-Word-Vectors

上百种预训练中文词向量：Chinese-Word-Vectors

AINLP

23+阅读 · 2019年2月26日

【论文推荐】最新八篇视频描述生成相关论文—在线视频理解、联合定位和描述事件、生成视频、跨模态注意力机制、联合事件检测和描述

【论文推荐】最新八篇视频描述生成相关论文—在线视频理解、联合定位和描述事件、生成视频、跨模态注意力机制、联合事件检测和描述

专知

11+阅读 · 2018年6月4日

Ni-Al尖晶石的合成与还原制备Ni/Al2O3催化剂的过程机理及催化性能的研究

国家自然科学基金

0+阅读 · 2014年12月31日

基于多方论据博弈的决策知识萃取与联合学习方法研究

国家自然科学基金

5+阅读 · 2012年12月31日

ArnSnOm靶向催化二甲氧基碳酸双酚A二酯合成和缩聚及反应机理研究

国家自然科学基金

0+阅读 · 2012年12月31日

基于视频内容重组的显示适配技术研究

国家自然科学基金

0+阅读 · 2012年12月31日

骨髓间充质干细胞对梗死心肌中肌纤维母细胞调控机制的研究

国家自然科学基金

0+阅读 · 2008年12月31日

SnapFusion: Text-to-Image Diffusion Model on Mobile Devices within Two Seconds

Arxiv

0+阅读 · 2023年6月1日

Grounding Language Models to Images for Multimodal Inputs and Outputs

Arxiv

0+阅读 · 2023年6月1日

Fine-grained Text Style Transfer with Diffusion-Based Language Models

Arxiv

0+阅读 · 2023年5月31日

DiffSketching: Sketch Control Image Synthesis with Diffusion Models

Arxiv

0+阅读 · 2023年5月30日

Neural Task Synthesis for Visual Programming

Arxiv

0+阅读 · 2023年5月26日

VIP会员

文章信息

相关主题

时间一致性

相关VIP内容

【CVPR 2022】多模态视频字幕的端到端生成预训练，End-to-end Generative Pretraining for Multimodal Video Captioning

【CVPR 2022】多模态视频字幕的端到端生成预训练，End-to-end Generative Pretraining for Multimodal Video Captioning

专知会员服务

27+阅读 · 2022年3月3日

【CVPR 2022】可控图像合成与编辑的合成生成先验学习，SemanticStyleGAN: Learning Compositonal Generative Priors for Controllable Image Synthesis and Editing

【CVPR 2022】可控图像合成与编辑的合成生成先验学习，SemanticStyleGAN: Learning Compositonal Generative Priors for Controllable Image Synthesis and Editing

专知会员服务

23+阅读 · 2022年3月3日

【USC2021】常识推理，47页ppt，Commonsense Reasoning in the Wild

专知会员服务

33+阅读 · 2021年10月9日

【DeepMind-牛津-CMU-CVPR2020】无监督词映射视觉基准，Visual Grounding in Video

【DeepMind-牛津-CMU-CVPR2020】无监督词映射视觉基准，Visual Grounding in Video

专知会员服务

12+阅读 · 2020年3月13日

【论文推荐】小样本视频合成，Few-shot Video-to-Video Synthesis

【论文推荐】小样本视频合成，Few-shot Video-to-Video Synthesis

专知会员服务

24+阅读 · 2019年12月15日

热门VIP内容

开通专知VIP会员享更多权益服务

【MIT博士论文】弱监督学习：理论、方法与应用

Andrej Karpathy：2025 年 LLM 年度回顾（2025 LLM Year in Review）

锚定情报：合成欺骗时代的地面真相

NeurIPS 2025 | NMKE：基于神经元归因与动态稀疏掩码的终身知识编辑

相关资讯

卷！用扩散模型合成连贯视觉故事，输入字幕就能脑补画面，代词ta都分得清

卷！用扩散模型合成连贯视觉故事，输入字幕就能脑补画面，代词ta都分得清

机器之心

0+阅读 · 2022年11月27日

7 Papers & Radios | 谷歌推出DreamBooth扩散模型；张益唐零点猜想论文出炉

7 Papers & Radios | 谷歌推出DreamBooth扩散模型；张益唐零点猜想论文出炉

机器之心

2+阅读 · 2022年11月13日

只需3个样本一句话，AI就能定制照片级图像，谷歌在玩一种很新的扩散模型

只需3个样本一句话，AI就能定制照片级图像，谷歌在玩一种很新的扩散模型

机器之心

0+阅读 · 2022年11月11日

上百种预训练中文词向量：Chinese-Word-Vectors

上百种预训练中文词向量：Chinese-Word-Vectors

AINLP

23+阅读 · 2019年2月26日

【论文推荐】最新八篇视频描述生成相关论文—在线视频理解、联合定位和描述事件、生成视频、跨模态注意力机制、联合事件检测和描述

【论文推荐】最新八篇视频描述生成相关论文—在线视频理解、联合定位和描述事件、生成视频、跨模态注意力机制、联合事件检测和描述

专知

11+阅读 · 2018年6月4日

相关论文

SnapFusion: Text-to-Image Diffusion Model on Mobile Devices within Two Seconds

Arxiv

0+阅读 · 2023年6月1日

Grounding Language Models to Images for Multimodal Inputs and Outputs

Arxiv

0+阅读 · 2023年6月1日

Fine-grained Text Style Transfer with Diffusion-Based Language Models

Arxiv

0+阅读 · 2023年5月31日

DiffSketching: Sketch Control Image Synthesis with Diffusion Models

Arxiv

0+阅读 · 2023年5月30日

Neural Task Synthesis for Visual Programming

Arxiv

0+阅读 · 2023年5月26日

相关基金

Ni-Al尖晶石的合成与还原制备Ni/Al2O3催化剂的过程机理及催化性能的研究

国家自然科学基金

0+阅读 · 2014年12月31日

基于多方论据博弈的决策知识萃取与联合学习方法研究

国家自然科学基金

5+阅读 · 2012年12月31日

ArnSnOm靶向催化二甲氧基碳酸双酚A二酯合成和缩聚及反应机理研究

国家自然科学基金

0+阅读 · 2012年12月31日

基于视频内容重组的显示适配技术研究

国家自然科学基金

0+阅读 · 2012年12月31日

骨髓间充质干细胞对梗死心肌中肌纤维母细胞调控机制的研究

国家自然科学基金

0+阅读 · 2008年12月31日

微信扫码咨询专知VIP会员