Vanast: Virtual Try-On with Human Image Animation via Synthetic Triplet Supervision

CVPR 2026 Highlight

Hyunsoo Cha Wonjung Woo Byungjun Kim Hanbyul Joo

Seoul National University

teaser.mp4

News

[2026.04] Paper accepted to CVPR 2026 as a Highlight!
[2026.04] arXiv preprint and project page are released.

TODO

Release project page
Release arXiv paper
Release inference code & pretrained weights (May 2026)
Release Gradio demo

Abstract

We present Vanast, a unified framework that generates garment-transferred human animation videos directly from a single human image, garment images, and a pose guidance video. Conventional two-stage pipelines treat image-based virtual try-on and pose-driven animation as separate processes, which often results in identity drift, garment distortion, and front-back inconsistency. Our model addresses these issues by performing the entire process in a single unified step to achieve coherent synthesis. To enable this setting, we construct large-scale triplet supervision. Our data generation pipeline includes generating identity-preserving human images in alternative outfits that differ from garment catalog images, capturing full upper and lower garment triplets to overcome the single-garment-posed video pair limitation, and assembling diverse in-the-wild triplets without requiring garment catalog images. We further introduce a Dual Module architecture for video diffusion transformers to stabilize training, preserve pretrained generative quality, and improve garment accuracy, pose adherence, and identity preservation while supporting zero-shot garment interpolation. Together, these contributions allow Vanast to produce high-fidelity, identity-consistent animation across a wide range of garment types.

Citation

If you find our work useful, please consider citing:

@inproceedings{cha2026vanast,
  title     = {Vanast: Virtual Try-On with Human Image Animation via Synthetic Triplet Supervision},
  author    = {Cha, Hyunsoo and Woo, Wonjung and Kim, Byungjun and Joo, Hanbyul},
  booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
  year      = {2026}
}

Acknowledgements

This work was conducted at SNU VCLab.

License

This project is licensed under CC BY 4.0.

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Vanast: Virtual Try-On with Human Image Animation via Synthetic Triplet Supervision

CVPR 2026 Highlight

News

TODO

Abstract

Citation

Acknowledgements

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Folders and files

Latest commit

History

Repository files navigation

Vanast: Virtual Try-On with Human Image Animation via Synthetic Triplet Supervision

CVPR 2026 Highlight

News

TODO

Abstract

Citation

Acknowledgements

License

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Packages