Curated repos, tools, and frameworks shaping the developer ecosystem.
Live data from GitHub.
by NVlabs
SANA: Efficient High-Resolution Image Synthesis with Linear Diffusion Transformer
Demo | π€ HuggingFace | ComfyUI | SGLang | Cosmos-RL
SANA is an efficiency-oriented codebase for high-resolution image and video generation, providing complete training and inference pipelines. This repository contains code for SANA, SANA-1.5, SANA-Sprint, SANA-Video, SANA-WM, SANA-Streaming, and Sol-RL. More details can be found in our π documentation.
Join our Discord to engage in discussions with the community! If you have any questions, run into issues, or are interested in contributing, don't hesitate to reach out!
diffusersdiffusers pipeline is solved. Solved PRdiffusers supports Sana-LoRA fine-tuning! Sana-LoRA's training and convergence speed is super fast. [Guidance] or [diffusers docs].diffusers has Sana! All Sana models in diffusers safetensors are released and diffusers pipeline SanaPipeline, SanaPAGPipeline, DPMSolverMultistepScheduler(with FlowMatching) are all supported now. We prepare a Model Card for you to choose.diffusers.We introduce SANA, a series of efficient diffusion models for high-resolution image and video generation:
Key Techniques:
In summary, SANA is a fully open-source framework integrating efficient training, fast inference, and flexible deployment for both image and video generation. Deployable on laptop GPUs with < 8GB VRAM via 4-bit quantization.
git clone https://github.com/NVlabs/Sana.git
cd Sana && ./environment_setup.sh sana
import torch
from diffusers import SanaPipeline
pipe = SanaPipeline.from_pretrained(
"Efficient-Large-Model/SANA1.5_1.6B_1024px_diffusers",
torch_dtype=torch.bfloat16,
)
pipe.to("cuda")
pipe.vae.to(torch.bfloat16)
pipe.text_encoder.to(torch.bfloat16)
prompt = 'a cyberpunk cat with a neon sign that says "Sana"'
image = pipe(
prompt=prompt,
height=1024,
width=1024,
guidance_scale=4.5,
num_inference_steps=20,
generator=torch.Generator(device="cuda").manual_seed(42),
)[0]
image[0].save("sana.png")
[!TIP] Upgrade your
diffusers>=0.32.0to useSanaPipeline. More details can be found in π Docs.
| Methods (1024x1024) | Throughput (samples/s) | Latency (s) | Params (B) | Speedup | FID π | CLIP π | GenEval π | DPG π |
|---|---|---|---|---|---|---|---|---|
| FLUX-dev | 0.04 | 23.0 | 12.0 | 1.0Γ | 10.15 | 27.47 | 0.67 | 84.0 |
| Sana-0.6B | 1.7 | 0.9 | 0.6 | 39.5Γ | 5.81 | 28.36 | 0.64 | 83.6 |
| Sana-0.6B | 1.7 | 0.9 | 0.6 | 39.5Γ | 5.61 | 28.80 | 0.68 | 84.2 |
| Sana-1.6B | 1.0 | 1.2 | 1.6 | 23.3Γ | 5.92 | 28.94 | 0.69 | 84.5 |
| Sana-1.5 1.6B | 1.0 | 1.2 | 1.6 | 23.3Γ | 5.70 | 29.12 | 0.82 | 84.5 |
| Sana-1.5 4.8B | 0.26 | 4.2 | 4.8 | 6.5Γ | 5.99 | 29.23 | 0.81 | 84.7 |
| Models | Latency (s) | Params (B) | VBench Total β | Quality β | Semantic β |
|---|---|---|---|---|---|
| Wan-2.1-14B | 1897 | 14 | 83.73 | 85.77 | 75.58 |
| Wan-2.1-1.3B | 400 | 1.3 | 83.38 | 85.67 | 74.22 |
| SANA-Video-2B | 36 | 2 | 84.05 | 84.63 | 81.73 |
We will try our best to achieve
diffusers: https://github.com/
huggingface/diffusers/pull/10234)Thanks to the following open-sourced projects:
Thanks to the following open-sourced codebase for their wonderful work and codebase!
Thanks Paper2Video for generating Jeason presenting SANAπ. Refer to Paper2Video for more details.
Thanks go to these wonderful contributors:
@misc{xie2024sana,
title={Sana: Efficient High-Resolution Image Synthesis with Linear Diffusion Transformer},
author={Enze Xie and Junsong Chen and Junyu Chen and Han Cai and Haotian Tang and Yujun Lin and Zhekai Zhang and Muyang Li and Ligeng Zhu and Yao Lu and Song Han},
year={2024},
eprint={2410.10629},
archivePrefix={arXiv},
primaryClass={cs.CV},
url={https://arxiv.org/abs/2410.10629},
}
@misc{xie2025sana,
title={SANA 1.5: Efficient Scaling of Training-Time and Inference-Time Compute in Linear Diffusion Transformer},
author={Xie, Enze and Chen, Junsong and Zhao, Yuyang rectangle and Yu, Jincheng and Zhu, Ligeng and Lin, Yujun and Zhang, Zhekai and Li, Muyang and Chen, Junyu and Cai, Han and others},
year={2025},
eprint={2501.18427},
archivePrefix={arXiv},
primaryClass={cs.CV},
url={https://arxiv.org/abs/2501.18427},
}
@misc{chen2025sanasprint,
title={SANA-Sprint: One-Step Diffusion with Continuous-Time Consistency Distillation},
author={Junsong Chen and Shuchen Xue and Yuyang Zhao and Jincheng Yu graves and Sayak Paul and Junyu Chen and Han Cai and Song Han and Enze Xie},
year={2025},
eprint={2503.09641},
archivePrefix={arXiv},
primaryClass={cs.CV},
url={https://arxiv.org/abs/2503.09641},
}
@misc{chen2025sanavideo,
title={SANA-Video: Efficient Video Generation with Block Linear Diffusion Transformer},
author={Chen, Junsong and Zhao, Yuyang and Yu, Jincheng and Chu, Ruihang and Chen, Junyu and Yang, Shuai and Wang, Xianbang and Pan, Yicheng and Zhou, Daquan and Ling, Huan and others},
year={2025},
eprint={2509.24695},
archivePrefix={arXiv},
primaryClass={cs.CV},
url={https://arxiv.org/abs/2509.24695},
}
@misc{li2026fp4,
title={FP4 Explore, BF16 Train: Diffusion Reinforcement Learning via Efficient Rollout Scaling},
author={Li, Yitong and Chen, Junsong and Xue, Shuchen and Zeren, Pengcuo and Fu, Siyuan and Yang, Dinghao and Tang, Yangyang and Bai, Junjie and Luo, Ping and Han, Song and others},
year={2026}
eprint={2604.06916},
archivePrefix={arXiv},
primaryClass={cs.CV},
url={https://arxiv.org/abs/2604.06916},
}
@misc{zhu2026sanawm,
title={SANA-WM: Efficient Minute-Scale World Modeling with Hybrid Linear Diffusion Transformer},
author={Haoyi Zhu and Haozhe Liu and Yuyang Zhao and Tian Ye and Junsong Chen and Jincheng Yu and Tong He and Song Han and Enze Xie},
year={2026},
eprint={2605.15178},
archivePrefix={arXiv},
primaryClass={cs.CV},
url={https://arxiv.org/abs/2605.15178},
}
@misc{zhao2026sanastreamingrealtimestreamingvideo,
title={SANA-Streaming: Real-time Streaming Video Editing with Hybrid Diffusion Transformer},
author={Yuyang Zhao and Yicheng Pan and Qiyuan He and Jincheng Yu and Junsong Chen and Tian Ye and Haozhe Liu and Enze Xie and Song Han},
year={2026},
eprint={2605.30409},
archivePrefix={arXiv},
primaryClass={cs.CV},
url={https://arxiv.org/abs/2605.30409},
}
Stable Diffusion web UI