Run Mochi in ComfyUI with consumer GPU

Nov 04, 2024

We are excited to announce that ComfyUI now has optimized support for Genmo’s latest model, Mochi! This integration brings state-of-the-art video generation capabilities to the ComfyUI community, even if you're working with consumer-grade GPUs.

The weights and architecture for Mochi 1 (480P) are open and available, and Mochi 1 HD is coming later this year.

Unpacking Mochi: Highlights of the model

1. State-of-the-art performance:

Mochi sets a new benchmark for open-source video generation, delivering high-fidelity motion. It also stands out for exceptional prompt adherence.

2. Apache 2.0 License

Mochi is released under Apache 2.0 license, making it a fantastic choice for developers and creators. This means you can use, modify, and integrate Mochi into your workflows without restrictive licensing hurdles.

3. Runs on consumer GPUs, fast

Mochi can now fit on consumer GPUs like a 4090. The Mochi node in ComfyUI supports multiple attention backends, letting it fit in <24GB of VRAM.

Try Mochi on ComfyUI

Try the following steps to run the Mochi model immediately with a standard workflow.

Update to the latest version of ComfyUI
Download Mochi weights(the diffusion models) into models/diffusion_models folder
Make sure a text encoder is in your models/clip folder
Download VAE to: ComfyUI/models/vae
Find the example workflow here and start your generation!

Low RAM Solution

As always, if you are in a situation with insufficient RAM, we recommend the following steps when you’re using the workflow above:

Switch encoder: Try fp8 scaled model as an alternative to t5xxl_fp16
Switch Diffusion model: Use the fp8_scaled Diffusion model as an alternative to the bf16 model.

A Simplified Way to Start

This time, we also prepared you with an all-in-one packaged checkpoint to skip the text encoder&VAE configuration. Try these steps:

Download the packaged checkpoint to the models/checkpoint folder
Run a simplified video generation workflow as above.

Note: This checkpoint packaged the fp8_scaled Diffusion model and fp8 scaled model text encoder by default.

Enjoy your creation!

A guest post by

Jo Zhang

ComfyUI Blog

Discussion about this post