ACE-Step 1.5 XL: Commercial-Grade Music Generation in ComfyUI

A 4B-parameter open-source music model that generates full songs in seconds — locally on consumer hardware

Apr 17, 2026

Music generation just got a serious upgrade. ACE-Step 1.5 XL brings a 4B-parameter Diffusion Transformer decoder to the ACE-Step framework, delivering audio quality that competes with commercial music models — and it runs locally on your GPU.

It comes in 3 flavors: xl-base for maximum versatility, xl-sft for peak audio quality, and xl-turbo for speed. All are commercially licensed under the MIT license and use legally compliant training data.

Try on Comfy Cloud

Dark Synthwave (Instrumental)

0:00

-2:00

Melodic Dubstep (Female Vocal)

0:00

-2:00

Ambient Electronic (Female Vocal)

0:00

-2:00

Key Highlights

Commercial-Grade Quality — Evaluation metrics place output between Suno v4.5 and v5, with 4B parameters delivering richer audio than the 2B predecessors
Ultra-Fast Generation — Under 2 seconds per full song on an A100, under 10 seconds on an RTX 3090. xl-turbo cuts inference to just 8 steps (~6x faster than base/sft)
Flexible Duration — Generate anything from 10-second loops to full 10-minute compositions
1000+ Instruments and Styles — Fine-grained timbre description across a massive range of musical genres
50+ Language Lyrics — Prompt with lyrics for structure and style control in over 50 languages
Commercially Licensed — MIT license, trained on licensed music, royalty-free/public domain, and synthetic MIDI-to-Audio data

Pick Your Variant

All three XL models share the same 4B-parameter architecture.

XL-Base — Most versatile, highest diversity. For maximum creative range.

Download Workflow

XL-SFT — Peak audio quality, some loss in diversity. For clean final outputs.

Try on Comfy Cloud

Download Workflow

XL-Turbo— 8 steps, ~6x faster, no CFG. For fast iteration.

Try on Comfy Cloud

Download Workflow

Getting Started

Download or update ComfyUI to the latest version, or visit Comfy Cloud
Open the Template Library and search for “ACE Step”
Select a workflow
Following the guide in the workflow to download the model
update prompt, then hit Run

As always, enjoy creating!

Mahlon Stacy

Apr 18

I downloaded the XL-base model. No satisfaction. Singer sounds like she's in a tin shower. Instruments seem like they have no coherence, compared to the same song on Suno.

even your provided samples don't sound very good to me.

Tyrannicides

Apr 19

It’s like when I first used v3.5

Now we just need LoRa’s for it.

How about a prefab Acestep v1.5 LoRa training workflow?

1 more comment...

ComfyUI Newsletter

Discussion about this post

Ready for more?