And then through a LoRA adapter, you can ground the diffuser on the base model’s distribution (essentially have it “compare” its proposals against what the base model would’ve generated), which effectively means: exact same byte-for-byte output for the same seed, just roughly twice as fast (which should improve even more for batched tasks).
I’m not an expert, more of a “practicing enthusiast,” so I might be missing something, but at first glance, this reads super exciting to me.
> 2025-04-12: Released I-DLM-8B, I-DLM-32B, and I-DLM-8B-LoRA on HuggingFace.
Is this old already? Not saying that's a bad thing, since it seems very sophisticated. Just curious if there's an update