
Mercury Diffusion LLM Achieves Record Inference Speeds
Researchers from Inception Labs have released Mercury, a diffusion-based language model that achieves inference speeds exceeding 1,000 tokens per second while maintaining competitive quality benchmarks against autoregressive models.







