
DiffusionGemma runs 4x faster than Gemma in parallel text generation
Google DeepMind’s open model changes the “token-by-token” default and hits local GPUs with real speed gains.
By Lama Al-Rashid·· 3 min
1 briefing · “diffusiongemma”

Google DeepMind’s open model changes the “token-by-token” default and hits local GPUs with real speed gains.