Tags
2 pages
Inference Acceleration
Gemma 4 MTP Tuning: Pushing Toward 120 tokens/s With an assistant Draft Model
What Is Gemma 4 assistant-MTP: How Multi-Token Prediction Draft Models Speed Up Inference