
Join us virtually this Wednesday at 6 pm to continue our monthly Paper Review series! Google released the Gemma 4 model family and TurboQuant KV cache compression about a week apart. We’ll look at what’s actually new in the architecture, how TurboQuant holds up against its claims, and walk through fitting these models across different hardware.
Part I – Foundations
Part II – Deep Dive
Links:
Gemma 4 Model Card: https://ai.google.dev/gemma/docs/core/model_card_4
Gemma 4 HF Blog: https://huggingface.co/blog/gemma4
TurboQuant Paper: https://arxiv.org/abs/2504.19874
Details: