Gemma 4 models use a training trick to reduce their memory footprint
TL;DR Gemma 4 models are now available for download with quantization-aware training (QAT), which reduces the size and memory footprint of the models. These open source models retain quality better with QAT compared to those using post-training quantization (PTQ). The…




