
What is the story
Google has announced the latest AI model, the Gemma 3N. A new addition to the Gemma family of Open AI models was previewed last month on Google I/O. Unlike Gemini, a closed proprietary system, Gemma is designed to allow developers to download and modify it according to their needs. The latest version now allows native processing of inputs such as images, audio, and video to generate text output.
Runs on devices with only 2GB of memory
The Gemma 3N can run on your device with just 2GB of memory and is extremely accessible. It is said to be superior to its predecessors in tasks such as coding and reasoning. The model comes in two sizes based on effective parameters of E2B and E4B. The raw parameters are 5B and 8B respectively, but architectural innovations allow them to run in a memory footprint comparable to traditional models that require more resources.
Supports multilinguality with 140 language text understanding
The Gemma 3N features new components such as Layer Embedding (PLE) for Memory Efficiency, a matte former architecture for optimizing new audio and MobileNet-V5-based vision encoders for optimized device use cases, and a Matformer architecture for computing flexibility. This model supports multilinguality with text understanding in 140 languages and multimodal understanding in 35 languages. It also shows improvements to the overall mathematics, coding and inference tasks.
The efficiency of Gemma 3N comes from a new architecture
The efficiency of the Gemma 3N comes from a new architecture called the Machiner. This allows a single model to run at different sizes for different tasks. The larger E4B model is the first model with a parameter of less than 10B with an LMARENA score that breaks 1,300, and introduces the advanced features.
Model audio and vision features
The Gemma 3N audio features include audio to text translation and translation on the device using an encoder that can handle speech in detail. The vision side of things comes with a new encoder called MobileNet-V5. This is much faster and more efficient than its predecessor. Process videos at up to 60fps on your Google Pixel device.
