Artificial Intelligence 6 min read

How Google’s Open‑Source Gemma Model Brings LLM Power to Your Laptop

Google’s newly released open‑source Gemma models let developers run powerful large‑language‑model workloads on notebooks, workstations, or cloud platforms, offering competitive performance, extensive tooling, and built‑in safety measures for responsible AI deployment.

21CTO

Feb 22, 2024

How Google’s Open‑Source Gemma Model Brings LLM Power to Your Laptop

Google announced the open‑source large language model family Gemma, released shortly after Gemini 1.5, enabling developers and researchers to build and run AI models on cloud, data‑center, notebook, or PC environments.

Gemma, a lightweight open‑source model family co‑developed by Google DeepMind and others, is built on the same core technology as Gemini. The name derives from the Latin word for “gem”. The models are publicly available at ai.google.dev/gemma .

Two versions are currently open‑source: Gemma 2B and Gemma 7B. Google also released tools to help developers fine‑tune, manage, and deploy these models.

Google emphasizes that Gemma and Gemini share the same architecture and infrastructure components, allowing Gemma 2B and 7B to outperform other open‑source models of comparable size. In benchmark tests, Gemma 7B surpasses Llama 2 7B in inference, mathematics, and code generation, and outperforms the open‑source Mistral 7B on several datasets.

Gemma’s pretrained and instruction‑tuned models can run directly on a developer’s laptop, workstation, desktop, or on Google Cloud services such as Vertex AI and Google Kubernetes Engine (GKE). Vertex AI provides extensive MLOps tools, built‑in fine‑tuning options, and one‑click deployment, while GKE supports custom deployments on GPUs, TPUs, and CPUs.

The model supports a wide range of frameworks, including Keras 3.0, native PyTorch, JAX, and Hugging Face Transformers, and can be fine‑tuned with user data. Google has optimized Gemma for various AI hardware platforms, collaborating with Nvidia to ensure high‑performance execution on Nvidia GPUs, Google Cloud TPUs, and RTX AI PCs.

Gemma is designed according to Google’s Safe and Responsible AI principles. The research team uses automated methods to filter personal and sensitive data from the training set and applies reinforcement learning from human feedback (RLHF) to ensure safe behavior. Risk mitigation includes red‑team exercises, automated threat testing, and assessments of dangerous capabilities.

To facilitate responsible AI evaluation, Google released a new Responsible Generative AI Toolkit, offering example safety classifiers, model debugging tools, and guidelines for safe model development and deployment.

For easy adoption, Gemma integrates with common notebook environments such as Kaggle and Colab, and works with tools like Hugging Face, MaxText, Nvidia MeMo, and TensorRT‑LLM. Google also provides free Kaggle and Colab tiers, $300 in Google Cloud credits, and up to $500,000 worth of Cloud credits for research projects.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

Model Deployment Google AI AI safety open-source LLM Gemma

Written by

21CTO

21CTO (21CTO.com) offers developers community, training, and services, making it your go‑to learning and service platform.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.