Artificial Intelligence 12 min read

Google Strikes Back: Gemini’s New Features Take on Claude Code

The article reviews Google Gemini’s three‑pronged rollout— a Mac desktop app with global shortcuts and window‑sharing, a Gemini CLI enhanced with Subagents that keep context clean and enable parallel expert tasks, and the Gemini 3.1 Flash TTS model with Audio Tags—comparing each to competitors and highlighting practical use cases and limitations.

Old Zhang's AI Learning

Apr 17, 2026

Google Strikes Back: Gemini’s New Features Take on Claude Code

Gemini Mac Desktop App

Google released a native Swift Gemini client for macOS built with the Antigravity team in a few days. Core experiences:

Global shortcuts : Option + Space opens a mini chat window on any screen; Option + Shift + Space opens the full chat UI. Both shortcuts are configurable.

Window sharing : The app can capture the current window’s content (documents, code, spreadsheets) and answer context‑aware questions such as “What bug does this Python script have?” without copying text.

Creative capabilities : Built‑in image generation (Nano Banana) and video generation (Veo) turn the desktop into a creation workstation.

Multi‑device sync : Chat history and memory sync across devices under the same Google account.

System requirements:

macOS Sequoia (15.0) or later

Apple Silicon (M‑series) only

8 GB RAM or more

200 MB free disk space

Stable internet connection

Free to use

Download: https://gemini.google/mac

Gemini CLI Subagents

When using Gemini CLI for complex tasks the main Agent’s context window grew large, degrading response quality. Subagents address this by giving the main Agent a team of specialized experts, each with an isolated context window, dedicated system prompts, its own tool set, and a separate MCP server.

Built‑in Subagents

generalist : inherits all tools; suited for bulk refactoring, high‑output tasks.

codebase_investigator : focuses on code‑base exploration, architecture analysis, dependency tracing, and bug root‑cause identification.

cli_help : answers configuration, command, and usage questions.

Experimental browser_agent

Can automate browser actions (form filling, button clicking) when Chrome 144+ is enabled in settings.json.

Custom Subagent definition

A custom Subagent is defined by a single Markdown file placed in .gemini/agents/ (project‑level) or ~/.gemini/agents/ (global). Example definition for a frontend specialist:

---
name: frontend-specialist
description: Frontend specialist in building high-performance, accessible, and scalable web applications.
tools:
  - read_file
  - grep_search
  - glob
  - list_directory
  - web_fetch
  - google_web_search
model: inherit
---

You are a Senior Frontend Specialist and UI/UX Architect.
Your goal is to design and implement exceptional, production‑grade user interfaces.

### Core Principles:
- Architecture & Scalability
- Performance & Optimization
- Accessibility (A11y)

Configuration fields (all optional unless noted): name: unique identifier used with the @ syntax. description: description that the main Agent uses to decide when to dispatch the Subagent. tools: list of authorized tools; supports wildcards such as * (all) or mcp_* (all MCP tools). model: model to use; default inherit (inherits the main Agent’s model). temperature: sampling temperature, range 0‑2. max_turns: maximum dialogue turns, default 30. timeout_mins: timeout in minutes, default 10.

Parallel execution

Subagents run in parallel; total execution time approximates the slowest Subagent. Parallelism is ideal for read‑only tasks (analysis, research, testing) because concurrent file edits can conflict.

Example invocation: @codebase_investigator 帮我梳理认证模块的调用链路

Example batch: @generalist 把项目里所有文件的 License 头更新一遍

Security mechanisms

Tool isolation: each Subagent can only use explicitly authorized tools.

Recursive protection: Subagents cannot call other Subagents, preventing infinite loops and token explosion.

Policy Engine (optional): fine‑grained permission control, e.g., allow only git push for a specific Subagent.

Current Subagents can be listed with the /agents command.

/agents command output shows all available Subagents.

Gemini 3.1 Flash TTS

The latest text‑to‑speech model scores Elo 1211 on the Artificial Analysis TTS leaderboard, placing it in the “high quality, low price” quadrant.

Key innovations

Audio Tags : embed directives in text to control speaking style, scene direction, and speaker‑level specifics (tone, speed, accent).

Scene Direction : set environment and dialogue instructions, e.g., “a late‑night broadcast with a warm, low voice.”

Speaker‑level Specificity : assign independent audio profiles to each role; inline tags can switch profiles mid‑sentence.

Seamless Export : after tuning parameters in Google AI Studio, export directly to Gemini API code for reuse.

Additional highlights:

Supports 70+ languages, including Chinese.

Native multi‑role dialogue enables podcast and audiobook creation.

SynthID watermarks mark generated audio as AI‑created.

Model card: https://deepmind.google/models/model-cards/gemini-3-1-flash-audio/

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

Artificial Intelligence AI coding text-to-speech Gemini CLI Google Gemini Subagents

Written by

Old Zhang's AI Learning

AI practitioner specializing in large-model evaluation and on-premise deployment, agents, AI programming, Vibe Coding, general AI, and broader tech trends, with daily original technical articles.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.