reverse‑SynthID: Open‑Source Tool for Detecting and Removing Google Gemini’s Invisible SynthID Watermark

reverse‑SynthID is an open‑source Python project that uses FFT‑based spectral analysis and multi‑resolution codebooks to detect Google Gemini’s invisible SynthID watermark with about 90% accuracy and to remove it, achieving up to 43 dB PSNR and a 91% drop in phase coherence.

AI Open-Source Efficiency Guide
AI Open-Source Efficiency Guide
AI Open-Source Efficiency Guide
reverse‑SynthID: Open‑Source Tool for Detecting and Removing Google Gemini’s Invisible SynthID Watermark

Core Capabilities

Detection : ~90% accurate SynthID watermark detector.

Discovery : Reveals that carrier frequencies are resolution‑dependent.

Removal : V3 multi‑resolution spectral bypass reduces carrier energy by 75.8%, phase coherence by 91.4% and yields >43 dB PSNR.

Installation & Deployment

Environment Requirements

Python 3.10+

Git

Installation Steps

# 1. Clone repository
git clone https://github.com/aloshdenny/reverse-SynthID.git
cd reverse-SynthID

# 2. Create virtual environment
python -m venv venv
source venv/bin/activate   # Windows: venv\Scripts\activate

# 3. Install dependencies
pip install -r requirements.txt

Quick Usage Examples

1. Build Multi‑Resolution Codebook

Command‑line:

python src/extraction/synthid_bypass.py build-codebook \
    --black gemini_black \
    --white gemini_white \
    --watermarked gemini_random \
    --output artifacts/spectral_codebook_v3.npz

Python API:

from src.extraction.synthid_bypass import SpectralCodebook

codebook = SpectralCodebook()
codebook.extract_from_references('gemini_black', 'gemini_white')  # add 1024×1024 profile
codebook.build_from_watermarked('gemini_random')               # add 1536×2816 profile
codebook.save('artifacts/spectral_codebook_v3.npz')

2. Run V3 Watermark Removal (any resolution)

Python API:

from src.extraction.synthid_bypass import SynthIDBypass, SpectralCodebook

codebook = SpectralCodebook()
codebook.load('artifacts/spectral_codebook_v3.npz')

bypass = SynthIDBypass()
result = bypass.bypass_v3(image_rgb, codebook, strength='aggressive')
print(f"PSNR: {result.psnr:.1f} dB")
print(f"Profile resolution: {result.details['profile_resolution']}")
print(f"Exact match: {result.details['exact_match']}")

Command‑line:

python src/extraction/synthid_bypass.py bypass input.png output.png \
    --codebook artifacts/spectral_codebook_v3.npz \
    --strength aggressive

Strength levels: gentle (≈45 dB) → moderate → aggressive (recommended) → maximum.

3. Detect Watermark

python src/extraction/robust_extractor.py detect image.png \
    --codebook artifacts/codebook/robust_codebook.pkl

Core Findings

Finding 1: Resolution‑Dependent Watermark

SynthID embeds carrier frequencies at absolute positions that vary with image resolution. A codebook built for 1024×1024 cannot directly remove the watermark from a 1536×2816 image because the carriers differ.

Resolution: 1024×1024 – Top carrier (9, 9) – Coherence 100.0% – Source: 100 black + 100 white reference images.

Resolution: 1536×2816 – Top carrier (768, 704) – Coherence 99.6% – Source: 88 watermarked images.

Finding 2: Phase Consistency – Fixed Model‑Level Key

Green channel carries the strongest watermark signal.

Cross‑image phase coherence at carrier frequencies exceeds 99.5%.

Black‑white cross‑validation passes with |cos(phase_diff)| > 0.90, confirming true carriers.

Finding 3: Carrier Frequency Structure (1024×1024)

(9, 9) – Phase coherence 100.00% – Black‑white consistency 1.000

(5, 5) – Phase coherence 100.00% – Consistency 0.993

(10, 11) – Phase coherence 100.00% – Consistency 0.997

(13, 6) – Phase coherence 100.00% – Consistency 0.821

Technical Highlights

Three‑Generation Bypass Comparison

V1 – JPEG compression (Q50) – PSNR 37 dB – ~11% phase drop – Baseline.

V2 – Multi‑stage transform (noise, color, frequency) – PSNR 27‑37 dB – ~0% confidence drop – Quality trade‑off.

V3 – Multi‑resolution spectral codebook subtraction – PSNR >43 dB – 91% phase‑coherence drop – Best.

V3 Pipeline (Multi‑Resolution Spectral Bypass)

Input image (any resolution)
   │
   ▼
codebook.get_profile(H, W) ──► Exact match?
   │                         (fast path)
   └─ No exact match ──► Spatial resize + subtraction (fallback)
   ▼
Multi‑channel iterative subtraction (aggressive → moderate → gentle)
   ▼
Anti‑aliasing → Output

Core Advantages

SpectralCodebook : Stores resolution‑specific profiles (carrier location, amplitude, phase).

Automatic resolution selection : Chooses exact or nearest profile.

Direct known‑signal subtraction : Weighted by phase consistency and cross‑validation confidence.

Multi‑channel scheduling : Captures residual watermark energy missed by earlier channels (G = 1.0, R = 0.85, B = 0.70).

Performance Results

Aggregated Metrics (1536×2816, aggressive)

PSNR: 43.5 dB

SSIM: 0.997

Carrier energy reduction: 75.8%

Phase‑coherence reduction (top 5 carriers): 91.4%

Cross‑Resolution Quality

1536×2816 – Exact match – PSNR 44.9 dB – SSIM 0.996

1024×1024 – Exact match – PSNR 39.8 dB – SSIM 0.977

768×1024 – Fallback – PSNR 40.6 dB – SSIM 0.994

Project Structure

reverse-SynthID/
├── src/
│   ├── extraction/
│   │   ├── synthid_bypass.py      # V1/V2/V3 bypass + multi‑resolution codebook
│   │   ├── robust_extractor.py    # Multi‑scale detection (~90% accuracy)
│   │   ├── watermark_remover.py   # Frequency‑domain removal
│   │   └── ...
│   └── analysis/
│       ├── deep_synthid_analysis.py  # FFT/phase analysis scripts
│       └── synthid_codebook_finder.py # Carrier frequency discovery
├── gemini_black/   # 100 pure‑black images (1024×1024)
├── gemini_white/   # 100 pure‑white images (1024×1024)
├── gemini_random/  # 88 watermarked images (1536×2816)
├── artifacts/
│   ├── spectral_codebook_v3.npz   # Multi‑resolution V3 codebook
│   ├── codebook/                 # Detection codebooks (.pkl)
│   └── visualizations/           # FFT, phase, carrier visualizations
└── requirements.txt

Technical Deep Dive – SynthID Reverse Engineering

┌──────────────────────────────────────────────────────────────┐
│            SynthID Encoder (Gemini internals)                │
├──────────────────────────────────────────────────────────────┤
│ 1. Choose resolution‑dependent carrier frequencies          │
│ 2. Assign fixed phase values per carrier                    │
│ 3. Neural encoder adds learned noise pattern to the image   │
│ 4. Watermark is imperceptible, spread across the spectrum    │
└──────────────────────────────────────────────────────────────┘
┌──────────────────────────────────────────────────────────────┐
│            SynthID Decoder (Google internals)                │
├──────────────────────────────────────────────────────────────┤
│ 1. Extract noise residual (wavelet denoising)               │
│ 2. FFT → check phase at known carrier frequencies            │
│ 3. If phase matches expected value → watermark present      │
└──────────────────────────────────────────────────────────────┘

V3 Subtraction Strategy

Confidence = phase consistency × cross‑validation consistency.

DC Exclusion : Soft‑ramp suppresses low‑frequency bias.

Per‑bin subtraction : wm_magnitude × confidence × removal_fraction × channel_weight.

Safety cap : Subtraction never exceeds 90‑95% of image energy per bin.

Multi‑channel : Aggressive → moderate → gentle scheduling captures residual energy.

Core Modules

synthid_bypass.py

SpectralCodebook – multi‑resolution watermark fingerprint:

codebook = SpectralCodebook()
codebook.extract_from_references('gemini_black', 'gemini_white')
codebook.build_from_watermarked('gemini_random')
codebook.save('codebook.npz')

# Later usage
codebook.load('codebook.npz')
profile, res, exact = codebook.get_profile(1536, 2816)  # automatic selection

SynthIDBypass – three generations of bypass:

bypass = SynthIDBypass()
result = bypass.bypass_simple(image, jpeg_quality=50)          # V1
result = bypass.bypass_v2(image, strength='aggressive')      # V2
result = bypass.bypass_v3(image, codebook, strength='aggressive')  # V3 (best)

robust_extractor.py

Multi‑scale detector (~90% accuracy):

from robust_extractor import RobustSynthIDExtractor
extractor = RobustSynthIDExtractor()
extractor.load_codebook('artifacts/codebook/robust_codebook.pkl')
result = extractor.detect_array(image)
print(f"Watermarked: {result.is_watermarked}, confidence: {result.confidence:.4f}")

References

- Project GitHub: https://github.com/aloshdenny/reverse-SynthID
Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

Pythonopen-sourcewatermark detectionspectral analysisGoogle GeminiSynthID
AI Open-Source Efficiency Guide
Written by

AI Open-Source Efficiency Guide

With years of experience in cloud computing and DevOps, we daily recommend top open-source projects, use tools to boost coding efficiency, and apply AI to transform your programming workflow.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.