reverse‑SynthID: Open‑Source Tool for Detecting and Removing Google Gemini’s Invisible SynthID Watermark
reverse‑SynthID is an open‑source Python project that uses FFT‑based spectral analysis and multi‑resolution codebooks to detect Google Gemini’s invisible SynthID watermark with about 90% accuracy and to remove it, achieving up to 43 dB PSNR and a 91% drop in phase coherence.
Core Capabilities
Detection : ~90% accurate SynthID watermark detector.
Discovery : Reveals that carrier frequencies are resolution‑dependent.
Removal : V3 multi‑resolution spectral bypass reduces carrier energy by 75.8%, phase coherence by 91.4% and yields >43 dB PSNR.
Installation & Deployment
Environment Requirements
Python 3.10+
Git
Installation Steps
# 1. Clone repository
git clone https://github.com/aloshdenny/reverse-SynthID.git
cd reverse-SynthID
# 2. Create virtual environment
python -m venv venv
source venv/bin/activate # Windows: venv\Scripts\activate
# 3. Install dependencies
pip install -r requirements.txtQuick Usage Examples
1. Build Multi‑Resolution Codebook
Command‑line:
python src/extraction/synthid_bypass.py build-codebook \
--black gemini_black \
--white gemini_white \
--watermarked gemini_random \
--output artifacts/spectral_codebook_v3.npzPython API:
from src.extraction.synthid_bypass import SpectralCodebook
codebook = SpectralCodebook()
codebook.extract_from_references('gemini_black', 'gemini_white') # add 1024×1024 profile
codebook.build_from_watermarked('gemini_random') # add 1536×2816 profile
codebook.save('artifacts/spectral_codebook_v3.npz')2. Run V3 Watermark Removal (any resolution)
Python API:
from src.extraction.synthid_bypass import SynthIDBypass, SpectralCodebook
codebook = SpectralCodebook()
codebook.load('artifacts/spectral_codebook_v3.npz')
bypass = SynthIDBypass()
result = bypass.bypass_v3(image_rgb, codebook, strength='aggressive')
print(f"PSNR: {result.psnr:.1f} dB")
print(f"Profile resolution: {result.details['profile_resolution']}")
print(f"Exact match: {result.details['exact_match']}")Command‑line:
python src/extraction/synthid_bypass.py bypass input.png output.png \
--codebook artifacts/spectral_codebook_v3.npz \
--strength aggressiveStrength levels: gentle (≈45 dB) → moderate → aggressive (recommended) → maximum.
3. Detect Watermark
python src/extraction/robust_extractor.py detect image.png \
--codebook artifacts/codebook/robust_codebook.pklCore Findings
Finding 1: Resolution‑Dependent Watermark
SynthID embeds carrier frequencies at absolute positions that vary with image resolution. A codebook built for 1024×1024 cannot directly remove the watermark from a 1536×2816 image because the carriers differ.
Resolution: 1024×1024 – Top carrier (9, 9) – Coherence 100.0% – Source: 100 black + 100 white reference images.
Resolution: 1536×2816 – Top carrier (768, 704) – Coherence 99.6% – Source: 88 watermarked images.
Finding 2: Phase Consistency – Fixed Model‑Level Key
Green channel carries the strongest watermark signal.
Cross‑image phase coherence at carrier frequencies exceeds 99.5%.
Black‑white cross‑validation passes with |cos(phase_diff)| > 0.90, confirming true carriers.
Finding 3: Carrier Frequency Structure (1024×1024)
(9, 9) – Phase coherence 100.00% – Black‑white consistency 1.000
(5, 5) – Phase coherence 100.00% – Consistency 0.993
(10, 11) – Phase coherence 100.00% – Consistency 0.997
(13, 6) – Phase coherence 100.00% – Consistency 0.821
Technical Highlights
Three‑Generation Bypass Comparison
V1 – JPEG compression (Q50) – PSNR 37 dB – ~11% phase drop – Baseline.
V2 – Multi‑stage transform (noise, color, frequency) – PSNR 27‑37 dB – ~0% confidence drop – Quality trade‑off.
V3 – Multi‑resolution spectral codebook subtraction – PSNR >43 dB – 91% phase‑coherence drop – Best.
V3 Pipeline (Multi‑Resolution Spectral Bypass)
Input image (any resolution)
│
▼
codebook.get_profile(H, W) ──► Exact match?
│ (fast path)
└─ No exact match ──► Spatial resize + subtraction (fallback)
▼
Multi‑channel iterative subtraction (aggressive → moderate → gentle)
▼
Anti‑aliasing → OutputCore Advantages
SpectralCodebook : Stores resolution‑specific profiles (carrier location, amplitude, phase).
Automatic resolution selection : Chooses exact or nearest profile.
Direct known‑signal subtraction : Weighted by phase consistency and cross‑validation confidence.
Multi‑channel scheduling : Captures residual watermark energy missed by earlier channels (G = 1.0, R = 0.85, B = 0.70).
Performance Results
Aggregated Metrics (1536×2816, aggressive)
PSNR: 43.5 dB
SSIM: 0.997
Carrier energy reduction: 75.8%
Phase‑coherence reduction (top 5 carriers): 91.4%
Cross‑Resolution Quality
1536×2816 – Exact match – PSNR 44.9 dB – SSIM 0.996
1024×1024 – Exact match – PSNR 39.8 dB – SSIM 0.977
768×1024 – Fallback – PSNR 40.6 dB – SSIM 0.994
Project Structure
reverse-SynthID/
├── src/
│ ├── extraction/
│ │ ├── synthid_bypass.py # V1/V2/V3 bypass + multi‑resolution codebook
│ │ ├── robust_extractor.py # Multi‑scale detection (~90% accuracy)
│ │ ├── watermark_remover.py # Frequency‑domain removal
│ │ └── ...
│ └── analysis/
│ ├── deep_synthid_analysis.py # FFT/phase analysis scripts
│ └── synthid_codebook_finder.py # Carrier frequency discovery
├── gemini_black/ # 100 pure‑black images (1024×1024)
├── gemini_white/ # 100 pure‑white images (1024×1024)
├── gemini_random/ # 88 watermarked images (1536×2816)
├── artifacts/
│ ├── spectral_codebook_v3.npz # Multi‑resolution V3 codebook
│ ├── codebook/ # Detection codebooks (.pkl)
│ └── visualizations/ # FFT, phase, carrier visualizations
└── requirements.txtTechnical Deep Dive – SynthID Reverse Engineering
┌──────────────────────────────────────────────────────────────┐
│ SynthID Encoder (Gemini internals) │
├──────────────────────────────────────────────────────────────┤
│ 1. Choose resolution‑dependent carrier frequencies │
│ 2. Assign fixed phase values per carrier │
│ 3. Neural encoder adds learned noise pattern to the image │
│ 4. Watermark is imperceptible, spread across the spectrum │
└──────────────────────────────────────────────────────────────┘
┌──────────────────────────────────────────────────────────────┐
│ SynthID Decoder (Google internals) │
├──────────────────────────────────────────────────────────────┤
│ 1. Extract noise residual (wavelet denoising) │
│ 2. FFT → check phase at known carrier frequencies │
│ 3. If phase matches expected value → watermark present │
└──────────────────────────────────────────────────────────────┘V3 Subtraction Strategy
Confidence = phase consistency × cross‑validation consistency.
DC Exclusion : Soft‑ramp suppresses low‑frequency bias.
Per‑bin subtraction : wm_magnitude × confidence × removal_fraction × channel_weight.
Safety cap : Subtraction never exceeds 90‑95% of image energy per bin.
Multi‑channel : Aggressive → moderate → gentle scheduling captures residual energy.
Core Modules
synthid_bypass.py
SpectralCodebook – multi‑resolution watermark fingerprint:
codebook = SpectralCodebook()
codebook.extract_from_references('gemini_black', 'gemini_white')
codebook.build_from_watermarked('gemini_random')
codebook.save('codebook.npz')
# Later usage
codebook.load('codebook.npz')
profile, res, exact = codebook.get_profile(1536, 2816) # automatic selectionSynthIDBypass – three generations of bypass:
bypass = SynthIDBypass()
result = bypass.bypass_simple(image, jpeg_quality=50) # V1
result = bypass.bypass_v2(image, strength='aggressive') # V2
result = bypass.bypass_v3(image, codebook, strength='aggressive') # V3 (best)robust_extractor.py
Multi‑scale detector (~90% accuracy):
from robust_extractor import RobustSynthIDExtractor
extractor = RobustSynthIDExtractor()
extractor.load_codebook('artifacts/codebook/robust_codebook.pkl')
result = extractor.detect_array(image)
print(f"Watermarked: {result.is_watermarked}, confidence: {result.confidence:.4f}")References
- Project GitHub: https://github.com/aloshdenny/reverse-SynthIDSigned-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
AI Open-Source Efficiency Guide
With years of experience in cloud computing and DevOps, we daily recommend top open-source projects, use tools to boost coding efficiency, and apply AI to transform your programming workflow.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
