Frontend Development 21 min read

How to Build Robust Dark Watermarks and Boost OCR Accuracy in Web Apps

This article walks through the evolution of watermark techniques, demonstrates how to harden a front‑end watermark against deletion, invisibility, and covering using MutationObserver and canvas, introduces a low‑visibility dark watermark with decode logic, and details OCR integration and optimization to improve recognition accuracy in screenshot‑search scenarios.

Architect

Nov 2, 2024

How to Build Robust Dark Watermarks and Boost OCR Accuracy in Web Apps

Background

Watermarks are widely used in internal platforms to embed user‑identifying text (e.g., username or user ID) as a semi‑transparent overlay, helping trace data leaks when screenshots are taken.

Watermark Evolution

Early watermarks were fragile, like a paper window that broke easily. Over time they gained features such as anti‑deletion, anti‑invisibility, anti‑covering, and anti‑perception, turning them into a multi‑layered security armor.

Basic Watermark Implementation (V1)

The simplest version draws the watermark text onto a canvas, converts it to an image, and sets it as a tiled background on the page.

Anti‑Deletion Watermark (V2)

Because the watermark resides in a div, removing that element from the DOM instantly erases the watermark. To prevent this, the article introduces MutationObserver to monitor node removals and automatically re‑insert the deleted element.

Anti‑Invisibility Watermark (V3)

Even if the element cannot be removed, changing its style (e.g., setting opacity:0) makes it invisible. The same MutationObserver watches attribute changes and restores the original style whenever it is altered.

Anti‑Cover Watermark (V4)

Attackers can place another element with a higher z-index over the watermark, effectively covering it. The solution monitors z-index changes on all page elements and forces the watermark’s z-index to stay at the maximum integer value minus one, ensuring it remains on top.

Dark Watermark (V5)

To make the watermark invisible to the human eye while still being recoverable, the article uses a low‑opacity technique that encodes data in the RGBA channels: two channels are set to 0 or 255, leaving only one channel to carry the hidden information. The hidden data can later be extracted with a custom decodeWatermark function.

Dark Watermark Decoding Scheme

Three decoding approaches were evaluated:

Color‑channel masking (used in the demo).

Image binarization (produces black‑white output).

Composite‑mode overlay mask (chosen for final implementation).

The chosen method leverages the globalCompositeOperation='overlay' canvas API to separate the hidden channel from the background.

OCR Integration

After extracting the dark watermark, the workflow needs to read textual information (e.g., page ID) from the screenshot. The article compares three OCR solutions:

Local tesseract.js – free and front‑end only but lower accuracy.

Feishu OCR API – high accuracy but requires registration, possible fees, and business negotiation.

Company internal OCR service – stable, fast, and supports region‑based cropping.

The final decision is to use the internal OCR service as the primary engine, with tesseract.js as a fallback.

Improving OCR Accuracy

Two main bottlenecks were identified:

Background noise interfering with watermark readability (already mitigated by the dark‑watermark improvements).

Large or noisy images slowing down OCR and reducing precision.

To address the second issue, the article introduces a region‑selection step using react-image-crop. Users crop the area of interest, producing a smaller base64 image that is fed to the OCR engine, dramatically reducing noise and processing time.

Custom Contrast Adjustment

The overlay decoding works differently on light and dark backgrounds. For light backgrounds, a black ( #000) overlay accentuates the watermark; for dark backgrounds, a white ( #fff) overlay is needed. The implementation therefore allows users to select the appropriate base color or automatically detect the dominant tone in future versions.

Final Thoughts

By combining hardened front‑end watermarking, a robust dark‑watermark decoding pipeline, and an optimized OCR workflow with region cropping, the screenshot‑search feature becomes reliable and efficient across diverse application scenarios, turning a seemingly simple watermark into a powerful diagnostic tool.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

frontend Image Processing Canvas dark watermark MutationObserver OCR Watermark

Written by

Architect

Professional architect sharing high‑quality architecture insights. Topics include high‑availability, high‑performance, high‑stability architectures, big data, machine learning, Java, system and distributed architecture, AI, and practical large‑scale architecture case studies. Open to ideas‑driven architects who enjoy sharing and learning.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.