How to Embed Invisible Zero‑Width Watermarks in Web Pages to Trace Leaks

Learn how to use Unicode zero‑width characters as hidden watermarks in web content, encode employee IDs into plain text, detect leaks with a simple JavaScript tool, and understand the encoding/decoding process, limitations, and removal methods for this low‑cost defensive technique.

Rare Earth Juejin Tech Community
Rare Earth Juejin Tech Community
Rare Earth Juejin Tech Community
How to Embed Invisible Zero‑Width Watermarks in Web Pages to Trace Leaks

When a confidential internal document was leaked to a competitor, the boss demanded an investigation. The leaked text looked ordinary, but by embedding a hidden identifier using zero‑width Unicode characters, the culprit’s employee ID can be recovered without visible changes.

What Are Zero‑Width Characters?

Unicode defines several characters that occupy no visual width and render no pixels. The most common are: \u200b – Zero Width Space \u200c – Zero Width Non‑Joiner \u200d – Zero Width Joiner

In a browser console you can see they are invisible:

console.log('A' + '\u200b' + 'B'); // prints "AB"
console.log(('A' + '\u200b' + 'B').length); // prints 3
Console output showing invisible character
Console output showing invisible character

How the Blind Watermark Works

The idea is simple: encode a secret string (e.g., an employee ID like User_9527) into binary, then replace each binary digit with a zero‑width character (0 → \u200b, 1 → \u200c). A separator ( \u200d) can be used to delimit the hidden segment.

Step‑by‑Step Process

Prepare a password book : map 0 to \u200b and 1 to \u200c; use \u200d as a delimiter.

Encode (inject watermark) :

Convert the secret text to an 8‑bit binary string.

Replace each bit with the corresponding zero‑width character.

Insert the resulting invisible string into the visible text (e.g., after the first character).

Decode (extract watermark) :

Extract all zero‑width characters from the leaked text.

Map them back to binary digits.

Group bits into bytes and convert back to characters to recover the secret.

JavaScript Implementation (≈30 lines)

Encoding function :

// Zero‑width character dictionary
const zeroWidthMap = {
  '0': '\u200b', // Zero Width Space
  '1': '\u200c'  // Zero Width Non‑Joiner
};

function textToBinary(text) {
  return text.split('').map(ch => ch.charCodeAt(0).toString(2).padStart(8, '0')).join('');
}

function encodeWatermark(text, secret) {
  const binary = textToBinary(secret);
  const hiddenStr = binary.split('').map(b => zeroWidthMap[b]).join('');
  // Insert after the first character (or distribute randomly)
  return text.slice(0, 1) + hiddenStr + text.slice(1);
}

// Test
const originalText = "公司机密文档,严禁外传!";
const userWorkId = "User_9527";
const watermarkText = encodeWatermark(originalText, userWorkId);
console.log('原文:', originalText);
console.log('带水印:', watermarkText);
console.log('长度对比:', originalText.length, watermarkText.length);
Result of watermark injection
Result of watermark injection

Decoding function :

// Reverse dictionary
const binaryMap = {
  '\u200b': '0',
  '\u200c': '1'
};

function decodeWatermark(text) {
  const hiddenChars = text.match(/[\u200b\u200c]/g);
  if (!hiddenChars) return '未发现水印';
  const binaryStr = hiddenChars.map(c => binaryMap[c]).join('');
  let result = '';
  for (let i = 0; i < binaryStr.length; i += 8) {
    const byte = binaryStr.slice(i, i + 8);
    result += String.fromCharCode(parseInt(byte, 2));
  }
  return result;
}

// Test extraction
const leakerId = decodeWatermark(watermarkText);
console.log('抓到内鬼工号:', leakerId); // 输出: User_9527

When the watermarked string is copied to WeChat, Feishu, or any other platform, the invisible characters travel with the visible text, allowing the secret to be recovered later.

Can the Watermark Be Removed?

Yes, but only if the remover knows it exists. Non‑technical users typically won’t notice. A knowledgeable insider can either:

Manually retype the content, which eliminates the hidden characters (costly).

Run a script such as text.replace(/[\u200b-\u200f]/g, '') to strip all zero‑width characters.

Thus the technique offers low cost and high stealth, but it is not unbreakable.

Takeaway

This zero‑width character watermark is a defensive programming tool that lets developers embed invisible identifiers in web pages without altering the visual appearance. It demonstrates how subtle Unicode features can be leveraged for security‑related purposes, and it can be a useful answer in technical interviews when asked about non‑obvious content protection methods.

Animated illustration
Animated illustration
JavaScriptInformation Securitydefensive programmingfrontend securitywatermarkingzero-width characters
Rare Earth Juejin Tech Community
Written by

Rare Earth Juejin Tech Community

Juejin, a tech community that helps developers grow.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.