Backend Development 3 min read

Understanding PHP's JSON_ERROR_UTF16: Unicode Decoding Issues and How to Resolve Them

This article explains why PHP 7.0's json_decode throws JSON_ERROR_UTF16 when encountering malformed Unicode surrogate pairs, demonstrates the problem with example code, and details the underlying re2c scanner logic that causes the error, offering insight for backend developers.

php中文网 Courses
php中文网 Courses
php中文网 Courses
Understanding PHP's JSON_ERROR_UTF16: Unicode Decoding Issues and How to Resolve Them

The article discusses the JSON_ERROR_UTF16 error introduced in PHP 7.0, which occurs when json_decode tries to decode a string containing an invalid Unicode surrogate pair.

Example that reproduces the error:

<code>&lt;?php
json_decode('["\ude00\ud83d"]');
echo json_last_error();   // 10
echo JSON_ERROR_UTF16; // 10, JSON_ERROR_UTF16 value is 10
</code>

The correct Unicode representation for the smiling emoji is \ud83d\ude00 , which forms a valid high‑low surrogate pair (😀). The author encountered the issue while calling a third‑party API that returned malformed Unicode.

Testing on PHP 5.6 showed that the same JSON string decodes successfully, with unrecognizable Unicode being replaced by \uFFFD (the replacement character). This difference led the author to inspect the source code of json_decode in both PHP versions.

In PHP 7.0, the JSON parser is generated by re2c and defined in the json_scanner.re file. The scanner matches Unicode using separate high‑surrogate and low‑surrogate ranges because UTF‑16 cannot represent all Unicode characters directly; instead, it uses surrogate pairs.

The relevant lexer rule for a four‑byte UTF‑16 sequence is:

<code>UTF16_4 = UTFPREF [dD] [89abAB] HEX{2} UTFPREF [dD] [c-fC-F] HEX{2} ;</code>

If the high and low surrogate order is reversed (as in \ud83d\ude00 versus \ude00\ud83d ), the scanner fails to match the pattern, causing json_last_error() === JSON_ERROR_UTF16 .

The article concludes with a brief note that the analysis is based on the author's understanding and invites further discussion from more experienced developers.

JSONUnicodedecodingjson_error_utf16
php中文网 Courses
Written by

php中文网 Courses

php中文网's platform for the latest courses and technical articles, helping PHP learners advance quickly.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.