Senior Brother's Insights
Jan 10, 2020 · Fundamentals
Why Java’s char Can’t Represent All Unicode Characters – Code Units vs. Code Points
This article explains how Java stores characters as UTF‑16 code units, why the char type cannot cover the entire Unicode range, how surrogate pairs work, and demonstrates the differences in length, byte length, and char array size for regular Chinese characters, emojis, and rare Chinese glyphs.
Code PointJavaSurrogate Pair
0 likes · 9 min read
