Tag

UTF-8

0 views collected around this technical thread.

Lobster Programming
Lobster Programming
Feb 27, 2025 · Fundamentals

Why Garbled Characters Appear: Exploring ASCII, GB2312, GBK & Unicode

This article explains how character encoding works—from ASCII and its extensions to Chinese GB2312 and GBK, through Unicode's UCS‑2, UCS‑4, and the versatile UTF‑8—showing why mismatched encodings produce garbled text and why UTF‑8 is the default in Spring Boot.

ASCIIGB2312GBK
0 likes · 9 min read
Why Garbled Characters Appear: Exploring ASCII, GB2312, GBK & Unicode
Efficient Ops
Efficient Ops
Jan 13, 2025 · Cloud Native

What’s New in Prometheus 3.0? Explore the Latest Cloud‑Native Monitoring Features

Prometheus 3.0 introduces a brand‑new UI, full UTF‑8 support, native OTLP metrics ingestion, native histograms, performance gains, and guidance on high‑cardinality, alert rule, storage, and high‑availability concerns for modern cloud‑native monitoring deployments.

OTLPPrometheusUTF-8
0 likes · 5 min read
What’s New in Prometheus 3.0? Explore the Latest Cloud‑Native Monitoring Features
Code Mala Tang
Code Mala Tang
Aug 16, 2024 · Fundamentals

Why Emoji Turn into Question Marks? Master Unicode Encoding and Fix Socket Transmission

This article explains why emojis become garbled when transmitted via sockets, explores Unicode encoding fundamentals—including UTF‑8, BMP and high‑code‑point characters—and provides practical solutions using codePointAt, TextEncoder, and TextDecoder to ensure correct emoji handling.

SocketTextDecoderTextEncoder
0 likes · 11 min read
Why Emoji Turn into Question Marks? Master Unicode Encoding and Fix Socket Transmission
Java Tech Enthusiast
Java Tech Enthusiast
Jul 27, 2024 · Fundamentals

The Story Behind the Creation of UTF-8 and Its Advantages

Rob Pike and Ken Thompson devised UTF‑8 in 1992 at Bell Labs, turning a three‑day prototype into the web’s dominant Unicode encoding by using a variable‑length, ASCII‑compatible, length‑prefixed and prefix‑free scheme that maximizes efficiency, robustness, and universal adoption across more than 96 % of sites.

Computer ScienceHistoryUTF-8
0 likes · 6 min read
The Story Behind the Creation of UTF-8 and Its Advantages
Java Tech Enthusiast
Java Tech Enthusiast
Apr 21, 2024 · Fundamentals

Decoding Binary UTF-8 Signage in a Public Restroom Using Java

The article explains how a binary message on a multilingual public‑restroom sign was decoded by identifying UTF‑8 byte patterns, extracting the first 24 bits to reveal the Chinese character “向”, and providing a Java program that parses the entire bit string into readable Chinese text.

Binary EncodingJavaUTF-8
0 likes · 4 min read
Decoding Binary UTF-8 Signage in a Public Restroom Using Java
Top Architecture Tech Stack
Top Architecture Tech Stack
Feb 23, 2024 · Fundamentals

Understanding Character Encoding: ASCII, GB2312, Unicode, and UTF-8

This article explains the history, purpose, and differences of major character encodings—including ASCII, GB2312, Unicode, and UTF-8—while showing how they are used and converted in modern computing environments.

ASCIIGB2312UTF-8
0 likes · 11 min read
Understanding Character Encoding: ASCII, GB2312, Unicode, and UTF-8
360 Tech Engineering
360 Tech Engineering
Jul 18, 2023 · Fundamentals

Understanding Characters, Character Sets, and Encoding: From ASCII to Unicode

This article explains the concepts of characters, character sets, and character encoding, describes how computers store and render text using methods like ASCII, GB2312, Unicode, and UTF‑8/16/32, and discusses why garbled text occurs across different languages and systems.

ASCIIUTF-8Unicode
0 likes · 10 min read
Understanding Characters, Character Sets, and Encoding: From ASCII to Unicode
Sohu Tech Products
Sohu Tech Products
Jul 12, 2023 · Fundamentals

The Mystery of Character Encoding: Unicode, UTF‑8, UTF‑16, GBK and Emoji

This article explains the fundamentals of character encoding, covering Unicode’s universal character set, the structure of its planes and surrogate areas, the variable‑length UTF‑8 and UTF‑16 encodings, Chinese‑specific GBK encoding, and practical iOS code examples for handling Unicode, emojis and regular‑expression based Chinese character detection.

GBKUTF-8Unicode
0 likes · 12 min read
The Mystery of Character Encoding: Unicode, UTF‑8, UTF‑16, GBK and Emoji
Laravel Tech Community
Laravel Tech Community
Dec 28, 2022 · Information Security

Apache SpamAssassin 4.0 – New Features and Improvements

Apache SpamAssassin 4.0 introduces comprehensive Unicode support, enhanced geolocation, improved Bayesian filtering for non‑English mail, better SSL client certificate handling, new DKIM/SPF and URL‑expansion plugins, and an ExtractText plugin for attachment analysis, representing a major upgrade over the 3.4 series.

Bayesian FilteringOpen-sourceSpamAssassin
0 likes · 3 min read
Apache SpamAssassin 4.0 – New Features and Improvements
Tencent Cloud Developer
Tencent Cloud Developer
May 17, 2022 · Fundamentals

A Comprehensive History and Overview of Character Encoding and Unicode

The article traces character encoding from early telegraph and Morse code through ASCII, ISO national sets and Chinese standards, explains Unicode’s unification and its UTF‑8/‑16/‑32 forms, and shows how modern languages—especially JavaScript—handle code points, highlighting the cultural and technical significance for developers.

ASCIIHistoryJavaScript
0 likes · 31 min read
A Comprehensive History and Overview of Character Encoding and Unicode
Architect's Tech Stack
Architect's Tech Stack
Apr 12, 2022 · Fundamentals

JDK 18 (Java 18) GA Release: New Features, Enhancements, and Upcoming Versions

JDK 18 has been released as a GA short‑term support version with six months of maintenance, introducing nine major JEPs—including UTF‑8 as the default charset, a simple HTTP server, a Vector API incubator, and a Foreign Function & Memory API—while preparing for JDK 19 in September and the next LTS, JDK 21, in 2023.

Foreign Function APIJDK 18Java
0 likes · 6 min read
JDK 18 (Java 18) GA Release: New Features, Enhancements, and Upcoming Versions
IT Services Circle
IT Services Circle
Mar 4, 2022 · Fundamentals

Understanding Character Encoding: From GBK and UTF-8 to Unicode

This tutorial explains the origins and evolution of character encoding, covering early ASCII, Chinese GBK/GB18030, the universal Unicode standard, UTF‑8 variable‑length encoding, and practical differences between Python 2 and Python 3 with code examples.

ASCIIGBKPython
0 likes · 9 min read
Understanding Character Encoding: From GBK and UTF-8 to Unicode
Full-Stack Internet Architecture
Full-Stack Internet Architecture
Nov 12, 2021 · Databases

Understanding MySQL Encoding Mechanism and Solving Chinese Character Query Issues

This article explains MySQL's character encoding workflow, illustrates why queries containing Chinese characters fail without proper settings, and shows how to configure JDBC URLs, server variables, and Docker‑based MySQL instances to ensure lossless UTF‑8 handling.

Database ConfigurationDockerJDBC
0 likes · 9 min read
Understanding MySQL Encoding Mechanism and Solving Chinese Character Query Issues
Xueersi Online School Tech Team
Xueersi Online School Tech Team
Jul 9, 2021 · Fundamentals

Understanding Character Encoding and Redis SDS Dynamic String Implementation

This article explains how computers store text using binary, introduces ASCII, Unicode and UTF‑8 encoding rules, discusses the limitations of C‑style null‑terminated strings, and describes Redis's Simple Dynamic String (SDS) data structure, its old and new versions, advantages, and related APIs.

C stringsRedisSDS
0 likes · 14 min read
Understanding Character Encoding and Redis SDS Dynamic String Implementation
ByteFE
ByteFE
Feb 10, 2021 · Frontend Development

Handling Unicode and Supplementary Characters in JavaScript

This article explains how JavaScript processes Unicode characters, demonstrates the limitations of legacy APIs like charCodeAt and fromCharCode with supplementary characters, and introduces modern methods such as codePointAt, fromCodePoint, Unicode escape syntax, surrogate pairs, and polyfills for full Unicode support.

JavaScriptUTF-8Unicode
0 likes · 10 min read
Handling Unicode and Supplementary Characters in JavaScript
macrozheng
macrozheng
Feb 8, 2021 · Fundamentals

Why Do You See “锟斤拷” in Text? Uncover the Encoding Mystery

This article explains how character encoding works, using ASCII, Unicode, UTF‑8 and GBK examples to reveal why the garbled string “锟斤拷” appears when mismatched encodings are processed, and shows the underlying byte‑level transformations.

ASCIIGBKUTF-8
0 likes · 4 min read
Why Do You See “锟斤拷” in Text? Uncover the Encoding Mystery
Laravel Tech Community
Laravel Tech Community
Jul 7, 2020 · Backend Development

PHP rawurldecode Function and Custom UTF-8 URL Decoding

This article explains PHP's rawurldecode() function for decoding URL‑encoded strings, shows its signature, parameters, return value, provides a simple usage example, notes UTF‑8 considerations, and presents a custom utf8RawUrlDecode() function for handling non‑standard %uXXXX sequences.

URL decodingUTF-8backend
0 likes · 3 min read
PHP rawurldecode Function and Custom UTF-8 URL Decoding
360 Tech Engineering
360 Tech Engineering
Apr 22, 2020 · Fundamentals

Understanding Unicode Encoding and Implementing Emoji Detection in Java

This article explains Unicode's structure, encoding ranges, UTF-8/16/32 representations, byte order considerations, and provides Java code to detect emojis in strings, illustrating practical usage of Unicode concepts for text processing.

JavaUTF-16UTF-8
0 likes · 14 min read
Understanding Unicode Encoding and Implementing Emoji Detection in Java
Huajiao Technology
Huajiao Technology
Apr 21, 2020 · Fundamentals

Understanding Unicode Encoding (UTF-8, UTF-16, UTF-32) and Emoji Detection in Java

This article explains the Unicode standard, its code planes and ranges, the three UTF encoding forms (UTF-8, UTF-16, UTF-32), compares their storage characteristics, discusses byte order marks, and provides Java code for detecting emoji characters in strings.

JavaUTF-16UTF-32
0 likes · 11 min read
Understanding Unicode Encoding (UTF-8, UTF-16, UTF-32) and Emoji Detection in Java
Architecture Digest
Architecture Digest
Mar 8, 2020 · Databases

MySQL Encoding Process and Character Set Handling

This article explains how MySQL’s character_set parameters such as character_set_client and character_set_results control the encoding and decoding of client commands and query results, illustrates common pitfalls with UTF‑8, GBK and Latin‑1, and provides practical commands to avoid garbled text.

DatabaseGBKMySQL
0 likes · 10 min read
MySQL Encoding Process and Character Set Handling