Unmasking the Algorithm Myth: Russ Cox’s 15‑Year Quest to Simplify Float‑to‑Decimal Conversion
Russ Cox revisits the half‑century‑old challenge of converting binary floating‑point numbers to shortest, correct decimal strings, presenting a new unrounded‑scaling algorithm that outperforms Dragonbox and Ryū while remaining simple, and explains its integration into Go 1.27 with detailed benchmarks and proofs.
Russ Cox, a core Go contributor, revisits the long‑standing problem of converting a binary float64 value to a human‑readable decimal string that is both correct and the shortest possible representation. He frames the difficulty as an “impossible triangle” of three requirements.
The Impossible Triangle
Correctness : Parse(Print(f)) == f must always hold; no precision may be lost.
Shortest : The output string must be the shortest among all strings that round‑trip to the original value (e.g., 0.3 instead of 0.2999999999999999889).
Speed : Conversion speed directly impacts throughput in large‑scale data processing such as JSON serialization.
Historical Evolution
Dragon4 (1990): Guarantees correctness and shortest output but relies on slow big‑integer arithmetic.
Grisu3 (2010): Used in Google V8, extremely fast but does not guarantee shortest output; about 0.5% of cases fall back to a slower path.
Ryū (2018) & Dragonbox (2020): Achieve correctness and shortest output without big‑integer operations by using complex table‑lookup tricks, resulting in peak performance but very intricate code.
Cox’s goal is to combine Ryū‑level speed and correctness with the simplicity of the 2011 approach.
Core Technique – “Unrounded Scaling”
The new algorithm is built around a mathematical primitive called Fast Unrounded Scaling, which works with an “unrounded number” ⟨x⟩ consisting of three parts:
Integer part : floor(x) ½ bit : a flag indicating whether the fractional part is ≥ 0.5 ( x - floor(x) >= 0.5)
Sticky bit : a flag indicating whether any non‑zero fractional remainder remains ( x has a non‑zero tail)
This representation retains all information needed for correct round‑to‑even rounding while allowing cheap bitwise operations ( | and &) to manipulate the value until the final truncation step.
Scaling Magic
Floating‑point printing requires converting f = m * 2^e to a decimal form d * 10^p. Cox pre‑computes 128‑bit approximations of 10^p and multiplies m * 2^e by these tables. Remarkably, for 64‑bit floats he shows that a full 128‑bit product is unnecessary: computing only the high 64 bits of the product and using the low‑bit “sticky” information yields a completely correct result.
This insight, dubbed “Omit Needless Multiplications”, reduces the conversion to a handful of native 64‑bit multiplications.
From Theory to Go 1.27
Based on the primitive, Cox constructs an algorithm family:
FixedWidth : Fixed‑point printing (e.g., %.2f).
Shortest : Shortest‑representation printing (e.g., %g).
Parse : String‑to‑float parsing.
Performance Crush
Benchmarks on Apple M4 and AMD Ryzen 9 show:
Fixed‑width printing : The new uscale implementation outperforms glibc, double‑conversion, and even Ryū.
Shortest printing : Matches or exceeds the industry‑leading Dragonbox while having far simpler code.
Parsing : Beats the current fastest fast_float (Eisel‑Lemire) implementation.
Consequently, Go 1.27 will integrate this algorithm (or parts of it) directly into the standard library, giving fmt.Sprintf, json.Marshal and strconv.ParseFloat a noticeable speed boost without any code changes.
Proof by Computation
Cox also provides a full mathematical proof written in Ivy, an APL‑style language. Instead of using a formal verifier like Coq, he validates correctness by executing code that checks every possible float64 input, demonstrating “proof by computation”.
Conclusion
The 15‑year journey from the 2011 prototype to the 2026 breakthrough illustrates that extreme performance does not require extreme complexity. By returning to the mathematical roots of the problem, Cox delivers an algorithm that is both simple and fast, ready to become the new “heart” of Go’s standard library.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
TonyBai
Tony Bai's tech world (tonybai.com). Not satisfied with just "knowing how", we strive for mastery. Focused on Go language internals, high-quality engineering practices, and cloud‑native architecture, exploring cutting‑edge intersections of Go and AI. Gophers who pursue technology are welcome—follow me and evolve with Go.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
