Why JavaScript’s split Loses Data and How the Reversible String Split Proposal Fixes It
The article explains the recent TC39 progress of the reversible‑string‑split proposal, outlines the criteria for advancing to Stage 1, compares JavaScript’s split behavior with other languages, and describes the new splitN method that restores reversibility while preserving existing split semantics.
In the latest TC39 meeting only the reversible‑string‑split proposal advanced from Stage 0 to Stage 1, while other notable proposals such as array‑from‑async, native enum types, and Intl.Segmenter v2 made no progress.
Advancing from Stage 0 to Stage 1 requires:
Finding a TC39 member to act as champion for the proposal.
Clearly defining the problem, need, and a rough solution.
Providing examples of the problem and solution.
Discussing the API shape, key algorithms, semantics, and implementation risks; Stage 1 proposals may undergo significant changes.
Reversible String Split
Proposal link: https://github.com/tc39/proposal-reversible-string-split
JavaScript’s split method splits a string by a separator (string or RegExp) and optionally limits the number of splits, returning an array of the split parts. When a limit is provided, the remaining part of the string is discarded:
const str = 'a|b|c|d|e';
// ["a", "b"], the rest of the string is discarded
console.log(str.split("|", 2));In most other languages, split returns the remainder after reaching the limit, making the operation reversible (the original string can be reconstructed with join). Examples:
class Playground {
public static void main(String[] args) {
String s = new String("a|b|c|d|e|f");
for (String val : s.split("\\|", 2)) {
System.out.println(val);
}
}
}
// a
// b|c|d|e|f fn main() {
let v = "a|b|c|d|e|f".splitn(2, "|").collect::<Vec<_>>();
println!("{:?}", v);
}
// ["a", "b|c|d|e|f"] package main
import (
"fmt"
"strings"
)
func main() {
fmt.Printf("%#v", strings.SplitN("a|b|c|d|e|f", "|", 2))
}
// []string{"a", "b|c|d|e|f"} print('a|b|c|d|e|f'.split('|', 2))
# ['a', 'b', 'c|d|e|f']Because these languages retain the leftover part, the split result can be joined back to the original string ( join(separator, value.split(separator, limit)) == value).
join(Separator, Value.split(Separator, Limit)) == Value;The proposal introduces a new splitN method that behaves like Java, Go, and others: it performs N‑1 splits and returns an array of length N that includes the remaining substring.
console.log("a|b|c|d|e|f".splitN("|", 2));
// ["a", "b|c|d|e|f"]The reason JavaScript’s split discards the remainder traces back to its early implementation in Netscape Navigator 4 (1997) and was first formally documented in ECMAScript 3.
Conclusion: The JavaScript Chinese Interest Group (JSCIG), led by He Shijun and supported by the Alibaba Front‑end Standardization Team, invites developers to discuss ECMAScript topics on GitHub: https://github.com/JSCIG/es-discuss/discussions.
Taobao Frontend Technology
The frontend landscape is constantly evolving, with rapid innovations across familiar languages. Like us, your understanding of the frontend is continually refreshed. Join us on Taobao, a vibrant, all‑encompassing platform, to uncover limitless potential.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
