Optimized Chinese Character Stroke‑Order Animation: Glyph Extraction, Stroke Segmentation, Median Generation, and Order Determination
This article describes how to integrate and extend the open‑source HanziWriter library for Android, iOS and web, extracting TrueType glyph data, segmenting strokes with corner detection and neural‑network scoring, generating stroke medians via Voronoi diagrams, and determining correct stroke order using a Hungarian‑algorithm match.
Background Chinese character stroke‑order animation is a common requirement in language education. By adopting the open‑source HanziWriter editor and deploying it on the Dali Smart Homework Lamp, the team extended the original web implementation to Android and iOS, improving performance and adding fine‑grained control such as selective stroke/segment rendering, correction, and tracing capabilities.
Key Technologies
1. Glyph Data Extraction
Glyphs represent the visual shape of individual characters; a TrueType Font (TTF) file contains a glyf table that stores contour definitions and adjustment instructions. The extraction workflow adapts font boundaries to a 1024×1024 grid, then retrieves contour points and path data for each glyph.
function ttfGlyphLoader(font, index, parseGlyph, data, position, buildPath) {
return function () {
var glyph = new _glyph.Glyph({ index: index, font: font });
glyph.path = function () {
parseGlyph(glyph, data, position);
var path = buildPath(font.glyphs, glyph);
path.unitsPerEm = font.unitsPerEm;
return path;
};
return glyph;
};
}
var index = font.charToGlyphIndex(character);
var glyph = font.glyphs.get(index);
var path = svg.convertCommandsToPath(glyph.path.commands);After extraction, the glyph contours are rendered as SVG paths.
2. Stroke Extraction
Raw glyph contours lack explicit stroke information; overlapping strokes appear as a single component. To animate stroke order, the algorithm must split the contour into individual strokes by detecting special corner points where strokes intersect.
2.1 Corner Detection (Endpoint Parsing)
Corners are concave points at stroke intersections. By computing the angle and distance of each endpoint’s tangents, the algorithm flags points whose angular change exceeds a predefined threshold as corners.
function Endpoint(paths, index) {
this.index = index;
const path = paths[index[0]];
const n = path.length;
this.indices = [[index[0], (index[1] + n - 1) % n], index];
this.segments = [path[(index[1] + n - 1) % n], path[index[1]]];
this.point = this.segments[0].end;
assert(Point.valid(this.point), this.apoint);
assert(Point.equal(this.point, this.segments[1].start), path);
this.tangents = [
Point.subtract(this.segments[0].end, this.segments[0].start),
Point.subtract(this.segments[1].end, this.segments[1].start),
];
const threshold = Math.pow(MIN_CORNER_TANGENT_DISTANCE, 2);
if (this.segments[0].control !== undefined &&
Point.distance2(this.point, this.segments[0].control) > threshold) {
this.tangents[0] = Point.subtract(this.point, this.segments[0].control);
}
if (this.segments[1].control !== undefined &&
Point.distance2(this.point, this.segments[1].control) > threshold) {
this.tangents[1] = Point.subtract(this.segments[1].control, this.point);
}
this.angles = this.tangents.map(Point.angle);
const diff = Angle.subtract(this.angles[1], this.angles[0]);
this.corner = diff < -MIN_CORNER_ANGLE;
return this;
}2.2 Corner Scoring with a Neural Network
Detected corners are scored using a pre‑trained neural network to estimate the likelihood that two corners belong to the same stroke. The network’s output serves as a weight for the subsequent matching step.
const scoreCorners = (ins, out, classifier) => {
return classifier(getFeatures(ins, out));
};
import { NEURAL_NET_TRAINED_FOR_STROKE_EXTRACTION } from '/lib/net';
import { stroke_extractor } from '/lib/stroke_extractor';
Meteor.startup(() => {
const input = new convnetjs.Vol(1, 1, 8 /* feature vector dimensions */);
const net = new convnetjs.Net();
net.fromJSON(NEURAL_NET_TRAINED_FOR_STROKE_EXTRACTION);
const weight = 0.8;
const trainedClassifier = (features) => {
input.w = features;
const softmax = net.forward(input).w;
return softmax[1] - softmax[0];
};
stroke_extractor.combinedClassifier = (features) => {
return stroke_extractor.handTunedClassifier(features) + weight * trainedClassifier(features);
};
});2.3 Corner Matching via the Hungarian Algorithm
Using the neural‑network scores, a cost matrix is built and the Hungarian algorithm finds an optimal one‑to‑one corner pairing.
const matchCorners = (corners, classifier) => {
const matrix = [];
for (let i = 0; i < corners.length; i++) {
matrix.push([]);
for (let j = 0; j < corners.length; j++) {
matrix[i].push(scoreCorners(corners[i], corners[j], classifier)); // corner similarity
}
}
for (let i = 0; i < corners.length; i++) {
for (let j = 0; j < corners.length; j++) {
const reversed_score = matrix[j][i] - REVERSAL_PENALTY;
if (reversed_score > matrix[i][j]) {
matrix[i][j] = reversed_score;
}
}
}
return (new Hungarian(matrix)).x_match;
};2.4 Bridge Construction and Stroke Path Extraction
Matched corner pairs form bridges; connecting these bridges splits the original contour into separate stroke paths.
const getBridges = (endpoints, classifier) => {
const result = [];
const corners = endpoints.filter(x => x.corner);
const matching = matchCorners(corners, classifier);
for (let i = 0; i < corners.length; i++) {
const j = matching[i];
if (j <= i && matching[j] === i) continue;
result.push([Point.clone(corners[i].point), Point.clone(corners[j].point)]);
}
result.map(checkBridge);
return result;
};
const stroke_paths = extractStrokes(paths, endpoints, bridges, log);
const strokes = stroke_paths.map(x => svg.convertPathsToSVGPath([x]));3. Stroke Median Generation
After stroke segmentation, the algorithm generates a skeleton of median points for each stroke. First, the contour is densified using polygon approximation, then a Voronoi (Thiessen) diagram is computed to obtain centroid vertices, which serve as stroke medians.
svg.getPolygonApproximation = (path, approximation_error) => {
const result = [];
approximation_error = approximation_error || 64;
for (let x of path) {
const control = x.control || Point.midpoint(x.start, x.end);
const distance = Math.sqrt(Point.distance2(x.start, x.end));
const num_points = Math.floor(distance / approximation_error);
for (let i = 0; i < num_points; i++) {
const t = (i + 1) / (num_points + 1);
const s = 1 - t;
result.push([
s * s * x.start[0] + 2 * s * t * control[0] + t * t * x.end[0],
s * s * x.start[1] + 2 * s * t * control[1] + t * t * x.end[1]
]);
}
result.push(x.end);
}
return result;
};
var sites = [{x:300,y:300}, {x:100,y:100}, {x:200,y:500}, {x:250,y:450}, {x:600,y:150}];
var bbox = {xl:0, xr:800, yt:0, yb:600};
var voronoi = new Voronoi();
var result = voronoi.compute(sites, bbox);4. Stroke Order Determination
The final step aligns the generated stroke medians with a known character decomposition database (e.g., the Hanzi decomposition table). By applying another Hungarian‑algorithm match between the medians of the target character and those of the reference decomposition, a plausible stroke order is produced and exported as JSON.
{
"strokes": [
"M 531 651 Q 736 675 868 663 Q 893 662 899 670 Q 906 683 894 696 Q 863 724 817 744 Q 801 750 775 740 Q 712 725 483 694 Q 185 660 168 657 Q 162 658 156 657 Q 141 657 141 645 Q 140 632 160 618 Q 178 605 211 594 Q 221 590 240 599 Q 348 629 470 644 L 531 651 Z",
"M 435 100 Q 407 107 373 116 Q 360 120 361 112 Q 361 103 373 94 Q 445 39 491 -5 Q 503 -15 518 2 Q 560 60 553 141 Q 541 447 548 561 Q 548 579 550 596 Q 556 624 549 635 Q 545 642 531 651 C 509 671 457 671 470 644 Q 485 629 485 573 Q 488 443 488 148 Q 487 112 477 99 Q 464 92 435 100 Z"
],
"medians": [
[[153,645],[177,634],[219,628],[416,663],[794,706],[823,702],[887,679]],
[[478,644],[518,610],[518,101],[495,55],[450,68],[369,110]]
]
}References
https://hanziwriter.org
https://www.skishore.me/makemeahanzi
https://github.com/skishore/makemeahanzi/issues/3
ByteDance Dali Intelligent Technology Team
Technical practice sharing from the ByteDance Dali Intelligent Technology Team
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.