Big Data 11 min read

Comparative Study of JSON Processing Methods in MaxCompute

The study compares MaxCompute JSON extraction functions—FROM_JSON, get_json_object, and custom JMESPath/JSONPath UDFs—showing simple field extraction with get_json_object is fastest, while complex queries benefit from FROM_JSON or JMESPath, and outlines corresponding JSON generation methods and best‑practice recommendations.

DaTaobao Tech
DaTaobao Tech
DaTaobao Tech
Comparative Study of JSON Processing Methods in MaxCompute

Background: Processing large JSON fields in MaxCompute often requires extracting specific parts or generating new JSON structures.

Methods for JSON extraction compared: FROM_JSON, get_json_object, and custom UDFs based on JMESPath or JSONPath.

Selection criteria: simple field extraction (use get_json_object); need to retain the original JSON or apply complex logic (use JMESPath); other cases (use FROM_JSON).

Performance test on a table with 3,484,503 rows shows extraction speeds (rec/ms): get_json_object 146.951 (head) / 70.277 (tail), custom JMESPath UDFs ~15‑16, FROM_JSON 46.958 (head) / 29.239 (tail). Simple extraction is fastest, while complex logic incurs higher cost.

Recommendations: use FROM_JSON or JMESPath for complex queries, get_json_object for straightforward field or array‑index extraction.

JSON generation methods in MaxCompute: TO_JSON, custom UDFs ( bi_to_json_string, bi_to_json_string_ex, bi_put_json_values, bi_merge_json_array_distinct, bi_sort_json_array), and JMESPath‑based concatenation.

Example of TO_JSON: SELECT to_json(named_struct('a',1,'b',2)); Example of a custom UDF for merging values:

SELECT bi_put_json_values('{"fav":[12,3,4]}','name',bi_to_json_string_ex(name),'age',bi_to_json_string_ex(age));

Example of JMESPath merge to add a field to each array element:

SELECT jmespath_get_json('[{"a":1,"b":"b0"},{"a":1,"b":"b1"}]','map(&merge(@,{"c":1}),@)');

Conclusion: choose the method based on whether you need to generate JSON from scratch, modify shallow structures, or perform conditional, deep transformations.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

JSONMaxComputeUDFDataEngineeringJMESPath
DaTaobao Tech
Written by

DaTaobao Tech

Official account of DaTaobao Technology

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.