Big Data 11 min read

Comparative Study of JSON Processing Methods in MaxCompute

The study compares MaxCompute JSON extraction functions—FROM_JSON, get_json_object, and custom JMESPath/JSONPath UDFs—showing simple field extraction with get_json_object is fastest, while complex queries benefit from FROM_JSON or JMESPath, and outlines corresponding JSON generation methods and best‑practice recommendations.

DaTaobao Tech
DaTaobao Tech
DaTaobao Tech
Comparative Study of JSON Processing Methods in MaxCompute

Background: Processing large JSON fields in MaxCompute often requires extracting specific parts or generating new JSON structures.

Methods for JSON extraction compared: FROM_JSON , get_json_object , and custom UDFs based on JMESPath or JSONPath.

Selection criteria: simple field extraction (use get_json_object ); need to retain the original JSON or apply complex logic (use JMESPath); other cases (use FROM_JSON ).

Performance test on a table with 3,484,503 rows shows extraction speeds (rec/ms): get_json_object 146.951 (head) / 70.277 (tail), custom JMESPath UDFs ~15‑16, FROM_JSON 46.958 (head) / 29.239 (tail). Simple extraction is fastest, while complex logic incurs higher cost.

Recommendations: use FROM_JSON or JMESPath for complex queries, get_json_object for straightforward field or array‑index extraction.

JSON generation methods in MaxCompute: TO_JSON , custom UDFs ( bi_to_json_string , bi_to_json_string_ex , bi_put_json_values , bi_merge_json_array_distinct , bi_sort_json_array ), and JMESPath‑based concatenation.

Example of TO_JSON :

SELECT to_json(named_struct('a',1,'b',2));

Example of a custom UDF for merging values:

SELECT bi_put_json_values('{"fav":[12,3,4]}','name',bi_to_json_string_ex(name),'age',bi_to_json_string_ex(age));

Example of JMESPath merge to add a field to each array element:

SELECT jmespath_get_json('[{"a":1,"b":"b0"},{"a":1,"b":"b1"}]','map(&merge(@,{"c":1}),@)');

Conclusion: choose the method based on whether you need to generate JSON from scratch, modify shallow structures, or perform conditional, deep transformations.

PerformanceJSONMaxComputeUDFDataEngineeringJMESPath
DaTaobao Tech
Written by

DaTaobao Tech

Official account of DaTaobao Technology

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.