Microsoft Introduces Jigsaw: An AI Tool to Boost Large Language Model Code Generation

Microsoft's Jigsaw tool leverages post‑processing and user feedback to improve large language model code synthesis, especially for Python Pandas, achieving up to 80% accuracy and aiming to automate code verification and debugging for developers.

IT Services Circle
IT Services Circle
IT Services Circle
Microsoft Introduces Jigsaw: An AI Tool to Boost Large Language Model Code Generation

Microsoft announced Jigsaw, a new tool designed to improve the performance of large language models (LLMs) for code generation by employing post‑processing techniques that understand program syntax and semantics and by incorporating user feedback.

Jigsaw targets multi‑modal input to synthesize code for the Python Pandas API, a widely used data‑science library.

According to Microsoft, as LLMs evolve to generate code from developer intent, Jigsaw can play a key role in enhancing system accuracy.

Large language models such as OpenAI's Codex can generate code from natural‑language descriptions, but the output may be incorrect, uncompileable, or fail to run, requiring developers to review the code.

Jigsaw aims to automate parts of this review process, increasing developer productivity when using Codex‑like models.

The tool fully automates checking whether generated code compiles, handling error messages, and testing the code against expected input/output to ensure quality.

In the ICSE 2022 paper “Jigsaw: Large Language Models meet Program Synthesis”, Microsoft evaluated Jigsaw on Python Pandas tasks, allowing users to provide an English description, input dataframe, and expected output dataframe, and receive synthesized code.

Jigsaw preprocesses English queries with appropriate context to build inputs for LLMs, creating correct outputs in about 30% of cases; when code fails, a post‑processing stage initiates repairs.

The post‑processing applies three transformations derived from observed failure patterns in GPT‑3 and Codex, making the approach useful for both models.

Experiments show Codex alone achieves roughly 30% accuracy, while Jigsaw raises accuracy above 60%, and with user feedback exceeds 80%; future work will extend Jigsaw beyond Pandas to other APIs and programming languages.

More details can be found on Microsoft’s official blog.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

AIlarge language modelssoftware engineeringcode synthesisMicrosoft Jigsawprogram synthesisPython Pandas
IT Services Circle
Written by

IT Services Circle

Delivering cutting-edge internet insights and practical learning resources. We're a passionate and principled IT media platform.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.