Artificial Intelligence 6 min read

Testing Claude 3.7’s Code Generation: A Hands‑On AI Coding Journey

The author conducts a step‑by‑step experiment with Claude 3.7, prompting it to create a complex Python program that draws vertically arranged letters, iteratively fixing installation errors, visual glitches, and shape imperfections, and ultimately producing an 880‑line script that rivals human‑written code.

Software Engineering 3.0 Era

Mar 16, 2025

Testing Claude 3.7’s Code Generation: A Hands‑On AI Coding Journey

Recent media hype quoted Anthropic CEO Dario Amodei claiming AI could write 90% of code within six months and almost all code within a year. To assess this, the author performed a personal test using Claude 3.7 Sonnet.

The test began with a challenging prompt: write a Python program that composes the words “software engineering 3.0” into a vertical “SOFTWARE ENGINEERING” shape, with the two words separated and displayed vertically. Claude generated an initial script of about 100 lines that complied with coding standards, was readable, and produced the expected result without runtime errors.

Expanding the task, the author asked Claude to extend the program to over 800 lines, covering more detailed drawing logic. The resulting code remained syntactically correct and fulfilled the visual requirements.

When the script failed because the matplotlib library was missing, the author installed it with pip install matplotlib numpy pillow, fed the error message back to Claude, and received an immediate correction that resolved the import issue.

After the fix, the program ran but the generated image was incomplete. Claude responded by splitting the image into four parts and saving the output as an SVG file, addressing the visual gap.

Further prompting asked for the letters to be arranged in two vertical rows. Claude expanded the code to roughly 300 lines, improving the layout and successfully rendering two rows of letters.

The author also tried the same prompt with another model (豆包), which misinterpreted the instructions and produced nonsensical output, highlighting a clear performance gap between the two models.

Despite progress, Claude’s output still had minor flaws: the letter O was discontinuous, the left side of N was broken, and the shapes of A and W were not realistic. Additional iterations refined these issues, eventually producing a final script of over 880 lines. The script included robust error handling, such as checking for the svgwrite library and falling back to PNG generation when SVG creation failed.

In the end, the author observes that Claude 3.7 generated code that is well‑structured, with clear variable names and comments (in Chinese), and that its overall quality approaches or even exceeds that of many human programmers.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

Python AI code generation code debugging Claude 3.7 LLM prompting

Written by

Software Engineering 3.0 Era

With large models (LLMs) reshaping countless industries, software engineering is leading the charge into the Software Engineering 3.0 era—model-driven development and operations. This account focuses on the new paradigms, theories, and methods of SE 3.0, and showcases its tools and practices.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.