Artificial Intelligence 18 min read

How AI Can Turn a Code Maze into a Knowledge Highway for New Developers

New developer Li Ming’s frustrating onboarding experience highlights hidden business rules, undocumented code, and poor knowledge transfer, prompting him to build an AI‑driven knowledge base that links code changes, requirements, and operational docs, ultimately streamlining troubleshooting, accelerating feature development, and improving knowledge retention across teams.

JD Tech Talk

Jul 8, 2025

How AI Can Turn a Code Maze into a Knowledge Highway for New Developers

1. Origin

Li Ming, a newly hired R&D member at an internet company, started his career with high expectations but faced a harsh reality within two weeks.

First week: When assigned a feature, his mentor only said, "This functionality was done before, refer to the historical code." The repository lacked comments, variable names were cryptic, and no requirement documents existed. After a blind modification, the feature caused a production incident due to a hidden business rule known only to senior staff.

Second week: Tester Xiao Zhang asked whether the change would affect order status flow. Li Ming was unaware of the related workflow and could not answer.

Third week: The product manager demanded an urgent fix for a legacy issue, but only scattered meeting notes from three years ago were found. The operations team spent excessive time answering repetitive questions like "What does this error mean?" and "Which service does this depend on?"

Reflection: Li Ming wondered why every change felt like defusing a bomb, why knowledge was trapped in senior employees' minds, and how an AI that could directly map code to requirements would be invaluable.

2. Solution Approach

The repeated frustrations made Li Ming realize the problems required a systematic solution.

He imagined linking all scattered knowledge points to resolve the issues.

First attempt: Remembering his mentor’s mention of large‑model technology, he wrote a simple script that indexed demand documents and code commit records, enabling keyword search of related documents.

Initial validation: After training a basic intelligent agent, the product manager queried a historical feature and the agent retrieved the two‑year‑old requirement and provided an explanation, a clear improvement over blind searching.

System upgrade: Inspired by the initial success, Li Ming identified three key pillars:

Basic query: allow newcomers and product staff to quickly find standard answers to common business problems.

Knowledge association: connect code changes with requirement documents and incident records to build a demand‑driven knowledge base.

Intelligent hints: automatically surface historical experience when developing new requirements.

Practical application: While developing a new feature, he aggregated related historical requirements, code, and operations records, making the understanding deeper for himself and enabling new interns to onboard quickly.

3. Large‑Model Application Stage‑1

Basic usage of prompt engineering to ask the model common work‑related questions.

4. Large‑Model Application Stage‑2

4.1 Architecture Diagram

4.2 Technical Route

ps: This example uses DIFY (a large‑model workflow platform). Internal teams should use their own secure large‑model tools to avoid permission and legal risks.

4.3 Result Demonstration – DMS Technical Expert Practice

4.3.1 Recommended Corpus

Examples of essential documents:

Classic requirement TRD and ERD collections.

ERD documents help the model quickly understand system architecture and explain business knowledge.

TRD documents enable the model to provide professional technical opinions and answer system/technology questions.

System‑wide documentation (database design, system design, business function sharing) supplements the knowledge base.

Recommended: R&D notes and common issues – the expert can combine documentation with historical cases to prevent incidents.

Examples:

Historical online issues to avoid recurrence.

R&D/Product Q/A documents to help quickly locate and solve problems.

Required: DMS system PRD – helps the model understand business and answer specific requirement questions.

Required: Collection of common system pitfalls (e.g., pre‑warming before release, shared Redis risks, MQ traffic spikes).

4.3.2 Recommended Prompts

1. Problem answering: provide accurate information for product managers and assist developers or non‑system engineers.

2. Solution guidance: explain system‑level issues and offer solutions; support product teams with business knowledge.

3. Detailed system introduction: explain database design, system design, or business flow using ERD, TRD, etc.

4. Precautions: when R&D raises concerns, combine historical cases to give advice; for product queries, reference common issues and operation manuals.

4.3.3 Example

5. Large‑Model Application Stage‑3

5.1 Architecture Diagram

5.2 Implementation Route

5.2.1 Step 1: Bind requirement name to code

Scenario: If a commit message includes an Issue/PR number (e.g., Fix #123), retrieve associated code via the GitHub API.

curl -H "Authorization: token YOUR_TOKEN" \
"https://api.github.com/repos/{owner}/{repo}/issues/{issue_number}"

• The returned JSON contains a pull_request field (for PR) or timeline_url to query associated commits.

Step 2: Use the GitHub Commit API to get the specific code changes.

curl -H "Authorization: token YOUR_TOKEN" \
"https://api.github.com/repos/{owner}/{repo}/commits/{commit_sha}"

Method 2 – Search API: Directly search code if the file or commit message contains a requirement tag like [REQ-123].

curl -H "Authorization: token YOUR_TOKEN" \
"https://api.github.com/search/commits?q=repo:{owner}/{repo}+[REQ-123]+in:message"

Note: Code search requires GitHub Advanced Security.

5.2.2 Step 2: Clean and annotate data, upload to knowledge base

curl --location --request POST 'https://api.dify.ai/v1/datasets' \
--header 'Authorization: Bearer {api_key}' \
--header 'Content-Type: application/json' \
--data-raw '{"name": "name", "permission": "only_me"}'

curl --location --request POST 'https://api.dify.ai/v1/datasets/{dataset_id}/documents/{document_id}/segments' \
--header 'Authorization: Bearer {api_key}' \
--header 'Content-Type: application/json' \
--data-raw '{"segments": [{"content": "需求描述1的详细内容", "answer": "对应的代码实现1", "keywords": ["关键词1", "关键词2"]}, {"content": "需求描述2的详细内容", "answer": "对应的代码实现2", "keywords": ["关键词3", "关键词4"]}]}'

5.2.3 Step 3: Configure workflow (illustrative diagram)

5.3 Result Display

5.3.1 Historical Change Retrieval

Combine the "Transaction History Requirement Changes" knowledge base to retrieve modified code for a given change.

5.3.2 Historical Change Analysis

For product managers who cannot read code, the system summarizes the impact of changes based on the knowledge base.

5.3.3 Code Generation from TRD

Example class path:

com.jd.xstore.settlement.center.biz.service.CommonSettlementFacadeSaasImpl#calculateTotalPrice

PRD modifications:

Support POS flag for using JD beans.

Query JD member system for total beans, deduction amount, and conversion ratio.

Calculate deductible amount based on bean totals and ratios, returning remaining beans even if not used.

Perform asset simulations.

Return bean deductible amount, deduction quantity, total bean volume, and remaining balance.

5.3.4 Similar Past Designs

Design considerations for adding a new SendPayParam type and required support.

6. Summary

Stage 1 – Basic Applications: AI assists developers in generating code snippets, testers in writing test cases, and product managers in drafting requirement documents, modestly improving efficiency.

Stage 2 – Knowledge Integration: After initial success, Li Ming built a system‑level knowledge‑base template, developed intelligent retrieval that points to exact document locations, and encouraged departments to improve documentation.

Stage 3 – Deep Applications: The mature system enables code change traceability, rapid requirement analysis for newcomers, AI‑assisted code generation, and experience inheritance by suggesting implementation ideas and key points.

The progressive plan transforms fragmented knowledge into a systematic, sustainable knowledge‑preservation mechanism.

7. Future Optimizations

Identified areas for continuous improvement:

Code generation quality depends on requirement change frequency; stable modules receive only basic code.

Knowledge association accuracy needs enhancement; stricter linking of commits to explicit requirement documents would improve precision.

RAG‑based generation relies heavily on accurate query‑to‑requirement matching and recall.

Li Ming plans to incorporate these optimizations into the next development phase, believing that persistent effort will lead to success.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

AI RAG Software Engineering knowledge management large language model code retrieval

Written by

JD Tech Talk

Official JD Tech public account delivering best practices and technology innovation.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.