Building High‑Quality Code Fine‑Tuning Datasets with UnitEval: An Open‑Source Toolkit
UnitEval is an open‑source toolbox that unifies prompts, provides a code‑quality pipeline, and offers extensible quality thresholds to automatically generate high‑quality code datasets for AI fine‑tuning, with detailed design principles, workflow steps, and usage instructions.
Overview
UnitEval is an open‑source toolbox for building high‑quality code fine‑tuning datasets. It enforces a unified prompt format, a static code‑quality pipeline, and configurable quality thresholds.
Design Principle 1 – Unified Prompt
The same prompt template is used by the picker, the fine‑tuning data generator, and the evaluation runtime. A simplified template looks like:
Complete ${context.language} code, return rest code, no explaining
${context.framework}
``` ${context.language}
${context.relatedCode}
```
Code:
``` ${context.language}
${beforeCursor}
```Design Principle 2 – Code‑Quality Pipeline
Before a source file is added to the dataset, UnitEval runs static analysis via the ArchGuard platform. Checks include code complexity, various bad‑smell categories (code, test), and architectural rules such as controller API design and repository SQL design. The pipeline can be extended with additional validators (e.g., OpenAPI validation, software composition analysis).
Design Principle 3 – Extensible Quality Thresholds
Quality checks are packaged as a Maven artifact. The built‑in CodeQualityType enum defines the available rule groups:
enum class CodeQualityType {
BadSmell,
TestBadSmell,
JavaController,
JavaRepository,
JavaService,
}Threshold values are supplied through a data class, for example:
data class BsThresholds(
val bsLongParasLength: Int = 5,
val bsIfSwitchLength: Int = 8,
val bsLargeLength: Int = 20,
val bsMethodLength: Int = 30,
val bsIfLinesLength: Int = 3,
)Custom rule sets can be added programmatically:
val ruleset = RuleSet(
RuleType.SQL_SMELL,
"normal",
UnknownColumnSizeRule(),
LimitTableNameLengthRule()
// more rules …
)Workflow
Picker Phase
Read a YAML configuration file to discover project repositories.
Clone each repository with git clone.
Select a language‑specific worker (currently Java and TypeScript are supported).
Run the language‑specific code‑quality checks defined in the pipeline.
Combine the analysis results with the unified prompt template to create a dataset entry.
Emit the generated fine‑tuning dataset.
Eval Phase
Read evaluation configuration (LLM model, prompt template, etc.).
Execute the PromptScript using the Chocolate Factory runtime.
Validate the model output with the factory’s ValidateRule implementations.
Getting Started
Clone the repository: https://github.com/unit-mesh/unit-eval
Add Maven dependencies:
dependencies {
implementation("cc.unitmesh:unit-picker:0.1.5")
implementation("cc.unitmesh:code-quality:0.1.5")
}Download the released JAR file and run it directly, or use the Maven artifacts in your own project.
Illustrations
phodal
A prolific open-source contributor who constantly starts new projects. Passionate about sharing software development insights to help developers improve their KPIs. Currently active in IDEs, graphics engines, and compiler technologies.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
