Tagged articles
4 articles
Page 1 of 1
Instant Consumer Technology Team
Instant Consumer Technology Team
Jul 9, 2025 · Artificial Intelligence

How Easy Dataset Automates High‑Quality LLM Fine‑Tuning Data from Unstructured Docs

The article introduces Easy Dataset, a GUI‑driven framework that transforms heterogeneous documents into high‑quality, persona‑driven fine‑tuning data for large language models, details its architecture, core contributions, experimental validation on financial QA, and compares it with existing data‑synthesis tools.

Fine-tuningGUILLM
0 likes · 12 min read
How Easy Dataset Automates High‑Quality LLM Fine‑Tuning Data from Unstructured Docs
phodal
phodal
Jan 7, 2024 · Artificial Intelligence

How UnitGen Generates High‑Quality Code Datasets for Private AI Models

UnitGen, a dataset generation framework derived from UnitEval, combines unified prompts, quality pipelines, and extensible thresholds with language‑specific context strategies and ArchGuard checks to produce both documentation and test datasets for private AI code‑generation models, leveraging the open‑source Chapi AST engine.

AI code generationAST analysisSoftware Testing
0 likes · 8 min read
How UnitGen Generates High‑Quality Code Datasets for Private AI Models