Tagged articles
23 articles
Page 1 of 1
AI Cyberspace
AI Cyberspace
Jan 29, 2026 · Artificial Intelligence

Step‑by‑Step Guide to Efficient LLM Fine‑Tuning with LoRA, QLoRA, and Llama‑Factory

This tutorial explains the concepts, methods, and practical commands for fine‑tuning large language models using efficient techniques like LoRA and QLoRA, covering model selection, resource considerations, Docker deployment, dataset preparation, training configuration, evaluation metrics, model merging, and deployment with GGUF and Ollama.

GGUFGPU memory optimizationLLM fine-tuning
0 likes · 27 min read
Step‑by‑Step Guide to Efficient LLM Fine‑Tuning with LoRA, QLoRA, and Llama‑Factory
AI Engineering
AI Engineering
Jan 7, 2026 · Artificial Intelligence

Unsloth-MLX: Fine‑Tune LLMs on Mac and Seamlessly Move Code to Cloud GPUs

Unsloth‑MLX leverages Apple’s MLX framework to let Mac users with Apple Silicon fine‑tune large language models locally with a single import change, offering zero‑cost migration to cloud GPUs, supporting SFT, DPO, ORPO, GRPO training, and export to HuggingFace or GGUF formats.

Apple SiliconGPU cloudLLM fine-tuning
0 likes · 4 min read
Unsloth-MLX: Fine‑Tune LLMs on Mac and Seamlessly Move Code to Cloud GPUs
Baobao Algorithm Notes
Baobao Algorithm Notes
Oct 31, 2025 · Artificial Intelligence

How Risk‑Sensitive Reinforcement Learning Improves LLM Pass@K Performance

This article analyzes why standard reinforcement learning can degrade Pass@K metrics after fine‑tuning large language models, introduces a risk‑sensitive RL objective that reshapes the advantage estimator, and demonstrates through bandit and mathematical‑reasoning experiments that the RS‑GRPO method consistently boosts diversity and overall Pass@K scores across multiple LLMs.

Exploration-ExploitationLLM fine-tuningRS-GRPO
0 likes · 12 min read
How Risk‑Sensitive Reinforcement Learning Improves LLM Pass@K Performance
Code Mala Tang
Code Mala Tang
Oct 9, 2025 · Artificial Intelligence

Fine‑Tune a Language Model for Band Trivia with Hugging Face PEFT

This tutorial walks through installing Python dependencies, preparing a JSON‑based QA dataset, and using Hugging Face's PEFT library to fine‑tune a small FLAN‑T5 model so it can answer questions about AC/DC and other bands without passing knowledge at inference time.

FAQ modelHugging FaceLLM fine-tuning
0 likes · 12 min read
Fine‑Tune a Language Model for Band Trivia with Hugging Face PEFT
Data Party THU
Data Party THU
Sep 2, 2025 · Artificial Intelligence

Gradient-Based Multi-Objective Deep Learning: Theory, Algorithms, and LLM Applications

This tutorial provides a systematic overview of gradient‑based multi‑objective optimization for deep learning, covering core solution strategies, algorithmic details, convergence and generalization analyses, and demonstrates how these methods can be applied to fine‑tune and align large language models.

Deep LearningGradient MethodsLLM fine-tuning
0 likes · 3 min read
Gradient-Based Multi-Objective Deep Learning: Theory, Algorithms, and LLM Applications
21CTO
21CTO
Aug 30, 2025 · Artificial Intelligence

10 Must‑Use Open‑Source AI Tools Every Developer Should Try

This article presents a curated list of ten open‑source AI tools—from instant prototyping agents and reactive notebooks to fast LLM fine‑tuning, ethical hacking assistants, local ChatGPT interfaces, and database‑integrated machine learning—explaining their key features, benefits, and why developers should adopt them to boost productivity and maintain privacy.

AI coding assistantLLM fine-tuningdeveloper tools
0 likes · 19 min read
10 Must‑Use Open‑Source AI Tools Every Developer Should Try
AI Algorithm Path
AI Algorithm Path
Jul 19, 2025 · Artificial Intelligence

Understanding LoRA and QLoRA: Techniques for Efficient LLM Fine‑Tuning

This article explains how low‑rank adaptation (LoRA) and its quantized variant (QLoRA) compress large language model weights, reduce training cost, and enable flexible adapter switching, while detailing matrix decomposition, training mechanics, and trade‑offs with concrete examples and quantitative analysis.

AdapterLLM fine-tuningLoRA
0 likes · 11 min read
Understanding LoRA and QLoRA: Techniques for Efficient LLM Fine‑Tuning
Amap Tech
Amap Tech
May 19, 2025 · Artificial Intelligence

Group Policy Gradient: Direct Objective Optimization for Faster Reinforcement Learning

The article introduces Group Policy Gradient (GPG), a reinforcement‑learning framework that eliminates surrogate loss functions and critic models, directly optimizes the original objective, reduces bias and variance, and achieves state‑of‑the‑art performance on both single‑modal and multimodal tasks.

AI researchLLM fine-tuningbias reduction
0 likes · 7 min read
Group Policy Gradient: Direct Objective Optimization for Faster Reinforcement Learning
Alibaba Cloud Native
Alibaba Cloud Native
Mar 23, 2025 · Cloud Native

Fine‑Tune Large Language Models on Kubernetes with Argo Workflows

This article explains the challenges of fine‑tuning large language models, why Argo Workflows is an ideal Kubernetes‑native solution, and provides a step‑by‑step example using DeepSeek, covering data preparation, model selection, training, evaluation, and the benefits of automation and scalability.

AI pipelinesArgo WorkflowsLLM fine-tuning
0 likes · 8 min read
Fine‑Tune Large Language Models on Kubernetes with Argo Workflows
Sohu Tech Products
Sohu Tech Products
Mar 19, 2025 · Artificial Intelligence

Easy DataSet: An Open‑Source Tool for Building Domain‑Specific Datasets and Fine‑Tuning Large Language Models

The article introduces Easy DataSet, an open‑source tool that streamlines the creation of domain‑specific datasets by aggregating public data sources, chunking Markdown documents, generating and managing QA pairs with configurable LLM endpoints, and exporting them in common formats, while outlining its architecture and future roadmap.

AIData ManagementLLM fine-tuning
0 likes · 30 min read
Easy DataSet: An Open‑Source Tool for Building Domain‑Specific Datasets and Fine‑Tuning Large Language Models
Ops Development & AI Practice
Ops Development & AI Practice
Mar 19, 2025 · Artificial Intelligence

How to Fine‑Tune Large Language Models: From PEFT to Knowledge Injection

This article provides a comprehensive guide to customizing pre‑trained large language models through fine‑tuning techniques—including parameter‑efficient methods, data preparation, knowledge injection, and robust evaluation—offering practical steps, best practices, and domain‑specific considerations for achieving superior task performance.

LLM fine-tuningdata preparationknowledge injection
0 likes · 18 min read
How to Fine‑Tune Large Language Models: From PEFT to Knowledge Injection
Architect
Architect
Mar 9, 2025 · Artificial Intelligence

Experiments with Reinforcement Learning Fine‑Tuning of a 0.5B Qwen Model on the KK Dataset

The author reports a series of reinforcement‑learning‑based fine‑tuning experiments on a 0.5‑billion‑parameter Qwen‑0.5VB instruct model using the KK dataset, detailing reward design adjustments, curriculum‑style data scaling, observed convergence issues, and hypotheses about why small models fail to develop long reasoning chains.

LLM fine-tuningcurriculum learningreinforcement learning
0 likes · 11 min read
Experiments with Reinforcement Learning Fine‑Tuning of a 0.5B Qwen Model on the KK Dataset
DevOps
DevOps
May 29, 2024 · Artificial Intelligence

End-to-End Task-Oriented Dialogue Agent Construction Using Monte Carlo Simulation and LLM Fine-Tuning

This article presents an end‑to‑end approach for building task‑oriented dialogue agents by simulating user behavior with Monte Carlo methods, generating training data via LLMs, and efficiently fine‑tuning multiple large language models using LLaMA Factory, demonstrating significant improvements in intent recognition, slot filling, and contextual understanding.

Data GenerationLLM fine-tuningMonte Carlo simulation
0 likes · 17 min read
End-to-End Task-Oriented Dialogue Agent Construction Using Monte Carlo Simulation and LLM Fine-Tuning
360 Smart Cloud
360 Smart Cloud
Apr 15, 2024 · Artificial Intelligence

Fine‑Tuning Qwen‑14B Large Language Model: A Complete Guide

This article provides a comprehensive tutorial on fine‑tuning the Qwen‑14B large language model, covering the motivation, fine‑tuning concepts, step‑by‑step workflow, required code, DeepSpeed training parameters, testing scripts, and deployment using FastChat and the 360AI platform.

AI Model DeploymentDeepSpeedFastChat
0 likes · 9 min read
Fine‑Tuning Qwen‑14B Large Language Model: A Complete Guide
NewBeeNLP
NewBeeNLP
Feb 22, 2024 · Artificial Intelligence

Practical Tips for CPT, SFT, and LoRA in Large Language Model Fine‑Tuning

This article shares hands‑on guidance on using continual pre‑training (CPT), supervised fine‑tuning (SFT), and LoRA adapters for large language models, covering dataset size requirements, learning‑rate scheduling, warm‑up ratios, epoch strategies, and practical routing choices based on real‑world experiments.

CPTLLM fine-tuningLoRA
0 likes · 12 min read
Practical Tips for CPT, SFT, and LoRA in Large Language Model Fine‑Tuning
NewBeeNLP
NewBeeNLP
Feb 5, 2024 · Artificial Intelligence

How HiFT Slashes GPU Memory for LLM Fine‑Tuning with Hierarchical Optimization

HiFT introduces a layer‑wise hierarchical fine‑tuning strategy that freezes most parameters per step, reduces optimizer state memory, and adapts mixed‑precision training, enabling 7B and 13B models to be fine‑tuned on 16‑31 GB GPUs while maintaining competitive performance.

GPU MemoryHiFTLLM fine-tuning
0 likes · 12 min read
How HiFT Slashes GPU Memory for LLM Fine‑Tuning with Hierarchical Optimization
DataFunTalk
DataFunTalk
Dec 29, 2023 · Artificial Intelligence

Enterprise Knowledge Assistant: Leveraging Vector Databases and Large Language Models

This article explores the emerging enterprise knowledge assistant paradigm in the era of large models, detailing traditional knowledge management challenges, solution architecture using vector databases and LLMs, core technologies such as ETL pipelines, reranking, secure fine‑tuning, and future prospects for intelligent enterprise applications.

Enterprise AILLM fine-tuningknowledge management
0 likes · 11 min read
Enterprise Knowledge Assistant: Leveraging Vector Databases and Large Language Models
Baidu Geek Talk
Baidu Geek Talk
Dec 19, 2023 · Industry Insights

Inside Baidu Search Innovation Contest: Winning AI Solutions Across Five Tracks

The second Baidu Search Innovation Contest attracted over 2,800 participants from 45 regions, featured five AI‑focused tracks, and highlighted champion teams that employed techniques such as Lora‑fine‑tuned LLMs, vector‑intersection Top‑K search, GPU‑optimized algorithms, and diffusion‑based image generation to push the boundaries of search technology.

AI competitionGPU OptimizationLLM fine-tuning
0 likes · 12 min read
Inside Baidu Search Innovation Contest: Winning AI Solutions Across Five Tracks
DaTaobao Tech
DaTaobao Tech
Oct 25, 2023 · Artificial Intelligence

Prompt Engineering, LLM Supervised Fine‑Tuning, and Mobile Tmall AI Assistant Application

The article explains prompt engineering techniques, supervised fine‑tuning of large language models, and their practical deployment in the Mobile Tmall AI shopping assistant, detailing ChatGPT’s generation steps, Transformer architecture, prompt clarity, delimiters, role‑play, few‑shot and chain‑of‑thought prompting, SFT versus pre‑training, LoRA adapters, data collection, Qwen‑14B training configuration, SDK‑based inference, and comprehensive evaluation.

AI AssistantLLM fine-tuningModel Deployment
0 likes · 14 min read
Prompt Engineering, LLM Supervised Fine‑Tuning, and Mobile Tmall AI Assistant Application
Baobao Algorithm Notes
Baobao Algorithm Notes
Jul 5, 2023 · Artificial Intelligence

Session‑Level Sample Organization for Decoder‑Only LLM Fine‑Tuning

This article explains how to restructure multi‑turn dialogue data into single session‑level training samples for decoder‑only large language models, leveraging causal attention and simple position IDs, and provides a concrete implementation, performance results, and a gradient‑weight analysis.

ChatGLM2LLM fine-tuningPrompt engineering
0 likes · 7 min read
Session‑Level Sample Organization for Decoder‑Only LLM Fine‑Tuning
phodal
phodal
Apr 12, 2023 · Artificial Intelligence

Four AI‑Driven Code Generation Techniques: From Example‑Based to Metadata‑Assisted

This article explores four distinct fine‑tuned LLaMA/ChatGLM approaches for AI‑assisted code generation—example‑based, test‑driven, metadata‑augmented, and information‑matching—detailing their training data, prompts, sample inputs and outputs, and evaluating their strengths, limitations, and suitable application scenarios.

AI code generationLLM fine-tuningcode synthesis
0 likes · 11 min read
Four AI‑Driven Code Generation Techniques: From Example‑Based to Metadata‑Assisted