Tagged articles

fine-tuning

139 articles · Page 2 of 2

Apr 15, 2024 · Artificial Intelligence

Fine‑Tuning Large Language Models: A Practical Guide Using Qwen‑14B on the 360AI Platform

This article explains the concept, motivations, and step‑by‑step workflow for fine‑tuning large language models—specifically Qwen‑14B—covering data preparation, training commands with DeepSpeed, hyper‑parameter settings, evaluation, and deployment via FastChat, all illustrated with code snippets and configuration details.

AIDeepSpeedFastChat

0 likes · 10 min read

Fine‑Tuning Large Language Models: A Practical Guide Using Qwen‑14B on the 360AI Platform

Rare Earth Juejin Tech Community

Apr 12, 2024 · Artificial Intelligence

Typical Business and Technical Architectures for Large Language Model Applications

This article reviews the common business and technical architectures used in large language model (LLM) applications, explains AI Embedded, AI Copilot, and AI Agent modes—including single‑ and multi‑agent systems—and offers guidance on selecting appropriate technology stacks such as prompt‑only, function‑calling agents, RAG, and fine‑tuning.

AI AgentLLMRAG

0 likes · 9 min read

Typical Business and Technical Architectures for Large Language Model Applications

NewBeeNLP

Apr 7, 2024 · Artificial Intelligence

Can Large Language Models Learn Recommendation Knowledge? A NL‑Simulated Auxiliary Task

This article reviews a recent study that bridges the knowledge gap between large language models and recommendation systems by generating natural‑language auxiliary tasks, fine‑tuning the models, and achieving notable performance gains on Amazon domain benchmarks.

AI researchLarge Language Modelsfine-tuning

0 likes · 4 min read

Can Large Language Models Learn Recommendation Knowledge? A NL‑Simulated Auxiliary Task

OPPO Kernel Craftsman

Mar 22, 2024 · Artificial Intelligence

InternLM Model Fine-Tuning Tutorial with XTuner: Chat Format and Practical Implementation Guide

This tutorial walks through fine‑tuning Shanghai AI Lab’s open‑source InternLM models with XTuner, explaining chat‑format conventions, loading and inference (including multimodal InternLM‑XComposer), dataset preparation, configuration sections, DeepSpeed acceleration, and memory‑efficient QLoRA details for 7‑B‑parameter chat models.

Chat FormatDeepSpeedHuggingFace

0 likes · 22 min read

InternLM Model Fine-Tuning Tutorial with XTuner: Chat Format and Practical Implementation Guide

TAL Education Technology

Mar 20, 2024 · Artificial Intelligence

Understanding AI: From Brain Differences to Data Science Practices and Large Model Applications

This article explains why current AI cannot achieve self‑awareness, outlines data‑science steps for large models—including preprocessing, exploratory analysis, modeling, and evaluation—then surveys general and vertical applications of large language models and details a complete machine‑learning workflow with transformer fine‑tuning techniques.

AIApplicationsLarge Language Models

0 likes · 14 min read

Understanding AI: From Brain Differences to Data Science Practices and Large Model Applications

Xiaohe Frontend Team

Mar 6, 2024 · Artificial Intelligence

What the New “Generative AI Act Two” Reveals About the Next AI Wave

Sequoia Capital’s “Generative AI Act Two” report highlights a shift from hype‑driven model releases to user‑centric, end‑to‑end solutions, emphasizing the rise of foundational models as components, the importance of developer tools, emerging RAG and fine‑tuning techniques, and the evolving competitive landscape.

AI marketFoundational modelsGenerative AI

0 likes · 6 min read

What the New “Generative AI Act Two” Reveals About the Next AI Wave

Alibaba Cloud Big Data AI Platform

Feb 29, 2024 · Artificial Intelligence

Deploy and Fine‑Tune Qwen1.5 LLM with Alibaba PAI‑QuickStart

This article introduces Alibaba Cloud's open‑source Qwen1.5 large language model series, highlights its multilingual, human‑preference alignment, and long‑context capabilities, and provides step‑by‑step guidance on using PAI‑QuickStart for model deployment, fine‑tuning, and Python SDK integration.

Model DeploymentPAI-QuickStartQwen1.5

0 likes · 9 min read

Deploy and Fine‑Tune Qwen1.5 LLM with Alibaba PAI‑QuickStart

Rare Earth Juejin Tech Community

Feb 18, 2024 · Artificial Intelligence

Llama 2: Open Foundation and Fine‑Tuned Chat Models – Overview and Technical Details

The article provides a comprehensive overview of Meta’s Llama 2 series, detailing model sizes, pre‑training data, architectural enhancements, supervised fine‑tuning, RLHF procedures, safety evaluations, reward‑model training, and iterative improvements, highlighting its open‑source release and comparative performance.

AI safetyLlama2RLHF

0 likes · 27 min read

Llama 2: Open Foundation and Fine‑Tuned Chat Models – Overview and Technical Details

Open Source Tech Hub

Jan 28, 2024 · Artificial Intelligence

How to Fine‑Tune a Text Classification Model with ModelScope’s PyTorch Trainer

This guide explains how to use ModelScope’s trainer components to fine‑tune a pretrained backbone for text classification, covering dataset loading, configuration modification, trainer construction, training, evaluation, prediction, and checkpoint management with concrete code examples.

ModelScopePyTorchText Classification

0 likes · 11 min read

How to Fine‑Tune a Text Classification Model with ModelScope’s PyTorch Trainer

DeWu Technology

Jan 22, 2024 · Artificial Intelligence

How to Integrate Business Systems with LLMs: Prompt, RAG, and Fine‑Tuning Strategies

This article outlines three practical approaches—direct prompting, retrieval‑augmented generation (RAG), and fine‑tuning—to connect enterprise applications to large language models, explains key prompt‑engineering techniques, details RAG workflow and vector‑database integration, and provides step‑by‑step guidance for fine‑tuning on the KubeAI platform.

AI for businessKubeAILLM integration

0 likes · 20 min read

How to Integrate Business Systems with LLMs: Prompt, RAG, and Fine‑Tuning Strategies

Alibaba Cloud Big Data AI Platform

Jan 12, 2024 · Artificial Intelligence

How to Fine‑Tune and Deploy Mixtral 8x7B MOE Model on Alibaba Cloud PAI

This guide walks AI developers through downloading the Mixtral 8x7B MOE large language model, fine‑tuning it with Swift or Deepspeed on Alibaba Cloud PAI‑DSW, testing inference with Transformers, and finally deploying the tuned model as an online service using PAI‑EAS.

Alibaba CloudDeepSpeedMixtral

0 likes · 13 min read

How to Fine‑Tune and Deploy Mixtral 8x7B MOE Model on Alibaba Cloud PAI

Alibaba Cloud Big Data AI Platform

Jan 12, 2024 · Artificial Intelligence

Deploy and Fine‑Tune Mixtral‑8x7B on Alibaba Cloud PAI: A Step‑by‑Step Guide

This guide introduces the open‑source Mixtral‑8x7B large language model, explains its architecture and performance, and provides detailed instructions for using Alibaba Cloud PAI‑QuickStart to deploy, invoke via API or SDK, and fine‑tune the model with LoRA on Lingjun GPU resources.

Alibaba Cloud PAIMixtralModel Deployment

0 likes · 16 min read

Deploy and Fine‑Tune Mixtral‑8x7B on Alibaba Cloud PAI: A Step‑by‑Step Guide

Sohu Tech Products

Dec 27, 2023 · Artificial Intelligence

OCR-Based Video Review System: Technology Selection, Optimization, and Model Fine-Tuning

An OCR‑based video review system using PaddleOCR’s DB detector and SVTR recognizer, combined with multi‑level frame deduplication, message‑queue task decoupling, Redis prioritization, and dynamic thread‑pool scheduling, was fine‑tuned on 5 000 samples to cut daily frames from 794 million to 3.6 million, achieving automated detection of over 230 abnormal videos per day and replacing three manual reviewers, with future plans for GPU acceleration and cross‑instance GRPC dispatch.

AIOCRPaddleOCR

0 likes · 20 min read

OCR-Based Video Review System: Technology Selection, Optimization, and Model Fine-Tuning

Alibaba Cloud Big Data AI Platform

Dec 15, 2023 · Artificial Intelligence

How to Fine‑Tune and Deploy Qwen‑72B‑Chat on Alibaba Cloud PAI

This guide walks you through preparing the environment, downloading Qwen‑72B‑Chat, performing LoRA fine‑tuning on PAI‑DSW, merging weights, and deploying the model for offline inference, web UI, API, and PAI SDK services on Alibaba Cloud.

LoRAPAIQwen-72B

0 likes · 12 min read

How to Fine‑Tune and Deploy Qwen‑72B‑Chat on Alibaba Cloud PAI

DataFunSummit

Nov 13, 2023 · Artificial Intelligence

SWIFT: A Scalable Light‑Weight Training and Inference Framework for Efficient Model Fine‑Tuning

SWIFT is an open‑source, PyTorch‑based framework that integrates multiple efficient fine‑tuning methods such as LoRA, QLoRA, Adapter, and the proprietary ResTuning, enabling developers to fine‑tune large language and multimodal models on consumer‑grade GPUs with significantly reduced memory and compute requirements.

LoRAModelScopePyTorch

0 likes · 13 min read

SWIFT: A Scalable Light‑Weight Training and Inference Framework for Efficient Model Fine‑Tuning

Baobao Algorithm Notes

Nov 7, 2023 · Artificial Intelligence

A Complete Technical Guide to LLM Foundations, Advanced Topics, Fine‑Tuning, and LangChain Applications

This article provides an in‑depth technical overview of large language models (LLMs), covering core model families, architectural differences, emergent abilities, common challenges such as repetition and token limits, detailed fine‑tuning strategies including PEFT, practical guidance for training custom models, and a thorough introduction to the LangChain framework with code examples, core concepts, and troubleshooting tips for building LLM‑powered applications.

LLMLangChainVector Store

0 likes · 97 min read

A Complete Technical Guide to LLM Foundations, Advanced Topics, Fine‑Tuning, and LangChain Applications

UCloud Tech

Sep 11, 2023 · Artificial Intelligence

Build a Soul‑Healing Chatbot with LangChain & Llama 2: A Step‑by‑Step Guide

This article walks through constructing a domain‑specific, soul‑healing chatbot using LangChain and Llama 2, comparing fine‑tuning versus external knowledge bases, detailing environment setup, data loading, text splitting, embedding with a Chinese model, vector store creation, prompt engineering, inference, and optimization strategies.

Knowledge BaseLangChainLlama 2

0 likes · 14 min read

Build a Soul‑Healing Chatbot with LangChain & Llama 2: A Step‑by‑Step Guide

AI Large Model Application Practice

Sep 6, 2023 · Artificial Intelligence

Prompt Engineering vs Fine‑Tuning: How to Choose the Best Strategy for Reliable LLM Outputs

This article compares Prompt Engineering and Supervised Fine‑Tuning for large language models, explains their principles, showcases common prompt patterns such as Chain‑of‑Thought, ReAct and Self‑Ask, outlines fine‑tuning stages and trade‑offs, and provides practical guidance on selecting the most suitable approach for specific enterprise AI Agent scenarios.

AI AgentLLMPrompt engineering

0 likes · 17 min read

Prompt Engineering vs Fine‑Tuning: How to Choose the Best Strategy for Reliable LLM Outputs

JD Tech

Jul 31, 2023 · Artificial Intelligence

Local Deployment, Fine‑tuning, and Inference of the Open‑source Alpaca‑LoRA Model on GPU Servers

This article details the step‑by‑step process of installing GPU drivers, setting up a Python environment, deploying the open‑source Alpaca‑LoRA large language model, fine‑tuning it with Chinese data on a multi‑GPU server, and running inference, while discussing practical challenges and performance observations.

AlpacaGPULoRA

0 likes · 14 min read

Local Deployment, Fine‑tuning, and Inference of the Open‑source Alpaca‑LoRA Model on GPU Servers

Rare Earth Juejin Tech Community

Jul 30, 2023 · Artificial Intelligence

ChatGPT Technical Analysis Series – Part 2: GPT1, GPT2, and GPT3 (Encoder vs Decoder, Zero‑Shot, and Scaling)

This article reviews the evolution of the GPT family from GPT‑1 to GPT‑3, comparing encoder‑decoder architectures, explaining the shift from supervised fine‑tuning to zero‑shot and few‑shot learning, and highlighting the architectural and training innovations that enabled large‑scale language models.

GPTLLMTransformer

0 likes · 13 min read

ChatGPT Technical Analysis Series – Part 2: GPT1, GPT2, and GPT3 (Encoder vs Decoder, Zero‑Shot, and Scaling)

Network Intelligence Research Center (NIRC)

Jul 29, 2023 · Artificial Intelligence

Getting Started with GPT: How Generative Pre‑Training and Discriminative Fine‑Tuning Work

This article explains GPT's two‑stage learning—unsupervised generative pre‑training on large raw corpora followed by discriminative fine‑tuning on labeled tasks—detailing the underlying Transformer decoder architecture, loss functions, and task‑specific input transformations.

GPTGenerative Pre‑TrainingNLP

0 likes · 5 min read

Getting Started with GPT: How Generative Pre‑Training and Discriminative Fine‑Tuning Work

Alibaba Cloud Developer

Jul 26, 2023 · Artificial Intelligence

How to Fine‑Tune and Deploy Llama 2 on Alibaba Cloud PAI: Step‑by‑Step Guide

This guide walks you through using Alibaba Cloud's PAI platform to quickly fine‑tune Llama 2 with LoRA or full‑parameter methods, deploy the models as online inference services, and launch an interactive WebUI, covering preparation, data formatting, training jobs, and deployment details.

AI DeploymentAlibaba CloudLlama2

0 likes · 15 min read

How to Fine‑Tune and Deploy Llama 2 on Alibaba Cloud PAI: Step‑by‑Step Guide

Alibaba Cloud Big Data AI Platform

Jul 25, 2023 · Artificial Intelligence

Fine‑Tune and Deploy Llama 2 on Alibaba Cloud PAI in Minutes

This guide walks you through using Meta's open‑source Llama 2 models on Alibaba Cloud's PAI platform, covering low‑code LoRA fine‑tuning, full‑parameter fine‑tuning with PAI‑DSW, and rapid WebUI deployment via PAI‑EAS, complete with step‑by‑step instructions, code snippets, and resource requirements.

AIAlibaba CloudLlama2

0 likes · 16 min read

Fine‑Tune and Deploy Llama 2 on Alibaba Cloud PAI in Minutes

Baobao Algorithm Notes

Jul 23, 2023 · Artificial Intelligence

Why Cold Starts, Reward Hacking, and Evaluation Matter in LLM Training

The article analyzes key challenges in large‑language‑model pipelines—including the necessity of cold‑start pretraining, the pitfalls of reward‑model hacking, efficiency‑effectiveness trade‑offs, evaluation difficulties, and downstream fine‑tuning limits—offering practical insights for more reliable LLM development.

EfficiencyLLMRLHF

0 likes · 9 min read

Why Cold Starts, Reward Hacking, and Evaluation Matter in LLM Training

Programmer DD

Jul 10, 2023 · Artificial Intelligence

OpenAI Opens GPT‑4 API to All Paying Users – What Developers Need to Know

OpenAI has opened its GPT‑4 API, along with GPT‑3.5 Turbo, DALL·E and Whisper APIs, to all paying customers, announced fine‑tuning tests, a new 8K‑token context, upcoming model retirements, and plans to expand access to new developers by month‑end.

APIGPT-4OpenAI

0 likes · 3 min read

OpenAI Opens GPT‑4 API to All Paying Users – What Developers Need to Know

Cloud Native Technology Community

Jun 28, 2023 · Artificial Intelligence

Building and Deploying Custom Large Language Models with Alauda Cloud‑Native MLOps

This article explains how enterprises can use the Alauda MLOps platform to quickly set up, fine‑tune, and deploy private large language models on cloud‑native infrastructure, covering notebook preparation, GPU allocation, model download, inference service creation, distributed training pipelines, and Docker image building.

AIMLOpsfine-tuning

0 likes · 9 min read

Building and Deploying Custom Large Language Models with Alauda Cloud‑Native MLOps

DataFunTalk

Jun 23, 2023 · Artificial Intelligence

DeepKE-LLM: An Open‑Source Large Language Model Toolkit for Knowledge Extraction

DeepKE-LLM is an open‑source, extensible knowledge‑graph extraction framework that leverages large language models for entity, relation, and attribute extraction, supports multiple LLMs, provides installation scripts, various usage modes, fine‑tuning pipelines, and integrates with the KnowLM project for advanced instruction‑following capabilities.

DeepKEKnowledge ExtractionLLM

0 likes · 8 min read

DeepKE-LLM: An Open‑Source Large Language Model Toolkit for Knowledge Extraction

Rare Earth Juejin Tech Community

Jun 12, 2023 · Artificial Intelligence

Comprehensive Guide to Using OpenAI APIs: Models, Prompts, Embeddings, Fine‑Tuning, LangChain, and Multimodal Applications

This article provides a detailed, step‑by‑step tutorial on OpenAI’s language models, API endpoints, prompt engineering, embeddings, moderation, fine‑tuning, LangChain workflows, memory management, and multimodal capabilities such as audio transcription and image generation, complete with code examples and practical usage tips.

APIEmbeddingLangChain

0 likes · 45 min read

Comprehensive Guide to Using OpenAI APIs: Models, Prompts, Embeddings, Fine‑Tuning, LangChain, and Multimodal Applications

JD Retail Technology

May 18, 2023 · Artificial Intelligence

Local Deployment, Inference, and Fine‑tuning of the Vicuna‑7B Large Language Model

This article details the step‑by‑step process of preparing the environment, merging weights, installing dependencies, running inference, evaluating Vicuna‑7B against other models, and attempting fine‑tuning, while highlighting performance results, encountered issues, and future work for large language model deployment.

GPUModel DeploymentVicuna

0 likes · 11 min read

Local Deployment, Inference, and Fine‑tuning of the Vicuna‑7B Large Language Model

JD Retail Technology

May 16, 2023 · Artificial Intelligence

Deploying and Fine‑Tuning the Alpaca‑LoRA Large Language Model on a Multi‑GPU Server

This guide details the end‑to‑end process of installing GPU drivers, setting up a Python environment, deploying the open‑source Alpaca‑LoRA model, fine‑tuning it with Chinese data on a multi‑GPU server, and performing inference, while highlighting practical challenges and performance observations.

Alpaca-LoRADeep LearningGPU

0 likes · 11 min read

Deploying and Fine‑Tuning the Alpaca‑LoRA Large Language Model on a Multi‑GPU Server

DataFunSummit

Apr 21, 2023 · Artificial Intelligence

Fine‑Tuning a ViT Image Classification Model on a Small Flower Dataset Using ModelScope

This tutorial walks through the complete process of fine‑tuning a Vision Transformer (ViT) model for 14‑class flower image classification on ModelScope, covering dataset preparation, model loading, training configuration, evaluation, and inference with practical code examples.

Deep LearningModelScopePython

0 likes · 14 min read

Fine‑Tuning a ViT Image Classification Model on a Small Flower Dataset Using ModelScope

Top Architect

Apr 21, 2023 · Artificial Intelligence

Fine‑Tuning LLaMA‑7B with Alpaca‑LoRA to Build a Chinese ChatGPT

This article explains why and how to fine‑tune the LLaMA‑7B model using the cheap Alpaca‑LoRA approach, covering hardware requirements, dataset preparation, LoRA training, optional model merging and quantization, and provides ready‑to‑run code snippets for single‑ and multi‑GPU setups.

Alpaca-LoRAGPULLM

0 likes · 10 min read

Fine‑Tuning LLaMA‑7B with Alpaca‑LoRA to Build a Chinese ChatGPT

ELab Team

Sep 23, 2022 · Artificial Intelligence

Fine‑Tune a Chinese BERT Model for Cloze Tasks in 30 Minutes

This tutorial walks you through NLP fundamentals, the evolution of BERT, the concept of pre‑trained models, and a step‑by‑step guide to fine‑tune a Chinese BERT on a cloze‑style task, complete with code snippets and verification results.

BERTChineseCloze Task

0 likes · 13 min read

Fine‑Tune a Chinese BERT Model for Cloze Tasks in 30 Minutes

DataFunTalk

Jun 30, 2022 · Artificial Intelligence

OBERT: A Billion‑Parameter Pretrained Language Model for Large‑Scale NLP Applications

The OPPO XiaoBu team introduced OBERT, a series of 100M‑, 300M‑, and 1B‑parameter pretrained language models that leverage massive TB‑scale corpora, multi‑granular masking, retrieval‑augmented training, and distributed acceleration to achieve state‑of‑the‑art results on CLUE and KgCLUE benchmarks while enabling efficient industrial deployment.

Knowledge augmentationNLPfine-tuning

0 likes · 12 min read

OBERT: A Billion‑Parameter Pretrained Language Model for Large‑Scale NLP Applications

Code DAO

May 19, 2022 · Artificial Intelligence

Semi‑Supervised Training Methods for Transformers

This article explains an end‑to‑end semi‑supervised training pipeline for Transformer‑based NLP models, detailing the unsupervised language‑model pre‑training, supervised fine‑tuning, and the internal architecture of embeddings, encoder layers, and downstream tasks such as text classification and NER.

BERTMasked Language ModelNLP

0 likes · 9 min read

Semi‑Supervised Training Methods for Transformers

Sohu Tech Products

Nov 4, 2020 · Artificial Intelligence

Understanding BERT: Architecture, Pre‑training, Fine‑tuning and Applications in Modern NLP

This article provides a comprehensive overview of BERT and related NLP advances, covering its historical context, model architecture, input‑output mechanisms, comparisons with CNNs, word‑embedding evolution, pre‑training strategies like MLM and next‑sentence prediction, and practical guidance for fine‑tuning and feature extraction.

BERTNLPTransformer

0 likes · 17 min read

Understanding BERT: Architecture, Pre‑training, Fine‑tuning and Applications in Modern NLP

Xueersi Online School Tech Team

Jan 17, 2020 · Artificial Intelligence

Fine‑tuning BERT for Sentence Pair Similarity in an Online Education Platform

This article describes how a BERT‑based model is fine‑tuned to compute sentence‑pair similarity for improving recommendation accuracy in an online school, detailing the architecture, training mechanisms, code implementation, experimental results, and future extensions such as sentiment analysis.

BERTChinese NLPDeep Learning

0 likes · 20 min read

Fine‑tuning BERT for Sentence Pair Similarity in an Online Education Platform

Amap Tech

Jan 3, 2020 · Artificial Intelligence

Machine Learning Solutions for User Feedback Intelligence at Amap (Gaode Maps)

Amap replaced its rule‑based feedback pipeline with a three‑stage, LSTM‑driven system that combines word2vec embeddings and structured fields, achieving over 96% classification accuracy, cutting manual workload by 80%, and slashing per‑task costs while enabling scalable, data‑driven map quality improvements.

Gaode MapsLSTMNLP

0 likes · 14 min read

Machine Learning Solutions for User Feedback Intelligence at Amap (Gaode Maps)

DataFunTalk

Nov 24, 2018 · Artificial Intelligence

Comprehensive Guide to Fine‑Tuning BERT on Chinese Datasets

This article provides a step‑by‑step guide for fine‑tuning Google’s open‑source BERT on Chinese datasets, covering model download, processor customization, code examples, training commands, and insights into the underlying TensorFlow estimator architecture and deployment considerations.

BERTChinese NLPTensorFlow

0 likes · 11 min read

Comprehensive Guide to Fine‑Tuning BERT on Chinese Datasets