Tagged articles
1023 articles
Page 10 of 11
Tencent Cloud Developer
Tencent Cloud Developer
Nov 8, 2023 · Artificial Intelligence

Comprehensive Overview of AI Agents: Concepts, Technical Frameworks, and Applications

The article surveys modern AI agents—software entities powered by large language models that perceive multimodal inputs, reason via brain modules, act through tools or embodied actions, employ retrieval‑augmented generation and chain‑of‑thought planning, and can operate singly (e.g., AutoGPT) or collaboratively via frameworks like Microsoft’s AutoGen—while highlighting current challenges such as controllability, memory limits, parallelism, and reliability.

AI agentsAgent ArchitectureAutoGen
0 likes · 34 min read
Comprehensive Overview of AI Agents: Concepts, Technical Frameworks, and Applications
DataFunSummit
DataFunSummit
Nov 5, 2023 · Artificial Intelligence

Enhancing Recommendation Models with Scaling Law via HCNet and MemoNet: A Memory‑Based Feature‑Combination Approach

This article presents a memory‑driven architecture (HCNet and MemoNet) that equips recommendation models with scaling‑law characteristics by storing and retrieving arbitrary feature‑combination embeddings, evaluates multi‑hash codebooks, memory‑restoring strategies, key‑feature selection, and demonstrates significant offline and online performance gains.

feature interactionlarge language modelsmemory networks
0 likes · 15 min read
Enhancing Recommendation Models with Scaling Law via HCNet and MemoNet: A Memory‑Based Feature‑Combination Approach
Model Perspective
Model Perspective
Nov 2, 2023 · Artificial Intelligence

Why Mathematical Modelers Must Embrace LLMs and Forget Outdated Skills

The article explains how rapid advances in data and large language models force mathematical modelers to continuously update their models and skills, discard obsolete knowledge, and adopt lifelong learning to stay effective in a fast‑changing AI‑driven environment.

Data Scienceartificial intelligencecontinuous learning
0 likes · 6 min read
Why Mathematical Modelers Must Embrace LLMs and Forget Outdated Skills
Baidu Geek Talk
Baidu Geek Talk
Nov 2, 2023 · Artificial Intelligence

AI-Powered Code Defect Detection: Leveraging Code Knowledge Graphs and Large Language Models

The paper presents an AI‑driven static analysis framework that builds code knowledge graphs to extract relevant slices and leverages large language models for multilingual defect prediction, achieving up to 80% F1, detecting 662 defects across 1,100 C++ modules with a 26.9% recall gain over traditional rule‑based scanners.

BERTSoftware qualitycode defect detection
0 likes · 9 min read
AI-Powered Code Defect Detection: Leveraging Code Knowledge Graphs and Large Language Models
Baidu Intelligent Cloud Tech Hub
Baidu Intelligent Cloud Tech Hub
Nov 1, 2023 · Databases

How BES Powers Large-Scale Vector Search for AI Applications

This article explains the principles of vector databases, outlines the engineering practices of Baidu Intelligent Cloud BES for large‑scale vector retrieval, discusses optimization techniques such as HNSW, IVF and filter integration, and presents real‑world AI use cases and future development directions.

AIBESElasticsearch
0 likes · 16 min read
How BES Powers Large-Scale Vector Search for AI Applications
DataFunSummit
DataFunSummit
Oct 30, 2023 · Artificial Intelligence

Exploring General AI, Large Language Models, Knowledge Graphs, and Reinforcement Learning – Insights from DataFun

This article presents a comprehensive overview of DaGuan Data's explorations in general artificial intelligence, large language models, knowledge graphs, reinforcement learning, compute and data requirements, and the emerging concept of Human‑Centric AGI, supplemented by a detailed Q&A session.

AGIKnowledge Graphsartificial intelligence
0 likes · 18 min read
Exploring General AI, Large Language Models, Knowledge Graphs, and Reinforcement Learning – Insights from DataFun
DataFunSummit
DataFunSummit
Oct 27, 2023 · Artificial Intelligence

ChatGPT Technology, Domesticization Attempts, and Open‑Source Large Models

This article reviews the evolution and challenges of ChatGPT technology, describes the authors' efforts to localize and commercialize the model for the Chinese market, and introduces their open‑source Chinese large‑model initiative, including training methods, performance gaps, and future improvement directions.

ChatGPTChinese NLPModel Localization
0 likes · 11 min read
ChatGPT Technology, Domesticization Attempts, and Open‑Source Large Models
Baidu Tech Salon
Baidu Tech Salon
Oct 25, 2023 · Artificial Intelligence

Intelligent Question Answering Technology in Baidu Search: Development, Modeling, and Retrieval‑Enhanced Generation

The article surveys Baidu Search’s intelligent question‑answering system, tracing its evolution from feature‑engineered retrieval to large pre‑trained and generative models, and detailing hierarchical readers, multi‑teacher distillation, retrieval‑enhanced generation, and instruction decomposition as key techniques for delivering fast, accurate, citation‑rich answers.

Baidu SearchRetrieval Augmented Generationknowledge distillation
0 likes · 18 min read
Intelligent Question Answering Technology in Baidu Search: Development, Modeling, and Retrieval‑Enhanced Generation
Baidu Geek Talk
Baidu Geek Talk
Oct 25, 2023 · Artificial Intelligence

How Baidu Search Is Transforming Machine Question Answering with Large‑Scale AI Models

This article reviews the evolution of machine question answering, from early feature‑engineered systems to modern large‑language‑model‑driven retrieval‑augmented generation, outlines Baidu Search’s current Retriever‑Reader architecture, discusses challenges such as semantic complexity, latency and answer quality, and presents solutions including hierarchical DocMRC modeling, multi‑teacher knowledge distillation, and instruction decomposition for efficient, high‑quality answers.

BaiduRetrieval Augmented Generationknowledge distillation
0 likes · 18 min read
How Baidu Search Is Transforming Machine Question Answering with Large‑Scale AI Models
DataFunTalk
DataFunTalk
Oct 25, 2023 · Artificial Intelligence

Applying Large Language Models to Wireless Network Intelligent Operations: Opportunities, Challenges, and Platform Construction

This article examines how large language model technology can be leveraged for intelligent operation of wireless communication networks, analyzing its advantages, current challenges, platform architecture, experimental validation, and future research directions within the telecom industry.

AIKnowledge Graphintelligent operation
0 likes · 17 min read
Applying Large Language Models to Wireless Network Intelligent Operations: Opportunities, Challenges, and Platform Construction
Zuoyebang Tech Team
Zuoyebang Tech Team
Oct 19, 2023 · Artificial Intelligence

How AI and Big Data Are Transforming Education: Insights from Zuoyebang’s Chief Scientist

At the GET2023 Education Technology Conference, Zuoyebang’s chief scientist Song Yang detailed how AI, large language models, big data, and smart hardware are reshaping learning experiences across subjects, from math problem generation to interactive programming assistants, and outlined the company’s vision for AI‑driven education.

AI in EducationEducational Technologylarge language models
0 likes · 12 min read
How AI and Big Data Are Transforming Education: Insights from Zuoyebang’s Chief Scientist
Alimama Tech
Alimama Tech
Oct 18, 2023 · Artificial Intelligence

Technical Challenges and Directions for Large‑Model Applications in E‑commerce

Taobao Group’s ten large‑model challenges target e‑commerce AI by demanding domain‑specific pre‑training, multi‑step reasoning, extended context handling, factual reliability, intelligent tool orchestration, robust retrieval integration, fuzzy‑intent tool selection, scalable multi‑objective RLHF, improved query rewriting, and knowledge‑driven recommendation.

RLHFe‑commerceknowledge hallucination
0 likes · 16 min read
Technical Challenges and Directions for Large‑Model Applications in E‑commerce
DaTaobao Tech
DaTaobao Tech
Oct 18, 2023 · Artificial Intelligence

Large Model Application Challenges for E-commerce

Taobao Group’s ten large‑model e‑commerce challenges call for researchers to build domain‑specific data pipelines, mitigate forgetting, balance expertise with generality, enable multi‑step reasoning, handle long contexts, reduce hallucinations, integrate tool use, improve fuzzy intent detection, apply multi‑objective RLHF, and generate cognitively novel recommendations.

Query UnderstandingRLHFknowledge hallucination
0 likes · 14 min read
Large Model Application Challenges for E-commerce
Baidu Geek Talk
Baidu Geek Talk
Oct 16, 2023 · Industry Insights

What Is AI‑Native Thinking and Why It Will Shape the Next Wave of Applications

The article explores the concept of AI‑native thinking, outlines the mindset and conditions needed for AI‑native applications, showcases examples such as Baidu Wenku and a legal‑assistant hackathon project, and discusses platform support, technical foundations, and emerging opportunities in the large‑model era.

AI-nativeBaiduapplication design
0 likes · 14 min read
What Is AI‑Native Thinking and Why It Will Shape the Next Wave of Applications
Baidu Geek Talk
Baidu Geek Talk
Oct 11, 2023 · Artificial Intelligence

How Baidu’s Qianfan 2.0 Supercharges Large‑Model Development and Deployment

The article reviews Baidu Cloud’s Qianfan 2.0 platform, detailing its expanded model catalog, dataset library, Chinese‑language enhancements, compression and speed gains, robust AI infrastructure, application templates, and end‑to‑end data‑labeling pipeline that together lower cost and accelerate large‑model adoption across industries.

AI PlatformCloud AIModel Deployment
0 likes · 14 min read
How Baidu’s Qianfan 2.0 Supercharges Large‑Model Development and Deployment
JD Cloud Developers
JD Cloud Developers
Oct 10, 2023 · Artificial Intelligence

Do Large Language Models Have a Mind? Attention, Emergence & Compression Explained

This article examines whether ChatGPT and other large language models exhibit true Theory of Mind, detailing the role of attention mechanisms, neural network architecture, emergent abilities, the Chinese‑room argument, and how compression of massive textual data underlies their apparent intelligence.

Attention MechanismEmergenceNeural Networks
0 likes · 30 min read
Do Large Language Models Have a Mind? Attention, Emergence & Compression Explained
Baobao Algorithm Notes
Baobao Algorithm Notes
Oct 9, 2023 · Artificial Intelligence

Demystifying RLHF and PPO for Large Language Models: Theory and Practice

This article explains why Reinforcement Learning from Human Feedback (RLHF) is crucial for LLM intelligence, outlines the three-stage training pipeline, details InstructGPT's reward model and PPO optimization, and provides a practical guide to implementing RLHF with deep‑learning frameworks.

PPORLHFReward Modeling
0 likes · 17 min read
Demystifying RLHF and PPO for Large Language Models: Theory and Practice
DataFunSummit
DataFunSummit
Sep 30, 2023 · Artificial Intelligence

Causal Inference from the Perspective of Large Models

This presentation by senior AI architect He Gang explores how large language models and LLM‑powered agents can enhance causal inference tasks, detailing model‑assisted analysis, agent‑based inference methods, and multi‑agent simulations to advance causal research.

AILLM agentslarge language models
0 likes · 2 min read
Causal Inference from the Perspective of Large Models
NetEase LeiHuo Testing Center
NetEase LeiHuo Testing Center
Sep 22, 2023 · Artificial Intelligence

Understanding Large Language Models and Prompt Engineering: A Practical Guide

This article provides an introductory overview of large language models (LLMs), compares popular models, explains their underlying principles, and offers practical guidance on prompt engineering, model evaluation, usage tips, and safety considerations, helping readers effectively select and apply LLMs in various scenarios.

AILLMModel Evaluation
0 likes · 44 min read
Understanding Large Language Models and Prompt Engineering: A Practical Guide
Tencent Tech
Tencent Tech
Sep 20, 2023 · Artificial Intelligence

Why Do Large Language Models Hallucinate and How to Reduce It?

The article explains why large language models generate hallucinations—due to data errors, training conflicts, and inference uncertainty—and outlines data‑cleaning, model‑level feedback, knowledge augmentation, constraint techniques, and post‑processing methods such as the “Truth‑seeking” algorithm to mitigate the issue.

AI SafetyData QualityKnowledge Retrieval
0 likes · 8 min read
Why Do Large Language Models Hallucinate and How to Reduce It?
DataFunSummit
DataFunSummit
Sep 19, 2023 · Artificial Intelligence

Advances in Information Extraction: From PLM to LLM Paradigms at Alibaba DAMO Academy

This article reviews Alibaba DAMO Academy's research on information extraction, covering background concepts, PLM-era extraction paradigms, few‑shot extraction techniques, and the emerging LLM‑era approaches, while also sharing practical insights, benchmark results, and future directions.

Alibaba DAMOFew‑Shot LearningRetrieval Augmented Generation
0 likes · 24 min read
Advances in Information Extraction: From PLM to LLM Paradigms at Alibaba DAMO Academy
Ximalaya Technology Team
Ximalaya Technology Team
Sep 18, 2023 · Artificial Intelligence

Understanding Autonomous and Autopilot AI Agents: Insights from Industry Experts

The article surveys the rise of LLM‑powered AI agents, defining them as LLM + memory + planning + tool use, contrasting fully autonomous agents with human‑guided autopilot/copilot variants, outlining their benefits, risks such as hallucinations and unsafe actions, and urging modular frameworks and oversight for reliable enterprise deployment.

AI agentsAgent FrameworkLLM
0 likes · 27 min read
Understanding Autonomous and Autopilot AI Agents: Insights from Industry Experts
AntTech
AntTech
Sep 12, 2023 · Artificial Intelligence

Ensuring Trustworthy and Secure AI: Insights from the 2023 Pujiang Innovation Forum

The 2023 Pujiang Innovation Forum highlighted the rapid rise of generative AI, its associated security and privacy risks, and presented Ant Group's multi‑stage, multi‑layered approach—including data, training, and inference controls and three core defense technologies—to achieve safe, reliable, and open knowledge sharing in the era of large language models.

information securityknowledge sharinglarge language models
0 likes · 10 min read
Ensuring Trustworthy and Secure AI: Insights from the 2023 Pujiang Innovation Forum
DaTaobao Tech
DaTaobao Tech
Sep 11, 2023 · Artificial Intelligence

Large Language Model Upgrade Paths and Architecture Selection

This article analyzes upgrade paths of major LLMs—ChatGLM, LLaMA, Baichuan—detailing performance, context length, and architectural changes, then examines essential capabilities, data cleaning, tokenizer and attention design, and offers practical guidance for balanced scaling and efficient model construction.

BaichuanChatGLMLLM architecture
0 likes · 32 min read
Large Language Model Upgrade Paths and Architecture Selection
DataFunSummit
DataFunSummit
Sep 9, 2023 · Artificial Intelligence

Evolution of AIGC Technology and Its Applications in Life Sciences

This article reviews the development of AIGC and generative AI technologies—including image, text, and molecular generation—explains key model advances such as diffusion and large language models, discusses their impact on drug discovery, and outlines current challenges, opportunities, and future directions.

AI in Life SciencesAIGCdrug discovery
0 likes · 14 min read
Evolution of AIGC Technology and Its Applications in Life Sciences
DataFunTalk
DataFunTalk
Sep 8, 2023 · Artificial Intelligence

Knowledge Processing in the Era of Large Models: New Opportunities and New Challenges

This article examines how large language models and knowledge graphs complement each other, discussing their respective strengths, integration techniques such as prompt engineering and knowledge editing, and outlining future research directions for building large knowledge models that combine linguistic understanding with structured knowledge representation.

AIKnowledge GraphsModel Alignment
0 likes · 27 min read
Knowledge Processing in the Era of Large Models: New Opportunities and New Challenges
Alibaba Cloud Developer
Alibaba Cloud Developer
Aug 28, 2023 · Artificial Intelligence

AI-Driven Application Engineering: From Prompt Engineering to Autonomous Agents

This article examines how the rapid rise of generative AI reshapes application engineering by outlining AI's core characteristics, the challenges developers face, the evolution of prompt and chain-of-thought techniques, the emergence of agents and tool integration, and the future direction toward AI‑centric computing architectures.

AIPrompt engineeringagents
0 likes · 20 min read
AI-Driven Application Engineering: From Prompt Engineering to Autonomous Agents
FunTester
FunTester
Aug 22, 2023 · Artificial Intelligence

The Current State and Future Outlook of AI‑Driven Software Testing

The article examines how large‑language models, test‑case generation technologies, and model‑driven testing are reshaping software testing, discusses the challenges of applying AI to testing, and outlines future directions and skill sets for professionals seeking to leverage AI in quality assurance.

AIKnowledge Graphslarge language models
0 likes · 14 min read
The Current State and Future Outlook of AI‑Driven Software Testing
DataFunTalk
DataFunTalk
Aug 21, 2023 · Artificial Intelligence

Can We Build Large-Scale Models for Recommendation Systems?

In this talk, Zhang Pengtao, a Sina Weibo technical expert with a Ph.D. in computer applications, explores how the strong memory capabilities of NLP large language models inspire the design of independent memory mechanisms for recommendation systems, covering model concepts, HCNet & MemoNet, experimental results, and practical takeaways for enhancing recommendation model performance.

AIMemory MechanismsRecommendation Systems
0 likes · 2 min read
Can We Build Large-Scale Models for Recommendation Systems?
DataFunTalk
DataFunTalk
Aug 19, 2023 · Artificial Intelligence

Applying Large Language Models to Zhihu's Bridge Platform: Use Cases, Challenges, and Solutions

This article details how Zhihu's internal Bridge platform integrates large language models for business analysis, knowledge taxonomy, natural‑language‑to‑filter conversion, and ad‑hoc data queries, describing the workflow, technical hurdles, iterative improvements, and future directions.

AI for business analyticsPrompt engineeringknowledge taxonomy
0 likes · 12 min read
Applying Large Language Models to Zhihu's Bridge Platform: Use Cases, Challenges, and Solutions
DataFunTalk
DataFunTalk
Aug 16, 2023 · Artificial Intelligence

Data Engineering, Automated Evaluation, and Knowledge Graph Integration in Large Model Development

This article presents a comprehensive overview of data engineering practices, pre‑training data composition, automated model evaluation techniques, and the synergistic use of knowledge graphs within large‑scale AI model research, highlighting pipelines, quality criteria, and practical case studies.

Knowledge Graphautomation evaluationdata engineering
0 likes · 29 min read
Data Engineering, Automated Evaluation, and Knowledge Graph Integration in Large Model Development
Bilibili Tech
Bilibili Tech
Aug 15, 2023 · Backend Development

Bilibili Customer Service System Architecture and Implementation

The article explains Bilibili's self‑developed customer‑service platform, describing its modular architecture, core workflows, and implementation of features such as intelligent QA with Faiss vector search, Redis‑based seat scheduling, a robust workstation, permission control, and exploration of large language models, highlighting improvements in interception rate, satisfaction, and handling time.

Backend DevelopmentCustomer Service SystemFaiss vector search
0 likes · 20 min read
Bilibili Customer Service System Architecture and Implementation
DataFunSummit
DataFunSummit
Aug 14, 2023 · Artificial Intelligence

State of GPT: A Programmer’s Guide to Large Language Model Fundamentals, Training, and Applications

This article provides programmers with a comprehensive overview of large language models—including their evolution, core concepts, data pipelines, model architectures, training techniques such as 3D parallelism, supervised fine‑tuning, RLHF, open‑source recipes, and emerging application ecosystems—while also highlighting current challenges and future directions.

Fine‑tuningLLM applicationsRLHF
0 likes · 43 min read
State of GPT: A Programmer’s Guide to Large Language Model Fundamentals, Training, and Applications
php Courses
php Courses
Aug 14, 2023 · Artificial Intelligence

Guide to the Five Most Powerful Large Language Models and How to Choose Them

This article explains the fundamentals of modern large language models, outlines the top five most powerful LLMs—including GPT‑4, Claude 2, Llama 2, Orca, and Cohere—and provides practical guidance on selecting and applying them across business and development use cases.

AI applicationsClaude 2GPT-4
0 likes · 9 min read
Guide to the Five Most Powerful Large Language Models and How to Choose Them
DataFunTalk
DataFunTalk
Aug 13, 2023 · Artificial Intelligence

Applying Large Language Models to Search Advertising Satisfaction: From DNN to ERNIE and Prompt Learning

The article details how Baidu's Fengchao team leverages large language models, including a transition from DNN embeddings to ERNIE, introduces multi‑level tokenization and discrete core‑word inputs, and applies prompt learning and AIGC techniques to improve search advertising satisfaction and industry‑specific relevance modeling.

AIGCBaidularge language models
0 likes · 22 min read
Applying Large Language Models to Search Advertising Satisfaction: From DNN to ERNIE and Prompt Learning
DataFunTalk
DataFunTalk
Aug 9, 2023 · Artificial Intelligence

Key Technologies for Domain‑Specific Large Models: Insights from the World AI Conference

This report, based on Professor Xiao Yanghua’s presentation at the World AI Conference, examines why vertical domains need general large models, outlines their key capabilities such as open‑world understanding, combinatorial innovation, evaluation, complex instruction execution, task planning, and symbolic reasoning, and discusses current limitations and optimization strategies for domain‑specific deployment.

AI EvaluationModel OptimizationVertical AI
0 likes · 17 min read
Key Technologies for Domain‑Specific Large Models: Insights from the World AI Conference
Efficient Ops
Efficient Ops
Aug 8, 2023 · Artificial Intelligence

Rethinking Software Development in the Age of Large Language Models

The article examines fundamental challenges of applying large language models to software engineering—such as scale limits, lack of abstract reasoning, hidden tacit knowledge, and maintenance difficulties—and proposes practical recommendations for integrating AI with disciplined development practices.

AI integrationdevelopment automationknowledge management
0 likes · 7 min read
Rethinking Software Development in the Age of Large Language Models
Baidu Intelligent Cloud Tech Hub
Baidu Intelligent Cloud Tech Hub
Aug 8, 2023 · Artificial Intelligence

Unlocking LMOps: How Enterprises Can Master Large Model Operations

This article explains the evolution from traditional machine learning to the current large‑model era, introduces LMOps concepts and key technologies, compares them with MLOps, and showcases Baidu Cloud's Qianfan platform as a practical solution for building, deploying, and managing large language models in industry.

AI OperationsBaidu CloudLMOps
0 likes · 22 min read
Unlocking LMOps: How Enterprises Can Master Large Model Operations
DataFunTalk
DataFunTalk
Jul 27, 2023 · Artificial Intelligence

Applying AIGC in E‑commerce: Product Copy and Image Generation with Large Language Models

This article shares recent AIGC practices in e‑commerce, detailing product copy generation using GPT‑based models, image creation with Stable Diffusion, the evolution of large language models, technical solutions, experimental results, and future opportunities for AI‑driven automation in online retail.

AIGCe‑commerceimage generation
0 likes · 18 min read
Applying AIGC in E‑commerce: Product Copy and Image Generation with Large Language Models
Baidu Geek Talk
Baidu Geek Talk
Jul 26, 2023 · Artificial Intelligence

Insights on AIGC Development and Commercial Applications by Baidu's Chief Architect

Baidu’s chief architect Li Shuanglong outlined how AIGC, driven by advanced large‑language and multimodal models, is already powering commercial tools such as automated copywriting, 2D digital‑human video creation and lead‑generation chatbots, while emphasizing future progress in engineering scalability, algorithmic fidelity, data quality, and scenario‑focused applications.

AI commercializationAI researchAIGC
0 likes · 8 min read
Insights on AIGC Development and Commercial Applications by Baidu's Chief Architect
Rare Earth Juejin Tech Community
Rare Earth Juejin Tech Community
Jul 24, 2023 · Artificial Intelligence

Comprehensive Survey of Large Language Models: History, Key Technologies, Resources, and Future Directions

This article provides a detailed overview of large language models (LLMs), tracing their evolution from statistical and neural language models to modern pre‑trained transformers, discussing scaling, training, adaptation, utilization, evaluation methods, available resources, and outlining current challenges and future research directions.

Model ScalingPre‑trainingPrompt engineering
0 likes · 26 min read
Comprehensive Survey of Large Language Models: History, Key Technologies, Resources, and Future Directions
Alibaba Cloud Developer
Alibaba Cloud Developer
Jul 19, 2023 · Artificial Intelligence

Mastering Prompt Engineering: Techniques, Tips, and Real-World Examples

This comprehensive guide explores prompt engineering for large language models, covering its background, fundamental concepts, prompt formats, construction principles, advanced techniques like few‑shot, zero‑shot, and chain‑of‑thought prompting, as well as practical examples, evaluation metrics, and future directions.

Few-ShotLLMPrompt engineering
0 likes · 33 min read
Mastering Prompt Engineering: Techniques, Tips, and Real-World Examples
ZhongAn Tech Team
ZhongAn Tech Team
Jul 14, 2023 · Artificial Intelligence

Exploring AIGC Applications in Insurance: Insights from ZhongAn Insurance CTO Jiang Jiyun

The interview with ZhongAn Insurance CTO Jiang Jiyun discusses how the company leverages AIGC technologies such as large language models, embeddings, and prompt engineering to enhance marketing, intelligent customer service, and data security, while highlighting practical challenges and best practices for AI adoption in the insurance sector.

AIGCEmbeddingPrompt engineering
0 likes · 15 min read
Exploring AIGC Applications in Insurance: Insights from ZhongAn Insurance CTO Jiang Jiyun
21CTO
21CTO
Jul 8, 2023 · Artificial Intelligence

Unlocking LangChain: Build End-to-End LLM Apps with Chains, Agents, and Memory

This article introduces LangChain—a modular framework for constructing large‑language‑model applications—covering its core components, asynchronous support, prompt engineering, memory handling, chain and agent workflows, token considerations, embedding techniques, and a step‑by‑step Python example that culminates in a Gradio‑based conversational chatbot.

AI DevelopmentEmbeddingLangChain
0 likes · 20 min read
Unlocking LangChain: Build End-to-End LLM Apps with Chains, Agents, and Memory
DeWu Technology
DeWu Technology
Jul 5, 2023 · Artificial Intelligence

Fine-tuning Large Language Models with LoRA/QLoRA and Deploying via GPTQ Quantization on KubeAI

The article explains how LoRA and its 4‑bit QLoRA extension dramatically reduce trainable parameters and GPU memory for fine‑tuning large language models, while GPTQ post‑training quantization compresses weights for cheap inference, and shows how KubeAI integrates these techniques into a one‑click workflow for 7 B, 13 B, and 33 B models from data upload to API deployment.

GPTQKubeAILoRA
0 likes · 13 min read
Fine-tuning Large Language Models with LoRA/QLoRA and Deploying via GPTQ Quantization on KubeAI
DataFunSummit
DataFunSummit
Jun 30, 2023 · Artificial Intelligence

Roundtable on Large‑Model‑Based Recommendation Systems: Opportunities, Challenges, and Future Directions

In this expert roundtable, leading researchers and engineers discuss the current state of recommendation systems, how large language models can reshape the field, the technical and practical challenges involved, and practical advice for practitioners looking to adopt AI‑driven personalization solutions.

AIRecommendation Systemsdialogue recommendation
0 likes · 36 min read
Roundtable on Large‑Model‑Based Recommendation Systems: Opportunities, Challenges, and Future Directions
DataFunSummit
DataFunSummit
Jun 28, 2023 · Artificial Intelligence

OPPO's CHAOS Pretrained Large Model and GammaE Knowledge‑Graph Multi‑hop Reasoning: Techniques and Insights

This article presents OPPO Research Institute's recent advances in large‑model AI, detailing the CHAOS pretrained model that topped the CLUE leaderboard, the knowledge‑enhanced training pipeline, and the GammaE model for multi‑hop reasoning over knowledge graphs, together with experimental results and practical training tips.

AI researchGammaEKnowledge Graph
0 likes · 20 min read
OPPO's CHAOS Pretrained Large Model and GammaE Knowledge‑Graph Multi‑hop Reasoning: Techniques and Insights
Programmer DD
Programmer DD
Jun 20, 2023 · Artificial Intelligence

Yann LeCun: Today's AI Still Below Dog Level – Inside Meta’s Voicebox, MusicGen & I‑JEPA

Meta’s chief AI scientist Yann LeCun warned that current large language models still fall short of human and even dog intelligence, citing their lack of real‑world understanding, while Meta unveiled three new generative AI models—Voicebox for speech, MusicGen for music, and I‑JEPA for image reasoning—showcasing both progress and remaining limitations.

Computer VisionMusic generationSpeech synthesis
0 likes · 7 min read
Yann LeCun: Today's AI Still Below Dog Level – Inside Meta’s Voicebox, MusicGen & I‑JEPA
DataFunTalk
DataFunTalk
Jun 20, 2023 · Artificial Intelligence

How Recommendation Systems Work and Their Integration with ChatGPT

This article explains the fundamentals of recommendation systems, their digital representation, how ChatGPT and large language models are applied to enhance recommendation performance, and highlights emerging trends such as conversational recommendation and a recommended book on the subject.

AIChatGPTConversational AI
0 likes · 8 min read
How Recommendation Systems Work and Their Integration with ChatGPT
DataFunSummit
DataFunSummit
Jun 14, 2023 · Artificial Intelligence

DataFun Summit 2023: Large Language Models and AIGC Conference

DataFun will host the DataFun Summit 2023 on June 17‑18, featuring three chairs and eight presenters who will discuss core topics such as large language model research, multimodal generation, reinforcement learning, tool learning, distributed training, and industry applications, with free registration via QR code.

AI ConferenceAIGCMultimodal
0 likes · 42 min read
DataFun Summit 2023: Large Language Models and AIGC Conference
Rare Earth Juejin Tech Community
Rare Earth Juejin Tech Community
Jun 14, 2023 · Artificial Intelligence

ChatGPT Practice Applications and Large Model Technology Insights from the Juejin Offline Salon

The article recaps a Beijing offline salon where experts and open‑source contributors discussed ChatGPT desktop applications, the development and deployment of ChatGPT‑Next‑Web, large‑language‑model challenges, the VisualGLM multimodal model, and product design considerations, providing technical insights and community perspectives on AI advancements.

AIChatGPTProduct Design
0 likes · 9 min read
ChatGPT Practice Applications and Large Model Technology Insights from the Juejin Offline Salon
Programmer DD
Programmer DD
Jun 12, 2023 · Artificial Intelligence

Master Prompt Engineering: Guide ChatGPT to Deliver Precise Answers

This article explains prompt engineering for large language models like ChatGPT, covering its definition, essential techniques such as diverse prompting strategies, problem restatement, background provision, gradient prompting, example inclusion, role‑playing, and the importance of systematic experimentation and quantitative evaluation to achieve high‑quality, task‑specific AI outputs.

AIChatGPTPrompt engineering
0 likes · 16 min read
Master Prompt Engineering: Guide ChatGPT to Deliver Precise Answers
Sohu Tech Products
Sohu Tech Products
Jun 7, 2023 · Artificial Intelligence

Multiscale PU Learning for Detecting AI‑Generated Text

Researchers from Peking University and Huawei present a multiscale positive‑unlabeled learning framework that significantly improves detection of AI‑generated short and long texts, addressing the difficulty of distinguishing AI‑written content from human writing and outperforming existing baselines on multiple benchmarks.

AI detectionPu-Learninglarge language models
0 likes · 8 min read
Multiscale PU Learning for Detecting AI‑Generated Text
Python Programming Learning Circle
Python Programming Learning Circle
Jun 6, 2023 · Artificial Intelligence

Why ChatGPT Plus Performance Is Dropping and What OpenAI’s Roadmap Reveals

Recent reports indicate a noticeable decline in ChatGPT Plus’s GPT‑4 performance, especially in coding accuracy, prompting speculation about model scaling pain, AI alignment trade‑offs, and OpenAI’s GPU‑limited roadmap that includes cheaper models, longer context windows, finetuning, and multimodal extensions.

AI AlignmentChatGPTGPT-4
0 likes · 8 min read
Why ChatGPT Plus Performance Is Dropping and What OpenAI’s Roadmap Reveals
OPPO Amber Lab
OPPO Amber Lab
Jun 5, 2023 · Information Security

How ChatGPT Impacts Security: Key Insights from the CSA Seminar

An online CSA seminar on May 30 examined ChatGPT’s security impact, presenting a whitepaper and four AI‑security interaction dimensions, while experts discussed telecom‑operator security‑GPT models, safe vertical‑domain large‑model training, and future industry implications.

AI GovernanceAI securityChatGPT
0 likes · 7 min read
How ChatGPT Impacts Security: Key Insights from the CSA Seminar
Programmer DD
Programmer DD
May 19, 2023 · Artificial Intelligence

Master Advanced Prompt Engineering: Boost LLM Performance with Proven Techniques

This article explains why effective prompt design—covering system messages, few‑shot learning, non‑dialogue scenarios, explicit instructions, output shaping, syntax cues, task decomposition, chain‑of‑thought, and real‑world context—is essential for reliable large language model results and provides practical examples and tips.

AIFew‑Shot LearningPrompt engineering
0 likes · 8 min read
Master Advanced Prompt Engineering: Boost LLM Performance with Proven Techniques
Rare Earth Juejin Tech Community
Rare Earth Juejin Tech Community
May 5, 2023 · Artificial Intelligence

Limitations of Generative Pre‑trained Transformers: Hallucinations, Memory, Planning, and Architectural Proposals

The article critically examines GPT‑4 and similar transformer models, highlighting persistent hallucinations, outdated knowledge, insufficient domain coverage, lack of planning and memory, and proposes architectural extensions inspired by fast‑slow thinking and differentiable modules to overcome these fundamental constraints.

AI limitationsGPT-4Model architecture
0 likes · 24 min read
Limitations of Generative Pre‑trained Transformers: Hallucinations, Memory, Planning, and Architectural Proposals
Architect
Architect
Apr 27, 2023 · Artificial Intelligence

Survey of Large Language Model Research: From GPT‑1 to ChatGPT and Open‑Source Alternatives

This article provides a comprehensive overview of the development of large language models, reviewing classic papers from GPT‑1 through GPT‑4, discussing open‑source implementations such as LLaMA, Alpaca, GLM, and ChatGLM, and analyzing training methods, datasets, and future research directions.

AI researchGPTlarge language models
0 likes · 36 min read
Survey of Large Language Model Research: From GPT‑1 to ChatGPT and Open‑Source Alternatives
Kuaishou Tech
Kuaishou Tech
Apr 23, 2023 · Artificial Intelligence

Kuaishou & Renmin AI Institute: Driving Multimodal Large Model Innovation

The article details how Kuaishou’s multimodal AI research, including its K7 trillion‑parameter model and VLUA algorithm, partners with Renmin University’s Gaoling AI Institute to launch a joint lab, produce cutting‑edge papers such as WebBrain and ChatImg, and advance recommendation and search technologies across the short‑video ecosystem.

AIIndustry collaborationRecommendation Systems
0 likes · 17 min read
Kuaishou & Renmin AI Institute: Driving Multimodal Large Model Innovation
DataFunSummit
DataFunSummit
Apr 20, 2023 · Artificial Intelligence

Mengzi Lightweight Model Technology System and Advances in Small‑Scale and Retrieval‑Augmented Pretraining

This presentation introduces the Mengzi lightweight model technology stack, covering large‑scale pre‑training, motivations for lightweight models, detailed techniques such as knowledge and sequence‑relation enhancement, training optimization, model compression, retrieval‑augmented pre‑training, multimodal extensions, open‑source releases, and real‑world applications.

Multimodalknowledge distillationlarge language models
0 likes · 23 min read
Mengzi Lightweight Model Technology System and Advances in Small‑Scale and Retrieval‑Augmented Pretraining
IT Architects Alliance
IT Architects Alliance
Apr 20, 2023 · Artificial Intelligence

Overview of Prominent Large Language Models and Instruction‑Finetuned Variants

This article provides a comprehensive overview of major large language models—including GPT series, T5, LaMDA, LLaMA, BLOOM, and others—detailing their architectures, parameter scales, open‑source status, and the evolution of instruction‑fine‑tuning techniques that improve zero‑shot and few‑shot performance.

AI researchInstruction TuningLLM comparison
0 likes · 24 min read
Overview of Prominent Large Language Models and Instruction‑Finetuned Variants
Architect
Architect
Apr 19, 2023 · Artificial Intelligence

Emergence in Large Language Models: Phenomena, Explanations, and Implications

This article reviews the emergence phenomena observed in large language models, explains how model scale, in‑context learning and chain‑of‑thought prompting contribute to sudden performance gains, discusses small‑model alternatives, and explores the relationship between emergence and the training‑time Grokking effect.

AI researchEmergenceIn-Context Learning
0 likes · 13 min read
Emergence in Large Language Models: Phenomena, Explanations, and Implications
DataFunTalk
DataFunTalk
Apr 19, 2023 · Artificial Intelligence

Is the Daily Emergence of Large Language Models Beneficial?

The article examines the rapid proliferation of large language models, weighing both the opportunities for experimentation and the drawbacks of noise, and argues that establishing authoritative Chinese LLM evaluation benchmarks is essential to guide meaningful progress in the field.

AI researchLLM evaluationlarge language models
0 likes · 7 min read
Is the Daily Emergence of Large Language Models Beneficial?
Architect
Architect
Apr 14, 2023 · Artificial Intelligence

Overview of Prominent Large Language Models and Instruction Fine‑Tuning Techniques

The article surveys major large language models—including GPT‑3, T5, LaMDA, Jurassic‑1, MT‑NLG, Gopher, Chinchilla, PaLM, U‑PaLM, OPT, LLaMA, BLOOM, GLM‑130B, and ERNIE 3.0 Titan—explains their architectures, scaling trade‑offs, and then details instruction‑fine‑tuned variants such as T0, FLAN, GPT‑3.5, ChatGPT, GPT‑4, Alpaca and ChatGLM, providing references for further study.

AIChatGPTGPT-3
0 likes · 27 min read
Overview of Prominent Large Language Models and Instruction Fine‑Tuning Techniques
ITPUB
ITPUB
Apr 14, 2023 · Artificial Intelligence

How Do Generative, Perceptual, and Decision AI Interact? Insights from Jina AI’s Founder

In this interview, Jina AI’s founder Shao Han examines the relationships among generative, perceptual, and decision AI, compares single‑modal and multimodal approaches, discusses large language model development, and evaluates the impact of ChatGPT on search and future AI commercialization.

AIAI commercializationMultimodal AI
0 likes · 11 min read
How Do Generative, Perceptual, and Decision AI Interact? Insights from Jina AI’s Founder
Programmer DD
Programmer DD
Apr 14, 2023 · Artificial Intelligence

How DeepSpeed-Chat Accelerates ChatGPT‑Style Model Training by 15×

Microsoft open‑sourced DeepSpeed‑Chat, a toolkit that streamlines the end‑to‑end training and inference of ChatGPT‑like large language models using RLHF, delivering up to fifteen‑fold speedups and dramatically lower costs, even on a single GPU.

ChatGPTDeepSpeedRLHF
0 likes · 8 min read
How DeepSpeed-Chat Accelerates ChatGPT‑Style Model Training by 15×
Top Architect
Top Architect
Apr 12, 2023 · Artificial Intelligence

Data‑Centric AI Perspective on GPT Models: Training, Inference, and Maintenance

This article examines how large language models such as GPT‑1 through GPT‑4 succeed largely due to high‑quality, large‑scale training data, and explains the Data‑centric AI framework—training data development, inference data development, and data maintenance—while discussing prompt engineering, data‑driven improvements, and future trends in AI.

AIData‑Centric AIGPT
0 likes · 19 min read
Data‑Centric AI Perspective on GPT Models: Training, Inference, and Maintenance
Architect
Architect
Apr 9, 2023 · Artificial Intelligence

Evaluating the Commonsense Knowledge and Reasoning Capabilities of ChatGPT and Other Large Language Models

This study systematically evaluates ChatGPT and other large language models on their ability to answer commonsense questions, assess their knowledge awareness, and utilize generated knowledge for reasoning, revealing strong QA performance but notable gaps in social and temporal commonsense and in leveraging contextual knowledge.

ChatGPTNLPcommonsense reasoning
0 likes · 20 min read
Evaluating the Commonsense Knowledge and Reasoning Capabilities of ChatGPT and Other Large Language Models
DataFunSummit
DataFunSummit
Apr 7, 2023 · Artificial Intelligence

China's AI Startup Landscape: Who Holds the Greatest Advantage in the Large Model Race?

Amid the 2023 AI boom, veteran entrepreneurs, former tech executives, and academic teams in China are racing to launch large‑model ventures, each leveraging distinct funding, talent, and product strategies, while major tech giants simultaneously push their own AI offerings, reshaping the market dynamics.

AI startupsChina AIlarge language models
0 likes · 12 min read
China's AI Startup Landscape: Who Holds the Greatest Advantage in the Large Model Race?
Xiaohongshu Tech REDtech
Xiaohongshu Tech REDtech
Apr 3, 2023 · Industry Insights

What Drives Intelligent Recommendation and Search? Key Takeaways from Xiaohongshu’s CCF C³ Event

The CCF C³ event at Xiaohongshu gathered leading researchers and industry experts to dissect the latest advances, challenges, and future opportunities in intelligent recommendation and search, including multimodal content handling, decentralized distribution, cold‑start solutions, and the impact of large language models.

AIRecommendation SystemsSearch
0 likes · 11 min read
What Drives Intelligent Recommendation and Search? Key Takeaways from Xiaohongshu’s CCF C³ Event
Baidu Geek Talk
Baidu Geek Talk
Mar 21, 2023 · Artificial Intelligence

Infrastructure Challenges and Solutions for Large‑Scale AI Model Training

The article explains how the massive compute and storage demands of today’s large language models create a “compute wall” and “storage wall,” and describes Baidu Intelligent Cloud’s four‑layer full‑stack infrastructure—combining advanced parallelism techniques, optimized GPU networking, static‑graph compilation, and cost‑model‑driven placement—to train trillion‑parameter models efficiently.

AI InfrastructureCost ModelDistributed Training
0 likes · 27 min read
Infrastructure Challenges and Solutions for Large‑Scale AI Model Training
DataFunTalk
DataFunTalk
Mar 16, 2023 · Artificial Intelligence

Technical Optimizations and Breakthroughs of GPT‑4: Multimodal Capabilities, Alignment Strategies, and Predictable Scaling

The article summarizes the technical innovations behind GPT‑4, highlighting its multimodal abilities, improved alignment methods, scaling‑law‑based performance prediction, and remaining limitations, while referencing the official OpenAI technical report and community analyses.

AI researchAlignmentGPT-4
0 likes · 10 min read
Technical Optimizations and Breakthroughs of GPT‑4: Multimodal Capabilities, Alignment Strategies, and Predictable Scaling
21CTO
21CTO
Mar 11, 2023 · Artificial Intelligence

Microsoft Announces Multimodal GPT-4: A New ‘iPhone Moment’ for AI

Microsoft Germany's CTO announced the imminent release of a multimodal GPT‑4, highlighting its ability to process text, images and video, while executives liken the breakthrough to an “iPhone moment” for AI, emphasizing new capabilities, industry disruption, and responsible data use.

AI DevelopmentGPT-4Microsoft
0 likes · 6 min read
Microsoft Announces Multimodal GPT-4: A New ‘iPhone Moment’ for AI
DataFunSummit
DataFunSummit
Feb 28, 2023 · Artificial Intelligence

Baidu Document Intelligence Technology Overview and Applications

This article presents a comprehensive overview of Baidu's document intelligence technologies—including the ERNIE‑Layout multimodal large model, the prompt‑based DocPrompt extraction system, layout and table understanding techniques, and PaddleNLP open‑source integration—detailing their architectures, challenges, solutions, performance benchmarks, and real‑world application cases across multiple industries.

DocPromptDocument IntelligenceERNIE-Layout
0 likes · 19 min read
Baidu Document Intelligence Technology Overview and Applications
21CTO
21CTO
Feb 27, 2023 · Artificial Intelligence

What’s Next for Large Language Models? Emerging Trends Shaping AI

The article explores three emerging directions for next‑generation large language models—self‑generated training data, built‑in verification with external retrieval, and massive sparse‑expert architectures—highlighting recent research, practical challenges, and their potential to reshape AI development.

AI researchgenerative AIlarge language models
0 likes · 17 min read
What’s Next for Large Language Models? Emerging Trends Shaping AI
21CTO
21CTO
Feb 23, 2023 · Artificial Intelligence

How Does ChatGPT Really Work? Inside the RLHF Training Process

This article explains ChatGPT’s architecture, the distinction between model capability and consistency, how next‑token and masked‑language‑model training lead to inconsistencies, and how OpenAI’s supervised fine‑tuning, reward‑model training, and PPO reinforcement learning (RLHF) are combined to improve alignment while highlighting the method’s limitations.

AI AlignmentChatGPTRLHF
0 likes · 15 min read
How Does ChatGPT Really Work? Inside the RLHF Training Process
21CTO
21CTO
Feb 20, 2023 · Artificial Intelligence

How the AI Arms Race Between Microsoft and Google Is Reshaping Search

The escalating competition between Microsoft and Google over large language models is driving new search experiences, reshaping AI research, and raising concerns about content quality, privacy, and the future of smaller AI innovators.

AI competitionBingChatGPT
0 likes · 10 min read
How the AI Arms Race Between Microsoft and Google Is Reshaping Search
dbaplus Community
dbaplus Community
Feb 18, 2023 · Artificial Intelligence

Why ChatGPT Still Gets It Wrong: Inside RLHF and Model Consistency

ChatGPT, OpenAI’s latest language model, builds on GPT‑3 but uses supervised fine‑tuning and Reinforcement Learning from Human Feedback (RLHF) to improve alignment, yet its training methods still cause consistency issues such as invalid help, hallucinations, bias, and limited explainability.

ChatGPTModel AlignmentPPO
0 likes · 17 min read
Why ChatGPT Still Gets It Wrong: Inside RLHF and Model Consistency
Architecture Digest
Architecture Digest
Feb 17, 2023 · Artificial Intelligence

Analyzing the Emergent Abilities of ChatGPT and the Technical Roadmap of GPT‑3.5

This article dissects how ChatGPT acquired its surprising capabilities by tracing the evolution from the original GPT‑3 model through instruction tuning, code‑based pre‑training, and reinforcement learning from human feedback, ultimately presenting a comprehensive technical roadmap for reproducing GPT‑3.5‑scale models.

ChatGPTGPT-3.5Instruction Tuning
0 likes · 26 min read
Analyzing the Emergent Abilities of ChatGPT and the Technical Roadmap of GPT‑3.5
DataFunTalk
DataFunTalk
Feb 15, 2023 · Artificial Intelligence

Three Emerging Directions for Next‑Generation Large Language Models

The article outlines three promising research avenues—self‑generated training data, model‑driven fact‑checking, and sparse expert architectures—that could shape the next wave of large language model innovation and address current limitations such as data scarcity and hallucinations.

AI researchlarge language modelsmodel self‑improvement
0 likes · 14 min read
Three Emerging Directions for Next‑Generation Large Language Models
Open Source Linux
Open Source Linux
Feb 13, 2023 · Artificial Intelligence

How Does ChatGPT Work? Inside RLHF and Model Consistency

This article explains the inner workings of ChatGPT, detailing its evolution from GPT‑3, the role of reinforcement learning from human feedback (RLHF) in improving consistency, the training pipeline steps, and the limitations and evaluation methods of large language models.

AIChatGPTModel Alignment
0 likes · 15 min read
How Does ChatGPT Work? Inside RLHF and Model Consistency
DataFunSummit
DataFunSummit
Feb 12, 2023 · Artificial Intelligence

Claude vs. ChatGPT: Constitutional AI, RLAIF, and the Quest for Safer Large‑Language Models

This article reviews Anthropic's Claude assistant, explains the novel Constitutional AI (RLAIF) approach that replaces costly human‑feedback data with a set of natural‑language principles, compares Claude with ChatGPT across helpfulness and harmlessness, and details the supervision and reinforcement‑learning pipelines, data annotation, and experimental results that demonstrate superior safety performance.

AI SafetyClaudeHarmlessness
0 likes · 21 min read
Claude vs. ChatGPT: Constitutional AI, RLAIF, and the Quest for Safer Large‑Language Models
Architect
Architect
Feb 9, 2023 · Artificial Intelligence

Emergent Abilities of Large Language Models: Complex Reasoning, Knowledge Reasoning, and Out‑of‑Distribution Robustness

This article reviews recent research on the emergent abilities of large language models—such as chain‑of‑thought reasoning, knowledge retrieval without external sources, and robustness to distribution shifts—examining scaling laws, model size thresholds, and the open questions surrounding a potential paradigm shift from fine‑tuning to in‑context learning.

AI researchchain-of-thought promptingemergent abilities
0 likes · 23 min read
Emergent Abilities of Large Language Models: Complex Reasoning, Knowledge Reasoning, and Out‑of‑Distribution Robustness
Top Architect
Top Architect
Feb 9, 2023 · Artificial Intelligence

How ChatGPT Works: Training, RLHF, and Consistency Issues

ChatGPT, OpenAI’s latest language model, builds on GPT‑3 and improves performance through supervised fine‑tuning, human‑feedback reinforcement learning (RLHF), and PPO optimization, addressing consistency challenges such as misaligned outputs, bias, and hallucinations while evaluating helpfulness, truthfulness, and harmlessness.

ChatGPTModel AlignmentRLHF
0 likes · 15 min read
How ChatGPT Works: Training, RLHF, and Consistency Issues
IT Architects Alliance
IT Architects Alliance
Feb 9, 2023 · Artificial Intelligence

How ChatGPT Works: Model Architecture, Training Strategies, and RLHF

ChatGPT, OpenAI’s latest language model, builds on GPT‑3 using supervised fine‑tuning and Reinforcement Learning from Human Feedback (RLHF) with PPO, addressing consistency issues by aligning model outputs with human preferences, while discussing training methods, limitations, and evaluation metrics.

AI AlignmentChatGPTPPO
0 likes · 15 min read
How ChatGPT Works: Model Architecture, Training Strategies, and RLHF
Top Architect
Top Architect
Feb 8, 2023 · Artificial Intelligence

A Technical Roadmap of GPT‑3.5: From Pre‑training to RLHF and Emerging Capabilities

This article analyses how ChatGPT and the GPT‑3.5 series evolved from the original GPT‑3 through large‑scale pre‑training, code‑based training, instruction tuning, and reinforcement learning from human feedback, identifying the origins of their language generation, in‑context learning, world knowledge, code understanding, chain‑of‑thought reasoning, and alignment capabilities while also outlining current limitations.

ChatGPTGPT-3.5Instruction Tuning
0 likes · 27 min read
A Technical Roadmap of GPT‑3.5: From Pre‑training to RLHF and Emerging Capabilities