Tagged articles

1023 articles

Page 10 of 11

Nov 8, 2023 · Artificial Intelligence

Comprehensive Overview of AI Agents: Concepts, Technical Frameworks, and Applications

The article surveys modern AI agents—software entities powered by large language models that perceive multimodal inputs, reason via brain modules, act through tools or embodied actions, employ retrieval‑augmented generation and chain‑of‑thought planning, and can operate singly (e.g., AutoGPT) or collaboratively via frameworks like Microsoft’s AutoGen—while highlighting current challenges such as controllability, memory limits, parallelism, and reliability.

AI agentsAgent ArchitectureAutoGen

0 likes · 34 min read

Comprehensive Overview of AI Agents: Concepts, Technical Frameworks, and Applications

DataFunSummit

Nov 5, 2023 · Artificial Intelligence

Enhancing Recommendation Models with Scaling Law via HCNet and MemoNet: A Memory‑Based Feature‑Combination Approach

This article presents a memory‑driven architecture (HCNet and MemoNet) that equips recommendation models with scaling‑law characteristics by storing and retrieving arbitrary feature‑combination embeddings, evaluates multi‑hash codebooks, memory‑restoring strategies, key‑feature selection, and demonstrates significant offline and online performance gains.

feature interactionlarge language modelsmemory networks

0 likes · 15 min read

Enhancing Recommendation Models with Scaling Law via HCNet and MemoNet: A Memory‑Based Feature‑Combination Approach

Model Perspective

Nov 2, 2023 · Artificial Intelligence

Why Mathematical Modelers Must Embrace LLMs and Forget Outdated Skills

The article explains how rapid advances in data and large language models force mathematical modelers to continuously update their models and skills, discard obsolete knowledge, and adopt lifelong learning to stay effective in a fast‑changing AI‑driven environment.

Data Scienceartificial intelligencecontinuous learning

0 likes · 6 min read

Why Mathematical Modelers Must Embrace LLMs and Forget Outdated Skills

Baidu Geek Talk

Nov 2, 2023 · Artificial Intelligence

AI-Powered Code Defect Detection: Leveraging Code Knowledge Graphs and Large Language Models

The paper presents an AI‑driven static analysis framework that builds code knowledge graphs to extract relevant slices and leverages large language models for multilingual defect prediction, achieving up to 80% F1, detecting 662 defects across 1,100 C++ modules with a 26.9% recall gain over traditional rule‑based scanners.

BERTSoftware qualitycode defect detection

0 likes · 9 min read

AI-Powered Code Defect Detection: Leveraging Code Knowledge Graphs and Large Language Models

Baidu Intelligent Cloud Tech Hub

Nov 1, 2023 · Databases

How BES Powers Large-Scale Vector Search for AI Applications

This article explains the principles of vector databases, outlines the engineering practices of Baidu Intelligent Cloud BES for large‑scale vector retrieval, discusses optimization techniques such as HNSW, IVF and filter integration, and presents real‑world AI use cases and future development directions.

AIBESElasticsearch

0 likes · 16 min read

How BES Powers Large-Scale Vector Search for AI Applications

DataFunSummit

Oct 30, 2023 · Artificial Intelligence

Exploring General AI, Large Language Models, Knowledge Graphs, and Reinforcement Learning – Insights from DataFun

This article presents a comprehensive overview of DaGuan Data's explorations in general artificial intelligence, large language models, knowledge graphs, reinforcement learning, compute and data requirements, and the emerging concept of Human‑Centric AGI, supplemented by a detailed Q&A session.

AGIKnowledge Graphsartificial intelligence

0 likes · 18 min read

Exploring General AI, Large Language Models, Knowledge Graphs, and Reinforcement Learning – Insights from DataFun

DataFunSummit

Oct 27, 2023 · Artificial Intelligence

ChatGPT Technology, Domesticization Attempts, and Open‑Source Large Models

This article reviews the evolution and challenges of ChatGPT technology, describes the authors' efforts to localize and commercialize the model for the Chinese market, and introduces their open‑source Chinese large‑model initiative, including training methods, performance gaps, and future improvement directions.

ChatGPTChinese NLPModel Localization

0 likes · 11 min read

ChatGPT Technology, Domesticization Attempts, and Open‑Source Large Models

Baidu Tech Salon

Oct 25, 2023 · Artificial Intelligence

Intelligent Question Answering Technology in Baidu Search: Development, Modeling, and Retrieval‑Enhanced Generation

The article surveys Baidu Search’s intelligent question‑answering system, tracing its evolution from feature‑engineered retrieval to large pre‑trained and generative models, and detailing hierarchical readers, multi‑teacher distillation, retrieval‑enhanced generation, and instruction decomposition as key techniques for delivering fast, accurate, citation‑rich answers.

Baidu SearchRetrieval Augmented Generationknowledge distillation

0 likes · 18 min read

Intelligent Question Answering Technology in Baidu Search: Development, Modeling, and Retrieval‑Enhanced Generation

Baidu Geek Talk

Oct 25, 2023 · Artificial Intelligence

How Baidu Search Is Transforming Machine Question Answering with Large‑Scale AI Models

This article reviews the evolution of machine question answering, from early feature‑engineered systems to modern large‑language‑model‑driven retrieval‑augmented generation, outlines Baidu Search’s current Retriever‑Reader architecture, discusses challenges such as semantic complexity, latency and answer quality, and presents solutions including hierarchical DocMRC modeling, multi‑teacher knowledge distillation, and instruction decomposition for efficient, high‑quality answers.

BaiduRetrieval Augmented Generationknowledge distillation

0 likes · 18 min read

How Baidu Search Is Transforming Machine Question Answering with Large‑Scale AI Models

DataFunTalk

Oct 25, 2023 · Artificial Intelligence

Applying Large Language Models to Wireless Network Intelligent Operations: Opportunities, Challenges, and Platform Construction

This article examines how large language model technology can be leveraged for intelligent operation of wireless communication networks, analyzing its advantages, current challenges, platform architecture, experimental validation, and future research directions within the telecom industry.

AIKnowledge Graphintelligent operation

0 likes · 17 min read

Applying Large Language Models to Wireless Network Intelligent Operations: Opportunities, Challenges, and Platform Construction

AI Large Model Application Practice

Oct 23, 2023 · Artificial Intelligence

Unlocking GPT‑4V: A Concise Guide to Multimodal Capabilities and Prompt Techniques

This article summarizes the GPT‑4V research paper, detailing its visual input modes, effective prompting strategies, diverse multimodal abilities, high‑value application scenarios, and ways to enhance the model with classic LLM techniques while noting current limitations.

AI applicationsGPT-4Vlarge language models

0 likes · 17 min read

Unlocking GPT‑4V: A Concise Guide to Multimodal Capabilities and Prompt Techniques

Zuoyebang Tech Team

Oct 19, 2023 · Artificial Intelligence

How AI and Big Data Are Transforming Education: Insights from Zuoyebang’s Chief Scientist

At the GET2023 Education Technology Conference, Zuoyebang’s chief scientist Song Yang detailed how AI, large language models, big data, and smart hardware are reshaping learning experiences across subjects, from math problem generation to interactive programming assistants, and outlined the company’s vision for AI‑driven education.

AI in EducationEducational Technologylarge language models

0 likes · 12 min read

How AI and Big Data Are Transforming Education: Insights from Zuoyebang’s Chief Scientist

Alimama Tech

Oct 18, 2023 · Artificial Intelligence

Technical Challenges and Directions for Large‑Model Applications in E‑commerce

Taobao Group’s ten large‑model challenges target e‑commerce AI by demanding domain‑specific pre‑training, multi‑step reasoning, extended context handling, factual reliability, intelligent tool orchestration, robust retrieval integration, fuzzy‑intent tool selection, scalable multi‑objective RLHF, improved query rewriting, and knowledge‑driven recommendation.

RLHFe‑commerceknowledge hallucination

0 likes · 16 min read

Technical Challenges and Directions for Large‑Model Applications in E‑commerce

DaTaobao Tech

Oct 18, 2023 · Artificial Intelligence

Large Model Application Challenges for E-commerce

Taobao Group’s ten large‑model e‑commerce challenges call for researchers to build domain‑specific data pipelines, mitigate forgetting, balance expertise with generality, enable multi‑step reasoning, handle long contexts, reduce hallucinations, integrate tool use, improve fuzzy intent detection, apply multi‑objective RLHF, and generate cognitively novel recommendations.

Query UnderstandingRLHFknowledge hallucination

0 likes · 14 min read

Large Model Application Challenges for E-commerce

Baidu Geek Talk

Oct 16, 2023 · Industry Insights

What Is AI‑Native Thinking and Why It Will Shape the Next Wave of Applications

The article explores the concept of AI‑native thinking, outlines the mindset and conditions needed for AI‑native applications, showcases examples such as Baidu Wenku and a legal‑assistant hackathon project, and discusses platform support, technical foundations, and emerging opportunities in the large‑model era.

AI-nativeBaiduapplication design

0 likes · 14 min read

What Is AI‑Native Thinking and Why It Will Shape the Next Wave of Applications

Baidu Geek Talk

Oct 11, 2023 · Artificial Intelligence

How Baidu’s Qianfan 2.0 Supercharges Large‑Model Development and Deployment

The article reviews Baidu Cloud’s Qianfan 2.0 platform, detailing its expanded model catalog, dataset library, Chinese‑language enhancements, compression and speed gains, robust AI infrastructure, application templates, and end‑to‑end data‑labeling pipeline that together lower cost and accelerate large‑model adoption across industries.

AI PlatformCloud AIModel Deployment

0 likes · 14 min read

How Baidu’s Qianfan 2.0 Supercharges Large‑Model Development and Deployment

JD Cloud Developers

Oct 10, 2023 · Artificial Intelligence

Do Large Language Models Have a Mind? Attention, Emergence & Compression Explained

This article examines whether ChatGPT and other large language models exhibit true Theory of Mind, detailing the role of attention mechanisms, neural network architecture, emergent abilities, the Chinese‑room argument, and how compression of massive textual data underlies their apparent intelligence.

Attention MechanismEmergenceNeural Networks

0 likes · 30 min read

Do Large Language Models Have a Mind? Attention, Emergence & Compression Explained

Baobao Algorithm Notes

Oct 9, 2023 · Artificial Intelligence

Demystifying RLHF and PPO for Large Language Models: Theory and Practice

This article explains why Reinforcement Learning from Human Feedback (RLHF) is crucial for LLM intelligence, outlines the three-stage training pipeline, details InstructGPT's reward model and PPO optimization, and provides a practical guide to implementing RLHF with deep‑learning frameworks.

PPORLHFReward Modeling

0 likes · 17 min read

Demystifying RLHF and PPO for Large Language Models: Theory and Practice

DataFunSummit

Sep 30, 2023 · Artificial Intelligence

Causal Inference from the Perspective of Large Models

This presentation by senior AI architect He Gang explores how large language models and LLM‑powered agents can enhance causal inference tasks, detailing model‑assisted analysis, agent‑based inference methods, and multi‑agent simulations to advance causal research.

AILLM agentslarge language models

0 likes · 2 min read

Causal Inference from the Perspective of Large Models

NetEase LeiHuo Testing Center

Sep 22, 2023 · Artificial Intelligence

Understanding Large Language Models and Prompt Engineering: A Practical Guide

This article provides an introductory overview of large language models (LLMs), compares popular models, explains their underlying principles, and offers practical guidance on prompt engineering, model evaluation, usage tips, and safety considerations, helping readers effectively select and apply LLMs in various scenarios.

AILLMModel Evaluation

0 likes · 44 min read

Understanding Large Language Models and Prompt Engineering: A Practical Guide

Tencent Tech

Sep 20, 2023 · Artificial Intelligence

Why Do Large Language Models Hallucinate and How to Reduce It?

The article explains why large language models generate hallucinations—due to data errors, training conflicts, and inference uncertainty—and outlines data‑cleaning, model‑level feedback, knowledge augmentation, constraint techniques, and post‑processing methods such as the “Truth‑seeking” algorithm to mitigate the issue.

AI SafetyData QualityKnowledge Retrieval

0 likes · 8 min read

Why Do Large Language Models Hallucinate and How to Reduce It?

DataFunSummit

Sep 19, 2023 · Artificial Intelligence

Advances in Information Extraction: From PLM to LLM Paradigms at Alibaba DAMO Academy

This article reviews Alibaba DAMO Academy's research on information extraction, covering background concepts, PLM-era extraction paradigms, few‑shot extraction techniques, and the emerging LLM‑era approaches, while also sharing practical insights, benchmark results, and future directions.

Alibaba DAMOFew‑Shot LearningRetrieval Augmented Generation

0 likes · 24 min read

Advances in Information Extraction: From PLM to LLM Paradigms at Alibaba DAMO Academy

Ximalaya Technology Team

Sep 18, 2023 · Artificial Intelligence

Understanding Autonomous and Autopilot AI Agents: Insights from Industry Experts

The article surveys the rise of LLM‑powered AI agents, defining them as LLM + memory + planning + tool use, contrasting fully autonomous agents with human‑guided autopilot/copilot variants, outlining their benefits, risks such as hallucinations and unsafe actions, and urging modular frameworks and oversight for reliable enterprise deployment.

AI agentsAgent FrameworkLLM

0 likes · 27 min read

Understanding Autonomous and Autopilot AI Agents: Insights from Industry Experts

AntTech

Sep 12, 2023 · Artificial Intelligence

Ensuring Trustworthy and Secure AI: Insights from the 2023 Pujiang Innovation Forum

The 2023 Pujiang Innovation Forum highlighted the rapid rise of generative AI, its associated security and privacy risks, and presented Ant Group's multi‑stage, multi‑layered approach—including data, training, and inference controls and three core defense technologies—to achieve safe, reliable, and open knowledge sharing in the era of large language models.

information securityknowledge sharinglarge language models

0 likes · 10 min read

Ensuring Trustworthy and Secure AI: Insights from the 2023 Pujiang Innovation Forum

DaTaobao Tech

Sep 11, 2023 · Artificial Intelligence

Large Language Model Upgrade Paths and Architecture Selection

This article analyzes upgrade paths of major LLMs—ChatGLM, LLaMA, Baichuan—detailing performance, context length, and architectural changes, then examines essential capabilities, data cleaning, tokenizer and attention design, and offers practical guidance for balanced scaling and efficient model construction.

BaichuanChatGLMLLM architecture

0 likes · 32 min read

Large Language Model Upgrade Paths and Architecture Selection

DataFunSummit

Sep 9, 2023 · Artificial Intelligence

Evolution of AIGC Technology and Its Applications in Life Sciences

This article reviews the development of AIGC and generative AI technologies—including image, text, and molecular generation—explains key model advances such as diffusion and large language models, discusses their impact on drug discovery, and outlines current challenges, opportunities, and future directions.

AI in Life SciencesAIGCdrug discovery

0 likes · 14 min read

Evolution of AIGC Technology and Its Applications in Life Sciences

DataFunTalk

Sep 8, 2023 · Artificial Intelligence

Knowledge Processing in the Era of Large Models: New Opportunities and New Challenges

This article examines how large language models and knowledge graphs complement each other, discussing their respective strengths, integration techniques such as prompt engineering and knowledge editing, and outlining future research directions for building large knowledge models that combine linguistic understanding with structured knowledge representation.

AIKnowledge GraphsModel Alignment

0 likes · 27 min read

Knowledge Processing in the Era of Large Models: New Opportunities and New Challenges

Continuous Delivery 2.0

Sep 7, 2023 · Artificial Intelligence

Google’s Internal Memo: “We Have No Moat, Neither Does OpenAI” – The Rise of Open‑Source AI

A leaked Google internal document titled “We have no moat, OpenAI also has none” reveals that both companies are losing the AI arms race to rapidly advancing open‑source models, which achieve comparable performance at a fraction of the cost, prompting a strategic rethink for Google.

AIGoogleLoRA

0 likes · 16 min read

Google’s Internal Memo: “We Have No Moat, Neither Does OpenAI” – The Rise of Open‑Source AI

Alibaba Cloud Developer

Aug 28, 2023 · Artificial Intelligence

AI-Driven Application Engineering: From Prompt Engineering to Autonomous Agents

This article examines how the rapid rise of generative AI reshapes application engineering by outlining AI's core characteristics, the challenges developers face, the evolution of prompt and chain-of-thought techniques, the emergence of agents and tool integration, and the future direction toward AI‑centric computing architectures.

AIPrompt engineeringagents

0 likes · 20 min read

AI-Driven Application Engineering: From Prompt Engineering to Autonomous Agents

FunTester

Aug 22, 2023 · Artificial Intelligence

The Current State and Future Outlook of AI‑Driven Software Testing

The article examines how large‑language models, test‑case generation technologies, and model‑driven testing are reshaping software testing, discusses the challenges of applying AI to testing, and outlines future directions and skill sets for professionals seeking to leverage AI in quality assurance.

AIKnowledge Graphslarge language models

0 likes · 14 min read

The Current State and Future Outlook of AI‑Driven Software Testing

DataFunTalk

Aug 21, 2023 · Artificial Intelligence

Can We Build Large-Scale Models for Recommendation Systems?

In this talk, Zhang Pengtao, a Sina Weibo technical expert with a Ph.D. in computer applications, explores how the strong memory capabilities of NLP large language models inspire the design of independent memory mechanisms for recommendation systems, covering model concepts, HCNet & MemoNet, experimental results, and practical takeaways for enhancing recommendation model performance.

AIMemory MechanismsRecommendation Systems

0 likes · 2 min read

Can We Build Large-Scale Models for Recommendation Systems?

DataFunTalk

Aug 19, 2023 · Artificial Intelligence

Applying Large Language Models to Zhihu's Bridge Platform: Use Cases, Challenges, and Solutions

This article details how Zhihu's internal Bridge platform integrates large language models for business analysis, knowledge taxonomy, natural‑language‑to‑filter conversion, and ad‑hoc data queries, describing the workflow, technical hurdles, iterative improvements, and future directions.

AI for business analyticsPrompt engineeringknowledge taxonomy

0 likes · 12 min read

Applying Large Language Models to Zhihu's Bridge Platform: Use Cases, Challenges, and Solutions

DataFunTalk

Aug 16, 2023 · Artificial Intelligence

Data Engineering, Automated Evaluation, and Knowledge Graph Integration in Large Model Development

This article presents a comprehensive overview of data engineering practices, pre‑training data composition, automated model evaluation techniques, and the synergistic use of knowledge graphs within large‑scale AI model research, highlighting pipelines, quality criteria, and practical case studies.

Knowledge Graphautomation evaluationdata engineering

0 likes · 29 min read

Data Engineering, Automated Evaluation, and Knowledge Graph Integration in Large Model Development

Bilibili Tech

Aug 15, 2023 · Backend Development

Bilibili Customer Service System Architecture and Implementation

The article explains Bilibili's self‑developed customer‑service platform, describing its modular architecture, core workflows, and implementation of features such as intelligent QA with Faiss vector search, Redis‑based seat scheduling, a robust workstation, permission control, and exploration of large language models, highlighting improvements in interception rate, satisfaction, and handling time.

Backend DevelopmentCustomer Service SystemFaiss vector search

0 likes · 20 min read

Bilibili Customer Service System Architecture and Implementation

DataFunSummit

Aug 14, 2023 · Artificial Intelligence

State of GPT: A Programmer’s Guide to Large Language Model Fundamentals, Training, and Applications

This article provides programmers with a comprehensive overview of large language models—including their evolution, core concepts, data pipelines, model architectures, training techniques such as 3D parallelism, supervised fine‑tuning, RLHF, open‑source recipes, and emerging application ecosystems—while also highlighting current challenges and future directions.

Fine‑tuningLLM applicationsRLHF

0 likes · 43 min read

State of GPT: A Programmer’s Guide to Large Language Model Fundamentals, Training, and Applications

php Courses

Aug 14, 2023 · Artificial Intelligence

Guide to the Five Most Powerful Large Language Models and How to Choose Them

This article explains the fundamentals of modern large language models, outlines the top five most powerful LLMs—including GPT‑4, Claude 2, Llama 2, Orca, and Cohere—and provides practical guidance on selecting and applying them across business and development use cases.

AI applicationsClaude 2GPT-4

0 likes · 9 min read

Guide to the Five Most Powerful Large Language Models and How to Choose Them

DataFunTalk

Aug 13, 2023 · Artificial Intelligence

Applying Large Language Models to Search Advertising Satisfaction: From DNN to ERNIE and Prompt Learning

The article details how Baidu's Fengchao team leverages large language models, including a transition from DNN embeddings to ERNIE, introduces multi‑level tokenization and discrete core‑word inputs, and applies prompt learning and AIGC techniques to improve search advertising satisfaction and industry‑specific relevance modeling.

AIGCBaidularge language models

0 likes · 22 min read

Applying Large Language Models to Search Advertising Satisfaction: From DNN to ERNIE and Prompt Learning

DataFunTalk

Aug 9, 2023 · Artificial Intelligence

Key Technologies for Domain‑Specific Large Models: Insights from the World AI Conference

This report, based on Professor Xiao Yanghua’s presentation at the World AI Conference, examines why vertical domains need general large models, outlines their key capabilities such as open‑world understanding, combinatorial innovation, evaluation, complex instruction execution, task planning, and symbolic reasoning, and discusses current limitations and optimization strategies for domain‑specific deployment.

AI EvaluationModel OptimizationVertical AI

0 likes · 17 min read

Key Technologies for Domain‑Specific Large Models: Insights from the World AI Conference

Efficient Ops

Aug 8, 2023 · Artificial Intelligence

Rethinking Software Development in the Age of Large Language Models

The article examines fundamental challenges of applying large language models to software engineering—such as scale limits, lack of abstract reasoning, hidden tacit knowledge, and maintenance difficulties—and proposes practical recommendations for integrating AI with disciplined development practices.

AI integrationdevelopment automationknowledge management

0 likes · 7 min read

Rethinking Software Development in the Age of Large Language Models

Baidu Intelligent Cloud Tech Hub

Aug 8, 2023 · Artificial Intelligence

Unlocking LMOps: How Enterprises Can Master Large Model Operations

This article explains the evolution from traditional machine learning to the current large‑model era, introduces LMOps concepts and key technologies, compares them with MLOps, and showcases Baidu Cloud's Qianfan platform as a practical solution for building, deploying, and managing large language models in industry.

AI OperationsBaidu CloudLMOps

0 likes · 22 min read

Unlocking LMOps: How Enterprises Can Master Large Model Operations

DataFunTalk

Jul 27, 2023 · Artificial Intelligence

Applying AIGC in E‑commerce: Product Copy and Image Generation with Large Language Models

This article shares recent AIGC practices in e‑commerce, detailing product copy generation using GPT‑based models, image creation with Stable Diffusion, the evolution of large language models, technical solutions, experimental results, and future opportunities for AI‑driven automation in online retail.

AIGCe‑commerceimage generation

0 likes · 18 min read

Applying AIGC in E‑commerce: Product Copy and Image Generation with Large Language Models

Baidu Geek Talk

Jul 26, 2023 · Artificial Intelligence

Insights on AIGC Development and Commercial Applications by Baidu's Chief Architect

Baidu’s chief architect Li Shuanglong outlined how AIGC, driven by advanced large‑language and multimodal models, is already powering commercial tools such as automated copywriting, 2D digital‑human video creation and lead‑generation chatbots, while emphasizing future progress in engineering scalability, algorithmic fidelity, data quality, and scenario‑focused applications.

AI commercializationAI researchAIGC

0 likes · 8 min read

Insights on AIGC Development and Commercial Applications by Baidu's Chief Architect

Rare Earth Juejin Tech Community

Jul 24, 2023 · Artificial Intelligence

Comprehensive Survey of Large Language Models: History, Key Technologies, Resources, and Future Directions

This article provides a detailed overview of large language models (LLMs), tracing their evolution from statistical and neural language models to modern pre‑trained transformers, discussing scaling, training, adaptation, utilization, evaluation methods, available resources, and outlining current challenges and future research directions.

Model ScalingPre‑trainingPrompt engineering

0 likes · 26 min read

Comprehensive Survey of Large Language Models: History, Key Technologies, Resources, and Future Directions

Alibaba Cloud Developer

Jul 19, 2023 · Artificial Intelligence

Mastering Prompt Engineering: Techniques, Tips, and Real-World Examples

This comprehensive guide explores prompt engineering for large language models, covering its background, fundamental concepts, prompt formats, construction principles, advanced techniques like few‑shot, zero‑shot, and chain‑of‑thought prompting, as well as practical examples, evaluation metrics, and future directions.

Few-ShotLLMPrompt engineering

0 likes · 33 min read

Mastering Prompt Engineering: Techniques, Tips, and Real-World Examples

Baidu Intelligent Cloud Tech Hub

Jul 17, 2023 · Artificial Intelligence

How Vector Retrieval Powers Large Language Models: Techniques and Practices

This article explains the fundamentals of vector retrieval, its role in enhancing large language models through embedding and prompt engineering, and details the algorithms, system architecture, and Baidu's engineering practices for building high‑performance vector databases.

AIEmbeddingVector Retrieval

0 likes · 14 min read

How Vector Retrieval Powers Large Language Models: Techniques and Practices

ZhongAn Tech Team

Jul 14, 2023 · Artificial Intelligence

Exploring AIGC Applications in Insurance: Insights from ZhongAn Insurance CTO Jiang Jiyun

The interview with ZhongAn Insurance CTO Jiang Jiyun discusses how the company leverages AIGC technologies such as large language models, embeddings, and prompt engineering to enhance marketing, intelligent customer service, and data security, while highlighting practical challenges and best practices for AI adoption in the insurance sector.

AIGCEmbeddingPrompt engineering

0 likes · 15 min read

Exploring AIGC Applications in Insurance: Insights from ZhongAn Insurance CTO Jiang Jiyun

21CTO

Jul 8, 2023 · Artificial Intelligence

Unlocking LangChain: Build End-to-End LLM Apps with Chains, Agents, and Memory

This article introduces LangChain—a modular framework for constructing large‑language‑model applications—covering its core components, asynchronous support, prompt engineering, memory handling, chain and agent workflows, token considerations, embedding techniques, and a step‑by‑step Python example that culminates in a Gradio‑based conversational chatbot.

AI DevelopmentEmbeddingLangChain

0 likes · 20 min read

Unlocking LangChain: Build End-to-End LLM Apps with Chains, Agents, and Memory

DeWu Technology

Jul 5, 2023 · Artificial Intelligence

Fine-tuning Large Language Models with LoRA/QLoRA and Deploying via GPTQ Quantization on KubeAI

The article explains how LoRA and its 4‑bit QLoRA extension dramatically reduce trainable parameters and GPU memory for fine‑tuning large language models, while GPTQ post‑training quantization compresses weights for cheap inference, and shows how KubeAI integrates these techniques into a one‑click workflow for 7 B, 13 B, and 33 B models from data upload to API deployment.

GPTQKubeAILoRA

0 likes · 13 min read

Fine-tuning Large Language Models with LoRA/QLoRA and Deploying via GPTQ Quantization on KubeAI

Network Intelligence Research Center (NIRC)

Jul 1, 2023 · Artificial Intelligence

Prompting Large Language Models for Knowledge‑Based Visual Question Answering: The Prophet Framework

This article analyzes the Prophet framework, which leverages a traditional VQA model to generate answer candidates and in‑context examples that prompt GPT‑3, achieving state‑of‑the‑art performance on the challenging OK‑VQA and A‑OKVQA benchmarks.

GPT-3MCANPrompt engineering

0 likes · 9 min read

Prompting Large Language Models for Knowledge‑Based Visual Question Answering: The Prophet Framework

DataFunSummit

Jun 30, 2023 · Artificial Intelligence

Roundtable on Large‑Model‑Based Recommendation Systems: Opportunities, Challenges, and Future Directions

In this expert roundtable, leading researchers and engineers discuss the current state of recommendation systems, how large language models can reshape the field, the technical and practical challenges involved, and practical advice for practitioners looking to adopt AI‑driven personalization solutions.

AIRecommendation Systemsdialogue recommendation

0 likes · 36 min read

Roundtable on Large‑Model‑Based Recommendation Systems: Opportunities, Challenges, and Future Directions

DataFunSummit

Jun 28, 2023 · Artificial Intelligence

OPPO's CHAOS Pretrained Large Model and GammaE Knowledge‑Graph Multi‑hop Reasoning: Techniques and Insights

This article presents OPPO Research Institute's recent advances in large‑model AI, detailing the CHAOS pretrained model that topped the CLUE leaderboard, the knowledge‑enhanced training pipeline, and the GammaE model for multi‑hop reasoning over knowledge graphs, together with experimental results and practical training tips.

AI researchGammaEKnowledge Graph

0 likes · 20 min read

OPPO's CHAOS Pretrained Large Model and GammaE Knowledge‑Graph Multi‑hop Reasoning: Techniques and Insights

政采云技术

Jun 28, 2023 · Artificial Intelligence

An Overview of ChatGPT: Architecture, Training Process, Advantages, Risks, and Practical Team Deployment

This article explains what GPT is, how it is trained, its strengths and limitations, the various risks it poses, and provides practical guidance on safely adopting large language models like ChatGPT within development teams, including code‑level analysis examples.

AI risksChatGPTModel Training

0 likes · 13 min read

An Overview of ChatGPT: Architecture, Training Process, Advantages, Risks, and Practical Team Deployment

Programmer DD

Jun 20, 2023 · Artificial Intelligence

Yann LeCun: Today's AI Still Below Dog Level – Inside Meta’s Voicebox, MusicGen & I‑JEPA

Meta’s chief AI scientist Yann LeCun warned that current large language models still fall short of human and even dog intelligence, citing their lack of real‑world understanding, while Meta unveiled three new generative AI models—Voicebox for speech, MusicGen for music, and I‑JEPA for image reasoning—showcasing both progress and remaining limitations.

Computer VisionMusic generationSpeech synthesis

0 likes · 7 min read

Yann LeCun: Today's AI Still Below Dog Level – Inside Meta’s Voicebox, MusicGen & I‑JEPA

DataFunTalk

Jun 20, 2023 · Artificial Intelligence

How Recommendation Systems Work and Their Integration with ChatGPT

This article explains the fundamentals of recommendation systems, their digital representation, how ChatGPT and large language models are applied to enhance recommendation performance, and highlights emerging trends such as conversational recommendation and a recommended book on the subject.

AIChatGPTConversational AI

0 likes · 8 min read

How Recommendation Systems Work and Their Integration with ChatGPT

DataFunSummit

Jun 14, 2023 · Artificial Intelligence

DataFun Summit 2023: Large Language Models and AIGC Conference

DataFun will host the DataFun Summit 2023 on June 17‑18, featuring three chairs and eight presenters who will discuss core topics such as large language model research, multimodal generation, reinforcement learning, tool learning, distributed training, and industry applications, with free registration via QR code.

AI ConferenceAIGCMultimodal

0 likes · 42 min read

DataFun Summit 2023: Large Language Models and AIGC Conference

Rare Earth Juejin Tech Community

Jun 14, 2023 · Artificial Intelligence

ChatGPT Practice Applications and Large Model Technology Insights from the Juejin Offline Salon

The article recaps a Beijing offline salon where experts and open‑source contributors discussed ChatGPT desktop applications, the development and deployment of ChatGPT‑Next‑Web, large‑language‑model challenges, the VisualGLM multimodal model, and product design considerations, providing technical insights and community perspectives on AI advancements.

AIChatGPTProduct Design

0 likes · 9 min read

ChatGPT Practice Applications and Large Model Technology Insights from the Juejin Offline Salon

Programmer DD

Jun 12, 2023 · Artificial Intelligence

Master Prompt Engineering: Guide ChatGPT to Deliver Precise Answers

This article explains prompt engineering for large language models like ChatGPT, covering its definition, essential techniques such as diverse prompting strategies, problem restatement, background provision, gradient prompting, example inclusion, role‑playing, and the importance of systematic experimentation and quantitative evaluation to achieve high‑quality, task‑specific AI outputs.

AIChatGPTPrompt engineering

0 likes · 16 min read

Master Prompt Engineering: Guide ChatGPT to Deliver Precise Answers

DataFunTalk

Jun 9, 2023 · Artificial Intelligence

Expert Roundtable on Causal Inference and Large Language Models: Opportunities and Challenges

Leading experts discuss how causal inference intersects with large language models, exploring opportunities, challenges, industry applications, and future research directions, while sharing personal journeys into causal reasoning and offering practical advice for practitioners.

AI researchcausal inferenceexpert interview

0 likes · 16 min read

Expert Roundtable on Causal Inference and Large Language Models: Opportunities and Challenges

Sohu Tech Products

Jun 7, 2023 · Artificial Intelligence

Multiscale PU Learning for Detecting AI‑Generated Text

Researchers from Peking University and Huawei present a multiscale positive‑unlabeled learning framework that significantly improves detection of AI‑generated short and long texts, addressing the difficulty of distinguishing AI‑written content from human writing and outperforming existing baselines on multiple benchmarks.

AI detectionPu-Learninglarge language models

0 likes · 8 min read

Multiscale PU Learning for Detecting AI‑Generated Text

Python Programming Learning Circle

Jun 6, 2023 · Artificial Intelligence

Why ChatGPT Plus Performance Is Dropping and What OpenAI’s Roadmap Reveals

Recent reports indicate a noticeable decline in ChatGPT Plus’s GPT‑4 performance, especially in coding accuracy, prompting speculation about model scaling pain, AI alignment trade‑offs, and OpenAI’s GPU‑limited roadmap that includes cheaper models, longer context windows, finetuning, and multimodal extensions.

AI AlignmentChatGPTGPT-4

0 likes · 8 min read

Why ChatGPT Plus Performance Is Dropping and What OpenAI’s Roadmap Reveals

OPPO Amber Lab

Jun 5, 2023 · Information Security

How ChatGPT Impacts Security: Key Insights from the CSA Seminar

An online CSA seminar on May 30 examined ChatGPT’s security impact, presenting a whitepaper and four AI‑security interaction dimensions, while experts discussed telecom‑operator security‑GPT models, safe vertical‑domain large‑model training, and future industry implications.

AI GovernanceAI securityChatGPT

0 likes · 7 min read

How ChatGPT Impacts Security: Key Insights from the CSA Seminar

Programmer DD

May 19, 2023 · Artificial Intelligence

Master Advanced Prompt Engineering: Boost LLM Performance with Proven Techniques

This article explains why effective prompt design—covering system messages, few‑shot learning, non‑dialogue scenarios, explicit instructions, output shaping, syntax cues, task decomposition, chain‑of‑thought, and real‑world context—is essential for reliable large language model results and provides practical examples and tips.

AIFew‑Shot LearningPrompt engineering

0 likes · 8 min read

Master Advanced Prompt Engineering: Boost LLM Performance with Proven Techniques

Python Crawling & Data Mining

May 7, 2023 · Artificial Intelligence

Why Does ChatGPT Suddenly ‘Think Step‑by‑Step’? Unveiling the Chain‑of‑Thought Emergence

The article explains how ChatGPT’s surprising step‑by‑step reasoning, known as Chain of Thought, emerged as a technical breakthrough, links it to model scaling, cognitive System 1/2 theory, and the influence of code data in training large language models.

AI reasoningChatGPTEmergence

0 likes · 10 min read

Why Does ChatGPT Suddenly ‘Think Step‑by‑Step’? Unveiling the Chain‑of‑Thought Emergence

Rare Earth Juejin Tech Community

May 5, 2023 · Artificial Intelligence

Limitations of Generative Pre‑trained Transformers: Hallucinations, Memory, Planning, and Architectural Proposals

The article critically examines GPT‑4 and similar transformer models, highlighting persistent hallucinations, outdated knowledge, insufficient domain coverage, lack of planning and memory, and proposes architectural extensions inspired by fast‑slow thinking and differentiable modules to overcome these fundamental constraints.

AI limitationsGPT-4Model architecture

0 likes · 24 min read

Architect

Apr 27, 2023 · Artificial Intelligence

Survey of Large Language Model Research: From GPT‑1 to ChatGPT and Open‑Source Alternatives

This article provides a comprehensive overview of the development of large language models, reviewing classic papers from GPT‑1 through GPT‑4, discussing open‑source implementations such as LLaMA, Alpaca, GLM, and ChatGLM, and analyzing training methods, datasets, and future research directions.

AI researchGPTlarge language models

0 likes · 36 min read

Survey of Large Language Model Research: From GPT‑1 to ChatGPT and Open‑Source Alternatives

Kuaishou Tech

Apr 23, 2023 · Artificial Intelligence

Kuaishou & Renmin AI Institute: Driving Multimodal Large Model Innovation

The article details how Kuaishou’s multimodal AI research, including its K7 trillion‑parameter model and VLUA algorithm, partners with Renmin University’s Gaoling AI Institute to launch a joint lab, produce cutting‑edge papers such as WebBrain and ChatImg, and advance recommendation and search technologies across the short‑video ecosystem.

AIIndustry collaborationRecommendation Systems

0 likes · 17 min read

Kuaishou & Renmin AI Institute: Driving Multimodal Large Model Innovation

DataFunSummit

Apr 20, 2023 · Artificial Intelligence

Mengzi Lightweight Model Technology System and Advances in Small‑Scale and Retrieval‑Augmented Pretraining

This presentation introduces the Mengzi lightweight model technology stack, covering large‑scale pre‑training, motivations for lightweight models, detailed techniques such as knowledge and sequence‑relation enhancement, training optimization, model compression, retrieval‑augmented pre‑training, multimodal extensions, open‑source releases, and real‑world applications.

Multimodalknowledge distillationlarge language models

0 likes · 23 min read

Mengzi Lightweight Model Technology System and Advances in Small‑Scale and Retrieval‑Augmented Pretraining

IT Architects Alliance

Apr 20, 2023 · Artificial Intelligence

Overview of Prominent Large Language Models and Instruction‑Finetuned Variants

This article provides a comprehensive overview of major large language models—including GPT series, T5, LaMDA, LLaMA, BLOOM, and others—detailing their architectures, parameter scales, open‑source status, and the evolution of instruction‑fine‑tuning techniques that improve zero‑shot and few‑shot performance.

AI researchInstruction TuningLLM comparison

0 likes · 24 min read

Overview of Prominent Large Language Models and Instruction‑Finetuned Variants

Architect

Apr 19, 2023 · Artificial Intelligence

Emergence in Large Language Models: Phenomena, Explanations, and Implications

This article reviews the emergence phenomena observed in large language models, explains how model scale, in‑context learning and chain‑of‑thought prompting contribute to sudden performance gains, discusses small‑model alternatives, and explores the relationship between emergence and the training‑time Grokking effect.

AI researchEmergenceIn-Context Learning

0 likes · 13 min read

Emergence in Large Language Models: Phenomena, Explanations, and Implications

DataFunTalk

Apr 19, 2023 · Artificial Intelligence

Is the Daily Emergence of Large Language Models Beneficial?

The article examines the rapid proliferation of large language models, weighing both the opportunities for experimentation and the drawbacks of noise, and argues that establishing authoritative Chinese LLM evaluation benchmarks is essential to guide meaningful progress in the field.

AI researchLLM evaluationlarge language models

0 likes · 7 min read

Is the Daily Emergence of Large Language Models Beneficial?

Architect

Apr 14, 2023 · Artificial Intelligence

Overview of Prominent Large Language Models and Instruction Fine‑Tuning Techniques

The article surveys major large language models—including GPT‑3, T5, LaMDA, Jurassic‑1, MT‑NLG, Gopher, Chinchilla, PaLM, U‑PaLM, OPT, LLaMA, BLOOM, GLM‑130B, and ERNIE 3.0 Titan—explains their architectures, scaling trade‑offs, and then details instruction‑fine‑tuned variants such as T0, FLAN, GPT‑3.5, ChatGPT, GPT‑4, Alpaca and ChatGLM, providing references for further study.

AIChatGPTGPT-3

0 likes · 27 min read

Overview of Prominent Large Language Models and Instruction Fine‑Tuning Techniques

ITPUB

Apr 14, 2023 · Artificial Intelligence

How Do Generative, Perceptual, and Decision AI Interact? Insights from Jina AI’s Founder

In this interview, Jina AI’s founder Shao Han examines the relationships among generative, perceptual, and decision AI, compares single‑modal and multimodal approaches, discusses large language model development, and evaluates the impact of ChatGPT on search and future AI commercialization.

AIAI commercializationMultimodal AI

0 likes · 11 min read

How Do Generative, Perceptual, and Decision AI Interact? Insights from Jina AI’s Founder

Programmer DD

Apr 14, 2023 · Artificial Intelligence

How DeepSpeed-Chat Accelerates ChatGPT‑Style Model Training by 15×

Microsoft open‑sourced DeepSpeed‑Chat, a toolkit that streamlines the end‑to‑end training and inference of ChatGPT‑like large language models using RLHF, delivering up to fifteen‑fold speedups and dramatically lower costs, even on a single GPU.

ChatGPTDeepSpeedRLHF

0 likes · 8 min read

How DeepSpeed-Chat Accelerates ChatGPT‑Style Model Training by 15×

Top Architect

Apr 12, 2023 · Artificial Intelligence

Data‑Centric AI Perspective on GPT Models: Training, Inference, and Maintenance

This article examines how large language models such as GPT‑1 through GPT‑4 succeed largely due to high‑quality, large‑scale training data, and explains the Data‑centric AI framework—training data development, inference data development, and data maintenance—while discussing prompt engineering, data‑driven improvements, and future trends in AI.

AIData‑Centric AIGPT

0 likes · 19 min read

Data‑Centric AI Perspective on GPT Models: Training, Inference, and Maintenance

Architect

Apr 9, 2023 · Artificial Intelligence

Evaluating the Commonsense Knowledge and Reasoning Capabilities of ChatGPT and Other Large Language Models

This study systematically evaluates ChatGPT and other large language models on their ability to answer commonsense questions, assess their knowledge awareness, and utilize generated knowledge for reasoning, revealing strong QA performance but notable gaps in social and temporal commonsense and in leveraging contextual knowledge.

ChatGPTNLPcommonsense reasoning

0 likes · 20 min read

Evaluating the Commonsense Knowledge and Reasoning Capabilities of ChatGPT and Other Large Language Models

DataFunSummit

Apr 7, 2023 · Artificial Intelligence

China's AI Startup Landscape: Who Holds the Greatest Advantage in the Large Model Race?

Amid the 2023 AI boom, veteran entrepreneurs, former tech executives, and academic teams in China are racing to launch large‑model ventures, each leveraging distinct funding, talent, and product strategies, while major tech giants simultaneously push their own AI offerings, reshaping the market dynamics.

AI startupsChina AIlarge language models

0 likes · 12 min read

China's AI Startup Landscape: Who Holds the Greatest Advantage in the Large Model Race?

DataFunTalk

Apr 6, 2023 · Artificial Intelligence

A Comprehensive Survey of Large Language Models: Background, Capabilities, Key Technologies, and Future Directions

This article reviews the rapid progress of large language models (LLMs), covering their historical development, scaling laws, emergent abilities, core technologies such as training and alignment, resource ecosystems, evaluation methods, safety concerns, and prospective research challenges.

AI researchAlignmentLLM

0 likes · 21 min read

A Comprehensive Survey of Large Language Models: Background, Capabilities, Key Technologies, and Future Directions

Xiaohongshu Tech REDtech

Apr 3, 2023 · Industry Insights

What Drives Intelligent Recommendation and Search? Key Takeaways from Xiaohongshu’s CCF C³ Event

The CCF C³ event at Xiaohongshu gathered leading researchers and industry experts to dissect the latest advances, challenges, and future opportunities in intelligent recommendation and search, including multimodal content handling, decentralized distribution, cold‑start solutions, and the impact of large language models.

AIRecommendation SystemsSearch

0 likes · 11 min read

What Drives Intelligent Recommendation and Search? Key Takeaways from Xiaohongshu’s CCF C³ Event

Python Programming Learning Circle

Mar 25, 2023 · Artificial Intelligence

Impact of ChatGPT and Large Language Models on the U.S. Labor Market

A recent OpenAI study estimates that roughly 80% of U.S. workers will see at least 10% of their tasks affected by ChatGPT and similar large language models, with about 19% of occupations potentially losing half of their tasks, highlighting widespread economic implications across all wage levels.

AI economicsChatGPTGPT‑4

0 likes · 9 min read

Impact of ChatGPT and Large Language Models on the U.S. Labor Market

Baidu Geek Talk

Mar 21, 2023 · Artificial Intelligence

Infrastructure Challenges and Solutions for Large‑Scale AI Model Training

The article explains how the massive compute and storage demands of today’s large language models create a “compute wall” and “storage wall,” and describes Baidu Intelligent Cloud’s four‑layer full‑stack infrastructure—combining advanced parallelism techniques, optimized GPU networking, static‑graph compilation, and cost‑model‑driven placement—to train trillion‑parameter models efficiently.

AI InfrastructureCost ModelDistributed Training

0 likes · 27 min read

Infrastructure Challenges and Solutions for Large‑Scale AI Model Training

Efficient Ops

Mar 16, 2023 · Artificial Intelligence

What’s New in AI? Baidu’s Wenxin, GPT‑4 Multimodal, Docker’s Policy Shift & More

The article reviews the latest AI breakthroughs—including Baidu’s Wenxin Yiyan launch, OpenAI’s multimodal GPT‑4, Google’s PaLM API, Microsoft’s super‑computer investment—and also covers Docker’s paid‑plan warning, the US demand for TikTok’s sale, and a Stack Overflow technology survey.

DockerGPT-4Google PaLM

0 likes · 7 min read

What’s New in AI? Baidu’s Wenxin, GPT‑4 Multimodal, Docker’s Policy Shift & More

DataFunTalk

Mar 16, 2023 · Artificial Intelligence

Technical Optimizations and Breakthroughs of GPT‑4: Multimodal Capabilities, Alignment Strategies, and Predictable Scaling

The article summarizes the technical innovations behind GPT‑4, highlighting its multimodal abilities, improved alignment methods, scaling‑law‑based performance prediction, and remaining limitations, while referencing the official OpenAI technical report and community analyses.

AI researchAlignmentGPT-4

0 likes · 10 min read

Technical Optimizations and Breakthroughs of GPT‑4: Multimodal Capabilities, Alignment Strategies, and Predictable Scaling

21CTO

Mar 11, 2023 · Artificial Intelligence

Microsoft Announces Multimodal GPT-4: A New ‘iPhone Moment’ for AI

Microsoft Germany's CTO announced the imminent release of a multimodal GPT‑4, highlighting its ability to process text, images and video, while executives liken the breakthrough to an “iPhone moment” for AI, emphasizing new capabilities, industry disruption, and responsible data use.

AI DevelopmentGPT-4Microsoft

0 likes · 6 min read

Microsoft Announces Multimodal GPT-4: A New ‘iPhone Moment’ for AI

DataFunSummit

Feb 28, 2023 · Artificial Intelligence

Baidu Document Intelligence Technology Overview and Applications

This article presents a comprehensive overview of Baidu's document intelligence technologies—including the ERNIE‑Layout multimodal large model, the prompt‑based DocPrompt extraction system, layout and table understanding techniques, and PaddleNLP open‑source integration—detailing their architectures, challenges, solutions, performance benchmarks, and real‑world application cases across multiple industries.

DocPromptDocument IntelligenceERNIE-Layout

0 likes · 19 min read

Baidu Document Intelligence Technology Overview and Applications

21CTO

Feb 27, 2023 · Artificial Intelligence

What’s Next for Large Language Models? Emerging Trends Shaping AI

The article explores three emerging directions for next‑generation large language models—self‑generated training data, built‑in verification with external retrieval, and massive sparse‑expert architectures—highlighting recent research, practical challenges, and their potential to reshape AI development.

AI researchgenerative AIlarge language models

0 likes · 17 min read

What’s Next for Large Language Models? Emerging Trends Shaping AI

21CTO

Feb 23, 2023 · Artificial Intelligence

How Does ChatGPT Really Work? Inside the RLHF Training Process

This article explains ChatGPT’s architecture, the distinction between model capability and consistency, how next‑token and masked‑language‑model training lead to inconsistencies, and how OpenAI’s supervised fine‑tuning, reward‑model training, and PPO reinforcement learning (RLHF) are combined to improve alignment while highlighting the method’s limitations.

AI AlignmentChatGPTRLHF

0 likes · 15 min read

How Does ChatGPT Really Work? Inside the RLHF Training Process

21CTO

Feb 20, 2023 · Artificial Intelligence

How the AI Arms Race Between Microsoft and Google Is Reshaping Search

The escalating competition between Microsoft and Google over large language models is driving new search experiences, reshaping AI research, and raising concerns about content quality, privacy, and the future of smaller AI innovators.

AI competitionBingChatGPT

0 likes · 10 min read

How the AI Arms Race Between Microsoft and Google Is Reshaping Search

dbaplus Community

Feb 18, 2023 · Artificial Intelligence

Why ChatGPT Still Gets It Wrong: Inside RLHF and Model Consistency

ChatGPT, OpenAI’s latest language model, builds on GPT‑3 but uses supervised fine‑tuning and Reinforcement Learning from Human Feedback (RLHF) to improve alignment, yet its training methods still cause consistency issues such as invalid help, hallucinations, bias, and limited explainability.

ChatGPTModel AlignmentPPO

0 likes · 17 min read

Why ChatGPT Still Gets It Wrong: Inside RLHF and Model Consistency

Architecture Digest

Feb 17, 2023 · Artificial Intelligence

Analyzing the Emergent Abilities of ChatGPT and the Technical Roadmap of GPT‑3.5

This article dissects how ChatGPT acquired its surprising capabilities by tracing the evolution from the original GPT‑3 model through instruction tuning, code‑based pre‑training, and reinforcement learning from human feedback, ultimately presenting a comprehensive technical roadmap for reproducing GPT‑3.5‑scale models.

ChatGPTGPT-3.5Instruction Tuning

0 likes · 26 min read

Analyzing the Emergent Abilities of ChatGPT and the Technical Roadmap of GPT‑3.5

DataFunTalk

Feb 15, 2023 · Artificial Intelligence

Three Emerging Directions for Next‑Generation Large Language Models

The article outlines three promising research avenues—self‑generated training data, model‑driven fact‑checking, and sparse expert architectures—that could shape the next wave of large language model innovation and address current limitations such as data scarcity and hallucinations.

AI researchlarge language modelsmodel self‑improvement

0 likes · 14 min read

Three Emerging Directions for Next‑Generation Large Language Models

Architects' Tech Alliance

Feb 13, 2023 · Artificial Intelligence

Why ChatGPT’s Explosive Growth Is Redefining the AI Landscape

Within five days of its launch, ChatGPT attracted over one million users, prompting massive investment from Microsoft, sparking fierce competition with Google, and raising critical questions about the technology's scalability, reliability, and societal impact.

AI industryChatGPTMicrosoft

0 likes · 18 min read

Why ChatGPT’s Explosive Growth Is Redefining the AI Landscape

Architects' Tech Alliance

Feb 13, 2023 · Artificial Intelligence

Do Large Language Models Really Have Theory of Mind? Stanford Study Reveals Surprising Results

A recent Stanford paper shows that GPT‑3.5 and its predecessor can pass classic Theory of Mind tests at levels comparable to 7‑9‑year‑old children, sparking debate over whether these abilities are genuine understanding or emergent by‑products of scaling.

AI EvaluationGPT-3.5Stanford Research

0 likes · 10 min read

Do Large Language Models Really Have Theory of Mind? Stanford Study Reveals Surprising Results

Open Source Linux

Feb 13, 2023 · Artificial Intelligence

How Does ChatGPT Work? Inside RLHF and Model Consistency

This article explains the inner workings of ChatGPT, detailing its evolution from GPT‑3, the role of reinforcement learning from human feedback (RLHF) in improving consistency, the training pipeline steps, and the limitations and evaluation methods of large language models.

AIChatGPTModel Alignment

0 likes · 15 min read

How Does ChatGPT Work? Inside RLHF and Model Consistency

DataFunSummit

Feb 12, 2023 · Artificial Intelligence

Claude vs. ChatGPT: Constitutional AI, RLAIF, and the Quest for Safer Large‑Language Models

This article reviews Anthropic's Claude assistant, explains the novel Constitutional AI (RLAIF) approach that replaces costly human‑feedback data with a set of natural‑language principles, compares Claude with ChatGPT across helpfulness and harmlessness, and details the supervision and reinforcement‑learning pipelines, data annotation, and experimental results that demonstrate superior safety performance.

AI SafetyClaudeHarmlessness

0 likes · 21 min read

Claude vs. ChatGPT: Constitutional AI, RLAIF, and the Quest for Safer Large‑Language Models

Top Architect

Feb 11, 2023 · Artificial Intelligence

ChatGPT: Technical Overview, Architecture, Training Process, Limitations and Future Directions

This article provides a comprehensive technical overview of ChatGPT, covering its origins, underlying GPT architecture, reinforcement learning from human feedback, training stages, current limitations, and prospective improvements such as model compression, constitutional AI, and integration with AIGC technologies.

AIGCChatGPTRLHF

0 likes · 18 min read

ChatGPT: Technical Overview, Architecture, Training Process, Limitations and Future Directions

Laravel Tech Community

Feb 9, 2023 · Artificial Intelligence

Understanding ChatGPT: Architecture, Training Strategies, and Alignment Challenges

This article explains how ChatGPT builds on GPT‑3, describes the supervised‑plus‑reinforcement learning (RLHF) pipeline that fine‑tunes the model, compares model capability with consistency, and discusses the performance evaluation and remaining limitations of large language models.

AlignmentChatGPTModel Training

0 likes · 15 min read

Architect

Feb 9, 2023 · Artificial Intelligence

Emergent Abilities of Large Language Models: Complex Reasoning, Knowledge Reasoning, and Out‑of‑Distribution Robustness

This article reviews recent research on the emergent abilities of large language models—such as chain‑of‑thought reasoning, knowledge retrieval without external sources, and robustness to distribution shifts—examining scaling laws, model size thresholds, and the open questions surrounding a potential paradigm shift from fine‑tuning to in‑context learning.

AI researchchain-of-thought promptingemergent abilities

0 likes · 23 min read

Emergent Abilities of Large Language Models: Complex Reasoning, Knowledge Reasoning, and Out‑of‑Distribution Robustness

Top Architect

Feb 9, 2023 · Artificial Intelligence

How ChatGPT Works: Training, RLHF, and Consistency Issues

ChatGPT, OpenAI’s latest language model, builds on GPT‑3 and improves performance through supervised fine‑tuning, human‑feedback reinforcement learning (RLHF), and PPO optimization, addressing consistency challenges such as misaligned outputs, bias, and hallucinations while evaluating helpfulness, truthfulness, and harmlessness.

ChatGPTModel AlignmentRLHF

0 likes · 15 min read

How ChatGPT Works: Training, RLHF, and Consistency Issues

IT Architects Alliance

Feb 9, 2023 · Artificial Intelligence

How ChatGPT Works: Model Architecture, Training Strategies, and RLHF

ChatGPT, OpenAI’s latest language model, builds on GPT‑3 using supervised fine‑tuning and Reinforcement Learning from Human Feedback (RLHF) with PPO, addressing consistency issues by aligning model outputs with human preferences, while discussing training methods, limitations, and evaluation metrics.

AI AlignmentChatGPTPPO

0 likes · 15 min read

How ChatGPT Works: Model Architecture, Training Strategies, and RLHF

Top Architect

Feb 8, 2023 · Artificial Intelligence

A Technical Roadmap of GPT‑3.5: From Pre‑training to RLHF and Emerging Capabilities

This article analyses how ChatGPT and the GPT‑3.5 series evolved from the original GPT‑3 through large‑scale pre‑training, code‑based training, instruction tuning, and reinforcement learning from human feedback, identifying the origins of their language generation, in‑context learning, world knowledge, code understanding, chain‑of‑thought reasoning, and alignment capabilities while also outlining current limitations.

ChatGPTGPT-3.5Instruction Tuning

0 likes · 27 min read

A Technical Roadmap of GPT‑3.5: From Pre‑training to RLHF and Emerging Capabilities