Tagged articles

Large Language Models

1206 articles · Page 11 of 13

May 11, 2024 · Artificial Intelligence

Will U.S. AI Export Controls Stall Global Large‑Model Development?

The United States is drafting a bipartisan bill to impose export controls on advanced proprietary AI models, aiming to shield its technology from China, Russia, North Korea and Iran, while confronting challenges of open‑source model regulation and potential geopolitical retaliation.

AI policyChinaLarge Language Models

0 likes · 7 min read

Will U.S. AI Export Controls Stall Global Large‑Model Development?

Baidu Tech Salon

May 10, 2024 · Artificial Intelligence

Baidu Comate: Core Capabilities of Intelligent Code Assistant

The article surveys Baidu Comate, an AI‑powered code assistant built on the Wenxin (ERNIE) large model, tracing software development from the 1950s crisis through the internet and open‑source era to today’s AI‑driven tools, and highlights its features and demonstration at a global development conference.

AI codingBaidu ComateIDE plugin

0 likes · 7 min read

Baidu Comate: Core Capabilities of Intelligent Code Assistant

Architects' Tech Alliance

May 9, 2024 · Artificial Intelligence

AI Servers: Market Opportunities, Architecture, and Future Demand Driven by Generative AI

The article examines how the surge of generative AI (AIGC) is fueling rapid growth in AI server demand, detailing the emerging AIGC ecosystem, server hardware composition, model scaling, heterogeneous computing, training vs. inference workloads, market size forecasts, and the competitive landscape of AI server manufacturers.

AI InfrastructureAI serversGPU

0 likes · 15 min read

AI Servers: Market Opportunities, Architecture, and Future Demand Driven by Generative AI

DataFunTalk

May 7, 2024 · Artificial Intelligence

Large Language Models and Knowledge Graphs: Recent Advances, Synergies, and Future Directions

This article reviews the rapid progress of large language models, compares them with knowledge graphs, explores how LLMs can aid knowledge extraction and completion, discusses how knowledge graphs can evaluate and enhance LLMs, and outlines future interactive integration between the two technologies.

AIEvaluationLarge Language Models

0 likes · 12 min read

Large Language Models and Knowledge Graphs: Recent Advances, Synergies, and Future Directions

Rare Earth Juejin Tech Community

May 2, 2024 · Artificial Intelligence

Understanding Large Language Models: Principles, Training, Risks, and Application Security

This article provides a comprehensive overview of large language models (LLMs), explaining their core concepts, transformer architecture, training stages, known shortcomings such as hallucination and reversal curse, and highlights emerging security threats like prompt injection and jailbreaking, offering guidance for safe deployment.

AI safetyLLMLarge Language Models

0 likes · 21 min read

Understanding Large Language Models: Principles, Training, Risks, and Application Security

21CTO

Apr 28, 2024 · Artificial Intelligence

5 Transformative Business Use Cases for Conversational AI

This article explores how conversational AI, powered by large language models, is reshaping enterprise operations across five key scenarios—from customer support assistants and AI‑driven data interfaces to HR bots, unstructured data processing, and multi‑agent digital assistants—highlighting benefits, implementation considerations, and privacy challenges.

Conversational AICustomer SupportData Integration

0 likes · 13 min read

5 Transformative Business Use Cases for Conversational AI

DataFunTalk

Apr 26, 2024 · Artificial Intelligence

Large Language Models in the Automotive Industry: Overview, Impact, and Practical Exploration

This article examines how large language models such as GPT and Transformer‑based architectures are reshaping the automotive sector by enhancing in‑vehicle intelligence, streamlining product development, improving customer service, and redefining data analyst roles, while also presenting practical experiments, deployment challenges, and future directions.

Automotive AIGPTLLM applications

0 likes · 18 min read

Large Language Models in the Automotive Industry: Overview, Impact, and Practical Exploration

Sohu Tech Products

Apr 24, 2024 · Artificial Intelligence

Evolution, Architecture, Training Data, Methods, and Performance of Meta's Llama Series (Llama 1, 2, 3)

Meta's Llama series has progressed from the 7‑65B Llama‑1 in early 2023 to the 8B and 70B Llama‑3 in 2024, scaling token counts from 1 T to over 15 T, adopting decoder‑only Transformers with RMSNorm, SwiGLU, RoPE and GQA, and adding supervised fine‑tuning, RLHF and DPO, resulting in state‑of‑the‑art benchmark performance and a vibrant open‑source ecosystem.

AILLaMALarge Language Models

0 likes · 25 min read

Evolution, Architecture, Training Data, Methods, and Performance of Meta's Llama Series (Llama 1, 2, 3)

21CTO

Apr 23, 2024 · Artificial Intelligence

Deploy Large Language Models with vLLM and Quantization for Low Latency

This guide explains how to deploy open‑source large language models using vLLM, benchmark latency and throughput, and apply 8‑bit/4‑bit quantization techniques such as BitsandBytes and NF4 to achieve faster inference on limited‑GPU hardware.

LLM deploymentLarge Language ModelsPython

0 likes · 13 min read

Deploy Large Language Models with vLLM and Quantization for Low Latency

MoonWebTeam

Apr 23, 2024 · Artificial Intelligence

Exploring Devika AI: An Open‑Source AI Programmer’s Capabilities and Limits

Devika AI, an open‑source AI programmer from Stition AI, is examined for its architecture, supported actions, installation steps, and real‑world performance across tasks such as building a Snake game, Conway’s Game of Life, Vue3 components, and unit‑test generation, highlighting strengths, weaknesses, and future potential.

Devika AILarge Language Modelstool evaluation

0 likes · 21 min read

Exploring Devika AI: An Open‑Source AI Programmer’s Capabilities and Limits

NewBeeNLP

Apr 22, 2024 · Artificial Intelligence

Why LLAMA‑3’s Scaling Laws Signal the Next AI Frontier

The article analyzes LLAMA‑3’s architectural tweaks, massive data expansion, scaling‑law implications, open‑source versus closed‑source dynamics, and the critical role of synthetic data in sustaining large‑model progress beyond 2025.

LLAMA-3Large Language ModelsOpen-source AI

0 likes · 10 min read

Why LLAMA‑3’s Scaling Laws Signal the Next AI Frontier

Baobao Algorithm Notes

Apr 21, 2024 · Artificial Intelligence

Why Llama 3’s Open‑Source Release Could Redefine Large‑Model Scaling and Synthetic Data

The article analyzes Llama 3’s architecture, training data expansion, model variants, Meta’s open‑source strategy, the evolving gap between open and closed models, and how future breakthroughs in synthetic data will shape scaling laws and large‑model progress through 2025 and beyond.

AI trendsLarge Language ModelsLlama3

0 likes · 12 min read

Why Llama 3’s Open‑Source Release Could Redefine Large‑Model Scaling and Synthetic Data

Xiaohe Frontend Team

Apr 21, 2024 · Artificial Intelligence

What’s New in Generative AI? VASA‑1, Llama‑3, Stable Diffusion 3 & More

The article reviews the latest breakthroughs in generative AI, including Microsoft’s VASA‑1 video synthesis model, Meta’s open‑source Llama‑3 large language model, Stability AI’s Stable Diffusion 3 API, Adobe’s integration of third‑party AI video tools into Premiere Pro, and a free image‑style‑recreation platform from Freepik, highlighting their technical details and potential applications.

AI toolsDiffusion ModelsGenerative AI

0 likes · 13 min read

What’s New in Generative AI? VASA‑1, Llama‑3, Stable Diffusion 3 & More

AntTech

Apr 19, 2024 · Artificial Intelligence

OneKE: Open-Source Bilingual Knowledge Extraction Framework for Large Language Models

OneKE, an open‑source bilingual (Chinese‑English) knowledge extraction framework jointly developed by Ant Group and Zhejiang University, enables efficient extraction of entities, relations, and events to build domain knowledge graphs that enhance large language models’ reasoning, reduce hallucinations, and support applications in medical, financial, and governmental sectors.

Artificial IntelligenceKnowledge ExtractionLarge Language Models

0 likes · 5 min read

OneKE: Open-Source Bilingual Knowledge Extraction Framework for Large Language Models

DevOps

Apr 17, 2024 · Artificial Intelligence

Engineering Capabilities for Enterprise Large Model Applications: Prompt Engineering, RAG, and Fine‑Tuning

The article explores how enterprises can build and improve large‑model applications by combining prompt engineering, retrieval‑augmented generation (RAG), and fine‑tuning, discusses their relationships, optimization dimensions, testing challenges, and provides practical guidance for SE4AI implementation.

AI EngineeringEnterprise AILarge Language Models

0 likes · 20 min read

Engineering Capabilities for Enterprise Large Model Applications: Prompt Engineering, RAG, and Fine‑Tuning

AntTech

Apr 17, 2024 · Artificial Intelligence

LLMRG: Improving Recommendations through Large Language Model Reasoning Graphs

LLMRG introduces a novel framework that leverages large language models to construct personalized reasoning graphs, integrating chain reasoning, self‑verification, divergent extension, and knowledge‑base self‑improvement, thereby enhancing recommendation accuracy, interpretability, and performance across multiple benchmark datasets without additional user or item information.

AILarge Language ModelsRecommendation Systems

0 likes · 9 min read

LLMRG: Improving Recommendations through Large Language Model Reasoning Graphs

360 Tech Engineering

Apr 15, 2024 · Artificial Intelligence

Fine‑Tuning Large Language Models: A Practical Guide Using Qwen‑14B on the 360AI Platform

This article explains the concept, motivations, and step‑by‑step workflow for fine‑tuning large language models—specifically Qwen‑14B—covering data preparation, training commands with DeepSpeed, hyper‑parameter settings, evaluation, and deployment via FastChat, all illustrated with code snippets and configuration details.

AIDeepSpeedFastChat

0 likes · 10 min read

Fine‑Tuning Large Language Models: A Practical Guide Using Qwen‑14B on the 360AI Platform

Architects' Tech Alliance

Apr 13, 2024 · Industry Insights

Why AI Servers Are Poised for Explosive Growth: Trends, Architecture, and Demand Forecast

The article analyzes how the surge in AIGC and large language models is reshaping the AI server market, detailing hardware composition, the rise of heterogeneous computing, GPU advantages, demand calculations for models like GPT‑3, and the competitive landscape driving rapid industry growth.

AI serversAIGCGPU computing

0 likes · 16 min read

Why AI Servers Are Poised for Explosive Growth: Trends, Architecture, and Demand Forecast

DataFunSummit

Apr 13, 2024 · Artificial Intelligence

Understanding and Mitigating Hallucinations in Large Language Model Industry Q&A with Knowledge Graphs

This article examines why large language models often produce hallucinations in industry question‑answering, defines the phenomenon, explores its data and training origins, proposes evaluation metrics, and presents practical strategies—including high‑quality fine‑tuning data, honest refusal mechanisms, advanced decoding methods, and external knowledge‑graph augmentation—to reduce hallucinations and improve reliability.

AI evaluationHallucinationLarge Language Models

0 likes · 21 min read

Understanding and Mitigating Hallucinations in Large Language Model Industry Q&A with Knowledge Graphs

NewBeeNLP

Apr 13, 2024 · Artificial Intelligence

How a Multimodal ‘Joke‑King’ Model Beats GPT‑4 at Humor Generation

A research team from Sun Yat‑sen University, Sea AI Lab and Harvard built a multimodal large model that learns to generate creative jokes and memes by training on the Oogiri‑GO dataset, introducing a Leap‑of‑Thought (LoT) paradigm and CLoT fine‑tuning, which outperforms GPT‑4 and other state‑of‑the‑art models in humor tasks.

CLoTLarge Language ModelsLeap-of-Thought

0 likes · 9 min read

How a Multimodal ‘Joke‑King’ Model Beats GPT‑4 at Humor Generation

Data Thinking Notes

Apr 11, 2024 · Artificial Intelligence

How Financial Institutions Are Building Their Own Large Language Models

This article explores how the finance sector is creating specialized large language models—covering the shift from generic to domain‑specific models, training innovations, evaluation methods, and real‑world applications such as marketing, customer service, risk control, and operational analytics.

ApplicationsLarge Language ModelsModel Training

0 likes · 16 min read

How Financial Institutions Are Building Their Own Large Language Models

Cloud Native Technology Community

Apr 11, 2024 · Cloud Native

Why Kubernetes Is the Ideal Platform for Deploying Large Language Models

Deploying large language models demands massive compute, flexible scaling, and robust resource management, and this article explains how Kubernetes’s auto‑scaling, portability, cloud‑native features, observability tools, and multi‑tenant isolation make it the optimal platform for training, serving, and iterating LLM workloads.

Cloud NativeKubernetesLarge Language Models

0 likes · 17 min read

Why Kubernetes Is the Ideal Platform for Deploying Large Language Models

DataFunSummit

Apr 9, 2024 · Artificial Intelligence

Knowledge Map for Large Model Application Development

This article outlines a comprehensive knowledge map for building large‑model applications, detailing a four‑layer technical architecture, development lifecycle, core elements such as prompt engineering and fine‑tuning, evaluation methods, and real‑world case studies across various AI use cases.

AI application developmentLarge Language Modelsmodel fine-tuning

0 likes · 12 min read

Knowledge Map for Large Model Application Development

NewBeeNLP

Apr 8, 2024 · Artificial Intelligence

What Will Recommendation Systems Look Like in 2026? Emerging Trends and Challenges

This article analyzes the current bottlenecks of conventional recommendation systems and outlines ten forward‑looking research directions for 2026, including retention improvement, user growth, content ecosystem, multi‑objective Pareto optimization, long‑term value estimation, site‑wide optimization, interactive recommendation, personalized modeling, decision‑theoretic framing, and the integration of large language models via the OneRec framework.

Large Language ModelsUser Retentioninteractive recommendation

0 likes · 18 min read

What Will Recommendation Systems Look Like in 2026? Emerging Trends and Challenges

NewBeeNLP

Apr 7, 2024 · Artificial Intelligence

Can Large Language Models Learn Recommendation Knowledge? A NL‑Simulated Auxiliary Task

This article reviews a recent study that bridges the knowledge gap between large language models and recommendation systems by generating natural‑language auxiliary tasks, fine‑tuning the models, and achieving notable performance gains on Amazon domain benchmarks.

AI researchLarge Language Modelsfine‑tuning

0 likes · 4 min read

Can Large Language Models Learn Recommendation Knowledge? A NL‑Simulated Auxiliary Task

DataFunTalk

Apr 4, 2024 · Artificial Intelligence

Enhancing Interactive Agents with Large Language Models: The SwiftSage Framework

This article reviews the challenges of textual‑only large language model interaction, introduces benchmark environments such as AFL World and ScienceWorld, compares baseline reinforcement‑learning approaches, and presents SwiftSage—a hybrid system that combines a fast T5‑based small model with a powerful LLM for planning and grounding, demonstrating superior performance, efficiency, and cost‑effectiveness while outlining current limitations and future research directions.

AILarge Language ModelsSwiftSage

0 likes · 22 min read

Enhancing Interactive Agents with Large Language Models: The SwiftSage Framework

DataFunTalk

Apr 3, 2024 · Artificial Intelligence

Future Directions of Recommendation Systems: Retention, User Growth, Content Ecosystem, Multi‑Objective Optimization, and Large‑Model Fusion

This presentation outlines the current bottlenecks of conventional recommendation pipelines and proposes a 2026 roadmap that includes retention improvement, user‑growth strategies, content‑ecosystem metrics, Pareto‑optimal multi‑objective optimization, long‑term value modeling, site‑wide spatial optimization, interactive recommendation, personalized modeling, and the integration of large‑model fusion through the OneRec framework.

Large Language ModelsRecommendation SystemsUser Retention

0 likes · 18 min read

Future Directions of Recommendation Systems: Retention, User Growth, Content Ecosystem, Multi‑Objective Optimization, and Large‑Model Fusion

DataFunTalk

Apr 2, 2024 · Artificial Intelligence

User Portrait Algorithms: From Ontology‑Based Methods to Deep Learning and Future Directions

This article provides a comprehensive overview of user portrait algorithms, covering their historical development, ontology‑based traditional approaches, deep‑learning enhancements, representation‑learning techniques such as lookalike, active‑learning driven iteration, and the integration of large‑model world knowledge, while also discussing current challenges and future research directions.

Active LearningDeep LearningLarge Language Models

0 likes · 26 min read

User Portrait Algorithms: From Ontology‑Based Methods to Deep Learning and Future Directions

DataFunTalk

Apr 1, 2024 · Artificial Intelligence

Large Model Applications in the Financial Sector: Practices, Knowledge Graphs, Ethics, and Ecosystem

This article presents a comprehensive overview of how large AI models are being applied in finance, covering development trends, practical use cases, knowledge‑graph integration, safety mechanisms, ethical considerations, and the evolving ecosystem of model‑centric solutions.

AI ApplicationsLarge Language Modelsfinance AI

0 likes · 24 min read

Large Model Applications in the Financial Sector: Practices, Knowledge Graphs, Ethics, and Ecosystem

DataFunSummit

Mar 31, 2024 · Artificial Intelligence

Challenges and Techniques in Distributed Training of Large Language Models

This article reviews the rapid development of large language models since 2019, outlines the historical background, identifies key challenges such as massive compute demand, memory constraints, and system complexity, and then details distributed training technologies—including data parallelism, pipeline parallelism, and advanced optimization strategies—while also discussing future research directions and answering common questions.

AI InfrastructureDeepSpeedLarge Language Models

0 likes · 23 min read

Challenges and Techniques in Distributed Training of Large Language Models

DaTaobao Tech

Mar 29, 2024 · Artificial Intelligence

Text-to-SQL with Large Language Models: DIN-SQL Approach

The DIN‑SQL approach enhances Text‑to‑SQL performance by using large language models in a decomposed in‑context learning framework with schema linking, query classification, SQL generation, and self‑correction modules, achieving state‑of‑the‑art 85.3% execution accuracy on the Spider benchmark by breaking complex queries into manageable sub‑tasks.

AI researchDatabase QueryingLarge Language Models

0 likes · 34 min read

Text-to-SQL with Large Language Models: DIN-SQL Approach

Sohu Tech Products

Mar 27, 2024 · Artificial Intelligence

NVIDIA NeMo Framework, TensorRT‑LLM, and RAG for Large Language Model Solutions

NVIDIA’s comprehensive LLM ecosystem combines the full‑stack NeMo Framework for data curation, distributed training, fine‑tuning, inference acceleration with TensorRT‑LLM and Triton, plus Retrieval‑Augmented Generation and Guardrails, enabling efficient, low‑latency, knowledge‑grounded model deployment across clusters.

AI accelerationLarge Language ModelsModel Training

0 likes · 16 min read

NVIDIA NeMo Framework, TensorRT‑LLM, and RAG for Large Language Model Solutions

Alibaba Cloud Big Data AI Platform

Mar 26, 2024 · Artificial Intelligence

MoE LLMs: How Alibaba Cloud & NVIDIA Megatron-Core Accelerate Training

This article reviews the evolution of Mixture-of-Experts (MoE) models, details Alibaba Cloud’s collaboration with NVIDIA’s Megatron-Core to build a high-performance MoE framework, and presents extensive training optimizations, benchmark results, conversion tools, and best-practice guidelines for large-scale LLM development and deployment.

Alibaba CloudLarge Language ModelsMegatron-Core

0 likes · 18 min read

MoE LLMs: How Alibaba Cloud & NVIDIA Megatron-Core Accelerate Training

NewBeeNLP

Mar 21, 2024 · Artificial Intelligence

Mastering Large Language Model Training: Key Challenges and Optimization Strategies

This article examines the resource and efficiency challenges of scaling large language model training, explains data, model, pipeline, and tensor parallelism, and provides practical I/O, communication, and stability optimization techniques—including high‑availability storage, RDMA networking, NCCL tuning, and fault‑tolerant recovery—to improve throughput and reliability.

AI EngineeringI/O optimizationLarge Language Models

0 likes · 15 min read

Mastering Large Language Model Training: Key Challenges and Optimization Strategies

TAL Education Technology

Mar 20, 2024 · Artificial Intelligence

Understanding AI: From Brain Differences to Data Science Practices and Large Model Applications

This article explains why current AI cannot achieve self‑awareness, outlines data‑science steps for large models—including preprocessing, exploratory analysis, modeling, and evaluation—then surveys general and vertical applications of large language models and details a complete machine‑learning workflow with transformer fine‑tuning techniques.

AIApplicationsLarge Language Models

0 likes · 14 min read

Understanding AI: From Brain Differences to Data Science Practices and Large Model Applications

DataFunTalk

Mar 20, 2024 · Artificial Intelligence

Challenges and Optimization Techniques for Large Language Model Training

The article outlines the resource and efficiency challenges of scaling large language models, explains data and model parallelism strategies, and details practical I/O, communication, and stability optimizations—including high‑availability storage, RDMA networking, and fault‑tolerance measures—to improve training throughput and reliability.

AI EngineeringI/O optimizationLarge Language Models

0 likes · 13 min read

Challenges and Optimization Techniques for Large Language Model Training

DataFunTalk

Mar 17, 2024 · Artificial Intelligence

Leveraging Large Language Models to Enhance Comprehensive Graph Learning Capabilities

In this talk, researcher Jiang Zhuoren from Zhejiang University reviews the current state of large language models applied to graph learning, discusses their roles across various graph scenarios, and outlines promising research directions for unified cross‑domain graph learning.

Artificial IntelligenceLarge Language Modelscross-domain learning

0 likes · 3 min read

Leveraging Large Language Models to Enhance Comprehensive Graph Learning Capabilities

Model Perspective

Mar 16, 2024 · Artificial Intelligence

What Watching a TV Drama Reveals About AI Model Training and Learning Strategies

The article draws parallels between expert viewers dissecting the drama "The Legend of Zhen Huan," efficient paper‑reading techniques, and the active‑prediction plus contrast‑learning approach that underpins modern AI model training, highlighting how proactive thinking boosts both personal and machine learning outcomes.

AI trainingActive LearningLarge Language Models

0 likes · 8 min read

What Watching a TV Drama Reveals About AI Model Training and Learning Strategies

DataFunSummit

Mar 14, 2024 · Artificial Intelligence

Multi‑Level Efficiency Challenges and Emerging Paradigms for Large AI Models

The article examines how large AI models are moving toward a unified, low‑knowledge‑density paradigm that raises computational efficiency challenges across model, algorithm, framework, and infrastructure layers, while also highlighting NVIDIA's GTC 2024 China AI Day sessions that showcase practical solutions and upcoming training opportunities.

AI InfrastructureAI conferencesLarge Language Models

0 likes · 10 min read

Multi‑Level Efficiency Challenges and Emerging Paradigms for Large AI Models

Smart Era Software Development

Mar 13, 2024 · Industry Insights

2023 AI Landscape: Public Perceptions, Emerging Trends, and the Road to AGI

The article reviews 2023's rapid LLM advances, public hype versus long‑term reality, the lack of hard limits to AGI, the rise of imagination‑driven capabilities, startup challenges, model compression, multimodal breakthroughs, AI agents, and the persistent US‑China technology gap.

AGIAI agentsAI startups

0 likes · 24 min read

2023 AI Landscape: Public Perceptions, Emerging Trends, and the Road to AGI

21CTO

Mar 12, 2024 · Artificial Intelligence

How Google’s ‘Social Learning’ AI Framework Boosts Privacy‑Safe Model Training

Google’s newly unveiled “Social Learning” AI framework lets large models teach each other via natural language, improving task performance while avoiding direct use of sensitive data, and uses teacher‑student interactions, synthetic data, and instruction generation to enhance privacy‑preserving model training.

AILarge Language ModelsPrivacy

0 likes · 4 min read

How Google’s ‘Social Learning’ AI Framework Boosts Privacy‑Safe Model Training

DataFunTalk

Mar 10, 2024 · Artificial Intelligence

Aligning Graph Models with Large Language Models for Open-Task Scenarios

This talk presents GraphTranslator, a framework that bridges pretrained graph models and large language models to enable unified handling of both predefined and open-ended graph analysis tasks by translating node representations into language tokens and training an alignment producer for node‑text pairs.

AI researchGraph Neural NetworksLarge Language Models

0 likes · 3 min read

Aligning Graph Models with Large Language Models for Open-Task Scenarios

NewBeeNLP

Mar 10, 2024 · Industry Insights

What WWW'24 Papers Reveal About LLMs in Search & Recommendation

This overview summarizes six WWW 2024 industry papers that apply large language models to e‑commerce search, personalized query suggestion, article recommendation, collaborative filtering, and lifelong sequential behavior understanding, highlighting their methods, experimental results, deployment status, and emerging trends in LLM‑driven search and recommendation.

LLMLarge Language ModelsSearch

0 likes · 16 min read

What WWW'24 Papers Reveal About LLMs in Search & Recommendation

DataFunTalk

Mar 7, 2024 · Artificial Intelligence

Enhancing Interactive Agents with Large Language Models: The SwiftSage Framework and Benchmark Analysis

This article reviews recent advances in using large language models for interactive embodied agents, introduces the SwiftSage dual‑model framework that combines a fast T5‑based small model with a powerful LLM for planning, evaluates it on benchmarks such as AFL World and ScienceWorld, and discusses efficiency, cost‑effectiveness, limitations, and future research directions.

AILarge Language ModelsSwiftSage

0 likes · 23 min read

Enhancing Interactive Agents with Large Language Models: The SwiftSage Framework and Benchmark Analysis

Rare Earth Juejin Tech Community

Mar 7, 2024 · Artificial Intelligence

Anthropic Announces Claude 3 Model Family: Opus, Sonnet, and Haiku

Anthropic has launched the Claude 3 family of large language models—Opus, Sonnet, and Haiku—offering varying balances of intelligence, speed, and cost, with enhanced reasoning, multilingual, vision capabilities, reduced refusals, and improved safety, now available via API in over 159 countries.

AI safetyAnthropicClaude 3

0 likes · 11 min read

Anthropic Announces Claude 3 Model Family: Opus, Sonnet, and Haiku

Model Perspective

Mar 6, 2024 · Fundamentals

Why Managing a City Is Like Designing a Spaceship: Exploring Complex Systems

An insightful look at how both spacecraft design and city governance exemplify complex systems, distinguishing closed versus open systems, outlining characteristics of complex and mega-complex systems, and linking these concepts to system engineering pioneers like Qian Xuesen and modern large language models.

Large Language ModelsQian Xuesenopen vs closed systems

0 likes · 9 min read

Why Managing a City Is Like Designing a Spaceship: Exploring Complex Systems

DataFunSummit

Mar 6, 2024 · Artificial Intelligence

Document Intelligence: Background, Technology, Large Models, and Enterprise Applications

This article presents a comprehensive overview of document intelligence, covering its background, technical evolution, large‑model advancements, and practical enterprise digital transformation use cases, with a focus on multimodal processing, unified document representation, and industry‑specific applications such as legal contract automation.

Enterprise AutomationLarge Language ModelsMultimodal AI

0 likes · 14 min read

Document Intelligence: Background, Technology, Large Models, and Enterprise Applications

Efficient Ops

Feb 27, 2024 · Artificial Intelligence

Can Large Language Models Truly Elevate Software Engineering? Insights and Roadmap

This article reviews the 2023 surge of large language models in software engineering, evaluates their current code generation, testing, and knowledge‑query capabilities, highlights persistent challenges in design and maintenance, and proposes concrete recommendations for advancing toward higher‑level intelligent development.

Generative AILarge Language ModelsSoftware Engineering

0 likes · 21 min read

Can Large Language Models Truly Elevate Software Engineering? Insights and Roadmap

NewBeeNLP

Feb 17, 2024 · Artificial Intelligence

How Sora Highlights the Next Leap Toward AGI and Shifts AI Competition

The article analyzes OpenAI's Sora video model, arguing that its integration of large‑language‑model reasoning with diffusion techniques marks a major step toward true world understanding, reshapes creative workflows, widens the AI talent gap, and accelerates the path to artificial general intelligence.

AGIAI trendsLarge Language Models

0 likes · 7 min read

How Sora Highlights the Next Leap Toward AGI and Shifts AI Competition

NewBeeNLP

Feb 11, 2024 · Industry Insights

What 2023 Taught Us About LLMs and AI‑Guided Optimization

The author reviews a year of rapid progress in large language models, highlighting breakthrough papers such as Positional Interpolation, StreamingLLM, Deja Vu, and RLCD, and discusses how AI‑guided optimization techniques like SurCo, LANCER, and GenCo are reshaping research and industry applications.

LLMLarge Language ModelsTransformers

0 likes · 13 min read

What 2023 Taught Us About LLMs and AI‑Guided Optimization

DataFunTalk

Feb 10, 2024 · Artificial Intelligence

Mitigating Hallucinations in Large Language Model Applications with Knowledge Graphs

This article examines the challenges of using large language models for industry Q&A, defines hallucination phenomena, evaluates their causes and impact, and proposes a set of strategies—including high‑quality fine‑tuning data, honest alignment, advanced decoding, and external knowledge‑graph augmentation—to reduce hallucinations and improve answer reliability.

HallucinationLarge Language Modelsknowledge graph

0 likes · 21 min read

Mitigating Hallucinations in Large Language Model Applications with Knowledge Graphs

Cloud Native Technology Community

Feb 8, 2024 · Artificial Intelligence

How Retrieval‑Augmented Generation Boosts LLM Accuracy and Trust

Retrieval‑augmented generation (RAG) enhances large language models by fetching up‑to‑date, authoritative information from external sources, addressing hallucinations, outdated knowledge, and lack of citations, while offering cost‑effective implementation, improved relevance, user trust, and greater developer control through vector databases, semantic search, and prompt engineering.

AILarge Language ModelsRAG

0 likes · 10 min read

How Retrieval‑Augmented Generation Boosts LLM Accuracy and Trust

DataFunSummit

Feb 5, 2024 · Artificial Intelligence

Ant Group's Knowledge Graph: Overview, Construction, Applications, and Integration with Large Models

Ant Group shares its comprehensive knowledge graph initiatives, detailing the fundamentals, construction pipeline, fusion techniques, cognitive representations, diverse business applications, and the emerging synergy between knowledge graphs and large language models, illustrating how graph-based AI enhances accuracy, interpretability, and downstream services.

Artificial IntelligenceData IntegrationGraph Fusion

0 likes · 14 min read

Ant Group's Knowledge Graph: Overview, Construction, Applications, and Integration with Large Models

MaGe Linux Operations

Jan 31, 2024 · Artificial Intelligence

Does Gemini Pro Really Outperform GPT‑4? A Deep Comparative Review

This article critically examines Google’s Gemini Pro against OpenAI’s GPT‑4 across reasoning, vision, token limits, benchmark data, and real‑world tasks, revealing where Gemini excels, where it falls short, and what to expect from the upcoming Gemini Ultra.

AI model comparisonBenchmarkGPT-4

0 likes · 13 min read

Does Gemini Pro Really Outperform GPT‑4? A Deep Comparative Review

DataFunTalk

Jan 31, 2024 · Artificial Intelligence

Industry Trends and Challenges of Large Language Models in Enterprise Applications (2023 Review)

The article reviews the rapid development of large language models in enterprise settings, covering internal collaboration tools, AI assistants for development and marketing, multimodal generation, inference speed bottlenecks, resource constraints, and future directions such as open‑source models and academic‑industry cooperation.

AI assistantsAI in marketingEnterprise AI

0 likes · 8 min read

Industry Trends and Challenges of Large Language Models in Enterprise Applications (2023 Review)

Alibaba Cloud Big Data AI Platform

Jan 29, 2024 · Artificial Intelligence

Unlocking Sparse MoE Large Model Training with Megatron-Core on Alibaba Cloud

This article explains how Alibaba Cloud's PAI platform and NVIDIA's Megatron-Core enable efficient training of sparse Mixture-of-Experts (MoE) large language models, covering algorithm basics, the Megatron-Core MoE framework, weight conversion pipelines, and performance results on Mixtral‑8x7B.

Large Language ModelsMegatron-CoreMixture of Experts

0 likes · 18 min read

Unlocking Sparse MoE Large Model Training with Megatron-Core on Alibaba Cloud

ZhongAn Tech Team

Jan 22, 2024 · Artificial Intelligence

Weekly Tech Overview: Major Industry Updates and AI Insights

This weekly tech overview summarizes major industry developments, including Huawei's HarmonyOS NEXT release, SenseTime's open‑source large language model InternLM2, the Apple‑Epic App Store dispute resolution, Xiaomi's 5G satellite terminal approval, Microsoft overtaking Apple in market value, and recent AI energy consumption concerns.

AIHarmonyOSIndustry Updates

0 likes · 10 min read

Weekly Tech Overview: Major Industry Updates and AI Insights

Xiaohongshu Tech REDtech

Jan 20, 2024 · Artificial Intelligence

Decoding Xiaohongshu’s Recommendation System: How Ordinary Users Gain Visibility

Xiaohongshu’s recommendation system uses large‑scale multimodal embeddings, dual‑tower and graph models, and diversity techniques like DPP and SSD to quickly surface high‑quality user‑generated content, enabling ordinary users to gain visibility while balancing personalization, exploration, and efficient LLM‑augmented pipelines.

Large Language ModelsMultimodal AIXiaohongshu

0 likes · 15 min read

Decoding Xiaohongshu’s Recommendation System: How Ordinary Users Gain Visibility

Cognitive Technology Team

Jan 17, 2024 · Artificial Intelligence

Redis Founder antirez Reflects on Large Language Models in 2024

In his first 2024 blog post, Redis founder antirez shares a programmer's perspective on large language models, sharply critiques Google's search engine, evaluates current AIGC as both foolish and historically knowledgeable, and argues that generative AI mainly amplifies the abilities of already strong developers.

AI CommentaryLarge Language ModelsRedis

0 likes · 2 min read

Redis Founder antirez Reflects on Large Language Models in 2024

21CTO

Jan 14, 2024 · Artificial Intelligence

Can Large Language Models Really Boost Programming Productivity? Insights from Redis Founder

The article reflects on the Redis founder's 2024 blog about large language models, examining their strengths and limits in software development, illustrating how they can accelerate coding for experienced programmers while highlighting challenges in system programming and the need for careful prompt engineering.

AI programmingLarge Language Modelsproductivity

0 likes · 19 min read

Can Large Language Models Really Boost Programming Productivity? Insights from Redis Founder

Rare Earth Juejin Tech Community

Jan 3, 2024 · Artificial Intelligence

Llama 2: Open Foundation and Fine‑Tuned Chat Models – Ghost Attention, RLHF Results, and Safety Evaluation

This article summarizes the Llama 2 series, describing the Ghost Attention technique for maintaining system‑message consistency across multi‑turn dialogs, presenting RLHF and human evaluation results, and discussing extensive safety pre‑training, benchmark assessments, and model release details.

AI evaluationGhost AttentionLarge Language Models

0 likes · 20 min read

Llama 2: Open Foundation and Fine‑Tuned Chat Models – Ghost Attention, RLHF Results, and Safety Evaluation

OPPO Kernel Craftsman

Dec 29, 2023 · Information Security

OPPO Releases White Paper on Mobile Application Trustworthy Technology at CAICT ICT+ Deep Observation Conference

At the CAICT ICT+ Deep Observation Conference, OPPO unveiled a white paper on mobile application trustworthy technology, analyzing lifecycle security risks, policy and patent developments, and the role of large‑model AI in intelligent terminals, while urging standardized security practices and accelerated AI‑driven vulnerability detection tools.

CAICTIntelligent TerminalsLarge Language Models

0 likes · 4 min read

OPPO Releases White Paper on Mobile Application Trustworthy Technology at CAICT ICT+ Deep Observation Conference

OPPO Amber Lab

Dec 29, 2023 · Information Security

Large Models Transform Mobile App Security – Key Takeaways from OPPO’s White Paper

The 2024 China Academy of ICT deep‑observation summit in Shanghai unveiled OPPO’s new white paper on trustworthy mobile application technology, highlighting how large language models enhance smart terminal security, outlining industry trends, and outlining future directions for secure, intelligent mobile ecosystems.

Large Language ModelsOPPOWhite Paper

0 likes · 6 min read

Large Models Transform Mobile App Security – Key Takeaways from OPPO’s White Paper

DataFunTalk

Dec 25, 2023 · Artificial Intelligence

Tool Learning with Foundation Models: Frameworks, Datasets, and Open‑Source Toolkits

This article reviews the emerging field of tool learning for large foundation models, outlining its background, categorization, core framework components, training strategies, and applications such as WebCPM, BMTools, and ToolBench, while highlighting recent research results and open‑source resources.

AI toolsFoundation ModelsLarge Language Models

0 likes · 21 min read

Tool Learning with Foundation Models: Frameworks, Datasets, and Open‑Source Toolkits

Java High-Performance Architecture

Dec 22, 2023 · Artificial Intelligence

Is Google Gemini Echoing Baidu? A Deep Dive into Model Contamination

The article investigates recent tests showing that Google Gemini sometimes claims to be Baidu's AI, reproduces Baidu‑related responses, and appears to have its Chinese and English corpora contaminated with competitor data, highlighting the challenges of data provenance in large language models.

AI model contaminationAI testingBaidu Wenxin

0 likes · 6 min read

Is Google Gemini Echoing Baidu? A Deep Dive into Model Contamination

DataFunTalk

Dec 21, 2023 · Artificial Intelligence

Label Words are Anchors: An Information Flow Perspective for Understanding In-Context Learning – Best Long Paper at EMNLP 2023

At EMNLP 2023, the joint WeChat AI and Peking University paper 'Label Words are Anchors: An Information Flow Perspective for Understanding In-Context Learning' won the Best Long Paper award, revealing that label tokens act as anchors driving information aggregation in shallow layers and prediction flow in deep layers, and proposing methods to improve and diagnose in‑context learning.

AI researchIn-Context LearningInformation Flow

0 likes · 13 min read

Label Words are Anchors: An Information Flow Perspective for Understanding In-Context Learning – Best Long Paper at EMNLP 2023

DataFunTalk

Dec 19, 2023 · Artificial Intelligence

Enterprise Large‑Model Deployment and Data Governance: Insights from Deepexi’s President

The article examines how enterprises can adopt domain‑specific large models by balancing demand‑side cost‑reduction needs with supply‑side mature training techniques, discusses team composition, fine‑tuning methods, data governance for unstructured data, and outlines Deepexi’s product ecosystem designed to improve efficiency, performance, and user experience.

AI DeploymentEnterprise AILarge Language Models

0 likes · 13 min read

Enterprise Large‑Model Deployment and Data Governance: Insights from Deepexi’s President

21CTO

Dec 17, 2023 · Artificial Intelligence

Why AI‑Native Apps Matter: Insights from Baidu, ByteDance Ban, and New PHP Server

The article examines Baidu CEO Li Yanhong’s call to focus on AI‑native applications, reports ByteDance’s suspension by OpenAI for misusing GPT, outlines Google’s phased removal of third‑party cookies, and announces the release of the Go‑based PHP server FrankenPHP 1.0.

AI-native applicationsLarge Language ModelsPHP server

0 likes · 7 min read

Why AI‑Native Apps Matter: Insights from Baidu, ByteDance Ban, and New PHP Server

DataFunSummit

Dec 14, 2023 · Artificial Intelligence

Enterprise Large‑Model Deployment: Data Governance, Fine‑Tuning Strategies, and Cost Economics

The article examines how enterprises can adopt domain‑specific large language models by addressing data governance, model fine‑tuning techniques, dataset balance, and product architecture to achieve cost‑effective, high‑performance AI solutions across various business scenarios.

Large Language ModelsModel Fine‑tuningcost efficiency

0 likes · 14 min read

Enterprise Large‑Model Deployment: Data Governance, Fine‑Tuning Strategies, and Cost Economics

Huawei Cloud Developer Alliance

Dec 14, 2023 · Artificial Intelligence

Unlocking LLaMA: Key Innovations, Architecture Insights, and MindSpore Inference Guide

This article reviews the LLaMA large‑language‑model series, covering its background, architectural innovations such as Add&Norm, SwiGLU, and RoPE, a known reversal‑curse bug, and provides step‑by‑step MindSpore Transformers code for model configuration, inference, and pipeline usage while previewing the upcoming LLaMA‑2 session.

AILLaMALarge Language Models

0 likes · 6 min read

Unlocking LLaMA: Key Innovations, Architecture Insights, and MindSpore Inference Guide

DataFunTalk

Dec 12, 2023 · Artificial Intelligence

Challenges and Considerations of Recommendation Systems: Evaluation, Data Leakage, and the Role of Large Models

This article examines recommendation system problem definitions, differences between academia and industry, offline evaluation pitfalls and data leakage issues, data construction challenges with datasets like MovieLens, and evaluates whether large language models can serve as effective solutions for modern recommendation tasks.

Large Language ModelsRecommendation Systemsdata leakage

0 likes · 20 min read

Challenges and Considerations of Recommendation Systems: Evaluation, Data Leakage, and the Role of Large Models

21CTO

Dec 7, 2023 · Artificial Intelligence

Google Gemini vs GPT‑4: Can the New AI Model Outperform ChatGPT?

Google's Gemini AI suite, unveiled in December, brings three model sizes—Nano, Pro, and Ultra—to power Bard and other services, claims superior performance over GPT‑4 across most benchmarks, and introduces multimodal capabilities that signal a major shift in the AI landscape.

AI language modelGPT-4 comparisonGoogle Gemini

0 likes · 6 min read

Google Gemini vs GPT‑4: Can the New AI Model Outperform ChatGPT?

JD Tech

Nov 30, 2023 · Artificial Intelligence

Understanding ChatGPT: Mechanisms, Attention, Emergence, and the Chinese Room

This article examines the principles behind ChatGPT, detailing its continuation-based operation, the role of attention mechanisms and transformer architecture, the scaling of neural networks that leads to emergent abilities, and interprets these phenomena through the lenses of compression theory and the Chinese Room thought experiment.

Attention MechanismChatGPTLarge Language Models

0 likes · 27 min read

Understanding ChatGPT: Mechanisms, Attention, Emergence, and the Chinese Room

AntTech

Nov 24, 2023 · Artificial Intelligence

Code Model Evaluation Framework and the CodeFuseEval Benchmark Overview

This article presents a comprehensive overview of code large‑model evaluation, describing the need for multi‑dimensional benchmarks, the CodeFuseEval benchmark suite, dataset construction, evaluation methods, framework architecture, result visualisation, and future directions for enterprise‑grade code generation models.

AIBenchmarkCodeFuseEval

0 likes · 12 min read

Code Model Evaluation Framework and the CodeFuseEval Benchmark Overview

Ant R&D Efficiency

Nov 24, 2023 · Artificial Intelligence

CodeFuseEval: An Enterprise‑Level Multi‑Task Benchmark for Evaluating Code Large Models

CodeFuseEval is an enterprise‑grade, multi‑task benchmark that evaluates code‑generation large models across six languages and thousands of real‑world tasks using both objective metrics (pass@k, BLEU, CodeBLEU) and expert human review, with an open‑source framework, continuous dataset expansion, and a focus on correctness, efficiency, robustness, and service‑level quality.

AIBenchmarkEvaluation

0 likes · 12 min read

CodeFuseEval: An Enterprise‑Level Multi‑Task Benchmark for Evaluating Code Large Models

DataFunTalk

Nov 21, 2023 · Artificial Intelligence

Improving Efficiency of Large-Scale Distributed Training for Large Language Models

Recent advances in large language models have dramatically increased model size and training data, leading to soaring computational costs; this article examines the scaling trends, hardware utilization challenges, distributed training techniques, and ethical considerations, highlighting methods to improve efficiency, reduce costs, and mitigate environmental impact.

AI ethicsLarge Language Modelscompute optimization

0 likes · 29 min read

Improving Efficiency of Large-Scale Distributed Training for Large Language Models

Baobao Algorithm Notes

Nov 21, 2023 · Artificial Intelligence

How Much Data Do You Need for a 10B LLM? Decoding Scaling Laws

This article explains how scaling laws can answer common LLM development questions—such as the data required for a 10B model, the model size achievable with 1 TB of data, and the optimal compute‑data‑model trade‑off for a fixed GPU budget—by presenting core formulas, practical derivations, and insights from OpenAI, DeepMind and Google.

Data RequirementsLarge Language ModelsModel Size

0 likes · 12 min read

How Much Data Do You Need for a 10B LLM? Decoding Scaling Laws

360 Smart Cloud

Nov 20, 2023 · Artificial Intelligence

Overview of Recent Open‑Source AI Models and Tools (November 2023)

This article summarizes a collection of newly released open‑source AI projects covering natural‑language processing, multimodal processing, intelligent agents, recommendation systems, and model training acceleration, providing brief descriptions, key capabilities, and links to their repositories.

AILarge Language ModelsMultimodal

0 likes · 9 min read

Overview of Recent Open‑Source AI Models and Tools (November 2023)

Ximalaya Technology Team

Nov 16, 2023 · Artificial Intelligence

How AI Agents Turn One-Line Prompts Into Fully Functional Apps in Minutes

ChatDev, an AI‑driven software development platform, claims to create complete applications from a single prompt in about three minutes and at a cost of roughly two yuan, leveraging a multi‑agent workflow, a custom 100‑billion‑parameter model, and open‑source frameworks to dramatically cut development time and expense.

AI agentsChatDevIndustry Analysis

0 likes · 13 min read

How AI Agents Turn One-Line Prompts Into Fully Functional Apps in Minutes

Architect

Nov 8, 2023 · Artificial Intelligence

AI Agents Unleashed: From Assistants API to Multi‑Agent Frameworks

The article dissects the rise of AI agents—from OpenAI's Assistants API and multimodal perception‑brain‑action pipelines to retrieval‑augmented generation, tool‑use strategies, single‑ and multi‑agent deployments, and emerging frameworks like AutoGen—while highlighting concrete examples, benchmark results, and current limitations.

AI agentsAssistants APIEmbodied AI

0 likes · 38 min read

AI Agents Unleashed: From Assistants API to Multi‑Agent Frameworks

Tencent Cloud Developer

Nov 8, 2023 · Artificial Intelligence

Comprehensive Overview of AI Agents: Concepts, Technical Frameworks, and Applications

The article surveys modern AI agents—software entities powered by large language models that perceive multimodal inputs, reason via brain modules, act through tools or embodied actions, employ retrieval‑augmented generation and chain‑of‑thought planning, and can operate singly (e.g., AutoGPT) or collaboratively via frameworks like Microsoft’s AutoGen—while highlighting current challenges such as controllability, memory limits, parallelism, and reliability.

AI agentsAutoGenLarge Language Models

0 likes · 34 min read

Comprehensive Overview of AI Agents: Concepts, Technical Frameworks, and Applications

DataFunSummit

Nov 5, 2023 · Artificial Intelligence

Enhancing Recommendation Models with Scaling Law via HCNet and MemoNet: A Memory‑Based Feature‑Combination Approach

This article presents a memory‑driven architecture (HCNet and MemoNet) that equips recommendation models with scaling‑law characteristics by storing and retrieving arbitrary feature‑combination embeddings, evaluates multi‑hash codebooks, memory‑restoring strategies, key‑feature selection, and demonstrates significant offline and online performance gains.

Large Language ModelsScaling Lawfeature interaction

0 likes · 15 min read

Enhancing Recommendation Models with Scaling Law via HCNet and MemoNet: A Memory‑Based Feature‑Combination Approach

Model Perspective

Nov 2, 2023 · Artificial Intelligence

Why Mathematical Modelers Must Embrace LLMs and Forget Outdated Skills

The article explains how rapid advances in data and large language models force mathematical modelers to continuously update their models and skills, discard obsolete knowledge, and adopt lifelong learning to stay effective in a fast‑changing AI‑driven environment.

Artificial IntelligenceLarge Language Modelscontinuous learning

0 likes · 6 min read

Why Mathematical Modelers Must Embrace LLMs and Forget Outdated Skills

Baidu Geek Talk

Nov 2, 2023 · Artificial Intelligence

AI-Powered Code Defect Detection: Leveraging Code Knowledge Graphs and Large Language Models

The paper presents an AI‑driven static analysis framework that builds code knowledge graphs to extract relevant slices and leverages large language models for multilingual defect prediction, achieving up to 80% F1, detecting 662 defects across 1,100 C++ modules with a 26.9% recall gain over traditional rule‑based scanners.

BERTLarge Language Modelscode defect detection

0 likes · 9 min read

AI-Powered Code Defect Detection: Leveraging Code Knowledge Graphs and Large Language Models

Baidu Intelligent Cloud Tech Hub

Nov 1, 2023 · Databases

How BES Powers Large-Scale Vector Search for AI Applications

This article explains the principles of vector databases, outlines the engineering practices of Baidu Intelligent Cloud BES for large‑scale vector retrieval, discusses optimization techniques such as HNSW, IVF and filter integration, and presents real‑world AI use cases and future development directions.

AIBESElasticsearch

0 likes · 16 min read

How BES Powers Large-Scale Vector Search for AI Applications

DataFunSummit

Oct 30, 2023 · Artificial Intelligence

Exploring General AI, Large Language Models, Knowledge Graphs, and Reinforcement Learning – Insights from DataFun

This article presents a comprehensive overview of DaGuan Data's explorations in general artificial intelligence, large language models, knowledge graphs, reinforcement learning, compute and data requirements, and the emerging concept of Human‑Centric AGI, supplemented by a detailed Q&A session.

AGIArtificial IntelligenceLarge Language Models

0 likes · 18 min read

Exploring General AI, Large Language Models, Knowledge Graphs, and Reinforcement Learning – Insights from DataFun

DataFunSummit

Oct 27, 2023 · Artificial Intelligence

ChatGPT Technology, Domesticization Attempts, and Open‑Source Large Models

This article reviews the evolution and challenges of ChatGPT technology, describes the authors' efforts to localize and commercialize the model for the Chinese market, and introduces their open‑source Chinese large‑model initiative, including training methods, performance gaps, and future improvement directions.

ChatGPTChinese NLPLarge Language Models

0 likes · 11 min read

ChatGPT Technology, Domesticization Attempts, and Open‑Source Large Models

Baidu Tech Salon

Oct 25, 2023 · Artificial Intelligence

Intelligent Question Answering Technology in Baidu Search: Development, Modeling, and Retrieval‑Enhanced Generation

The article surveys Baidu Search’s intelligent question‑answering system, tracing its evolution from feature‑engineered retrieval to large pre‑trained and generative models, and detailing hierarchical readers, multi‑teacher distillation, retrieval‑enhanced generation, and instruction decomposition as key techniques for delivering fast, accurate, citation‑rich answers.

Baidu SearchKnowledge DistillationLarge Language Models

0 likes · 18 min read

Intelligent Question Answering Technology in Baidu Search: Development, Modeling, and Retrieval‑Enhanced Generation

Baidu Geek Talk

Oct 25, 2023 · Artificial Intelligence

How Baidu Search Is Transforming Machine Question Answering with Large‑Scale AI Models

This article reviews the evolution of machine question answering, from early feature‑engineered systems to modern large‑language‑model‑driven retrieval‑augmented generation, outlines Baidu Search’s current Retriever‑Reader architecture, discusses challenges such as semantic complexity, latency and answer quality, and presents solutions including hierarchical DocMRC modeling, multi‑teacher knowledge distillation, and instruction decomposition for efficient, high‑quality answers.

BaiduKnowledge DistillationLarge Language Models

0 likes · 18 min read

How Baidu Search Is Transforming Machine Question Answering with Large‑Scale AI Models

DataFunTalk

Oct 25, 2023 · Artificial Intelligence

Applying Large Language Models to Wireless Network Intelligent Operations: Opportunities, Challenges, and Platform Construction

This article examines how large language model technology can be leveraged for intelligent operation of wireless communication networks, analyzing its advantages, current challenges, platform architecture, experimental validation, and future research directions within the telecom industry.

AILarge Language Modelsintelligent operation

0 likes · 17 min read

Applying Large Language Models to Wireless Network Intelligent Operations: Opportunities, Challenges, and Platform Construction

AI Large Model Application Practice

Oct 23, 2023 · Artificial Intelligence

Unlocking GPT‑4V: A Concise Guide to Multimodal Capabilities and Prompt Techniques

This article summarizes the GPT‑4V research paper, detailing its visual input modes, effective prompting strategies, diverse multimodal abilities, high‑value application scenarios, and ways to enhance the model with classic LLM techniques while noting current limitations.

AI ApplicationsGPT-4VLarge Language Models

0 likes · 17 min read

Unlocking GPT‑4V: A Concise Guide to Multimodal Capabilities and Prompt Techniques

Zuoyebang Tech Team

Oct 19, 2023 · Artificial Intelligence

How AI and Big Data Are Transforming Education: Insights from Zuoyebang’s Chief Scientist

At the GET2023 Education Technology Conference, Zuoyebang’s chief scientist Song Yang detailed how AI, large language models, big data, and smart hardware are reshaping learning experiences across subjects, from math problem generation to interactive programming assistants, and outlined the company’s vision for AI‑driven education.

AI in EducationEducational TechnologyLarge Language Models

0 likes · 12 min read

How AI and Big Data Are Transforming Education: Insights from Zuoyebang’s Chief Scientist

Alimama Tech

Oct 18, 2023 · Artificial Intelligence

Technical Challenges and Directions for Large‑Model Applications in E‑commerce

Taobao Group’s ten large‑model challenges target e‑commerce AI by demanding domain‑specific pre‑training, multi‑step reasoning, extended context handling, factual reliability, intelligent tool orchestration, robust retrieval integration, fuzzy‑intent tool selection, scalable multi‑objective RLHF, improved query rewriting, and knowledge‑driven recommendation.

E‑CommerceLarge Language ModelsRLHF

0 likes · 16 min read

Technical Challenges and Directions for Large‑Model Applications in E‑commerce

DaTaobao Tech

Oct 18, 2023 · Artificial Intelligence

Large Model Application Challenges for E-commerce

Taobao Group’s ten large‑model e‑commerce challenges call for researchers to build domain‑specific data pipelines, mitigate forgetting, balance expertise with generality, enable multi‑step reasoning, handle long contexts, reduce hallucinations, integrate tool use, improve fuzzy intent detection, apply multi‑objective RLHF, and generate cognitively novel recommendations.

Large Language ModelsRLHFknowledge hallucination

0 likes · 14 min read

Large Model Application Challenges for E-commerce

Baidu Geek Talk

Oct 16, 2023 · Industry Insights

What Is AI‑Native Thinking and Why It Will Shape the Next Wave of Applications

The article explores the concept of AI‑native thinking, outlines the mindset and conditions needed for AI‑native applications, showcases examples such as Baidu Wenku and a legal‑assistant hackathon project, and discusses platform support, technical foundations, and emerging opportunities in the large‑model era.

AI-nativeBaiduIndustry insight

0 likes · 14 min read

What Is AI‑Native Thinking and Why It Will Shape the Next Wave of Applications

Baidu Geek Talk

Oct 11, 2023 · Artificial Intelligence

How Baidu’s Qianfan 2.0 Supercharges Large‑Model Development and Deployment

The article reviews Baidu Cloud’s Qianfan 2.0 platform, detailing its expanded model catalog, dataset library, Chinese‑language enhancements, compression and speed gains, robust AI infrastructure, application templates, and end‑to‑end data‑labeling pipeline that together lower cost and accelerate large‑model adoption across industries.

AI platformLarge Language ModelsModel Deployment

0 likes · 14 min read

How Baidu’s Qianfan 2.0 Supercharges Large‑Model Development and Deployment

JD Cloud Developers

Oct 10, 2023 · Artificial Intelligence

Do Large Language Models Have a Mind? Attention, Emergence & Compression Explained

This article examines whether ChatGPT and other large language models exhibit true Theory of Mind, detailing the role of attention mechanisms, neural network architecture, emergent abilities, the Chinese‑room argument, and how compression of massive textual data underlies their apparent intelligence.

Attention MechanismLarge Language ModelsTheory of Mind

0 likes · 30 min read

Do Large Language Models Have a Mind? Attention, Emergence & Compression Explained

Baobao Algorithm Notes

Oct 9, 2023 · Artificial Intelligence

Demystifying RLHF and PPO for Large Language Models: Theory and Practice

This article explains why Reinforcement Learning from Human Feedback (RLHF) is crucial for LLM intelligence, outlines the three-stage training pipeline, details InstructGPT's reward model and PPO optimization, and provides a practical guide to implementing RLHF with deep‑learning frameworks.

Artificial IntelligenceLarge Language ModelsPPO

0 likes · 17 min read

Demystifying RLHF and PPO for Large Language Models: Theory and Practice

DataFunSummit

Sep 30, 2023 · Artificial Intelligence

Causal Inference from the Perspective of Large Models

This presentation by senior AI architect He Gang explores how large language models and LLM‑powered agents can enhance causal inference tasks, detailing model‑assisted analysis, agent‑based inference methods, and multi‑agent simulations to advance causal research.

AILLM AgentsLarge Language Models

0 likes · 2 min read

Causal Inference from the Perspective of Large Models

NetEase LeiHuo Testing Center

Sep 22, 2023 · Artificial Intelligence

Understanding Large Language Models and Prompt Engineering: A Practical Guide

This article provides an introductory overview of large language models (LLMs), compares popular models, explains their underlying principles, and offers practical guidance on prompt engineering, model evaluation, usage tips, and safety considerations, helping readers effectively select and apply LLMs in various scenarios.

AILLMLarge Language Models

0 likes · 44 min read

Understanding Large Language Models and Prompt Engineering: A Practical Guide