Tagged articles
47 articles
Page 1 of 1
Data Party THU
Data Party THU
Apr 29, 2026 · Artificial Intelligence

Claude Opus 4.7 System Prompt Leak: Decoding Its 10 Core Design Decisions

The article dissects the leaked Claude Opus 4.7 system prompt, revealing ten intertwined design decisions—from treating psychological reconstruction as a danger signal to dynamic safety‑policy upgrades—that together shape the model’s self‑restraint, tool‑use, memory handling, and risk‑aware behavior.

AI SafetyClaudeLanguage Model
0 likes · 8 min read
Claude Opus 4.7 System Prompt Leak: Decoding Its 10 Core Design Decisions
Smart Workplace Lab
Smart Workplace Lab
Apr 23, 2026 · Artificial Intelligence

Think Standard Scripts Solve It? Uncover the Real Issue with High‑EQ AI Prompt Tuning

The article explains why using formal, standard language makes AI‑generated workplace messages sound robotic and presents a three‑step protocol—high‑quality phrase extraction, persona‑mapping prompts, and forbidden‑word rules—to feed the model with emotionally intelligent corpora for more natural communication.

AI prompt engineeringLanguage ModelPrompt Design
0 likes · 5 min read
Think Standard Scripts Solve It? Uncover the Real Issue with High‑EQ AI Prompt Tuning
Smart Workplace Lab
Smart Workplace Lab
Apr 16, 2026 · Industry Insights

Boost AI Communication Trust: Empathy Prompt Templates & Risk Checklist

This guide explains why AI‑generated messages often feel robotic, presents a set of prompt templates that inject emotion, relationship, and cultural context into LLM outputs, and offers a risk‑assessment checklist to ensure safe, high‑impact workplace communication.

Language ModelPrompt engineeringai
0 likes · 6 min read
Boost AI Communication Trust: Empathy Prompt Templates & Risk Checklist
Machine Learning Algorithms & Natural Language Processing
Machine Learning Algorithms & Natural Language Processing
Mar 7, 2026 · Artificial Intelligence

Transformer Hidden States Can Reconstruct Input with 100% Accuracy – New Invertibility Study

A recent paper from Sapienza University's GLADIA Lab shows that mainstream Transformer language models are injective, enabling a novel SIPIT algorithm to recover original text from hidden states with perfect accuracy, while extensive experiments confirm the models retain all input information.

InjectiveInvertibilityLanguage Model
0 likes · 11 min read
Transformer Hidden States Can Reconstruct Input with 100% Accuracy – New Invertibility Study
Efficient Ops
Efficient Ops
Aug 27, 2025 · Artificial Intelligence

Why DeepSeek V3.1 Randomly Inserts the Chinese Character “极” – Token Bug Explained

DeepSeek’s latest V3.1 model unexpectedly injects the Chinese character “极” into generated text, a token‑ID mix‑up that breaks code compilation, JSON parsing, and academic writing, with users tracing the issue to adjacent token IDs and two main hypotheses of dataset contamination or model shortcut.

AI SafetyDeepSeekLanguage Model
0 likes · 4 min read
Why DeepSeek V3.1 Randomly Inserts the Chinese Character “极” – Token Bug Explained
Data Party THU
Data Party THU
Aug 18, 2025 · Artificial Intelligence

Why Google’s Gemma 3 270M Model Is a Game‑Changer for Edge AI

Google’s newly released Gemma 3 270M is a compact 270‑million‑parameter language model that combines a large token vocabulary, energy‑efficient INT4 quantization, strong instruction‑following, and production‑ready checkpoints, making it ideal for fine‑tuning, on‑device deployment, and a wide range of low‑latency AI tasks.

Gemma 3Google AILanguage Model
0 likes · 7 min read
Why Google’s Gemma 3 270M Model Is a Game‑Changer for Edge AI
JD Tech Talk
JD Tech Talk
Mar 5, 2025 · Artificial Intelligence

GLM: General Language Model Pretraining with Autoregressive Blank Infilling

GLM introduces a unified pretraining framework that combines autoregressive blank‑filling with 2D positional encoding and span‑shuffle, achieving superior performance over BERT, T5 and GPT on a range of NLU and generation tasks such as SuperGLUE, text‑filling, and language modeling.

2D positional encodingGLMLanguage Model
0 likes · 27 min read
GLM: General Language Model Pretraining with Autoregressive Blank Infilling
JD Cloud Developers
JD Cloud Developers
Mar 5, 2025 · Artificial Intelligence

How GLM’s Autoregressive Blank‑Filling Beats BERT, T5, and GPT

GLM introduces a universal language model that combines autoregressive blank‑filling with 2D positional encoding and span‑shuffle training, achieving superior performance over BERT, T5, and GPT across NLU, conditional and unconditional generation tasks, as demonstrated on SuperGLUE and other benchmarks.

Language ModelNLUTransformer
0 likes · 29 min read
How GLM’s Autoregressive Blank‑Filling Beats BERT, T5, and GPT
AI Algorithm Path
AI Algorithm Path
Feb 20, 2025 · Artificial Intelligence

What Is Perplexity in Large Language Models?

The article explains perplexity as a metric for evaluating large language models, walks through a step‑by‑step probability calculation for a sample sentence, shows how to normalize by sentence length using the geometric mean, and demonstrates that lower perplexity indicates a more accurate and less uncertain model.

Language ModelPerplexityai
0 likes · 6 min read
What Is Perplexity in Large Language Models?
Code Mala Tang
Code Mala Tang
Jan 31, 2025 · Artificial Intelligence

Master DeepSeek: 7 Prompt Engineering Tricks to Boost AI Responses

This guide presents seven practical prompt‑engineering techniques—clear goals, structured queries, domain terminology, concrete examples, scoped questions, step‑by‑step breakdowns, and multi‑turn interactions—to help users get more accurate and useful answers from DeepSeek.

AI promptsDeepSeekLanguage Model
0 likes · 6 min read
Master DeepSeek: 7 Prompt Engineering Tricks to Boost AI Responses
360 Tech Engineering
360 Tech Engineering
Dec 17, 2024 · Artificial Intelligence

Innovative Multimodal Architectures: IAA for Extending Language Models and BDM for Chinese-Native AI Painting

The article introduces two 360 AI Research Institute projects—IAA, an architecture that equips frozen language models with multimodal capabilities via plug‑in layers, and BDM, a Chinese‑native diffusion model compatible with the Stable Diffusion ecosystem—detailing their motivations, designs, benchmark results, and open‑source resources.

Chinese AI paintingLanguage ModelMultimodal AI
0 likes · 6 min read
Innovative Multimodal Architectures: IAA for Extending Language Models and BDM for Chinese-Native AI Painting
Infra Learning Club
Infra Learning Club
Oct 30, 2024 · Artificial Intelligence

How GPT-3 Evolved: From Transformer Roots to Massive Language Models

The article traces the development of GPT series—from the 2017 Transformer breakthrough, through GPT‑1, GPT‑2, and GPT‑3’s 175 billion parameters, to later models like Codex and ChatGPT—highlighting key papers, architectural choices, and the surprising role of OpenAI’s decoder‑only approach.

GPT-3GoogleLanguage Model
0 likes · 4 min read
How GPT-3 Evolved: From Transformer Roots to Massive Language Models
Architect's Alchemy Furnace
Architect's Alchemy Furnace
Apr 27, 2024 · Artificial Intelligence

28 Powerful ChatGPT Prompt Techniques to Supercharge Your Work

This guide presents 28 practical ChatGPT prompt strategies—from role‑playing experts and setting response length to crafting resumes, weekly reports, and product PRDs—helping readers boost productivity, creativity, and learning across personal and professional tasks.

AI productivityChatGPTLanguage Model
0 likes · 33 min read
28 Powerful ChatGPT Prompt Techniques to Supercharge Your Work
21CTO
21CTO
Feb 2, 2024 · Artificial Intelligence

WeChat’s App Bloats 1400×, China’s Quantum Computer Reaches 1M Users, AI2 Releases OLMo

Recent tech headlines reveal WeChat’s iOS app ballooning to 712 MB, China’s third‑generation superconducting quantum computer “Wukong” surpassing one million remote accesses, AI2 unveiling the open‑source OLMo language model, and Google planning to retire the Bard brand in favor of Gemini, highlighting rapid shifts across mobile, quantum, and AI domains.

GoogleLanguage ModelOpen-source AI
0 likes · 7 min read
WeChat’s App Bloats 1400×, China’s Quantum Computer Reaches 1M Users, AI2 Releases OLMo
Huolala Tech
Huolala Tech
Nov 23, 2023 · Artificial Intelligence

How HuoLaLa Built a Custom ASR System to Boost Accuracy and Cut Costs

This article details HuoLaLa's development of an in‑house Automatic Speech Recognition system, covering its architecture, VAD optimization, language‑model and hot‑word enhancements, punctuation restoration, task and resource scheduling, and the resulting improvements in accuracy and cost efficiency.

ASRLanguage ModelVAD
0 likes · 18 min read
How HuoLaLa Built a Custom ASR System to Boost Accuracy and Cut Costs
DataFunTalk
DataFunTalk
Nov 2, 2023 · Artificial Intelligence

Enhancing Language and Vision Models with External Knowledge and Tools: OREO‑LM, REVEAL, and AVIS

This article reviews recent research on augmenting language and multimodal models with external knowledge sources and tool‑calling mechanisms, covering three systems—OREO‑LM for knowledge‑graph reasoning, REVEAL for multi‑source visual‑language pretraining, and AVIS for dynamic tool selection—and their experimental results and implications.

Language ModelMultimodalknowledge graph
0 likes · 28 min read
Enhancing Language and Vision Models with External Knowledge and Tools: OREO‑LM, REVEAL, and AVIS
DataFunSummit
DataFunSummit
Sep 22, 2023 · Artificial Intelligence

Exploring Game AI Agents: Review, LLM‑Driven Exploration, and Future Directions

This article reviews the evolution of game AI agents, examines how large language models (LLMs) can drive new AI behaviors in games, and discusses practical case studies across genres such as Werewolf‑style, war‑SLG, and MOBA games, concluding with challenges and future research directions.

AI agentsGame DevelopmentLLM
0 likes · 31 min read
Exploring Game AI Agents: Review, LLM‑Driven Exploration, and Future Directions
Open Source Linux
Open Source Linux
Sep 8, 2023 · Artificial Intelligence

How ChatGPT Works: Inside the Neural Network That Generates Human‑Like Text

This article explains the inner workings of ChatGPT, covering how large language models predict the next token using probability distributions, the role of embeddings, the transformer architecture with attention heads, training methods, loss functions, and why such a massive neural network can produce coherent, human‑like language.

ChatGPTLanguage ModelNeural Networks
0 likes · 79 min read
How ChatGPT Works: Inside the Neural Network That Generates Human‑Like Text
58 Tech
58 Tech
Jun 21, 2023 · Artificial Intelligence

GPU Hotword Enhancement for WeNet End-to-End Speech Recognition

This article explains the design, implementation, and experimental evaluation of hot‑word augmentation in WeNet's GPU runtime, detailing how character‑ and word‑based language model scoring are extended to boost recognition of rare proper nouns in both streaming and non‑streaming ASR services.

ASRCTC decoderGPU
0 likes · 12 min read
GPU Hotword Enhancement for WeNet End-to-End Speech Recognition
Full-Stack Trendsetter
Full-Stack Trendsetter
May 15, 2023 · Artificial Intelligence

Do You Really Understand ChatGPT, the Era‑Defining AI?

This article explains what ChatGPT is, how it builds on natural-language-processing and the Transformer-based GPT series, details its model-size growth, architectural enhancements, multilingual support, and walks through the tokenization-to-generation pipeline that enables coherent AI-driven conversations.

ChatGPTDeep LearningGPT-3
0 likes · 8 min read
Do You Really Understand ChatGPT, the Era‑Defining AI?
21CTO
21CTO
Apr 16, 2023 · Artificial Intelligence

Why ChatGPT Isn't a New Revolution: History, Tech, and Real Impact

In this talk, Wu Jun explains the decades‑long evolution of language models, why ChatGPT sparked hype yet isn’t a breakthrough, how massive compute and data power it, and what practical effects it has on creators, energy use, and the tech industry.

AI historyChatGPTLanguage Model
0 likes · 20 min read
Why ChatGPT Isn't a New Revolution: History, Tech, and Real Impact
dbaplus Community
dbaplus Community
Apr 15, 2023 · Artificial Intelligence

Why ChatGPT Isn't a New Revolution: Insights from AI Pioneer Wu Jun

In a live talk, AI veteran Wu Jun explains why the hype around ChatGPT is overblown, traces the history of language models from the 1970s, details the massive compute and data requirements, and discusses the real impact of large‑scale AI on society and work.

AI hypeChatGPTLanguage Model
0 likes · 20 min read
Why ChatGPT Isn't a New Revolution: Insights from AI Pioneer Wu Jun
Programmer DD
Programmer DD
Apr 10, 2023 · Artificial Intelligence

Why ChatGPT Sparks Panic and What Its Real Technical Foundations Are

In this talk, AI expert Wu Jun explains why ChatGPT has caused widespread fear, traces the historical development of language models from the 1970s to today, clarifies the massive computational and data requirements, and discusses the real impact and opportunities of large‑scale AI systems.

AI hypeChatGPTDeep Learning
0 likes · 20 min read
Why ChatGPT Sparks Panic and What Its Real Technical Foundations Are
Top Architect
Top Architect
Mar 1, 2023 · Artificial Intelligence

Understanding the Internals of ChatGPT: Neural Networks, Embeddings, and Training Techniques

This article provides a comprehensive overview of how ChatGPT works, covering its probabilistic text generation, transformer architecture, embedding representations, neural network training processes, and the underlying principles that enable large language models to produce coherent and meaningful human-like language.

ChatGPTLanguage ModelNeural Networks
0 likes · 80 min read
Understanding the Internals of ChatGPT: Neural Networks, Embeddings, and Training Techniques
IT Architects Alliance
IT Architects Alliance
Feb 23, 2023 · Artificial Intelligence

Training a Positive Review Generator with RLHF and PPO

This article demonstrates how to use Reinforcement Learning from Human Feedback (RLHF) with a PPO algorithm and a sentiment‑analysis model to train a language model that generates positive product reviews, covering task definition, data sampling, reward evaluation, model optimization, and experimental results.

GPTLanguage ModelPPO
0 likes · 11 min read
Training a Positive Review Generator with RLHF and PPO
Architect
Architect
Feb 19, 2023 · Artificial Intelligence

Training a Positive Review Generator with RLHF and PPO

This article demonstrates how to apply Reinforcement Learning from Human Feedback (RLHF) using a sentiment‑analysis model as a reward function and Proximal Policy Optimization (PPO) to fine‑tune a language model that generates positive product reviews, complete with code snippets and experimental results.

Language ModelPPORLHF
0 likes · 10 min read
Training a Positive Review Generator with RLHF and PPO
DataFunSummit
DataFunSummit
Feb 8, 2023 · Artificial Intelligence

Technical Architecture and Training Process of ChatGPT

ChatGPT, a dialogue-focused language model, builds on the GPT family and employs techniques such as Reinforcement Learning from Human Feedback (RLHF), the TAMER framework, and a three-stage training pipeline (supervised fine‑tuning, reward modeling, and PPO reinforcement learning) to achieve advanced conversational capabilities.

ChatGPTGPTLanguage Model
0 likes · 7 min read
Technical Architecture and Training Process of ChatGPT
MoonWebTeam
MoonWebTeam
Dec 30, 2022 · Artificial Intelligence

What Makes ChatGPT So Powerful? A Deep Dive into Its Technology and Applications

ChatGPT, OpenAI’s conversational AI launched in December 2022, builds on GPT‑3 and advanced training methods like supervised fine‑tuning and reinforcement learning from human feedback, offering versatile applications from search assistance to code generation, while also revealing notable limitations and future commercial prospects.

ApplicationsChatGPTLanguage Model
0 likes · 17 min read
What Makes ChatGPT So Powerful? A Deep Dive into Its Technology and Applications
Xiaohongshu Tech REDtech
Xiaohongshu Tech REDtech
Nov 11, 2022 · Artificial Intelligence

Language Model as a Service and Black‑Box Optimization: Insights from Prof. Qiu Xipeng’s Talk

Prof. Qiu Xipeng’s talk highlighted how large language models can be offered as a service and efficiently adapted via in‑context learning, lightweight label‑tuning, and gradient‑free black‑box optimization, showcasing a unified asymmetric Transformer (CPT) that handles understanding, generation, ABSA and NER tasks while reducing resource demands.

Black-Box OptimizationLLMLanguage Model
0 likes · 15 min read
Language Model as a Service and Black‑Box Optimization: Insights from Prof. Qiu Xipeng’s Talk
DataFunTalk
DataFunTalk
Jul 30, 2021 · Artificial Intelligence

Fundamentals of Natural Language Processing: Language Models, Smoothing, and Basic Tasks

This article provides a comprehensive overview of natural language processing fundamentals, covering the challenges of language modeling, N‑gram and Markov assumptions, smoothing techniques such as discounting and add‑one, evaluation via perplexity, basic tasks like Chinese word segmentation, subword tokenization, POS tagging, syntactic and semantic parsing, and a range of downstream applications including information extraction, sentiment analysis, question answering, machine translation, and dialogue systems.

Language ModelNLPSubword Tokenization
0 likes · 29 min read
Fundamentals of Natural Language Processing: Language Models, Smoothing, and Basic Tasks
58 Tech
58 Tech
Dec 11, 2020 · Artificial Intelligence

Weighted Finite State Transducers (WFST) in Traditional Speech Recognition: Principles and Optimization

This article explains the role of Weighted Finite State Transducers in conventional HMM‑based speech recognition, covering language models, pronunciation dictionaries, WFST definitions, semiring theory, composition and determinization operations, decoding graph construction (HCLG), lattice rescoring, and practical optimization techniques for real‑world scenarios.

ASRLanguage ModelWFST
0 likes · 23 min read
Weighted Finite State Transducers (WFST) in Traditional Speech Recognition: Principles and Optimization
Sohu Tech Products
Sohu Tech Products
Nov 25, 2020 · Artificial Intelligence

Illustrated Guide to GPT-2: Detailed Explanation of the Decoder‑Only Transformer Model

This article provides a comprehensive, illustrated walkthrough of OpenAI's GPT‑2 language model, covering its decoder‑only Transformer architecture, self‑attention mechanisms, token processing, training data, differences from BERT, and applications beyond language modeling, enriched with visual diagrams and code snippets for deeper understanding.

GPT-2Language ModelSelf-Attention
0 likes · 24 min read
Illustrated Guide to GPT-2: Detailed Explanation of the Decoder‑Only Transformer Model
Didi Tech
Didi Tech
Nov 5, 2020 · Artificial Intelligence

Self-Learning Platform for Speech Recognition Model Optimization at DiDi

DiDi’s self‑learning ASR platform lets non‑technical users upload business data, automatically train, test and deploy models with semi‑supervised learning, hot‑word updates and LSTM rescoring, creating a closed‑loop pipeline that boosted vehicle voice‑interaction accuracy from around 80 % to over 95 % within months.

Language ModelSemi-supervised Learningacoustic model
0 likes · 14 min read
Self-Learning Platform for Speech Recognition Model Optimization at DiDi
58 Tech
58 Tech
Mar 2, 2020 · Artificial Intelligence

Low-Quality Text Detection Using Unsupervised Language Model Perplexity

This article proposes a method to identify low-quality text in business data by training a large-scale unsupervised language model to compute sentence perplexity, converting the detection problem into a threshold decision, and details model design, challenges, optimizations, and online performance results.

BERTLanguage ModelNLP
0 likes · 13 min read
Low-Quality Text Detection Using Unsupervised Language Model Perplexity
DataFunTalk
DataFunTalk
Sep 3, 2019 · Artificial Intelligence

Forward Neural Networks and Their Applications in Language Modeling, Ranking, and Recommendation

This article excerpt explains the structure and training of feed‑forward neural networks, illustrates their use in neural language models, describes deep structured semantic models for ranking tasks, and details two‑stage recommendation systems such as YouTube, covering both theoretical formulas and practical deployment considerations.

Artificial IntelligenceLanguage Modelforward neural network
0 likes · 13 min read
Forward Neural Networks and Their Applications in Language Modeling, Ranking, and Recommendation
WeChat Backend Team
WeChat Backend Team
Sep 3, 2019 · Artificial Intelligence

How Tencent Scaled Massive n‑gram Language Models for Real‑Time Speech Recognition

This article presents a distributed system that efficiently supports large‑scale n‑gram language models for automatic speech recognition by introducing caching, a two‑level distributed index, batch processing, and a cascading fault‑tolerance mechanism, demonstrating robust scalability and low communication overhead in Tencent's WeChat ASR service.

Language ModelN-gramcaching
0 likes · 35 min read
How Tencent Scaled Massive n‑gram Language Models for Real‑Time Speech Recognition
Tencent Cloud Developer
Tencent Cloud Developer
Aug 25, 2019 · Artificial Intelligence

Understanding Intelligent Speech Recognition Technology

Intelligent speech recognition converts spoken audio to text using a pipeline of feature extraction, acoustic and language modeling, where deep neural networks—especially CNN, LSTM, and hybrid CLDNN architectures—drive high accuracy, enabling mobile voice input, call‑center transcription, legal record keeping, and Tencent Cloud ASR’s 97% Mandarin accuracy with speaker separation and on‑premises deployment.

Language ModelTencent Cloudacoustic model
0 likes · 7 min read
Understanding Intelligent Speech Recognition Technology
Tencent Cloud Developer
Tencent Cloud Developer
Jul 17, 2019 · Artificial Intelligence

Design and Implementation of a Multi‑Turn Conversational Chatbot

The article outlines the design and implementation of a multi‑turn conversational chatbot, detailing how natural‑language understanding converts user utterances into structured representations, a CNN‑LSTM language model classifies topics, intents, and sentiments, and an XML‑based answer engine orchestrates tasks and services for real‑world deployment.

ChatbotLanguage Modelai
0 likes · 9 min read
Design and Implementation of a Multi‑Turn Conversational Chatbot
58 Tech
58 Tech
Jun 27, 2019 · Artificial Intelligence

Spelling Correction System for 58.com Search Engine: Rule‑Based and Statistical Methods

This article describes the design and implementation of a spelling‑correction module for 58.com’s search engine, covering common query errors, rule‑based and statistical language‑model approaches, offline dictionary generation, n‑gram and Viterbi decoding, online workflow, and practical examples.

Language ModelQuery ProcessingViterbi algorithm
0 likes · 15 min read
Spelling Correction System for 58.com Search Engine: Rule‑Based and Statistical Methods
58 Tech
58 Tech
Feb 20, 2019 · Artificial Intelligence

Building and Deploying Language Models for Text Quality Evaluation and Generation

This article explains the concepts, training pipeline, deployment formats, and practical applications of language models—particularly LSTM‑based models—for evaluating and generating text quality in a real‑world rental listing platform, highlighting data preparation, model training, and online serving techniques.

DeploymentLSTMLanguage Model
0 likes · 16 min read
Building and Deploying Language Models for Text Quality Evaluation and Generation
58 Tech
58 Tech
Jan 22, 2019 · Artificial Intelligence

Chinese Word Segmentation: Challenges, Methods, and Practical Practices

The article explains why Chinese word segmentation is essential for NLP tasks, outlines its fundamental difficulties such as ambiguity and out‑of‑vocabulary words, reviews dictionary‑based, statistical, and CRF approaches, and shares practical experiences from 58 Search’s production system.

CRFLanguage ModelNLP
0 likes · 21 min read
Chinese Word Segmentation: Challenges, Methods, and Practical Practices
iQIYI Technical Product Team
iQIYI Technical Product Team
Sep 14, 2018 · Artificial Intelligence

Limitations of Language Models in Voice Interaction and HomeAI Solutions

iQIYI HomeAI tackles the bottleneck of static language models in voice assistants by separating phonetic and semantic processing, correcting ASR errors at the intent‑recognition layer with pinyin‑enhanced entity correction, thereby reducing error amplification in video‑on‑demand interactions and paving the way for adaptive, personalized voice experiences.

Language Modelaiintent recognition
0 likes · 7 min read
Limitations of Language Models in Voice Interaction and HomeAI Solutions
dbaplus Community
dbaplus Community
Nov 10, 2016 · Artificial Intelligence

Demystifying Recurrent Neural Networks: Theory, Training, and Implementation

This article explains the fundamentals of recurrent neural networks (RNNs), their role in language modeling, various RNN architectures such as bidirectional and deep RNNs, the back‑propagation through time (BPTT) training algorithm, gradient challenges, vectorization techniques, and provides a step‑by‑step code implementation.

BPTTDeep LearningLanguage Model
0 likes · 21 min read
Demystifying Recurrent Neural Networks: Theory, Training, and Implementation