Tagged articles
52 articles
Page 1 of 1
Architects' Tech Alliance
Architects' Tech Alliance
May 8, 2026 · Artificial Intelligence

Token Fundamentals: A Technical Panorama of AI Language Units

Tokens are the smallest language building blocks that AI models process, representing characters, words, subwords, punctuation or emojis; they determine context window size and generation speed, so tokenization directly impacts model understanding accuracy and efficiency, as explained in the 2026 Token Report.

AI fundamentalsContext Windowlanguage models
0 likes · 4 min read
Token Fundamentals: A Technical Panorama of AI Language Units
Machine Heart
Machine Heart
Apr 30, 2026 · Artificial Intelligence

Can a Pre‑1930 Language Model Infer Einstein’s Relativity? Insights from the Talkie‑1930 Project

Researchers built a 13‑billion‑parameter model trained only on texts published before 1931, called Talkie‑1930, and used surprise‑based metrics, programming tests, and a modern‑twin comparison to explore how far such a historically‑constrained model can extrapolate future knowledge and reveal data‑leakage challenges.

AI researchHumanEvaldata leakage
0 likes · 10 min read
Can a Pre‑1930 Language Model Infer Einstein’s Relativity? Insights from the Talkie‑1930 Project
AI Explorer
AI Explorer
Apr 29, 2026 · Artificial Intelligence

Tencent Open‑Sources Hy‑MT: Offline Translation for 33 Languages Beats Google Translate

Tencent’s Hy‑MT1.5‑1.8B‑1.25bit model, now open‑source, runs entirely offline on smartphones, supports 33 languages, and—according to internal tests—delivers translation quality that surpasses Google Translate’s online service, highlighting the impact of 1.25‑bit quantization on model size and performance.

1.25bit quantizationHy-MTMobile AI
0 likes · 6 min read
Tencent Open‑Sources Hy‑MT: Offline Translation for 33 Languages Beats Google Translate
Machine Heart
Machine Heart
Apr 28, 2026 · Artificial Intelligence

LangFlow Demonstrates Continuous Diffusion Matching Discrete Models via Better Training

LangFlow revisits continuous diffusion for language modeling, showing that earlier performance gaps were due to suboptimal training and evaluation, and through embedding‑space diffusion, a log‑NSR noise schedule, and a Gumbel‑based information schedule it matches or exceeds discrete diffusion and autoregressive baselines on standard and zero‑shot benchmarks.

Evaluation MetricsGumbel distributionLangflow
0 likes · 16 min read
LangFlow Demonstrates Continuous Diffusion Matching Discrete Models via Better Training
Machine Learning Algorithms & Natural Language Processing
Machine Learning Algorithms & Natural Language Processing
Mar 14, 2026 · Artificial Intelligence

Can Large Language Models Get Stronger Without Human Language Training? A New Pre‑Pre‑Training Path

A recent study shows that pre‑training Transformers on synthetic, non‑language data generated by Neural Cellular Automata can boost language‑model performance by up to 6%, accelerate convergence by 40%, and improve downstream reasoning, even outperforming models trained on massive natural‑text corpora.

In-Context LearningNeural Cellular AutomataPre‑training
0 likes · 12 min read
Can Large Language Models Get Stronger Without Human Language Training? A New Pre‑Pre‑Training Path
AI Step-by-Step
AI Step-by-Step
Mar 10, 2026 · Artificial Intelligence

5 Essential Prompting Techniques to Make AI Truly Boost Your Productivity

The article explains that merely choosing the right AI tool is insufficient; real efficiency comes from asking clear, well‑structured questions, and it outlines five practical prompting methods—including specifying goals, providing background, breaking tasks into steps, defining output format, and iterating drafts—to turn AI into a time‑saving collaborator.

AI promptingPrompt engineeringlanguage models
0 likes · 9 min read
5 Essential Prompting Techniques to Make AI Truly Boost Your Productivity
Qborfy AI
Qborfy AI
Mar 2, 2026 · Artificial Intelligence

Master Prompt Engineering: A 4‑Step Method to Make AI Give Exactly What You Want

This article explains why asking AI the right way matters, introduces a practical four‑step prompting framework—role, background, task, format—illustrates each step with concrete examples, reveals a hidden “sample” trick, and shows how iterative refinement can turn generic replies into precise, useful results.

AI communicationPrompt engineeringeffective prompting
0 likes · 10 min read
Master Prompt Engineering: A 4‑Step Method to Make AI Give Exactly What You Want
Data Party THU
Data Party THU
Feb 15, 2026 · Artificial Intelligence

Why Retrieval‑Augmented Generation Is Still Fragile: Boosting Generalization and Evidence‑Based Answers

Although modern information access is faster than ever, retrieval‑augmented generation systems remain vulnerable, especially when faced with distribution shifts, making it crucial to improve both retriever generalization across domains and languages and ensure generators produce evidence‑grounded responses or refuse when evidence is lacking.

AI RobustnessRAGevidence grounding
0 likes · 3 min read
Why Retrieval‑Augmented Generation Is Still Fragile: Boosting Generalization and Evidence‑Based Answers
Network Intelligence Research Center (NIRC)
Network Intelligence Research Center (NIRC)
Dec 30, 2025 · Artificial Intelligence

Bridging Tokenizer Gaps: Cross-Tokenizer Knowledge Distillation at AAAI 2026

This paper introduces SeDi, a semantics‑ and distribution‑aware cross‑tokenizer knowledge distillation framework that aligns teacher and student token spaces via bipartite graph components and top‑K re‑encoding, achieving state‑of‑the‑art performance and lower exposure bias on multiple LLM benchmarks.

AI researchcross-tokenizer distillationentropy alignment
0 likes · 10 min read
Bridging Tokenizer Gaps: Cross-Tokenizer Knowledge Distillation at AAAI 2026
HyperAI Super Neural
HyperAI Super Neural
Nov 15, 2025 · Artificial Intelligence

AI Paper Weekly: Scale Pretraining, Game Agents, Attention, Context Engineering

This weekly roundup highlights five recent AI research papers—including CoCa’s contrastive captioning model, the Game‑TARS framework for scalable game agents, Kimi Linear’s efficient attention architecture, the Continuous Autoregressive Language Model (CALM), and a comprehensive survey of Context Engineering—summarizing their core contributions and providing direct links.

AIContext Engineeringattention architecture
0 likes · 6 min read
AI Paper Weekly: Scale Pretraining, Game Agents, Attention, Context Engineering
Xiaohongshu Tech REDtech
Xiaohongshu Tech REDtech
Nov 4, 2025 · Artificial Intelligence

Unveiling the Law of Capacity Gap: Boosting Language Model Distillation Efficiency

At ACL 2025, a collaborative paper introduced the Law of Capacity Gap, revealing a linear 2.5× optimal teacher‑student size relationship in language model distillation, dramatically cutting compute costs and achieving Pareto‑optimal efficiency, with the MiniMA model as a successful demonstration.

DistillationMiniMAartificial-intelligence
0 likes · 7 min read
Unveiling the Law of Capacity Gap: Boosting Language Model Distillation Efficiency
Data Party THU
Data Party THU
Oct 2, 2025 · Artificial Intelligence

Bridging Human and Machine Learning: Meta Prompt Tuning and Lifelong Few-Shot Language Models

This article presents a comprehensive study on enhancing language models with few‑shot and continual learning techniques, introducing Meta Prompt Tuning, Dynamic Module Expansion, and the LFPT5 framework to achieve more human‑like, efficient, and adaptable learning across evolving tasks.

Lifelong Learningcontinual learninglanguage models
0 likes · 8 min read
Bridging Human and Machine Learning: Meta Prompt Tuning and Lifelong Few-Shot Language Models
Data Party THU
Data Party THU
Sep 18, 2025 · Artificial Intelligence

Can Language Models Self‑Optimize? Inside the STOP Framework

Researchers introduce the Self‑Taught Optimizer (STOP), a scaffolding‑based framework that lets large language models iteratively improve their own code without altering model weights, demonstrating superior performance on tasks like LPN, exploring diverse strategies such as beam search and genetic algorithms, while also highlighting security risks like sandbox bypass and reward hacking.

AI Safetylanguage modelsrecursive self-improvement
0 likes · 11 min read
Can Language Models Self‑Optimize? Inside the STOP Framework
HyperAI Super Neural
HyperAI Super Neural
Sep 15, 2025 · Artificial Intelligence

AI Papers This Week: Red‑Team LMs, Multi‑View 3D Tracking, Protein Rep., Crypto Vulnerability Detection

This weekly roundup highlights five recent AI papers: a red‑team study of language models that reveals scaling challenges and releases a large attack dataset, a data‑driven multi‑view 3D point‑tracking method, the FusionProt framework for unified protein representation, an analysis of why language models hallucinate, and CryptoScope, an LLM‑based system for automated cryptographic vulnerability detection.

3D trackingAIRed Teaming
0 likes · 6 min read
AI Papers This Week: Red‑Team LMs, Multi‑View 3D Tracking, Protein Rep., Crypto Vulnerability Detection
Data Thinking Notes
Data Thinking Notes
Sep 10, 2025 · Artificial Intelligence

Why Do Language Models Hallucinate? Uncovering the Statistical Roots

OpenAI’s latest research reveals that language model hallucinations stem from training and evaluation incentives that favor confident guesses over acknowledging uncertainty, and proposes revised scoring methods that reward modesty, highlighting statistical mechanisms behind false answers and offering pathways to reduce hallucinations.

AI Safetyevaluationhallucination
0 likes · 10 min read
Why Do Language Models Hallucinate? Uncovering the Statistical Roots
Architect
Architect
Sep 9, 2025 · Artificial Intelligence

Why Do Language Models Hallucinate? Insights from OpenAI’s New Study

This article explains why large language models often produce confident but incorrect answers, detailing statistical inevitability, data scarcity, and model capacity limits, and proposes concrete solutions such as confidence thresholds and allowing abstention to reduce hallucinations.

AI SafetyPrompt engineeringevaluation
0 likes · 8 min read
Why Do Language Models Hallucinate? Insights from OpenAI’s New Study
Baobao Algorithm Notes
Baobao Algorithm Notes
Sep 9, 2025 · Artificial Intelligence

Why Do Language Models Hallucinate? Roots, Risks, and a New Evaluation Approach

The article analyzes OpenAI's study on language‑model hallucinations, explaining how statistical limits in pre‑training and flawed binary evaluation incentives cause false answers, and proposes a confidence‑threshold scoring system that rewards honest "I don’t know" responses to improve reliability.

AI SafetyModel Alignmentconfidence threshold
0 likes · 8 min read
Why Do Language Models Hallucinate? Roots, Risks, and a New Evaluation Approach
AI Frontier Lectures
AI Frontier Lectures
Jun 19, 2025 · Artificial Intelligence

Essential Multimodal Datasets for AI Research – Links, Stats, and Quick Overview

This article compiles a curated list of widely used multimodal datasets—including CLEVR, Visual Genome, Pangea, Touch‑Vision‑Language, WIT, and more—providing download URLs, key statistics, and brief descriptions to help researchers quickly locate the right data for vision‑language and multimodal model training.

AIDatasetslanguage models
0 likes · 9 min read
Essential Multimodal Datasets for AI Research – Links, Stats, and Quick Overview
AI Algorithm Path
AI Algorithm Path
Jun 8, 2025 · Artificial Intelligence

Autoregressive vs Diffusion Language Models: Principles, Trade‑offs, and Future Directions

The article compares autoregressive and diffusion language models, detailing their mathematical foundations, training and inference pipelines, performance trade‑offs such as speed, coherence and diversity, and explores hybrid approaches and emerging research directions for more efficient and controllable text generation.

AI researchText GenerationTransformer
0 likes · 17 min read
Autoregressive vs Diffusion Language Models: Principles, Trade‑offs, and Future Directions
NewBeeNLP
NewBeeNLP
Sep 5, 2024 · Artificial Intelligence

Why RLHF Is Irreplaceable: Uncovering the Limits of SFT

The article analyzes why supervised fine‑tuning (SFT) cannot replace reinforcement learning from human feedback (RLHF), highlighting SFT's lack of negative feedback and backward‑looking capability, and explains how RLHF’s reward model addresses these fundamental shortcomings.

RLHFReward ModelingSFT
0 likes · 7 min read
Why RLHF Is Irreplaceable: Uncovering the Limits of SFT
DataFunSummit
DataFunSummit
Jul 22, 2024 · Artificial Intelligence

From BERT to LLM: Language Model Applications in 360 Advertising Recommendation

This talk explores how 360's advertising recommendation system leverages language models—from BERT to large‑scale LLMs—to improve user interest modeling, feature extraction, and conversion‑rate prediction, detailing practical challenges, engineering solutions, experimental results, and future research directions.

AdvertisingBERTLLM
0 likes · 18 min read
From BERT to LLM: Language Model Applications in 360 Advertising Recommendation
Sohu Tech Products
Sohu Tech Products
Mar 20, 2024 · Artificial Intelligence

Comparison of Base LLM and Instruction Tuned LLM

The diagram contrasts a Base LLM, which merely predicts the next word from training data and can continue stories or answer simple facts but may generate unsafe text, with an Instruction‑Tuned LLM that is fine‑tuned via RLHF to understand and follow commands, delivering more accurate, useful, and safe responses.

AIAI applicationsBASE model
0 likes · 7 min read
Comparison of Base LLM and Instruction Tuned LLM
php Courses
php Courses
Nov 30, 2023 · Information Security

ChatGPT Repeat Prompt Vulnerability Exposes Sensitive Personal Information

Researchers discovered that prompting ChatGPT with repeated words can cause the model to leak private data such as phone numbers and email addresses, highlighting a serious repeat‑prompt vulnerability that reveals substantial personally identifiable information from its training corpus.

ChatGPTPIIarXiv
0 likes · 3 min read
ChatGPT Repeat Prompt Vulnerability Exposes Sensitive Personal Information
DataFunSummit
DataFunSummit
Nov 16, 2023 · Artificial Intelligence

Application of Language Models in Molecular Structure Prediction

This talk presents how large language models are leveraged for predicting protein, antibody, and RNA structures, covering background, model stability, generative approaches, antibody-specific models, RNA modeling, and protein‑RNA interaction prediction, along with experimental results and future research directions.

AI for BiologyGenerative ModelsRNA modeling
0 likes · 17 min read
Application of Language Models in Molecular Structure Prediction
Architect
Architect
Oct 12, 2023 · Artificial Intelligence

Evolution of Language Models: From Statistical N‑grams to GPT‑4

This article provides a comprehensive overview of natural language processing and language‑model research, tracing the historical development from early rule‑based and statistical N‑gram models through neural network approaches such as RNN, LSTM, ELMo, and Transformer, and detailing the architectures, strengths, and limitations of the GPT series up to GPT‑4, while also discussing evaluation metrics, practical applications, and future challenges.

GPTNLPartificial intelligence
0 likes · 34 min read
Evolution of Language Models: From Statistical N‑grams to GPT‑4
Zhuanzhuan Tech
Zhuanzhuan Tech
Sep 28, 2023 · Artificial Intelligence

Evolution of Language Models and an Overview of the GPT Series

This article surveys the development of natural language processing from early rule‑based systems through statistical n‑gram models, neural language models, RNNs, LSTMs, ELMo, Transformers and BERT, and then details the architecture, training methods, advantages and limitations of the GPT‑1, GPT‑2, GPT‑3, ChatGPT and GPT‑4 models, concluding with a discussion of future challenges and references.

Deep LearningGPTNLP
0 likes · 30 min read
Evolution of Language Models and an Overview of the GPT Series
Rare Earth Juejin Tech Community
Rare Earth Juejin Tech Community
Aug 1, 2023 · Artificial Intelligence

Do Language Models Learn Language in the Same Stages as Children? An Analysis of GPT‑2 Developmental Trajectories

This article reviews a study that compares the stage‑wise language acquisition of infants with the learning trajectory of GPT‑2, using linguistic probes and statistical tests to determine whether deep language models follow sequential or parallel learning patterns similar to children.

AI researchGPT-2developmental learning
0 likes · 17 min read
Do Language Models Learn Language in the Same Stages as Children? An Analysis of GPT‑2 Developmental Trajectories
360 Quality & Efficiency
360 Quality & Efficiency
Jul 21, 2023 · Artificial Intelligence

Prompt Engineering: Principles, Design Guidelines, and Practical Use Cases with ChatGPT

This article introduces prompt engineering for ChatGPT, explains key design principles, and demonstrates a series of practical applications such as text classification, summarization, role‑playing, terminal emulation, output formatting, temperature control, iterative fine‑tuning, and reverse‑engineering of prompts.

AI Prompt DesignChatGPTPrompt engineering
0 likes · 8 min read
Prompt Engineering: Principles, Design Guidelines, and Practical Use Cases with ChatGPT
Sohu Tech Products
Sohu Tech Products
Jul 19, 2023 · Artificial Intelligence

Understanding the Inner Workings of ChatGPT and Neural Networks

This article explains how ChatGPT generates text by predicting the next token using large language models, describes the role of probability, temperature, and attention mechanisms in transformers, and discusses neural network training, embeddings, semantic spaces, and the broader implications for artificial intelligence research.

ChatGPTNeural Networksartificial intelligence
0 likes · 79 min read
Understanding the Inner Workings of ChatGPT and Neural Networks
21CTO
21CTO
Jun 16, 2023 · Artificial Intelligence

Why Are LLM Stacks Becoming Essential for Modern Companies?

A comprehensive look at how companies are rapidly adopting large language model APIs, retrieval techniques, and custom model strategies, revealing key statistics, emerging toolchains, and the shifting balance between closed‑source LLM services and open‑source custom stacks.

AI adoptionCustom ModelsLLM
0 likes · 8 min read
Why Are LLM Stacks Becoming Essential for Modern Companies?
Airbnb Technology Team
Airbnb Technology Team
May 23, 2023 · Artificial Intelligence

Applying Text Generation Models to Scalable Customer Support at Airbnb

Airbnb replaced its XLM‑RoBERTa ranking with an MT5 encoder‑decoder for content recommendation, built a real‑time generative assistant for reply suggestions and intent detection, and deployed a T5‑based paraphrase chatbot, showing that large‑scale pre‑trained transformers improve relevance, agent efficiency, and user satisfaction.

AIAirbnbPrompt engineering
0 likes · 12 min read
Applying Text Generation Models to Scalable Customer Support at Airbnb
Python Programming Learning Circle
Python Programming Learning Circle
Mar 17, 2023 · Artificial Intelligence

Analysis of New Bing’s Behavior Compared to ChatGPT: Issues, User Experiences, and Underlying AI Models

The article examines the public testing of the new Bing chatbot, contrasting its internet‑enabled, citation‑rich responses and occasional erratic, immature behavior with ChatGPT’s more stable output, while exploring user‑reported failures, speculative technical reasons, and the ethical implications of deploying advanced language models.

AI behaviorBingChatGPT
0 likes · 8 min read
Analysis of New Bing’s Behavior Compared to ChatGPT: Issues, User Experiences, and Underlying AI Models
Architect's Guide
Architect's Guide
Feb 9, 2023 · Artificial Intelligence

Why ChatGPT Is So Powerful: A Technical Overview of NLP Model Evolution

This article explains why ChatGPT performs so well by tracing the evolution of natural‑language processing from rule‑based grammars through statistical n‑gram models to neural architectures like RNNs, LSTMs, attention mechanisms, Transformers, and the massive data and training methods that power modern large language models.

ChatGPTNLPTransformer
0 likes · 14 min read
Why ChatGPT Is So Powerful: A Technical Overview of NLP Model Evolution
Architect
Architect
Feb 6, 2023 · Artificial Intelligence

Understanding How ChatGPT Works: RLHF, PPO, and Consistency Challenges

This article explains the underlying mechanisms of ChatGPT, including its GPT‑3 foundation, the role of supervised fine‑tuning, human‑feedback reinforcement learning (RLHF), PPO optimization, consistency issues, evaluation metrics, and the limitations of these training strategies, with references to key research papers.

AI AlignmentChatGPTPPO
0 likes · 16 min read
Understanding How ChatGPT Works: RLHF, PPO, and Consistency Challenges
DataFunSummit
DataFunSummit
Jan 14, 2023 · Artificial Intelligence

Key Transformer Model Papers Across Language, Vision, Speech, and Time‑Series Domains

This article surveys the most influential Transformer‑based research papers—from the original Attention Is All You Need work to recent models such as Autoformer and FEDformer—covering breakthroughs in natural language processing, computer vision, speech recognition, and long‑term series forecasting, and provides download links for each.

AITime-Series ForecastingTransformer
0 likes · 17 min read
Key Transformer Model Papers Across Language, Vision, Speech, and Time‑Series Domains
21CTO
21CTO
Dec 7, 2022 · Artificial Intelligence

Why Did Stack Overflow Ban ChatGPT Answers? Insights and Community Reactions

Stack Overflow recently banned AI‑generated answers from ChatGPT after discovering thousands of inaccurate responses that required expert review, prompting a heated community debate about the benefits and risks of AI assistance on the platform.

AI policyChatGPTcontent quality
0 likes · 4 min read
Why Did Stack Overflow Ban ChatGPT Answers? Insights and Community Reactions
政采云技术
政采云技术
Jul 5, 2022 · Artificial Intelligence

Overview of Natural Language Processing Techniques and Their Evolution

This article provides a comprehensive overview of natural language processing, covering its definition, historical development from one‑hot encoding to modern models such as word2vec, ELMo, GPT, and BERT, and discusses the advantages, limitations, and key concepts of each technique.

NLPWord Embeddingartificial intelligence
0 likes · 23 min read
Overview of Natural Language Processing Techniques and Their Evolution
JD Cloud Developers
JD Cloud Developers
Mar 11, 2022 · Artificial Intelligence

How JD’s NR‑Rino Model Cracked the DROP Benchmark with 90% Accuracy

The JD Intelligent Customer Service team’s NR‑Rino model topped the DROP leaderboard at 90.26% accuracy by enhancing multi‑head predictor architecture and training strategies, showcasing advanced discrete reasoning for machine reading comprehension and promising broader AI applications in finance, logistics, and health.

AIDROPNR-Rino
0 likes · 9 min read
How JD’s NR‑Rino Model Cracked the DROP Benchmark with 90% Accuracy
DataFunSummit
DataFunSummit
Jan 25, 2022 · Artificial Intelligence

Intelligent Lyric Generation for Music: Techniques, Models, and Future Directions

This article explores how AI and natural language processing technologies are applied to music lyric creation, covering background challenges, rhyme retrieval methods, advanced language models such as SongNet, decoding strategies, style transfer, and a multi‑level generation platform that aims to streamline professional songwriting.

AI lyric generationSongNetStyle Transfer
0 likes · 14 min read
Intelligent Lyric Generation for Music: Techniques, Models, and Future Directions
DataFunSummit
DataFunSummit
Nov 14, 2021 · Artificial Intelligence

Overview of Pre‑training Models and the UER‑py Framework for Natural Language Processing

This article introduces the importance of pre‑training in natural language processing, reviews classic pre‑training models such as Skip‑thoughts, BERT, GPT‑2 and T5, presents the modular UER‑py framework and its Chinese resources, compares it with Huggingface Transformers, and outlines practical deployment steps in industry.

NLPUER-pylanguage models
0 likes · 22 min read
Overview of Pre‑training Models and the UER‑py Framework for Natural Language Processing
DataFunTalk
DataFunTalk
Sep 23, 2020 · Artificial Intelligence

From Word Embedding to BERT: A Comprehensive Overview of Pre‑training Model Development in NLP

This article surveys the evolution of pre‑training models for natural language processing, detailing model architectures such as Encoder‑AE, Decoder‑AR, Encoder‑Decoder, Prefix LM, and PLM, analyzing why models like RoBERTa, T5, and GPT‑3 excel, and offering practical guidance for building strong pre‑training systems.

BERTNLPTransformer
0 likes · 47 min read
From Word Embedding to BERT: A Comprehensive Overview of Pre‑training Model Development in NLP
DataFunTalk
DataFunTalk
Jun 23, 2019 · Artificial Intelligence

Understanding XLNet: Differences from BERT, Innovations, and Experimental Analysis

This article examines XLNet, contrasting it with BERT by detailing its novel permutation language modeling, dual‑stream attention, and larger pre‑training data, and analyzes experimental results that show XLNet’s superior performance on reading‑comprehension, GLUE, and other NLP tasks, especially for long documents.

BERTNLPPermutation Language Model
0 likes · 27 min read
Understanding XLNet: Differences from BERT, Innovations, and Experimental Analysis
Alibaba Cloud Developer
Alibaba Cloud Developer
Jun 5, 2019 · Artificial Intelligence

Tracing the Evolution of Language Models: From N‑grams to GPT‑2

This article reviews the historical development of natural language processing language models, covering expert rule‑based systems, statistical n‑grams, smoothing techniques, neural network models such as NNLM, RNN, word2vec, GloVe, ELMo, and the transformer‑based breakthroughs of GPT, BERT and GPT‑2, and summarizes their impact on modern NLP tasks.

BERTDeep LearningGPT
0 likes · 25 min read
Tracing the Evolution of Language Models: From N‑grams to GPT‑2
Hulu Beijing
Hulu Beijing
Apr 4, 2019 · Artificial Intelligence

How BERT, GPT, and ELMo Revolutionize Language Feature Representation

Natural language processing, a cornerstone of AI, relies on language models to capture linguistic features; this article reviews classic pre‑training models—ELMo, GPT, and BERT—explaining their architectures, training objectives, and how they boost downstream NLP tasks despite data‑scarcity challenges.

BERTDeep LearningELMo
0 likes · 10 min read
How BERT, GPT, and ELMo Revolutionize Language Feature Representation
21CTO
21CTO
Jul 5, 2017 · Artificial Intelligence

Can AI Learn to Write Like a Chinese Novelist? Exploring Deep Learning in Literature

This article examines how deep‑learning‑based AI models, from symbolic and statistical NLP methods to Karpathy's recurrent network, progressively learn to generate Chinese wuxia novels, poetry, and web fiction, revealing both their surprising advances and inherent limitations.

AIDeep LearningText Generation
0 likes · 15 min read
Can AI Learn to Write Like a Chinese Novelist? Exploring Deep Learning in Literature