Tag

AI alignment

0 views collected around this technical thread.

Architects' Tech Alliance
Architects' Tech Alliance
Jun 11, 2025 · Artificial Intelligence

From Transformers to DeepSeek‑R1: The 2017‑2025 Evolution of Large Language Models

This article chronicles the rapid development of large language models from the 2017 Transformer breakthrough through the rise of BERT, GPT‑3, ChatGPT, multimodal systems like GPT‑4V/o, and the recent cost‑efficient DeepSeek‑R1, highlighting key architectural innovations, scaling trends, alignment techniques, and their transformative impact on AI research and industry.

AI alignmentBERTGPT
0 likes · 26 min read
From Transformers to DeepSeek‑R1: The 2017‑2025 Evolution of Large Language Models
Model Perspective
Model Perspective
Apr 7, 2025 · Artificial Intelligence

Why AI Alignment Matters: Ensuring Smart Systems Follow Human Intent

This article explores the multifaceted AI alignment challenge, detailing safety benchmarks such as toxicity, ethical, power‑seeking, and hallucination evaluations, and argues that responsible AI development requires technical safeguards, international governance, and a civilizational dialogue bridging philosophy and humanity.

AI alignmentAI governanceAI safety
0 likes · 12 min read
Why AI Alignment Matters: Ensuring Smart Systems Follow Human Intent
Cognitive Technology Team
Cognitive Technology Team
Apr 4, 2025 · Artificial Intelligence

Reasoning Models Do Not Always Reveal Their Thoughts: Evaluating Chain‑of‑Thought Fidelity

The article examines how modern reasoning models like Claude 3.7 Sonnet display chain‑of‑thought explanations, but often hide or distort their true reasoning, presenting challenges for AI safety and alignment, and evaluates methods to test and improve fidelity.

AI alignmentAI safetyChain-of-Thought
0 likes · 13 min read
Reasoning Models Do Not Always Reveal Their Thoughts: Evaluating Chain‑of‑Thought Fidelity
Architects' Tech Alliance
Architects' Tech Alliance
Mar 31, 2025 · Artificial Intelligence

A Comprehensive History of Large Language Models from the Transformer Era (2017) to DeepSeek‑R1 (2025)

This article reviews the evolution of large language models from the 2017 Transformer breakthrough through BERT, GPT series, alignment techniques, multimodal extensions, open‑weight releases, and the cost‑efficient DeepSeek‑R1 in 2025, highlighting key technical advances, scaling trends, and their societal impact.

AI alignmentLLM evolutionReasoning Models
0 likes · 26 min read
A Comprehensive History of Large Language Models from the Transformer Era (2017) to DeepSeek‑R1 (2025)
Bilibili Tech
Bilibili Tech
Jan 14, 2025 · Artificial Intelligence

Technical Practices and Productization of Intelligent Advertising Title Generation for Bilibili

We built an LLM‑powered system for Bilibili that automatically creates ad titles from user keywords, employing fluency, style, and quality classifiers, mixed domain data cleaning, and alignment methods such as SFT, DPO and KTO, resulting in a product that now generates about ten percent of daily titles and drives significant ad spend.

AI alignmentAd Title GenerationBilibili
0 likes · 24 min read
Technical Practices and Productization of Intelligent Advertising Title Generation for Bilibili
AntTech
AntTech
Sep 21, 2024 · Artificial Intelligence

Insights from the 2024 Inclusion·Bund Conference: From Data for AI to AI for Data

The 2024 Inclusion·Bund conference brought together academia and industry leaders to discuss how data technologies are evolving and aligning with AI, covering trends in large‑model storage, synthetic data generation, AI‑enhanced databases, and Ant Group's emerging AI‑centric data ecosystem.

AIAI alignmentData Strategy
0 likes · 7 min read
Insights from the 2024 Inclusion·Bund Conference: From Data for AI to AI for Data
Python Programming Learning Circle
Python Programming Learning Circle
Jun 6, 2023 · Artificial Intelligence

Why ChatGPT Plus Performance Is Dropping and What OpenAI’s Roadmap Reveals

Recent reports indicate a noticeable decline in ChatGPT Plus’s GPT‑4 performance, especially in coding accuracy, prompting speculation about model scaling pain, AI alignment trade‑offs, and OpenAI’s GPU‑limited roadmap that includes cheaper models, longer context windows, finetuning, and multimodal extensions.

AI alignmentChatGPTGPT-4
0 likes · 8 min read
Why ChatGPT Plus Performance Is Dropping and What OpenAI’s Roadmap Reveals
Tencent Cloud Developer
Tencent Cloud Developer
Feb 10, 2023 · Artificial Intelligence

Technical Overview of Claude's RLAIF Approach and Comparison with ChatGPT

Claude, Anthropic’s ChatGPT‑like assistant, employs Constitutional AI and a Reinforcement‑Learning‑from‑AI‑Feedback (RLAIF) pipeline that substitutes costly human‑ranked data with AI‑generated critiques and revisions, yielding comparable reasoning ability to ChatGPT while markedly increasing harmlessness through transparent rule‑based training, chain‑of‑thought prompting, and open‑source reproducible methods.

AI alignmentChatGPTClaude
0 likes · 19 min read
Technical Overview of Claude's RLAIF Approach and Comparison with ChatGPT
IT Architects Alliance
IT Architects Alliance
Feb 9, 2023 · Artificial Intelligence

How ChatGPT Works: Model Architecture, Training Strategies, and RLHF

ChatGPT, OpenAI’s latest language model, builds on GPT‑3 using supervised fine‑tuning and Reinforcement Learning from Human Feedback (RLHF) with PPO, addressing consistency issues by aligning model outputs with human preferences, while discussing training methods, limitations, and evaluation metrics.

AI alignmentChatGPTPPO
0 likes · 15 min read
How ChatGPT Works: Model Architecture, Training Strategies, and RLHF
Architects' Tech Alliance
Architects' Tech Alliance
Feb 7, 2023 · Artificial Intelligence

ChatGPT: Technical Principles, Architecture, and the Role of Human‑Feedback Reinforcement Learning

This article explains how ChatGPT builds on GPT‑3 with improved accuracy and coherence, details its training pipeline that combines supervised fine‑tuning and Reinforcement Learning from Human Feedback (RLHF), discusses consistency challenges, evaluation metrics, and the limitations of the RLHF approach.

AI alignmentChatGPTPPO
0 likes · 15 min read
ChatGPT: Technical Principles, Architecture, and the Role of Human‑Feedback Reinforcement Learning
Architect
Architect
Feb 6, 2023 · Artificial Intelligence

Understanding How ChatGPT Works: RLHF, PPO, and Consistency Challenges

This article explains the underlying mechanisms of ChatGPT, including its GPT‑3 foundation, the role of supervised fine‑tuning, human‑feedback reinforcement learning (RLHF), PPO optimization, consistency issues, evaluation metrics, and the limitations of these training strategies, with references to key research papers.

AI alignmentChatGPTPPO
0 likes · 16 min read
Understanding How ChatGPT Works: RLHF, PPO, and Consistency Challenges