Tagged articles

268 articles

Page 2 of 3

Jun 8, 2025 · Artificial Intelligence

Autoregressive vs Diffusion Language Models: Principles, Trade‑offs, and Future Directions

The article compares autoregressive and diffusion language models, detailing their mathematical foundations, training and inference pipelines, performance trade‑offs such as speed, coherence and diversity, and explores hybrid approaches and emerging research directions for more efficient and controllable text generation.

AI researchText GenerationTransformer

0 likes · 17 min read

Autoregressive vs Diffusion Language Models: Principles, Trade‑offs, and Future Directions

DataFunTalk

Jun 8, 2025 · Artificial Intelligence

Why Autoregressive Video Models Like MAGI-1 May Outperform Diffusion Approaches

The article examines the current dominance of diffusion models in commercial video generation, contrasts them with autoregressive methods, and details how the open‑source MAGI‑1 model combines both paradigms to achieve longer, more controllable video synthesis while addressing scalability and quality challenges.

AI researchAutoregressive ModelsDiffusion Models

0 likes · 70 min read

Why Autoregressive Video Models Like MAGI-1 May Outperform Diffusion Approaches

Architect

Jun 7, 2025 · Artificial Intelligence

Mass Framework: Boosting Multi‑Agent Design with Smarter Prompts & Topologies

The Mass framework, developed by Google and Cambridge University, automates multi‑agent system design by jointly optimizing prompts and topologies through three staged processes, demonstrating significant performance gains over existing methods across various tasks while highlighting the importance of coordinated prompt‑topology optimization.

AI researchMass frameworkTopology Design

0 likes · 6 min read

Mass Framework: Boosting Multi‑Agent Design with Smarter Prompts & Topologies

Xiaohongshu Tech REDtech

Jun 6, 2025 · Artificial Intelligence

How dots.llm1 Sets New Benchmarks for Open‑Source MoE Language Models

dots.llm1, an open‑source 142‑billion‑parameter Mixture‑of‑Experts language model from hi lab, achieves Qwen2.5‑72B‑level performance after training on 11.2 T high‑quality tokens, and the release includes full models, intermediate checkpoints, and detailed training pipelines for the research community.

AI researchMixture of ExpertsTraining Efficiency

0 likes · 10 min read

How dots.llm1 Sets New Benchmarks for Open‑Source MoE Language Models

Alibaba Cloud Developer

Jun 5, 2025 · Artificial Intelligence

How Deep (Re)Search Transforms Code Search and AI-Powered Knowledge Retrieval

This article systematically explains the concepts of Deep Search and Deep Research, contrasts them with traditional Retrieval‑Augmented Generation, reviews leading commercial and open‑source solutions, details their architecture for code retrieval, and outlines future plans for specialized code‑search agents.

AI researchKnowledge RetrievalRetrieval Augmented Generation

0 likes · 13 min read

How Deep (Re)Search Transforms Code Search and AI-Powered Knowledge Retrieval

Baobao Algorithm Notes

Jun 4, 2025 · Artificial Intelligence

Do Recent LLM‑RL Papers Overstate Their Gains? A Critical Review

This article critically examines seven high‑profile reinforcement‑learning papers for large language models, exposing flawed baseline evaluations, unrealistic settings, and modest actual improvements despite bold claims of dramatic performance gains.

AI researchLLMbaseline evaluation

0 likes · 8 min read

Do Recent LLM‑RL Papers Overstate Their Gains? A Critical Review

Kuaishou Tech

Jun 4, 2025 · Artificial Intelligence

KwaiCoder-AutoThink-preview: An Automatic‑Thinking Large Model Enhanced with Step‑SRPO Reinforcement Learning

The KwaiPilot team released the KwaiCoder‑AutoThink‑preview model, which introduces a novel automatic‑thinking training paradigm and a process‑supervised reinforcement‑learning method called Step‑SRPO, enabling the model to dynamically switch between thinking and non‑thinking modes, reduce inference cost, and achieve up to 20‑point gains on code and math benchmarks while handling large‑scale codebases.

AI researchModel OptimizationReinforcement Learning

0 likes · 12 min read

KwaiCoder-AutoThink-preview: An Automatic‑Thinking Large Model Enhanced with Step‑SRPO Reinforcement Learning

Xiaohongshu Tech REDtech

Jun 3, 2025 · Artificial Intelligence

Beyond One-Size-Fits-All: Tailored Benchmarks for Efficient Evaluation

The TailoredBench framework dramatically reduces large‑language‑model evaluation cost and error by using a global probe set, model‑specific source selection, extensible K‑Medoids clustering, and calibration, achieving up to 300× speedup and a 31.4% MAE reduction across diverse benchmarks.

AI researchK-MedoidsLLM evaluation

0 likes · 10 min read

Beyond One-Size-Fits-All: Tailored Benchmarks for Efficient Evaluation

AntTech

May 31, 2025 · Artificial Intelligence

Machine Reasoning and Deep Thinking: Insights from Ant Financial’s NLP Lead Wu Wei

The article explores how DeepSeek R1 and long‑thinking chains have revived interest in machine reasoning, tracing the evolution of natural‑language models, defining reasoning as logical knowledge composition, and outlining future research directions in efficient reasoning architectures and deep‑thinking applications.

AI researchEfficient ReasoningLarge Language Models

0 likes · 8 min read

Machine Reasoning and Deep Thinking: Insights from Ant Financial’s NLP Lead Wu Wei

ShiZhen AI

May 28, 2025 · Artificial Intelligence

Claude Finally Gets Voice: Anthropic Adds Speech to Its AI Assistant

Anthropic has introduced a voice mode for Claude, enabling English users to speak and type interchangeably with five voice personalities, while a new 3D AI startup, SpAItial, showcases photorealistic room generation and researchers present INTUITOR, a confidence‑driven training method that improves AI reasoning.

AI researchAnthropicClaude

0 likes · 7 min read

Claude Finally Gets Voice: Anthropic Adds Speech to Its AI Assistant

AI Frontier Lectures

May 28, 2025 · Artificial Intelligence

How Token‑Shuffle Enables 2048×2048 Autoregressive Image Generation

The article analyzes the Token‑Shuffle method, which reduces visual token redundancy to allow high‑resolution (2048×2048) autoregressive image generation, detailing its architecture, training pipeline, experimental results, efficiency gains, and comparisons with diffusion and other AR models.

AI researchAutoregressive ModelsHigh‑Resolution Image Generation

0 likes · 17 min read

How Token‑Shuffle Enables 2048×2048 Autoregressive Image Generation

AI Frontier Lectures

May 27, 2025 · Artificial Intelligence

Can One-Step Generative Modeling Beat Multi-Step Diffusion? Inside MeanFlow

The article presents MeanFlow, a novel one‑step generative modeling framework that replaces instantaneous velocity with an average‑velocity field, achieving a record‑low FID of 3.43 on ImageNet 256×256 with a single function evaluation and outperforming both prior single‑step and multi‑step diffusion models.

AI researchFIDImageNet

0 likes · 7 min read

Can One-Step Generative Modeling Beat Multi-Step Diffusion? Inside MeanFlow

Baobao Algorithm Notes

May 26, 2025 · Artificial Intelligence

When Should Large Language Models Think? 10 Cutting‑Edge Strategies to Boost Reasoning Efficiency

This article reviews ten recent papers that tackle the over‑thinking problem in large language models by shortening chain‑of‑thought reasoning, introducing dynamic early‑exit, adaptive thinking triggers, and reinforcement‑learning‑based training, showing how models can maintain or improve accuracy while dramatically reducing token usage and latency.

AI researchModel Pruningadaptive inference

0 likes · 38 min read

When Should Large Language Models Think? 10 Cutting‑Edge Strategies to Boost Reasoning Efficiency

AI Frontier Lectures

May 21, 2025 · Artificial Intelligence

How BGE’s New Code and Multimodal Vector Models Set New Retrieval Benchmarks

The article introduces three BGE vector models—BGE‑Code‑v1, BGE‑VL‑v1.5, and BGE‑VL‑Screenshot—detailing their architectures, open‑source resources, benchmark results on CoIR, Code‑RAG, MMEB, and MVRB, and their impact on code and multimodal retrieval research.

AI researchMultimodal AIOpen-source models

0 likes · 8 min read

How BGE’s New Code and Multimodal Vector Models Set New Retrieval Benchmarks

JD Tech

May 20, 2025 · Artificial Intelligence

How Re‑parameterization and Adaptive Learning Boost Visual Deep Learning Efficiency

The award‑winning project from Tsinghua University and JD Retail introduces re‑parameterization model design, cross‑scene adaptive learning, and platform‑aware compression to overcome accuracy‑efficiency trade‑offs in visual deep learning, achieving over 20% accuracy gains and more than 50% inference speedup in real‑world e‑commerce deployments.

AI researchComputer Visionadaptive models

0 likes · 6 min read

How Re‑parameterization and Adaptive Learning Boost Visual Deep Learning Efficiency

AI Frontier Lectures

May 19, 2025 · Artificial Intelligence

DreamO: Multi‑Condition Image Customization with a 400M Flux‑Based Model

DreamO, a collaborative effort by ByteDance and Peking University, introduces a unified 400M‑parameter framework built on Flux‑1.0‑dev that enables simultaneous control of identity, style, appearance, and virtual try‑on, offering open‑source, low‑cost, and fast image customization comparable to commercial large models.

AI researchDreamOFlux model

0 likes · 6 min read

DreamO: Multi‑Condition Image Customization with a 400M Flux‑Based Model

Amap Tech

May 19, 2025 · Artificial Intelligence

Group Policy Gradient: Direct Objective Optimization for Faster Reinforcement Learning

The article introduces Group Policy Gradient (GPG), a reinforcement‑learning framework that eliminates surrogate loss functions and critic models, directly optimizes the original objective, reduces bias and variance, and achieves state‑of‑the‑art performance on both single‑modal and multimodal tasks.

AI researchLLM fine-tuningPolicy Gradient

0 likes · 7 min read

Group Policy Gradient: Direct Objective Optimization for Faster Reinforcement Learning

Amap Tech

May 12, 2025 · Artificial Intelligence

How G3PT Uses Autoregressive Modeling to Revolutionize 3D Generation

The paper introduces G3PT, a groundbreaking autoregressive 3D generation model that employs a Cross‑Scale Querying Transformer and multi‑scale tokenization to produce high‑quality meshes from a single image, outperforming diffusion‑based methods and revealing a scaling law for 3D generation.

3D generationAI researchG3PT

0 likes · 9 min read

How G3PT Uses Autoregressive Modeling to Revolutionize 3D Generation

AI Frontier Lectures

May 10, 2025 · Artificial Intelligence

Can the ‘Canon’ Layer Unlock New Limits in Large Language Models?

A new study introduces the lightweight “Canon” layer for large language models, showing how it improves information flow, inference depth, and scalability across Transformers, linear attention, and state‑space architectures, while offering a controlled synthetic pre‑training benchmark for deeper architectural analysis.

AI researchLarge Language ModelsMamba

0 likes · 11 min read

Can the ‘Canon’ Layer Unlock New Limits in Large Language Models?

Baobao Algorithm Notes

Apr 28, 2025 · Artificial Intelligence

What Makes Qwen3 the Next Leap in Large Language Models?

The article announces Qwen3, detailing its flagship 235B and smaller MoE models, superior benchmark performance, extensive multilingual support, expanded pretraining data, four-stage post‑training, flexible thinking modes, deployment guides for SGLang, vLLM, Ollama, and future plans toward AGI‑level capabilities.

AI researchDeploymentQwen3

0 likes · 15 min read

What Makes Qwen3 the Next Leap in Large Language Models?

DataFunTalk

Apr 25, 2025 · Artificial Intelligence

Does Reinforcement Learning Really Expand Reasoning Capacity in Large Language Models? Insights from Recent Empirical Study

Recent empirical research by Tsinghua’s LeapLab and Shanghai Jiao Tong University reveals that reinforcement‑learning‑based fine‑tuning (RLVR) improves sampling efficiency but does not extend the fundamental reasoning abilities of large language models beyond their base capabilities, as demonstrated across mathematics, code, and visual reasoning benchmarks.

AI researchLarge Language ModelsRLVR

0 likes · 12 min read

Does Reinforcement Learning Really Expand Reasoning Capacity in Large Language Models? Insights from Recent Empirical Study

Architect

Apr 21, 2025 · Artificial Intelligence

Microsoft Research Releases BitNet b1.58 2B4T: A 1‑Bit Native Large Language Model with Ultra‑Low Memory and Energy Consumption

Microsoft Research introduced BitNet b1.58 2B4T, a native 1‑bit large language model with 2 billion parameters trained on 4 trillion tokens, achieving only 0.4 GB non‑embedding memory, 0.028 J decoding energy, and 29 ms CPU latency while matching full‑precision performance.

1-bit LLMAI researchBitNet

0 likes · 7 min read

Microsoft Research Releases BitNet b1.58 2B4T: A 1‑Bit Native Large Language Model with Ultra‑Low Memory and Energy Consumption

AI Frontier Lectures

Apr 18, 2025 · Artificial Intelligence

From RL’s Early Days to Its Future: A Four‑Stage Evolution of Reinforcement Learning

This reflective essay traces reinforcement learning’s decade‑long evolution through four stages—early algorithmic foundations, application‑driven growth, problem‑construction focus, and speculative future—while critiquing the expanding definition and its impact on research and industry.

AI researchRL evolutionRLHF

0 likes · 9 min read

From RL’s Early Days to Its Future: A Four‑Stage Evolution of Reinforcement Learning

21CTO

Apr 17, 2025 · Artificial Intelligence

What’s New in OpenAI’s GPT‑4.1? Bigger Context, Faster, Cheaper AI

OpenAI has launched GPT‑4.1, a multimodal AI model that expands context windows to one million tokens, improves coding and instruction following, offers cheaper Mini and Nano variants, and signals a shift in its release roadmap, including plans to retire GPT‑4 and delay GPT‑5.

AI researchGPT-4.1OpenAI

0 likes · 5 min read

What’s New in OpenAI’s GPT‑4.1? Bigger Context, Faster, Cheaper AI

Baobao Algorithm Notes

Apr 16, 2025 · Artificial Intelligence

Why Reinforcement Learning Finally Works: The Second Half of AI

The article argues that AI has entered its second half, where reinforcement learning finally generalizes thanks to large‑scale language pretraining and reasoning, shifting focus from building ever better models to redefining problems, evaluation methods, and real‑world utility.

AI researchindustry trends

0 likes · 16 min read

Why Reinforcement Learning Finally Works: The Second Half of AI

DevOps

Apr 13, 2025 · Artificial Intelligence

The Amazing Magic of GPT‑4o and a Speculative Technical Roadmap

This article reviews the breakthrough image‑generation capabilities of GPT‑4o, showcases diverse examples, and offers a detailed speculation on its underlying autoregressive architecture, tokenization methods, VQ‑VAE/GAN advances, and training strategies that could explain its performance.

AI researchGPT-4oImage Generation

0 likes · 16 min read

The Amazing Magic of GPT‑4o and a Speculative Technical Roadmap

AntTech

Apr 10, 2025 · Artificial Intelligence

Ant Group Presents Four AI Research Papers at ICLR 2025 Live Showcase

At the ICLR 2025 live session in Singapore, Ant Group showcased four cutting‑edge papers—CodePlan, Animate‑X, Group Position Embedding, and OmniKV—demonstrating advances in large‑language‑model reasoning, universal character animation, layout‑aware document understanding, and efficient long‑context inference.

AI researchLarge Language ModelsMultimodal

0 likes · 6 min read

Ant Group Presents Four AI Research Papers at ICLR 2025 Live Showcase

DevOps

Apr 7, 2025 · Artificial Intelligence

Meta Llama 4 Scout, Maverick, and Behemoth: Architecture, NoPE Innovation, and Training Advances

The article introduces Meta's newly open‑sourced Llama 4 series—including Scout with a 1 billion‑token context window, Maverick with 400 billion parameters, and the upcoming Behemoth teacher model—detailing their expert‑mix architecture, the NoPE positional‑encoding removal, training pipelines, performance benchmarks, and infrastructure improvements for large‑scale AI research.

AI researchContext WindowLlama 4

0 likes · 8 min read

Meta Llama 4 Scout, Maverick, and Behemoth: Architecture, NoPE Innovation, and Training Advances

AntTech

Mar 31, 2025 · Artificial Intelligence

Ant Group Papers Accepted at ICLR 2025: Summaries and Links

The article presents the abstracts, publication types, links, and research areas of seventeen Ant Group papers accepted at ICLR 2025, covering topics such as embodied robot co‑design, efficient distributed training for large language models, optimization via LLMs, character animation, interactive frame interpolation, KV‑cache management, and privacy‑preserving Transformers.

AI researchAnt GroupICLR2025

0 likes · 23 min read

Ant Group Papers Accepted at ICLR 2025: Summaries and Links

AI Frontier Lectures

Mar 30, 2025 · Artificial Intelligence

Do Large Language Models Mirror Human Brain Language Processing? Google’s Groundbreaking Findings

Google researchers discovered a linear relationship between brain activity recorded during natural conversation and the internal embeddings of a speech‑to‑text large language model, revealing that acoustic and lexical representations from the model can accurately predict neural responses in both language comprehension and production.

AI researchGoogleLarge Language Models

0 likes · 8 min read

Do Large Language Models Mirror Human Brain Language Processing? Google’s Groundbreaking Findings

AI Frontier Lectures

Mar 30, 2025 · Artificial Intelligence

How NOVA Generates High‑Quality Video Autoregressively Without Vector Quantization

This article provides an in‑depth analysis of the NOVA model, a non‑quantized autoregressive video generation framework that combines frame‑by‑frame temporal prediction with set‑by‑set spatial prediction, uses diffusion loss for token estimation, and achieves state‑of‑the‑art results on multiple video and image benchmarks.

AI researchAutoregressive ModelNOVA

0 likes · 15 min read

How NOVA Generates High‑Quality Video Autoregressively Without Vector Quantization

AI Frontier Lectures

Mar 29, 2025 · Artificial Intelligence

How MMGDreamer Achieves Precise Geometry Control in 3D Indoor Scene Generation

MMGDreamer introduces a mixed‑modality graph and a dual‑branch diffusion model that combine text, image, and relational cues to generate highly realistic, geometrically controllable 3D indoor scenes, outperforming prior methods on multiple quantitative and qualitative benchmarks.

3D scene generationAI researchComputer Vision

0 likes · 11 min read

How MMGDreamer Achieves Precise Geometry Control in 3D Indoor Scene Generation

MaGe Linux Operations

Mar 26, 2025 · Artificial Intelligence

Why Qwen2.5‑VL‑32B Is the New AI Breakthrough for Vision and Math

Alibaba's newly released Qwen2.5‑VL‑32B multimodal model delivers state‑of‑the‑art visual and textual performance, offering human‑aligned responses, superior mathematical reasoning, fine‑grained image understanding, and efficient deployment features that make it a compelling tool for developers and AI researchers alike.

AI researchQwen2.5-VL-32Blarge language model

0 likes · 9 min read

Why Qwen2.5‑VL‑32B Is the New AI Breakthrough for Vision and Math

AI Frontier Lectures

Mar 25, 2025 · Artificial Intelligence

What Drives Alignment in Multimodal Large Language Models? A Comprehensive Review

This article provides an in‑depth review of alignment algorithms for multimodal large language models, covering application scenarios, dataset construction methods, evaluation benchmarks, current challenges, and future research directions, while summarizing contributions from leading academic institutions.

AI researchalignment algorithmsdataset construction

0 likes · 22 min read

What Drives Alignment in Multimodal Large Language Models? A Comprehensive Review

AI Algorithm Path

Mar 20, 2025 · Artificial Intelligence

Understanding Multimodal Large Language Models: Recent Advances and Comparative Analysis

This article surveys the latest multimodal large language model research, dissecting the design, training strategies, and performance trade‑offs of models such as Llama 3.2, Molmo, NVLM, Qwen2‑VL, Pixtral, MM1.5, Emu3, and Janus, and highlights the challenges of fair cross‑model evaluation.

AI researchCross-AttentionLarge Language Models

0 likes · 16 min read

Understanding Multimodal Large Language Models: Recent Advances and Comparative Analysis

AIWalker

Mar 14, 2025 · Artificial Intelligence

Dynamic Tanh Lets He Kaiming and LeCun Drop Transformer Normalization in 9 Lines

Researchers He Kaiming, Yann LeCun and colleagues propose a 9‑line Dynamic Tanh (DyT) layer that replaces LayerNorm/RMSNorm in Transformers, showing comparable or superior accuracy across vision, language, speech and DNA tasks while also reducing inference latency on modern GPUs.

AI researchDeep LearningDynamic Tanh

0 likes · 18 min read

Dynamic Tanh Lets He Kaiming and LeCun Drop Transformer Normalization in 9 Lines

JD Tech

Mar 12, 2025 · Artificial Intelligence

From Low‑Resource Large Model Training to Dynamic Margin Selection: A JD Engineer’s Journey

The article recounts a JD retail engineer’s rapid growth through tackling low‑resource large‑model training, developing a margin‑based dynamic data selection method (DynaMS) that earned an ICLR paper, and sharing practical insights on aligning business needs with cutting‑edge AI research.

AI researchICLRLow‑Resource Training

0 likes · 11 min read

From Low‑Resource Large Model Training to Dynamic Margin Selection: A JD Engineer’s Journey

AIWalker

Mar 11, 2025 · Artificial Intelligence

Introducing FAR: A Frequency‑Progressive Autoregressive Paradigm for Image Generation

The paper presents FAR, a frequency‑aware autoregressive framework that predicts image tokens from low‑frequency to high‑frequency components using a continuous tokenizer, and demonstrates its efficiency and quality on ImageNet and text‑to‑image benchmarks compared with existing AR and VAR methods.

AI researchAutoregressive ModelsFAR

0 likes · 20 min read

Introducing FAR: A Frequency‑Progressive Autoregressive Paradigm for Image Generation

DataFunTalk

Mar 7, 2025 · Artificial Intelligence

DeepSeek R1 Technical Report: Insights into Reasoning Models and Their Impact

This presentation reviews the development, technical details, and societal impact of DeepSeek's R1 model, explaining its reasoning capabilities, training pipeline, comparisons with other models, and future directions for AI research and product applications.

AI researchDeepSeekR1

0 likes · 53 min read

DeepSeek R1 Technical Report: Insights into Reasoning Models and Their Impact

AIWalker

Mar 5, 2025 · Artificial Intelligence

Attention Distillation in Diffusion Models: CVPR 2025 Technique Outperforms Traditional Image Generation

The paper introduces a novel attention‑distillation loss and a guided‑sampling scheme that together enable diffusion models to faithfully transfer visual features from reference images, dramatically speeding synthesis and surpassing prior plug‑and‑play attention methods across style transfer, text‑to‑image generation, and texture synthesis tasks.

AI researchDiffusion ModelsImage Generation

0 likes · 15 min read

Attention Distillation in Diffusion Models: CVPR 2025 Technique Outperforms Traditional Image Generation

Data Thinking Notes

Mar 4, 2025 · Artificial Intelligence

Unlock AI-Powered Research: The DeepSeek‑R1 & DeepResearch Guide

Compiled by Tsinghua University experts, this guide systematically analyzes the DeepSeek‑R1 inference model and DeepResearch platform, offering multi‑model comparisons, real‑world case studies, and end‑to‑end AI‑driven solutions from data collection to report generation for researchers.

AI researchData AutomationDeepSeek

0 likes · 6 min read

Unlock AI-Powered Research: The DeepSeek‑R1 & DeepResearch Guide

Architect

Mar 3, 2025 · Artificial Intelligence

Unlocking Reasoning LLMs: Methods, DeepSeek R1 Insights, and Cost‑Effective Strategies

This article examines how to build and improve reasoning‑capable large language models, explains the definition and use‑cases of reasoning models, details DeepSeek‑R1’s training pipeline, compares four key enhancement methods—including inference‑time scaling, pure RL, SFT + RL, and distillation—and offers budget‑friendly advice.

AI researchDeepSeekInference Scaling

0 likes · 27 min read

Unlocking Reasoning LLMs: Methods, DeepSeek R1 Insights, and Cost‑Effective Strategies

Tencent Cloud Developer

Feb 27, 2025 · Artificial Intelligence

DeepSeek LLM Series (V1‑V3, R1) Technical Overview and Analysis

The DeepSeek technical overview details the evolution from the dense 67 B V1 model through the 236 B MoE‑based V2 and 671 B V3 with FP8 training, to the RL‑only R1 series that learns reasoning without supervision, highlighting innovations such as Grouped‑Query Attention, Multi‑Head Latent Attention, load‑balancing‑free MoE, Multi‑Token Prediction, and knowledge distillation, and reporting state‑of‑the‑art benchmark results and open‑source reproduction projects.

AI researchDeepSeekMixture of Experts

0 likes · 37 min read

DeepSeek LLM Series (V1‑V3, R1) Technical Overview and Analysis

Architecture Digest

Feb 25, 2025 · Artificial Intelligence

DeepSeek Distillation Technology: Overview, Innovations, Architecture, Training, Performance, and Challenges

DeepSeek’s distillation technology combines data and model distillation to transfer knowledge from large teacher models to compact student models, detailing its definitions, principles, key innovations, architecture, training methods, performance gains, and challenges, especially in multimodal contexts.

AI researchDeepSeekLarge Language Models

0 likes · 16 min read

Architect

Feb 22, 2025 · Artificial Intelligence

How Open‑Source Projects Reproduced DeepSeek‑R1 and Pushed LLM Limits

This article reviews the most notable open‑source reproductions of DeepSeek‑R1—including Open R1, OpenThoughts, LIMO and DeepScaleR—detailing their data pipelines, training steps, reinforcement‑learning strategies, dataset constructions, and benchmark results that demonstrate how small, high‑quality data can rival massive‑scale models.

AI researchDeepSeek-R1Model Scaling

0 likes · 26 min read

How Open‑Source Projects Reproduced DeepSeek‑R1 and Pushed LLM Limits

AIWalker

Feb 22, 2025 · Artificial Intelligence

FlexTok Achieves High‑Quality Visual Reconstruction with as Few as 8 Tokens, Outperforming TiTok

FlexTok introduces a variable‑length 1‑D image tokenizer that can reconstruct images with as few as eight tokens, surpasses TiTok in FID and MAE across multiple token budgets, and serves as a hierarchical visual vocabulary for autoregressive image generation.

AI researchFlexTokautoregressive generation

0 likes · 23 min read

FlexTok Achieves High‑Quality Visual Reconstruction with as Few as 8 Tokens, Outperforming TiTok

NewBeeNLP

Feb 21, 2025 · Artificial Intelligence

Do Scaling Laws Still Hold? Analyzing Grok‑3, Deepseek and LLM Training Trends

The article examines whether pre‑training scaling laws remain valid, compares Grok‑3’s architecture and training strategy with Deepseek models, and explores how different scaling approaches—pre‑training, RL‑based, and test‑time—affect the cost‑effectiveness and intelligence of large language models.

AI researchGrok-3scaling laws

0 likes · 11 min read

Do Scaling Laws Still Hold? Analyzing Grok‑3, Deepseek and LLM Training Trends

Architect

Feb 20, 2025 · Artificial Intelligence

Why Long CoT and In‑Context RL Are the Next Frontier for LLMs

The article analyses recent breakthroughs such as OpenAI's o1, Long CoT, and test‑time search, arguing that enabling LLMs to perform self‑critique and reinforcement learning with long output sequences is essential for future AI performance, while warning against overly structured workflows.

AI researchIn‑Context RLLLM

0 likes · 12 min read

Why Long CoT and In‑Context RL Are the Next Frontier for LLMs

AIWalker

Feb 20, 2025 · Artificial Intelligence

Transfusion: A Single Model for Unified Image Generation and Understanding

Transfusion is a 7B‑parameter transformer that jointly trains language modeling and diffusion losses on mixed text‑image data, enabling seamless text generation, image generation, and image understanding within one model and outperforming prior multimodal approaches such as Chameleon across multiple benchmarks.

AI researchImage GenerationLanguage Modeling

0 likes · 20 min read

Transfusion: A Single Model for Unified Image Generation and Understanding

Baobao Algorithm Notes

Feb 19, 2025 · Artificial Intelligence

How X‑R1’s New Open‑Source 0.5B/1.5B/3B Models Enable LoRA and Chinese Inference

The X‑R1 release introduces fully open‑source 0.5B, 1.5B and 3B models with one‑click training scripts, LoRA fine‑tuning support, Chinese inference capabilities, detailed reward‑curve visualizations, and quick‑start instructions for both CUDA and Ascend platforms.

AI researchChinese inferenceLoRA

0 likes · 5 min read

How X‑R1’s New Open‑Source 0.5B/1.5B/3B Models Enable LoRA and Chinese Inference

AIWalker

Feb 16, 2025 · Artificial Intelligence

VARGPT: A Unified Autoregressive Architecture for Multimodal Understanding and Generation

VARGPT is a novel multimodal large language model that unifies visual understanding and autoregressive image generation within a single architecture, extending LLaVA with next‑token and next‑scale prediction, trained through three staged data‑curated phases and achieving superior performance on numerous vision‑language benchmarks.

AI researchImage GenerationMultimodal

0 likes · 20 min read

VARGPT: A Unified Autoregressive Architecture for Multimodal Understanding and Generation

Baobao Algorithm Notes

Feb 13, 2025 · Artificial Intelligence

How to Build and Improve Reasoning LLMs: Methods, Trade‑offs, and DeepSeek Insights

This article explains what reasoning language models are, when they are needed, and reviews four main techniques— inference‑time scaling, pure reinforcement learning, combined SFT + RL, and distillation—illustrated with DeepSeek‑R1’s development, cost analysis, and low‑budget alternatives.

AI researchDeepSeekInference Scaling

0 likes · 27 min read

How to Build and Improve Reasoning LLMs: Methods, Trade‑offs, and DeepSeek Insights

ZhongAn Tech Team

Feb 10, 2025 · Artificial Intelligence

Weekly AI Technology Overview: OpenAI ChatGPT Search, Deep Research, DeepSeek Advances, and Industry Insights

This week’s AI roundup covers OpenAI’s fully open ChatGPT Search, the launch of Deep Research for automated multi‑step research, NetEase Youdao’s integration of DeepSeek‑R1, Figure’s robot partnership break with OpenAI, low‑cost AI model s1, OpenAI’s Stargate data‑center plans, Google’s antitrust probe, DeepSeek’s traffic surge, and top AI scientist Xu joining Alibaba.

AI researchChatGPTDeepSeek

0 likes · 9 min read

Weekly AI Technology Overview: OpenAI ChatGPT Search, Deep Research, DeepSeek Advances, and Industry Insights

Top Architect

Feb 9, 2025 · Artificial Intelligence

DeepSeek‑R1: Training Pipeline, Reinforcement‑Learning Techniques, and Experimental Results

The article reviews DeepSeek‑R1’s training methodology—including cold‑start data collection, multi‑stage RL fine‑tuning, SFT data generation, and model distillation—highlights its performance comparable to OpenAI‑o1‑1217, and discusses key contributions, reward design, successful experiments, and failed attempts.

AI researchDeepSeekLLM

0 likes · 12 min read

DeepSeek‑R1: Training Pipeline, Reinforcement‑Learning Techniques, and Experimental Results

Architect

Feb 6, 2025 · Artificial Intelligence

DeepSeek‑R1: Reinforcement‑Learning‑Driven Long‑Chain Reasoning for Large Language Models

The article reviews DeepSeek‑R1, detailing its reinforcement‑learning‑based training pipeline that uses minimal supervised data, cold‑start fine‑tuning, multi‑stage RL, rejection‑sampling SFT, and distillation to achieve reasoning performance comparable to OpenAI‑o1‑1217, while also discussing successful contributions and failed experiments.

AI researchDeepSeek-R1LLM reasoning

0 likes · 11 min read

DeepSeek‑R1: Reinforcement‑Learning‑Driven Long‑Chain Reasoning for Large Language Models

AIWalker

Feb 4, 2025 · Artificial Intelligence

Meta’s Open‑Source MILS Enables LLMs to See and Hear Without Training – SOTA on Images, Video, and Audio

The paper introduces MILS, a training‑free multimodal iterative LLM solver that lets large language models perceive and generate across image, video, and audio domains, achieving new state‑of‑the‑art results without any task‑specific data or fine‑tuning.

AI researchLLMMILS

0 likes · 18 min read

Meta’s Open‑Source MILS Enables LLMs to See and Hear Without Training – SOTA on Images, Video, and Audio

AIWalker

Jan 17, 2025 · Artificial Intelligence

InternLM 3.0: Boosting Model Performance with Only 4 TB of Training Data

Shanghai AI Laboratory’s InternLM 3.0 upgrade demonstrates that refining data quality—measured as intelligence‑per‑token—can replace massive datasets, achieving higher reasoning and dialogue capabilities with just 4 TB of tokens, cutting training cost by over 75 % while approaching GPT‑4‑level performance.

AI researchInternLMModel Evaluation

0 likes · 9 min read

InternLM 3.0: Boosting Model Performance with Only 4 TB of Training Data

360 Zhihui Cloud Developer

Jan 9, 2025 · Artificial Intelligence

Unlocking Efficient Large Model Fine‑Tuning: LoRA, LoRA+, rsLoRA, DoRA & PiSSA Explained

This article introduces the fundamentals of large‑model fine‑tuning, compares popular parameter‑efficient methods such as LoRA and its variants, presents experimental results on the Qwen2.5‑7B model, and discusses current challenges and future research directions.

AI researchLoRAlarge model fine-tuning

0 likes · 17 min read

Unlocking Efficient Large Model Fine‑Tuning: LoRA, LoRA+, rsLoRA, DoRA & PiSSA Explained

DevOps

Jan 7, 2025 · Artificial Intelligence

Microsoft’s 2025 AI Predictions: Stronger Models, AI Agents, AI Companions, Efficient Resources, Testing & Customization, and Accelerated Scientific Research

Microsoft outlines six 2025 AI forecasts—including more powerful models, autonomous AI agents reshaping work, AI companions aiding daily life, greener resource use, rigorous testing and customization, and AI-driven scientific breakthroughs—highlighting how these advances will transform industries, research, and everyday experiences.

2025 predictionsAI modelsAI research

0 likes · 8 min read

Microsoft’s 2025 AI Predictions: Stronger Models, AI Agents, AI Companions, Efficient Resources, Testing & Customization, and Accelerated Scientific Research

21CTO

Jan 2, 2025 · Artificial Intelligence

2025 AI Breakthroughs: Unlimited Memory & Intelligent Agents, Says Eric Schmidt

Former Google CEO Eric Schmidt warns that AI is on the brink of a transformative era, highlighting three 2025 breakthroughs—unlimited context memory, autonomous AI agents, and text‑to‑action programming—while also stressing the looming risks of energy consumption, security threats, and the need for ethical safeguards.

AI SafetyAI memoryAI research

0 likes · 14 min read

2025 AI Breakthroughs: Unlimited Memory & Intelligent Agents, Says Eric Schmidt

DaTaobao Tech

Dec 30, 2024 · Artificial Intelligence

AI Research Highlights: AAAI 2025 & NeurIPS 2024 Breakthroughs in Image Generation

This article compiles recent AI research breakthroughs presented at AAAI 2025 and NeurIPS 2024, summarizing eight papers on multi‑condition image generation, mixed auto‑regressive models, hallucination mitigation in vision‑language models, quantized diffusion denoising, facial part swapping, language‑guided concept vectors, attribution consistency, and video virtual try‑on, with links to each work.

AAAI 2025AI researchDiffusion Models

0 likes · 13 min read

AI Research Highlights: AAAI 2025 & NeurIPS 2024 Breakthroughs in Image Generation

Baobao Algorithm Notes

Dec 16, 2024 · Artificial Intelligence

What Do Leading Open‑Source LLMs Do After Pretraining? A Deep Dive into Post‑Training Strategies

This article surveys the post‑training pipelines of major open‑source large language models released this year, detailing their alignment algorithms, data synthesis, reward modeling, DPO/GRPO variants, long‑context handling, tool use, and model‑averaging techniques, and highlights emerging trends such as data‑centric pipelines and iterative weak‑to‑strong alignment.

AI researchAlignmentLLM

0 likes · 99 min read

What Do Leading Open‑Source LLMs Do After Pretraining? A Deep Dive into Post‑Training Strategies

DataFunTalk

Nov 30, 2024 · Artificial Intelligence

Interview with Rich Sutton on Continuous Learning, Reinforcement Learning, and the Future of AI

In this extensive interview, Rich Sutton critiques the focus on transient deep learning, advocates for continuous learning, discusses the reward hypothesis, outlines research challenges, offers advice to emerging scholars, and predicts breakthroughs in AI understanding by 2030‑2040.

AI researchReinforcement Learningcontinuous learning

0 likes · 27 min read

Interview with Rich Sutton on Continuous Learning, Reinforcement Learning, and the Future of AI

Alipay Experience Technology

Nov 27, 2024 · Artificial Intelligence

EchoMimicV2: High‑Quality Audio‑Driven Half‑Body Human Animation with Simple Inputs

EchoMimicV2 is an open‑source digital‑human framework that generates high‑quality half‑body animation videos from a single reference image, an audio clip, and a hand‑gesture sequence, addressing challenges of facial portrait limits, complex condition injection, and inference latency in audio‑driven animation.

AI researchDiffusion ModelsDigital Human

0 likes · 18 min read

EchoMimicV2: High‑Quality Audio‑Driven Half‑Body Human Animation with Simple Inputs

360 Tech Engineering

Nov 15, 2024 · Artificial Intelligence

Advances in Multimodal Large Models and Document Understanding Presented at the 2024 Global Machine Learning Conference (Beijing)

At the 2024 Global Machine Learning Conference in Beijing, 360 AI Research Institute showcased cutting‑edge multimodal large‑model research, fine‑grained open‑world object detection, and document understanding technologies, highlighting open‑source releases, real‑world deployments, and competitive achievements in AI competitions.

AI researchMultimodal AIdocument understanding

0 likes · 7 min read

Advances in Multimodal Large Models and Document Understanding Presented at the 2024 Global Machine Learning Conference (Beijing)

Tencent Cloud Developer

Nov 6, 2024 · Artificial Intelligence

Overview of Tencent Hunyuan Large and 3D Generation Model Open‑Source Release

Tencent has open‑sourced its 389‑billion‑parameter Hunyuan Large Mixture‑of‑Experts model—featuring 52 B active parameters, 256 K token context, novel routing, KV‑cache compression, and advanced training optimizations that beat leading open‑source models—and its first text‑to‑3D/image‑to‑3D Hunyuan 3D Generation model, both downloadable via GitHub, Hugging Face, and Tencent Cloud.

3D generationAI researchMixture of Experts

0 likes · 9 min read

Overview of Tencent Hunyuan Large and 3D Generation Model Open‑Source Release

Meituan Technology Team

Oct 31, 2024 · Artificial Intelligence

Selected Meituan Papers from CIKM 2024: Summaries of Eight Research Works

This article highlights eight Meituan research papers accepted at CIKM 2024—spanning self‑supervised sequential recommendation, rating‑consistent explanation generation, CTR prediction via recommendation pre‑training, cross‑domain interest transfer, multimodal vector retrieval, design‑aware poster layout, order‑fulfillment cycle‑time forecasting, and delivery‑scope substitution—offering insights from both internal and university collaborations.

AI researchCTR predictionCross‑Domain Recommendation

0 likes · 16 min read

Selected Meituan Papers from CIKM 2024: Summaries of Eight Research Works

Baobao Algorithm Notes

Oct 30, 2024 · Artificial Intelligence

How to Choose High-Quality Instruction Data for LLM Fine‑Tuning: Methods Compared

This article surveys and categorizes instruction data selection techniques for large language model fine‑tuning, explaining metric‑based, trainable‑LLM, powerful‑LLM, and small‑model approaches, detailing representative papers, their pipelines, and empirical findings on data quality and diversity.

AI researchData QualityInstruction Tuning

0 likes · 15 min read

How to Choose High-Quality Instruction Data for LLM Fine‑Tuning: Methods Compared

AntTech

Oct 29, 2024 · Artificial Intelligence

Three Ant Group Papers Featured at EMNLP 2024: Dynamic Transformers, Plug‑and‑Play Visual Reasoner, and Efficient Fine‑Tuning of Large Language Models

This announcement introduces three Ant Group papers accepted at EMNLP 2024—Mixture‑of‑Modules for dynamic Transformer assembly, a plug‑and‑play visual reasoning framework built via data synthesis, and a layer‑wise importance‑aware efficient fine‑tuning method for large language models—highlighting their innovations and upcoming live presentations.

AI researchEMNLP 2024Large Language Models

0 likes · 6 min read

Three Ant Group Papers Featured at EMNLP 2024: Dynamic Transformers, Plug‑and‑Play Visual Reasoner, and Efficient Fine‑Tuning of Large Language Models

Baobao Algorithm Notes

Oct 24, 2024 · Artificial Intelligence

How NoteLLM-2 Boosts Multimodal Recommendations with In-Content Learning

NoteLLM-2 introduces multimodal In-Content Learning and Late Fusion to overcome visual‑modality bias in end‑to‑end fine‑tuned large representation models, delivering significant gains over baseline multimodal LLMs and traditional retrieval methods in recommendation tasks.

AI researchRecommendation Systemscontrastive learning

0 likes · 11 min read

How NoteLLM-2 Boosts Multimodal Recommendations with In-Content Learning

Alibaba Cloud Big Data AI Platform

Oct 16, 2024 · Artificial Intelligence

How VICTORIA Revolutionizes Multi‑Object Image Editing with Language‑Aware Diffusion

The VICTORIA algorithm, presented by Alibaba Cloud AI Platform PAI and South China University of Technology at ACM MM 2024, leverages linguistic dependency parsing to guide cross‑attention in Stable Diffusion, enabling accurate, training‑free multi‑object image editing while preserving spatial structure and achieving state‑of‑the‑art results on benchmark datasets.

AI researchDiffusion ModelsStable Diffusion

0 likes · 10 min read

How VICTORIA Revolutionizes Multi‑Object Image Editing with Language‑Aware Diffusion

Alibaba Cloud Big Data AI Platform

Oct 15, 2024 · Artificial Intelligence

How VICTORIA Boosts Text‑Guided Image Editing with Language‑Aware Diffusion

The VICTORIA algorithm, presented by Alibaba Cloud's PAI team at ACM MM2024, leverages linguistic dependency parsing and cross‑attention control to overcome multi‑object editing challenges in training‑free text‑guided image editing, delivering precise, structure‑preserving results across diverse scenes.

AI researchDiffusion Modelsimage manipulation

0 likes · 6 min read

How VICTORIA Boosts Text‑Guided Image Editing with Language‑Aware Diffusion

Network Intelligence Research Center (NIRC)

Oct 8, 2024 · Artificial Intelligence

Two NIRC Papers Accepted at NeurIPS 2024: FM-Delta Compression and GLAFF Forecasting

The Beijing University of Posts and Telecommunications' Network Intelligent Research Center (NIRC) had two papers accepted to NeurIPS 2024, presenting FM-Delta, a lossless compression technique that halves storage and cuts cloud costs by over 40%, and GLAFF, a global‑local fusion framework that markedly improves the robustness of time‑series forecasting across multiple domains.

AI researchFM-DeltaGLAFF

0 likes · 8 min read

Two NIRC Papers Accepted at NeurIPS 2024: FM-Delta Compression and GLAFF Forecasting

Baobao Algorithm Notes

Oct 7, 2024 · Artificial Intelligence

Decoding OpenAI’s o1: How RL and Process‑Supervised Reward Models Might Power the Next LLM

The author speculates on OpenAI’s o1 architecture, proposing that it relies on reinforcement learning guided by a generalizable, process‑supervised reward model, and outlines data collection, multi‑model generation, and training tweaks needed to realize such a system.

AI researchLLMRLHF

0 likes · 8 min read

Decoding OpenAI’s o1: How RL and Process‑Supervised Reward Models Might Power the Next LLM

Fighter's World

Sep 30, 2024 · Artificial Intelligence

Exploring Google NotebookLM: Use Cases, Interaction Experience, and Key Insights

The author reviews Google NotebookLM, describing how it aids deep paper reading, boosts chat willingness with guided prompts, maintains conversation coherence through self‑play insights, highlights the audio‑overview feature, and reflects on AI concepts such as the "bitter lesson" and the limits of self‑play in open scenarios.

AI researchGoogleLLM

0 likes · 22 min read

Exploring Google NotebookLM: Use Cases, Interaction Experience, and Key Insights

Kuaishou Tech

Sep 27, 2024 · Artificial Intelligence

XPSR: Cross‑modal Priors for Diffusion‑based Image Super‑Resolution

The paper introduces XPSR, a diffusion‑based image super‑resolution method that incorporates cross‑modal semantic priors from a large multimodal language model, achieving state‑of‑the‑art performance on both reference and no‑reference quality metrics across synthetic and real‑world video restoration tasks.

AI researchECCV2024cross‑modal priors

0 likes · 8 min read

XPSR: Cross‑modal Priors for Diffusion‑based Image Super‑Resolution

DataFunSummit

Sep 13, 2024 · Artificial Intelligence

Research on Domain Large Models by Fudan University Knowledge Workshop Lab

This article presents the Fudan University Knowledge Workshop Lab's comprehensive research on domain large models, covering background, domain adaptation, capability enhancement, collaborative workflows, challenges such as inference cost and alignment, and proposed solutions including source‑enhanced training, self‑correction mechanisms, and hybrid retrieval‑augmented generation.

AI researchKnowledge Graphsdomain adaptation

0 likes · 16 min read

Research on Domain Large Models by Fudan University Knowledge Workshop Lab

Baobao Algorithm Notes

Sep 5, 2024 · Artificial Intelligence

Why Small LLMs Are the Secret Weapon for Scaling Large Model Research

The article explains how homologous small language models—trained on the same tokenizer and data as their large counterparts—serve as cheap, fast experimental platforms that can predict large‑model performance, guide pre‑training decisions, and support techniques like distillation and reward modeling.

AI researchLLM scalingQwen2

0 likes · 13 min read

Why Small LLMs Are the Secret Weapon for Scaling Large Model Research

360 Tech Engineering

Aug 29, 2024 · Artificial Intelligence

FancyVideo: Towards Dynamic and Consistent Video Generation via Cross-frame Textual Guidance

FancyVideo is an open‑source UNet‑based video generation model that supports arbitrary resolutions, aspect ratios, styles, and motion dynamics by introducing a Cross‑frame Textual Guidance Module (CTGM) with temporal injectors, refiners, and boosters, achieving state‑of‑the‑art results on multiple benchmarks and enabling versatile applications such as video extension, backtracking, and frame interpolation.

AI researchUNetVideo Generation

0 likes · 6 min read

FancyVideo: Towards Dynamic and Consistent Video Generation via Cross-frame Textual Guidance

AntTech

Aug 28, 2024 · Artificial Intelligence

Ant Group’s Selected Papers at KDD2024: Abstracts and Highlights

The article presents a curated collection of Ant Group's research papers accepted at KDD2024, summarizing each paper's title, type, link, source, relevant fields, and abstract, covering topics such as graph mining, large language models, fraud detection, recommendation systems, and multimodal medical AI.

AI researchAnt GroupKDD2024

0 likes · 31 min read

Ant Group’s Selected Papers at KDD2024: Abstracts and Highlights

Alibaba Cloud Big Data AI Platform

Aug 20, 2024 · Artificial Intelligence

How DAFNet Enables Efficient Sequential Editing of Large Language Models

This article introduces DAFNet, a dynamic auxiliary fusion framework that enables efficient sequential editing of large language models by injecting knowledge with reduced resource costs while preserving model reliability, generalization, and mitigating hallucination, and details its dataset, architecture, and evaluation results.

AI researchdynamic auxiliary fusionmodel editing

0 likes · 10 min read

How DAFNet Enables Efficient Sequential Editing of Large Language Models

Alibaba Cloud Big Data AI Platform

Aug 19, 2024 · Artificial Intelligence

How Long‑Tail Knowledge Boosts Retrieval‑Augmented Large Language Models

The paper introduces a method that classifies user queries into ordinary and long‑tail types, applying retrieval‑augmented generation only to long‑tail queries, which improves large language model efficiency and accuracy by leveraging specialized knowledge detection metrics and an extended RAG pipeline.

AI researchECE metricRetrieval Augmented Generation

0 likes · 9 min read

How Long‑Tail Knowledge Boosts Retrieval‑Augmented Large Language Models

Alibaba Cloud Big Data AI Platform

Aug 11, 2024 · Artificial Intelligence

Alibaba Cloud PAI’s Breakthroughs in Chinese Diffusion, Prompting, and LLM Knowledge Editing

Recent ACL 2024 papers from Alibaba Cloud’s PAI platform showcase open‑source Chinese diffusion models, an interactive multi‑turn prompt generator, a long‑tail knowledge‑aware retrieval‑augmented LLM approach, and a dynamic fusion network for sequential model editing, all integrated into cloud services.

AI researchDiffusion ModelsRetrieval Augmented Generation

0 likes · 11 min read

Alibaba Cloud PAI’s Breakthroughs in Chinese Diffusion, Prompting, and LLM Knowledge Editing

JD Retail Technology

Jul 15, 2024 · Artificial Intelligence

Can Task‑Aware Decoding Tame LLM Hallucinations? Insights from IJCAI 2024

This article reviews the IJCAI 2024‑presented Task‑aware Decoding (TaD) technique, explains how it mitigates large‑language‑model hallucinations when combined with Retrieval‑augmented Generation, and details experimental results, practical deployments, and future research directions.

AI researchIJCAI2024LLM

0 likes · 19 min read

Can Task‑Aware Decoding Tame LLM Hallucinations? Insights from IJCAI 2024

21CTO

Jul 10, 2024 · Information Security

Did a Hacker Breach OpenAI’s Internal AI Discussions? Implications for Security

A New York Times report reveals that a hacker accessed OpenAI's internal messaging system, exposing employee discussions on AI advancements and sparking concerns about foreign espionage, internal security practices, and the broader national‑security implications of AI technology.

AI researchAI securityOpenAI

0 likes · 4 min read

Did a Hacker Breach OpenAI’s Internal AI Discussions? Implications for Security

DataFunSummit

Jul 9, 2024 · Artificial Intelligence

Applying Large Language Models to Recommendation Systems at Ant Group

This article details Ant Group's research on integrating large language models into recommendation pipelines, covering background challenges, knowledge extraction, teacher‑student distillation, experimental results, and practical Q&A for improving bias, efficiency, and cold‑start performance.

AI researchAnt GroupLarge Language Models

0 likes · 14 min read

Applying Large Language Models to Recommendation Systems at Ant Group

Baobao Algorithm Notes

Jul 9, 2024 · Artificial Intelligence

Why Step-Level DPO Is Revolutionizing LLM Math Reasoning

This article reviews recent step‑level DPO research, compares it with instance‑level DPO, explains the underlying Monte Carlo Tree Search formulation, and presents the author’s own replication experiments that demonstrate consistent performance gains across multiple LLM sizes on GSM8K and MATH benchmarks.

AI researchLLM alignmentMCTS

0 likes · 10 min read

Why Step-Level DPO Is Revolutionizing LLM Math Reasoning

Meituan Technology Team

Jun 27, 2024 · Artificial Intelligence

Meituan Technical Team's Three Papers Accepted at SIGIR 2024: Ad Auction Integration, Federated Recommendation, and POI Recommendation

The article highlights three Meituan research papers accepted at SIGIR 2024—covering deep automated mechanism design for ad auction, a retrieval‑enhanced vertical federated recommendation framework, and disentangled contrastive hypergraph learning for next POI recommendation—and announces an online sharing event where the authors will present their work.

AI researchAd AuctionFederated Recommendation

0 likes · 9 min read

Meituan Technical Team's Three Papers Accepted at SIGIR 2024: Ad Auction Integration, Federated Recommendation, and POI Recommendation

Alibaba Cloud Developer

Jun 27, 2024 · Artificial Intelligence

How to Supercharge Retrieval‑Augmented Generation: Papers, Techniques, and Real‑World Tips

This article surveys the main challenges of deploying large language models, introduces key RAG optimization papers such as RAPTOR, Self‑RAG, and CRAG, and compiles practical engineering tricks—including chunking, query rewriting, hybrid and progressive retrieval—to help practitioners build more accurate and efficient RAG systems.

AI researchLLM optimizationRAG

0 likes · 22 min read

How to Supercharge Retrieval‑Augmented Generation: Papers, Techniques, and Real‑World Tips

Xiaohongshu Tech REDtech

Jun 20, 2024 · Artificial Intelligence

Xiaohongshu 2024 Large Model Frontier Paper Sharing Live Event

On June 27, 2024, Xiaohongshu’s technical team will livestream a two‑hour session across WeChat Channels, Bilibili, Douyin and Xiaohongshu, showcasing six top‑conference papers on large‑model advances—including early‑stopping and fine‑grained self‑consistency, novel evaluation methods, negative‑sample‑assisted distillation, and LLM‑based note recommendation—followed by a Q&A and recruitment briefing.

AI researchLarge Language ModelsModel Evaluation

0 likes · 12 min read

Xiaohongshu 2024 Large Model Frontier Paper Sharing Live Event

Alibaba Cloud Big Data AI Platform

Jun 18, 2024 · Artificial Intelligence

Free-Prompt-Editing: Efficient Text-Guided Image Editing with Stable Diffusion

The paper introduces Free-Prompt-Editing (FPE), a novel, efficient algorithm for text‑guided image editing that leverages probe analysis of cross‑ and self‑attention maps in Stable Diffusion, demonstrates its superiority over existing methods through extensive experiments, and provides open‑source implementation for both synthetic and real‑image editing.

AI researchStable Diffusionattention maps

0 likes · 12 min read

Free-Prompt-Editing: Efficient Text-Guided Image Editing with Stable Diffusion

Alibaba Cloud Big Data AI Platform

Jun 17, 2024 · Artificial Intelligence

How Free-Prompt-Editing Revolutionizes Text-Guided Image Editing with Stable Diffusion

The paper introduces Free-Prompt-Editing, a concise and efficient algorithm that replaces self‑attention maps during denoising to achieve high‑quality text‑guided image edits without source prompts, and demonstrates its superiority over existing methods on both synthetic and real images.

AI researchFree-Prompt-Editingattention mechanisms

0 likes · 6 min read

How Free-Prompt-Editing Revolutionizes Text-Guided Image Editing with Stable Diffusion

DataFunTalk

Jun 15, 2024 · Artificial Intelligence

Research on Domain Large Models by Fudan University Knowledge Factory Lab

This article presents Fudan University's Knowledge Factory Lab research on domain large models, covering background, challenges, data selection, source‑enhanced tagging, capability improvements, self‑correction, collaborative workflows, and retrieval‑augmented generation for practical AI deployment.

AI researchLarge Language Modelsdomain adaptation

0 likes · 16 min read

Research on Domain Large Models by Fudan University Knowledge Factory Lab

DataFunSummit

Jun 6, 2024 · Artificial Intelligence

MetaGPT: Multi‑Agent Collaboration and Agent Capability Enhancement

This article introduces MetaGPT, an open‑source multi‑agent framework that leverages large language models to automate software development, data science, and simulation tasks, detailing its development, impact, experimental results, memory and reasoning enhancements, and comparisons with related systems.

AI researchAgent MemoryLLM agents

0 likes · 21 min read

MetaGPT: Multi‑Agent Collaboration and Agent Capability Enhancement

NewBeeNLP

May 28, 2024 · Artificial Intelligence

How Generative Models Are Redefining Recommendation Systems

This article reviews recent advances in generative recommendation, highlighting challenges such as item representation and multimodal fusion, and summarizing four key research papers that propose novel tokenization, collaborative integration, and transformer-based multimodal approaches to improve recommendation performance.

AI researchGenerative RecommendationLLM

0 likes · 8 min read

How Generative Models Are Redefining Recommendation Systems

360 Tech Engineering

May 17, 2024 · Artificial Intelligence

360VL: An Open‑Source Multimodal Large Language Model Based on Llama‑3‑70B

The article introduces 360VL, an open‑source multimodal large language model built on Llama‑3‑70B, describes its novel C‑abs bridge architecture for high‑resolution visual understanding, outlines the two‑stage training with bilingual data, and presents benchmark results showing superior performance over prior LMMs.

AI researchLlama3Multimodal

0 likes · 8 min read

360VL: An Open‑Source Multimodal Large Language Model Based on Llama‑3‑70B

NewBeeNLP

May 15, 2024 · Artificial Intelligence

How Large Language Models and Knowledge Graphs Can Boost Each Other

This talk reviews recent advances in large language models, compares them with knowledge graphs, explores how LLMs enhance knowledge extraction and completion, examines how knowledge graphs aid LLM evaluation and safe deployment, and outlines future interactive integration between the two technologies.

AI researchKnowledge GraphsLarge Language Models

0 likes · 13 min read

How Large Language Models and Knowledge Graphs Can Boost Each Other

Rare Earth Juejin Tech Community

May 15, 2024 · Artificial Intelligence

OpenAI Unveils GPT‑4o: An Omni‑Capable Multimodal Model Offered Free to All Users

OpenAI introduced GPT‑4o, a free, omni‑capable multimodal model that processes text, audio, and images together, delivers near‑human response latency, showcases impressive live demos, and will soon be available via a discounted API, marking a significant step forward in end‑to‑end AI research.

AI researchGPT-4oMultimodal AI

0 likes · 7 min read

OpenAI Unveils GPT‑4o: An Omni‑Capable Multimodal Model Offered Free to All Users

21CTO

Apr 8, 2024 · Artificial Intelligence

How Naver’s HyperCLOVA X Advances Multilingual AI for Asian Languages

Naver’s newly unveiled HyperCLOVA X large‑language model, detailed in an arXiv technical report, claims superior cross‑lingual reasoning for Asian languages, especially Korean, by pre‑training on a data mix of Korean, multilingual text and code, achieving state‑of‑the‑art translation and multilingual capabilities.

AI researchHyperCLOVA XKorean NLP

0 likes · 4 min read

How Naver’s HyperCLOVA X Advances Multilingual AI for Asian Languages

NewBeeNLP

Apr 7, 2024 · Artificial Intelligence

Can Large Language Models Learn Recommendation Knowledge? A NL‑Simulated Auxiliary Task

This article reviews a recent study that bridges the knowledge gap between large language models and recommendation systems by generating natural‑language auxiliary tasks, fine‑tuning the models, and achieving notable performance gains on Amazon domain benchmarks.

AI researchFine-tuningLarge Language Models

0 likes · 4 min read

Can Large Language Models Learn Recommendation Knowledge? A NL‑Simulated Auxiliary Task