Tagged articles
44 articles
Page 1 of 1
AI Cyberspace
AI Cyberspace
Feb 13, 2026 · Artificial Intelligence

How Attention Mechanisms Revolutionized Computer Vision and Machine Translation

This article traces the evolution of attention mechanisms from their inaugural application in computer vision and machine translation to their central role in modern Transformer models, detailing the underlying RNN‑Attention designs, the breakthrough in sequence alignment, and the innovations that enabled high‑performance, parallelizable deep learning architectures.

Attention MechanismComputer VisionDeep Learning
0 likes · 14 min read
How Attention Mechanisms Revolutionized Computer Vision and Machine Translation
HyperAI Super Neural
HyperAI Super Neural
Jan 9, 2026 · Artificial Intelligence

How HY-MT1.5 Achieves 1 GB Mobile Translation with a 1.8B Model

The article explains how Tencent's open‑source HY‑MT1.5 tackles the high‑cost, large‑parameter barrier of neural machine translation by offering a 1.8 B‑parameter model that runs on roughly 1 GB of RAM, processes 50 tokens in 0.18 s, supports 33 languages, and uses on‑policy distillation to retain top‑tier accuracy, while providing a step‑by‑step online demo and free compute credits for new users.

HY-MT1.5Mobile AIOn-Policy Distillation
0 likes · 5 min read
How HY-MT1.5 Achieves 1 GB Mobile Translation with a 1.8B Model
Bilibili Tech
Bilibili Tech
Oct 31, 2025 · Artificial Intelligence

RIVAL: Adversarial RL Framework Elevates Conversational Subtitle Translation

RIVAL (Reinforcement Learning with Iterative and Adversarial Optimization) introduces an adversarial game between a reward model and a translation LLM, combining qualitative preference rewards with quantitative metrics like BLEU, to overcome distribution shift in RLHF and achieve superior performance on conversational subtitle and WMT translation tasks.

BLEULLMReward Modeling
0 likes · 13 min read
RIVAL: Adversarial RL Framework Elevates Conversational Subtitle Translation
Tencent Technical Engineering
Tencent Technical Engineering
Apr 16, 2025 · Artificial Intelligence

Understanding Transformer Architecture for Chinese‑English Translation: A Practical Guide

This practical guide walks through the full Transformer architecture for Chinese‑to‑English translation, detailing encoder‑decoder structure, tokenization and embeddings, batch handling with padding and masks, positional encodings, parallel teacher‑forcing, self‑ and multi‑head attention, and the complete forward and back‑propagation training steps.

Positional EncodingPyTorchSelf-Attention
0 likes · 26 min read
Understanding Transformer Architecture for Chinese‑English Translation: A Practical Guide
vivo Internet Technology
vivo Internet Technology
Feb 12, 2025 · Artificial Intelligence

Bidirectional Optimization of NLLB-200 and ChatGPT for Low-Resource Language Translation

The paper proposes a bidirectional optimization framework that fine‑tunes the low‑resource NLLB‑200 translation model with LoRA using data generated by ChatGPT, while also translating low‑resource prompts with NLLB before feeding them to LLMs, thereby improving multilingual translation quality yet requiring careful validation of noisy synthetic data.

Fine-tuningLLMLoRA
0 likes · 28 min read
Bidirectional Optimization of NLLB-200 and ChatGPT for Low-Resource Language Translation
Baobao Algorithm Notes
Baobao Algorithm Notes
Nov 24, 2024 · Artificial Intelligence

How Marco‑o1 Merges Chain‑of‑Thought Fine‑Tuning with Monte‑Carlo Tree Search for Superior Reasoning

The article introduces Marco‑o1, an open‑source LLM that enhances complex reasoning by fine‑tuning on Chain‑of‑Thought data, integrating Monte‑Carlo Tree Search, introducing mini‑step actions and a reflection mechanism, and evaluates its performance on multilingual math and translation benchmarks.

LLMMonte Carlo Tree Searchartificial intelligence
0 likes · 15 min read
How Marco‑o1 Merges Chain‑of‑Thought Fine‑Tuning with Monte‑Carlo Tree Search for Superior Reasoning
System Architect Go
System Architect Go
Oct 24, 2024 · Artificial Intelligence

How to Fine‑Tune Translation Models on Kubernetes Docs with LoRA

This article walks through the complete process of fine‑tuning both domain‑specific and large‑language translation models on Kubernetes documentation, covering data preparation, model selection, training configurations, the differences between Seq2Seq and CausalLM, and how LoRA can dramatically reduce resource usage while improving performance.

AIFine-tuningLLM
0 likes · 7 min read
How to Fine‑Tune Translation Models on Kubernetes Docs with LoRA
Baidu Tech Salon
Baidu Tech Salon
Jun 24, 2024 · Artificial Intelligence

Paperpolisher: AI-Powered Academic Paper Translation and Polishing Assistant

Paperpolisher is an AI-powered tool using Baidu's ERNIE large model and Comate to translate and polish Chinese academic papers into high-quality English, leveraging large paper datasets and retrieval augmentation, streamlining code generation and improving acceptance chances for submissions to top conferences.

AI coding assistantBaidu ComateERNIE large model
0 likes · 9 min read
Paperpolisher: AI-Powered Academic Paper Translation and Polishing Assistant
Rare Earth Juejin Tech Community
Rare Earth Juejin Tech Community
Jun 12, 2024 · Artificial Intelligence

A Simple Introduction to the Transformer Model

This article provides a comprehensive, beginner-friendly explanation of the Transformer architecture, covering its encoder‑decoder structure, self‑attention, multi‑head attention, positional encoding, residual connections, decoding process, final linear and softmax layers, and training considerations, illustrated with numerous diagrams and code snippets.

Deep LearningNeural NetworksSelf-Attention
0 likes · 24 min read
A Simple Introduction to the Transformer Model
DataFunSummit
DataFunSummit
Mar 3, 2024 · Artificial Intelligence

Instruction Fine-Tuning Practices for Huawei's Pangu Large Language Model

This presentation details the concepts, methodologies, and experimental results of instruction fine‑tuning for Huawei's Pangu large language model, covering model scale, architecture, training strategies, data quality, parallelism techniques, and case studies on Chinese‑English translation and Thai language adaptation.

Efficient Fine-Tuninginstruction fine-tuningmachine translation
0 likes · 19 min read
Instruction Fine-Tuning Practices for Huawei's Pangu Large Language Model
DataFunTalk
DataFunTalk
Sep 19, 2023 · Artificial Intelligence

Simultaneous Speech Translation: Technical Background, System Architecture, and Key Challenges

This article reviews the technical background of simultaneous speech translation, compares offline and real‑time scenarios, details ASR and MT technologies, describes the system architecture and design strategies, and discusses the major challenges and solutions for deploying robust, low‑latency translation services.

ASRHuaweiReal-Time
0 likes · 16 min read
Simultaneous Speech Translation: Technical Background, System Architecture, and Key Challenges
21CTO
21CTO
Apr 27, 2023 · Artificial Intelligence

Demystifying Transformers: A Step‑by‑Step Guide to Self‑Attention and Architecture

This article explains the Transformer model—from its encoder‑decoder structure and self‑attention mechanism to multi‑head attention, positional encoding, residual connections, training loss, and inference strategies—providing a clear, visual walkthrough for readers new to modern NLP architectures.

Deep LearningSelf-AttentionTransformer
0 likes · 21 min read
Demystifying Transformers: A Step‑by‑Step Guide to Self‑Attention and Architecture
NetEase LeiHuo Testing Center
NetEase LeiHuo Testing Center
Mar 31, 2023 · Artificial Intelligence

Comparative Evaluation of Deepl and ChatGPT Machine Translation for Game Localization

This article investigates the translation quality of Deepl and ChatGPT for the game 'Naraka: Bladepoint' by comparing their outputs against professional human translations across Chinese‑English, Chinese‑Spanish, and English‑Spanish pairs using BLEU scores and manual assessment, revealing strengths and limitations of each system.

AIGCBLEUChatGPT
0 likes · 12 min read
Comparative Evaluation of Deepl and ChatGPT Machine Translation for Game Localization
Model Perspective
Model Perspective
Nov 17, 2022 · Artificial Intelligence

How Mathematics Sparked the Rise of Modern Linguistics and NLP

This article traces the historical convergence of mathematics and linguistics, from 19th‑century pioneers to post‑war computer‑driven research, highlighting how statistical, probabilistic, and formal methods laid the foundation for machine translation, morphological analysis, and contemporary natural language processing.

history of linguisticsmachine translationmathematical linguistics
0 likes · 7 min read
How Mathematics Sparked the Rise of Modern Linguistics and NLP
DataFunTalk
DataFunTalk
Sep 27, 2022 · Artificial Intelligence

Contrastive Learning for Text Generation: Motivation, Methodology, Experiments, and Discussion (CoNT Framework)

This article reviews the integration of contrastive learning into text generation, explains why it helps mitigate exposure bias, introduces the CoNT framework with three key improvements, presents extensive experiments on translation, summarization, code comment and data‑to‑text tasks, and discusses practical deployment considerations.

AICoNTText Generation
0 likes · 21 min read
Contrastive Learning for Text Generation: Motivation, Methodology, Experiments, and Discussion (CoNT Framework)
DataFunTalk
DataFunTalk
Jul 30, 2022 · Artificial Intelligence

Technical Analysis of Huawei’s Offline Speech‑to‑Text and Length‑Constrained Speech Translation Systems in IWSLT 2022

This article reviews the IWSLT 2022 competition tasks, explains Huawei’s cascade offline speech‑to‑text translation pipeline, details four major technical innovations—including ensemble‑based ASR de‑noise, context‑aware re‑ranking, domain‑controlled training, and length‑control strategies—and presents experimental results that demonstrate Huawei’s leading performance across multiple language directions.

ASRHuaweiIWSLT
0 likes · 18 min read
Technical Analysis of Huawei’s Offline Speech‑to‑Text and Length‑Constrained Speech Translation Systems in IWSLT 2022
21CTO
21CTO
Jul 9, 2022 · Artificial Intelligence

Meta Unveils NLLB-200: Open‑Source AI Model Translating 200 Languages

Meta has open‑sourced its new NLLB‑200 model, a single AI system that translates 200 languages with up to 44 % higher quality than its predecessor, supporting numerous low‑resource languages and powering billions of daily translations across Facebook and Instagram to improve user experience and content safety.

MetaNLLB-200machine translation
0 likes · 3 min read
Meta Unveils NLLB-200: Open‑Source AI Model Translating 200 Languages
DataFunTalk
DataFunTalk
Jan 16, 2022 · Artificial Intelligence

DeltaLM: A Multilingual Pretrained Encoder‑Decoder Model for Neural Machine Translation and Zero‑Shot Transfer

DeltaLM is a new multilingual pretrained encoder‑decoder model that leverages a pretrained encoder and a novel decoder to improve multilingual neural machine translation, offering efficient training, strong cross‑language transfer, zero‑shot translation, and superior performance on various translation and summarization tasks.

DeltaLMNMTmachine translation
0 likes · 13 min read
DeltaLM: A Multilingual Pretrained Encoder‑Decoder Model for Neural Machine Translation and Zero‑Shot Transfer
DataFunSummit
DataFunSummit
Nov 18, 2021 · Artificial Intelligence

Enterprise Applications and Research of Speech Translation

This article reviews recent advances in speech translation, discusses ByteDance's practical deployments, compares cascade and end‑to‑end modeling approaches, introduces improved encoder‑decoder architectures and training strategies, and reports state‑of‑the‑art results on the IWSLT 2021 benchmark.

AIByteDanceEnd-to-End
0 likes · 15 min read
Enterprise Applications and Research of Speech Translation
DataFunTalk
DataFunTalk
Oct 5, 2021 · Artificial Intelligence

From Technology to Experience: Vivo Machine Translation Deployment Practice

This article presents a comprehensive guide to deploying machine translation at Vivo, covering business analysis, algorithm choices beyond standard NMT, language detection challenges, data collection and cleaning, scientific evaluation methods, and engineering optimizations to deliver a seamless user experience.

AIEngineeringNMT
0 likes · 20 min read
From Technology to Experience: Vivo Machine Translation Deployment Practice
Volcano Engine Developer Services
Volcano Engine Developer Services
Sep 25, 2021 · Artificial Intelligence

Cutting‑Edge AI from ByteDance & OPPO: Audio, NLP, and Translation

The ByteDance Engine Developer Community Meetup featured senior engineers from ByteDance and OPPO who presented the latest advances in intelligent audio signal processing, natural language processing for recommendation, entity linking in knowledge graphs, and multimedia machine translation, highlighting practical applications and performance challenges.

Knowledge GraphRecommendation Systemsartificial intelligence
0 likes · 4 min read
Cutting‑Edge AI from ByteDance & OPPO: Audio, NLP, and Translation
Tencent Tech
Tencent Tech
Jul 22, 2021 · Artificial Intelligence

How Tencent Dominated WMT2021: Winning Five News‑Track Translation Tasks

Tencent’s machine‑translation teams clinched five first‑place wins in the WMT2021 news track—covering Chinese‑English, Japanese‑English and English‑German limited‑resource tasks—outperforming 82 competing teams and showcasing the impact of its AI‑driven translation engine across its products.

AI competitionBLEUTencent
0 likes · 4 min read
How Tencent Dominated WMT2021: Winning Five News‑Track Translation Tasks
iQIYI Technical Product Team
iQIYI Technical Product Team
Jul 9, 2021 · Artificial Intelligence

iQIYI Multi‑Language Subtitle Machine Translation: Practice, Model Exploration, and Deployment

iQIYI’s multi‑language subtitle machine‑translation system combines a one‑to‑many transformer, context‑fusion encoding, four custom attention masks, masked language modeling, global decoding loss, reconstruction and error‑correction modules, plus pronoun, idiom and name‑handling tricks, achieving higher quality than third‑party services and even surpassing human translation for several languages.

Error CorrectionOne-to-Many ModelSubtitle Translation
0 likes · 17 min read
iQIYI Multi‑Language Subtitle Machine Translation: Practice, Model Exploration, and Deployment
DataFunTalk
DataFunTalk
Feb 20, 2021 · Artificial Intelligence

Industrial-Scale Machine Translation at Bytedance: Applications, Demos, and Research Advances

This article presents Bytedance's industrial machine‑translation platform, describing its global deployment, diverse product demos, underlying sequence‑to‑sequence models, BERT‑enhanced training strategies, prune‑tune sparsity techniques, multilingual pre‑training, document translation, and a high‑performance inference engine.

BERTmachine translationmultilingual NLP
0 likes · 19 min read
Industrial-Scale Machine Translation at Bytedance: Applications, Demos, and Research Advances
DataFunTalk
DataFunTalk
Feb 9, 2021 · Artificial Intelligence

Multimodal AI Research: Video-Aware Dialog, Dual-Channel Reasoning, and Multimodal Machine Translation

This article surveys recent multimodal AI research, covering video scene‑aware dialog with a GPT‑2 based unified pre‑training framework, dual‑channel multi‑hop reasoning for visual dialog, capsule‑network‑enhanced multimodal machine translation, and graph‑neural‑network‑driven multimodal translation, highlighting experimental results and future directions.

Graph Neural NetworkMultimodal AIMultimodal Learning
0 likes · 12 min read
Multimodal AI Research: Video-Aware Dialog, Dual-Channel Reasoning, and Multimodal Machine Translation
New Oriental Technology
New Oriental Technology
Feb 1, 2021 · Artificial Intelligence

Neural Machine Translation: Seq2Seq, Beam Search, BLEU, Attention Mechanisms, and GNMT Improvements

This article explains key concepts of neural machine translation, covering Seq2Seq encoder‑decoder models, beam search strategies, BLEU evaluation, various attention mechanisms, and the enhancements introduced in Google's Neural Machine Translation system to improve speed, OOV handling, and translation quality.

BLEUBeam SearchGNMT
0 likes · 11 min read
Neural Machine Translation: Seq2Seq, Beam Search, BLEU, Attention Mechanisms, and GNMT Improvements
DataFunTalk
DataFunTalk
Jan 10, 2021 · Artificial Intelligence

Didi's Machine Translation System: Architecture, Techniques, and WMT2020 Competition Experience

This article presents a comprehensive overview of Didi's machine translation platform, covering its evolution from statistical to neural models, the Transformer architecture with relative position and larger FFN, data preparation, training strategies such as back‑translation and knowledge distillation, deployment optimizations with TensorRT, and the team's successful participation in the WMT2020 news translation task.

BLEUNeural NetworksTensorRT
0 likes · 14 min read
Didi's Machine Translation System: Architecture, Techniques, and WMT2020 Competition Experience
Ctrip Technology
Ctrip Technology
Nov 12, 2020 · Artificial Intelligence

Ctrip Machine Translation Platform: Architecture, Data Construction, Algorithm Design, and Performance Optimization

This article presents a comprehensive overview of Ctrip's multilingual machine translation platform, detailing demand analysis, system architecture, data pipeline, algorithmic innovations such as task‑space fusion and term‑translation interventions, as well as extensive performance optimizations for low‑resource languages.

AICtripModel Optimization
0 likes · 20 min read
Ctrip Machine Translation Platform: Architecture, Data Construction, Algorithm Design, and Performance Optimization
Didi Tech
Didi Tech
Oct 27, 2020 · Artificial Intelligence

Didi's Machine Translation System: Architecture, Techniques, and WMT2020 Competition Experience

Didi's machine translation system combines a Transformer‑big architecture with relative position representations, enlarged feed‑forward networks, iterative back‑translation, knowledge‑distillation and domain fine‑tuning, optimized via TensorRT for speed, achieving a BLEU 36.6 and third place in the WMT2020 Chinese‑to‑English news task.

BLEUNeural NetworksTensorRT
0 likes · 15 min read
Didi's Machine Translation System: Architecture, Techniques, and WMT2020 Competition Experience
DataFunTalk
DataFunTalk
May 6, 2020 · Artificial Intelligence

Application of Large-Scale Pretrained Models in Alibaba Machine Translation

This article reviews how large‑scale pretrained language models have reshaped NLP, outlines the challenges of applying them to machine translation, introduces the APT framework and the GRET architecture for better encoder‑decoder integration, and reports experimental gains and future research directions.

AIAPT frameworkGRET
0 likes · 10 min read
Application of Large-Scale Pretrained Models in Alibaba Machine Translation
21CTO
21CTO
Apr 20, 2020 · Artificial Intelligence

Why DeepL’s Neural Translation Beats Google: Inside the AI Engine

This article examines DeepL’s translation system, comparing its neural‑network‑driven output to Google and other services, detailing its Icelandic HPC infrastructure, data collection, architectural choices, language support, strengths, limitations, and expert opinions on why it often delivers more natural translations.

AIComparisonHPC
0 likes · 9 min read
Why DeepL’s Neural Translation Beats Google: Inside the AI Engine
DataFunTalk
DataFunTalk
Apr 10, 2020 · Artificial Intelligence

Improving Machine Translation: Addressing Exposure Bias, Efficient Decoding, and Non‑Autoregressive Models

This article reviews recent research on machine translation that tackles the training‑inference distribution gap, exposure bias, and slow autoregressive decoding by introducing scheduled sampling, differentiable sequence‑level losses, cube‑pruning, and sequence‑aware non‑autoregressive decoding, showing BLEU gains and significant speedups.

BLEUNLPcube pruning
0 likes · 16 min read
Improving Machine Translation: Addressing Exposure Bias, Efficient Decoding, and Non‑Autoregressive Models
Qunar Tech Salon
Qunar Tech Salon
Sep 12, 2019 · Artificial Intelligence

A Comprehensive Overview of Attention Mechanisms in Deep Learning

This article systematically reviews the history, core concepts, variants, and practical implementations of attention mechanisms—from early additive and multiplicative forms to self‑attention, multi‑head attention, and recent transformer‑based models—highlighting why attention has become fundamental in modern AI research.

Deep LearningNLPSelf-Attention
0 likes · 16 min read
A Comprehensive Overview of Attention Mechanisms in Deep Learning
Ctrip Technology
Ctrip Technology
May 21, 2019 · Artificial Intelligence

A Brief Overview of Machine Translation: History, Neural Models, and Practical Insights

This article surveys the evolution of machine translation from early rule‑based systems to modern neural architectures, explains how translation engines are trained, highlights recent advances such as attention and Transformers, and shares practical experience and current challenges in the field.

Attention MechanismNeural NetworksTransformer
0 likes · 11 min read
A Brief Overview of Machine Translation: History, Neural Models, and Practical Insights
ITPUB
ITPUB
Mar 6, 2019 · Artificial Intelligence

Why WeChat’s Translation Glitches Reveal Hidden AI Challenges

A recent WeChat translation bug that turned a name into bizarre Chinese phrases sparked a deep dive into neural machine translation, exposing algorithmic shortcomings, training‑data biases, and the broader uncertainties that affect modern AI‑driven translators.

AINMTNeural Networks
0 likes · 10 min read
Why WeChat’s Translation Glitches Reveal Hidden AI Challenges
DataFunTalk
DataFunTalk
Mar 6, 2019 · Artificial Intelligence

Baidu Chinese Text Correction Technology Overview

This article presents a comprehensive overview of Baidu's Chinese text correction technology, covering its background, error types, system architecture, key detection, candidate recall and ranking methods, core language and knowledge techniques, and real-world applications in open-domain and scenario-specific contexts.

Baidumachine translationtext correction
0 likes · 13 min read
Baidu Chinese Text Correction Technology Overview
DataFunTalk
DataFunTalk
Feb 27, 2019 · Artificial Intelligence

Human‑Interactive Machine Translation: Research, Techniques, and Productization

This article reviews the current state of machine translation, explores the challenges of ambiguity, quality, and domain specificity, and presents human‑in‑the‑loop translation techniques—including attention‑enhanced models, transformer architectures, and online learning—while discussing practical productization and deployment considerations.

AI productizationHuman-in-the-LoopOnline Learning
0 likes · 16 min read
Human‑Interactive Machine Translation: Research, Techniques, and Productization
Hulu Beijing
Hulu Beijing
Sep 27, 2018 · Artificial Intelligence

From Rules to Neural Networks: The Evolution of Machine Translation

This article traces the history of machine translation—from early rule‑based systems through statistical models that leveraged parallel corpora, to modern neural network approaches—while highlighting current applications, challenges, and future directions in the field.

AI applicationsNeural Networksmachine translation
0 likes · 9 min read
From Rules to Neural Networks: The Evolution of Machine Translation
AntTech
AntTech
Aug 1, 2018 · Artificial Intelligence

Highlights and Paper Summaries from ACL 2018 Conference

An extensive overview of ACL 2018, featuring acceptance statistics, award-winning papers, tutorial insights, and concise summaries of notable research across machine translation, semantic parsing, question answering, domain adaptation, text classification, summarization, dialogue systems, generation, and related tools.

ACL 2018Dialogue SystemsNLP
0 likes · 12 min read
Highlights and Paper Summaries from ACL 2018 Conference
Hulu Beijing
Hulu Beijing
Dec 20, 2017 · Artificial Intelligence

How Attention Mechanisms Transform Seq2Seq Models for Better Translation

This article explains why attention mechanisms were introduced into Seq2Seq models, how they address the limitations of fixed‑length encoding, the role of bidirectional RNNs, and showcases their impact on machine translation and image captioning with illustrative diagrams.

Attention MechanismRNNSeq2Seq
0 likes · 10 min read
How Attention Mechanisms Transform Seq2Seq Models for Better Translation
Alibaba Cloud Developer
Alibaba Cloud Developer
Sep 20, 2016 · Artificial Intelligence

What ACL 2016 Tutorials Reveal About the Future of NLP and Deep Learning

The article reviews ACL 2016’s tutorial program, summarizing key talks on computer‑aided translation, neural machine translation, semantic sense representation, short‑text understanding, and highlights selected papers on multimodal translation, coverage modeling, and language‑vision grounding, illustrating deep learning’s impact on NLP research.

ACL 2016Deep LearningNLP
0 likes · 13 min read
What ACL 2016 Tutorials Reveal About the Future of NLP and Deep Learning
Architects Research Society
Architects Research Society
Oct 4, 2015 · Artificial Intelligence

Bayesian Thinking on Your Feet: Embedding Generative Models in Reinforcement Learning for Sequentially Revealed Data

This NSF‑funded project aims to develop algorithms that incrementally process partially observed data, integrating generative models with reinforcement‑learning policies to decide when to act, applied to simultaneous machine translation and quiz‑bowl style question answering.

Bayesian inferenceGenerative Modelsmachine translation
0 likes · 4 min read
Bayesian Thinking on Your Feet: Embedding Generative Models in Reinforcement Learning for Sequentially Revealed Data