Tagged articles

383 articles

Page 4 of 4

Feb 1, 2023 · Artificial Intelligence

Video Object of Interest Segmentation (VOIS): Task, Dataset, and Dual-Path Transformer Approach

The paper presents Video Object of Interest Segmentation (VOIS), a new e‑commerce task that locates and segments video instances matching a given product image, introduces the LiveVideos dataset of 2,418 Taobao live‑stream clips, and proposes a dual‑path Swin‑Transformer with cross‑fusion modules that outperforms existing VOS/VIS baselines.

DatasetTransformerinstance segmentation

0 likes · 11 min read

Video Object of Interest Segmentation (VOIS): Task, Dataset, and Dual-Path Transformer Approach

DataFunSummit

Jan 15, 2023 · Artificial Intelligence

Intelligent Writing: AIGC Technologies, Models, Evaluation Metrics, and Real‑World Applications

This article surveys the evolution of AI‑generated content for intelligent writing, covering its definition, key technologies from RNN Seq2Seq to Transformer‑based models such as UniLM, T5, BART and GPT series, evaluation datasets and metrics, product deployments by Datagrand, and the remaining challenges and future directions.

AI writingAIGCGPT

0 likes · 25 min read

Intelligent Writing: AIGC Technologies, Models, Evaluation Metrics, and Real‑World Applications

DataFunSummit

Jan 14, 2023 · Artificial Intelligence

Key Transformer Model Papers Across Language, Vision, Speech, and Time‑Series Domains

This article surveys the most influential Transformer‑based research papers—from the original Attention Is All You Need work to recent models such as Autoformer and FEDformer—covering breakthroughs in natural language processing, computer vision, speech recognition, and long‑term series forecasting, and provides download links for each.

AITime-Series ForecastingTransformer

0 likes · 17 min read

Key Transformer Model Papers Across Language, Vision, Speech, and Time‑Series Domains

21CTO

Jan 13, 2023 · Artificial Intelligence

How Google’s Muse Is Redefining Text‑to‑Image Generation with Parallel Decoding

Google’s new Muse model, a Transformer‑based text‑to‑image system running on TPUv4, claims to generate 256×256 images in 0.5 seconds—far faster than Imagen—while delivering unprecedented photorealism and deep language understanding through parallel decoding and large‑scale LLM‑conditioned training.

AI researchGoogle MuseLLM conditioning

0 likes · 4 min read

How Google’s Muse Is Redefining Text‑to‑Image Generation with Parallel Decoding

AntTech

Dec 19, 2022 · Artificial Intelligence

TransVCL: Attention‑Enhanced Video Copy Localization Network with Flexible Supervision

TransVCL introduces an end‑to‑end attention‑enhanced video copy localization network that leverages a custom Transformer, correlation‑Softmax similarity matrix, and temporal alignment module, combined with a semi‑supervised learning framework, achieving state‑of‑the‑art performance on VCSL and VCDB benchmarks.

AISemi-supervised LearningTransformer

0 likes · 13 min read

TransVCL: Attention‑Enhanced Video Copy Localization Network with Flexible Supervision

DataFunTalk

Dec 17, 2022 · Artificial Intelligence

Efficient Spatiotemporal Self‑Attention Transformer (Patch Shift Transformer) for Video Action Recognition

This article introduces a lightweight spatiotemporal self‑attention transformer, called Patch Shift Transformer, which achieves competitive video action recognition performance on datasets such as Kinetics‑400, Sth‑v1/v2, and Diving48 without increasing computational cost or parameters, and details its design, experiments, and speed advantages.

ECCV 2022Transformerpatch shift

0 likes · 5 min read

Efficient Spatiotemporal Self‑Attention Transformer (Patch Shift Transformer) for Video Action Recognition

Rare Earth Juejin Tech Community

Dec 8, 2022 · Artificial Intelligence

ChatGPT: Development History, Technical Principles, and Future Investment Trends

This article reviews ChatGPT’s rapid rise, compares it with GPT‑3, explains the underlying transformer and reinforcement‑learning‑from‑human‑feedback technologies, outlines the evolution of natural‑language processing, and discusses emerging AI investment opportunities and future trends.

AIChatGPTNLP

0 likes · 12 min read

ChatGPT: Development History, Technical Principles, and Future Investment Trends

HelloTech

Oct 19, 2022 · Artificial Intelligence

Intelligent Creative System: Types, Quality Evaluation, Generation Models, and Optimization

The Intelligent Creative System defines advertising creatives across formats, evaluates image and text quality using reference‑based metrics and models like DeepBIQ, generates multimodal ads via GANs and Transformers, and selects optimal variants through bandit‑based CTR prediction and multimodal fusion, enabling scalable, data‑driven creative production.

AIBandit ModelGAN

0 likes · 10 min read

Intelligent Creative System: Types, Quality Evaluation, Generation Models, and Optimization

Rare Earth Juejin Tech Community

Oct 10, 2022 · Artificial Intelligence

A Beginner’s Journey into Vision Transformers (ViT) for Computer Vision Engineers

This article introduces the fundamentals of Vision Transformers (ViT) for computer‑vision developers, starting with an overview of the transformer architecture, detailed explanation of self‑attention and multi‑head attention, and step‑by‑step PyTorch code examples that illustrate query, key, value computation and attention scoring.

PyTorchSelf-AttentionTransformer

0 likes · 12 min read

A Beginner’s Journey into Vision Transformers (ViT) for Computer Vision Engineers

HaoDF Tech Team

Oct 8, 2022 · Artificial Intelligence

Exploring Transformer Technology and Its Applications in NLP, Computer Vision, and OCR at Haodf.com

This article introduces the Transformer architecture, explains its attention mechanism, details its adaptations for natural language processing, computer vision, and OCR tasks, and presents experimental results of various models such as BERT, ELECTRA, Swin Transformer, and CRNN-BCN on large-scale medical data from Haodf.com.

Model EvaluationNLPOCR

0 likes · 39 min read

Exploring Transformer Technology and Its Applications in NLP, Computer Vision, and OCR at Haodf.com

Kuaishou Audio & Video Technology

Sep 29, 2022 · Artificial Intelligence

How DeViT Revolutionizes Video Inpainting with Deformed Vision Transformers

The article introduces DeViT, a novel Deformed Vision Transformer framework for video inpainting that leverages a deformable patch homography estimator, mask‑pruned attention, and spatio‑temporal weight adaptation, achieving state‑of‑the‑art results on benchmark datasets and highlighting its potential for advanced video editing tools.

DeViTMultimediaTransformer

0 likes · 10 min read

How DeViT Revolutionizes Video Inpainting with Deformed Vision Transformers

ELab Team

Sep 23, 2022 · Artificial Intelligence

Fine‑Tune a Chinese BERT Model for Cloze Tasks in 30 Minutes

This tutorial walks you through NLP fundamentals, the evolution of BERT, the concept of pre‑trained models, and a step‑by‑step guide to fine‑tune a Chinese BERT on a cloze‑style task, complete with code snippets and verification results.

BERTChineseCloze Task

0 likes · 13 min read

Fine‑Tune a Chinese BERT Model for Cloze Tasks in 30 Minutes

Alimama Tech

Sep 21, 2022 · Artificial Intelligence

EXTR: Click-Through Rate Prediction with Externalities in E-Commerce Sponsored Search

The paper introduces EXTR, a Transformer‑based CTR prediction model that jointly encodes diverse externalities from surrounding organic results and ads and infers missing ad placements via a Potential Allocation Generator, achieving superior AUC, COPC and LogLoss on Taobao data and deployment in Alibaba’s advertising system.

AdvertisingExternalitiesTransformer

0 likes · 11 min read

EXTR: Click-Through Rate Prediction with Externalities in E-Commerce Sponsored Search

Architects' Tech Alliance

Aug 31, 2022 · Artificial Intelligence

Performance Evaluation of Transformer Models on the Inspur NF5488A5 GPU Server

This article presents a detailed benchmark of four Transformer models of varying sizes trained on the high‑end Inspur NF5488A5 GPU server, compares its NVSwitch‑based interconnect with a PCIe‑based system, and analyzes the impact of model scale, tensor parallelism, and hardware bandwidth on training efficiency.

DeepSpeedGPU serverMegatron-LM

0 likes · 12 min read

Performance Evaluation of Transformer Models on the Inspur NF5488A5 GPU Server

Alibaba Cloud Developer

Jul 28, 2022 · Artificial Intelligence

Unlock Chinese Text‑to‑Image Generation with EasyNLP: Models, Code & Tutorials

This article introduces EasyNLP's Chinese text‑to‑image generation framework, explains the underlying Transformer‑VQGAN architecture, provides model specifications, showcases sample outputs, and offers step‑by‑step code and command‑line instructions for fine‑tuning and inference.

Chinese AIEasyNLPTransformer

0 likes · 20 min read

Unlock Chinese Text‑to‑Image Generation with EasyNLP: Models, Code & Tutorials

Alibaba Cloud Big Data AI Platform

Jul 11, 2022 · Artificial Intelligence

How Structure-Aware Sparse Attention Boosts Long-Code Transformers

The SASA model, a structure‑aware sparse‑attention Transformer developed by Alibaba Cloud PAI and Prof. Gao Ming’s team, improves long‑code sequence processing by sparsifying self‑attention using top‑k frequency and AST pattern matrices, achieving higher performance and lower memory/computation costs on CodeXGLUE benchmarks.

ASTCode UnderstandingLong Sequences

0 likes · 8 min read

How Structure-Aware Sparse Attention Boosts Long-Code Transformers

DataFunTalk

Jul 9, 2022 · Artificial Intelligence

Graph Neural Networks Enter the Transformer Era – Seminar by Dr. Zheng Shuxin

The LOGS seminar on July 9, 2022 featured Dr. Zheng Shuxin from Microsoft Research presenting an overview of Transformer models, their success in NLP and CV, recent breakthroughs in applying Transformers to graph data, and future directions for graph processing.

AI SeminarMicrosoft researchTransformer

0 likes · 4 min read

Graph Neural Networks Enter the Transformer Era – Seminar by Dr. Zheng Shuxin

Xiaohongshu Tech REDtech

Jun 20, 2022 · Artificial Intelligence

Action Sequence Verification in Videos with CosAlignment Transformer (CAT)

The paper introduces Action Sequence Verification (ASV), a task that determines whether two videos follow the same ordered actions, provides the Chemical Sequence Verification dataset and re‑annotated COIN‑SV and Diving48‑SV sets, and proposes the CosAlignment Transformer (CAT) with intra‑step feature extraction, a Transformer‑based inter‑step encoder, and a sequence‑alignment loss that outperforms prior baselines and serves as a pre‑training model for video retrieval and classification.

Action VerificationComputer VisionDataset

0 likes · 7 min read

Action Sequence Verification in Videos with CosAlignment Transformer (CAT)

Python Programming Learning Circle

Jun 5, 2022 · Artificial Intelligence

The Rise of Hugging Face: From Emoji Logo to Leading AI Platform

From its quirky start as a teenage iPhone chatbot to becoming the central hub for open‑source transformer models, Hugging Face has grown into a fast‑rising AI platform, securing $100 million Series C funding, serving thousands of organizations, and aiming to democratize machine learning.

AIFundingHugging Face

0 likes · 7 min read

DaTaobao Tech

May 24, 2022 · Artificial Intelligence

GEN-VLKT: Simplify Association and Enhance Interaction Understanding for HOI Detection

GEN‑VLKT introduces a Guided‑Embedding Network with position‑ and instance‑guided embeddings to remove costly post‑processing and leverages CLIP‑based visual‑linguistic knowledge transfer for interaction understanding, achieving state‑of‑the‑art HOI detection performance and zero‑shot capability, now deployed in Alibaba’s Taobao services.

CLIPHOI detectionTransformer

0 likes · 7 min read

GEN-VLKT: Simplify Association and Enhance Interaction Understanding for HOI Detection

DataFunTalk

May 7, 2022 · Artificial Intelligence

Intelligent Recommendation Selling Point Generation: Architecture, Core AI Techniques, Model Development, and Product Impact

This article explains how JD's intelligent recommendation selling point system leverages NLP, BERT, Transformer and pointer‑generator models to automatically create short, personalized product highlights, describing the technical background, system architecture, model training pipeline, online/offline monitoring, and the resulting business benefits.

BERTNLPRecommendation Systems

0 likes · 13 min read

Intelligent Recommendation Selling Point Generation: Architecture, Core AI Techniques, Model Development, and Product Impact

AntTech

Apr 27, 2022 · Artificial Intelligence

Pyraformer: Low-Complexity Pyramidal Attention for Long-Range Time Series Modeling and Forecasting

The paper introduces Pyraformer, a low‑complexity pyramidal‑attention Transformer that captures multi‑scale temporal dependencies with linear time‑space complexity, achieving superior single‑step and long‑range forecasting performance on real‑world datasets while supporting green‑computing capacity management.

PyraformerTransformerlong-range dependencies

0 likes · 14 min read

Pyraformer: Low-Complexity Pyramidal Attention for Long-Range Time Series Modeling and Forecasting

DataFunSummit

Apr 25, 2022 · Artificial Intelligence

Token‑Level Pipeline Parallelism for Transformer‑based Language Models (TeraPipe)

The article introduces a token‑level pipeline parallelism strategy that splits the sequence‑length dimension of Transformer‑based language models, explains why this approach is feasible, presents a dynamic‑programming formulation for optimal slicing, discusses engineering challenges, and evaluates its performance on large GPT models.

Performance OptimizationPipeline ParallelismToken-level

0 likes · 13 min read

Token‑Level Pipeline Parallelism for Transformer‑based Language Models (TeraPipe)

Code DAO

Apr 18, 2022 · Artificial Intelligence

Transformer‑Based Denoising AutoEncoder (TSDAE) for Job Description Embeddings (Job2Vec)

This article explains how TSDAE, a transformer‑based denoising auto‑encoder, converts noisy job description sentences into robust vector embeddings, details its training process, loss function, dataset preparation, and demonstrates using FAISS for similarity search on the resulting Job2Vec representations.

AutoencoderFAISSNLP

0 likes · 4 min read

Transformer‑Based Denoising AutoEncoder (TSDAE) for Job Description Embeddings (Job2Vec)

Kuaishou Large Model

Apr 6, 2022 · Artificial Intelligence

How Transformers Revolutionize Image Style Transfer: Introducing StyTr²

This article reviews the limitations of traditional CNN‑based image stylization, explains how Transformer architectures overcome these issues with global context and self‑attention, and presents the novel StyTr² method with content‑aware positional encoding that achieves superior, detail‑preserving style transfer results.

Computer VisionDeep LearningTransformer

0 likes · 8 min read

How Transformers Revolutionize Image Style Transfer: Introducing StyTr²

IEG Growth Platform Technology Team

Feb 14, 2022 · Artificial Intelligence

Multimodal Evolution and Application in Tencent Game Advertising System

This article describes the end‑to‑end multimodal modeling pipeline—covering text, image, and video understanding, model evolution from shallow to deep networks, key‑frame extraction, fine‑tuning, and multimodal fusion—used in Tencent's game ad exchange platform, along with practical deployment challenges and solutions.

AdvertisingCNNMultimodal Learning

0 likes · 22 min read

Multimodal Evolution and Application in Tencent Game Advertising System

Baobao Algorithm Notes

Jan 14, 2022 · Artificial Intelligence

Visualize Transformer Attention with BertViz: Install and Example Walkthrough

This guide introduces BertViz, an interactive visualization tool for transformer models such as BERT, GPT‑2 and T5, explains how to install it via pip along with required dependencies, and demonstrates head, model, and neuron view visualizations with code examples in Jupyter.

Attention VisualizationBertVizNLP

0 likes · 6 min read

Visualize Transformer Attention with BertViz: Install and Example Walkthrough

Baobao Algorithm Notes

Jan 14, 2022 · Artificial Intelligence

BERT Interview Q&A: Decoding CLS, Masks, Complexity, and More

An in‑depth Q&A breaks down core BERT concepts—from the purpose of the [CLS] token and masking strategies to self‑attention complexity, sparse attention tricks, subword handling of OOV words, warm‑up learning rates, GPT’s unidirectional nature, and ALBERT’s parameter sharing—providing concise explanations for each.

BERTMaskingSelf-Attention

0 likes · 7 min read

BERT Interview Q&A: Decoding CLS, Masks, Complexity, and More

Baobao Algorithm Notes

Jan 7, 2022 · Interview Experience

Essential Transformer Interview Cheat Sheet: 11 Must‑Know Q&A

This concise guide presents eleven frequently asked Transformer interview questions with clear, English explanations covering self‑attention formulas, scaling, alternative designs, LayerNorm vs. BatchNorm, positional embeddings, multi‑head mechanisms, and BPE tokenization, helping candidates deliver solid, theory‑backed answers.

BERTDeep LearningLayerNorm

0 likes · 6 min read

Essential Transformer Interview Cheat Sheet: 11 Must‑Know Q&A

Kuaishou Tech

Jan 5, 2022 · Artificial Intelligence

How a New Bilingual Video Text Dataset and Transformer Spotter Advance Video OCR

This article reviews the NeurIPS 2021 paper introducing BOVText, a large‑scale bilingual video‑text dataset with over 2,000 videos and 1.75 million frames, and describes its transformer‑based end‑to‑end video text spotter that integrates EAST encoding into DETR, covering dataset collection, annotation, architecture, and experimental results.

BOVTextDETRTransformer

0 likes · 12 min read

How a New Bilingual Video Text Dataset and Transformer Spotter Advance Video OCR

Code DAO

Dec 30, 2021 · Artificial Intelligence

Exemplar Transformers Enable 8× Faster CPU‑Compatible Visual Tracking

Researchers at ETH Zurich introduce Exemplar Transformers, a novel Transformer layer that accelerates visual object tracking by eight times, runs in real‑time on CPUs, and improves robustness when integrated into a Siamese‑based tracker, achieving state‑of‑the‑art performance on six benchmark datasets.

BenchmarkCPUSiamese tracker

0 likes · 5 min read

Exemplar Transformers Enable 8× Faster CPU‑Compatible Visual Tracking

Code DAO

Dec 17, 2021 · Artificial Intelligence

Applying UNETR Transformer for 3D Medical Image Segmentation

This article walks through using the UNETR transformer architecture to segment 3D brain MRI scans from the BRATS dataset, detailing environment setup, data preprocessing with MONAI, model construction, training with DiceCE loss, validation metrics, and visualizing the best‑performing model outputs.

3D segmentationBRATSMONAI

0 likes · 16 min read

Applying UNETR Transformer for 3D Medical Image Segmentation

Baobao Algorithm Notes

Dec 15, 2021 · Artificial Intelligence

Why Can BERT’s Token, Segment, and Position Embeddings Be Added? A Deep Dive into Positional Encoding

This article revisits the long‑standing question of why BERT’s token, segment, and position embeddings are summed, critiques earlier explanations, and presents findings from the ICLR‑2021 paper “Rethinking Positional Encoding in Language Pre‑training” that show removing the token‑position cross term speeds convergence and improves downstream GLUE scores.

BERTEmbeddingLanguage Pretraining

0 likes · 6 min read

Why Can BERT’s Token, Segment, and Position Embeddings Be Added? A Deep Dive into Positional Encoding

Code DAO

Dec 14, 2021 · Artificial Intelligence

Building a Chess AI from Scratch: Combining AlphaZero and Transformers (Part 2)

This article walks through constructing a learnable chess AI by integrating AlphaZero‑style Monte Carlo Tree Search with a decoder‑only Transformer, detailing the game tree logic, model architecture, input and output encodings, self‑play training loop, and code implementation in PyTorch.

AlphaZeroMonteCarloTreeSearchPyTorch

0 likes · 23 min read

Building a Chess AI from Scratch: Combining AlphaZero and Transformers (Part 2)

Code DAO

Dec 7, 2021 · Artificial Intelligence

Key Deep Learning Architectures for Image Captioning: Encoders, Decoders, Attention & Multimodal Models

This article surveys deep‑learning image captioning, detailing the image encoder, sequence decoder, attention mechanisms and multimodal designs, comparing encoder‑decoder, detection‑backbone, transformer and dense captioning architectures, and explaining generation strategies and BLEU evaluation.

BLEUCNNDeep Learning

0 likes · 9 min read

Key Deep Learning Architectures for Image Captioning: Encoders, Decoders, Attention & Multimodal Models

DataFunSummit

Nov 21, 2021 · Artificial Intelligence

Sequential Recommendation Algorithms: Overview and Techniques

This article surveys sequential recommendation methods, covering standard models such as pooling, RNN, CNN, attention, and Transformer, as well as long‑short term, multi‑interest, multi‑behavior approaches, and recent advances like contrastive learning, highlighting their impact on recommendation performance.

RNNTransformerattention

0 likes · 8 min read

Sequential Recommendation Algorithms: Overview and Techniques

JD Retail Technology

Nov 16, 2021 · Artificial Intelligence

Intelligent Online Selling Point Extraction for E‑Commerce Recommendation (IOSPE) Wins AAAI 2022 Innovation Award

The IOSPE system, which uses BERT‑based scoring, transformer‑pointer generation, and personalized distribution to automatically extract and generate selling points for millions of e‑commerce products, earned the AAAI 2022 Artificial Intelligence Innovation Application Award and has boosted click‑through rates and user dwell time across JD.com platforms.

AIBERTInnovation Award

0 likes · 6 min read

Intelligent Online Selling Point Extraction for E‑Commerce Recommendation (IOSPE) Wins AAAI 2022 Innovation Award

JD Retail Technology

Nov 16, 2021 · Artificial Intelligence

Automatic Product Copywriting for E-Commerce: The APCG System and Its AI Innovations

The APCG system, awarded the AAAI 2022 Innovation Application Prize, automatically generates e‑commerce product copy using a Transformer‑Pointer network and a pretrained sequence‑to‑sequence model, incorporates quality control, employs novel pretraining tasks, and has produced millions of descriptions that boost CTR, CVR, and GMV.

AITransformercopywriting

0 likes · 6 min read

Automatic Product Copywriting for E-Commerce: The APCG System and Its AI Innovations

58 Tech

Oct 12, 2021 · Artificial Intelligence

Seq2Seq Approaches for Phone Number Extraction from Two‑Speaker Voice Dialogues

This article presents a practical study of extracting phone numbers from two‑speaker voice dialogues using Seq2Seq models—including LSTM, GRU with attention and feature fusion, and Transformer—detailing data characteristics, model architectures, training strategies, experimental results, and comparative analysis showing the GRU‑Attention approach achieving the best performance.

GRULSTMNLP

0 likes · 13 min read

Seq2Seq Approaches for Phone Number Extraction from Two‑Speaker Voice Dialogues

DataFunTalk

Sep 12, 2021 · Artificial Intelligence

Overview of Pretraining Models and the UER‑py Framework for Natural Language Processing

This article reviews the background and evolution of pre‑training models in NLP, introduces classic models such as Skip‑thoughts, BERT, and T5, and details the modular UER‑py framework, its comparison with HuggingFace Transformers, available Chinese pre‑trained weights, and practical deployment workflows.

NLPTransformerUER-py

0 likes · 21 min read

Overview of Pretraining Models and the UER‑py Framework for Natural Language Processing

Python Programming Learning Circle

Aug 30, 2021 · Artificial Intelligence

DeepDebug: Transformer‑Based Automatic Debugging Using Large Pretrained Models

The paper presents DeepDebug, a transformer‑based system that leverages large pretrained models and extensive synthetic and real‑world data to automatically localize and fix bugs in Python code, achieving significant improvements in patch generation success rates and reduction of false positives on benchmarks such as QuixBugs.

Software EngineeringTransformerautomatic debugging

0 likes · 12 min read

DeepDebug: Transformer‑Based Automatic Debugging Using Large Pretrained Models

TiPaiPai Technical Team

Jun 18, 2021 · Artificial Intelligence

Mastering Text Recognition: Encoder & Decoder Strategies Explained

This article reviews modern text‑recognition systems, detailing how encoders such as CNN, CNN‑BiLSTM, and Transformer‑based models extract visual features, and how decoders like Position Attention, Transformer decoders, and RNN Seq2Seq align variable‑length text, while also discussing CTC loss and practical design choices.

CNNCTCDecoder

0 likes · 9 min read

Mastering Text Recognition: Encoder & Decoder Strategies Explained

Meituan Technology Team

Jun 3, 2021 · Artificial Intelligence

VisTR: End-to-End Video Instance Segmentation with Transformers

VisTR redefines video instance segmentation as an end‑to‑end sequence‑to‑sequence task, using a CNN backbone, Transformer encoder‑decoder with instance queries, and Hungarian matching to jointly predict masks, classes, and tracks across frames, achieving state‑of‑the‑art accuracy (40.1 AP) and 57.7 FPS on YouTube‑VIS.

TransformerVideo Instance SegmentationVisTR

0 likes · 21 min read

VisTR: End-to-End Video Instance Segmentation with Transformers

DataFunTalk

May 15, 2021 · Artificial Intelligence

Multi‑Interest Recall Techniques in iQIYI Short‑Video Recommendation

The article reviews the evolution of iQIYI's short‑video recommendation recall pipeline, detailing multi‑interest recall methods such as clustering‑based recall, MOE‑based recall, single‑activation multi‑interest networks, regularization strategies, dynamic capacity handling, and multimodal extensions, and discusses their impact on recommendation performance.

TransformeriQIYImachine learning

0 likes · 15 min read

Multi‑Interest Recall Techniques in iQIYI Short‑Video Recommendation

Cyber Elephant Tech Team

Apr 28, 2021 · Artificial Intelligence

Understanding BERT: From Encoder-Decoder to Transformer and Attention

This article explains the BERT model by first reviewing the Encoder-Decoder framework, then detailing the attention mechanism—including self-attention and multi-head attention—before describing the full Transformer architecture and finally outlining BERT’s encoder-only design, training stages, and fine-tuning applications.

BERTEncoder-DecoderNLP

0 likes · 15 min read

Understanding BERT: From Encoder-Decoder to Transformer and Attention

DataFunTalk

Apr 17, 2021 · Artificial Intelligence

Personalized Re-ranking for Recommendation (ResSys'19)

This article introduces a personalized re‑ranking model for recommendation systems, explaining the limitations of traditional point‑wise ranking, describing the PRM architecture with input, encoding, and output layers using multi‑head attention and pre‑trained personalization features, and presenting experimental results and future extensions.

CTRTransformerattention

0 likes · 7 min read

Personalized Re-ranking for Recommendation (ResSys'19)

DataFunTalk

Apr 16, 2021 · Artificial Intelligence

Live Streaming Recommendation Ranking Model Evolution and Multi‑Objective Learning at Alibaba 1688

This article presents a comprehensive overview of Alibaba's 1688 live‑streaming recommendation system, detailing core challenges such as heterogeneous behavior modeling, multi‑objective optimization, and bias mitigation, and describing four successive model iterations—from feature‑engineered GBDT to attention‑based heterogeneous networks and transformer architectures—along with experimental results and practical insights.

Recommendation SystemsTransformerbias mitigation

0 likes · 22 min read

Live Streaming Recommendation Ranking Model Evolution and Multi‑Objective Learning at Alibaba 1688

DataFunTalk

Apr 10, 2021 · Artificial Intelligence

2020 Computer Vision Breakthroughs: Self‑Supervised Learning, Transformer Attention Modeling, and Neural Radiance Fields

The talk reviews three major 2020 advances in computer vision—self‑supervised learning surpassing supervised pre‑training, the successful adoption of Transformer‑based attention models for detection and classification, and the emergence of Neural Radiance Fields for view synthesis—while highlighting related research from Microsoft Research Asia and the broader community.

2020AI breakthroughsComputer Vision

0 likes · 19 min read

2020 Computer Vision Breakthroughs: Self‑Supervised Learning, Transformer Attention Modeling, and Neural Radiance Fields

DataFunTalk

Apr 3, 2021 · Artificial Intelligence

A Survey of User Behavior Sequence Modeling for Search and Recommendation Advertising

User behavior sequence modeling, crucial for search and recommendation advertising ranking, has evolved from simple pooling to attention, RNN, capsule, and Transformer architectures, with industrial applications across e‑commerce, social, video, and music platforms, and future directions include time‑aware, multi‑dimensional, and self‑supervised approaches.

Deep LearningRecommendation SystemsSequence Modeling

0 likes · 24 min read

A Survey of User Behavior Sequence Modeling for Search and Recommendation Advertising

Sohu Tech Products

Feb 17, 2021 · Artificial Intelligence

Improving BERT Pre‑training with RealFormer: Principles, Implementation, and Empirical Evaluation

This article analyzes the RealFormer modification to the Transformer architecture, details its implementation in BERT, and presents extensive experiments showing that while RealFormer can boost performance on low‑label‑count classification tasks, its benefits diminish or disappear as the number of classes grows.

BERTRealFormerResidual

0 likes · 12 min read

Improving BERT Pre‑training with RealFormer: Principles, Implementation, and Empirical Evaluation

Liangxu Linux

Feb 3, 2021 · Artificial Intelligence

Build a DIY AI Bot for Honor of Kings with Transformers, scrcpy & minitouch

Learn how to create a low‑cost AI bot for the mobile game Honor of Kings by capturing the phone screen with scrcpy, generating action commands from game images using a Transformer model, and executing those commands via minitouch, complete with setup steps, required tools, and code links.

Game AutomationMobile DevelopmentTransformer

0 likes · 6 min read

Build a DIY AI Bot for Honor of Kings with Transformers, scrcpy & minitouch

New Oriental Technology

Jan 25, 2021 · Artificial Intelligence

Transformer Model: Attention Mechanism in Machine Translation

The 2017 Transformer model introduced by Vaswani et al. revolutionized machine translation by relying solely on attention mechanisms, outperforming traditional RNN and CNN approaches through parallel processing and improved contextual understanding.

AIAttention MechanismNLP

0 likes · 4 min read

Transformer Model: Attention Mechanism in Machine Translation

Python Crawling & Data Mining

Jan 9, 2021 · Artificial Intelligence

Build a DIY AI Bot for Honor of Kings Using Transformers and scrcpy

This tutorial shows how to create a civilian‑grade AI for the mobile game Honor of Kings by mirroring the phone with scrcpy, generating action commands from game screenshots using a Transformer model, and executing them via minitouch on Android devices.

AIMobile AutomationPython

0 likes · 6 min read

Build a DIY AI Bot for Honor of Kings Using Transformers and scrcpy

58 Tech

Dec 30, 2020 · Artificial Intelligence

qa_match V1.3: Lightweight Deep Learning QA Matching Tool with Semi‑Automatic Knowledge‑Base Mining and Transformer‑Enhanced Pre‑training

The qa_match open‑source tool from 58 Tongcheng, now at version 1.3, introduces semi‑automatic knowledge‑base mining for cold‑start and online scenarios and upgrades its Simple Pre‑trained Model (SPTM) with Transformer‑based feature representation to improve question‑answer matching performance.

DEC clusteringTransformerknowledge base mining

0 likes · 10 min read

qa_match V1.3: Lightweight Deep Learning QA Matching Tool with Semi‑Automatic Knowledge‑Base Mining and Transformer‑Enhanced Pre‑training

DataFunSummit

Dec 14, 2020 · Artificial Intelligence

LightSeq: High‑Performance Open‑Source Inference Engine for Transformers, GPT and Other NLP Models

This article introduces LightSeq, an open‑source, GPU‑accelerated inference engine that dramatically speeds up Transformer‑based models such as BERT and GPT by up to 14× over TensorFlow, supports multiple decoding strategies, integrates seamlessly with major deep‑learning frameworks, and provides detailed performance benchmarks and technical optimizations.

Deep LearningGPUInference

0 likes · 15 min read

LightSeq: High‑Performance Open‑Source Inference Engine for Transformers, GPT and Other NLP Models

Sohu Tech Products

Nov 25, 2020 · Artificial Intelligence

Illustrated Guide to GPT-2: Detailed Explanation of the Decoder‑Only Transformer Model

This article provides a comprehensive, illustrated walkthrough of OpenAI's GPT‑2 language model, covering its decoder‑only Transformer architecture, self‑attention mechanisms, token processing, training data, differences from BERT, and applications beyond language modeling, enriched with visual diagrams and code snippets for deeper understanding.

AIGPT-2Language Model

0 likes · 24 min read

Illustrated Guide to GPT-2: Detailed Explanation of the Decoder‑Only Transformer Model

Sohu Tech Products

Nov 11, 2020 · Artificial Intelligence

Illustrated Transformer: Comprehensive Explanation and Code Implementation

This article provides a step‑by‑step illustrated guide to the Transformer architecture, covering its macro structure, detailed self‑attention mechanisms, multi‑head attention, positional encoding, residual connections, decoder operation, training process, loss functions, and includes complete PyTorch and custom Python code examples.

NLPPyTorchSelf-Attention

0 likes · 33 min read

Illustrated Transformer: Comprehensive Explanation and Code Implementation

Sohu Tech Products

Nov 4, 2020 · Artificial Intelligence

Understanding BERT: Architecture, Pre‑training, Fine‑tuning and Applications in Modern NLP

This article provides a comprehensive overview of BERT and related NLP advances, covering its historical context, model architecture, input‑output mechanisms, comparisons with CNNs, word‑embedding evolution, pre‑training strategies like MLM and next‑sentence prediction, and practical guidance for fine‑tuning and feature extraction.

BERTFine-tuningNLP

0 likes · 17 min read

Understanding BERT: Architecture, Pre‑training, Fine‑tuning and Applications in Modern NLP

Didi Tech

Oct 27, 2020 · Artificial Intelligence

Didi's Machine Translation System: Architecture, Techniques, and WMT2020 Competition Experience

Didi's machine translation system combines a Transformer‑big architecture with relative position representations, enlarged feed‑forward networks, iterative back‑translation, knowledge‑distillation and domain fine‑tuning, optimized via TensorRT for speed, achieving a BLEU 36.6 and third place in the WMT2020 Chinese‑to‑English news task.

BLEUNeural NetworksTensorRT

0 likes · 15 min read

Didi's Machine Translation System: Architecture, Techniques, and WMT2020 Competition Experience

Meituan Technology Team

Sep 24, 2020 · Artificial Intelligence

Multimodal Recall Solution for KDD Cup 2020: ImageBERT and LXMERT Based Approach

The second‑place team tackled KDD Cup 2020’s Multimodal Recall challenge by fine‑tuning ImageBERT and LXMERT on query‑image pairs, generating negatives, applying AMSoftmax and multi‑similarity losses, ensembling weighted predictions, and using score‑based post‑processing, boosting NDCG@5 to 0.8352 and powering Meituan’s multimodal search pipeline.

ImageBERTKDD Cup 2020LXMERT

0 likes · 23 min read

Multimodal Recall Solution for KDD Cup 2020: ImageBERT and LXMERT Based Approach

DataFunTalk

Sep 23, 2020 · Artificial Intelligence

From Word Embedding to BERT: A Comprehensive Overview of Pre‑training Model Development in NLP

This article surveys the evolution of pre‑training models for natural language processing, detailing model architectures such as Encoder‑AE, Decoder‑AR, Encoder‑Decoder, Prefix LM, and PLM, analyzing why models like RoBERTa, T5, and GPT‑3 excel, and offering practical guidance for building strong pre‑training systems.

BERTNLPTransformer

0 likes · 47 min read

From Word Embedding to BERT: A Comprehensive Overview of Pre‑training Model Development in NLP

Didi Tech

Aug 23, 2020 · Artificial Intelligence

DiDi AI Labs Achieves Third Place in WMT2020 News Translation Task

DiDi AI Labs’ NLP team earned third place in the WMT2020 Chinese‑to‑English news translation task with a 36.6 BLEU score, using an enhanced Transformer‑2 model that incorporates self‑attention, relative positional attention, iterative back‑translation, knowledge distillation, data cleaning, ensembling, and other techniques, now deployed across DiDi’s international services.

BLEUDiDi AI LabsNLP

0 likes · 5 min read

DiDi AI Labs Achieves Third Place in WMT2020 News Translation Task

Tencent Advertising Technology

Jun 2, 2020 · Artificial Intelligence

Turning Ad Click Sequences into Age & Gender Predictions with Transformers

This article shares a competition winner's step‑by‑step solution for predicting user age and gender from ad click sequences, treating IDs as words, using word2vec embeddings, a custom transformer‑LSTM model, dual‑task loss, and weight‑search post‑processing.

AdvertisingNLPTransformer

0 likes · 7 min read

Turning Ad Click Sequences into Age & Gender Predictions with Transformers

Didi Tech

May 25, 2020 · Artificial Intelligence

How Didi Harnesses Cutting‑Edge Speech Recognition: From ASR Basics to Transformer Models

This article provides a comprehensive technical overview of modern speech recognition, covering Didi’s driver‑assistant and smart‑customer‑service applications, fundamental ASR concepts, classic GMM‑HMM methods, deep‑learning breakthroughs such as DNN‑HMM, CTC, attention‑based and transformer models, practical training tricks, signal‑processing steps, and multimodal fusion techniques.

ASRCTCDeep Learning

0 likes · 16 min read

How Didi Harnesses Cutting‑Edge Speech Recognition: From ASR Basics to Transformer Models

Meituan Technology Team

Apr 16, 2020 · Artificial Intelligence

Transformer Applications in Meituan Search Ranking: Practice and Experience

Meituan’s search ranking system integrates Transformer‑based models across feature engineering, behavior sequence modeling, and re‑ranking, adapting AutoInt‑style embeddings and multi‑stage attention mechanisms to boost QV_CTR and NDCG, while outlining future enhancements with BERT, graph neural networks, and reinforcement learning.

MeituanTransformerbehavior modeling

0 likes · 16 min read

Transformer Applications in Meituan Search Ranking: Practice and Experience

Qunar Tech Salon

Mar 5, 2020 · Artificial Intelligence

Content Tagging Technology for Short Videos at iQIYI: Challenges and Model Evolution

This article describes iQIYI's short‑video content tagging system, outlining the challenges of extracting type and abstract tags from multimodal data, detailing the evolution from text‑only models to image‑fusion, BERT‑enhanced, and video‑frame models, and discussing their applications and future directions.

BERTMultimodal LearningTransformer

0 likes · 11 min read

Content Tagging Technology for Short Videos at iQIYI: Challenges and Model Evolution

58 Tech

Mar 2, 2020 · Artificial Intelligence

Low-Quality Text Detection Using Unsupervised Language Model Perplexity

This article proposes a method to identify low-quality text in business data by training a large-scale unsupervised language model to compute sentence perplexity, converting the detection problem into a threshold decision, and details model design, challenges, optimizations, and online performance results.

BERTLanguage ModelNLP

0 likes · 13 min read

Low-Quality Text Detection Using Unsupervised Language Model Perplexity

DataFunTalk

Feb 27, 2020 · Artificial Intelligence

Content Tagging Technology for Short Videos: Challenges and Model Evolution at iQIYI

This article examines the challenges of short‑video content tagging and describes iQIYI's multi‑stage evolution from simple text‑only models to sophisticated multimodal architectures that fuse cover images, BERT embeddings, and video frames to improve tag generation accuracy.

BERTMultimodal LearningTransformer

0 likes · 12 min read

Content Tagging Technology for Short Videos: Challenges and Model Evolution at iQIYI

iQIYI Technical Product Team

Feb 14, 2020 · Artificial Intelligence

Content Tagging Technology for Short Videos: Challenges and Multi‑Modal Model Evolution at iQIYI

iQIYI’s short‑video tagging system tackles multimodal fusion, open‑set and abstract tags by evolving from a text‑only model through cover‑image, BERT‑vector, and video‑frame fusion architectures, enabling automated labeling, personalized recommendation, and semantic search while planning to add OCR, audio, and knowledge‑graph enhancements.

BERTMultimodal LearningTransformer

0 likes · 13 min read

Content Tagging Technology for Short Videos: Challenges and Multi‑Modal Model Evolution at iQIYI

Qunar Tech Salon

Sep 12, 2019 · Artificial Intelligence

A Comprehensive Overview of Attention Mechanisms in Deep Learning

This article systematically reviews the history, core concepts, variants, and practical implementations of attention mechanisms—from early additive and multiplicative forms to self‑attention, multi‑head attention, and recent transformer‑based models—highlighting why attention has become fundamental in modern AI research.

Deep LearningNLPSelf-Attention

0 likes · 16 min read

A Comprehensive Overview of Attention Mechanisms in Deep Learning

Alibaba Cloud Developer

Aug 27, 2019 · Artificial Intelligence

How Transformers Enable Personalized Outfit Generation for Fashion Recommendation

This article presents a Transformer‑based framework that simultaneously generates visually compatible outfits and personalizes recommendations by leveraging multimodal item embeddings and user behavior, achieving significant gains in compatibility prediction, fill‑in‑the‑blank accuracy, and click‑through rate on Alibaba's iFashion platform.

Deep LearningMultimodal LearningTransformer

0 likes · 15 min read

How Transformers Enable Personalized Outfit Generation for Fashion Recommendation

Alibaba Cloud Developer

Aug 7, 2019 · Artificial Intelligence

How KOBE Transforms Personalized Recommendation Reason Generation with Transformers

This article introduces KOBE, a knowledge‑based personalized text generation system that leverages Transformer architecture, attribute fusion, and external knowledge graphs to produce fluent, domain‑aware recommendation reasons for e‑commerce products, with a case study on the Spring Festival cloud theme.

Knowledge GraphText GenerationTransformer

0 likes · 13 min read

How KOBE Transforms Personalized Recommendation Reason Generation with Transformers

Alibaba Cloud Developer

Jul 9, 2019 · Artificial Intelligence

Demystifying Attention: A Clear Guide to Its History, Types, and Why It Works

This article systematically reviews the evolution of attention mechanisms—from early additive and multiplicative forms to self‑attention and multi‑head variants—explaining their core three‑step framework, key differences, and why they have become essential across NLP, vision, and broader AI applications.

Deep LearningNLPSelf-Attention

0 likes · 19 min read

Demystifying Attention: A Clear Guide to Its History, Types, and Why It Works

Alibaba Cloud Developer

Jun 5, 2019 · Artificial Intelligence

Tracing the Evolution of Language Models: From N‑grams to GPT‑2

This article reviews the historical development of natural language processing language models, covering expert rule‑based systems, statistical n‑grams, smoothing techniques, neural network models such as NNLM, RNN, word2vec, GloVe, ELMo, and the transformer‑based breakthroughs of GPT, BERT and GPT‑2, and summarizes their impact on modern NLP tasks.

BERTDeep LearningGPT

0 likes · 25 min read

Tracing the Evolution of Language Models: From N‑grams to GPT‑2

Ctrip Technology

May 21, 2019 · Artificial Intelligence

A Brief Overview of Machine Translation: History, Neural Models, and Practical Insights

This article surveys the evolution of machine translation from early rule‑based systems to modern neural architectures, explains how translation engines are trained, highlights recent advances such as attention and Transformers, and shares practical experience and current challenges in the field.

Attention MechanismNeural NetworksTransformer

0 likes · 11 min read

A Brief Overview of Machine Translation: History, Neural Models, and Practical Insights

Sohu Tech Products

Apr 11, 2019 · Artificial Intelligence

Media Domain Named Entity Recognition: Techniques, Evolution, and Sohu’s Practical Implementation

This article reviews the challenges of media‑domain named entity recognition, outlines the evolution from rule‑based methods through traditional machine‑learning and deep‑learning models to attention‑based Transformers, and details Sohu’s practical Bi‑LSTM‑CRF system with data‑annotation strategies and performance results.

Bi-LSTMCRFDeep Learning

0 likes · 12 min read

Media Domain Named Entity Recognition: Techniques, Evolution, and Sohu’s Practical Implementation

DataFunTalk

Mar 13, 2019 · Artificial Intelligence

A Comprehensive Overview of NLP Development and Deep Learning Models

This article reviews the history of natural language processing, explains key deep‑learning models such as NNLM, Word2vec, CNN, RNN, attention mechanisms, and Transformers, and discusses their applications, future trends, and practical considerations in NLP tasks.

NLPTransformerattention

0 likes · 38 min read

A Comprehensive Overview of NLP Development and Deep Learning Models

DataFunTalk

Feb 27, 2019 · Artificial Intelligence

Human‑Interactive Machine Translation: Research, Techniques, and Productization

This article reviews the current state of machine translation, explores the challenges of ambiguity, quality, and domain specificity, and presents human‑in‑the‑loop translation techniques—including attention‑enhanced models, transformer architectures, and online learning—while discussing practical productization and deployment considerations.

AI productizationHuman-in-the-LoopOnline Learning

0 likes · 16 min read

Human‑Interactive Machine Translation: Research, Techniques, and Productization

Tencent TDS Service

Jan 24, 2019 · Artificial Intelligence

Unlocking BERT: How Its Transformer Engine Powers State-of-the-Art Text Classification

This article explains BERT’s architecture—from its bidirectional Transformer encoder and attention mechanisms to its pre‑training tasks—and presents extensive experiments showing its superior performance on various Chinese and English text‑classification benchmarks across multiple datasets.

BERTNLPTransformer

0 likes · 22 min read

Unlocking BERT: How Its Transformer Engine Powers State-of-the-Art Text Classification

Sohu Tech Products

Jan 9, 2019 · Artificial Intelligence

Understanding the Transformer Model: Attention, Self‑Attention, and Multi‑Head Mechanisms

This article provides a comprehensive, step‑by‑step explanation of the Transformer architecture, covering its encoder‑decoder structure, self‑attention, multi‑head attention, positional encoding, residual connections, and training processes, illustrated with diagrams and code snippets to aid readers new to neural machine translation.

Deep LearningNeural Machine TranslationPositional Encoding

0 likes · 16 min read

Understanding the Transformer Model: Attention, Self‑Attention, and Multi‑Head Mechanisms

Sohu Tech Products

Oct 10, 2018 · Artificial Intelligence

Optimizing News Recall with DDPG Reinforcement Learning and Transformer Architecture

This article explains how reinforcement learning, specifically the DDPG algorithm combined with Transformer-based networks, is applied to improve large‑scale news recall systems, detailing the business scenario, algorithm selection, model architecture, speed optimizations, training challenges, and observed online performance gains.

AIDDPGTransformer

0 likes · 13 min read

Optimizing News Recall with DDPG Reinforcement Learning and Transformer Architecture

21CTO

Sep 15, 2018 · Backend Development

Laravel Architecture Deep Dive: Repositories, Services, Presenters, Transformers

The article summarizes a video on Laravel project structuring, explaining how separating responsibilities into layers such as Repository for data access, Service for business logic, Presenter for view preparation, Transformer for data shaping, and Formatter for consistent API responses improves maintainability and scalability.

Backend ArchitectureLaravelPresenter

0 likes · 6 min read

Laravel Architecture Deep Dive: Repositories, Services, Presenters, Transformers

Alibaba Cloud Developer

May 11, 2018 · Artificial Intelligence

How Suffix Prediction Boosts English‑Russian Neural Machine Translation Accuracy

Researchers introduce a novel suffix‑prediction mechanism for neural machine translation that separately generates stems and suffixes during decoding, dramatically reducing out‑of‑vocabulary errors and morphological mistakes in English‑Russian translation, achieving consistent improvements across RNN and Transformer models on large‑scale news and e‑commerce datasets.

English-RussianMorphologically Rich LanguagesNeural Machine Translation

0 likes · 10 min read

How Suffix Prediction Boosts English‑Russian Neural Machine Translation Accuracy