Tagged articles

multitask learning

17 articles · Page 1 of 1

Jul 2, 2026 · Artificial Intelligence

Quantifying Robot Data Value: ATHENA Scales Influence Functions to Billion‑Parameter VLA with 313× Speedup

ATHENA introduces a data‑curation framework for billion‑parameter multi‑task Vision‑Language‑Action models that extends influence functions via Kronecker gradient compression and a multitask influence interaction scheme, achieving a 313× reduction in compute (from 8054.6 to 25.7 GPU‑hours) and improving task success rates while using fewer, higher‑value demonstrations.

data curationinfluence functionslarge-scale models

0 likes · 9 min read

Quantifying Robot Data Value: ATHENA Scales Influence Functions to Billion‑Parameter VLA with 313× Speedup

Machine Learning Algorithms & Natural Language Processing

Feb 9, 2026 · Artificial Intelligence

Time‑o1: Overcoming Time‑Series Forecasting Bottlenecks with a Novel Loss Function

The paper identifies two fundamental issues in time‑series forecasting—label autocorrelation bias and task‑scale explosion caused by the standard TMSE loss—and proposes Time‑o1, a PCA‑based orthogonal label transformation that eliminates bias, reduces optimization complexity, and yields consistent performance gains across multiple models and datasets.

NeurIPS 2025PCATime‑o1

0 likes · 12 min read

Time‑o1: Overcoming Time‑Series Forecasting Bottlenecks with a Novel Loss Function

Bighead's Algorithm Notes

Aug 26, 2025 · Artificial Intelligence

SSPT: Custom Pre‑training Tasks for Stock Data Boost Stock Selection Performance

This article reviews the SSPT paper, which introduces three stock‑specific pre‑training tasks—stock code classification, sector classification, and moving‑average prediction—built on a two‑layer Transformer, and demonstrates through extensive experiments across five market datasets that these tasks consistently improve cumulative return and Sharpe ratio over baselines.

Transformerfinancial AImultitask learning

0 likes · 11 min read

SSPT: Custom Pre‑training Tasks for Stock Data Boost Stock Selection Performance

AI Frontier Lectures

Jun 10, 2025 · Artificial Intelligence

Can One Model Master All Remote Sensing Tasks? Introducing the TSSUN Framework

This paper presents the Temporal‑Spectral‑Spatial Unified Network (TSSUN), a flexible deep‑learning architecture that simultaneously handles semantic segmentation, semantic change detection, and binary change detection across heterogeneous remote‑sensing inputs, achieving state‑of‑the‑art performance without task‑specific retraining.

Attention MechanismTSSUNdeep learning

0 likes · 15 min read

Can One Model Master All Remote Sensing Tasks? Introducing the TSSUN Framework

AntTech

Feb 26, 2025 · Artificial Intelligence

Ant Group’s 18 Accepted Papers at AAAI 2025: Summaries and Highlights

This article presents concise English summaries of the 18 Ant Group papers accepted at AAAI 2025, covering topics such as privacy‑preserving large‑model tuning, knowledge‑graph integration, AI‑generated image detection, multi‑task learning, generative retrieval, role‑playing evaluation, and video hallucination mitigation.

AAAI 2025AI evaluationGenerative Retrieval

0 likes · 29 min read

Ant Group’s 18 Accepted Papers at AAAI 2025: Summaries and Highlights

NewBeeNLP

May 24, 2024 · Artificial Intelligence

How NoteLLM Boosts Cold‑Start Recommendation with Generative Contrastive Learning

This article reviews the NoteLLM paper, which leverages Llama 2 to create richer text embeddings and automatically generate tags and categories for note recommendation, addressing cold‑start issues through a multitask prompt design, generative‑contrastive learning, and collaborative supervised fine‑tuning, and demonstrates strong offline and online gains.

EmbeddingGenerative Contrastive LearningLLM

0 likes · 14 min read

How NoteLLM Boosts Cold‑Start Recommendation with Generative Contrastive Learning

DataFunTalk

Jan 30, 2023 · Artificial Intelligence

Domain Knowledge Enhanced Pretrained Language Model for Medicinal Product Vertical Search

This article presents a domain‑knowledge‑enhanced pretrained language model that combines ELECTRA‑based token‑level masking with a novel product‑attribute prediction (PAP) task to improve query understanding, intent classification, and relevance matching in vertical drug e‑commerce search, and validates its effectiveness through extensive experiments on public and proprietary datasets.

ELECTRAdomain knowledgemedical NLP

0 likes · 13 min read

Domain Knowledge Enhanced Pretrained Language Model for Medicinal Product Vertical Search

DataFunTalk

Aug 22, 2022 · Artificial Intelligence

Live‑Streaming Recommendation System: Interaction Scenarios, User Cold‑Start, Prior Modeling, and Scene Modeling

The article presents a comprehensive technical overview of a live‑streaming recommendation system, covering common and specific characteristics, user cold‑start strategies using unbiased clustering, prior knowledge integration, multi‑task modeling, and scene‑aware routing to improve relevance and engagement in interactive environments.

ClusteringLive StreamingRecommendation Systems

0 likes · 19 min read

Live‑Streaming Recommendation System: Interaction Scenarios, User Cold‑Start, Prior Modeling, and Scene Modeling

DataFunTalk

Dec 14, 2021 · Artificial Intelligence

Speech Translation: Enterprise Applications and Research

This article presents an overview of speech translation, discusses its motivations and applications at ByteDance, compares cascade and end‑to‑end modeling approaches, introduces advanced encoder and decoder designs such as LUT, Chimera, and COSTT, outlines progressive multi‑task training and data‑augmentation strategies, and shares experimental results and Q&A.

AIAudio Processingend-to-end models

0 likes · 16 min read

Speech Translation: Enterprise Applications and Research

DataFunSummit

Mar 30, 2021 · Artificial Intelligence

Chinese Short‑Text Entity Linking: Model Design, Multitask Learning, and Experimental Results on the Qianyan Dataset

This article presents a comprehensive approach to Chinese short‑text entity linking, describing the Qianyan dataset, pipeline and end‑to‑end task formulations, sample construction, a multitask model that jointly performs entity ranking and NIL classification, various optimization techniques including confidence learning and adversarial training, and detailed experimental analysis showing state‑of‑the‑art performance.

Chinese NLPadversarial trainingconfidence learning

0 likes · 13 min read

Chinese Short‑Text Entity Linking: Model Design, Multitask Learning, and Experimental Results on the Qianyan Dataset

DataFunTalk

Feb 26, 2021 · Artificial Intelligence

Fine‑Grained Sentiment Analysis and Opinion Quadruple Extraction: Methods, Tasks, and Applications

This article introduces the concepts, tasks, and recent advances in text sentiment analysis, focusing on attribute‑level sentiment (TG‑ABSA) and opinion‑quadruple extraction, describing unsupervised, reading‑comprehension, and multi‑task deep‑learning approaches, their implementation on Huawei Cloud, experimental results, and future research directions.

NLPSentiment Analysisaspect‑based sentiment

0 likes · 20 min read

Fine‑Grained Sentiment Analysis and Opinion Quadruple Extraction: Methods, Tasks, and Applications

DataFunTalk

Oct 23, 2020 · Artificial Intelligence

Feedback‑Aware Deep Matching Model for Music Recommendation in Tmall Genie

This article presents DeepMatch, a behavior‑sequence based deep learning recall model enhanced with play‑rate and intent‑type embeddings, describes its self‑attention architecture, factorized embedding parameterization, multitask loss design, distributed TensorFlow training tricks, and demonstrates significant offline and online improvements in music recommendation performance.

Self-AttentionTensorFlowdeep learning

0 likes · 15 min read

Feedback‑Aware Deep Matching Model for Music Recommendation in Tmall Genie

Alibaba Cloud Developer

May 21, 2020 · Artificial Intelligence

How DeepMatch Boosts Music Recommendations with Play Rate and Intent Signals

This article examines the DeepMatch retrieval model for Tmall Genie music recommendation, detailing how incorporating user feedback such as play‑rate and query intent signals via multi‑task learning and feedback‑aware self‑attention improves recall accuracy and reduces negative recommendations, while also discussing embedding factorization, loss functions, and distributed training optimizations.

Recommendation SystemsSelf-Attentiondeep learning

0 likes · 18 min read

How DeepMatch Boosts Music Recommendations with Play Rate and Intent Signals

JD Tech Talk

Nov 5, 2019 · Artificial Intelligence

GeoBERT: A Multi‑Task Pre‑trained Language Model for Chinese Address Text

This article introduces GeoBERT, a novel pre‑training method for Chinese address strings that leverages seven jointly constrained tasks to capture spatial semantics, administrative hierarchy, and similarity relationships, enabling downstream address classification, segmentation, POI extraction, similarity comparison, and authenticity verification with reduced annotation dependence.

Chinese LanguageGeoBERTGeocoding

0 likes · 15 min read

GeoBERT: A Multi‑Task Pre‑trained Language Model for Chinese Address Text

High Availability Architecture

May 27, 2019 · Artificial Intelligence

A Survey of Transfer Learning and Model Pre‑training Techniques for Natural Language Processing

This article reviews the taxonomy of transfer learning in NLP, summarizes representative pre‑training models such as ELMo, ULMFiT, BERT, GPT, MASS and UNILM, discusses their strengths and limitations, and provides practical recommendations for applying these techniques in real‑world projects.

BERTELMoNLP

0 likes · 34 min read

A Survey of Transfer Learning and Model Pre‑training Techniques for Natural Language Processing

Alibaba Cloud Developer

Dec 25, 2018 · Artificial Intelligence

How a Composite Framework Boosts Speech Emotion Recognition in Noisy Environments

This paper presents a multi‑subsystem ensemble for voice‑based emotion recognition that leverages low‑level descriptors, high‑level iVector features, attention‑based RNNs, and text SVMs, achieving superior robustness and accuracy on the noisy MEC 2017 dataset.

attention modeldeep neural networkmultitask learning

0 likes · 8 min read

How a Composite Framework Boosts Speech Emotion Recognition in Noisy Environments

JD Tech

Jun 29, 2018 · Artificial Intelligence

JD AI's JDAI-Face: Real-Time Multi-Task Facial Attribute Recognition System

The article introduces JD AI's JDAI-Face system, a deep‑learning based real‑time multi‑task facial attribute recognition platform that detects gender, age, ethnicity, expression and attractiveness, outlines its technical pipeline, showcases retail applications, and cites recent academic publications and expert contributors.

deep learningface recognitionmultitask learning

0 likes · 12 min read

JD AI's JDAI-Face: Real-Time Multi-Task Facial Attribute Recognition System