Tag

data augmentation

0 views collected around this technical thread.

Amap Tech
Amap Tech
May 27, 2025 · Artificial Intelligence

Gaode Map Custom Voice Pack: End‑to‑End TTS Model Architecture and Deployment

This article explains how Gaode Map leverages lightweight edge TTS models, dual‑autoregressive large‑model data augmentation, and a configurable audio‑processing DAG to enable users to create highly realistic personalized voice packs from just three recorded sentences.

Gaode MapsTTSdata augmentation
0 likes · 8 min read
Gaode Map Custom Voice Pack: End‑to‑End TTS Model Architecture and Deployment
Sohu Tech Products
Sohu Tech Products
Apr 16, 2025 · Artificial Intelligence

Comprehensive Guide to Building AI Datasets: From Source Collection to Data Augmentation and Validation

This guide walks readers through every stage of building high‑quality AI training datasets—from locating open‑source data and defining goals, through collection, annotation, cleaning, large‑scale processing, optional augmentation, and splitting, to validation—using a medical QA example for fine‑tuning DeepSeek‑R1.

AI fine‑tuningPythondata augmentation
0 likes · 18 min read
Comprehensive Guide to Building AI Datasets: From Source Collection to Data Augmentation and Validation
DataFunTalk
DataFunTalk
Jan 1, 2025 · Artificial Intelligence

Applying Large Language Models to Financial Risk Control at Akulaku

This article details Akulaku’s deployment of large language models across multimodal financial risk‑control scenarios—covering business background, a three‑module intelligent‑agent architecture, concrete tool‑ and planning‑enhancement case studies, and future outlook—demonstrating how LLMs boost efficiency, reduce labeling effort, and enable copilot‑style assistance.

KYC verificationagent architecturedata augmentation
0 likes · 15 min read
Applying Large Language Models to Financial Risk Control at Akulaku
Rare Earth Juejin Tech Community
Rare Earth Juejin Tech Community
Jun 16, 2024 · Artificial Intelligence

HRNet Source Code Walkthrough: Keypoint Dataset Construction, Online Data Augmentation, and Training Pipeline

This article provides a detailed, English-language walkthrough of the HRNet source code, covering how the COCO keypoint dataset is built, the online data‑augmentation techniques applied during training, and the end‑to‑end training and inference procedures for human pose estimation.

HRNetPyTorchcomputer vision
0 likes · 36 min read
HRNet Source Code Walkthrough: Keypoint Dataset Construction, Online Data Augmentation, and Training Pipeline
DaTaobao Tech
DaTaobao Tech
May 17, 2024 · Artificial Intelligence

Understanding Convolutional Neural Networks: Theory, Architecture, and Practical Techniques

The article explains CNN fundamentals—convolution, pooling, and fully‑connected layers—illustrates their implementation for American Sign Language letter recognition, details parameter calculations, demonstrates data augmentation and transfer learning techniques, and highlights how these methods boost image‑classification accuracy to around 92%.

CNNImage Recognitiondata augmentation
0 likes · 19 min read
Understanding Convolutional Neural Networks: Theory, Architecture, and Practical Techniques
php中文网 Courses
php中文网 Courses
Oct 13, 2023 · Artificial Intelligence

Top 10 Python Libraries for Data Augmentation in Machine Learning

This article introduces ten popular Python libraries—Augmentor, imgaug, albumentations, nlpaug, textaugment, pytorch‑geometric, audiomentations, nlpaugment, keras‑augment, and OpenCV—that provide powerful image, text, audio, and graph data augmentation techniques to improve model generalization and robustness.

Pythonaudio augmentationdata augmentation
0 likes · 8 min read
Top 10 Python Libraries for Data Augmentation in Machine Learning
DataFunSummit
DataFunSummit
May 23, 2023 · Artificial Intelligence

Continuous Semantic Enhancement for Neural Machine Translation: Methodology, Experiments, and Community Deployment

This article introduces a continuous semantic enhancement approach for neural machine translation that overcomes the limitations of discrete data‑augmentation techniques, details the neighbor risk minimization training objective, presents benchmark improvements on ACL‑2022 datasets, and describes practical deployment and fine‑tuning workflows in the Modu community.

continuous semantic augmentationcontrastive learningdata augmentation
0 likes · 19 min read
Continuous Semantic Enhancement for Neural Machine Translation: Methodology, Experiments, and Community Deployment
Sohu Tech Products
Sohu Tech Products
Mar 16, 2023 · Artificial Intelligence

ChatGPT Data Augmentation Methods for NLP

This article introduces various ChatGPT‑based data‑augmentation techniques for natural language processing, explains how to use prompts for synonym, antonym, homophone, random insertion, deletion, and swapping transformations, and provides concrete example prompts and outputs to illustrate each method.

Artificial IntelligenceChatGPTNLP
0 likes · 15 min read
ChatGPT Data Augmentation Methods for NLP
AntTech
AntTech
Nov 6, 2022 · Artificial Intelligence

Advanced Rule Learning, Constraint‑Adaptive Frameworks, and Semi‑Supervised Data Augmentation for Fraud Detection and Imbalanced Ranking

This article surveys recent Ant Group research on explainable fraud detection, including constraint‑adaptive rule‑set learning (CRSL), meta‑path guided rule generation (MetaRule), biased sampling for imbalanced ranking, and a semi‑supervised data‑augmentation framework (SDAT) for tabular data, highlighting their motivations, methodologies, deployments, and experimental results.

AI researchGraph Neural NetworksSemi-supervised Learning
0 likes · 18 min read
Advanced Rule Learning, Constraint‑Adaptive Frameworks, and Semi‑Supervised Data Augmentation for Fraud Detection and Imbalanced Ranking
DaTaobao Tech
DaTaobao Tech
Oct 17, 2022 · Artificial Intelligence

AI Live Stream: Causal Representation Learning and Real-time Color Enhancement

In this AI Live Stream, two Taobao Technology engineers present how causal representation learning enables unbiased data augmentation and factor‑controllable generation to boost fine‑grained image classification, while also unveiling a real‑time color‑enhancement technique that merges cascaded lookup tables with dynamic neural networks, illustrating modern AI trends and practical deployment strategies.

AI algorithmsModel GeneralizationReal-time Processing
0 likes · 4 min read
AI Live Stream: Causal Representation Learning and Real-time Color Enhancement
DataFunSummit
DataFunSummit
Jun 26, 2022 · Artificial Intelligence

Applying Knowledge Graphs to Recruitment: Construction, Tag Mining, and Recommendation at 58.com

58.com’s NLP senior engineer explains how a recruitment knowledge graph is built—through multi‑dimensional tag systems, tag mining, and relation extraction—and how it enhances bidirectional matching and recommendation efficiency, addressing challenges such as weak expression, cold start, and supply‑demand imbalance.

AINLPRecruitment
0 likes · 17 min read
Applying Knowledge Graphs to Recruitment: Construction, Tag Mining, and Recommendation at 58.com
DataFunSummit
DataFunSummit
Jun 23, 2022 · Artificial Intelligence

Unlocking Data Potential: Automatic Data Augmentation, Denoising, Active Learning, and Data Splitting

The talk explains how to maximize the value of training data by exploring background on model generalization, automatic data augmentation techniques, denoising strategies, active learning for selecting unlabeled samples, and robust data splitting methods, offering practical guidelines for AI practitioners.

AIactive learningautomatic denoise
0 likes · 16 min read
Unlocking Data Potential: Automatic Data Augmentation, Denoising, Active Learning, and Data Splitting
DaTaobao Tech
DaTaobao Tech
Jun 13, 2022 · Artificial Intelligence

Robust Neural Radiance Field Representation for Extrapolating Novel Views (RapNeRF)

RapNeRF enhances Neural Radiance Fields for extreme view extrapolation by introducing Random Ray Casting and a Ray Atlas, which together augment training data and store view‑dependent surface features, enabling robust, high‑quality novel‑view synthesis from sparse images and outperforming prior methods on synthetic and real datasets.

Neural RenderingView Synthesiscomputer vision
0 likes · 8 min read
Robust Neural Radiance Field Representation for Extrapolating Novel Views (RapNeRF)
NetEase LeiHuo Testing Center
NetEase LeiHuo Testing Center
Apr 1, 2022 · Artificial Intelligence

Learning OCR for Game Text Recognition: From Data Preparation to CRNN Model Training

This article documents the author’s step‑by‑step journey of building an OCR system for recognizing Chinese characters in a card‑game UI, covering game selection, technical background, data generation, deep‑learning model training with CRNN, real‑image data collection, optimization attempts, and final performance evaluation.

CRNNEasyOCRGame Text Recognition
0 likes · 15 min read
Learning OCR for Game Text Recognition: From Data Preparation to CRNN Model Training
DataFunSummit
DataFunSummit
Feb 12, 2022 · Artificial Intelligence

Advances and Challenges in Post‑BERT Semantic Matching: Negative Sampling, Data Augmentation, and Applications

After the BERT era, this article reviews the limitations of pre‑trained language models for semantic matching, discusses negative‑sample sampling, data‑augmentation techniques, contrastive learning methods such as ConSERT and SimCSE, and practical deployment considerations in vector‑based retrieval systems.

Vector Retrievalcontrastive learningdata augmentation
0 likes · 20 min read
Advances and Challenges in Post‑BERT Semantic Matching: Negative Sampling, Data Augmentation, and Applications
58 Tech
58 Tech
Aug 19, 2021 · Artificial Intelligence

Practical NER Techniques for Business Chatbots on the 58.com Service Platform

This article presents a comprehensive case study of applying named‑entity‑recognition (NER) techniques to the smart chat assistant of 58.com’s yellow‑page service, covering business background, model selection (BiLSTM‑CRF, IDCNN‑CRF, BERT), data‑augmentation, focal loss, fusion of rule‑based and neural methods, context modeling, online performance, and future research directions.

BERTCRFEntity Recognition
0 likes · 16 min read
Practical NER Techniques for Business Chatbots on the 58.com Service Platform
Beike Product & Technology
Beike Product & Technology
Jul 1, 2021 · Artificial Intelligence

Semantic Data Augmentation and GigaSpeech: Highlights of Two INTERSPEECH 2021 Papers from the Beike Voice Team

The article summarizes two INTERSPEECH 2021 papers from Beike's voice technology team, detailing a grammar‑based semantic data augmentation method that improves end‑to‑end Chinese speech recognition and introducing GigaSpeech, a massive 10,000‑hour multilingual English speech dataset for robust ASR research.

GigaSpeechInterspeechSpeech Recognition
0 likes · 7 min read
Semantic Data Augmentation and GigaSpeech: Highlights of Two INTERSPEECH 2021 Papers from the Beike Voice Team
DataFunTalk
DataFunTalk
May 9, 2021 · Artificial Intelligence

Few-Shot Learning, Data Augmentation, and Multi‑Task Learning for Safety Modeling in Ride‑Hailing Platforms

This article presents Didi's exploration of few‑shot learning, data‑augmentation, semi‑supervised self‑training and multi‑task learning techniques to address the scarcity of labeled samples in safety and governance scenarios, demonstrating practical solutions and performance gains across various risk‑detection tasks.

AISemi-supervised Learningdata augmentation
0 likes · 15 min read
Few-Shot Learning, Data Augmentation, and Multi‑Task Learning for Safety Modeling in Ride‑Hailing Platforms
Didi Tech
Didi Tech
Apr 20, 2021 · Artificial Intelligence

Few-Shot Learning, Data Augmentation, and Semi‑Supervised Methods for Improving Safety and Governance Models at Didi

To overcome scarce labeled data for safety and governance, Didi combines few‑shot learning with systematic data augmentation, self‑training semi‑supervised labeling, and multi‑task neural architectures, cutting labeling costs and reducing log‑loss by over 20% while boosting ROC‑AUC and PR‑AUC across harassment detection, expense‑complaint, and route‑intercept use cases.

AI safetyDidiSemi-supervised Learning
0 likes · 15 min read
Few-Shot Learning, Data Augmentation, and Semi‑Supervised Methods for Improving Safety and Governance Models at Didi