Tagged articles

45 articles

Page 1 of 1

May 18, 2026 · Artificial Intelligence

Composer 2.5 Delivers Opus‑level Performance at One‑Tenth the Cost

Composer 2.5, Cursor’s latest LLM, matches Claude Opus 4.7‑level capabilities while costing roughly one‑tenth as much, thanks to larger training scale, precise text‑feedback reinforcement learning, 25× more synthetic tasks, and a new Muon‑HSDP optimizer that boosts efficiency up to ten‑fold.

Composer 2.5LLMMuon optimizer

0 likes · 9 min read

Composer 2.5 Delivers Opus‑level Performance at One‑Tenth the Cost

Machine Heart

May 18, 2026 · Artificial Intelligence

How DeepCybo’s Z‑WM Dominated WorldArena Track 2 with a 30.5‑Point Lead

DeepCybo celebrated its first anniversary by showing that its human‑first‑perspective data pipeline and the PhysBrain 1.0 base model can generate physically consistent synthetic videos that boost robot task success, earning Z‑WM an 88.5‑point score and a 30.5‑point lead to win WorldArena Track 2, while also ranking eighth in Track 1 with language‑only input.

DeepCyboEmbodied AIPhysBrain

0 likes · 14 min read

How DeepCybo’s Z‑WM Dominated WorldArena Track 2 with a 30.5‑Point Lead

Weekly Large Model Application

May 5, 2026 · Artificial Intelligence

Why More GPUs and Data Aren’t Enough: Defining Scenarios and Data for Speech Model Training

The article argues that successful speech model training starts with understanding user scenarios, then selecting appropriate data, and finally choosing metrics, detailing six key questions, data sourcing strategies, evaluation criteria, and compliance considerations to avoid the misconception that sheer data volume guarantees performance.

AI trainingModel Evaluationdata collection

0 likes · 6 min read

Why More GPUs and Data Aren’t Enough: Defining Scenarios and Data for Speech Model Training

Woodpecker Software Testing

Apr 29, 2026 · R&D Management

Test Data Generation Teams Must Evolve: From Data Movers to Data Engineering Experts

With CI/CD pipelines maturing, automated test coverage is no longer the bottleneck; the real constraint has shifted to producing accurate, fast, and secure test data, prompting teams to upgrade from simple data mocking to full‑stack data engineering, AI‑driven synthesis, and verifiable data contracts.

AIci/cddata engineering

0 likes · 8 min read

Test Data Generation Teams Must Evolve: From Data Movers to Data Engineering Experts

Woodpecker Software Testing

Apr 18, 2026 · Operations

Why 83% of Test Teams Suffer Data Shortage and How Next‑Gen Test Data Generation Overcomes It

The article examines the growing data shortage in software testing, explains why traditional manual and script‑based data generation fails, and presents four pillars of next‑generation test data generation—data contracts, privacy‑enhanced synthetic techniques, scenario‑aware dynamic supply, and observability—backed by a real e‑commerce case study.

Test Data Generationdata-contractsprivacy-preserving

0 likes · 8 min read

Why 83% of Test Teams Suffer Data Shortage and How Next‑Gen Test Data Generation Overcomes It

AI Info Trend

Apr 15, 2026 · Industry Insights

2026 AI Index: China‑US Model Race, Compute Surge & Data Trends

Based on Stanford HAI’s AI Index 2026, this analysis highlights how the US‑China model performance gap has vanished, global AI compute has exploded 3.3‑fold, data bottlenecks are easing through synthetic data and curation, while transparency, supply‑chain concentration, and environmental impact raise new challenges.

AI Index 2026AI computeAI trends

0 likes · 8 min read

2026 AI Index: China‑US Model Race, Compute Surge & Data Trends

Test Development Learning Exchange

Apr 10, 2026 · Industry Insights

What AI-Powered Testing Stack Will Dominate Enterprises in 2026?

The article outlines seven essential AI-driven technologies—ranging from intelligent test generation engines to unified test data platforms—that enterprises must adopt by 2026 to achieve human‑machine collaboration, reduce script fragility, and boost testing efficiency.

AI testingHuman-in-the-LoopRoot Cause Analysis

0 likes · 8 min read

What AI-Powered Testing Stack Will Dominate Enterprises in 2026?

Woodpecker Software Testing

Apr 10, 2026 · Artificial Intelligence

2026 Model Evaluation Reaches the Cost‑Benefit Threshold

In 2026, model evaluation has become the pivotal bottleneck in AI engineering, with exploding compute, data‑compliance, and tooling costs forcing a shift from labor‑intensive testing to quantifiable business value, and three levers—dynamic granularity, synthetic data loops, and evaluation‑as‑a‑service—offering a path to a cost‑benefit inflection point.

AI complianceDynamic GranularityEvaluation as a Service

0 likes · 7 min read

2026 Model Evaluation Reaches the Cost‑Benefit Threshold

HyperAI Super Neural

Mar 26, 2026 · Artificial Intelligence

MIT’s Wave‑Former Reconstructs Fully Occluded Objects with 85% Precision, Boosting Recall to 72%

MIT researchers introduce Wave‑Former, a physics‑aware, generative‑AI framework for mmWave sensing that achieves high‑precision 3D reconstruction of completely hidden objects, raising recall from 54% to 72% while maintaining 85% precision and outperforming existing baselines on real‑world datasets.

3D reconstructionbenchmarkgenerative AI

0 likes · 15 min read

MIT’s Wave‑Former Reconstructs Fully Occluded Objects with 85% Precision, Boosting Recall to 72%

AIWalker

Mar 22, 2026 · Artificial Intelligence

How SAP Cuts 90% Compute and Boosts 4K Panorama Segmentation Accuracy by 17.2%

The SAP framework transforms a static 4K equirectangular panorama into a pseudo‑video, fine‑tunes SAM2 with synthetic data and a column‑first scanning trajectory, slashing GPU memory use by 90% while raising zero‑shot mIoU by an average of 17.2% across multiple benchmarks.

Deep LearningSAM2panorama segmentation

0 likes · 15 min read

How SAP Cuts 90% Compute and Boosts 4K Panorama Segmentation Accuracy by 17.2%

AI Engineering

Mar 16, 2026 · Artificial Intelligence

Does Synthetic Data Have a Future? Evidence‑Based Conclusions

A detailed investigation of two public programming‑training datasets shows that AI‑only synthetic data suffers from severe quality issues, and even AI‑plus‑expert review yields only about ten percent usable examples, proving that high‑quality training data still requires domain experts and rigorous quality‑control processes.

AI trainingModel Evaluationdata labeling

0 likes · 16 min read

Does Synthetic Data Have a Future? Evidence‑Based Conclusions

Model Perspective

Mar 16, 2026 · Artificial Intelligence

Can AI‑Generated “Silicon Samples” Replace Real Survey Respondents?

The article explains how large language models can simulate virtual respondents—called silicon samples—to generate synthetic survey data, outlines the four fidelity criteria for evaluating their credibility, and demonstrates practical workflows with the open‑source EDSL Python library.

Artificial IntelligenceEDSLLLM

0 likes · 14 min read

Can AI‑Generated “Silicon Samples” Replace Real Survey Respondents?

Machine Learning Algorithms & Natural Language Processing

Mar 14, 2026 · Artificial Intelligence

Can Large Language Models Get Stronger Without Human Language Training? A New Pre‑Pre‑Training Path

A recent study shows that pre‑training Transformers on synthetic, non‑language data generated by Neural Cellular Automata can boost language‑model performance by up to 6%, accelerate convergence by 40%, and improve downstream reasoning, even outperforming models trained on massive natural‑text corpora.

In-Context LearningNeural Cellular AutomataPre‑training

0 likes · 12 min read

Can Large Language Models Get Stronger Without Human Language Training? A New Pre‑Pre‑Training Path

Machine Learning Algorithms & Natural Language Processing

Feb 28, 2026 · Artificial Intelligence

From Prompt Learning to SIPDO: The Closed‑Loop Evolution Driving Continuous Innovation

The article traces how prompt optimization has mirrored the historical evolution of parameter learning, outlines four development phases—from evolutionary search to beyond‑first‑order methods—and explains how SIPDO’s synthetic‑data feedback and difficulty‑progression create a closed‑loop system that yields consistent performance gains across LLM benchmarks.

AIClosed Loop LearningLLM

0 likes · 18 min read

From Prompt Learning to SIPDO: The Closed‑Loop Evolution Driving Continuous Innovation

Data Party THU

Oct 30, 2025 · Artificial Intelligence

How to Generate Realistic Synthetic Data with Histograms and GMMs

This article explains two practical techniques—histogram‑based per‑column synthesis and Gaussian‑Mixture‑Model generation—for creating large, privacy‑preserving synthetic datasets that retain the statistical distributions and inter‑column relationships of the original data, and shows how to evaluate their quality.

Data GenerationGaussian mixture modelPython

0 likes · 27 min read

How to Generate Realistic Synthetic Data with Histograms and GMMs

Code Mala Tang

Oct 28, 2025 · Artificial Intelligence

Unlocking AI Creativity with Just Eight Words: The Verbalized Sampling Breakthrough

A recent Stanford and West Virginia University study reveals that a simple eight‑word prompt technique, called Verbalized Sampling, can double the creative output of large language models without costly retraining, by exposing hidden diversity suppressed by conventional alignment methods.

AI creativityLLM sampling techniquesPrompt engineering

0 likes · 9 min read

Unlocking AI Creativity with Just Eight Words: The Verbalized Sampling Breakthrough

DataFunTalk

Sep 18, 2025 · Artificial Intelligence

How Tongyi DeepResearch Turns Chatty AI into a Research Powerhouse

Tongyi DeepResearch, an open‑source AI model and framework, achieves SOTA on multiple Deep Research benchmarks by combining fully open‑source models, frameworks, and data pipelines, and introduces novel agentic pre‑training, fine‑tuning, and reinforcement‑learning methods to enable complex multi‑step reasoning and real‑world applications.

AI researchOpen sourceagentic reinforcement learning

0 likes · 14 min read

How Tongyi DeepResearch Turns Chatty AI into a Research Powerhouse

AntTech

Sep 13, 2025 · Artificial Intelligence

Why High‑Quality Data Is the New Breakthrough for Large‑Scale AI Models

At the 2025 Inclusion·Bund Conference forum, leading scholars and industry experts revealed how high‑quality data and AI form a dual‑engine that reshapes model training, improves performance, and drives the next evolution of intelligent systems.

AI training dataData Qualitydata infrastructure

0 likes · 7 min read

Why High‑Quality Data Is the New Breakthrough for Large‑Scale AI Models

Tencent Technical Engineering

Sep 12, 2025 · Artificial Intelligence

How POINTS-Reader Achieves State‑of‑the‑Art PDF Extraction Without Teacher Models

The POINTS-Reader paper, accepted at EMNLP 2025, introduces a two‑stage, fully automated data generation pipeline that enables a lightweight visual‑language model to extract text, tables, and LaTeX formulas from diverse PDF layouts with superior performance and high throughput, all without relying on costly teacher‑model distillation.

AIDocument ParsingOCR

0 likes · 12 min read

How POINTS-Reader Achieves State‑of‑the‑Art PDF Extraction Without Teacher Models

Data Party THU

Aug 20, 2025 · Artificial Intelligence

How Large-Scale Corpus Rewriting is Shaping LLM Training: A Deep Dive into K2, WRAP, and Beyond

This article surveys recent large‑scale corpus rewriting techniques for LLM pre‑training, covering K2’s token‑utilization strategies, domain‑specific methods like SwallowMath/Code, reStructured pretraining, the WRAP pipeline, Nemotron‑CC filtering, Pro‑X noise removal, and the MAGA multi‑style expansion, while highlighting challenges, experimental findings, and open research questions.

LLMcorpus rewritingdata synthesis

0 likes · 20 min read

How Large-Scale Corpus Rewriting is Shaping LLM Training: A Deep Dive into K2, WRAP, and Beyond

AsiaInfo Technology: New Tech Exploration

Jun 23, 2025 · Artificial Intelligence

How Generative Data‑Driven Model Distillation Boosts Large‑Model Performance and Cuts Compute

This article examines generative data‑driven model distillation as a technique that not only compresses large language models but also improves their accuracy, addresses data‑privacy constraints, and reduces computational costs, offering a practical roadmap and real‑world results from a corporate AI platform.

AI OptimizationKnowledge TransferMaaS platform

0 likes · 22 min read

How Generative Data‑Driven Model Distillation Boosts Large‑Model Performance and Cuts Compute

AIWalker

Jun 18, 2025 · Artificial Intelligence

Six New Directions for Large Language Models

Large language models are booming, and this article highlights six cutting‑edge research directions—LLM‑plus synthetic data, reward modeling, inference techniques, LLM‑as‑a‑Judge, safety alignment, and long‑context handling—each illustrated with recent papers, experimental results, and links to code repositories.

InferenceLLMReward Modeling

0 likes · 9 min read

Six New Directions for Large Language Models

Volcano Engine Developer Services

Jun 18, 2025 · Artificial Intelligence

ChatTS: A Synthetic Data‑Driven Multimodal LLM that Natively Understands Time Series

ChatTS is a time‑series‑native multimodal large language model trained on purely synthetic data, offering superior understanding and reasoning over both real and synthetic time‑series datasets, and outperforming existing LLM baselines across alignment and inference tasks.

AILLM alignmentTS‑MLLM

0 likes · 18 min read

ChatTS: A Synthetic Data‑Driven Multimodal LLM that Natively Understands Time Series

Fighter's World

Jun 14, 2025 · Artificial Intelligence

How Can LLMs Learn to “Think” in Complex Industry Scenarios?

The article analyzes how large language models can acquire true reasoning abilities for hard‑to‑score industry tasks by combining Chain‑of‑Thought prompting with reinforcement learning, addressing vague reward signals, reward hacking, and loyalty, and proposing a toolbox of reward engineering, synthetic data, hierarchical RL and multi‑agent collaboration.

LLMReward Modelingchain-of-thought

0 likes · 22 min read

How Can LLMs Learn to “Think” in Complex Industry Scenarios?

AsiaInfo Technology: New Tech Exploration

May 19, 2025 · Artificial Intelligence

How WASP Generates High‑Quality DP Synthetic Data with Multi‑Model Collaboration

WASP is a privacy‑preserving framework that fuses multiple pretrained language models through a weighted Top‑Q voting scheme to synthesize differential‑private data, dramatically improving downstream task performance even when only a few private samples are available, and it scales to federated settings.

Federated LearningMulti-Model Fusiondifferential privacy

0 likes · 19 min read

How WASP Generates High‑Quality DP Synthetic Data with Multi‑Model Collaboration

Architects' Tech Alliance

Feb 12, 2025 · Artificial Intelligence

DeepSeek‑V3 Training Efficiency, Knowledge Distillation, and the Risks of Synthetic Data

The article examines DeepSeek‑V3’s low‑cost training using 2048 H800 GPUs, explains how knowledge distillation and high‑quality data improve efficiency, discusses expert concerns about training on AI‑generated content, and outlines the limitations and ceiling effect of distillation techniques.

AI SafetyAI Training EfficiencyDeepSeek-V3

0 likes · 7 min read

DeepSeek‑V3 Training Efficiency, Knowledge Distillation, and the Risks of Synthetic Data

Baobao Algorithm Notes

Jan 11, 2025 · Artificial Intelligence

Why Phi‑4’s 14B Model Outperforms GPT‑4 on STEM and Reasoning Tasks

Microsoft Research’s Phi‑4 model, a 14‑billion‑parameter LLM, leverages extensive synthetic data, advanced tokenization, and a two‑stage training pipeline to achieve superior performance on STEM question answering, long‑context reasoning, and safety benchmarks, rivaling larger models like GPT‑4.

AI SafetyBenchmarkingPhi-4

0 likes · 15 min read

Why Phi‑4’s 14B Model Outperforms GPT‑4 on STEM and Reasoning Tasks

Fighter's World

Nov 1, 2024 · Artificial Intelligence

How Fiercely Competitive Is the Large‑Model Landscape? Insights from the State of AI Report 2024

The State of AI Report 2024 reveals converging capabilities among open and closed LLMs, a shift toward inference compute, benchmark and data contamination challenges, rising synthetic‑data risks, booming robotics research, Nvidia's hardware dominance, and a mix of accurate and missed predictions for the coming year.

AI hardwareAI industryinference compute

0 likes · 15 min read

How Fiercely Competitive Is the Large‑Model Landscape? Insights from the State of AI Report 2024

Baobao Algorithm Notes

Oct 25, 2024 · Artificial Intelligence

Why Calibration Data Outperforms Pruning Algorithms in LLM Compression

This study investigates how the choice of calibration data, rather than the pruning algorithm itself, dominates post‑training pruning performance for large language models, revealing that data similarity to the original training set and synthetic data generation can significantly boost compression results.

Artificial IntelligenceLLM pruningcalibration data

0 likes · 14 min read

Why Calibration Data Outperforms Pruning Algorithms in LLM Compression

NewBeeNLP

Sep 23, 2024 · Artificial Intelligence

Why Post‑Training Is Redefining LLMs: DPO vs PPO, Synthetic Data, and Scaling Strategies

This article analyzes recent post‑training trends in large language models, comparing DPO and PPO, examining the scarcity of open‑source preference data, the iterative training process, the rise of synthetic data pipelines, and emerging methods for improving math and reasoning capabilities.

DPOLLMPPO

0 likes · 12 min read

Why Post‑Training Is Redefining LLMs: DPO vs PPO, Synthetic Data, and Scaling Strategies

AntTech

Sep 21, 2024 · Artificial Intelligence

Insights from the 2024 Inclusion·Bund Conference: From Data for AI to AI for Data

The 2024 Inclusion·Bund conference brought together academia and industry leaders to discuss how data technologies are evolving and aligning with AI, covering trends in large‑model storage, synthetic data generation, AI‑enhanced databases, and Ant Group's emerging AI‑centric data ecosystem.

AIAI Alignmentdata strategy

0 likes · 7 min read

Insights from the 2024 Inclusion·Bund Conference: From Data for AI to AI for Data

AntData

Sep 6, 2024 · Artificial Intelligence

Insights from the 2024 Inclusion·Bund Conference: From Data for AI to AI for Data

The 2024 Inclusion·Bund Conference forum brought together leading academics and industry experts to examine how data value is shifting in the AI era, covering large‑model storage challenges, the rise of synthetic data, AI‑enhanced databases, and Ant Group’s next‑generation intelligent data architecture.

AIIntelligent Data Systemsdata strategy

0 likes · 6 min read

NewBeeNLP

Sep 2, 2024 · Artificial Intelligence

Boosting Large Language Model Math Reasoning: Mixed Instructions, Synthetic Data, and Training Optimizations

This article presents a comprehensive technical walkthrough on enhancing large language model mathematical reasoning by reviewing model architectures, introducing mixed CoT‑PoT instructions, generating and filtering synthetic data, and applying multi‑stage training optimizations such as RFT, PPO, and DPO, with detailed experimental results and Q&A insights.

AIReward modelTraining Optimization

0 likes · 17 min read

Boosting Large Language Model Math Reasoning: Mixed Instructions, Synthetic Data, and Training Optimizations

DataFunTalk

Aug 24, 2024 · Artificial Intelligence

Improving the Mathematical Reasoning Ability of Large Language Models: Overview, Mixed Instructions, Synthetic Data, and Training Optimization

This article presents a comprehensive approach to enhancing large language models' mathematical reasoning by reviewing model architectures, introducing mixed CoT‑PoT instructions, generating and filtering synthetic data, and applying multi‑stage training optimizations such as RFT, PPO, and DPO, with detailed experimental results and Q&A.

AIReward modellarge language models

0 likes · 16 min read

Improving the Mathematical Reasoning Ability of Large Language Models: Overview, Mixed Instructions, Synthetic Data, and Training Optimization

NewBeeNLP

Jul 31, 2024 · Artificial Intelligence

How Continual Pre‑Training Boosts Llama‑3’s Chinese and Scientific Reasoning

This report presents a continual pre‑training approach that significantly enhances Llama‑3 (8B)’s Chinese language proficiency and scientific reasoning by using a carefully mixed corpus of existing and synthetic data, detailing the bilingual adaptation and synthetic‑enhancement stages, data‑mixing and curriculum strategies, and demonstrating strong results across multilingual and scientific benchmarks without sacrificing original capabilities.

BenchmarkingLLMLlama-3

0 likes · 9 min read

How Continual Pre‑Training Boosts Llama‑3’s Chinese and Scientific Reasoning

Baobao Algorithm Notes

Jul 25, 2024 · Artificial Intelligence

Why LLaMA 3 405B Matches GPT‑4o: Architecture, Training, and Industry Impact

The article provides an in‑depth analysis of LLaMA 3 405B, covering its dense Transformer architecture, three‑stage pre‑training (initial, long‑context, annealing), iterative post‑training with RM‑guided rejection sampling, the decision against MOE, and the broader implications for both large and small model development.

405BModel architecturemodel distillation

0 likes · 17 min read

Why LLaMA 3 405B Matches GPT‑4o: Architecture, Training, and Industry Impact

IT Services Circle

Jul 9, 2024 · Artificial Intelligence

Comparative Study of Classification Algorithms and Calibration Using Synthetic Data

This article presents a comprehensive case study that explains classification principles, shows the key formulas for logistic regression and SVM, and provides a full Python implementation that generates synthetic data, trains multiple classifiers, calibrates them, and visualizes calibration curves and probability histograms.

CalibrationPythonclassification

0 likes · 6 min read

Comparative Study of Classification Algorithms and Calibration Using Synthetic Data

Meituan Technology Team

Jun 13, 2024 · Artificial Intelligence

Overview of Meituan's Selected CVPR 2024 Papers and Online Sharing Event

Meituan's tech team highlights seven CVPR 2024 papers—spanning OCR pre‑training, long‑tail semi‑supervised learning, visual AIGC, audio‑visual segmentation and synthetic‑data detection—provides detailed abstracts and experimental results, and announces an online author‑talk session on June 27.

Audio-Visual SegmentationCVPR 2024Computer Vision

0 likes · 18 min read

Overview of Meituan's Selected CVPR 2024 Papers and Online Sharing Event

NewBeeNLP

Apr 22, 2024 · Artificial Intelligence

Why LLAMA‑3’s Scaling Laws Signal the Next AI Frontier

The article analyzes LLAMA‑3’s architectural tweaks, massive data expansion, scaling‑law implications, open‑source versus closed‑source dynamics, and the critical role of synthetic data in sustaining large‑model progress beyond 2025.

LLAMA-3large language modelsopen-source AI

0 likes · 10 min read

Why LLAMA‑3’s Scaling Laws Signal the Next AI Frontier

DataFunSummit

Nov 29, 2023 · Artificial Intelligence

AIGC and Causal Inference: Mutual Empowerment and Applications with YLearn

This article explores how generative AI (AIGC) can be used to synthesize structured data, how synthetic data supports causal inference, and how agent‑based modeling and the YLearn framework together enable advanced causal discovery, effect estimation, and scenario simulation for enterprise AI applications.

AIGCAgent-Based ModelingArtificial Intelligence

0 likes · 16 min read

AIGC and Causal Inference: Mutual Empowerment and Applications with YLearn

Model Perspective

Oct 9, 2023 · Fundamentals

Unpacking Gender Wage Gaps: Oaxaca‑Blinder, Regression & Simulated Data

This article reviews Claudia Goldin’s Nobel‑winning research on gender wage disparities, explaining the Oaxaca‑Blinder decomposition, multiple linear regression, and mean‑difference models, and demonstrates their application with a synthetic dataset and Python code to illustrate how education, experience, and gender affect wages.

Oaxaca-Blindergender wage gaplabor economics

0 likes · 10 min read

Unpacking Gender Wage Gaps: Oaxaca‑Blinder, Regression & Simulated Data

DataFunSummit

Sep 4, 2023 · Artificial Intelligence

AIGC and Causal Inference: Mutual Empowerment and Applications with YLearn

This article explores how generative AI (AIGC) can be used to synthesize structured data, how synthetic data and agent‑based modeling support causal inference, and introduces the YLearn framework for end‑to‑end causal learning, highlighting practical use cases and research directions.

AIGCAgent-Based ModelingYLearn

0 likes · 15 min read

DataFunTalk

Nov 22, 2022 · Artificial Intelligence

NVIDIA's Advances in Multi‑Role Generative Dialogue Modeling and Synthetic Data‑Driven QA

This article reviews NVIDIA's recent work on multi‑role generative dialogue modeling using GPT‑2‑based architectures and on enhancing question‑answering systems with synthetic data pipelines, covering model design, data preparation from Reddit, extensive experiments, scaling effects, and practical Q&A insights.

GPT-2Generative DialogueModel Scaling

0 likes · 17 min read

NVIDIA's Advances in Multi‑Role Generative Dialogue Modeling and Synthetic Data‑Driven QA

Code DAO

Dec 11, 2021 · Artificial Intelligence

Using DCGAN to Generate Synthetic Marine Plastic Images

This article explains how to apply a Deep Convolutional GAN in PyTorch to create realistic synthetic images of marine plastic, addressing dataset scarcity, detailing the network architecture, training procedure, and showing loss curves and generated samples.

DCGANGANMarine Plastic

0 likes · 13 min read

Using DCGAN to Generate Synthetic Marine Plastic Images

Kuaishou Large Model

Dec 17, 2020 · Artificial Intelligence

How KAIFX Generates High‑Quality Virtual Data for AI Training

This article explains how KAIFX, a synthetic data platform built on computer graphics and AI techniques, tackles challenges of data scarcity, realism, labeling bias, and management to boost AR and 3D face reconstruction model performance.

3D face reconstructionAIAR

0 likes · 12 min read

How KAIFX Generates High‑Quality Virtual Data for AI Training