Tagged articles
98 articles
Page 1 of 1
Machine Heart
Machine Heart
May 14, 2026 · Artificial Intelligence

How PsiBot Uses 100,000 Hours of Human Data to Power Embodied Intelligence

PsiBot demonstrates that, with a 100,000‑hour human‑operation dataset captured via exoskeleton gloves and ego‑vision, a world‑model (W0) and reinforcement‑learning policy (R2) can bridge the gap to robot control, offering a scalable alternative to costly teleoperation pipelines.

Embodied AIRoboticsdata collection
0 likes · 12 min read
How PsiBot Uses 100,000 Hours of Human Data to Power Embodied Intelligence
Weekly Large Model Application
Weekly Large Model Application
May 5, 2026 · Artificial Intelligence

Why More GPUs and Data Aren’t Enough: Defining Scenarios and Data for Speech Model Training

The article argues that successful speech model training starts with understanding user scenarios, then selecting appropriate data, and finally choosing metrics, detailing six key questions, data sourcing strategies, evaluation criteria, and compliance considerations to avoid the misconception that sheer data volume guarantees performance.

AI trainingModel Evaluationdata collection
0 likes · 6 min read
Why More GPUs and Data Aren’t Enough: Defining Scenarios and Data for Speech Model Training
Machine Heart
Machine Heart
Apr 13, 2026 · Artificial Intelligence

How Six‑Dimensional Force Data Powers China’s First Full‑Perception VTLA Model

The article analyzes how Kepler Robotics’ dual‑path, six‑degree‑of‑freedom force‑tactile data collection system overcomes the scaling bottleneck of embodied AI, enabling a VTLA model that integrates vision, language, action and tactile feedback to achieve near‑perfect industrial assembly performance.

Embodied AIKepler RoboticsVTLA model
0 likes · 14 min read
How Six‑Dimensional Force Data Powers China’s First Full‑Perception VTLA Model
Machine Heart
Machine Heart
Apr 7, 2026 · Artificial Intelligence

How Qianxun Raised ¥3 B in 30 Days: AI‑Powered Robotics Secrets

Qianxun Intelligent secured ¥30 billion in funding within a month, leveraged a scaling‑law data engine and the Spirit v1.5 VLA model to achieve breakthrough robot performance, and demonstrated the commercial loop through deployments at JD.com retail and CATL battery lines.

Embodied AIQianxun IntelligentRobotics
0 likes · 12 min read
How Qianxun Raised ¥3 B in 30 Days: AI‑Powered Robotics Secrets
Machine Heart
Machine Heart
Apr 6, 2026 · Artificial Intelligence

How HumDex Overcomes Humanoid Robot Data Bottlenecks with Low‑Cost Full‑Body Dexterous Control

HumDex combines wireless inertial motion capture, a learning‑based hand‑redirection network, and a two‑stage pre‑training/fine‑tuning pipeline to deliver portable, high‑precision teleoperation for humanoid robots, cutting data‑collection time by 26% and raising remote‑operation success to 91.7% while enabling zero‑shot generalization across unseen objects and scenes.

data collectionhumanoid robotinertial motion capture
0 likes · 9 min read
How HumDex Overcomes Humanoid Robot Data Bottlenecks with Low‑Cost Full‑Body Dexterous Control
Machine Heart
Machine Heart
Apr 3, 2026 · Artificial Intelligence

How Foundation Models Are Transforming Embodied Navigation from Task‑Specific to General Intelligence

This survey systematically reviews how foundation models reshape embodied navigation, covering problem definition, taxonomy of tasks and robot forms, system architecture from perception to control, data sources and training strategies, edge deployment techniques, benchmark metrics, and future research directions.

BenchmarkMultimodal AIdata collection
0 likes · 11 min read
How Foundation Models Are Transforming Embodied Navigation from Task‑Specific to General Intelligence
Alibaba Cloud Native
Alibaba Cloud Native
Feb 5, 2026 · Cloud Native

How to Cut Cross‑Cloud Data Transfer Costs with CDN and LoongCollector

In multi‑cloud environments, enterprises face high outbound traffic fees for unified observability, but by routing logs through a CDN and using the high‑performance LoongCollector agent, they can reduce cross‑cloud transfer costs by up to 70%, improve throughput, and simplify deployment.

CDNcloud-nativecost-optimization
0 likes · 10 min read
How to Cut Cross‑Cloud Data Transfer Costs with CDN and LoongCollector
DataFunSummit
DataFunSummit
Jan 29, 2026 · Big Data

How to Slash Web Scraping Costs by 60%: Proven Strategies from a Bright Data Expert

In the era of massive AI model training, this article presents a step‑by‑step technical guide—covering the full data‑collection pipeline, three acquisition modes, IP‑type choices, bandwidth savings, path and mixed‑request optimizations, and business‑level cost controls—to reduce web‑scraping expenses by more than 60% while maintaining data quality.

AIAutomationdata collection
0 likes · 24 min read
How to Slash Web Scraping Costs by 60%: Proven Strategies from a Bright Data Expert
DataFunSummit
DataFunSummit
Jan 17, 2026 · Artificial Intelligence

How UnrealZoo Accelerates Embodied AI Research with High‑Fidelity Simulation

This article outlines the evolution from traditional AI to embodied intelligence, explains the Vision‑Language‑Action (VLA) paradigm, highlights data‑collection bottlenecks, introduces the UnrealZoo simulation platform built on Unreal Engine, and showcases real‑world case studies and future challenges for embodied AI research.

Embodied AIRoboticsUnreal Engine
0 likes · 16 min read
How UnrealZoo Accelerates Embodied AI Research with High‑Fidelity Simulation
Old Meng AI Explorer
Old Meng AI Explorer
Dec 10, 2025 · Operations

How Spider_XHS Turns Xiaohongshu Data Collection into a 10× Efficiency Boost

Spider_XHS is an open‑source Xiaohongshu crawler that automates note, user, comment, and message extraction, offers watermark‑free media downloads, exports structured Excel/JSON data, integrates with the creator platform, and includes proxy and anti‑ban features, enabling marketers and researchers to cut weeks of manual work into hours.

AutomationWeb ScrapingXiaohongshu
0 likes · 10 min read
How Spider_XHS Turns Xiaohongshu Data Collection into a 10× Efficiency Boost
Alibaba Cloud Developer
Alibaba Cloud Developer
Oct 27, 2025 · Artificial Intelligence

How to Build a Quantifiable AI Coding Efficiency Metric System

This article explains how, amid the rapid rise of AI‑assisted programming, a scientific and actionable R&D efficiency metric framework was designed, detailing core indicators such as AI code adoption rate, data collection methods, platform architecture, and practical insights from a large‑scale implementation.

AIMCPcoding
0 likes · 18 min read
How to Build a Quantifiable AI Coding Efficiency Metric System
Alibaba Cloud Observability
Alibaba Cloud Observability
Oct 20, 2025 · Mobile Development

Accelerate iOS Issue Diagnosis with Cloud‑Native Data Collection SDK

Mobile developers often struggle with unreproducible crashes and lag reported by users, spending days sifting through logs and isolated stack traces; this article explains how a cloud‑native iOS SDK links performance metrics, error logs, and user behavior through systematic data collection to dramatically speed up issue diagnosis.

Method SwizzlingMobile DevelopmentPerformance Monitoring
0 likes · 9 min read
Accelerate iOS Issue Diagnosis with Cloud‑Native Data Collection SDK
Alibaba Cloud Native
Alibaba Cloud Native
Jul 29, 2025 · Cloud Native

How LoongCollector Redefines Cloud‑Native Observability for AI Workloads

LoongCollector, the core component of Alibaba Cloud's LoongSuite, delivers zero‑intrusion, multi‑tenant, high‑performance data collection and processing for AI services, integrating logs, metrics, traces, events, and profiles into a unified, programmable pipeline that scales elastically across heterogeneous GPU clusters.

AIcloud-nativedata collection
0 likes · 17 min read
How LoongCollector Redefines Cloud‑Native Observability for AI Workloads
Ctrip Technology
Ctrip Technology
Jul 10, 2025 · Frontend Development

How Visual Event Tracking Transforms Front‑End Data Collection: Inside Ctrip’s Hetu System

This article examines Ctrip's Hetu visual event‑tracking system, comparing it with traditional code‑based tracking, detailing its architecture, SDK modes, configuration workflow, and best‑practice implementations that improve efficiency, reduce coupling, and enhance data reliability across web, app, and mini‑program platforms.

SDKdata collectionevent tracking
0 likes · 22 min read
How Visual Event Tracking Transforms Front‑End Data Collection: Inside Ctrip’s Hetu System
DataFunSummit
DataFunSummit
Feb 25, 2025 · Artificial Intelligence

Collecting High-Quality LLM Training Data and Custom Model Training Guide

This article explains what constitutes high‑quality LLM training data, why large datasets are essential, outlines the step‑by‑step process for collecting, preprocessing, and fine‑tuning models, and highlights the best data sources—including web content, books, code repositories, and news—while noting available free datasets.

AILLMWeb Scraping
0 likes · 9 min read
Collecting High-Quality LLM Training Data and Custom Model Training Guide
Baobao Algorithm Notes
Baobao Algorithm Notes
Dec 23, 2024 · Artificial Intelligence

From Zero to One: A Practical Guide to Pretraining Large Language Models

This comprehensive guide walks through every stage of building a large‑language‑model pretraining pipeline—from data sourcing, cleaning, and deduplication, to tokenizer design, model architecture choices, training framework selection, optimization tricks, and evaluation methods—providing actionable tips and pitfalls to avoid for both newcomers and seasoned practitioners.

LLM Pretrainingdata collectionscaling laws
0 likes · 33 min read
From Zero to One: A Practical Guide to Pretraining Large Language Models
Python Programming Learning Circle
Python Programming Learning Circle
Dec 10, 2024 · Big Data

23 Python Web Scraping Projects with GitHub Links

This article compiles twenty‑three Python web‑scraping projects, each described with its purpose, key features, and a direct GitHub repository link, offering developers a ready‑made toolbox for data collection across platforms such as WeChat, DouBan, Zhihu, Bilibili, and more.

GitHubPythonScrapy
0 likes · 9 min read
23 Python Web Scraping Projects with GitHub Links
High Availability Architecture
High Availability Architecture
Nov 4, 2024 · Operations

Ctrip's Weak Network Identification Model: Design, Implementation, and Practice

This article details Ctrip's approach to weak network detection, covering background, data collection, processing, dynamic weighting algorithms, result output, deployment effects, and future plans, and provides practical code examples and threshold settings for improving mobile network performance.

Weak Network Detectiondata collectiondynamic weighting
0 likes · 26 min read
Ctrip's Weak Network Identification Model: Design, Implementation, and Practice
JavaEdge
JavaEdge
Oct 7, 2024 · Big Data

Master Data Analysis: From Collection to Visualization

This guide explains why data analysis is essential, breaks it into three core stages—data collection, data mining, and data visualization—offers practical tool recommendations, and presents principles for efficient learning and skill development.

Big DataData visualizationPython
0 likes · 10 min read
Master Data Analysis: From Collection to Visualization
NewBeeNLP
NewBeeNLP
Sep 25, 2024 · Artificial Intelligence

From Zero to One: A Practical Guide to Pretraining Large Language Models

This comprehensive guide walks through every stage of LLM pretraining—from data sourcing, cleaning, and deduplication, to tokenizer design, model architecture choices, training framework selection, optimization tricks, and evaluation methods—offering actionable tips and pitfalls to avoid.

LLM PretrainingTraining Frameworkdata collection
0 likes · 32 min read
From Zero to One: A Practical Guide to Pretraining Large Language Models
Zhuanzhuan Tech
Zhuanzhuan Tech
Aug 28, 2024 · Big Data

Quality Inspection Data Collection: Design, Architecture, and Applications

This article outlines the design, architecture, and practical applications of a quality inspection data collection system, covering data point structures, reporting mechanisms, compliance analysis, intelligent strategy iteration, and BI dashboards, illustrating how big‑data techniques enable digital transformation of inspection processes.

BIBig Datacompliance
0 likes · 10 min read
Quality Inspection Data Collection: Design, Architecture, and Applications
Wukong Talks Architecture
Wukong Talks Architecture
Aug 5, 2024 · Operations

Comprehensive Case Study of Large‑Scale Desktop IT Management and Automated Fault Detection at Ctrip

This article presents a detailed case study of Ctrip's large‑scale desktop IT management solution, describing the challenges of handling tens of thousands of office PCs, the full‑link architecture built with Rust, Tauri, SpringBoot and Django, automated health monitoring, fault detection, remediation workflows, security measures, performance optimizations, and the measurable operational improvements achieved.

AutomationDesktop ManagementIT Operations
0 likes · 16 min read
Comprehensive Case Study of Large‑Scale Desktop IT Management and Automated Fault Detection at Ctrip
Python Programming Learning Circle
Python Programming Learning Circle
Jun 5, 2024 · Backend Development

Various Python Methods for E‑commerce Data Collection and Web Scraping

This article introduces ten practical Python techniques—including requests, Selenium, Scrapy, Crawley, PySpider, aiohttp, asks, vibora, Pyppeteer, and Fiddler‑based reverse engineering—to efficiently collect e‑commerce and app data while addressing common challenges such as IP blocking, captchas, and authentication.

ScrapySeleniumaiohttp
0 likes · 8 min read
Various Python Methods for E‑commerce Data Collection and Web Scraping
Alibaba Cloud Developer
Alibaba Cloud Developer
May 31, 2024 · Frontend Development

Unlocking Browser Extension Power: Real‑World Scenarios and Implementation Guides

This article explores browser extension use cases—from product capabilities and data‑collection tools to daily utilities—provides detailed implementation ideas for each scenario, and walks through concrete code examples such as traffic recording, behavior analysis, and proxy plugins, helping developers build practical, production‑ready Chrome extensions.

Performance TestingWeb Developmentbrowser extensions
0 likes · 20 min read
Unlocking Browser Extension Power: Real‑World Scenarios and Implementation Guides
Model Perspective
Model Perspective
May 21, 2024 · Fundamentals

How to Turn Mathematical Modeling from Theory into Real‑World Solutions

This article outlines practical steps—understanding problem background, gathering quality data, selecting appropriate models, solving and analyzing them, and applying results—to ensure mathematical modeling moves beyond theory and effectively addresses real-world issues.

Case StudyModel SelectionProject Management
0 likes · 9 min read
How to Turn Mathematical Modeling from Theory into Real‑World Solutions
Huolala Tech
Huolala Tech
Mar 26, 2024 · Backend Development

How We Built a Real‑Time Road‑Testing Platform to Boost Navigation Accuracy

This article details the design and implementation of a road‑testing platform that improves navigation accuracy by integrating data collection, real‑time video streaming, WebSocket communication, and automated reporting, resulting in faster issue detection, reduced manual effort, and measurable efficiency gains across multiple cities.

OBSWebSocketdata collection
0 likes · 16 min read
How We Built a Real‑Time Road‑Testing Platform to Boost Navigation Accuracy
DeWu Technology
DeWu Technology
Mar 6, 2024 · Frontend Development

Visual Event Tracking Solution: Architecture, Implementation, and Practices

The visual event tracking solution replaces costly manual code instrumentation with a Data‑Trackid and relative Xpath system, a VSCode plugin for automatic ID generation, and an SDK that captures clicks and exposures, dynamically loads third‑party analytics, and provides validation, monitoring, and future decoupling for scalable, real‑time product analytics.

Event AnalyticsSDKdata collection
0 likes · 9 min read
Visual Event Tracking Solution: Architecture, Implementation, and Practices
ByteFE
ByteFE
Jan 26, 2024 · Frontend Development

A Comprehensive Guide to Frontend Event Tracking (埋点)

This article explains what frontend event tracking (埋点) is, why it is essential for product analytics, when and how to implement it, the different tracking models and reporting methods, as well as practical tips, iteration processes, and common pitfalls for developers and product teams.

AnalyticsWebdata collection
0 likes · 18 min read
A Comprehensive Guide to Frontend Event Tracking (埋点)
Rare Earth Juejin Tech Community
Rare Earth Juejin Tech Community
Jan 25, 2024 · Frontend Development

Front-End Event Tracking (埋点) – Fundamentals, Types, and Best Practices

This article provides a comprehensive guide to front‑end event tracking, covering its definition, motivations, scenarios, various tracking types, data models, reporting mechanisms, implementation steps, security considerations, and practical tips for ensuring accurate and non‑blocking data collection in web applications.

Analyticsdata collectionevent tracking
0 likes · 23 min read
Front-End Event Tracking (埋点) – Fundamentals, Types, and Best Practices
DataFunSummit
DataFunSummit
Oct 19, 2023 · Big Data

Design and Evolution of Zhihu's Event Tracking (埋点) System

This article presents a comprehensive overview of Zhihu's event‑tracking system, covering its motivation, toolset, demand‑management platform, verification workflow, data‑collection pipeline, query service architecture, cloud‑native data service design, and practical Q&A on best practices and optimization strategies.

Cloud NativeSoftware Engineeringdata collection
0 likes · 12 min read
Design and Evolution of Zhihu's Event Tracking (埋点) System
FunTester
FunTester
Sep 1, 2023 · Operations

Observability in the Cloud‑Native Era: Data Collection Strategies and Sampling Techniques

The article explains how cloud‑native observability systems gather massive telemetry from infrastructure, containers, middleware and services, compares direct push and file‑based collection approaches, and details head, tail and local sampling methods to optimize data completeness and performance.

Distributed TracingObservabilityPerformance Optimization
0 likes · 10 min read
Observability in the Cloud‑Native Era: Data Collection Strategies and Sampling Techniques
DevOps Cloud Academy
DevOps Cloud Academy
Aug 29, 2023 · Cloud Native

Observability and Data Collection Strategies in Cloud‑Native Environments

The article explains that while observability is not new, cloud‑native systems have driven rapid development of observable platforms, detailing data collection architectures, direct push versus file‑based approaches, and various sampling techniques (head, tail, and local sampling) to balance completeness, real‑time reporting, and performance impact.

Samplingcloud-nativedata collection
0 likes · 11 min read
Observability and Data Collection Strategies in Cloud‑Native Environments
vivo Internet Technology
vivo Internet Technology
Apr 26, 2023 · Backend Development

Design and Evolution of Vivo's Points Task System

Vivo’s Points Task System evolved from a simple configuration‑driven task model into a scalable, multi‑source behavior incentive platform that uses an AviatorScript engine, unified SDK, and three isolated services—event collection, computation, and task handling—to deliver configurable tasks, real‑time rewards, and flexible user notifications while ensuring stability and extensibility.

Points Systembehavior SDKdata collection
0 likes · 14 min read
Design and Evolution of Vivo's Points Task System
DataFunSummit
DataFunSummit
Jan 12, 2023 · Big Data

Industrial IoT Data Collection Platform: Neuron v2.0 Architecture, Design, and Case Studies

This article presents a comprehensive overview of EMQ's Neuron industrial IoT data collection platform, detailing the lessons learned from version 1.x, the redesigned v2.0 architecture, core modules, plugin mechanisms, data‑tag management, eKuiper integration, and two real‑world case studies in oil‑field and smart‑factory environments.

Big DataIoTdata collection
0 likes · 16 min read
Industrial IoT Data Collection Platform: Neuron v2.0 Architecture, Design, and Case Studies
Alibaba Terminal Technology
Alibaba Terminal Technology
Jan 5, 2023 · Mobile Development

Why Mobile Trace Is Hard and How OpenTelemetry Solves It

This article explores the challenges of end‑to‑end tracing on mobile apps, explains why issues are hard to reproduce, and presents a four‑step solution using a unified OpenTelemetry standard, automated data linking, performance optimizations, and machine‑learning‑driven root‑cause analysis.

AndroidObservabilityOpenTelemetry
0 likes · 20 min read
Why Mobile Trace Is Hard and How OpenTelemetry Solves It
Dada Group Technology
Dada Group Technology
Dec 30, 2022 · Fundamentals

Ensuring Trustworthy A/B Experiments: Architecture, Balance Checks, Log Consistency, Automated Significance Testing, and Result Interpretation

This article discusses how to improve the reliability of online A/B experiments by designing robust architecture, evaluating group balance with orthogonal testing, ensuring consistent front‑end/back‑end logging, automating statistical significance checks, reducing group imbalance, and interpreting results using causal trees.

A/B testingcausal treesdata collection
0 likes · 12 min read
Ensuring Trustworthy A/B Experiments: Architecture, Balance Checks, Log Consistency, Automated Significance Testing, and Result Interpretation
DataFunTalk
DataFunTalk
Nov 11, 2022 · Product Management

Data Tracking (埋点) Application Scenarios, Workflow, and the Seven‑Word Guideline

This article explains the concept of data tracking (埋点), outlines its key application scenarios such as exposure, click, and page‑event tracking, describes the end‑to‑end workflow from requirement gathering to deployment and post‑analysis, and summarizes the practical “seven‑word” checklist for successful implementation.

Data Trackingdata collectionproduct analytics
0 likes · 12 min read
Data Tracking (埋点) Application Scenarios, Workflow, and the Seven‑Word Guideline
Alibaba Cloud Developer
Alibaba Cloud Developer
Sep 9, 2022 · Information Security

How to Build a Comprehensive Cloud‑Native Kubernetes Security Monitoring System

This article examines the evolving security risks of cloud‑native architectures, explains why traditional perimeter defenses are insufficient, introduces zero‑trust principles for Kubernetes, outlines common K8s threat vectors, and presents a complete data‑collection and monitoring solution based on the open‑source iLogtail agent.

KubernetesObservabilityZero Trust
0 likes · 30 min read
How to Build a Comprehensive Cloud‑Native Kubernetes Security Monitoring System
Efficient Ops
Efficient Ops
Aug 31, 2022 · Operations

How Intelligent Operations and Observability Transform Cloud‑Native Environments

In this talk, Wu Yakun from Guance Cloud explains the shortcomings of traditional operations, introduces intelligent, data‑driven approaches for the cloud‑native era, and outlines how unified data collection, observability, and SLO‑based monitoring can dramatically improve fault detection and system reliability.

Intelligent OperationsObservabilitySLO
0 likes · 16 min read
How Intelligent Operations and Observability Transform Cloud‑Native Environments
Model Perspective
Model Perspective
Aug 31, 2022 · Fundamentals

How to Build a Watermelon Sweetness Dataset: From Field to Features

This article describes how the author collected a watermelon dataset, defined measurable features such as size, color, sugar content, seed count, and texture, and documented the process with photos, tables, and a brief discussion of data characteristics for future machine‑learning analysis.

data analysisdata collectionfeature engineering
0 likes · 12 min read
How to Build a Watermelon Sweetness Dataset: From Field to Features
HomeTech
HomeTech
Aug 30, 2022 · Big Data

Real‑time Data Collection SDK Visualization: Architecture, Implementation and Usage Guide

This article introduces a data‑collection SDK with a real‑time visualization feature, explains the shortcomings of traditional packet‑capture and log‑based methods, describes the underlying architecture—including a new SDK entry, encrypted reporting, WebSocket communication and Elasticsearch storage—and provides step‑by‑step usage instructions for developers.

ElasticsearchWebSocketdata collection
0 likes · 8 min read
Real‑time Data Collection SDK Visualization: Architecture, Implementation and Usage Guide
Python Crawling & Data Mining
Python Crawling & Data Mining
Aug 12, 2022 · Big Data

Master the Big Data Ecosystem: 9 Core Technology Frameworks Explained

This article provides a comprehensive overview of the big data ecosystem, detailing nine essential technology categories—including data collection, storage, computation, analysis, resource management, retrieval, underlying infrastructure, and cluster installation—while comparing popular tools and illustrating their typical use‑cases with diagrams.

Cluster Managementdata collectiondata storage
0 likes · 11 min read
Master the Big Data Ecosystem: 9 Core Technology Frameworks Explained
政采云技术
政采云技术
Jun 28, 2022 · Frontend Development

Frontend Data Collection and Monitoring: Implementation and Evolution

This article explores frontend data collection and monitoring systems, covering why they are needed, what they can achieve, existing implementation schemes, and future evolution directions including performance data reporting and interface data monitoring.

Data visualizationException MonitoringPerformance Tracking
0 likes · 9 min read
Frontend Data Collection and Monitoring: Implementation and Evolution
Architecture Digest
Architecture Digest
May 23, 2022 · Big Data

Overview of Core Technologies in a Big Data Platform Architecture

This article explains the main layers of a typical big data platform—data collection, storage and analysis, sharing, and application—detailing common tools such as Flume, DataX, Hive, Spark, SparkSQL, Impala, and Spark Streaming, and discusses task scheduling and monitoring in the ecosystem.

Data PlatformDataXHadoop
0 likes · 10 min read
Overview of Core Technologies in a Big Data Platform Architecture
Meituan Technology Team
Meituan Technology Team
May 5, 2022 · Databases

Database Autonomy Service (DAS): Architecture, Design, and Implementation

The Database Autonomy Service (DAS) is a platform that uses big‑data, machine‑learning, and expert knowledge to automatically collect, compress, and analyze MySQL metrics, providing self‑service fault detection, root‑cause diagnosis, and security management, thereby reducing manual effort, shortening MTTR, and supporting Meituan’s rapid database growth.

AI-driven opsDatabase AutonomyPerformance Monitoring
0 likes · 20 min read
Database Autonomy Service (DAS): Architecture, Design, and Implementation
YunZhu Net Technology Team
YunZhu Net Technology Team
Feb 24, 2022 · Big Data

Design and Implementation of a Comprehensive Monitoring System for a Big Data Platform

This article describes the end‑to‑end design, metric hierarchy, data collection methods, visualization dashboards, and alerting mechanisms used to build a robust monitoring system for a large‑scale big‑data platform, covering physical hosts, Hadoop components, business services, and data layers with tools such as Telegraf, Prometheus, and Grafana.

AlertingGrafanaPrometheus
0 likes · 14 min read
Design and Implementation of a Comprehensive Monitoring System for a Big Data Platform
政采云技术
政采云技术
Dec 16, 2021 · Big Data

What Is Event Tracking (埋点) and Its Implementation in a Data Analysis System

This article explains the concept of event tracking (埋点), its importance for capturing user behavior, outlines the four‑module architecture of a tracking system, compares code‑based, visual and full tracking methods, describes data models, storage, management, and presents a practical case study with analysis techniques.

AnalyticsBackendBig Data
0 likes · 15 min read
What Is Event Tracking (埋点) and Its Implementation in a Data Analysis System
Python Programming Learning Circle
Python Programming Learning Circle
Oct 22, 2021 · Backend Development

Python Project for Simulating Login and Web Scraping Across Multiple Websites

This article introduces a Python-based project that demonstrates how to log into and scrape data from 18 major websites—including Facebook, Twitter, Zhihu, and Bilibili—using methods such as Selenium, direct HTTP requests, and cookie management, providing code examples and future improvement plans.

Login AutomationSeleniumdata collection
0 likes · 4 min read
Python Project for Simulating Login and Web Scraping Across Multiple Websites
Amap Tech
Amap Tech
Mar 12, 2021 · Artificial Intelligence

High‑Precision Maps for Autonomous Driving: Production System and Technical Insights

Gaode’s high‑precision map platform, described by GM Xiang Zhe, details a three‑stage production pipeline, multi‑layer map architecture, and tiered data‑collection strategy that together address city‑road challenges, ensure map freshness, advance positioning and perception algorithms, and support commercial Level‑4 autonomous‑driving deployments.

autonomous drivingdata collectionhigh-precision map
0 likes · 11 min read
High‑Precision Maps for Autonomous Driving: Production System and Technical Insights
Youzan Coder
Youzan Coder
Dec 25, 2020 · Big Data

Metadata Governance and Collection in a Data Asset Platform

The platform implements comprehensive metadata governance by extracting, standardizing, and ingesting basic, trend, resource, lineage, and task metadata from offline and real‑time systems via a Kafka‑based SDK, enabling unified storage, monitoring, alerts, and future automation to improve data asset visibility and quality.

Big DataData GovernanceSDK
0 likes · 18 min read
Metadata Governance and Collection in a Data Asset Platform
Zhengtong Technical Team
Zhengtong Technical Team
Oct 27, 2020 · Mobile Development

Implementing Mobile Data Collection and Analytics with Countly: Architecture, Customization, and Insights

This article outlines how to design and implement a comprehensive mobile data collection and analysis system using the open‑source Countly platform, covering background requirements, solution selection, architecture, customizations for client, server and dashboard, SDK integration for Android and H5, and practical data mining insights.

Android SDKCountlyH5 SDK
0 likes · 11 min read
Implementing Mobile Data Collection and Analytics with Countly: Architecture, Customization, and Insights
Programmer DD
Programmer DD
Sep 24, 2020 · Databases

Build a One-Day Survey & Lottery System with SeaTable Scripts

Learn how to quickly create a questionnaire using SeaTable, automatically collect responses, and run a custom script to randomly select winners, providing a fast, low‑code solution for urgent business needs without a full development cycle.

AutomationLotterySeaTable
0 likes · 7 min read
Build a One-Day Survey & Lottery System with SeaTable Scripts
JD.com Experience Design Center
JD.com Experience Design Center
Jul 31, 2020 · Frontend Development

Scaling JD’s 福礼 Platform: Frontend Architecture, Component Library & Cross‑Platform Lessons

This article chronicles the rapid evolution of JD’s 福礼 employee‑benefits platform, detailing its Vue‑based frontend architecture, custom build tools, NutUI component library adoption, data‑collection strategies, multi‑device integration, development‑efficiency hacks, and collaborative processes that together drove a 265% YoY active‑user growth.

Component LibraryCross‑Platform IntegrationFrontend Architecture
0 likes · 24 min read
Scaling JD’s 福礼 Platform: Frontend Architecture, Component Library & Cross‑Platform Lessons
Aikesheng Open Source Community
Aikesheng Open Source Community
Jun 22, 2020 · Operations

Introduction to the Prometheus Data Collection Process

This article explains the complete Prometheus data collection workflow, covering key concepts such as targets, samples, and meta labels, detailing the relabeling steps, configuration options, example use‑cases, and the final scrape and storage phases for effective monitoring.

ConfigurationPrometheusdata collection
0 likes · 8 min read
Introduction to the Prometheus Data Collection Process
58 Tech
58 Tech
May 25, 2020 · Backend Development

ZLog: A Comprehensive Data Collection, Reporting, and Analysis Service for Mobile and PC Platforms

The article introduces ZLog, a unified data‑service solution that offers end‑to‑end collection, reporting, processing, and analysis for mobile (Android/iOS) and PC applications, explains its layered architecture, SDK design, verification of data accuracy, and showcases real‑world applications that improve debugging, security, and user‑behavior insights.

Data AnalyticsMobile Developmentdata collection
0 likes · 11 min read
ZLog: A Comprehensive Data Collection, Reporting, and Analysis Service for Mobile and PC Platforms
Tencent Cloud Developer
Tencent Cloud Developer
May 19, 2020 · Cloud Computing

Design and Architecture of a Distributed Atmospheric Monitoring System

Tencent’s rapid‑deployment project creates a volunteer‑hosted, low‑cost atmospheric monitoring network using LoRa, NB‑IoT and Wi‑Fi, with edge‑agnostic data collection, cloud‑based device normalization, and business processing modules that manage devices, analyze PM2.5 sensor data, and visualize real‑time air quality across five community sites.

Atmospheric MonitoringIoTLoRA
0 likes · 10 min read
Design and Architecture of a Distributed Atmospheric Monitoring System
政采云技术
政采云技术
May 17, 2020 · Frontend Development

Building a User Behavior Data Collection and Analysis System (Hunyi) – Frontend Team Experience

This article describes how the frontend team designed and implemented a comprehensive user behavior data collection and analysis platform, covering its business value, overall architecture, SDK-based data gathering, event interception, processing pipelines, analytics dashboards, and practical insights for product and operations teams.

AnalyticsSDKdata collection
0 likes · 15 min read
Building a User Behavior Data Collection and Analysis System (Hunyi) – Frontend Team Experience
Taobao Frontend Technology
Taobao Frontend Technology
May 17, 2020 · Frontend Development

How to Build a Scalable Frontend A/B Testing Framework

This article explains the design of a standardized, simple, and efficient front‑end A/B testing pipeline, covering experiment configuration, data models, platform architecture, runtime JSSDK, traffic‑splitting strategies, and data back‑flow to enable reliable, data‑driven product decisions.

A/B testingExperiment PlatformJSSDK
0 likes · 16 min read
How to Build a Scalable Frontend A/B Testing Framework
Alibaba Terminal Technology
Alibaba Terminal Technology
Apr 27, 2020 · Frontend Development

Designing a Scalable Frontend AB Testing Framework: From Config to Runtime

This article outlines a comprehensive, standardized front‑end AB testing architecture that separates experiment configuration and data chains, introduces a JSSDK with Core and Coupler packages, and explains traffic‑splitting models, data back‑flow, and extensibility across multiple front‑end DSLs.

AB testingFrontend ArchitectureJSSDK
0 likes · 16 min read
Designing a Scalable Frontend AB Testing Framework: From Config to Runtime
Xianyu Technology
Xianyu Technology
Mar 26, 2020 · Big Data

Scalable User Behavior Data Collection and Auto-Generated Datasets for Xianyu

Xianyu created a highly extensible user‑behavior collection framework that standardizes data into a common ODPS schema, uses JavaScript Proxy to intercept navigation and API calls, maps business metrics via JSON, aggregates reports to cut dataset‑creation effort from days to minutes while avoiding heavy full‑tracking overhead.

AnalyticsBig DataJavaScript
0 likes · 9 min read
Scalable User Behavior Data Collection and Auto-Generated Datasets for Xianyu
Programmer DD
Programmer DD
Jan 27, 2020 · Product Management

How the Wuhan COVID‑19 Data Platform Streamlines Resource Coordination

The Wuhan New Coronavirus Prevention Information Collection Platform aggregates hospital, hotel, factory, logistics, donation, and treatment data through manual reporting and automated processing, enabling real‑time information sharing and efficient allocation of social resources during the pandemic.

COVID-19Resource CoordinationWeb Platform
0 likes · 3 min read
How the Wuhan COVID‑19 Data Platform Streamlines Resource Coordination
Youzan Coder
Youzan Coder
Aug 14, 2019 · Big Data

Comprehensive Guide to Data Collection, Event Modeling, and Tracking in Big Data Platforms

The guide explains how comprehensive data collection in big‑data platforms relies on a standardized event model, passive and code‑based embedding, multi‑platform SDKs, a log‑middleware layer, precise location tracking, and an embedding management platform that supports workflow, testing, quality monitoring, and scalable infrastructure for future enhancements.

AnalyticsBig DataLog Processing
0 likes · 19 min read
Comprehensive Guide to Data Collection, Event Modeling, and Tracking in Big Data Platforms
转转QA
转转QA
Jul 31, 2019 · Mobile Development

Automating Mobile App Packaging, Testing, and Release Management

The article outlines how to automate the end‑to‑end mobile app packaging workflow—from code submission and continuous integration to data collection, automated testing, and release management—highlighting the benefits of reducing manual effort, improving reliability, and enabling comprehensive historical package tracking.

AutomationMobileapp-packaging
0 likes · 6 min read
Automating Mobile App Packaging, Testing, and Release Management
Architects' Tech Alliance
Architects' Tech Alliance
Jul 30, 2019 · R&D Management

Information Collection Techniques for Industry Research

The article outlines systematic approaches and practical tips for conducting industry research, covering the overall research framework, step‑by‑step information‑gathering methods, categorisation of data sources, efficiency‑boosting tactics, and best practices for deep interviews to derive actionable business insights.

ConsultingInformation GatheringMarket analysis
0 likes · 15 min read
Information Collection Techniques for Industry Research
DataFunTalk
DataFunTalk
Jul 1, 2019 · Artificial Intelligence

Data-Driven Foundations for Building Recommendation Systems

The article explains how data serves as a critical asset for recommendation systems, outlining the necessary steps from understanding business problems and data dimensions to collection, cleaning, integration, and analysis, while distinguishing explicit and implicit user feedback and emphasizing data quality, timeliness, and relevance.

Data QualityETLRecommendation Systems
0 likes · 11 min read
Data-Driven Foundations for Building Recommendation Systems
Mafengwo Technology
Mafengwo Technology
Apr 18, 2019 · Frontend Development

How to Build an Efficient Front‑End Monitoring Data Collection System

This article explains why front‑end monitoring is essential for user experience, outlines the key data types to collect, and provides practical AOP‑based implementations for route changes, JavaScript errors, performance metrics, resource failures, API calls, and reliable log reporting.

JavaScriptaopdata collection
0 likes · 14 min read
How to Build an Efficient Front‑End Monitoring Data Collection System
MaGe Linux Operations
MaGe Linux Operations
Apr 4, 2019 · Backend Development

Build a Python Crawler to Auto‑Collect TV Drama Download Links

This article describes how the author built a Python web crawler to automatically generate numeric URLs, fetch TV drama pages from the 天天美剧 site, extract ed2k download links using regular expressions, and save them into organized text files, streamlining the download process with Thunder.

Crawlerdata collectionmultithreading
0 likes · 6 min read
Build a Python Crawler to Auto‑Collect TV Drama Download Links
Architect's Tech Stack
Architect's Tech Stack
Oct 23, 2018 · Fundamentals

Common Data Collection Challenges in Startups and Practical Solutions

The article examines three typical data collection problems faced by startups—unclear collection methods, chaotic tracking points, and poor collaboration between data and engineering teams—and offers practical strategies such as adopting full‑event models, appointing data architects, and securing top‑down support to achieve reliable, comprehensive analytics.

AnalyticsData Governancedata collection
0 likes · 10 min read
Common Data Collection Challenges in Startups and Practical Solutions
Efficient Ops
Efficient Ops
Sep 29, 2018 · Operations

Agent vs Wire Data: Which APM Method Delivers Real‑Time Insight?

This article compares probe‑based agents and wire‑data collection for application performance monitoring, detailing their architectures, advantages, drawbacks, and cost implications, and concludes which approach best supports modern, real‑time operational intelligence.

APMWire Dataagent monitoring
0 likes · 17 min read
Agent vs Wire Data: Which APM Method Delivers Real‑Time Insight?
Baidu Intelligent Testing
Baidu Intelligent Testing
Mar 2, 2018 · Mobile Development

Automated Performance Testing Solutions for Android and iOS Apps

The article outlines comprehensive automated performance testing approaches for Android and iOS applications, covering challenges of data accuracy, reliability and volume, and describing configurable UI automation, remote device management, data collection, and reporting mechanisms to enable scalable, low‑effort mobile testing.

AndroidPerformance AutomationUI automation
0 likes · 13 min read
Automated Performance Testing Solutions for Android and iOS Apps
Efficient Ops
Efficient Ops
Feb 5, 2018 · Operations

How WeChat Scales Massive Real-Time Monitoring: Design & Practices

This article details the architecture and practical techniques behind WeChat's large‑scale monitoring system, covering lightweight data collection, classification of real‑time, non‑real‑time and user‑specific metrics, anomaly detection algorithms, automated configuration, and high‑performance storage solutions for billions of events per minute.

OperationsReal-Timedata collection
0 likes · 14 min read
How WeChat Scales Massive Real-Time Monitoring: Design & Practices
Meitu Technology
Meitu Technology
Dec 19, 2017 · Big Data

Meitu Internet Technology Salon Session 7: Practices in Recommendation Algorithms, Big Data, and Personalized Recommendation

At Meitu’s seventh Internet Technology Salon in Xiamen, over a hundred experts discussed recommendation algorithms and big‑data solutions, with talks on the Arachnia log‑collection system, the Naix distributed bitmap service, Meitu’s personalized recommendation pipeline challenges, and novel data‑missing‑theory models for improved performance.

Big Datadata collectiondistributed bitmap
0 likes · 8 min read
Meitu Internet Technology Salon Session 7: Practices in Recommendation Algorithms, Big Data, and Personalized Recommendation
58 Tech
58 Tech
Dec 15, 2017 · Big Data

Design and Architecture of WMDA: A Comprehensive User Behavior Analysis Platform

The article details WMDA, a no‑code and manual‑code data collection platform for PC, mobile and app that supports real‑time and offline user behavior analysis, describing its functional model, behavior taxonomy, five‑layer architecture, tracking techniques, circle‑selection, data services, streaming and batch processing pipelines, and related technologies such as Storm, Spark, Druid and Roaring Bitmap.

Big DataDruidReal-time Streaming
0 likes · 18 min read
Design and Architecture of WMDA: A Comprehensive User Behavior Analysis Platform
Baidu Intelligent Testing
Baidu Intelligent Testing
Oct 9, 2017 · Big Data

User Behavior Analysis: From Data Acquisition to Funnel Insights

The article explains how to move beyond macro app metrics by collecting offline and real‑time user data, storing it in HDFS, processing it with Spark, visualizing behavior paths as state‑machine trees, and performing branch‑funnel analysis to uncover conversion bottlenecks and improve product quality.

AnalyticsBig DataFunnel Analysis
0 likes · 5 min read
User Behavior Analysis: From Data Acquisition to Funnel Insights
Qunar Tech Salon
Qunar Tech Salon
Aug 18, 2017 · Operations

Hardware Automation Operations System at Qunar: Design, Implementation, and Lessons Learned

This article details Qunar's hardware automation operations platform, covering the hardware scope, pain points of manual processes, a five‑stage lifecycle, automated testing, data collection, fault handling, and the underlying Mesos‑Marathon‑Docker infrastructure that together improve efficiency, reliability, and cost control.

data collectionfault handlinghardware automation
0 likes · 21 min read
Hardware Automation Operations System at Qunar: Design, Implementation, and Lessons Learned
Ctrip Technology
Ctrip Technology
May 18, 2017 · Backend Development

Design and Implementation of Ctrip's Real-Time User Data Collection System

This article details the design, technology selection, architecture, encryption, compression, and performance evaluation of Ctrip's real-time user data collection system, which leverages Java, Netty, Kafka, and Avro to achieve high throughput, low latency, and robust fault tolerance for mobile and web applications.

Backend DevelopmentNettyPerformance Testing
0 likes · 17 min read
Design and Implementation of Ctrip's Real-Time User Data Collection System
Liulishuo Tech Team
Liulishuo Tech Team
Aug 6, 2016 · Product Management

Structuring and Managing Data Collection Requirements with JSON and Git

By defining data collection (event tracking) requirements in a structured JSON format and storing them in Git with a web interface that abstracts version control, teams can standardize identifiers, validate data formats automatically, track changes via commit logs, and streamline collaboration between product managers, developers, and testers.

GitJSONdata collection
0 likes · 7 min read
Structuring and Managing Data Collection Requirements with JSON and Git
Baidu Intelligent Testing
Baidu Intelligent Testing
Apr 12, 2016 · Product Management

User Feedback Analysis: Methods, Process, and Core Metrics

This article explains what user feedback is, why it should be analyzed, and provides a step‑by‑step methodology—including channel setup, data collection, coding, categorization, and statistical analysis—along with key performance indicators for monitoring feedback handling in product management.

categorizationdata collectionfeedback analysis
0 likes · 8 min read
User Feedback Analysis: Methods, Process, and Core Metrics