Tagged articles
25 articles
Page 1 of 1
Machine Heart
Machine Heart
Apr 18, 2026 · Artificial Intelligence

Why Embodied Data Is the Biggest Gold Mine: Inside the World’s First Hundred‑Billion‑Scale Multimodal Data Cloud Mall

Paxini, together with JD Cloud, Tencent Cloud, and Baidu Intelligent Cloud, launches the world’s first hundred‑billion‑scale, full‑modal, high‑degree‑of‑freedom embodied AI data cloud mall, offering instant online data procurement, end‑to‑end model training pipelines, and validated performance gains in both lab and real‑world robot tasks.

Embodied AIModel TrainingMultimodal Data
0 likes · 13 min read
Why Embodied Data Is the Biggest Gold Mine: Inside the World’s First Hundred‑Billion‑Scale Multimodal Data Cloud Mall
Tencent Advertising Technology
Tencent Advertising Technology
Sep 3, 2025 · Artificial Intelligence

Boosting Ads Revenue: LFM4Ads’ Full‑Representation Multi‑Granular Transfer Raises GMV 2.45%

Tencent's LFM4Ads introduces a full‑representation, multi‑granular knowledge transfer framework that moves user, item, and cross representations from a large foundation model to downstream tasks, achieving up to 2.45% platform GMV uplift across more than ten advertising scenarios.

Knowledge Transferads recommendationfoundation model
0 likes · 12 min read
Boosting Ads Revenue: LFM4Ads’ Full‑Representation Multi‑Granular Transfer Raises GMV 2.45%
Zhuanzhuan Tech
Zhuanzhuan Tech
Apr 3, 2024 · Backend Development

Design and Implementation of an Elasticsearch Data Synchronization Service (ECP) for Large‑Scale Order Data

This article describes the challenges and technical solutions for synchronizing billions of order records from a relational database to Elasticsearch, including multi‑source data reading, dynamic rate limiting, retry strategies, SPI‑based service integration, environment isolation, health‑checking, smooth migration, and structured logging, all implemented in a backend service called ECP.

JavaSPIbackend service
0 likes · 21 min read
Design and Implementation of an Elasticsearch Data Synchronization Service (ECP) for Large‑Scale Order Data
dbaplus Community
dbaplus Community
Nov 15, 2023 · Databases

Scaling Bloom Filter for 800 Million OpenIDs in Redis

This article explains how to use a Bloom filter backed by Redis bitmap and Roaring Bitmap sharding to efficiently filter 800 million OpenID queries, covering memory planning, hash function selection, code implementation, and performance‑tuned batch write strategies.

BitmapRoaring Bitmapbackend optimization
0 likes · 13 min read
Scaling Bloom Filter for 800 Million OpenIDs in Redis
ITPUB
ITPUB
Oct 1, 2023 · Backend Development

Scaling Schema‑Free Classified Ads Platforms: Storage & Search for Billions

This article explains how to design a scalable architecture for classification‑info platforms that handle billions of rows, ten‑thousand attributes, and hundred‑thousand QPS by using vertical partitioning, unified post, category, and search services, along with compressed JSON extensions and external indexing.

Vertical Partitioninglarge-scale datascalable architecture
0 likes · 12 min read
Scaling Schema‑Free Classified Ads Platforms: Storage & Search for Billions
Zhuanzhuan Tech
Zhuanzhuan Tech
May 30, 2023 · Backend Development

Design and Architecture of a Checkout System: Scenarios, Features, Third‑Party Integration, and Large‑Scale Data Solutions

This article explains the background, key scenarios, functional components, third‑party payment capabilities, implementation logic, rule‑engine usage, and large‑scale data handling strategies of a checkout system, providing a comprehensive view of its backend architecture and operational considerations.

BackendPayment Architecturecheckout
0 likes · 14 min read
Design and Architecture of a Checkout System: Scenarios, Features, Third‑Party Integration, and Large‑Scale Data Solutions
DataFunTalk
DataFunTalk
Dec 17, 2022 · Artificial Intelligence

Multimodal Pre‑training Techniques and Applications – Overview, OPPOVL Dataset, Architecture, and Performance

This article presents a comprehensive overview of multimodal pre‑training, describing its motivation, architecture choices, large‑scale Chinese image‑text dataset construction, training optimizations, performance benchmarks, downstream applications, and a Q&A session that highlights practical deployment considerations.

Computer VisionDeep LearningModel architecture
0 likes · 16 min read
Multimodal Pre‑training Techniques and Applications – Overview, OPPOVL Dataset, Architecture, and Performance
AntTech
AntTech
Nov 28, 2022 · Information Security

Ant Group Anti‑Intrusion Platform: Architecture, Trillion‑Scale Detection, Risk Assessment, and Automated Response

This article details the evolution, architecture, and key technologies of Ant Group's anti‑intrusion platform, explaining how it handles trillion‑level data streams for intrusion detection, performs multi‑dimensional risk assessment and attribution, and enables rapid, automated security incident response across massive enterprise environments.

anti-intrusioninformation securityintrusion detection
0 likes · 15 min read
Ant Group Anti‑Intrusion Platform: Architecture, Trillion‑Scale Detection, Risk Assessment, and Automated Response
DataFunTalk
DataFunTalk
Oct 28, 2022 · Big Data

Angel Graph: A High‑Performance Distributed Graph Computing Framework for Intelligent Risk Control

Angel Graph is a high‑performance, fault‑tolerant distributed graph computing framework developed by Tencent, featuring scalable node‑metric, community‑detection, and graph‑neural‑network algorithms optimized for billion‑node, trillion‑edge datasets, and demonstrated through practical applications in intelligent financial risk control.

Distributed Systemscommunity-detectiongraph computing
0 likes · 20 min read
Angel Graph: A High‑Performance Distributed Graph Computing Framework for Intelligent Risk Control
Xingsheng Youxuan Technology Community
Xingsheng Youxuan Technology Community
Oct 28, 2022 · Backend Development

How We Processed 1 Million Images in Sub-Second: Backend Optimization Secrets

Facing a challenge of managing roughly one million server-side images and 180 client images, the TOOSIMPLE team built a high-performance backend using fingerprinting, parallel processing, mmap-SSE2 acceleration, and sparsemap indexing, achieving sub-second response times while ensuring correct ordered display.

GolangHashinglarge-scale data
0 likes · 12 min read
How We Processed 1 Million Images in Sub-Second: Backend Optimization Secrets
ITPUB
ITPUB
Jun 9, 2022 · Artificial Intelligence

How 58’s Multi‑Label Image Recognition Boosts Semantic Search and Recommendations

This article details the design, data pipeline, model architecture, loss functions, and evaluation metrics of a large‑scale multi‑label image classification system built for 58.com, showing how it improves semantic similarity detection, recommendation, and content moderation across diverse business domains.

Computer VisionDeep Learningasymmetric loss
0 likes · 18 min read
How 58’s Multi‑Label Image Recognition Boosts Semantic Search and Recommendations
Architecture Digest
Architecture Digest
Jun 7, 2022 · Big Data

Design and Optimization Strategies for Querying 100K Records from Tens of Millions Using ClickHouse, Elasticsearch, HBase, and RediSearch

This article examines a business requirement to filter up to 100,000 items from a pool of tens of millions, presenting and evaluating four technical solutions—multithreaded ClickHouse pagination, Elasticsearch scroll‑scan, an ES‑HBase hybrid, and RediSearch + RedisJSON—along with performance data and implementation details.

HBaseRediSearchRedisJSON
0 likes · 10 min read
Design and Optimization Strategies for Querying 100K Records from Tens of Millions Using ClickHouse, Elasticsearch, HBase, and RediSearch
DataFunTalk
DataFunTalk
Feb 1, 2022 · Big Data

Kafka at Meituan: Practices, Challenges, and Optimizations for Large‑Scale Data Platforms

This article presents Meituan's large‑scale Kafka deployment, describing the current state and challenges of massive data ingestion, detailing latency‑reduction techniques, cluster‑level optimizations, SSD‑based caching, isolation strategies, full‑link monitoring, lifecycle management, and future directions for high availability.

Cluster ManagementKafkaMeituan
0 likes · 22 min read
Kafka at Meituan: Practices, Challenges, and Optimizations for Large‑Scale Data Platforms
Java Interview Crash Guide
Java Interview Crash Guide
Dec 2, 2021 · Databases

How Zhihu Scaled to Trillions of Rows with TiDB – Real‑Time Query Performance Insights

Zhihu’s Moneta service stores over a trillion rows and faces massive write and read loads; this article explains why TiDB was chosen, how its architecture and features such as HTAP, Raft, Titan and table partitioning enable millisecond‑level query latency, high availability, and seamless scaling.

HTAPPerformance OptimizationScalability
0 likes · 15 min read
How Zhihu Scaled to Trillions of Rows with TiDB – Real‑Time Query Performance Insights
21CTO
21CTO
May 18, 2021 · Big Data

How Baidu Scales Multimodal Image Search with the Imazon Platform

This article explains Baidu's multimodal retrieval system, detailing the offline and online pipelines, the image processing and indexing platform (Imazon), its architecture, key technologies such as ANN and GPU models, and the optimization practices that enable massive daily image ingestion and real‑time search at billion‑scale.

BaiduImage Processinglarge-scale data
0 likes · 13 min read
How Baidu Scales Multimodal Image Search with the Imazon Platform
High Availability Architecture
High Availability Architecture
May 18, 2021 · Big Data

Design and Optimization of Baidu's Image Processing and Multimodal Retrieval Platform (Imazon)

This article details Baidu's large‑scale image processing and multimodal retrieval system, describing its offline‑online architecture, massive data ingestion pipeline, ANN search techniques, performance metrics, infrastructure components, and a series of optimizations for throughput, cost, and reliability in a high‑volume streaming environment.

BaiduImage ProcessingImazon
0 likes · 12 min read
Design and Optimization of Baidu's Image Processing and Multimodal Retrieval Platform (Imazon)
Java Backend Technology
Java Backend Technology
Mar 21, 2020 · Databases

How Zhihu Scaled to Trillions of Rows with TiDB: Lessons from Moneta

Zhihu’s Moneta service, handling over 1.3 trillion rows and billions of daily writes, migrated from MySQL sharding to TiDB, achieving millisecond query latency, high availability, and horizontal scalability, while sharing architectural choices, performance metrics, migration challenges, and future expectations for TiDB 3.0.

HTAPMySQL MigrationPerformance Optimization
0 likes · 16 min read
How Zhihu Scaled to Trillions of Rows with TiDB: Lessons from Moneta
DataFunTalk
DataFunTalk
Feb 26, 2020 · Databases

ByteGraph: ByteDance’s Distributed Graph Database and Graph Computing System – Architecture, Data Model, and Practices

This article presents an in‑depth technical overview of ByteGraph, ByteDance’s self‑built distributed graph database and its accompanying graph‑computing engine, covering graph data characteristics, the directed‑property graph model, API design, three‑tier system architecture, storage strategies using KV stores and B‑Trees, hotspot handling, indexing, and future research directions.

B+TreeByteGraphGremlin
0 likes · 33 min read
ByteGraph: ByteDance’s Distributed Graph Database and Graph Computing System – Architecture, Data Model, and Practices
Alibaba Cloud Developer
Alibaba Cloud Developer
Aug 23, 2018 · Artificial Intelligence

How Alibaba’s “Cangjingge” Knowledge Engine Powers AI with Massive Graphs

Alibaba, together with top Chinese universities and research institutes, unveiled the Cangjingge Knowledge Engine project, detailing its massive data assets, five‑module architecture, large‑scale knowledge construction techniques, and initial deployments in safety and tourism knowledge graphs to boost AI applications.

AIAlibabaKnowledge Graph
0 likes · 9 min read
How Alibaba’s “Cangjingge” Knowledge Engine Powers AI with Massive Graphs
21CTO
21CTO
Mar 22, 2017 · Artificial Intelligence

How Youku Tudou Revamped Its Video Recommendation Engine for Real‑Time Ranking

The Youku Tudou data team overhauled its video recommendation system by moving ranking from offline to online, detailing architectural changes, advantages, challenges, feature handling, offline evaluation, and model weight fusion to improve scalability and user experience.

AB testingAISystem Architecture
0 likes · 7 min read
How Youku Tudou Revamped Its Video Recommendation Engine for Real‑Time Ranking