ByteDance Data Platform
Author

ByteDance Data Platform

The ByteDance Data Platform team empowers all ByteDance business lines by lowering data‑application barriers, aiming to build data‑driven intelligent enterprises, enable digital transformation across industries, and create greater social value. Internally it supports most ByteDance units; externally it delivers data‑intelligence products under the Volcano Engine brand to enterprise customers.

78
Articles
0
Likes
187
Views
0
Comments
Recent Articles

Latest from ByteDance Data Platform

78 recent articles
ByteDance Data Platform
ByteDance Data Platform
Apr 23, 2026 · Artificial Intelligence

How LanceDB Powers Enterprise‑Scale Memory in OpenClaw Agents

This article details the technical evaluation and deep integration of LanceDB as a memory plugin for the OpenClaw‑based ArkClaw agent platform, covering plugin selection, core enhancements such as mixed retrieval, hierarchical memory, Autodream processing, Context Engine optimizations, Git‑style version control, and the vision of a unified edge‑cloud memory lake.

AI agentsArkClawLLM Memory
0 likes · 12 min read
How LanceDB Powers Enterprise‑Scale Memory in OpenClaw Agents
ByteDance Data Platform
ByteDance Data Platform
Apr 16, 2026 · Artificial Intelligence

Unlock AI Data Lake Power with LAS CLI: 9 Operators in One Command

The LAS CLI, a command‑line interface for Volcano Engine's AI Data Lake, lets developers and AI agents invoke nine multimodal operators—covering audio, video, image, and document processing—through simple terminal commands, enabling token‑efficient, structured, and pipeline‑friendly data workflows.

AI data lakeLAS CLIagent integration
0 likes · 14 min read
Unlock AI Data Lake Power with LAS CLI: 9 Operators in One Command
ByteDance Data Platform
ByteDance Data Platform
Mar 13, 2026 · Artificial Intelligence

Beyond Parameters: How ClawLake Turns Agent Memory into Enterprise‑Level AI Infrastructure

The article explains why an AI agent's capabilities are limited by memory depth rather than model size, reviews three historical memory architectures, highlights their structural shortcomings, and details how the ClawLake solution provides a multi‑layer, multimodal, enterprise‑grade memory infrastructure for OpenClaw agents.

AIAgentEnterprise
0 likes · 17 min read
Beyond Parameters: How ClawLake Turns Agent Memory into Enterprise‑Level AI Infrastructure
ByteDance Data Platform
ByteDance Data Platform
Feb 11, 2026 · Databases

How ByteHouse Redefines Real‑Time Multimodal Analytics with a Cloud‑Native Data Warehouse

ByteHouse, ByteDance's cloud‑native data warehouse, evolves from a traditional warehouse to a next‑generation AI‑ready platform that handles 800+ PB of data, supports 25,000 nodes, and delivers real‑time, multimodal analytics through a decoupled storage‑compute architecture, AI‑driven query optimization, and native vector search integration.

AI OptimizationReal-time analyticscloud-native
0 likes · 9 min read
How ByteHouse Redefines Real‑Time Multimodal Analytics with a Cloud‑Native Data Warehouse
ByteDance Data Platform
ByteDance Data Platform
Feb 2, 2026 · Big Data

How StreamShield Powers Production‑Grade Resilience for Apache Flink at Massive Scale

ByteDance’s StreamShield delivers a three‑layer resiliency framework—engine self‑healing, hybrid replication at the cluster level, and chaos‑tested releases—that enables over 70,000 concurrent Flink jobs on 11 million CPU cores to meet strict SLAs with second‑level startup and robust fault tolerance.

Apache FlinkByteDanceReal‑Time Computing
0 likes · 6 min read
How StreamShield Powers Production‑Grade Resilience for Apache Flink at Massive Scale
ByteDance Data Platform
ByteDance Data Platform
Jan 15, 2026 · Artificial Intelligence

Why Model Evaluation Can Be Cool: Innovative Automated Testing for Data‑Driven LLM Agents

In the era of rapidly advancing large‑model technology, the article outlines the challenges of evaluating data‑centric LLM agents, proposes a three‑layer evaluation framework covering basic capabilities, component‑level checks, and end‑to‑end business impact, and shares practical innovations such as semantic‑equivalence SQL matching, agent‑as‑judge pipelines, and a unified assessment platform.

Agent as judgeData AgentLLM evaluation
0 likes · 22 min read
Why Model Evaluation Can Be Cool: Innovative Automated Testing for Data‑Driven LLM Agents
ByteDance Data Platform
ByteDance Data Platform
Dec 23, 2025 · Artificial Intelligence

How Daft and Ray Supercharge Million‑Hour Video Processing for AI‑Powered Robotics

This article details a scalable, distributed pipeline that uses LAS AI Data Lake, Daft on Ray, and advanced video‑processing techniques—scene detection, splitting, frame sampling, filtering, and caption generation—to transform tens of millions of hours of robot‑captured video into high‑quality, searchable semantic data while dramatically boosting CPU and GPU utilization.

AI pipelineDaftDistributed Computing
0 likes · 21 min read
How Daft and Ray Supercharge Million‑Hour Video Processing for AI‑Powered Robotics
ByteDance Data Platform
ByteDance Data Platform
Oct 29, 2025 · Big Data

How Volcano Engine’s Multimodal Data Lake Tackles AI Agent Challenges

The article explores how Volcano Engine’s multimodal data lake architecture addresses the storage, compute, and management challenges of AI agents by introducing new formats like Lance, upgrading engines such as Spark and Daft, and providing unified tools for processing, versioning, and querying massive multimodal datasets.

Daft engineLance formatbig data
0 likes · 13 min read
How Volcano Engine’s Multimodal Data Lake Tackles AI Agent Challenges
ByteDance Data Platform
ByteDance Data Platform
Sep 24, 2025 · Artificial Intelligence

Why Data Agents Are the Next AI Frontier: Insights from Volcano Engine’s Journey

In this talk, Volcano Engine’s technical expert Chen Shuo explains the evolution of the Data Agent platform, the four‑quadrant framework for AI‑driven analytics, real‑world deployment challenges, architectural upgrades from pipeline to intelligent scheduling, and key lessons for building reliable, enterprise‑grade AI agents.

AIAgent architectureData Agent
0 likes · 17 min read
Why Data Agents Are the Next AI Frontier: Insights from Volcano Engine’s Journey