Tagged articles
9 articles
Page 1 of 1
Alibaba Cloud Big Data AI Platform
Alibaba Cloud Big Data AI Platform
Sep 16, 2024 · Artificial Intelligence

How TAG Makes LLM Inference Fully Asynchronous for Higher Throughput

With the growing complexity of LLM architectures like GQA, MLA, and MoE, runtime overhead has become a bottleneck; this article analyzes Python performance, communication costs, and synchronous execution in current inference frameworks, introduces the fully asynchronous TAG architecture, and demonstrates its superior throughput and latency through benchmarks.

GPU utilizationLLM inferenceRuntime Optimization
0 likes · 12 min read
How TAG Makes LLM Inference Fully Asynchronous for Higher Throughput
DataFunSummit
DataFunSummit
Jul 25, 2023 · Artificial Intelligence

Real‑Time Deep Learning Training with PAI‑ODL: Architecture, Pipeline, and Key Technologies

This article introduces PAI‑ODL, a real‑time deep‑learning training platform that supports online model updates for search, advertising, and recommendation scenarios, detailing its pipeline modules, system architecture, large‑scale sparse model techniques, incremental model export, embedding store design, and performance optimizations that together enable low‑latency, high‑throughput serving.

PAI ODLReal-time TrainingRuntime Optimization
0 likes · 19 min read
Real‑Time Deep Learning Training with PAI‑ODL: Architecture, Pipeline, and Key Technologies
Bilibili Tech
Bilibili Tech
Nov 29, 2022 · Big Data

How Bilibili Supercharged Flink: Checkpoint, HA, and Runtime Optimizations

This article details Bilibili's extensive enhancements to Flink's runtime—including checkpoint recoverability, operator ID stability, state processor extensions, hybrid high‑availability, regional checkpointing, and load‑based channel selection—to improve scalability, reliability, and operational efficiency of large‑scale streaming jobs.

Big DataCheckpointFlink
0 likes · 32 min read
How Bilibili Supercharged Flink: Checkpoint, HA, and Runtime Optimizations
DataFunTalk
DataFunTalk
Apr 17, 2022 · Artificial Intelligence

DeepRec: Alibaba’s Sparse Model Training Engine – Architecture, Features, and Open‑Source Status

DeepRec, developed since 2016 by Alibaba, is a specialized sparse‑model training engine that addresses feature elasticity, training performance, and deployment challenges through dynamic elastic features, optimized runtimes, distributed training frameworks, incremental model export, and multi‑level storage, and is now being open‑sourced for broader industry collaboration.

AI InfrastructureDeepRecRuntime Optimization
0 likes · 15 min read
DeepRec: Alibaba’s Sparse Model Training Engine – Architecture, Features, and Open‑Source Status
TikTok Frontend Technology Team
TikTok Frontend Technology Team
Oct 15, 2021 · Frontend Development

React Runtime Optimization: From React 15 to 18 – Architecture, Scheduling, and New Features

This article provides a comprehensive overview of React's runtime optimization strategies across versions 15 to 18, explaining the evolution of its architecture, the introduction of Fiber, concurrent rendering, scheduling, priority lanes, and new APIs such as Suspense, startTransition, and useDeferredValue, while including detailed code excerpts and practical insights for developers.

Concurrent ModeFiberReact
0 likes · 35 min read
React Runtime Optimization: From React 15 to 18 – Architecture, Scheduling, and New Features
ByteFE
ByteFE
Sep 22, 2021 · Frontend Development

How React’s Runtime Optimizations Evolved from 15 to 18 – A Deep Technical Dive

This article walks through the evolution of React’s runtime architecture from version 15 to 18, explaining key concepts such as Fiber, Scheduler, priority lanes, concurrent mode, and new APIs like startTransition and useDeferredValue, while providing concrete code examples and visual diagrams.

Concurrent ModeFiberReact
0 likes · 36 min read
How React’s Runtime Optimizations Evolved from 15 to 18 – A Deep Technical Dive
Node Underground
Node Underground
Sep 13, 2019 · Backend Development

Deep Dive into Node.js Runtime: Optimization, V8 Tweaks, and Serverless Insights

This article recaps a hardcore Node.js underground salon where experts explored runtime optimization, V8 engine enhancements, Alinode monitoring, fork(2) performance tricks, and core startup processes, highlighting practical insights for Serverless, IoT, and high‑performance backend development.

Node.jsRuntime OptimizationServerless
0 likes · 5 min read
Deep Dive into Node.js Runtime: Optimization, V8 Tweaks, and Serverless Insights
Big Data Technology & Architecture
Big Data Technology & Architecture
Aug 5, 2019 · Big Data

Apache Spark Latest Technological Developments and Outlook for Spark 3.0+

The article provides a comprehensive overview of recent Apache Spark advancements—including Delta Lake, Data Source V2, runtime optimizations, relational cache, cloud‑native challenges, AI integration via Project Hydrogen, and the anticipated features of Spark 3.0—highlighting how these innovations address modern data‑warehouse, cloud, and machine‑learning workloads.

Apache SparkBig DataData Warehouse
0 likes · 17 min read
Apache Spark Latest Technological Developments and Outlook for Spark 3.0+