Alibaba Cloud Infrastructure
Author

Alibaba Cloud Infrastructure

For uninterrupted computing services

353
Articles
0
Likes
936
Views
0
Comments
Recent Articles

Latest from Alibaba Cloud Infrastructure

100 recent articles max
Alibaba Cloud Infrastructure
Alibaba Cloud Infrastructure
Mar 17, 2025 · Cloud Native

Boost LLM Inference with ACK Gateway AI Extension: A Step‑by‑Step Guide

This guide demonstrates how to deploy the QwQ‑32B large language model on an Alibaba Cloud ACK cluster, configure OSS storage, enable the ACK Gateway with AI Extension, set up InferencePool and InferenceModel resources, and benchmark intelligent routing versus standard gateway routing, revealing latency and throughput improvements.

ACK GatewayAI ExtensionKubernetes
0 likes · 16 min read
Boost LLM Inference with ACK Gateway AI Extension: A Step‑by‑Step Guide
Alibaba Cloud Infrastructure
Alibaba Cloud Infrastructure
Mar 12, 2025 · Cloud Computing

FinOps Case Study: Building a Cloud Cost Center at Qimai Technology

This case study describes how Qimai Technology, a SaaS provider for offline stores, tackled rapid cloud cost growth by establishing a cost center that uses CMDB mapping, Alibaba Cloud ACK Cost V2 API, and static and dynamic allocation rules to achieve fine‑grained resource cost distribution and improve financial transparency.

ACKCMDBCloud Cost Management
0 likes · 7 min read
FinOps Case Study: Building a Cloud Cost Center at Qimai Technology
Alibaba Cloud Infrastructure
Alibaba Cloud Infrastructure
Mar 11, 2025 · Cloud Native

Implementing Per‑User Rate Limiting with Alibaba Cloud Service Mesh (ASM) Traffic Scheduling Suite

This article explains how to use Alibaba Cloud Service Mesh (ASM) traffic‑scheduling suite to implement rich traffic‑control scenarios such as per‑user rate limiting, request queuing and priority scheduling in a Kubernetes environment, providing step‑by‑step deployment, configuration and verification instructions.

ASMKubernetesTraffic Scheduling
0 likes · 14 min read
Implementing Per‑User Rate Limiting with Alibaba Cloud Service Mesh (ASM) Traffic Scheduling Suite
Alibaba Cloud Infrastructure
Alibaba Cloud Infrastructure
Mar 9, 2025 · Cloud Computing

Deploy QwQ-32B LLM Inference on Alibaba Cloud ACS with vLLM: Step‑by‑Step Guide

This guide walks you through using Alibaba Cloud Container Compute Service (ACS) to provision GPU resources, prepare the QwQ-32B model, configure persistent storage, deploy the model with vLLM, set up OpenWebUI, verify the service, and optionally benchmark its performance, all with detailed commands and YAML examples.

ACSAlibaba CloudGPU
0 likes · 17 min read
Deploy QwQ-32B LLM Inference on Alibaba Cloud ACS with vLLM: Step‑by‑Step Guide
Alibaba Cloud Infrastructure
Alibaba Cloud Infrastructure
Mar 6, 2025 · Big Data

Leveraging Apache Iceberg and AutoMQ for Real-Time Data Lake Ingestion: Architecture, Best Practices, and Cost Optimization

This article examines how Apache Iceberg’s snapshot‑based ACID transactions, logical‑physical partition evolution, and COW/MOR update modes enable efficient real‑time data lake ingestion, and demonstrates AutoMQ’s Kafka‑to‑Iceberg Table Topic solution that simplifies schema management, reduces latency, and cuts operational costs.

Apache IcebergAutoMQStreaming
0 likes · 14 min read
Leveraging Apache Iceberg and AutoMQ for Real-Time Data Lake Ingestion: Architecture, Best Practices, and Cost Optimization
Alibaba Cloud Infrastructure
Alibaba Cloud Infrastructure
Mar 5, 2025 · Cloud Native

Using Fluid Cloud‑Native Data Caching to Boost Performance and Elasticity of a Quantitative Research Platform on Alibaba Cloud

This article describes how JoinQuant built a cloud‑native quantitative research platform on Alibaba Cloud, identified performance, cost, data‑management, and security challenges, and solved them with Fluid’s JindoRuntime data‑caching, elastic scaling, and Python‑driven workflows, achieving dramatic speed and cost improvements.

Data CachingFluidKubernetes
0 likes · 18 min read
Using Fluid Cloud‑Native Data Caching to Boost Performance and Elasticity of a Quantitative Research Platform on Alibaba Cloud
Alibaba Cloud Infrastructure
Alibaba Cloud Infrastructure
Mar 4, 2025 · Cloud Native

Koordinator v1.6 Release: Advanced Heterogeneous Device Scheduling and GPU Management Features

The Koordinator v1.6 release introduces a suite of innovations—including GPU topology‑aware scheduling, end‑to‑end GPU & RDMA joint allocation, strong GPU isolation, differentiated GPU scoring, fine‑grained resource reservation, mixed‑workload QoS, and extensive scheduler and rescheduler optimizations—to efficiently manage heterogeneous resources in Kubernetes clusters for AI and high‑performance computing workloads.

GPU SchedulingHeterogeneous ResourcesKoordinator
0 likes · 24 min read
Koordinator v1.6 Release: Advanced Heterogeneous Device Scheduling and GPU Management Features
Alibaba Cloud Infrastructure
Alibaba Cloud Infrastructure
Feb 28, 2025 · Cloud Computing

Accelerating Java Application Startup with CRaC and Flexible Compute on Alibaba Cloud Container Service

This article explains how Alibaba Cloud Container Service leverages flexible compute and the CRaC (Coordinated Restore at Checkpoint) mechanism to dramatically reduce Java application startup latency, details integration steps, presents experimental performance results, and discusses future applicability in cloud‑native environments.

CRaCContainer ServicePerformance Testing
0 likes · 11 min read
Accelerating Java Application Startup with CRaC and Flexible Compute on Alibaba Cloud Container Service
Alibaba Cloud Infrastructure
Alibaba Cloud Infrastructure
Feb 21, 2025 · Artificial Intelligence

Deploying DeepSeek R1 Model Inference on ACK Edge with Virtual Nodes and Serverless GPU

This article explains how to use Alibaba Cloud ACK Edge to manage on‑premise GPU resources and seamlessly fall back to cloud‑based ACS Serverless GPU via virtual nodes for deploying DeepSeek R1 inference, covering environment preparation, model download, storage setup, custom scheduling, and scaling strategies.

ACK@EdgeDeepSeekGPU
0 likes · 16 min read
Deploying DeepSeek R1 Model Inference on ACK Edge with Virtual Nodes and Serverless GPU