Tagged articles
7 articles
Page 1 of 1
Bilibili Tech
Bilibili Tech
Mar 28, 2023 · Operations

Bilibili's Capacity Management Platform: Design, Implementation, and S12 Event Support

Bilibili's capacity management platform integrates foundational data, VPA/HPA scaling, quota control, and visual dashboards to streamline resource usage, cut costs, and boost stability, delivering event‑specific support such as for S12 that slashes release issues by 80% and online failures by 90%, while planning predictive scaling and risk control.

BilibiliResource OptimizationSRE
0 likes · 13 min read
Bilibili's Capacity Management Platform: Design, Implementation, and S12 Event Support
DataFunTalk
DataFunTalk
Jun 27, 2021 · Big Data

Practical Experience in Operating NetEase's Big Data Platform: Architecture, EasyOps, Monitoring, and Optimization

This presentation by NetEase senior SRE Jin Chuan details the current state of NetEase's big data platform, introduces the internally built EasyOps management system, explains a generic Ansible‑based operation framework, describes Prometheus/Grafana monitoring and alerting, and shares practical lessons on network, storage, and cloud migration for large‑scale Hadoop services.

AnsiblePrometheusSRE
0 likes · 10 min read
Practical Experience in Operating NetEase's Big Data Platform: Architecture, EasyOps, Monitoring, and Optimization
Big Data Technology Architecture
Big Data Technology Architecture
Jun 2, 2021 · Big Data

Practical Operations of NetEase Big Data Platform: Architecture, EasyOps, Monitoring, and Experience Sharing

The presentation details NetEase's big data platform operations, covering current usage, the internally built EasyOps control system, a generic service‑operation framework based on Ansible, Prometheus‑Grafana monitoring, configuration management, network and storage optimizations, and lessons learned from cloud migration.

AnsibleBig DataEasyOps
0 likes · 9 min read
Practical Operations of NetEase Big Data Platform: Architecture, EasyOps, Monitoring, and Experience Sharing
Huolala Tech
Huolala Tech
Jun 30, 2020 · Operations

How Robust Risk Admission Strategies Power Platform Safety and Growth

This article explains the design and implementation of a comprehensive risk‑control admission framework—including authentication, eligibility rules, pipeline processing, and AI‑driven verification—that safeguards platform operations while enabling scalable growth and efficient order management.

AI verificationSystem Architectureadmission strategy
0 likes · 10 min read
How Robust Risk Admission Strategies Power Platform Safety and Growth
Youzan Coder
Youzan Coder
Nov 29, 2019 · Operations

Recap of the 3rd Hangzhou Testing Salon – Continuous Delivery and Efficient Testing Environments

The 3rd Hangzhou Testing Salon, part of Youzan’s eighth technical series, featured Youzan leaders discussing continuous‑delivery product evolution, platform design, and cost‑effective testing environments, with videos and PPT available, while future Alibaba and KuJia sessions will be announced soon.

Continuous DeliveryDevOpsplatform operations
0 likes · 4 min read
Recap of the 3rd Hangzhou Testing Salon – Continuous Delivery and Efficient Testing Environments