How YY Interactive Built a Scalable PaaS Ops Platform: Lessons & Best Practices
YY Interactive’s operations team shares a comprehensive overview of their private PaaS platform, detailing the evolution from IaaS to automated DevOps-driven services, the value framework of quality, efficiency, cost, and security, and future plans for scalability, multi‑language support, and VDC integration.
Preface
Cloud computing has evolved since AWS EC2 in 2006, and today enterprises of all sizes adopt it to accelerate delivery, reduce cost, and improve competitiveness. The primary goal is to abstract hardware and provide compute, storage, and network resources.
The core benefit is higher IT delivery efficiency, enabling businesses to do more with less while meeting quality and security requirements. Under the cloud wave, IT departments must decide what IaaS services to offer and how to build PaaS on top to satisfy quality, efficiency, cost, and security.
Since 2013, YY Interactive’s operations team, following DevOps and ITIL best practices, has launched its own IaaS and gradually built a private PaaS platform, continuously improving it.
1. Operations Value System
Operations value is measured by four dimensions: quality, efficiency, cost, and security. All work, from platform construction to specific tasks, should be evaluated against these metrics, using automation, service‑orientation, data‑driven and visualized outputs.
2. Platform‑Centric Approaches
Three main models are identified:
2.1 Process‑Oriented
Independent tool subsystems expose APIs for integration. Example: a web service requests a server via CMDB, submits a ticket, passes through multiple approval steps, and finally obtains resources. While this provides good control, it slows down business‑critical changes.
2.2 Service‑Oriented
Resources are IaaS‑ified and offered via APIs, enabling self‑service provisioning of servers, databases, CDN, etc. Deployment becomes fully automated and developers can release without ops involvement.
2.3 “Bring‑Your‑Own” (Adopt Public Cloud or ITSM tools)
Public cloud platforms provide complete resources and APIs; suitable for startups but may require multi‑cloud or hybrid solutions as the company grows.
Commercial ITSM software offers integrated operation management suites, helpful for traditional enterprises.
3. YY Interactive PaaS Philosophy and Practice
Business Scenarios
Rapid experimentation, insufficient manpower, and cost pressure drive the need for a high‑efficiency, low‑cost PaaS platform.
Platform Philosophy
The platform turns operations services into productivity, providing high‑availability, high‑performance infrastructure that supports self‑service for developers, aiming for a NoOps experience.
Implementation Highlights
1. Overall Architecture
Two views: a business‑centric view (blue) and an operations view (gray). The stack consists of hardware, IaaS, PaaS, and business layers, plus global resource, monitoring, data, reporting, and security centers.
2. Standardization
Automation is built on standards for base software (Nginx, Tomcat, etc.), packaging (Java, PHP), deployment, monitoring, and other aspects.
Base application software standards
Application packaging standards
Deployment standards
Monitoring standards
Other standards
3. IaaS Services
Compute virtualization: VM as the smallest unit, using OpenStack and KVM, with horizontal scaling.
Storage virtualization: local storage for VMs; object storage via internal service.
Network virtualization: Neutron Provider Network, no VPC isolation yet.
Data sources: MySQL, Redis, Memcached provided as plugins, automatically monitored.
4. Continuous Delivery
Delivery model based on application packaging; supports project, module, and resource management, continuous integration, testing, deployment, rollback, and feedback.
5. High‑Availability Architecture
Design includes DDoS protection, GSLB, OSPF‑LVS load balancing, application routing (Nginx), container layer, cache layer, and database layer with master‑slave failover.
6. Elastic Scaling
Elasticity provides performance, cost efficiency, high availability, and smooth deployment. Implemented via CloudRouter, CloudMonitor, and a pre‑provisioned VM pool, with policies based on average VM load.
7. NoOps
Self‑service tools cover log management, monitoring (Zabbix‑based CloudMonitor), and operational utilities, reducing the need for manual server access.
8. Security Auditing
All operations are recorded and traceable; critical data changes require approval.
9. Platform Operations
Two‑way feedback loops and user experience optimization ensure the platform meets developer needs and avoids wasted effort.
Benefits
Quality : High‑availability components and automated availability management.
Efficiency : DevOps‑driven automation accelerates delivery.
Security : Integrated network and system security, role‑based access, and audit trails.
Cost : Resource pooling, elasticity, and automation reduce hardware and labor expenses.
Risks
Capacity Management : Self‑service resource allocation requires robust capacity forecasting and alerts.
Isolation : Lack of VPC limits network isolation; multi‑tenant resource contention must be monitored.
Future Roadmap
Plans include a one‑stop platform for business and operations, multi‑language support (Task, Node.js, Python), deeper automation, data‑driven visualization, productization, and migration to a VDC based on SDN/VPC.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Efficient Ops
This public account is maintained by Xiaotianguo and friends, regularly publishing widely-read original technical articles. We focus on operations transformation and accompany you throughout your operations career, growing together happily.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
