Tagged articles
360 articles
Page 4 of 4
Alibaba Cloud Infrastructure
Alibaba Cloud Infrastructure
Aug 17, 2017 · Cloud Computing

Alibaba Tech Open-Day in Silicon Valley Showcases Global Infrastructure and Cloud Computing Innovations

The Alibaba Tech Open-Day held in Silicon Valley highlighted the company's global data‑center network, energy‑efficient designs, high‑speed networking, custom hardware, advanced system software, middleware solutions, and its ambitious NASA research program, while also recruiting top engineering talent for both US and China operations.

AlibabaData CentersInfrastructure
0 likes · 12 min read
Alibaba Tech Open-Day in Silicon Valley Showcases Global Infrastructure and Cloud Computing Innovations
Efficient Ops
Efficient Ops
Aug 13, 2017 · Operations

22 Essential Ops Manager Tips for Building Resilient Web Infrastructure

This article compiles 22 practical recommendations from an operations manager covering domain management, CDN usage, image servers, data center selection, monitoring, security, redundancy, high‑availability architecture, disaster‑recovery planning, and team coordination to help ensure stable and secure online services.

InfrastructureOperationsdisaster recovery
0 likes · 12 min read
22 Essential Ops Manager Tips for Building Resilient Web Infrastructure
ITPUB
ITPUB
Jul 25, 2017 · Operations

How to Accurately Plan Data Center Power with Dell’s Enterprise Infrastructure Planning Tool

This guide explains why precise power usage assessment is crucial for data‑center safety and efficiency, introduces Dell’s free online Enterprise Infrastructure Planning Tool, provides the web and download links, and walks through step‑by‑step configuration of voltage, devices, PSU selection, summary view, and exporting results to PDF or Excel.

ConfigurationData centerDell
0 likes · 6 min read
How to Accurately Plan Data Center Power with Dell’s Enterprise Infrastructure Planning Tool
Ctrip Technology
Ctrip Technology
Jul 20, 2017 · Operations

Ctrip's Fourth‑Generation Architecture: Elastic Routing (SLB) and the TARS Release System

This article reviews Ctrip's two‑year architecture transformation, describing how the company replaced hardware load balancers with a software‑defined SLB, introduced application‑level grouping, multi‑update mechanisms, health‑check sharing, monitoring, and the TARS release platform to achieve faster, more reliable deployments.

CtripInfrastructureOperations
0 likes · 16 min read
Ctrip's Fourth‑Generation Architecture: Elastic Routing (SLB) and the TARS Release System
Efficient Ops
Efficient Ops
Jul 12, 2017 · Operations

How Alibaba Built a Scalable DevOps Platform: Lessons for Modern Operations

This article, based on a DevOpsDays Beijing talk, details Alibaba's post‑DevOps transformation, outlining the three evolution stages of operations, the four pillars of automated ops, the importance of CMDB, CI/CD pipelines, and the design of the ATOM platform that enables rapid, data‑driven, and resilient service delivery.

CMDBDevOpsInfrastructure
0 likes · 15 min read
How Alibaba Built a Scalable DevOps Platform: Lessons for Modern Operations
MaGe Linux Operations
MaGe Linux Operations
Jun 29, 2017 · Operations

Mastering Internet Operations: Roles, Responsibilities, and Evolution

This article outlines the service‑centric approach of internet operations, detailing how stability, security, and efficiency are achieved through infrastructure management, system and application maintenance, database administration, and security practices, and traces the evolution of operational roles from manual handling to automated, self‑scheduling platforms.

Infrastructure
0 likes · 20 min read
Mastering Internet Operations: Roles, Responsibilities, and Evolution
Efficient Ops
Efficient Ops
Jun 12, 2017 · Operations

Mastering DevOps in Complex Business Systems: Theory, Culture, Architecture & Case Studies

This article presents a comprehensive overview of a GOPS 2017 Shenzhen talk on DevOps theory and practice in complex business environments, covering the fundamentals of DevOps, cultural transformation, technical architecture, and real‑world case studies that illustrate automation, deployment pipelines, and value‑stream delivery.

Continuous DeliveryDevOpsInfrastructure
0 likes · 17 min read
Mastering DevOps in Complex Business Systems: Theory, Culture, Architecture & Case Studies
Efficient Ops
Efficient Ops
Jun 10, 2017 · Operations

What Google’s SRE Book Reveals About Modern Operations

This article introduces the Chinese translation of Google’s SRE book, shares behind‑the‑scenes stories of its creation, and distills key concepts such as the AAA model, Borg architecture, SLOs, toil reduction, and the cultural shift required for reliable large‑scale services.

DevOpsGoogleInfrastructure
0 likes · 20 min read
What Google’s SRE Book Reveals About Modern Operations
Efficient Ops
Efficient Ops
Jun 6, 2017 · Operations

How SF Express Transformed Its Infrastructure: From Chaos to Automated DevOps

This article details SF Express's journey from a fragmented, manual infrastructure operation to a standardized, automated DevOps environment, covering organizational restructuring, open‑source adoption, change management, capacity forecasting, and the vision for a self‑service "WeiX" platform.

DevOpsInfrastructurestandardization
0 likes · 18 min read
How SF Express Transformed Its Infrastructure: From Chaos to Automated DevOps
Efficient Ops
Efficient Ops
May 31, 2017 · Operations

How a Veteran Ops Leader Transforms DevOps into Full‑Chain Automation

This article shares a veteran operations leader’s insights on DevOps fundamentals, the comprehensive ops knowledge system and career paths, the evolution of small‑business web architectures, and the step‑by‑step development of a full‑chain automation platform, emphasizing both technical and soft‑skill growth.

Career DevelopmentDevOpsInfrastructure
0 likes · 17 min read
How a Veteran Ops Leader Transforms DevOps into Full‑Chain Automation
MaGe Linux Operations
MaGe Linux Operations
May 2, 2017 · Operations

What Is Zabbix? A Deep Dive into Its Features, Architecture, and Deployment

Zabbix is an open‑source, web‑based enterprise monitoring platform that tracks Windows/Linux hosts, network devices, and hardware/software metrics, provides alerting, visualizes data via a customizable PHP web UI, and comprises components such as server, agents, proxies, Java gateway, and API, with flexible templates, discovery, and storage options.

AlertingIT OperationsInfrastructure
0 likes · 6 min read
What Is Zabbix? A Deep Dive into Its Features, Architecture, and Deployment
21CTO
21CTO
Apr 30, 2017 · Backend Development

Essential Backend Infrastructure for Scalable Internet Services

This article outlines the critical backend components and services—such as API gateways, MVC/IOC/ORM frameworks, caching, databases, search engines, message queues, unified authentication, configuration management, service governance, scheduling, logging, and data processing pipelines—that together enable stable, high‑availability, and maintainable online applications.

BackendInfrastructureMicroservices
0 likes · 29 min read
Essential Backend Infrastructure for Scalable Internet Services
Meituan Technology Team
Meituan Technology Team
Apr 7, 2017 · Information Security

Insights on Google Infrastructure Security Design

Google’s new security white paper reveals how its deeply integrated, principle‑driven architecture—spanning physical data‑center safeguards, mutual‑authenticated multi‑tenant services, pervasive encryption, and a comprehensive DevSecOps process—enables massive‑scale protection, but replicating this model demands substantial custom hardware, unified tooling, and large‑scale engineering expertise.

Data ProtectionGoogleInfrastructure
0 likes · 22 min read
Insights on Google Infrastructure Security Design
Efficient Ops
Efficient Ops
Mar 21, 2017 · Operations

Rethinking Operations: The “Third Kind” of SRE at Lianjia

The article shares the author’s experience transitioning from private to public and hybrid clouds at Lianjia, introduces a “third kind” of operations that blends traditional and internet‑based practices, and discusses containers, DNS‑based naming, and automation tools to build adaptable, cost‑effective infrastructure.

InfrastructureNaming ServiceSRE
0 likes · 21 min read
Rethinking Operations: The “Third Kind” of SRE at Lianjia
Alibaba Cloud Infrastructure
Alibaba Cloud Infrastructure
Mar 15, 2017 · Operations

Alibaba IDC and Network Monitoring System Architecture and Practices

The article details Alibaba's globally distributed IDC and network monitoring systems, describing their fully distributed data collection, centralized computation, storage strategies, alarm mechanisms, and frontend visualization that together enable real‑time infrastructure and network health management for large‑scale operations.

Distributed SystemsIDCInfrastructure
0 likes · 13 min read
Alibaba IDC and Network Monitoring System Architecture and Practices
DevOps
DevOps
Feb 23, 2017 · Cloud Native

JD's Migration from OpenStack to Kubernetes: Lessons and Architecture of JDOS 2.0

Since the end of 2016, JD has been transitioning its infrastructure from OpenStack to Kubernetes, completing 20% of the migration and aiming for full conversion by Q2, and shares detailed experiences, architectural evolution, operational practices, and future directions for large‑scale container platforms.

Cloud NativeInfrastructureJDOS
0 likes · 16 min read
JD's Migration from OpenStack to Kubernetes: Lessons and Architecture of JDOS 2.0
Meituan Technology Team
Meituan Technology Team
Jan 23, 2017 · Cloud Native

Meituan-Dianping Docker Container Platform: Architecture and Practices

Meituan‑Dianping’s Docker Container Platform, built on a four‑layer architecture that integrates API orchestration, host‑side management, a hybrid image registry, OVS‑DPDK networking, LVM‑backed storage, and low‑overhead monitoring, enables seconds‑level scaling, live resource adjustments, and major cost savings across dozens of business units by combining containers with traditional VMs.

Cloud NativeDockerInfrastructure
0 likes · 23 min read
Meituan-Dianping Docker Container Platform: Architecture and Practices
Alibaba Cloud Developer
Alibaba Cloud Developer
Jan 5, 2017 · Cloud Native

How Alibaba Unified T4 and Docker into AliDocker for Double‑11 Scale

This article details Alibaba's large‑scale migration of core transaction services from traditional VM and proprietary T4 containers to a unified Docker‑based platform called AliDocker, covering integration challenges, image‑based deployment, Swarm customizations, and middleware Dockerization that enabled seamless double‑11 operations.

AliDockerContainerDocker
0 likes · 18 min read
How Alibaba Unified T4 and Docker into AliDocker for Double‑11 Scale
Alibaba Cloud Developer
Alibaba Cloud Developer
Dec 13, 2016 · Operations

How Alibaba Evolved Its Application Operations: From Scripts to DevOps

Alibaba’s application operations journey, detailed by researcher Lin Hao, traces the shift from early script‑based practices through tool‑centric phases to a full DevOps transformation, highlighting challenges, automation efforts, and the emerging push toward intelligent, data‑driven operations.

AlibabaDevOpsInfrastructure
0 likes · 19 min read
How Alibaba Evolved Its Application Operations: From Scripts to DevOps
Architects' Tech Alliance
Architects' Tech Alliance
Nov 16, 2016 · Cloud Computing

How OpenStack Ironic Enables Bare-Metal Provisioning in the Cloud

OpenStack Ironic is a dedicated bare‑metal service that replaces Nova’s original driver, using PXE and IPMI to automate physical server deployment, power management, and resource discovery, integrating with Keystone, Nova, Neutron, Glance, and Cinder to provide cloud‑like provisioning for real hardware.

Bare MetalInfrastructureIronic
0 likes · 6 min read
How OpenStack Ironic Enables Bare-Metal Provisioning in the Cloud
Architects' Tech Alliance
Architects' Tech Alliance
Nov 4, 2016 · Big Data

The Seven Camps of the Global Big Data Ecosystem

The article outlines how mobile Internet merges the data‑driven society with the physical world to create a new big‑data architecture and describes the seven distinct camps—Infrastructure, Analytics, Applications, Cross‑Domain Architecture, Open‑Source, Data Sources & APIs, and Incubator & Training—that together form a comprehensive end‑to‑end big‑data solution ecosystem.

APIAnalyticsApplications
0 likes · 3 min read
The Seven Camps of the Global Big Data Ecosystem
Efficient Ops
Efficient Ops
Oct 31, 2016 · Operations

What Are DevOps’ Eight Honors and Shames? Insights from Heroku’s 12‑Factor Manifesto

This article presents a seasoned DevOps expert’s eight “honors and shames” principles, explains why configuration, redundancy, restartability, whole‑delivery, statelessness, standardization, automation, and unattended operation matter, and connects them to Heroku’s twelve‑factor app guidelines for building resilient cloud services.

DevOpsInfrastructure
0 likes · 21 min read
What Are DevOps’ Eight Honors and Shames? Insights from Heroku’s 12‑Factor Manifesto
360 Zhihui Cloud Developer
360 Zhihui Cloud Developer
Oct 19, 2016 · Operations

Wonder Monitoring: Scaling Ops with Open‑Falcon‑Powered Automation

This article explains how the internally built Wonder monitoring system, based on Open‑Falcon, tackles large‑scale operational challenges by offering automated agent updates, customizable metrics, log and port monitoring, persistent alarm storage, enhanced alert content, and comprehensive dashboards for thousands of devices.

AlertingAutomationInfrastructure
0 likes · 7 min read
Wonder Monitoring: Scaling Ops with Open‑Falcon‑Powered Automation
Qunar Tech Salon
Qunar Tech Salon
Oct 11, 2016 · Operations

Design and Implementation of Qunar Network Device Operations Platform

Facing growing network device counts and limited netops staff, Qunar built a network device operations platform that integrates command automation, permission-controlled tasks, monitoring, and dynamic scaling using Docker, Marathon, and Celery, thereby improving efficiency, reducing risk, and enabling comprehensive auditability.

Infrastructurenetwork operationspermission control
0 likes · 8 min read
Design and Implementation of Qunar Network Device Operations Platform
Architects' Tech Alliance
Architects' Tech Alliance
Sep 30, 2016 · Cloud Computing

Understanding Hyper‑Converged Infrastructure: Nutanix Overview and Market Landscape

The article provides a comprehensive overview of converged and hyper‑converged infrastructure, discusses typical use cases such as VDI and database acceleration, compares major vendor solutions, and details Nutanix’s product lines, architecture, performance considerations, cloud integration, and micro‑service capabilities.

Hyper-ConvergedInfrastructureNutanix
0 likes · 9 min read
Understanding Hyper‑Converged Infrastructure: Nutanix Overview and Market Landscape
Efficient Ops
Efficient Ops
Sep 19, 2016 · Operations

How Ctrip Revolutionized IDC Management with Visual Automation

Ctrip’s rapid internet growth forced a massive data‑center expansion, prompting the company to evolve from self‑built facilities to hybrid vendor‑leased IDC, and ultimately to a visual management platform that automates monitoring, space planning, device intake, and operational workflows, dramatically improving efficiency and reducing manual effort.

CMDBInfrastructurevisualization
0 likes · 13 min read
How Ctrip Revolutionized IDC Management with Visual Automation
Qunar Tech Salon
Qunar Tech Salon
Sep 14, 2016 · Cloud Computing

Design and Implementation of Ctrip's Virtual Cloud Desktop System Based on OpenStack

This article presents Ctrip's deployment of a virtual cloud desktop system for its call center, detailing the OpenStack‑based architecture, advantages over traditional PCs, challenges encountered, the evolution to a decoupled design, resource over‑commit strategies, networking issues, and the operational tools and automated testing that ensure stability.

InfrastructureOpenStackcloud computing
0 likes · 13 min read
Design and Implementation of Ctrip's Virtual Cloud Desktop System Based on OpenStack
Efficient Ops
Efficient Ops
Sep 5, 2016 · Operations

Inside Google’s Data Centers: How SRE Manages Hardware, Borg, and Global Services

This article explains how Google’s Site Reliability Engineering team designs and operates uniform hardware in its data centers, uses the Borg cluster manager, implements storage layers, SDN networking, monitoring, and a sample Shakespeare search service to achieve high‑availability, scalable production services.

BorgDistributed SystemsGoogle SRE
0 likes · 21 min read
Inside Google’s Data Centers: How SRE Manages Hardware, Borg, and Global Services
High Availability Architecture
High Availability Architecture
Aug 30, 2016 · Operations

Evolution of Meizu Flyme Operations Architecture and High‑Availability Practices

The article details Meizu's Flyme operations platform evolution—from a single‑cabinet setup in 2011 to a multi‑IDC, 6000‑server infrastructure—highlighting challenges, architectural upgrades, monitoring, cost control, automation, and future high‑availability directions for large‑scale internet services.

Infrastructurecost controlhigh availability
0 likes · 13 min read
Evolution of Meizu Flyme Operations Architecture and High‑Availability Practices
Ctrip Technology
Ctrip Technology
Aug 26, 2016 · Information Security

Automated Firewall Operations and Management System at Ctrip

The article describes how Ctrip’s network security team built an automated, centralized firewall management platform that handles multi‑brand firewalls, streamlines policy queries, generation, and deployment, integrates with change‑ticket workflows, and dramatically improves operational efficiency while reducing human error.

CtripInfrastructureOperations
0 likes · 14 min read
Automated Firewall Operations and Management System at Ctrip
Efficient Ops
Efficient Ops
Aug 25, 2016 · Operations

How Tencent Scales Ops Automation for Hundreds of Thousands of Servers

This article explains how Tencent transformed massive operational pressure from billions of users and half‑million servers into an automated, standardized workflow by defining clear goals, building a layered CMDB, integrating Dev and Ops, and implementing a six‑step deployment pipeline that balances efficiency with safety.

CMDBDevOpsInfrastructure
0 likes · 21 min read
How Tencent Scales Ops Automation for Hundreds of Thousands of Servers
21CTO
21CTO
Apr 20, 2016 · Operations

How Spotify Scaled Machine Management: From Ops Chaos to Cloud Automation

This article chronicles Spotify's evolution in server operations—from a manual Ops team and ad‑hoc tools in the early years, through automated DNS, provisioning, and self‑service platforms, to a hybrid cloud strategy that reduced resource‑request turnaround from weeks to minutes.

AutomationDevOpsInfrastructure
0 likes · 14 min read
How Spotify Scaled Machine Management: From Ops Chaos to Cloud Automation
Huawei Cloud Developer Alliance
Huawei Cloud Developer Alliance
Apr 11, 2016 · Cloud Computing

Explore OpenStack's Core Services: From Nova to Ceilometer

This article introduces the key OpenStack services—Nova, Neutron, Keystone, Glance, Horizon, Cinder, Swift, Heat, and Ceilometer—explaining each component’s role, functionality, and how they collectively enable scalable compute, networking, identity, image, dashboard, block storage, object storage, orchestration, and telemetry in cloud environments.

InfrastructureOpenStackService Architecture
0 likes · 9 min read
Explore OpenStack's Core Services: From Nova to Ceilometer
Efficient Ops
Efficient Ops
Mar 21, 2016 · Operations

How to Build a High‑Performance Unified Monitoring & Alerting Platform

This article outlines a comprehensive design for a high‑performance, unified operations monitoring platform, detailing a six‑layer architecture, the roles of data collection (using Ganglia), data extraction, and alerting modules (with Centreon), and provides practical integration tips, deployment diagrams, and Q&A for large‑scale environments.

AlertingCentreonGanglia
0 likes · 24 min read
How to Build a High‑Performance Unified Monitoring & Alerting Platform
Architect
Architect
Mar 18, 2016 · Backend Development

Sogou Business Platform Infrastructure Evolution: From Horizontal Scaling to Stream Computing

This article outlines Sogou's infrastructure evolution under rapid business iteration, detailing stages of compute and storage horizontal scaling, serviceization, and stream computing, while sharing the practices, principles, lessons learned, and reflections that guided the platform's architectural transformation.

InfrastructureScalabilityService Architecture
0 likes · 4 min read
Sogou Business Platform Infrastructure Evolution: From Horizontal Scaling to Stream Computing
21CTO
21CTO
Mar 11, 2016 · Operations

Scaling DevOps at Mogujie: How a Young Ops Team Tackled Massive Traffic and Double‑11

Facing explosive traffic and high‑concurrency demands, Mogujie's newly formed operations team adopted DevOps practices, built CMDB, CI/CD pipelines, and monitoring platforms, and successfully supported the massive Double‑11 and Double‑12 sales events, sharing key technologies and lessons learned in their rapid‑pace environment.

CMDBDevOpsInfrastructure
0 likes · 3 min read
Scaling DevOps at Mogujie: How a Young Ops Team Tackled Massive Traffic and Double‑11
Architecture Digest
Architecture Digest
Mar 5, 2016 · Operations

Dianping Operations Architecture Overview and Best Practices

This article presents a comprehensive overview of Dianping's operations architecture, detailing team organization, multi‑data‑center infrastructure, monitoring layers, automation tools, configuration management systems, incident analysis, lessons learned, and future directions such as Docker and PaaS adoption.

AutomationDevOpsDocker
0 likes · 16 min read
Dianping Operations Architecture Overview and Best Practices
Efficient Ops
Efficient Ops
Feb 24, 2016 · Operations

Is Operations Automation Overhyped? A Pragmatic Look at Real‑World Practices

The article critiques the hype around operations automation, arguing that many tasks can be handled with simple shell scripts, that automation should solve error‑prone manual work rather than replace thoughtful architecture, and that choosing the most convenient tool is more valuable than chasing trendy solutions.

AutomationInfrastructureOperations
0 likes · 13 min read
Is Operations Automation Overhyped? A Pragmatic Look at Real‑World Practices
21CTO
21CTO
Jan 25, 2016 · Cloud Native

Why Docker Still Dominates: 2016 Tech Awards Highlights & Key Container Projects

InfoWorld’s 2016 Technology of the Year Awards spotlight Docker’s dominance, listing top container‑related projects such as Docker, Kubernetes, CoreOS, Mesos and others, while also covering a broad range of languages, tools, cloud services and big‑data platforms that shaped the tech landscape.

Cloud NativeContainersDevOps
0 likes · 6 min read
Why Docker Still Dominates: 2016 Tech Awards Highlights & Key Container Projects
Efficient Ops
Efficient Ops
Dec 14, 2015 · Operations

Top Ops Security Pitfalls and How to Safeguard Your Infrastructure

This article examines the most common operational security vulnerabilities—such as unpatched Struts, server‑status leaks, backup file exposure, SVN leaks, and weak default credentials—explains why they are critical, and offers practical recommendations for enterprises to improve their ops‑security posture.

InfrastructurePatch managementVulnerability Management
0 likes · 15 min read
Top Ops Security Pitfalls and How to Safeguard Your Infrastructure
Qunar Tech Salon
Qunar Tech Salon
Dec 14, 2015 · Cloud Native

Building Scalable Development Environments with Docker, Mesos, and Kubernetes: Lessons Learned

This article details a year‑long journey of designing, deploying, and operating container‑based development environments using Docker, Apache Mesos, and Kubernetes, covering the challenges of version consistency, rapid environment switching, resource isolation, and the practical solutions and lessons gathered from real‑world production use.

DevOpsDockerInfrastructure
0 likes · 16 min read
Building Scalable Development Environments with Docker, Mesos, and Kubernetes: Lessons Learned
Architect
Architect
Nov 25, 2015 · Cloud Native

Kubernetes Architecture Overview and Practical Insights

This article introduces Kubernetes, explains why it is used, outlines its core goals, describes the main components and their functions, discusses the architectural improvements it enables, and shares practical deployment experiences and common issues encountered during real‑world usage.

DevOpsInfrastructureKubernetes
0 likes · 15 min read
Kubernetes Architecture Overview and Practical Insights
Efficient Ops
Efficient Ops
Sep 20, 2015 · Operations

From Internet Ops to Banking: Lessons on Data Center Challenges

In a candid Q&A, industry veterans discuss the fundamental differences between internet and traditional banking operations, share experiences transitioning between sectors, and outline strategies to eliminate difficult data‑center maintenance, highlighting risk‑focused versus growth‑driven approaches.

Data centerIT opsInfrastructure
0 likes · 14 min read
From Internet Ops to Banking: Lessons on Data Center Challenges
Architects' Tech Alliance
Architects' Tech Alliance
Sep 13, 2015 · Cloud Computing

Infrastructure Convergence: Hardware Fusion and Hyper‑Converged Systems Overview

The article explains the evolution of enterprise IT infrastructure toward both custom, small‑scale distributed designs driven by cloud computing and integrated fusion/hyper‑converged architectures, detailing their design principles, differences, major vendor solutions, and the role of software‑defined storage.

Hyper-ConvergedInfrastructureSoftware-Defined Storage
0 likes · 12 min read
Infrastructure Convergence: Hardware Fusion and Hyper‑Converged Systems Overview
Efficient Ops
Efficient Ops
Aug 3, 2015 · Cloud Computing

How 1hao Store Uses Hybrid Cloud to Balance Cost and Performance

This article explains how an e‑commerce platform leverages a hybrid cloud architecture to handle massive traffic spikes from marketing events while optimizing costs, and outlines six key considerations for successful implementation.

Cost OptimizationE‑commerceInfrastructure
0 likes · 10 min read
How 1hao Store Uses Hybrid Cloud to Balance Cost and Performance
Efficient Ops
Efficient Ops
Jul 27, 2015 · Operations

What Google SREs Do: Inside the Role that Powers Reliable Services

This article explains the responsibilities, requirements, and daily work of Google Site Reliability Engineers, contrasts them with Software Engineers, outlines key internal infrastructure components, and discusses the future direction of operations engineering in the cloud era.

GoogleInfrastructureOperations
0 likes · 11 min read
What Google SREs Do: Inside the Role that Powers Reliable Services
Efficient Ops
Efficient Ops
Jul 23, 2015 · Operations

How Project Scorpio Reshaped China’s Data Center Rack Standards

This article chronicles the birth and evolution of China’s Project Scorpio—from its 2011 launch through Scorpio 1.0 and 2.0 specifications—highlighting its collaboration with Intel, its technical trade‑offs with Open Rack, and its impact on data‑center operations and standards.

Data centerInfrastructureOpen Compute Project
0 likes · 17 min read
How Project Scorpio Reshaped China’s Data Center Rack Standards
MaGe Linux Operations
MaGe Linux Operations
Jun 16, 2015 · Operations

Inside Dianping’s Ops: Building Scalable Monitoring, Automation, and Self‑Service Platforms

This article details how Dianping’s sub‑40‑person operations team structures its groups, designs a dual‑datacenter architecture, and creates comprehensive monitoring, automation, configuration, and analysis systems—including Zabbix, Cat, workflow, Button, and a custom radar platform—to achieve high‑availability, self‑service, and continuous improvement.

AutomationDevOpsInfrastructure
0 likes · 18 min read
Inside Dianping’s Ops: Building Scalable Monitoring, Automation, and Self‑Service Platforms

Stack Overflow Architecture and Operations: Scaling, Performance, and Infrastructure Overview

This article provides a comprehensive overview of Stack Overflow's infrastructure, detailing its vertically‑scaled hardware, use of Microsoft and Linux technologies, high‑availability design, caching layers, database strategies, deployment processes, monitoring, and the performance‑first philosophy that drives its efficient operation.

Infrastructureperformancescaling
0 likes · 17 min read
Stack Overflow Architecture and Operations: Scaling, Performance, and Infrastructure Overview
Efficient Ops
Efficient Ops
May 27, 2015 · Operations

How NoOps Transforms Operations: Automating Service Management

The article outlines the NoOps philosophy of automating routine operational tasks, describes how a tech‑learning team builds self‑service platforms, leverages open‑source tools, and invests in research to boost efficiency, stability, and innovation in modern internet services.

InfrastructureNoOpsopen-source tools
0 likes · 11 min read
How NoOps Transforms Operations: Automating Service Management
Efficient Ops
Efficient Ops
May 22, 2015 · Operations

Mastering Puppet: From Basics to Advanced Ops Automation and Docker Integration

This article summarizes a comprehensive talk on Puppet covering its evolution, core concepts, architecture, ecosystem, practical use cases such as building a CMDB, automated deployment pipelines, OpenStack deployment, and the interplay with Docker, highlighting how Puppet drives modern operations automation.

Configuration ManagementInfrastructureOperations
0 likes · 13 min read
Mastering Puppet: From Basics to Advanced Ops Automation and Docker Integration
MaGe Linux Operations
MaGe Linux Operations
Mar 26, 2015 · Operations

Essential Open‑Source Tools for Backup, Cloud, DevOps, and IT Operations

This article compiles a comprehensive list of open‑source tools covering backup, cloning, cloud platforms, cloud workflows, distributed file systems, cloud storage, code review, collaboration suites, CMDB, configuration management, continuous integration/deployment, DNS, hosting control panels, IT asset management, and LDAP, providing a valuable resource for IT professionals.

BackupDevOpsInfrastructure
0 likes · 11 min read
Essential Open‑Source Tools for Backup, Cloud, DevOps, and IT Operations
MaGe Linux Operations
MaGe Linux Operations
Sep 13, 2014 · Operations

How to Build a Scalable Small Website: From Thousands to Millions of Daily Visits

This article systematically outlines the essential steps and considerations—ranging from language choice, version control, hardware and data center selection, to architecture, software, database, storage, and code optimization—to help a small website scale from a few thousand daily visits to millions while avoiding costly pitfalls.

BackendDevOpsInfrastructure
0 likes · 13 min read
How to Build a Scalable Small Website: From Thousands to Millions of Daily Visits