Operations 6 min read

Highlights from SRECon17 Americas 2023 in San Francisco

The article reports on the SRECon17 Americas conference in San Francisco, summarizing keynote talks, panel sessions, and practical insights from industry leaders such as Stripe, Netflix, Google, and IBM on topics ranging from traffic control and container management to on‑call practices and cost considerations for Site Reliability Engineering.

High Availability Architecture
High Availability Architecture
High Availability Architecture
Highlights from SRECon17 Americas 2023 in San Francisco

The annual SRECon conference opened on March 13‑14 in San Francisco, organized by USENIX and titled SRECon17 Americas. It gathered professionals interested in website reliability, system engineering, and complex distributed systems, featuring speakers from Google, Facebook, LinkedIn, Netflix, Pivotal, Pinterest, Uber, Twitter, and Baidu.

One of the opening talks, "So You Want to Be a Wizard" by a female SRE from Stripe, described her transition from DevOps to SRE, illustrated with hand‑drawn comic‑style slides and detailed how she used tools like tcpdump and Wireshark to troubleshoot slow HTTP requests, emphasizing documentation and the importance of understanding the "why" behind problems.

The conference then split into three parallel tracks covering core SRE topics such as traffic control, automated debugging, rapid deployments, large‑scale container operations, monitoring and alerting, online profiling, and 24/7 on‑call practices.

A recommended session from Netflix, "Ten Persistent SRE Antipatterns," highlighted common pitfalls in building successful SRE programs, using humorous graphics to discuss reliability metrics and cost implications.

Another session, "I’m an SRE Lead! Now What?" presented by IBM Bluemix, focused on how to bootstrap and organize an SRE team, collaborate with development groups, balance development and operational duties, define clear responsibilities, integrate agile processes, and design robust incident‑response workflows.

The author reflects that, while the conference may not have delivered the dense technical depth typical of developer‑focused events, it offered valuable insights into SRE challenges such as the shift from "pets" to "cattle" (and even "poultry") in container management, effective alerting strategies, on‑call fatigue, and the cost of hiring SREs.

Overall, the day provided a rich mix of practical experiences, cultural observations, and strategic guidance for anyone interested in adopting or improving SRE practices.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

OperationsDevOpsSREGoogleconferenceSite Reliability EngineeringNetflix
High Availability Architecture
Written by

High Availability Architecture

Official account for High Availability Architecture.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.