Cloud Native 16 min read

Scaling Productivity on Microservices at Lyft – Development and Testing Environment History

This article chronicles Lyft's evolution from a PHP monolith to a Python and Go microservice architecture, detailing the development and testing environments—including Devbox, Onebox, integration testing, pre‑release staging, and the migration to Kubernetes—while highlighting the scalability and productivity challenges faced as the service count grew to hundreds.

Architecture Digest
Architecture Digest
Architecture Digest
Scaling Productivity on Microservices at Lyft – Development and Testing Environment History

At the end of 2018 Lyft's engineering team split its original PHP monolith into a set of Python and Go microservices, achieving faster experiments, hundreds of daily deployments, and language‑specific service choices, but the rapid growth of engineers, services, and test cases soon outpaced their development tools.

The series is divided into four parts that describe how Lyft's development environment supported growth from 100 engineers and a few services to over 1,000 engineers and hundreds of services, addressing scaling challenges and introducing a heavy integration‑testing approach for near‑end‑to‑end validation.

Part 1 – History of Development and Testing Environments

In 2015, with about 100 engineers, Lyft still relied on a single PHP monolith while a few microservices emerged for specific use cases. Anticipating continued growth, they began building a Docker‑based container orchestration environment, first for testing services and later extending to production, to benefit from multi‑tenant workloads, lower cost, and faster scaling.

In early 2016 Lyft launched Devbox , a boxed development environment that managed local virtual machines, handling package installation, service startup, and shared folder configuration. Developers could start a VM with a single command, which would pull the latest image, create/fill databases, launch an Envoy sidecar, and perform other setup steps.

As the need for longer‑running shared environments grew, Lyft introduced Onebox , essentially a Devbox running on an EC2 instance (r3.4xlarge with 16 vCPU and 122 GiB RAM). Onebox allowed multiple services to run faster due to AWS bandwidth and avoided the overhead of VirtualBox on laptops.

Onebox also proved suitable for CI integration testing: a service defines its dependencies in a manifest.yaml file, CI spins up a temporary Onebox, runs the services, and executes tests on each pull request.

name: api
type: service
groups:
  - name: integration
    members:
      - driver_onboarding
      - users
tests:
  - name: integration
    group: integration

The pre‑release environment mirrors production (minus production data) and became increasingly important for end‑to‑end testing, enabling realistic traffic simulations and providing deployment feedback.

By 2020, after four years of using Devbox and Onebox, the “Lyft‑in‑a‑box” approach struggled to keep up with a ten‑fold increase in engineers and a growing dependency graph of services, causing CI environment startup and test execution to become painfully slow.

Scalability issues emerged: Onebox could no longer efficiently run the same observability stack across hundreds of environments, making root‑cause analysis difficult. The pre‑release environment, while more scalable, introduced challenges such as experiment interference, single‑change testing limits, and longer deployment times.

Maintenance overhead grew as the stack migrated to Kubernetes for container orchestration, and the heavy multi‑process images slowed build and download times.

To address these problems, Lyft identified three critical workflows that must be supported:

Local development – fast, simple unit testing or web server startup for any service.

Manual end‑to‑end testing – isolated testing of changes in a larger system.

Automated end‑to‑end testing – a small set of valuable acceptance tests run during production deployment.

Future posts in the series will dive deeper into each workflow, discussing problem domains, solutions, and lessons learned.

Example integration test code illustrating the evolution of test complexity across 2013, 2015, and 2018:

# 2013 (monolith), duration: 1 minute
def test_driver_approval():
    """
    Requires:
      - api
    """
    user = get_user()
    approve_driver(user)
    assert user.is_approved

# 2015 (mostly monolithic, a few services), duration: 3 minutes
def test_driver_approval():
    """
    Requires:
      - api (monolith)
      - users
      - mongodb
      - driver_onboarding
      - redis
    """
    user = user_service.create_user()
    user = driver_onboarding_service.approve_driver(user)
    assert user.is_approved

# 2018 (post‑decomp, microservices), duration: 20 minutes
def test_driver_approval__california():
    """
    Requires:
      - users
      - redis
      - experimentation
      - fraud
      - dynamodb
      - messaging
      - mongodb
      - driver_onboarding
      - dmv_checks
      - vehicles
      - payments
    """
    user = user_service.create_user()
    user = driver_onboarding_service.approve_driver(user)
    assert user.is_approved

These examples show how test dependencies expanded as Lyft moved toward a microservice architecture, underscoring the need for more efficient, scalable development and testing environments.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

MicroservicesScalabilitytestingcontinuous integration
Architecture Digest
Written by

Architecture Digest

Focusing on Java backend development, covering application architecture from top-tier internet companies (high availability, high performance, high stability), big data, machine learning, Java architecture, and other popular fields.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.