Cloud Computing 14 min read

How Discord Boosted Engineer Productivity by Moving to Cloud Development Environments

Discord’s engineering team migrated its backend and infrastructure development to Linux‑based cloud development environments using Coder, gaining immutable, reproducible, and secure workspaces while navigating challenges like latency, network reliability, and the need for seamless developer onboarding.

Radish, Keep Going!
Radish, Keep Going!
Radish, Keep Going!
How Discord Boosted Engineer Productivity by Moving to Cloud Development Environments

Introduction

If you have followed our previous engineering blog posts, you know that building and maintaining Discord is complex. Our software development occurs in a multilingual monorepo with active languages such as Python, TypeScript, Rust, Elixir, and C/C++. We also develop and ship products for Android, iOS, macOS, Windows, and Linux.

The internal developer experience team owns roughly the first third of the software development lifecycle, handling IDE experience, development environment management, build and test tooling, CI infrastructure, and change‑management processes. This post focuses on how we transitioned all backend and infrastructure development to Linux‑based cloud development environments with the help of the Coder team.

Background

Discord’s engineering organization has more than doubled in size over the past few years and operates primarily remotely, with offices in San Francisco and the Netherlands.

Most developers use MacBooks. Before remote development machines, we ensured engineers could run Discord on both Mac and Ubuntu, using Homebrew to configure laptops. We encountered issues, such as a brew upgrade that could block developer processes, which we mitigated by hard‑pinning packages and passing dependencies. Eventually we migrated from Homebrew to Nix for system dependencies, allowing optional Homebrew usage.

Our local service orchestration evolved from Makefiles and procfiles to Docker and docker‑compose, but performance and friction led us to a supervisor‑based system with tools for defining and running services and dependencies.

Managing two non‑replicable environments became a heavy burden, prompting us to focus on a single Linux‑based development environment and evaluate cloud development environments (CDEs), ultimately selecting Coder.

Moving development to cloud‑hosted VMs offers immutability, reproducibility, configurability, enhanced security, and built‑in IAM, along with broader automation options.

CDEs require a solid editor and development loop experience; VS Code’s remote development extension provides a stable, powerful experience that most Discord engineers already use.

We mount and preserve the /home directory across restarts, allowing developers to resume work where they left off, balancing immutability with practicality.

While CDEs bring many benefits, they cannot match localhost performance, and SSH latency can be significant, especially on unstable networks.

Large HTML/JS bundles over the network increase save and rebuild times, leading many engineers to do frontend work locally and backend work remotely, which adds cognitive load.

Diagram of developer laptops accessing remote services in the cloud
Diagram of developer laptops accessing remote services in the cloud

Despite these trade‑offs, we believe the benefits outweigh the drawbacks.

Coder

We first engaged with Coder at the end of 2020. Their early product was Kubernetes‑native, aligning with Discord’s heavy Kubernetes usage. Building a similar solution in‑house would have required significant effort, so evaluating Coder was a clear decision.

Coder provides all expected features; they recently rebuilt their product from scratch, addressing many early issues we encountered with Kubernetes and container‑based development environments, such as noisy neighbors, latency spikes, and higher‑than‑expected latency.

In 2023 we migrated to Coder’s V2 product, delivering VMs to developers, simplifying architecture, and improving stability, security, and performance via a rewritten network stack using Tailscale and WireGuard. Post‑migration feedback highlighted faster, smoother development and a drop in latency‑related support tickets.

Collage of Discord messages showing satisfaction with Coder V2
Collage of Discord messages showing satisfaction with Coder V2

How Did the Migration Go?

Our transition from local MacBook development to using Coder involved learning and adaptation. Below is a detailed account of the migration, lessons learned, and what we would do differently.

Migration Plan

Consolidate experience – ensure the default experience “just works.”

Broaden adoption – conduct small‑scale testing, gather representative feedback, then move to public testing.

Hard deadline – finalize documentation, expand training and support channels, and fully retire backend development on MacBooks.

The migration began with “simple” tasks: creating development containers, installing system dependencies, setting up user accounts, permissions, and pre‑installed software, while adding automation to speed feedback loops.

We realized migration is as much a people problem as a technical one, requiring understanding of missed experiences, learning curves, and pain points. Interviews and early feedback helped shape the process.

We recruited “champions” from various teams to test the new environment and provide regular feedback, ensuring diverse daily workflows were represented and issues identified early.

We believed that delivering tools that enhance the engineering experience would naturally motivate adoption, and a hard cut‑over date helped overcome resistance.

With Apple’s M1 ARM-based chips arriving, we accelerated the timeline, deprecating macOS‑based backend development and pushing the migration forward.

Lessons Learned

Running development machines inside Kubernetes containers is challenging; we could not run privileged containers, requiring unique solutions for kernel parameter changes, such as a privileged daemon on each node for apps like Scylla.

Developer responsiveness is critical; slow UI rendering disrupts workflow. Smooth onboarding, especially for those unfamiliar with the command line, required extensive documentation, training videos, and default dotfiles.

What We Would Do Differently

Post‑migration, network latency and connection interruptions were major issues. We would invest earlier in monitoring and diagnostic tools, and conduct broader regional testing to ensure performance across varied network conditions.

Overall, despite challenges, the migration to cloud development environments was positive, yielding productivity gains, higher developer satisfaction, and valuable insights for future technical decisions.

KubernetesDevOpscloud developmentcoderengineering migrationremote dev
Radish, Keep Going!
Written by

Radish, Keep Going!

Personal sharing

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.