Big Data 11 min read

Migrating LinkedIn’s Who Viewed Your Profile System from Lambda Architecture to a Lambda‑less Architecture

This article describes how LinkedIn’s Who Viewed Your Profile feature was originally built on a Lambda architecture, the operational challenges it caused, and the step‑by‑step migration to a streamlined, Samza‑driven, Lambda‑less design that improves performance, reduces maintenance overhead, and retains essential batch capabilities.

Top Architect
Top Architect
Top Architect
Migrating LinkedIn’s Who Viewed Your Profile System from Lambda Architecture to a Lambda‑less Architecture

LinkedIn’s Who Viewed Your Profile (WVYP) feature originally used a Lambda architecture that combined real‑time Kafka‑based processing with offline Hadoop MapReduce jobs, feeding results into Pinot for both real‑time and batch queries.

The Lambda approach provided fast, accurate data handling but introduced significant operational complexity, duplicated pipelines, and higher development and maintenance costs.

Key challenges included the need for developers to build and maintain two parallel pipelines that produced largely identical data, and the difficulty of keeping business logic synchronized across both streams.

To address these issues, the team removed all offline batch jobs and introduced a new Samza job to handle ProfileViewEvent and NavigationEvent streams, while retaining a lightweight offline job solely for copying real‑time data into Pinot’s offline tables for performance and retention benefits.

Samza was chosen because it supports multiple programming models (including Beam), integrates well with LinkedIn’s YARN clusters, and simplifies deployment and scaling of streaming jobs.

The new architecture eliminates duplicated processing, halves development time, reduces maintenance overhead, and improves user experience with faster, more reliable real‑time calculations such as view source attribution.

Additional mechanisms for deduplication and reprocessing were implemented at the service layer (when reading from Pinot) and the notification layer to avoid duplicate alerts, and offline reprocessing jobs are used for corrective data fixes.

The migration demonstrates that moving away from a Lambda architecture can yield substantial productivity gains, lower operational costs, and better real‑time capabilities while still preserving necessary batch functionality.

data pipelineBatch Processingstreaminglambda architectureLinkedInSamzaPinot
Top Architect
Written by

Top Architect

Top Architect focuses on sharing practical architecture knowledge, covering enterprise, system, website, large‑scale distributed, and high‑availability architectures, plus architecture adjustments using internet technologies. We welcome idea‑driven, sharing‑oriented architects to exchange and learn together.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.