Big Data 12 min read

How to Build a Scalable Friend Recommendation System with MaxCompute

This article explains how to leverage Alibaba Cloud's MaxCompute and MapReduce to design, model, and deploy a large‑scale social friend recommendation system, covering data requirements, analysis models, cloud architecture, and practical development steps.

21CTO

Apr 2, 2018

How to Build a Scalable Friend Recommendation System with MaxCompute

1. Big Data in Friend Recommendation Systems

The article introduces a social friend recommendation system built on MaxCompute, emphasizing three essential elements for big‑data applications: massive data volume, high‑speed data processing capability, and a concrete commercial scenario for monetization.

2. Analysis Model of the Recommendation System

The recommendation logic relies on identifying potential mutual friends between non‑connected users. By counting shared connections, the system ranks candidates, placing users with more common friends higher in the recommendation list.

3. MapReduce‑Based Computation

The workflow uses the MapReduce programming model, consisting of three stages:

Map : Transform raw relationship records into key,value pairs where the key is a user pair and the value indicates friendship status (0 for existing friends, 1 for non‑friends).

Combine : Perform local aggregation to remove duplicate records and reduce data volume before shuffling.

Reduce : Aggregate values for each key, discarding pairs with a value of 0 (already friends) and keeping those with positive values to generate potential friend recommendations.

4. Implementation on Alibaba Cloud

The system architecture integrates a front‑end application hosted on ECS, relational data stored in RDS, and data extraction to MaxCompute for analysis via the Big Data Development Platform. Results are then served back to the front‑end for display.

5. Technical Features of MaxCompute

Distributed architecture with flexible scaling across clusters.

Robust security including automatic error correction, sandboxing, and multi‑copy backup.

Ease of use with standard APIs, full SQL support, and upload/download tools.

Fine‑grained permission control for multi‑tenant management and data access policies.

6. Typical Use Cases of MaxCompute

MaxCompute can serve as a data warehouse, support graph computations, schedule periodic data‑analysis workflows, and provide processed data for machine‑learning platforms.

7. DataIDE (DataWorks) for Development

DataIDE offers a visual, secure environment for offline data development, simplifying task workflow creation while the underlying processing still runs on MaxCompute.

8. Development Process for MaxCompute Applications

Install and configure the MaxCompute client.

Develop MapReduce programs (Java or Eclipse).

Test scripts locally.

Package the program into a JAR.

Upload the JAR to a MaxCompute project space.

Execute the MapReduce job in MaxCompute.

The article walks through a concrete friend‑recommendation example, detailing client setup, code testing, resource uploading, and job execution.

Source: Cloud Community

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

Data Warehouse MaxCompute Friend Recommendation

Written by

21CTO

21CTO (21CTO.com) offers developers community, training, and services, making it your go‑to learning and service platform.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.