Big Data 11 min read

Explore World Cup Analytics on EMR Serverless StarRocks – Free Trial Guide

This guide walks you through creating a fully managed EMR Serverless StarRocks instance, loading historical World Cup data, and running OLAP SQL queries to analyze championship counts and host‑nation performance, all using a free trial of compute and storage resources.

Alibaba Cloud Big Data AI Platform
Alibaba Cloud Big Data AI Platform
Alibaba Cloud Big Data AI Platform
Explore World Cup Analytics on EMR Serverless StarRocks – Free Trial Guide

What is EMR Serverless StarRocks?

EMR Serverless StarRocks is a fully managed, serverless version of the open‑source StarRocks OLAP engine on Alibaba Cloud. It supports MySQL‑compatible queries and offers high‑performance multi‑dimensional analysis, data‑lake queries, high concurrency, and real‑time analytics.

Free Trial Offer

New users can claim 5000 CU·h of compute resources and 48000 GB·h of storage for free.

Key Advantages

Fully managed, no operations overhead.

Visual instance management, monitoring and alerting.

StarRocks Manager for metadata, diagnostics, optimization, and user/role management.

Tutorial Overview

The tutorial demonstrates creating a StarRocks instance, a database, a table, loading historical World Cup data, and performing simple OLAP queries.

Step 1 – Prepare Environment

Log in to Alibaba Cloud and navigate to Big Data Computing > Data Lake , then click “Try Now” on the EMR Serverless StarRocks card.

Complete any required RAM role authorization.

Configure the instance parameters as shown in the table below (region, VPC, subnet, resource package, etc.).

Configuration Item

Description

Region

North China 2 (Beijing)

VPC

Select an existing VPC or create a new one.

Subnet

Select an existing subnet or create a new one.

Resource Package

5000 CU·h compute, 100 GB for 20 days storage (available in CN‑North‑2, CN‑East‑2, CN‑South‑1, CN‑East‑1).

Instance Name

1‑64 characters, letters, numbers, hyphens, underscores.

Password

Custom admin password.

Step 2 – Connect via SQL Editor

In the EMR console, select EMR Serverless > StarRocks , then click StarRocks Manager .

In the “New Connection” dialog, use the default admin user and the password set earlier.

Open the SQL Editor .

Step 3 – Create Database and Table

create database sr_db;
create table if not exists sr_db.world_cup_summary(
  year varchar(20),
  HostCountry varchar(20),
  Winner varchar(50),
  Second varchar(50),
  Third varchar(50),
  Fourth varchar(50),
  GoalsScored bigint,
  QualifiedTeams bigint,
  MatchesPlayed bigint,
  Attendance bigint,
  HostContinent varchar(50),
  WinnerContinent varchar(50)
) distributed by hash(Attendance) buckets 2
properties("replication_num"="1");

The table stores summary information for all 21 World Cup editions (1930‑2018).

Step 4 – Load Data

insert into sr_db.world_cup_summary values ('1938','France','Italy','Hungary','Brazil','Sweden',84,15,18,375700,'Europe','Europe');
... (additional INSERT statements for each tournament) ...

Step 5 – Perform OLAP Queries

Top 5 countries by championship count:

select Winner, count('Winner') as Winner_count
from sr_db.world_cup_summary
group by Winner
order by Winner_count desc
limit 5;

Host nation reaching semifinals:

select 'Reached Semifinals' as host_stage, count(1) as cnt
from (
  select year, HostCountry, Winner, Second, Third, Fourth
  from sr_db.world_cup_summary
  where Winner=HostCountry or Second=HostCountry or Third=HostCountry or Fourth=HostCountry
) a
union all
select 'Did Not Reach Semifinals', count(1)
from (
  select year, HostCountry, Winner, Second, Third, Fourth
  from sr_db.world_cup_summary
  where Winner!=HostCountry and Second!=HostCountry and Third!=HostCountry and Fourth!=HostCountry
) b;

Host nation reaching the final and winning the championship (similar queries omitted for brevity).

Result

Running the queries produces visualizations that show the distribution of championships, host‑nation performance, and other insights.

Conclusion

This hands‑on example shows how to create a StarRocks instance on EMR Serverless, load a dataset, and execute OLAP analyses, enabling rapid exploration of big‑data workloads.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

Big DataSQLStarRocksOLAPemr serverlessWorld Cup
Alibaba Cloud Big Data AI Platform
Written by

Alibaba Cloud Big Data AI Platform

The Alibaba Cloud Big Data AI Platform builds on Alibaba’s leading cloud infrastructure, big‑data and AI engineering capabilities, scenario algorithms, and extensive industry experience to offer enterprises and developers a one‑stop, cloud‑native big‑data and AI capability suite. It boosts AI development efficiency, enables large‑scale AI deployment across industries, and drives business value.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.