Explore World Cup Analytics on EMR Serverless StarRocks – Free Trial Guide
This guide walks you through creating a fully managed EMR Serverless StarRocks instance, loading historical World Cup data, and running OLAP SQL queries to analyze championship counts and host‑nation performance, all using a free trial of compute and storage resources.
What is EMR Serverless StarRocks?
EMR Serverless StarRocks is a fully managed, serverless version of the open‑source StarRocks OLAP engine on Alibaba Cloud. It supports MySQL‑compatible queries and offers high‑performance multi‑dimensional analysis, data‑lake queries, high concurrency, and real‑time analytics.
Free Trial Offer
New users can claim 5000 CU·h of compute resources and 48000 GB·h of storage for free.
Key Advantages
Fully managed, no operations overhead.
Visual instance management, monitoring and alerting.
StarRocks Manager for metadata, diagnostics, optimization, and user/role management.
Tutorial Overview
The tutorial demonstrates creating a StarRocks instance, a database, a table, loading historical World Cup data, and performing simple OLAP queries.
Step 1 – Prepare Environment
Log in to Alibaba Cloud and navigate to Big Data Computing > Data Lake , then click “Try Now” on the EMR Serverless StarRocks card.
Complete any required RAM role authorization.
Configure the instance parameters as shown in the table below (region, VPC, subnet, resource package, etc.).
Configuration Item
Description
Region
North China 2 (Beijing)
VPC
Select an existing VPC or create a new one.
Subnet
Select an existing subnet or create a new one.
Resource Package
5000 CU·h compute, 100 GB for 20 days storage (available in CN‑North‑2, CN‑East‑2, CN‑South‑1, CN‑East‑1).
Instance Name
1‑64 characters, letters, numbers, hyphens, underscores.
Password
Custom admin password.
Step 2 – Connect via SQL Editor
In the EMR console, select EMR Serverless > StarRocks , then click StarRocks Manager .
In the “New Connection” dialog, use the default admin user and the password set earlier.
Open the SQL Editor .
Step 3 – Create Database and Table
create database sr_db;
create table if not exists sr_db.world_cup_summary(
year varchar(20),
HostCountry varchar(20),
Winner varchar(50),
Second varchar(50),
Third varchar(50),
Fourth varchar(50),
GoalsScored bigint,
QualifiedTeams bigint,
MatchesPlayed bigint,
Attendance bigint,
HostContinent varchar(50),
WinnerContinent varchar(50)
) distributed by hash(Attendance) buckets 2
properties("replication_num"="1");The table stores summary information for all 21 World Cup editions (1930‑2018).
Step 4 – Load Data
insert into sr_db.world_cup_summary values ('1938','France','Italy','Hungary','Brazil','Sweden',84,15,18,375700,'Europe','Europe');
... (additional INSERT statements for each tournament) ...Step 5 – Perform OLAP Queries
Top 5 countries by championship count:
select Winner, count('Winner') as Winner_count
from sr_db.world_cup_summary
group by Winner
order by Winner_count desc
limit 5;Host nation reaching semifinals:
select 'Reached Semifinals' as host_stage, count(1) as cnt
from (
select year, HostCountry, Winner, Second, Third, Fourth
from sr_db.world_cup_summary
where Winner=HostCountry or Second=HostCountry or Third=HostCountry or Fourth=HostCountry
) a
union all
select 'Did Not Reach Semifinals', count(1)
from (
select year, HostCountry, Winner, Second, Third, Fourth
from sr_db.world_cup_summary
where Winner!=HostCountry and Second!=HostCountry and Third!=HostCountry and Fourth!=HostCountry
) b;Host nation reaching the final and winning the championship (similar queries omitted for brevity).
Result
Running the queries produces visualizations that show the distribution of championships, host‑nation performance, and other insights.
Conclusion
This hands‑on example shows how to create a StarRocks instance on EMR Serverless, load a dataset, and execute OLAP analyses, enabling rapid exploration of big‑data workloads.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Alibaba Cloud Big Data AI Platform
The Alibaba Cloud Big Data AI Platform builds on Alibaba’s leading cloud infrastructure, big‑data and AI engineering capabilities, scenario algorithms, and extensive industry experience to offer enterprises and developers a one‑stop, cloud‑native big‑data and AI capability suite. It boosts AI development efficiency, enables large‑scale AI deployment across industries, and drives business value.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
