Databases 27 min read

Lessons Learned from Hosting PostgreSQL at Scale and Using It as a Job Queue

This article shares a detailed case study of how a startup managed a massive cinema‑ticket dataset with PostgreSQL, evaluated cloud and self‑hosted options, refined materialized‑view strategies, built a reliable job queue, and distilled practical takeaways for large‑scale database operations.

Architects Research Society
Architects Research Society
Architects Research Society
Lessons Learned from Hosting PostgreSQL at Scale and Using It as a Job Queue

Background

The author, co‑founder of Applaudience, needed to store and process a cinema‑ticket dataset that records over a billion rows per month across 3,200+ theatres in Europe and the US. The PostgreSQL database became the core of the business, serving both as the source of truth and as a job queue.

Common Mistakes

Many teams over‑engineer data pipelines with unnecessary message queues, caching layers, and complex ETL processes. The author argues that a well‑designed single database, with atomic transactions and minimal external services, avoids most of these pitfalls.

Goals

The aim was to keep the architecture simple: no separate message‑queue service, no cache, no dedicated data‑warehouse replica. Instead, the team focused on low latency, appropriate hardware, and careful schema design to eliminate bottlenecks.

Dataset Overview

The dataset combines showtimes, ticket prices, and seat‑level data with auxiliary information from YouTube, Twitter, and weather feeds, resulting in a comprehensive time‑series view of movie releases for forecasting purposes.

Where to Host the Database

Google Cloud SQL for PostgreSQL

Amazon RDS for PostgreSQL

Aiven.io (managed PostgreSQL)

Self‑hosted on rented hardware

Google Cloud SQL

Received $100k in credits, but the service lagged behind the latest PostgreSQL version and contained a known bug that caused data corruption. The bug was fixed only in newer releases, making Google Cloud SQL a risky choice.

Amazon RDS

Provided up‑to‑date PostgreSQL versions, but lacked support for the TimescaleDB extension needed for partitioning, and Amazon’s own time‑series service (Timestream) did not meet the project’s requirements.

Aiven.io

Managed PostgreSQL with all required extensions, including TimescaleDB, and offered good support, but did not grant super‑user access, leading to operational issues such as missing auto_explain and logical replication problems.

Update 2019‑02‑05: Aiven added auto_explain support in version 1.2.

Self‑Hosted

By renting dedicated hardware, the team avoided vendor lock‑in, achieved lower costs (about 30 % cheaper than cloud), and could hire a freelance DBA for daily health checks.

Takeaways

Cloud providers prioritize proprietary solutions (BigQuery, Redshift). For a simple database that will not grow to billions of rows, a managed instance with near‑real‑time scaling, cross‑region migration, point‑in‑time recovery, and built‑in monitoring is often sufficient.

Performance Bonus

Benchmarking showed GCP’s PostgreSQL outperformed AWS by roughly 2× in the author’s tests.

Materialized Views

Initially, the team used a single large materialized view (MV) for venue data. As the dataset grew, refresh times ballooned from seconds to hours, prompting a series of refinements.

First Attempt

Created a single MV venue_view that duplicated the base table and added indexes.

CREATE MATERIALIZED VIEW venue_view AS
WITH
  auditorium_with_future_events AS (
    SELECT e1.venue_id, e1.auditorium_id
    FROM event e1
    WHERE e1.start_time > now() - INTERVAL '30 day'
      AND e1.auditorium_id IS NOT NULL
    GROUP BY e1.venue_id, e1.auditorium_id
  ),
  ...
WITH NO DATA;

Second Attempt – Divide and Conquer

Split the large MV into several smaller MVs (e.g., auditorium_with_future_events_view , venue_auditorium_seat_count_view ) so each could be refreshed independently.

CREATE MATERIALIZED VIEW auditorium_with_future_events_view AS
SELECT venue_id, auditorium_id
FROM event
WHERE start_time > now() - INTERVAL '30 day'
  AND auditorium_id IS NOT NULL
GROUP BY venue_id, auditorium_id
WITH NO DATA;

This approach reduced transaction size and allowed index‑driven joins, but increased the number of MVs to manage.

Third Attempt – Abstract Sub‑sets

Created domain‑specific MVs (e.g., last_week_event , future_event ) to avoid scanning billions of rows in a single view.

Fourth Attempt – Materialized Columns

Added a materialized_at timestamp column directly to the base tables, eliminating the need for separate MVs for many use‑cases. A background job populates these columns when they are NULL.

CREATE OR REPLACE FUNCTION materialize_event_seat_state_change()
RETURNS void AS $$
BEGIN
  WITH event_seat_state_count AS (
    SELECT essc1.id,
           count(*)::smallint AS seat_count,
           ...
    FROM event e1
    LEFT JOIN event_seat_state_change essc1 ON essc1.event_id = e1.id
    ...
    GROUP BY essc1.id
  )
  UPDATE event_seat_state_change essc1
  SET materialized_at = now(),
      seat_count = essc2.seat_count,
      ...
  FROM event_seat_state_count essc2
  WHERE essc1.id = essc2.id;
END
$$ LANGUAGE plpgsql;

Using the Database as a Job Queue

By leveraging UPDATE … SKIP LOCKED , the team built a simple, reliable, and high‑throughput concurrent work queue without external services.

CREATE OR REPLACE FUNCTION schedule_cinema_data_task()
RETURNS table(cinema_data_task_id int) AS $$
DECLARE scheduled_id int;
BEGIN
  UPDATE cinema_data_task_queue
  SET attempted_at = now()
  WHERE id = (
    SELECT id FROM cinema_data_task_queue
    WHERE attempted_at IS NULL
    ORDER BY id ASC
    LIMIT 1
    FOR UPDATE SKIP LOCKED
  )
  RETURNING cinema_data_task_id INTO scheduled_id;

  UPDATE cinema_data_task SET last_attempted_at = now() WHERE id = scheduled_id;
  RETURN QUERY SELECT scheduled_id;
END
$$ LANGUAGE plpgsql;

This pattern kept queue scans fast, allowed scaling to over 2,000 concurrent aggregation agents, and ensured that the scheduling query executed in a few milliseconds.

Key Lessons

PostgreSQL materialized views are excellent for small to medium datasets but require careful planning as data grows.

Fine‑grained MV refreshes and materialized columns provide near‑real‑time data enrichment without heavy data‑warehouse pipelines.

Using the database itself as a job queue works well when the queue table stays reasonably sized and scheduling queries stay sub‑millisecond.

Latency between application and database matters; colocating them can multiply throughput.

Proper index ordering, vacuum settings, and monitoring (pg_stat_statements, pg_stat_user_tables, pg_stat_activity) are essential for large‑scale PostgreSQL deployments.

Miscellaneous Tips

Adjust from_collapse_limit and join_collapse_limit from the default of 8 to higher values to avoid sub‑optimal plans.

Consider pg_repack or pg_squeeze for table bloat.

Continuously monitor query performance and index usage.

Bonus: Slonik PostgreSQL Client

The team switched from node‑postgres to Slonik , a type‑safe, logged, and assertion‑rich client that prevents SQL injection and integrates with auto_explain.

For more details, see the original article at http://jiagoushi.pro/node/1177 .

PostgreSQLMaterialized ViewsDatabase ScalingCloud HostingJob Queue
Architects Research Society
Written by

Architects Research Society

A daily treasure trove for architects, expanding your view and depth. We share enterprise, business, application, data, technology, and security architecture, discuss frameworks, planning, governance, standards, and implementation, and explore emerging styles such as microservices, event‑driven, micro‑frontend, big data, data warehousing, IoT, and AI architecture.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.