Big Data 4 min read

Maintaining Wide Tables: Resource Impact, Evaluation, Granularity, Timeliness, and Automatic Expansion

The article explains how wide tables are maintained without excessive resource consumption, outlines criteria for deciding which metrics belong in a wide table, describes their granularity and timeliness considerations, and clarifies that they do not automatically expand when tracking points change.

DataFunTalk

Dec 25, 2022

Maintaining Wide Tables: Resource Impact, Evaluation, Granularity, Timeliness, and Automatic Expansion

Q1: How are wide tables maintained and do they consume excessive resources? Although wide tables may appear to use more resources than star‑schema tables, they do not in practice because only necessary metrics are included; low‑frequency, highly personalized metrics should be excluded. Wide tables are coarse‑grained, atomic metric collections that are flexible, have lower maintenance cost, and suit fast‑changing internet businesses, but are unsuitable for domains like finance.

Q2: How to evaluate whether a metric should be placed in a wide table? Decision is based on metric usage frequency and scope: metrics with few users, low frequency, or strong personalization are not suitable, while common metrics such as daily active users, homepage PV, and content exposure PV are appropriate; niche metrics like a floating‑ball click PV are excluded.

Q3: What is the granularity of a user wide table? A wide table has no primary key and acts as a “bucket” of common metrics; the user master table provides the user‑ID primary key. Core metrics (e.g., DAU, retention, revenue, consumption) are aggregated in the wide table, whereas fine‑grained metrics must be retrieved from detailed user‑behavior tables.

Q4: How to ensure the timeliness of a wide table? Timeliness depends on source data ingestion efficiency and SQL performance. High‑complexity data can be processed in parallel rather than via upstream/downstream dependencies, and big‑data platforms typically assign higher task priorities to accelerate processing.

Q5: When tracking points change, does the data‑warehouse wide table automatically expand? No. Wide tables are defined based on metric definitions; tracking events are merely factors for those metrics, and a single metric may involve multiple events, so the table does not auto‑extend.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

Analytics Big Data Resource Management Data Warehouse Wide Table

Written by

DataFunTalk

Dedicated to sharing and discussing big data and AI technology applications, aiming to empower a million data scientists. Regularly hosts live tech talks and curates articles on big data, recommendation/search algorithms, advertising algorithms, NLP, intelligent risk control, autonomous driving, and machine learning/deep learning.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.