Unlock Inceptor Stage Metrics: A Guide to Faster Data Processing
This article explains how to navigate Inceptor's Stage information page, interpret Summary Metrics, Aggregated Metrics by Executor, and Task details, and use DAG Visualization and Event Timeline charts to diagnose performance issues and optimize big‑data workloads.
Stage Information Page Overview
The Stage page in Inceptor provides three main data tables—Summary Metrics, Aggregated Metrics by Executor, and Tasks—plus two visual charts: DAG Visualization and Event Timeline.
Summary Metrics for Completed Tasks
This table aggregates statistics for all completed tasks in a Stage, breaking down task overhead into five metrics: result serialization time, total execution time, result fetch time, latency, and Shuffle write size. For each metric, the table shows Min, 25th percentile, Median, 75th percentile, and Max, indicating the distribution of task costs.
Aggregated Metrics by Executor
This table lists executor‑level metrics; most columns are self‑explanatory. The two Shuffle Spill columns (Memory and Disk) can usually be ignored for tuning.
Tasks
The Tasks table shows 14 columns such as ID, Status, Executor, Launch Time, Duration, Input, Write Time, Shuffle Write, Attempt, and Locality Level. "Attempt" indicates retry count; a high value may signal resource shortage or data skew. "Locality Level" describes the relationship between computation and data location.
PROCESS_LOCAL : computation and data reside in the same JVM on the same machine, offering the fastest access.
NODE_LOCAL : computation accesses data in another JVM on the same machine.
RACK_LOCAL : computation must fetch data from a different machine, incurring network latency.
Ideally, tasks should be PROCESS_LOCAL to minimize data transfer. When many tasks have high Duration values, it may indicate data skew; if long‑running tasks cluster on a single executor or machine, that node may have issues.
DAG Visualization
Clicking this button displays a directed acyclic graph of the selected Stage, showing operator execution order and relationships.
Event Timeline
This chart visualizes executor runtime distribution; each horizontal bar represents a task’s execution time, with colors indicating different task phases. Typically, the computation time (green) dominates.
Summary
The article clarifies the meaning of each column in the Summary Metrics, Aggregated Metrics by Executor, and Tasks tables, and explains how to use DAG Visualization and Event Timeline to gain intuitive insight into Stage execution and cost distribution. By combining these views, users can detect anomalies such as data skew, uneven task distribution, or problematic executors, and take targeted actions to improve performance.
StarRing Big Data Open Lab
Focused on big data technology research, exploring the Big Data era | [email protected]
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
