Tagged articles
35 articles
Page 1 of 1
Data STUDIO
Data STUDIO
Nov 21, 2025 · Big Data

How a One‑Line Pandas Change Cuts GroupBy Time from 40 Minutes to 4 Seconds

The article shows why a naïve Pandas groupby on a 25‑million‑row DataFrame can take 40 minutes, identifies common performance killers, and demonstrates that converting the grouping column to the categorical dtype with observed=True and sort=False reduces runtime to about 4 seconds while also cutting memory usage dramatically.

Pythoncategory dtypedata-processing
0 likes · 7 min read
How a One‑Line Pandas Change Cuts GroupBy Time from 40 Minutes to 4 Seconds
Python Crawling & Data Mining
Python Crawling & Data Mining
Apr 12, 2025 · Fundamentals

How to Group and Map Data in Pandas: 5 Practical Methods

This article walks through a common Python data‑processing challenge—grouping numeric identifiers with corresponding strings—by presenting five distinct Pandas‑based solutions, complete with code snippets and visual results, enabling readers to efficiently transform raw lists into organized dictionaries.

Code ExamplesPythondata-processing
0 likes · 8 min read
How to Group and Map Data in Pandas: 5 Practical Methods
Python Crawling & Data Mining
Python Crawling & Data Mining
Jan 28, 2025 · Fundamentals

Master Pandas: From Data Import to Advanced Manipulation in Python

This tutorial walks you through pandas fundamentals—including reading CSV/Excel files, creating Series and DataFrames, performing basic operations, cleaning data, using loc/iloc indexing, grouping, concatenating, merging, and handling time series—providing code examples and visual outputs for each step.

Time Seriesdata cleaninggroupby
0 likes · 14 min read
Master Pandas: From Data Import to Advanced Manipulation in Python
Alibaba Cloud Observability
Alibaba Cloud Observability
Sep 5, 2024 · Databases

How SLS Achieves 8× Faster High‑Cardinality GroupBy Queries

This article explains the challenges of high‑cardinality GroupBy operations, describes SLS's underlying implementation and session‑based optimizations, and presents three real‑world test cases that demonstrate up to an eight‑fold speed improvement for massive data aggregations.

Performance TestingSLSSQL Optimization
0 likes · 10 min read
How SLS Achieves 8× Faster High‑Cardinality GroupBy Queries
Alibaba Cloud Native
Alibaba Cloud Native
Sep 4, 2024 · Big Data

How to Speed Up High‑Cardinality GroupBy Queries by Up to 8× in SLS

This article explains why high‑cardinality GroupBy queries are slow, describes SLS's underlying aggregation pipeline, and shows how adjusting session parameters and enabling high‑cardinality optimizations can reduce query times from dozens of seconds to just a few seconds across three real‑world test scenarios.

SLSSQLbig-data
0 likes · 11 min read
How to Speed Up High‑Cardinality GroupBy Queries by Up to 8× in SLS
Python Crawling & Data Mining
Python Crawling & Data Mining
Aug 15, 2024 · Fundamentals

Master Pandas: Essential Data Manipulation Techniques for Beginners

This comprehensive tutorial walks you through pandas basics, including reading CSV and Excel files, creating Series and DataFrames, performing data inspection, cleaning, indexing, hierarchical indexing, time‑series handling, grouping, aggregation, concatenation, merging, and practical code examples with visual outputs.

Time Seriesdata cleaninggroupby
0 likes · 12 min read
Master Pandas: Essential Data Manipulation Techniques for Beginners
21CTO
21CTO
Jul 30, 2024 · Frontend Development

What’s New in ECMAScript 2024? Key Features and Their Impact on JavaScript Development

The article reviews ECMAScript 2024, highlighting new small‑scale features such as improved WebAssembly interop, enhanced Promise utilities, group‑by methods, better Unicode handling, async locking with Atomics.waitAsync, and resizable ArrayBuffers, while also discussing upcoming proposals for 2025.

AsyncECMAScript 2024JavaScript
0 likes · 17 min read
What’s New in ECMAScript 2024? Key Features and Their Impact on JavaScript Development
Code Ape Tech Column
Code Ape Tech Column
Jun 21, 2023 · Big Data

From Java Streams to Spark: Basic Big Data Operations Explained

This article demonstrates how developers familiar with Java Stream APIs can quickly grasp fundamental Spark operations—including map, flatMap, groupBy, and reduce—by translating stream examples into Spark code, providing complete code snippets, explanations of transformations versus actions, and practical tips for handling exceptions in distributed processing.

Big DataJava StreamMAP
0 likes · 24 min read
From Java Streams to Spark: Basic Big Data Operations Explained