Tagged articles

Approximate Distinct Count

1 articles · Page 1 of 1

Jan 7, 2020 · Big Data

Using HyperLogLog for High-Performance Pre-Aggregation in Big Data with Spark-Alchemy

The article explains how pre‑aggregation combined with the HyperLogLog algorithm and Spark‑Alchemy's native HLL functions can dramatically accelerate distinct‑count calculations in big‑data workloads while maintaining low error rates and cross‑system compatibility.

Approximate Distinct CountBig DataHyperLogLog

0 likes · 7 min read

Using HyperLogLog for High-Performance Pre-Aggregation in Big Data with Spark-Alchemy