Tag

SparkR

1 views collected around this technical thread.

Big Data Technology Architecture
Big Data Technology Architecture
Aug 8, 2020 · Big Data

Performance Comparison of SparkR with Vectorized Execution Using Apache Arrow

This article explains how SparkR’s performance compares to native Spark APIs, shows the slowdown caused by JVM‑R serialization, and demonstrates how enabling Apache Arrow’s vectorized execution in Spark 3.0 can accelerate SparkR operations by up to dozens of times.

Apache ArrowSparkRVectorized Execution
0 likes · 7 min read
Performance Comparison of SparkR with Vectorized Execution Using Apache Arrow
Qunar Tech Salon
Qunar Tech Salon
Aug 18, 2015 · Big Data

Overview of Spark Big Data Analytics Framework Components

Spark’s big‑data analytics ecosystem comprises core components such as the in‑memory RDD data structure, Streaming for real‑time processing, GraphX for graph analytics, MLlib for machine‑learning, Spark SQL for querying, the Tachyon file system, and SparkR, each enabling scalable, distributed computation.

GraphXMLlibRDD
0 likes · 5 min read
Overview of Spark Big Data Analytics Framework Components