Tag

Scala

0 views collected around this technical thread.

Bitu Technology
Bitu Technology
Mar 21, 2025 · Backend Development

Optimizing Redis Latency for an Online Feature Store: A Batch Query Case Study

This article describes how Tubi improved the latency of its Redis‑backed online feature store for machine‑learning inference by analyzing query patterns, measuring client‑side bottlenecks, and applying optimizations such as binary Avro encoding, MGET usage, virtual partitioning, and parallel deserialization to meet a sub‑10 ms SLA.

Batch QueryFeature StoreRedis
0 likes · 9 min read
Optimizing Redis Latency for an Online Feature Store: A Batch Query Case Study
DataFunSummit
DataFunSummit
Nov 11, 2024 · Big Data

Understanding Spark SQL Parsing Layer and Its Optimizations

This talk, the third in a Spark series, introduces the Spark SQL parsing layer, explains its architecture and integration with ANTLR4, details core implementation classes, and presents a real‑world optimization case that reduces code complexity and improves maintainability.

ANTLR4Big DataOptimization
0 likes · 15 min read
Understanding Spark SQL Parsing Layer and Its Optimizations
Bitu Technology
Bitu Technology
Dec 8, 2023 · Backend Development

Why Every Java Developer Should Learn Scala – Key Advantages and Insights from the Scala Meetup

The article reviews a Scala meetup where experts compare Java and Scala, highlighting Scala's stronger expressiveness, type inference, pattern matching, safety, and concurrency features, and discusses real‑world adoption, developer experiences, and a recruitment opportunity for a Scala‑focused big‑data team.

Big DataFunctional ProgrammingJava
0 likes · 13 min read
Why Every Java Developer Should Learn Scala – Key Advantages and Insights from the Scala Meetup
Bitu Technology
Bitu Technology
Dec 8, 2023 · Fundamentals

Exploring Partial POJOs, Meta‑Information, and Scala 3 Mirror in a Scala Meetup

The article recaps the ninth Scala Meetup where speaker Pei Qi (Li Guobin) introduced GraphQL‑driven partial POJOs, Scala 3's Mirror feature, Magnolia as an alternative, type‑class concepts, and compared Scala 2 and Scala 3, while also promoting the Tubi data‑platform lead role and past meetup resources.

GraphQLMagnoliaPartialPOJO
0 likes · 5 min read
Exploring Partial POJOs, Meta‑Information, and Scala 3 Mirror in a Scala Meetup
Rare Earth Juejin Tech Community
Rare Earth Juejin Tech Community
May 8, 2023 · Artificial Intelligence

Review of Alibaba's Tongyi Qianwen AI Model with Sample Code, Recipe, and SWOT Analysis

This article reviews Alibaba's Tongyi Qianwen large language model, shares personal impressions, provides a fish‑flavored pork recipe, conducts a SWOT analysis, and includes Scala Spark and Java code examples illustrating its capabilities and usage scenarios.

Artificial IntelligenceJavaSWOT Analysis
0 likes · 12 min read
Review of Alibaba's Tongyi Qianwen AI Model with Sample Code, Recipe, and SWOT Analysis
Rare Earth Juejin Tech Community
Rare Earth Juejin Tech Community
Apr 28, 2023 · Artificial Intelligence

Exploring Alibaba’s Tongyi Qianwen AI Model, SWOT, Recipe Demo, and Code Samples for Spark Same‑Period Analysis and Java Bubble Sort

The article reviews Alibaba’s Tongyi Qianwen large‑language model, shares a cooking recipe generated by the AI, presents a SWOT analysis, and provides code examples—including a Spark Scala script for same‑period month‑over‑month calculations and a Java bubble‑sort implementation.

AIJavaSWOT
0 likes · 12 min read
Exploring Alibaba’s Tongyi Qianwen AI Model, SWOT, Recipe Demo, and Code Samples for Spark Same‑Period Analysis and Java Bubble Sort
Sohu Tech Products
Sohu Tech Products
Sep 7, 2022 · Big Data

Introducing the Fire Framework: Annotation‑Driven Development for Spark and Flink

The Fire framework, open‑source by ZTO Express, provides a unified annotation‑based programming model for real‑time Spark and Flink jobs, dramatically reducing boilerplate, simplifying configuration, and enabling rapid development of large‑scale data processing tasks with concise Scala code examples.

AnnotationsBig DataFire Framework
0 likes · 12 min read
Introducing the Fire Framework: Annotation‑Driven Development for Spark and Flink
政采云技术
政采云技术
Sep 6, 2022 · Big Data

Compiling and Deploying Spark 3.3.0 on CDH 6.3.2 (Cloudera) – Step‑by‑Step Guide

This guide explains how to download JDK, Maven, Scala and Spark 3.3.0, modify the Spark pom and configuration files for CDH 6.3.2, compile Spark with Maven, deploy the binaries to a client node, set up spark‑sql and spark‑submit scripts, and address common runtime issues.

Big DataCDHCompilation
0 likes · 13 min read
Compiling and Deploying Spark 3.3.0 on CDH 6.3.2 (Cloudera) – Step‑by‑Step Guide
Bitu Technology
Bitu Technology
Jun 29, 2022 · Backend Development

Recap of Scala Meetup #7: Tubi Recommendation System Architecture, The Nature of Computation, and Reactive Streams in Large-Scale Scenarios

The seventh Scala Meetup gathered over 1400 online participants to share three technical talks covering Tubi's content recommendation system architecture, philosophical insights into the nature of computation, and practical experiences with reactive streams in large‑scale JVM environments, followed by a round‑table discussion and audience feedback.

Functional ProgrammingReactive StreamsScala
0 likes · 15 min read
Recap of Scala Meetup #7: Tubi Recommendation System Architecture, The Nature of Computation, and Reactive Streams in Large-Scale Scenarios
Bitu Technology
Bitu Technology
Jun 14, 2022 · Backend Development

Why Tubi Chose Scala: A Backend Transformation Story

The article explains how Tubi, an ad‑supported streaming platform with tens of millions of users, migrated from a Node.js‑Spark‑Redis stack to a Scala‑based backend to achieve sub‑10‑millisecond recommendation latency, improve fault tolerance, and simplify domain modeling using Akka and functional programming.

AkkaBackendScala
0 likes · 7 min read
Why Tubi Chose Scala: A Backend Transformation Story
DataFunTalk
DataFunTalk
Apr 30, 2022 · Artificial Intelligence

Insights into BIDMach: An Unusual Machine Learning Framework and Thoughts on Building Industrial‑Grade ML Systems

The article introduces BIDMach, a compact Scala‑based machine‑learning framework built with JNI‑driven CUDA/MKL, explains its three‑layer architecture, and discusses broader considerations for designing usable, high‑performance, and extensible industrial AI frameworks, emphasizing co‑design, algorithm‑framework co‑evolution, and ecosystem factors.

BIDMachCo-designScala
0 likes · 8 min read
Insights into BIDMach: An Unusual Machine Learning Framework and Thoughts on Building Industrial‑Grade ML Systems
Big Data Technology Architecture
Big Data Technology Architecture
Nov 2, 2021 · Big Data

Comprehensive Guide to FlinkSQL and Table API: Background, Dependencies, Planners, and Usage

This article provides a detailed introduction to FlinkSQL, covering its background, the Table API, required dependencies, differences between old and Blink planners, various API usage patterns, connector configurations for CSV, Kafka, Elasticsearch, MySQL, and how to convert between DataStream and Table in Flink's unified batch‑stream processing model.

ConnectorDataStreamFlinkSQL
0 likes · 23 min read
Comprehensive Guide to FlinkSQL and Table API: Background, Dependencies, Planners, and Usage
Big Data Technology Architecture
Big Data Technology Architecture
Jul 15, 2021 · Big Data

Resolving Spark Task Not Serializable Errors: Causes, Code Examples, and Best Practices

This article analyzes why Spark tasks fail with a "Task not serializable" exception when closures reference class members, demonstrates the issue with Scala code examples, and provides practical solutions such as using @transient annotations, moving functions to objects, and ensuring proper class serialization.

Big DataScalaSerialization
0 likes · 12 min read
Resolving Spark Task Not Serializable Errors: Causes, Code Examples, and Best Practices
Big Data Technology Architecture
Big Data Technology Architecture
Jun 29, 2021 · Big Data

Implementing and Registering a Custom SparkListener in Apache Spark

This article explains how to create a custom SparkListener in Apache Spark, provides Scala code examples for the listener and a main application, and details two registration approaches—via Spark configuration or SparkContext—along with a comprehensive list of listener event methods.

Apache SparkBig DataEvent Listener
0 likes · 5 min read
Implementing and Registering a Custom SparkListener in Apache Spark
GrowingIO Tech Team
GrowingIO Tech Team
Jun 25, 2021 · Backend Development

How Scala Macros Simplify Protobuf‑Java ↔ Scala Case Class Conversions

This article explains a Scala‑based DSL that uses macros and implicit parameters to generate type‑safe, boiler‑plate‑free conversion code between protobuf‑java messages and Scala case classes, showing examples, builder customisation, and upcoming Scala 3 support.

DSLMacroProtobuf
0 likes · 11 min read
How Scala Macros Simplify Protobuf‑Java ↔ Scala Case Class Conversions
Qunar Tech Salon
Qunar Tech Salon
Jun 1, 2021 · Big Data

Integrating TensorFlow for Java with Spark‑Scala for Distributed Machine Learning Prediction

This article shares practical experience of building a high‑performance distributed prediction service by combining TensorFlow for Java with Spark‑Scala, covering framework selection, performance comparison, model training, loading, inference, deployment, and optimization techniques for large‑scale data processing.

Big DataJavaScala
0 likes · 16 min read
Integrating TensorFlow for Java with Spark‑Scala for Distributed Machine Learning Prediction
Architect
Architect
Apr 3, 2021 · Big Data

Advanced Spark Performance Optimization: Data Skew and Shuffle Tuning

This article explains advanced Spark performance tuning techniques, focusing on diagnosing and resolving data skew and shuffle bottlenecks through stage analysis, key distribution inspection, and a variety of practical solutions such as Hive pre‑processing, key filtering, parallelism increase, two‑stage aggregation, map‑join, and combined strategies, while also covering ShuffleManager internals and related configuration parameters.

Big DataPerformance TuningScala
0 likes · 47 min read
Advanced Spark Performance Optimization: Data Skew and Shuffle Tuning
DataFunTalk
DataFunTalk
Mar 18, 2021 · Fundamentals

Building Popper: Tubi’s Scalable Experimentation Platform

Tubi’s Popper platform combines a Scala‑based experiment engine, reproducible JSON‑stored configurations, a React UI, and data pipelines using Spark and Akka to enable fast, cross‑team A/B testing, automated analysis, health checks, and data‑driven decision making across mobile and OTT services.

A/B testingAkkaExperimentation platform
0 likes · 15 min read
Building Popper: Tubi’s Scalable Experimentation Platform
Big Data Technology Architecture
Big Data Technology Architecture
Mar 13, 2021 · Big Data

Understanding mapPartitions vs map in Apache Spark: Performance, Pitfalls, and Proper Usage

This article examines why many developers favor Spark's mapPartitions over map, analyzes the underlying source code, highlights common pitfalls such as complexity and OOM risks, and provides practical guidelines and code examples for correctly using mapPartitions in both simple and advanced scenarios.

Big DataIteratorScala
0 likes · 9 min read
Understanding mapPartitions vs map in Apache Spark: Performance, Pitfalls, and Proper Usage