Tag

Apache Seatunnel

0 views collected around this technical thread.

DataFunSummit
DataFunSummit
Feb 24, 2025 · Big Data

Building Real-Time Data Synchronization Pipelines with Apache SeaTunnel

Apache SeaTunnel is an open‑source, distributed data integration platform that enables efficient real‑time data synchronization across diverse sources and destinations, supporting both streaming and batch processing, with detailed architecture, connector plugins, CDC handling, transform capabilities, and deployment strategies for large‑scale data pipelines.

Apache SeatunnelCDCdata pipelines
0 likes · 34 min read
Building Real-Time Data Synchronization Pipelines with Apache SeaTunnel
DataFunTalk
DataFunTalk
Jul 10, 2024 · Big Data

Apache SeaTunnel: A Next‑Generation Data Integration Platform for ETL/ELT and OLAP

This article introduces Apache SeaTunnel, a modern data integration platform designed for the EtLT era, detailing its architecture, core connector APIs, checkpoint mechanism, model inference, multi‑table synchronization, the high‑performance SeaTunnel Zeta engine, OLAP use cases, community roadmap, and the commercial WhaleTunnel product.

Apache SeatunnelBig DataELT
0 likes · 22 min read
Apache SeaTunnel: A Next‑Generation Data Integration Platform for ETL/ELT and OLAP
Inke Technology
Inke Technology
Jun 28, 2023 · Big Data

Extending Apache Seatunnel for Trino and Kyuubi Integration: A Practical Guide

This article outlines the challenges of scaling data integration platforms, proposes a comprehensive solution using Apache Seatunnel and Dinky, details the implementation of Trino and Kyuubi JDBC support, and describes the platform's architecture, task publishing workflow, logging, monitoring, resource management, and future enhancements.

Apache SeatunnelBig DataKyuubi
0 likes · 16 min read
Extending Apache Seatunnel for Trino and Kyuubi Integration: A Practical Guide
DataFunTalk
DataFunTalk
Jul 17, 2022 · Big Data

Redesigning Apache SeaTunnel: Decoupling Source and Sink APIs for Multi‑Engine Support

The presentation details the motivations, goals, and architectural redesign of Apache SeaTunnel (Incubating) to decouple its Source and Sink APIs from underlying engines, introducing unified APIs, version‑agnostic connectors, and enhanced support for Spark and Flink in both batch and streaming scenarios.

Apache SeatunnelBig DataFlink
0 likes · 12 min read
Redesigning Apache SeaTunnel: Decoupling Source and Sink APIs for Multi‑Engine Support
Big Data Technology Architecture
Big Data Technology Architecture
Jul 15, 2022 · Big Data

Using and Designing the Apache SeaTunnel Examples Module

This article introduces Apache SeaTunnel's Examples module, compares SeaTunnel with DataX, explains its multi‑engine design, demonstrates Flink and Spark example implementations, and shares the speaker's experiences contributing to the open‑source community, providing practical guidance for big‑data integration projects.

Apache SeatunnelBig DataFlink
0 likes · 10 min read
Using and Designing the Apache SeaTunnel Examples Module
DataFunTalk
DataFunTalk
Mar 15, 2022 · Big Data

Bilibili's Billion‑Scale Data Synchronization Using Apache SeaTunnel

This article details Bilibili's implementation of a hundred‑terabyte‑per‑day data synchronization pipeline, covering tool selection between DataX‑based Rider and SeaTunnel‑based AlterEgo, architecture design, performance tuning, logging optimization, rate‑limiting strategies, and comprehensive monitoring for large‑scale offline data ingestion and export.

Apache SeatunnelBig DataClickHouse
0 likes · 13 min read
Bilibili's Billion‑Scale Data Synchronization Using Apache SeaTunnel