Tagged articles
560 articles
Page 1 of 6
Big Data Technology & Architecture
Big Data Technology & Architecture
Apr 24, 2026 · Artificial Intelligence

A Deep Dive into Flink Agents: Architecture, Roadmap, and Upcoming Features

The article explains Flink Agents' current 0.3 preview, detailing its layered architecture—from Agent definition to execution plan and runtime operators—while outlining the roadmap for Skills integration, Mem0 long‑term memory, durable execution, and observability enhancements aimed at production readiness.

AI AgentsAgentPlanFlink
0 likes · 7 min read
A Deep Dive into Flink Agents: Architecture, Roadmap, and Upcoming Features
ITPUB
ITPUB
Apr 17, 2026 · Industry Insights

Why LinkedIn Dumped Kafka for Its Own ‘Northguard’ Streaming Engine

LinkedIn, the original home of Apache Kafka, abandoned the platform for a home‑grown system called Northguard, redesigning log storage, decentralizing metadata, and adding a virtualized Xinfra layer to handle trillions of daily events, while still acknowledging Kafka’s relevance for most companies.

Distributed SystemsInfrastructureKafka
0 likes · 7 min read
Why LinkedIn Dumped Kafka for Its Own ‘Northguard’ Streaming Engine
JD Tech
JD Tech
Apr 16, 2026 · Industry Insights

How JD Revolutionized Coupon Search with a Stream‑Batch Unified Architecture

This article analyzes JD's end‑to‑end upgrade of its retail coupon search infrastructure, detailing the business drivers, data‑skew challenges, the shift from dual KV and batch pipelines to a unified stream‑batch model built on Apache Doris, and the resulting performance, resource and stability gains across multiple scenarios.

Apache DorisBatch ProcessingCoupon Search
0 likes · 12 min read
How JD Revolutionized Coupon Search with a Stream‑Batch Unified Architecture
Coder Circle
Coder Circle
Apr 14, 2026 · Backend Development

Spring AI Hands‑On for Java Developers: Connecting ChatClient to the MiniMax LLM

This tutorial shows Java engineers how to set up a Spring Boot 4 project, configure Spring AI for the MiniMax large‑language model, call it via simple and streaming endpoints, use prompt templates with dynamic parameters, add metadata and advisors, and switch between different LLM providers with minimal code changes.

JavaLLMMiniMax
0 likes · 8 min read
Spring AI Hands‑On for Java Developers: Connecting ChatClient to the MiniMax LLM
James' Growth Diary
James' Growth Diary
Apr 9, 2026 · Artificial Intelligence

Building a Tool-Calling Agent from Scratch with LangChain.js

This tutorial walks through creating a fully functional Tool-Calling Agent using LangChain.js, covering tool definition, model binding, manual execution loops, the high‑level createReactAgent API, streaming responses, state management with thread IDs, common pitfalls, and a complete runnable example.

JavaScriptLangChain.jsOpenAI
0 likes · 20 min read
Building a Tool-Calling Agent from Scratch with LangChain.js
Su San Talks Tech
Su San Talks Tech
Apr 8, 2026 · Artificial Intelligence

Master Claude API: From Setup to Advanced RAG, Prompts, and Streaming

This comprehensive guide walks you through Claude Code model selection, API authentication, request construction, multi‑turn conversation handling, system prompts, temperature tuning, streaming responses, and clean JSON extraction, providing practical Python examples for building robust AI‑powered applications.

AI DevelopmentAnthropicClaude API
0 likes · 28 min read
Master Claude API: From Setup to Advanced RAG, Prompts, and Streaming
Java One
Java One
Apr 8, 2026 · Artificial Intelligence

Master Claude API: From Model Selection to Streaming Responses

This guide walks you through Claude Code model choices, secure API key handling, Python SDK setup, request construction, multi‑turn conversation management, system prompts, temperature tuning, response streaming, and extracting clean structured data such as JSON, all with practical code examples and diagrams.

Claude APIMulti-turn ConversationPrompt Engineering
0 likes · 31 min read
Master Claude API: From Model Selection to Streaming Responses
James' Growth Diary
James' Growth Diary
Apr 6, 2026 · Artificial Intelligence

10 Practical LangChain Performance Hacks to Speed Up and Cut Costs

This article presents ten concrete techniques—including in‑memory and Redis caching, semantic caching, parallel execution, batch processing, prompt compression, model routing, streaming output, and connection‑pool reuse—to dramatically reduce latency and token costs in production LangChain applications.

LangChainNode.jsPerformance Optimization
0 likes · 14 min read
10 Practical LangChain Performance Hacks to Speed Up and Cut Costs
Weekly Large Model Application
Weekly Large Model Application
Mar 22, 2026 · Artificial Intelligence

Inside MiMo-Audio: Dissecting the Large-Scale Audio Model

The article breaks down MiMo-Audio, a next‑token‑prediction‑style large‑scale audio model built on Qwen2, detailing its acoustic front‑end, RVQ tokenizer, patch‑based transformer architecture, streaming capabilities, performance advantages, engineering constraints, and recommended application scenarios.

Audio ModelingFew-ShotQwen2
0 likes · 9 min read
Inside MiMo-Audio: Dissecting the Large-Scale Audio Model
Deepin Linux
Deepin Linux
Mar 17, 2026 · Backend Development

How to Build a High‑Performance gRPC File Transfer Service from Scratch

This step‑by‑step tutorial shows how to configure gRPC, define protobuf service contracts, and implement streaming upload and download in C++, covering environment setup, code generation, server and client logic, testing, and performance tuning for efficient file transfer.

C++MicroservicesProtocol Buffers
0 likes · 44 min read
How to Build a High‑Performance gRPC File Transfer Service from Scratch
Code Mala Tang
Code Mala Tang
Feb 22, 2026 · Backend Development

Why FastAPI Slows Down with Millions of Rows—and How to Keep It Fast

FastAPI feels lightning‑fast on small datasets, but returning millions of rows can exhaust memory, block the event loop, and cripple the database; this article explains why that happens and provides concrete design rules—selective fields, pagination, cursor‑based queries, streaming, and chunked processing—to keep APIs stable at scale.

FastAPIStreaminglarge datasets
0 likes · 9 min read
Why FastAPI Slows Down with Millions of Rows—and How to Keep It Fast
Code Mala Tang
Code Mala Tang
Feb 21, 2026 · Backend Development

10 Common Pitfalls When Streaming JSON in Node.js and Safer Patterns

This guide enumerates ten frequent traps encountered when streaming JSON in Node.js—such as assuming one chunk per object, UTF‑8 split issues, missing newline delimiters, back‑pressure overload, and handling of large numbers—and presents reliable patterns like using NDJSON framing, StringDecoder, pipeline, and proper error handling to avoid data loss and memory spikes.

JSONNDJSONNode.js
0 likes · 13 min read
10 Common Pitfalls When Streaming JSON in Node.js and Safer Patterns
IT Services Circle
IT Services Circle
Feb 13, 2026 · Fundamentals

5 Proven Python Memory‑Optimization Patterns to Slash RAM Usage

Learn five practical Python techniques—streaming large files, using generator pipelines, leveraging __slots__, avoiding temporary objects in loops, and reusing buffers—that together can reduce memory consumption by up to 70% and dramatically improve performance when processing gigabyte‑scale datasets.

GeneratorsMemory OptimizationPython
0 likes · 9 min read
5 Proven Python Memory‑Optimization Patterns to Slash RAM Usage
Code Wrench
Code Wrench
Feb 11, 2026 · Backend Development

Why gRPC Is More Than an RPC Framework: It’s a New Connection Primitive

This article reveals that gRPC should be seen not merely as a high‑performance RPC framework but as a novel connection primitive that treats network links as living, stateful entities, reshaping how developers design, monitor, and reason about distributed systems.

StreamingconnectiongRPC
0 likes · 12 min read
Why gRPC Is More Than an RPC Framework: It’s a New Connection Primitive
DeWu Technology
DeWu Technology
Feb 9, 2026 · Big Data

How to Build a Production‑Ready Flink ClickHouse Sink with Dynamic Sharding, Batch‑by‑Size, and Robust Retry

This article presents a production‑grade Flink ClickHouse sink that solves common pain points such as lack of size‑based batching, static table schemas, and distributed‑table latency by introducing data‑size batching, dynamic table routing, local‑table writes, load‑balanced node discovery, back‑pressure queues, dual‑trigger flush, and recursive retry with node exclusion, all integrated with Flink checkpoint semantics for at‑least‑once guarantees.

BatchingCheckpointClickHouse
0 likes · 25 min read
How to Build a Production‑Ready Flink ClickHouse Sink with Dynamic Sharding, Batch‑by‑Size, and Robust Retry
Data STUDIO
Data STUDIO
Feb 9, 2026 · Fundamentals

5 Python Memory‑Optimization Patterns That Cut Usage by 70%

The article walks through five concrete Python techniques—streaming file reads, generator expressions, __slots__, avoiding temporary objects in loops, and reusing buffers—showing code examples and measured memory reductions that together lowered overall RAM consumption by about 70%.

GeneratorsMemory OptimizationProfiling
0 likes · 9 min read
5 Python Memory‑Optimization Patterns That Cut Usage by 70%
DataFunSummit
DataFunSummit
Feb 8, 2026 · Big Data

Kuaishou’s Data Lake Upgrade with Hudi: Solving AI & BI Challenges

The article explains how Kuaishou modernized its data lake by partnering with Apache Hudi to address latency, storage cost, and consistency issues in both AI and BI pipelines, detailing architectural changes, new ingestion tools, partitioning strategies, compaction mechanisms, performance gains and future plans.

AIBIBig Data
0 likes · 20 min read
Kuaishou’s Data Lake Upgrade with Hudi: Solving AI & BI Challenges
DataFunSummit
DataFunSummit
Feb 7, 2026 · Big Data

How Flink Enables Real‑Time AI Inference and Agent Construction

This article explains Apache Flink’s stream processing fundamentals, introduces the open‑source Flink Agents framework for building event‑driven AI agents, details Alibaba Cloud’s Flink AI Function for real‑time LLM inference, and showcases demos, architecture, integration patterns, and practical use cases such as VOC analysis, live‑stream analytics, and intelligent operations.

Apache FlinkBig DataReal-time inference
0 likes · 24 min read
How Flink Enables Real‑Time AI Inference and Agent Construction
Java Companion
Java Companion
Feb 1, 2026 · Databases

How I Completed a 100+ Table Migration in One Month Using Navicat Tricks

In under a month the author migrated more than 100 heterogeneous tables from MySQL and MongoDB to PostgreSQL across isolated networks by automating Navicat's import/export logic, streaming data, handling special characters, configuring flexible sync strategies, and using OSS as a bridge to avoid OOM and lock‑contention.

Data MigrationMongoDBNavicat
0 likes · 20 min read
How I Completed a 100+ Table Migration in One Month Using Navicat Tricks
Baidu Tech Salon
Baidu Tech Salon
Jan 14, 2026 · Cloud Native

How to Build a Cloud‑Native Streaming Compute PaaS on Kubernetes

This article examines the growing demand for real‑time data processing, outlines the high development, operational, and scalability challenges of traditional streaming systems, and presents a Kubernetes‑based cloud‑native PaaS solution that automates resource management, provides configuration‑driven development, and delivers observable, elastic, and service‑oriented streaming capabilities.

KubernetesPaaSStreaming
0 likes · 25 min read
How to Build a Cloud‑Native Streaming Compute PaaS on Kubernetes
Java Companion
Java Companion
Dec 23, 2025 · Backend Development

Three Spring Async Streaming APIs to Eliminate Timeout Issues

This article explains how to handle long‑running Spring endpoints by using three asynchronous streaming tools—ResponseBodyEmitter, SseEmitter, and StreamingResponseBody—showing their appropriate scenarios, configuration details, and complete code examples that keep the servlet thread free and avoid timeout problems.

AsyncResponseBodyEmitterSseEmitter
0 likes · 9 min read
Three Spring Async Streaming APIs to Eliminate Timeout Issues
Full-Stack Cultivation Path
Full-Stack Cultivation Path
Dec 11, 2025 · Frontend Development

Why TanStack AI SDK Is a Game‑Changer for Vue and React Developers

TanStack AI (Alpha) introduces a framework‑agnostic, type‑safe, and visual‑debuggable AI SDK that works across Vue, React, Solid, and vanilla JavaScript, supporting multiple model providers and languages, with step‑by‑step guides that let developers quickly build streaming chat, actions, and RAG applications.

AI SDKDevToolsReact
0 likes · 7 min read
Why TanStack AI SDK Is a Game‑Changer for Vue and React Developers
Senior Tony
Senior Tony
Dec 1, 2025 · Databases

How to Efficiently Import 100 Million Excel Rows into MySQL

This article explains how to import a hundred‑million‑row Excel dataset into MySQL by using CSV format, streaming parsers like EasyExcel, batch inserts, asynchronous processing, and partial‑transaction strategies to ensure feasibility, data integrity, and high performance.

Batch InsertExcelStreaming
0 likes · 5 min read
How to Efficiently Import 100 Million Excel Rows into MySQL
Big Data Technology & Architecture
Big Data Technology & Architecture
Nov 28, 2025 · Big Data

What’s New in Apache Paimon 2025? Core Performance, AI Integration & Real‑Time Lakehouse Updates

The 2025 Apache Paimon release brings major performance boosts, AI‑centric multimodal storage, deeper streaming‑batch integration, and broader engine compatibility, detailing query and write optimizations, memory management tweaks, and a unified lake format for structured and unstructured data.

AI integrationApache PaimonBig Data
0 likes · 6 min read
What’s New in Apache Paimon 2025? Core Performance, AI Integration & Real‑Time Lakehouse Updates
Ray's Galactic Tech
Ray's Galactic Tech
Nov 18, 2025 · Big Data

From Zero to Mastery: A Complete Roadmap to Learn Apache Spark

This guide outlines a step‑by‑step learning path for Apache Spark, covering core concepts, environment setup, hands‑on WordCount code, API mastery, ecosystem extensions like Structured Streaming and MLlib, deployment options, performance tuning, and practical project advice.

Apache SparkPySparkStreaming
0 likes · 7 min read
From Zero to Mastery: A Complete Roadmap to Learn Apache Spark
Code Mala Tang
Code Mala Tang
Nov 5, 2025 · Backend Development

How to Build a Production-Ready Async LLM API with FastAPI

Learn how to design and deploy a high‑performance, production‑grade LLM API using FastAPI, covering async routing, type‑safe Pydantic models, streaming via SSE/WebSockets, middleware, caching, rate limiting, observability, retries, and cost‑control strategies for robust AI services.

AsyncFastAPILLM
0 likes · 12 min read
How to Build a Production-Ready Async LLM API with FastAPI
Big Data Technology & Architecture
Big Data Technology & Architecture
Oct 30, 2025 · Backend Development

What’s New in Apache Kafka 4.1? Core Features and Architecture Changes Explained

Apache Kafka 4.1.0 introduces native queue semantics, a new Streams rebalancing protocol, multi‑version Connect plugins, a revamped consumer‑group protocol, enhanced transaction safety, and numerous client, monitoring, and security improvements, offering a comprehensive upgrade over the 4.0 release.

KafkaStreamingdistributed-systems
0 likes · 6 min read
What’s New in Apache Kafka 4.1? Core Features and Architecture Changes Explained
Architect Chen
Architect Chen
Oct 22, 2025 · Big Data

How to Eliminate Kafka Message Backlog with Practical Optimizations

This guide presents concrete techniques for improving Kafka consumer and producer performance, scaling clusters, tuning broker settings, and designing asynchronous buffering layers to prevent message accumulation and boost overall throughput.

Big DataKafkaPerformance Optimization
0 likes · 5 min read
How to Eliminate Kafka Message Backlog with Practical Optimizations
Raymond Ops
Raymond Ops
Oct 21, 2025 · Big Data

Deep Dive into Kafka Architecture: Topics, Partitions, and Reliable Data Pipelines

This article explains Kafka’s core concepts—including topics, partitions, log segmentation, indexing, and acknowledgment mechanisms—then provides a step‑by‑step guide to deploy a Zookeeper‑Kafka cluster integrated with Filebeat, Logstash, and the ELK stack for reliable log collection and analysis.

Big DataELKFilebeat
0 likes · 11 min read
Deep Dive into Kafka Architecture: Topics, Partitions, and Reliable Data Pipelines
360 Smart Cloud
360 Smart Cloud
Sep 26, 2025 · Artificial Intelligence

How to Turn OpenAPI Specs into AI Agent Tools with MCP: A Multi‑Language Guide

This article explains how the Model Context Protocol (MCP) bridges large language models and external services by converting OpenAPI specifications into callable tools, covering generation with openapi‑generator, mapping rules, three runtime modes (stdio, streamable, SSE), and implementation details in Java, Python, and Go.

AI AgentsGoJava
0 likes · 23 min read
How to Turn OpenAPI Specs into AI Agent Tools with MCP: A Multi‑Language Guide
Architect's Must-Have
Architect's Must-Have
Sep 15, 2025 · Big Data

Mastering Spark Streaming Rate Control: A Deep Dive into Backpressure

This article explains Spark Streaming's rate control mechanisms, covering static limits, the dynamic back‑pressure feature introduced in Spark 1.5, the PID‑based estimator, RPC communication, and how Guava's token‑bucket RateLimiter enforces the calculated thresholds to ensure stability and optimal throughput.

RateControlSparkStreaming
0 likes · 13 min read
Mastering Spark Streaming Rate Control: A Deep Dive into Backpressure
360 Zhihui Cloud Developer
360 Zhihui Cloud Developer
Sep 11, 2025 · Big Data

How Paimon Transforms Membership Data Warehousing: From Legacy Lambda to Real‑Time Lakehouse

This article examines the challenges of a legacy Lambda‑based membership data warehouse, introduces Apache Paimon’s lakehouse architecture and its key features, and showcases three real‑world implementations—partial‑update order wide tables, Bitmap‑based UV counting, and branch‑based data correction—while discussing benefits, remaining challenges, and future directions.

Big DataData LakeData Warehouse
0 likes · 29 min read
How Paimon Transforms Membership Data Warehousing: From Legacy Lambda to Real‑Time Lakehouse
Tencent Cloud Developer
Tencent Cloud Developer
Aug 13, 2025 · Backend Development

How to Build a Streaming Reconciliation Engine for Billions of Transactions per Hour

This article explains the fundamentals of financial reconciliation, outlines the two‑step reconciliation process, and details a streaming reconciliation architecture capable of handling 30 billion USD per hour (≈27 777 TPS) with high concurrency, distributed parsing, and efficient result aggregation.

FinancialHigh ThroughputReconciliation
0 likes · 8 min read
How to Build a Streaming Reconciliation Engine for Billions of Transactions per Hour
58 Tech
58 Tech
Aug 7, 2025 · Big Data

Transform Real‑Time Data Warehousing with Paimon: From Flink ROW_NUMBER to Streaming Lakehouse

This article details how a real‑time data warehouse built on Flink, Kafka, HBase and MySQL was redesigned using Paimon to eliminate costly deduplication, handle out‑of‑order events, enable streaming reads, simplify aggregation, replace multiple lookup sources, and achieve faster, more reliable batch repairs, resulting in major resource and operational gains.

Data WarehouseFlinkLakehouse
0 likes · 24 min read
Transform Real‑Time Data Warehousing with Paimon: From Flink ROW_NUMBER to Streaming Lakehouse
Open Source Tech Hub
Open Source Tech Hub
Aug 2, 2025 · Backend Development

How to Build a PHP SDK for Coze API When Official Support Is Missing

This guide explains why the official Coze SDK lacks PHP support, shows how to install the community‑made pfinalclub/coze_sdk package via Composer, and demonstrates chat, bot management, and both basic and advanced streaming features with complete code examples and future improvement plans.

BackendComposerCoze API
0 likes · 6 min read
How to Build a PHP SDK for Coze API When Official Support Is Missing
360 Zhihui Cloud Developer
360 Zhihui Cloud Developer
Jul 22, 2025 · Big Data

How Apache SeaTunnel Revolutionizes Heterogeneous Data Integration with Decoupled Connectors

This article explores how Apache SeaTunnel addresses modern data integration challenges by providing a high‑performance, distributed, plugin‑based platform that decouples connectors from execution engines, enabling seamless batch and streaming synchronization across heterogeneous sources such as databases, message queues, and data lakes.

Apache SeaTunnelBatch ProcessingConnector Architecture
0 likes · 24 min read
How Apache SeaTunnel Revolutionizes Heterogeneous Data Integration with Decoupled Connectors
DataFunSummit
DataFunSummit
Jul 12, 2025 · Big Data

How Fluss Unifies Stream and Lake to Power AI Data Pipelines

In the era of rapid AI growth, Fluss offers a unified lake‑stream architecture that tackles data quality, timeliness, scale, and multimodal challenges by tightly integrating Flink streaming with a high‑performance data lake, enabling seamless real‑time and batch analytics for AI workloads.

AIData LakeFlink
0 likes · 12 min read
How Fluss Unifies Stream and Lake to Power AI Data Pipelines
Big Data Technology & Architecture
Big Data Technology & Architecture
Jul 8, 2025 · Big Data

Flink’s AI Agents and Disaggregated State: Transforming Big Data

The article reviews key topics from the FFA2025 Singapore conference, highlighting Flink’s new AI‑focused Agents framework, the breakthrough Flink 2.0 disaggregated state architecture, emerging lake storage solutions like Paimon, and the Fluss streaming table store, illustrating how big‑data platforms are evolving for AI workloads.

AI AgentsBig DataData Lake
0 likes · 6 min read
Flink’s AI Agents and Disaggregated State: Transforming Big Data
FunTester
FunTester
Jul 5, 2025 · Big Data

Master Kafka: Core Concepts and Performance Testing Strategies

This article explains Kafka’s high‑performance distributed streaming architecture, key components such as topics, partitions, producers, consumers, brokers, offsets, and ZooKeeper, and provides step‑by‑step workflows for producers and consumers along with performance‑testing tips and Maven setup.

Big DataJavaKafka
0 likes · 9 min read
Master Kafka: Core Concepts and Performance Testing Strategies
Big Data Tech Team
Big Data Tech Team
Jul 3, 2025 · Big Data

Master Kafka: A Complete Learning Roadmap from Basics to Advanced Projects

This guide presents a step‑by‑step Kafka learning roadmap covering core concepts, architecture, configuration, monitoring tools, practical project ideas, advanced components like Streams and KSQL, plus code samples and resource recommendations to help beginners become proficient in real‑time data streaming.

Code ExamplesKafkaStreaming
0 likes · 14 min read
Master Kafka: A Complete Learning Roadmap from Basics to Advanced Projects
FunTester
FunTester
Jul 2, 2025 · Backend Development

Master Non‑Blocking & Streaming gRPC Clients in Java for High‑Performance Testing

Learn how to create and use non‑blocking (asynchronous) and streaming gRPC clients in Java, covering stub creation, request handling with ListenableFuture, blocking vs listener approaches, and stream observer implementation, enabling high‑concurrency performance testing and real‑time data processing scenarios.

AsynchronousJavaPerformance Testing
0 likes · 7 min read
Master Non‑Blocking & Streaming gRPC Clients in Java for High‑Performance Testing
Architecture & Thinking
Architecture & Thinking
Jun 11, 2025 · Artificial Intelligence

Accelerate LLM App Development with Eino: A Go Framework Walkthrough

Eino is an open‑source Golang framework for building large‑model applications, offering reusable components, robust orchestration, clean APIs, best‑practice templates, and full‑cycle DevOps tools, with code examples for both Ollama and OpenAI modes, plus streaming and normal output options.

AI DevelopmentFrameworkGo
0 likes · 10 min read
Accelerate LLM App Development with Eino: A Go Framework Walkthrough
Linux Cloud Computing Practice
Linux Cloud Computing Practice
May 29, 2025 · Big Data

Why Learn Kafka? Core Benefits, Use Cases, and Key Interview Topics

This article explains why Kafka is essential for modern data engineering, highlighting its widespread adoption, high throughput, scalability, durability, integration with streaming ecosystems, and common real‑time use cases, while also providing a concise list of interview topics for aspiring engineers.

Real-time ProcessingStreamingdata pipelines
0 likes · 6 min read
Why Learn Kafka? Core Benefits, Use Cases, and Key Interview Topics
Code Mala Tang
Code Mala Tang
May 21, 2025 · Backend Development

Master FastAPI Responses: JSON, HTML, Files, Streaming & Custom Classes

Learn how FastAPI handles various response types—including default JSON, custom status codes and headers, HTMLResponse, FileResponse, StreamingResponse, and user‑defined response classes—while covering best practices, common pitfalls, and code examples for building robust APIs.

API responsesFastAPIPython
0 likes · 8 min read
Master FastAPI Responses: JSON, HTML, Files, Streaming & Custom Classes
Huolala Tech
Huolala Tech
May 14, 2025 · Big Data

How Lalamove Scaled Real‑Time Data Warehousing with Flink and Paimon

Lalamove’s international logistics platform transformed its real‑time data warehouse by leveraging Apache Flink and the Paimon lakehouse, addressing challenges of multi‑region data centers, time‑zone diversity, frequent upstream changes, and high costs, while improving scalability, latency, and operational efficiency across global markets.

Big DataFlinkPaimon
0 likes · 13 min read
How Lalamove Scaled Real‑Time Data Warehousing with Flink and Paimon
DataFunSummit
DataFunSummit
May 4, 2025 · Big Data

Iceberg Table Format Practice in Huawei Terminal Cloud

This article explains how Huawei's terminal cloud adopts the Apache Iceberg table format to efficiently manage large-scale datasets, detailing its architecture, feature engineering, merge operations, LSM-based storage, schema versioning, AB testing support, catalog enhancements, and future roadmap for full lifecycle data governance.

Big DataData LakeHuawei Cloud
0 likes · 13 min read
Iceberg Table Format Practice in Huawei Terminal Cloud
Su San Talks Tech
Su San Talks Tech
May 4, 2025 · Backend Development

How to Export Millions of Excel Rows in Seconds: High‑Performance Java Strategies

Learn how to overcome memory and speed bottlenecks when exporting massive datasets to Excel by using streaming APIs like SXSSFWorkbook and EasyExcel, optimizing database pagination, tuning JVM and connection pools, and applying asynchronous shard processing to achieve stable sub‑200 MB memory usage for millions of rows.

Database paginationJavaStreaming
0 likes · 10 min read
How to Export Millions of Excel Rows in Seconds: High‑Performance Java Strategies
Java Architect Essentials
Java Architect Essentials
Apr 14, 2025 · Backend Development

Using Spring's ResponseBodyEmitter for Real-Time Log Streaming

This article introduces Spring Framework's ResponseBodyEmitter, explains its role in asynchronous HTTP responses, outlines typical use cases such as long polling and real‑time log streaming, provides a complete Spring Boot example with code, and compares it with SSE and raw OutputStream approaches.

JavaResponseBodyEmitterStreaming
0 likes · 11 min read
Using Spring's ResponseBodyEmitter for Real-Time Log Streaming
DataFunTalk
DataFunTalk
Apr 9, 2025 · Big Data

Highlights of the Apache Hudi Asia Technical Salon Hosted by Kuaishou – Practices and Innovations from Leading Companies

The Kuaishou‑hosted Apache Hudi Asia technical salon gathered over 230 attendees and featured seven experts from Kuaishou, Meituan, TikTok, Huawei, JD and others, who shared best practices, architecture designs, and performance optimizations for large‑scale data lake applications across AI, BI, and real‑time workloads.

AIApache HudiBatch Processing
0 likes · 14 min read
Highlights of the Apache Hudi Asia Technical Salon Hosted by Kuaishou – Practices and Innovations from Leading Companies
Architecture Digest
Architecture Digest
Apr 3, 2025 · Backend Development

Real‑Time Streaming with Spring’s ResponseBodyEmitter: Concepts, Use Cases, and Code Example

This article explains the purpose, core methods, and practical scenarios of Spring Framework’s ResponseBodyEmitter, compares it with SSE and raw streaming, and provides a complete Spring Boot controller example that demonstrates how to implement real‑time log streaming and other asynchronous HTTP responses.

BackendReal-TimeResponseBodyEmitter
0 likes · 9 min read
Real‑Time Streaming with Spring’s ResponseBodyEmitter: Concepts, Use Cases, and Code Example
DataFunSummit
DataFunSummit
Apr 3, 2025 · Big Data

Apache Hudi Asia Technical Salon Highlights: Practices and Innovations from Kuaishou, Meituan, Douyin, Huawei, and JD

The Apache Hudi Asia technical salon held in Beijing on March 29 gathered over 230 on‑site participants and 16,000 online viewers, featuring expert talks from leading Chinese tech companies that showcased real‑world Hudi implementations, performance optimizations, and future roadmap for data‑lake technologies.

Apache HudiBig DataData Lake
0 likes · 13 min read
Apache Hudi Asia Technical Salon Highlights: Practices and Innovations from Kuaishou, Meituan, Douyin, Huawei, and JD
DataFunSummit
DataFunSummit
Apr 1, 2025 · Big Data

Understanding Flink CDC 3.3: Features, Improvements, and Future Plans

This article provides a comprehensive overview of Flink CDC 3.3, detailing its CDC fundamentals, new connectors, Transform module enhancements, asynchronous snapshot splitting, community adoption, and upcoming roadmap for broader ecosystem support and batch‑mode execution.

Big DataCDCChange Data Capture
0 likes · 15 min read
Understanding Flink CDC 3.3: Features, Improvements, and Future Plans
iQIYI Technical Product Team
iQIYI Technical Product Team
Mar 27, 2025 · Big Data

Cost‑Effective Real‑Time Data Warehouse 2.0: Migrating from Kafka to Iceberg

iQIYI transformed its real‑time data warehouse by replacing a costly Kafka‑based Lambda stack with a unified stream‑batch Iceberg lake, cutting storage expenses by 90%, halving compute costs, extending data retention, and delivering minute‑level freshness for 90% of use cases while preserving second‑level processing where needed.

Cost OptimizationFlinkIceberg
0 likes · 11 min read
Cost‑Effective Real‑Time Data Warehouse 2.0: Migrating from Kafka to Iceberg
Airbnb Technology Team
Airbnb Technology Team
Mar 24, 2025 · Artificial Intelligence

Chronon: Open‑Source Feature Platform for Machine Learning – Architecture, Workflow, and Code Examples

Chronon is an open‑source ML feature platform that lets engineers declaratively define, compute, and serve both batch and real‑time features with built‑in observability, data‑quality checks, and a low‑latency retrieval API, ensuring online‑offline consistency while simplifying pipeline management and enabling future automation.

ChrononObservabilityStreaming
0 likes · 13 min read
Chronon: Open‑Source Feature Platform for Machine Learning – Architecture, Workflow, and Code Examples
Top Architecture Tech Stack
Top Architecture Tech Stack
Mar 24, 2025 · Backend Development

Using Spring's ResponseBodyEmitter for Real‑Time Log Streaming

This article explains how Spring Framework's ResponseBodyEmitter enables real‑time, chunked HTTP responses for use cases such as log streaming, progress updates, chat, and AI output, detailing its advantages over SSE, usage scenarios, core methods, a complete controller example, and best‑practice considerations.

BackendJavaResponseBodyEmitter
0 likes · 9 min read
Using Spring's ResponseBodyEmitter for Real‑Time Log Streaming
Big Data Technology & Architecture
Big Data Technology & Architecture
Mar 24, 2025 · Big Data

Apache Kafka 4.0: Major New Features – KRaft Architecture, Consumer Group Protocol, Queue Mode, Java Upgrade, API Simplifications and More

Apache Kafka 4.0 introduces a groundbreaking KRaft architecture that removes ZooKeeper, a revamped consumer‑group protocol that dramatically speeds up rebalancing, a new queue mode for point‑to‑point messaging, upgraded Java requirements, streamlined APIs, and numerous performance and security enhancements, reshaping both development and operations for large‑scale streaming workloads.

Java 11KRaftKafka
0 likes · 12 min read
Apache Kafka 4.0: Major New Features – KRaft Architecture, Consumer Group Protocol, Queue Mode, Java Upgrade, API Simplifications and More
Laravel Tech Community
Laravel Tech Community
Mar 23, 2025 · Big Data

Apache Kafka 4.0 Released: First Version Without ZooKeeper and New Features

Apache Kafka 4.0 has been officially released as the first major version that runs entirely without Apache ZooKeeper, introducing KRaft mode, a new consumer group protocol (KIP‑848), early‑access queue support (KIP‑932), updated Java requirements, and other enhancements aimed at improving scalability, operability, and messaging versatility.

Apache KafkaKIP-848KIP-932
0 likes · 3 min read
Apache Kafka 4.0 Released: First Version Without ZooKeeper and New Features
Alimama Tech
Alimama Tech
Mar 12, 2025 · Big Data

Design and Evolution of Alibaba Advertising Real-Time Data Warehouse

Alibaba Mama’s advertising platform migrated from a monolithic Flink‑Kafka pipeline to a layered Paimon lakehouse, adding DWS upsert support and multi‑layer storage, which delivers minute‑level data freshness, cuts latency by 2.5 hours, reduces resource use over 40 %, halves development effort and achieves ≥99.9 % availability.

AdvertisingAlibabaData Lake
0 likes · 18 min read
Design and Evolution of Alibaba Advertising Real-Time Data Warehouse
Radish, Keep Going!
Radish, Keep Going!
Mar 12, 2025 · Backend Development

Why gRPC Beats REST: Performance, Contracts, and Streaming Explained

gRPC offers superior performance, efficient Protobuf encoding, strong API contracts, seamless streaming, cross‑language support, and HTTP/2 advantages, making it a compelling alternative to REST for modern web services, while tools like gRPC‑Gateway, ConnectRPC, and Buf streamline migration and ecosystem integration.

API contractsStreaminggRPC
0 likes · 10 min read
Why gRPC Beats REST: Performance, Contracts, and Streaming Explained
Java Architect Essentials
Java Architect Essentials
Mar 7, 2025 · Artificial Intelligence

Introducing DeepSeek4j 1.4: A Java Spring Boot Integration for DeepSeek AI with Chain‑of‑Thought and Streaming Support

The article introduces DeepSeek4j 1.4, a Java Spring Boot library that overcomes existing framework limitations by preserving DeepSeek's chain‑of‑thought capabilities, adding full reactive streaming, and providing a simple one‑line API along with quick‑start instructions and code examples.

AI integrationDeepSeekJava
0 likes · 5 min read
Introducing DeepSeek4j 1.4: A Java Spring Boot Integration for DeepSeek AI with Chain‑of‑Thought and Streaming Support
Code Ape Tech Column
Code Ape Tech Column
Mar 7, 2025 · Backend Development

Using Spring's ResponseBodyEmitter for Real‑Time Streaming Responses

This article introduces Spring Framework's ResponseBodyEmitter, explains its advantages over SSE for asynchronous HTTP responses, demonstrates practical usage with a real‑time log‑streaming example, outlines core methods, working principles, best‑practice considerations, and compares it with traditional Streaming and Server‑Sent Events.

BackendHTTPJava
0 likes · 10 min read
Using Spring's ResponseBodyEmitter for Real‑Time Streaming Responses
Alibaba Cloud Infrastructure
Alibaba Cloud Infrastructure
Mar 6, 2025 · Big Data

Leveraging Apache Iceberg and AutoMQ for Real-Time Data Lake Ingestion: Architecture, Best Practices, and Cost Optimization

This article examines how Apache Iceberg’s snapshot‑based ACID transactions, logical‑physical partition evolution, and COW/MOR update modes enable efficient real‑time data lake ingestion, and demonstrates AutoMQ’s Kafka‑to‑Iceberg Table Topic solution that simplifies schema management, reduces latency, and cuts operational costs.

Apache IcebergAutoMQBig Data
0 likes · 14 min read
Leveraging Apache Iceberg and AutoMQ for Real-Time Data Lake Ingestion: Architecture, Best Practices, and Cost Optimization
Python Programming Learning Circle
Python Programming Learning Circle
Feb 28, 2025 · Fundamentals

Techniques for Efficient Large File Processing in Python

Processing large files efficiently in Python requires techniques such as line-by-line iteration, chunked reads, generators, buffered I/O, and streaming, which help avoid memory errors, improve speed, and optimize resources for tasks like log analysis, data scraping, and real-time API handling.

Streamingfile I/Olarge files
0 likes · 5 min read
Techniques for Efficient Large File Processing in Python
Cognitive Technology Team
Cognitive Technology Team
Feb 28, 2025 · Artificial Intelligence

Design and High‑Availability Architecture of Alibaba LangEngine AI Application Framework

This article introduces Alibaba's LangEngine, a pure Java AI application framework, detailing its high‑availability gateway architecture, communication protocols, streaming and non‑streaming output, multi‑level metadata caching, asynchronous and serverless designs, and future open‑source roadmap, offering practical guidance for building robust AI services.

AI FrameworkLLMLangEngine
0 likes · 11 min read
Design and High‑Availability Architecture of Alibaba LangEngine AI Application Framework
Top Architect
Top Architect
Feb 21, 2025 · Artificial Intelligence

DeepSeek4j 1.4: Java Integration Framework for DeepSeek with Full Chain‑of‑Thought and Streaming Support

The article introduces DeepSeek4j 1.4, a Java‑based framework that overcomes Spring AI’s limitations by fully preserving DeepSeek’s chain‑of‑thought and billing features, adding reactive streaming, providing Spring Boot starter integration, and offering quick‑start code samples and configuration guidance.

AIDeepSeekJava
0 likes · 8 min read
DeepSeek4j 1.4: Java Integration Framework for DeepSeek with Full Chain‑of‑Thought and Streaming Support
Java Architecture Diary
Java Architecture Diary
Feb 10, 2025 · Artificial Intelligence

deepseek4j 1.3: Java SDK adds web search, streaming & multi‑channel AI

deepseek4j 1.3 introduces web‑search capability, streaming responses, system prompts, expanded multi‑platform support, enhanced SSE debugging, and upcoming features like API‑key rotation and resilience, enabling Java developers to integrate DeepSeek models effortlessly while focusing on business logic.

AIDeepSeekSDK
0 likes · 8 min read
deepseek4j 1.3: Java SDK adds web search, streaming & multi‑channel AI
macrozheng
macrozheng
Jan 24, 2025 · Backend Development

Boost Java Excel Performance with FastExcel: Features, Usage, and Comparison

This article introduces FastExcel, an upgraded Java library for high‑performance Excel read/write, outlines its key features, provides step‑by‑step code examples for entity creation, event listeners, writing, reading, PDF conversion, compares it with EasyExcel, and concludes with its suitability for large‑scale data processing.

ExcelFastExcelPDF
0 likes · 8 min read
Boost Java Excel Performance with FastExcel: Features, Usage, and Comparison
Alibaba Cloud Big Data AI Platform
Alibaba Cloud Big Data AI Platform
Jan 14, 2025 · Big Data

How Fluss Unifies Lake and Stream for Real‑Time Analytics: Architecture, Benefits, and Future Roadmap

This article summarizes a talk by Alibaba Cloud senior engineer and Flink Committer Luo Yuxia on the challenges of separating lake and stream storage, introduces the Fluss lake‑stream unified architecture, explains its technical benefits such as second‑level data freshness, unified metadata, efficient changelog generation, and outlines future plans for broader ecosystem integration.

Data LakeFlinkFluss
0 likes · 13 min read
How Fluss Unifies Lake and Stream for Real‑Time Analytics: Architecture, Benefits, and Future Roadmap
Ctrip Technology
Ctrip Technology
Jan 3, 2025 · Big Data

Design and Implementation of a Kafka Gatekeeper for FinOps Billing Data Quality Governance

This article describes the challenges of data quality in Ctrip’s hybrid‑cloud FinOps billing system and presents the design, implementation, and high‑availability deployment of a custom Kafka Gatekeeper proxy that performs pre‑validation, configurable rules, self‑service dashboards, and automated alerts to improve coverage, timeliness, and responsibility attribution.

Big DataCloud NativeData Quality
0 likes · 17 min read
Design and Implementation of a Kafka Gatekeeper for FinOps Billing Data Quality Governance
Big Data Technology & Architecture
Big Data Technology & Architecture
Jan 2, 2025 · Big Data

Apache Paimon: Core Capabilities, Table Types, LSM Tree, Buckets, Merge Engines, and Operational Details

This article provides a comprehensive overview of Apache Paimon, covering its real‑time lake ingestion, unified stream‑batch processing, table types (primary‑key and append‑only), LSM‑tree storage, bucket mechanisms, merge‑engine options, compaction strategies, concurrency control, consumption methods, tag management, data cleanup, and system tables for big‑data workloads.

Apache PaimonBig DataFlink
0 likes · 25 min read
Apache Paimon: Core Capabilities, Table Types, LSM Tree, Buckets, Merge Engines, and Operational Details
Zhihu Tech Column
Zhihu Tech Column
Dec 31, 2024 · Cloud Native

Cloud Native Innovation Forum: AutoMQ Table Topic, OceanBase Integrated Database, and Observability Practices

The article recaps Zhihu's Cloud Native Innovation Forum where experts from AutoMQ, OceanBase, and Flashcat shared practical solutions on streaming data ingestion, unified database architectures, and AI‑driven observability, highlighting real‑world deployments, performance optimizations, and cost‑saving strategies.

AIAutoMQCloud Native
0 likes · 10 min read
Cloud Native Innovation Forum: AutoMQ Table Topic, OceanBase Integrated Database, and Observability Practices
Architect
Architect
Dec 15, 2024 · Databases

Efficient MySQL Queries for Millions of Rows: Regular, Stream, and Cursor

When processing massive MySQL result sets, loading all rows into JVM memory can cause OOM and slow performance, so this guide compares three approaches—regular pagination, streaming queries using server-side cursors, and cursor‑based fetchSize control—detailing their implementations, MyBatis configurations, and trade‑offs.

CursorDatabase QueryLarge Data
0 likes · 10 min read
Efficient MySQL Queries for Millions of Rows: Regular, Stream, and Cursor
MaGe Linux Operations
MaGe Linux Operations
Dec 14, 2024 · Big Data

Master Kafka: From Core Concepts to Real-World Deployment

This comprehensive guide explains Kafka’s architecture, core APIs, topics and partitions, deployment steps, multi‑broker clustering, and practical use cases such as messaging, log aggregation, stream processing, and data import/export with Kafka Connect, providing a hands‑on tutorial for developers and engineers.

Distributed SystemsInstallationKafka
0 likes · 30 min read
Master Kafka: From Core Concepts to Real-World Deployment