DataFunSummit
Author

DataFunSummit

Official account of the DataFun community, dedicated to sharing big data and AI industry summit news and speaker talks, with regular downloadable resource packs.

1.6k
Articles
0
Likes
4.7k
Views
0
Comments
Recent Articles

Latest from DataFunSummit

100 recent articles max
DataFunSummit
DataFunSummit
Feb 24, 2026 · Artificial Intelligence

How Large Language Models Are Redefining Search Ranking at Tencent

This article details Tencent Search's exploration of large‑model‑driven ranking, covering the evolution from traditional keyword retrieval to RAG‑based AI search, the multi‑stage AI ranking architecture (L0‑L5), model training pipelines, distillation, synthetic data generation, and future research directions.

LLMRAGranking architecture
0 likes · 21 min read
How Large Language Models Are Redefining Search Ranking at Tencent
DataFunSummit
DataFunSummit
Feb 8, 2026 · Big Data

Kuaishou’s Data Lake Upgrade with Hudi: Solving AI & BI Challenges

The article explains how Kuaishou modernized its data lake by partnering with Apache Hudi to address latency, storage cost, and consistency issues in both AI and BI pipelines, detailing architectural changes, new ingestion tools, partitioning strategies, compaction mechanisms, performance gains and future plans.

AIArchitectureBI
0 likes · 20 min read
Kuaishou’s Data Lake Upgrade with Hudi: Solving AI & BI Challenges
DataFunSummit
DataFunSummit
Feb 7, 2026 · Big Data

How Flink Enables Real‑Time AI Inference and Agent Construction

This article explains Apache Flink’s stream processing fundamentals, introduces the open‑source Flink Agents framework for building event‑driven AI agents, details Alibaba Cloud’s Flink AI Function for real‑time LLM inference, and showcases demos, architecture, integration patterns, and practical use cases such as VOC analysis, live‑stream analytics, and intelligent operations.

Apache FlinkReal-time InferenceStreaming
0 likes · 24 min read
How Flink Enables Real‑Time AI Inference and Agent Construction
DataFunSummit
DataFunSummit
Feb 1, 2026 · Artificial Intelligence

How AI Agents Are Redefining Data Engineering: Expert Insights and Real‑World Practices

In a deep‑dive roundtable, three data‑engineering veterans discuss the rise of AI agents, the importance of data context, memory mechanisms, workflow versus agent trade‑offs, and the future of database intelligence, offering practical strategies and architectural philosophies for building smarter data pipelines.

Context EngineeringDatabase IntelligenceImmersive Analytics
0 likes · 24 min read
How AI Agents Are Redefining Data Engineering: Expert Insights and Real‑World Practices
DataFunSummit
DataFunSummit
Jan 29, 2026 · Big Data

How to Slash Web Scraping Costs by 60%: Proven Strategies from a Bright Data Expert

In the era of massive AI model training, this article presents a step‑by‑step technical guide—covering the full data‑collection pipeline, three acquisition modes, IP‑type choices, bandwidth savings, path and mixed‑request optimizations, and business‑level cost controls—to reduce web‑scraping expenses by more than 60% while maintaining data quality.

AIAutomationdata collection
0 likes · 24 min read
How to Slash Web Scraping Costs by 60%: Proven Strategies from a Bright Data Expert
DataFunSummit
DataFunSummit
Jan 18, 2026 · Big Data

How Ray Reinvents AI Data Pipelines for Massive Multimodal Inference

This article examines the shortcomings of traditional big‑data engines for AI workloads, presents a Ray‑based heterogeneous fusion architecture that unifies CPU/GPU scheduling, Python ecosystems, and streaming‑batch processing, and details fault‑tolerance, checkpointing, compute‑storage separation, resource‑utilization, scalability, and observability improvements that enable thousands of nodes and dramatically higher GPU efficiency.

Distributed ComputingRayResource Optimization
0 likes · 31 min read
How Ray Reinvents AI Data Pipelines for Massive Multimodal Inference
DataFunSummit
DataFunSummit
Jan 17, 2026 · Artificial Intelligence

How UnrealZoo Accelerates Embodied AI Research with High‑Fidelity Simulation

This article outlines the evolution from traditional AI to embodied intelligence, explains the Vision‑Language‑Action (VLA) paradigm, highlights data‑collection bottlenecks, introduces the UnrealZoo simulation platform built on Unreal Engine, and showcases real‑world case studies and future challenges for embodied AI research.

RoboticsUnreal EngineVLA
0 likes · 16 min read
How UnrealZoo Accelerates Embodied AI Research with High‑Fidelity Simulation
DataFunSummit
DataFunSummit
Jan 11, 2026 · Artificial Intelligence

How Healthpeak Turned Manual Real Estate Ops into an AI‑Driven System with Palantir AIP

Healthpeak’s commercial‑real‑estate workflow, plagued by data silos and manual meter‑reading, was transformed by deploying Palantir’s AI Platform, which introduced an ontology‑based four‑layer architecture that automates billing, detects anomalies, and enables mobile‑first, AI‑driven decision making.

AIAutomationReal Estate
0 likes · 17 min read
How Healthpeak Turned Manual Real Estate Ops into an AI‑Driven System with Palantir AIP
DataFunSummit
DataFunSummit
Jan 10, 2026 · Artificial Intelligence

How Agentic Workflows Transform International Logistics: A Deep Dive into the WOL‑APL‑EVAL Architecture

This article explores the challenges of international logistics and presents the WOL‑APL‑EVAL three‑layer architecture—workflow governance, adaptive planning, and continuous evaluation—demonstrating how AI agents, rule engines, and dynamic planning can automate customs clearance, reduce manual effort, and improve compliance and efficiency.

AIAgentic AIArchitecture
0 likes · 28 min read
How Agentic Workflows Transform International Logistics: A Deep Dive into the WOL‑APL‑EVAL Architecture
DataFunSummit
DataFunSummit
Jan 10, 2026 · Artificial Intelligence

How Healthpeak Turned Property Management into an AI‑Driven Operating System

This article examines how Healthpeak, a large healthcare REIT, replaced manual spreadsheet‑based processes with Palantir’s AI Platform (AIP), using an ontology‑driven architecture to automate billing, detect anomalies, and orchestrate workflows, delivering faster operations, higher accuracy, and scalable growth.

AIAutomationProperty Management
0 likes · 17 min read
How Healthpeak Turned Property Management into an AI‑Driven Operating System