Tagged articles
7 articles
Page 1 of 1
Data Party THU
Data Party THU
Apr 5, 2026 · Artificial Intelligence

How to Beat Shortcut Learning for Better OOD Generalization in Vision Models

Visual and vision-language models excel under IID benchmarks but often fail on out-of-distribution data due to shortcut learning; this article examines the problem, explains its causes, and proposes data-level and model-level interventions—including StillMix, FLASH, and SPARCL—to improve OOD robustness.

AI researchModel DesignOOD generalization
0 likes · 7 min read
How to Beat Shortcut Learning for Better OOD Generalization in Vision Models
Data STUDIO
Data STUDIO
Mar 16, 2026 · Backend Development

11 Essential Pydantic v2 Practices to Avoid Common Pitfalls

This article explains why rigorous data validation is crucial and presents eleven practical Pydantic v2 techniques—including strong typing, boundary validation, separating validation from conversion, composing small models, using Annotated and RootModel, enforcing immutability, handling circular references, writing clear errors, keeping business logic out of models, and validating all external data—to make Python code more robust and maintainable.

AnnotatedFastAPIModel Design
0 likes · 12 min read
11 Essential Pydantic v2 Practices to Avoid Common Pitfalls
Big Data Tech Team
Big Data Tech Team
Nov 2, 2025 · Big Data

Data Governance Blueprint: Naming Rules, Lifecycle Levels, and Layered Architecture

Explore a comprehensive data governance guide covering naming conventions, data lifecycle classifications, layered architecture standards, inter-layer calling rules, and model design principles, providing practical standards and best practices for building robust, maintainable data warehouses and analytics platforms.

Data LifecycleModel Designdata-warehouse
0 likes · 9 min read
Data Governance Blueprint: Naming Rules, Lifecycle Levels, and Layered Architecture
Alibaba Cloud Developer
Alibaba Cloud Developer
Jun 26, 2025 · Artificial Intelligence

How to Build a Multi‑Dimensional Evaluation Framework for AI‑Powered Data Analysis Platforms

This article outlines the design of a scientific, quantifiable, multi‑dimensional evaluation system for the DataV‑Note intelligent analysis platform, addressing the lack of unified standards and accuracy challenges in AI‑driven data reporting, and proposes concrete metrics, model architecture, and future automation plans.

AI EvaluationMetricsModel Design
0 likes · 13 min read
How to Build a Multi‑Dimensional Evaluation Framework for AI‑Powered Data Analysis Platforms
Big Data Tech Team
Big Data Tech Team
Mar 17, 2025 · Big Data

How to Design and Review a Data Warehouse Model: A Complete Guide

This document outlines a comprehensive data warehouse model design and review process, covering revision records, project overview, business requirements, conceptual and logical modeling, ETL workflow, exception handling, and acceptance criteria with practical examples and templates.

ETLModel Designdata modeling
0 likes · 6 min read
How to Design and Review a Data Warehouse Model: A Complete Guide
Sohu Tech Products
Sohu Tech Products
Aug 4, 2021 · Artificial Intelligence

Technical Summary of the 2021 Sohu Campus Text Matching Algorithm Competition

This article presents a comprehensive technical summary of the 2021 Sohu Campus Text Matching Algorithm Competition, detailing data characteristics, preprocessing strategies, tokenization choices, positional encoding methods, model architectures using relative encodings such as WoBERT and RoFormer, experimental results, and reflections on future improvements.

Model DesignNLPcompetition
0 likes · 9 min read
Technical Summary of the 2021 Sohu Campus Text Matching Algorithm Competition