Tagged articles
73 articles
Page 1 of 1
Su San Talks Tech
Su San Talks Tech
May 15, 2026 · Artificial Intelligence

Understanding Rerank in Retrieval‑Augmented Generation (RAG)

The article explains why a reranking step is essential in RAG pipelines, describes how it refines the initial vector‑search results, compares mainstream rerank techniques, discusses practical engineering choices such as candidate set size and model selection, and outlines how to evaluate and tune rerank performance.

Cross-EncoderEvaluation MetricsLLM
0 likes · 15 min read
Understanding Rerank in Retrieval‑Augmented Generation (RAG)
Machine Heart
Machine Heart
May 10, 2026 · Artificial Intelligence

The First Industry Survey of Vision World Models: Toward a Higher‑Intelligence Visual Paradigm

This survey introduces vision world models as a central driver for AI to learn physical and causal dynamics directly from visual data, presents a unified "representation‑learning‑simulation" framework, categorises four major technical routes, outlines evaluation metrics and datasets, and proposes a 3R roadmap for the next generation of world models.

Evaluation MetricsFuture DirectionsGenerative Modeling
0 likes · 15 min read
The First Industry Survey of Vision World Models: Toward a Higher‑Intelligence Visual Paradigm
Machine Heart
Machine Heart
Apr 28, 2026 · Artificial Intelligence

LangFlow Demonstrates Continuous Diffusion Matching Discrete Models via Better Training

LangFlow revisits continuous diffusion for language modeling, showing that earlier performance gaps were due to suboptimal training and evaluation, and through embedding‑space diffusion, a log‑NSR noise schedule, and a Gumbel‑based information schedule it matches or exceeds discrete diffusion and autoregressive baselines on standard and zero‑shot benchmarks.

Evaluation MetricsGumbel distributionLangflow
0 likes · 16 min read
LangFlow Demonstrates Continuous Diffusion Matching Discrete Models via Better Training
Architect's Tech Stack
Architect's Tech Stack
Apr 27, 2026 · Artificial Intelligence

Can Your RAG System Pass the Demo and Remain Accurate Across 5,000 Documents?

The article dissects a tough interview question about building a production‑grade Retrieval‑Augmented Generation (RAG) system that not only works in a demo but also delivers stable, correct answers over a knowledge base of 5,000 documents, covering chunking, hybrid retrieval, intent routing, constrained generation, evaluation metrics, and operational safeguards.

Evaluation MetricsHybrid RetrievalIntent Routing
0 likes · 15 min read
Can Your RAG System Pass the Demo and Remain Accurate Across 5,000 Documents?
Woodpecker Software Testing
Woodpecker Software Testing
Apr 25, 2026 · Artificial Intelligence

5 Common Pitfalls in Prompt Testing and Practical Ways to Fix Them

The article analyzes five frequent mistakes teams make when testing LLM prompts—confusing pass with robustness, ignoring implicit assumptions, relying on subjective judgments, lacking version‑aware CI/CD, and missing a human‑AI feedback loop—while offering concrete, data‑backed remedies.

AI quality assuranceEvaluation MetricsLLM testing
0 likes · 8 min read
5 Common Pitfalls in Prompt Testing and Practical Ways to Fix Them
SuanNi
SuanNi
Apr 21, 2026 · Artificial Intelligence

Why AI Video Generation Is Leaving the Silent Era: Architecture, Alignment, and Evaluation Insights

This article analyzes the rapid evolution of multimodal video generation models from separated visual‑audio pipelines to unified diffusion Transformers, detailing VAE compression, MoE scaling, cross‑modal alignment techniques, comprehensive evaluation metrics, real‑world applications, and the remaining technical challenges.

Evaluation MetricsMultimodal AIVideo Generation
0 likes · 15 min read
Why AI Video Generation Is Leaving the Silent Era: Architecture, Alignment, and Evaluation Insights
Wu Shixiong's Large Model Academy
Wu Shixiong's Large Model Academy
Apr 9, 2026 · Artificial Intelligence

How to Jump‑Start a RAG System Without Any Labeled Data

Building a Retrieval‑Augmented Generation (RAG) system from scratch without existing QA pairs requires a systematic cold‑start approach that creates synthetic QA data, establishes baseline metrics, iteratively improves via expert labeling and real user feedback, and ensures document quality for reliable evaluation.

Evaluation MetricsLLMRAG
0 likes · 17 min read
How to Jump‑Start a RAG System Without Any Labeled Data
AgentGuide
AgentGuide
Apr 6, 2026 · Artificial Intelligence

How to Optimize RAG System Performance: From Evaluation Metrics to Tuning Strategies

The article explains how to improve Retrieval‑Augmented Generation (RAG) systems by interpreting three key metrics—context recall, context precision, and answer correctness—and provides concrete step‑by‑step actions such as checking the knowledge base, upgrading embedding models, rewriting queries, adding a rerank model, and refining prompts and generation parameters.

Evaluation MetricsRAGRerank
0 likes · 7 min read
How to Optimize RAG System Performance: From Evaluation Metrics to Tuning Strategies
PaperAgent
PaperAgent
Mar 6, 2026 · Artificial Intelligence

Unlocking AI Memory: A Comprehensive Survey of Theory, Architectures, and Future Trends

This extensive survey presents a panoramic view of AI memory, introducing a novel 4W classification, detailing single‑agent and multi‑agent memory architectures, outlining evaluation metrics, showcasing real‑world applications, and highlighting open challenges and emerging research directions.

4W TaxonomyAI memoryEvaluation Metrics
0 likes · 12 min read
Unlocking AI Memory: A Comprehensive Survey of Theory, Architectures, and Future Trends
PMTalk Product Manager Community
PMTalk Product Manager Community
Mar 3, 2026 · Product Management

Why Data Thinking Is the Key to Evaluating AI Agents for Product Managers

Product managers transitioning to AI must shift from feature‑centric thinking to a data‑driven mindset, treating models as probabilistic systems, defining ground truth, analyzing bad cases, and building multi‑dimensional evaluation metrics such as safety, consistency, and usefulness to ensure reliable, user‑focused AI outputs.

AI product managementEvaluation Metricsbad case analysis
0 likes · 9 min read
Why Data Thinking Is the Key to Evaluating AI Agents for Product Managers
HyperAI Super Neural
HyperAI Super Neural
Feb 14, 2026 · Artificial Intelligence

Beyond Visual Realism: WorldArena Benchmark Reveals the Capability Gap in Embodied World Models

WorldArena introduces a unified benchmark that evaluates generated videos not only for visual fidelity but also for embodied task functionality across six dimensions, exposing a stark gap between visual realism and practical usefulness and providing a composite EWMScore to compare models.

BenchmarkEmbodied AIEvaluation Metrics
0 likes · 9 min read
Beyond Visual Realism: WorldArena Benchmark Reveals the Capability Gap in Embodied World Models
PMTalk Product Manager Community
PMTalk Product Manager Community
Dec 8, 2025 · Product Management

How AI Product Managers Build Technical Insight to Shift from Execution to Strategy

The article outlines how AI product managers can develop technical insight—shifting from a purely execution role to a strategic one—through four steps of mindset change, capability building, hands‑on practice, and strategic scaling, using concrete frameworks, data‑driven decision tools, and real‑world case studies.

AI product managementEvaluation MetricsPOC methodology
0 likes · 23 min read
How AI Product Managers Build Technical Insight to Shift from Execution to Strategy
Sohu Tech Products
Sohu Tech Products
Oct 29, 2025 · Information Security

Why a New Multimodal AI Security Dataset Is Essential for Detecting Deepfakes

As multimodal AI models become capable of generating realistic images, videos, and audio, the OpenMMSec benchmark provides a comprehensive, open‑source dataset and evaluation metrics that help researchers and developers detect and localize AI‑generated forgeries across all three modalities, addressing emerging security challenges.

AI securityEvaluation MetricsOpenMMSec
0 likes · 18 min read
Why a New Multimodal AI Security Dataset Is Essential for Detecting Deepfakes
Data Party THU
Data Party THU
Sep 5, 2025 · Artificial Intelligence

What a PRISMA Review Uncovers About Retrieval‑Augmented Generation (RAG)

This systematic PRISMA review analyzes 128 highly‑cited RAG papers, covering five major databases, 343 datasets, a detailed technical roadmap, evaluation metrics from EM to LLM‑as‑Judge, and future research directions, showing that RAG has evolved into a complex, programmable, and auditable distributed system.

AIDatasetsEvaluation Metrics
0 likes · 5 min read
What a PRISMA Review Uncovers About Retrieval‑Augmented Generation (RAG)
Didi Tech
Didi Tech
Jul 17, 2025 · Artificial Intelligence

How RAS‑AUCC Eliminates Offline‑Online Gaps in Multi‑Treatment Uplift Modeling

This article explains the challenges of evaluating uplift models for intelligent marketing with multiple discount treatments, reviews existing metrics such as AUUC, Qini, and AUCC, and introduces the RAS‑AUCC metric that aligns offline evaluation with online ROI by sorting samples by marginal ROI and using RCT data.

Evaluation MetricsMarketing OptimizationUplift Modeling
0 likes · 13 min read
How RAS‑AUCC Eliminates Offline‑Online Gaps in Multi‑Treatment Uplift Modeling
dbaplus Community
dbaplus Community
May 31, 2025 · Artificial Intelligence

How RAG is Shaping the Future of AI-Powered User Experience

Amid the rapid rise of large language models, this article examines RAG’s development, technical hurdles, core strategies, and future outlook, illustrating how Alibaba’s Chatbot and Copilot projects boost retrieval accuracy to 90% and generation precision to 85% while tackling data quality, heterogeneous retrieval, and evaluation challenges.

AI searchEvaluation MetricsRAG
0 likes · 27 min read
How RAG is Shaping the Future of AI-Powered User Experience
Instant Consumer Technology Team
Instant Consumer Technology Team
May 16, 2025 · Artificial Intelligence

Smart AI‑Powered Push Copy: From Templates to Sampling Strategies

This article explores how high‑quality content assets—text, images, and video—drive SEO and user engagement, then delves into the challenges of crafting push‑notification copy and presents an intelligent copy system that uses template and keyword generation, transformer models, BLEU and semantic similarity evaluation, and various sampling strategies to improve relevance and diversity.

AIEvaluation MetricsNLP
0 likes · 30 min read
Smart AI‑Powered Push Copy: From Templates to Sampling Strategies
Model Perspective
Model Perspective
Apr 3, 2025 · Artificial Intelligence

Turning Metrics into Music: A Sensitivity & Specificity Song Explained

This article showcases an AI‑generated song that teaches the four core classification metrics—sensitivity, specificity, precision, and recall—by presenting lyrical explanations, a confusion‑matrix overview, Python code for MIDI creation, and a step‑by‑step guide to producing the final video.

AI musicEvaluation MetricsMIDI
0 likes · 8 min read
Turning Metrics into Music: A Sensitivity & Specificity Song Explained
Alibaba Cloud Developer
Alibaba Cloud Developer
Mar 6, 2025 · Artificial Intelligence

From Linear Regression to Transformers: Mastering Machine Learning Foundations

This comprehensive guide walks readers through the evolution of machine learning, starting with basic linear models and feature engineering, progressing through logistic regression, decision trees, and deep learning architectures like MLPs, CNNs, RNNs, and transformers, and demonstrates practical implementations with code examples and evaluation metrics.

Deep LearningEvaluation MetricsRecommendation Systems
0 likes · 64 min read
From Linear Regression to Transformers: Mastering Machine Learning Foundations
Bilibili Tech
Bilibili Tech
Jan 14, 2025 · Artificial Intelligence

Technical Practices and Productization of Intelligent Advertising Title Generation for Bilibili

We built an LLM‑powered system for Bilibili that automatically creates ad titles from user keywords, employing fluency, style, and quality classifiers, mixed domain data cleaning, and alignment methods such as SFT, DPO and KTO, resulting in a product that now generates about ten percent of daily titles and drives significant ad spend.

AI AlignmentAd Title GenerationBilibili
0 likes · 24 min read
Technical Practices and Productization of Intelligent Advertising Title Generation for Bilibili
Bilibili Tech
Bilibili Tech
Oct 8, 2024 · Artificial Intelligence

ICDAR 2024 Historical Map Text Recognition Competition: DNTextSpotter Methodology and Results

The ICDAR 2024 Historical Map Text Recognition competition was won by Bilibili’s DNTextSpotter, a Transformer‑based model built on DeepSolo and ViTAE‑v2 that uses deformable self‑attention, dual‑query decoding and denoising training, combined with mixed‑vocabulary fine‑tuning, advanced loss functions and strict PDQ/PWQ/PCQ metrics to achieve state‑of‑the‑art dense, rotated, arbitrary‑shaped text detection and recognition on historical maps and real‑world multimedia.

DNTextSpotterDeep LearningEvaluation Metrics
0 likes · 17 min read
ICDAR 2024 Historical Map Text Recognition Competition: DNTextSpotter Methodology and Results
Xiaohongshu Tech REDtech
Xiaohongshu Tech REDtech
Sep 23, 2024 · Artificial Intelligence

AlignRec: A Joint Training Framework for Aligning Multimodal Representations with Personalized Recommendation

AlignRec is a joint‑training framework that synchronizes multimodal encoders with personalized recommendation models through a staged alignment strategy and three specialized loss functions, preserving both content and ID signals, and achieving state‑of‑the‑art performance on multiple datasets while releasing superior Amazon multimodal features.

AIEvaluation Metricsjoint training
0 likes · 11 min read
AlignRec: A Joint Training Framework for Aligning Multimodal Representations with Personalized Recommendation
NewBeeNLP
NewBeeNLP
Jun 14, 2024 · Artificial Intelligence

Why Coarse Ranking Matters: Goals, Metrics, and Model Design in Search Systems

The article explains the purpose of coarse ranking in industrial search pipelines, outlines key evaluation metrics, discusses sample construction and model architecture choices, and highlights trade‑offs between consistency with downstream ranking and overall system performance.

Evaluation Metricscoarse rankingsearch ranking
0 likes · 11 min read
Why Coarse Ranking Matters: Goals, Metrics, and Model Design in Search Systems
DataFunSummit
DataFunSummit
Jun 5, 2024 · Fundamentals

User Portrait Tagging: Construction, Feature Processing, and Evaluation

This article provides a comprehensive guide on building user portrait tags—from basic attribute tags to business and strategy tags—detailing data collection methods, feature engineering techniques such as cleaning, time decay, and smoothing, and evaluation metrics for cohesion and stability, aimed at data product managers and analysts.

Evaluation Metricsfeature engineeringproduct-management
0 likes · 12 min read
User Portrait Tagging: Construction, Feature Processing, and Evaluation
AI Large Model Application Practice
AI Large Model Application Practice
Apr 10, 2024 · Artificial Intelligence

What Is Self‑RAG? A Simple Guide to Self‑Reflective Retrieval‑Augmented Generation

This article explains the motivation behind Self‑RAG, describes its core workflow—including conditional retrieval, enhanced generation, and self‑evaluation tokens—details the four evaluation metrics (Retrieve, IsRel, IsSup, IsUse), and provides a Python scoring example using log‑probabilities.

Evaluation MetricsLLMLogprobs
0 likes · 13 min read
What Is Self‑RAG? A Simple Guide to Self‑Reflective Retrieval‑Augmented Generation
Baobao Algorithm Notes
Baobao Algorithm Notes
Mar 18, 2024 · Industry Insights

Inside the 2024 KDD Cup ShopBench Challenge: Tasks, Data, and Evaluation Metrics

The 2024 KDD Cup introduces the ShopBench benchmark, a large‑scale LLM competition that simulates real‑world online shopping with 57 tasks, over 20,000 questions, and multiple tracks covering concept understanding, knowledge reasoning, user‑behavior alignment, multilingual ability, and an all‑round track, all evaluated with task‑specific metrics and a hidden test set.

BenchmarkDatasetEvaluation Metrics
0 likes · 11 min read
Inside the 2024 KDD Cup ShopBench Challenge: Tasks, Data, and Evaluation Metrics
DaTaobao Tech
DaTaobao Tech
Feb 28, 2024 · Artificial Intelligence

A Survey of Image Quality Evaluation Metrics for Text-to-Image Generation

The survey traces the evolution of image‑quality evaluation for text‑to‑image generation—from early handcrafted edge and color cues, through GAN‑era similarity scores such as IS, FID and KID, to modern perceptual and CLIP‑based metrics like LPIPS, CLIPScore, TRIQ, IQT and human‑preference models—highlighting a shift toward semantic, aesthetic, and text‑image alignment measures and forecasting domain‑specific metrics for future diffusion models.

Evaluation MetricsGANGenerative Models
0 likes · 18 min read
A Survey of Image Quality Evaluation Metrics for Text-to-Image Generation
DataFunTalk
DataFunTalk
Feb 5, 2024 · Fundamentals

User Portrait Tagging: Construction, Feature Processing, and Evaluation

This article explains how to build user portrait tags—from basic attribute tags to business and strategy tags—covers methods for data collection, anomaly handling, time decay, smoothing, and evaluates tag quality using cohesion, stability, and AUC-related metrics to support data‑driven product decisions.

Data ScienceEvaluation Metricsfeature engineering
0 likes · 12 min read
User Portrait Tagging: Construction, Feature Processing, and Evaluation
DataFunSummit
DataFunSummit
Dec 28, 2023 · Artificial Intelligence

Problem Analysis and User Value Estimation in Advertising Scenarios

This article analyzes challenges in advertising placement, introduces user value modeling practices such as CLTV estimation, discusses data sparsity, multi‑distribution issues, evaluation metrics, and presents future work on budget allocation and iterative model improvement for growth optimization.

CLTVEvaluation MetricsUser Value Modeling
0 likes · 11 min read
Problem Analysis and User Value Estimation in Advertising Scenarios
DeWu Technology
DeWu Technology
Dec 20, 2023 · Artificial Intelligence

Coarse Ranking in Recommenders: Key Strategies, Metrics & Optimizations

This article systematically reviews the coarse‑ranking stage of recommendation systems, comparing it with recall and fine‑ranking, defining evaluation metrics, detailing sample design, presenting two technical routes, and exploring optimization directions such as dual‑tower models, knowledge distillation, lightweight fully‑connected layers, multi‑objective and multi‑scenario modeling, followed by practical case studies and results.

Evaluation Metricscoarse rankingdual-tower
0 likes · 22 min read
Coarse Ranking in Recommenders: Key Strategies, Metrics & Optimizations
ZhongAn Tech Team
ZhongAn Tech Team
Sep 4, 2023 · Artificial Intelligence

Embedding Technology for FAQ Retrieval: Cases, Evaluation Metrics, and Model Comparison

This article introduces the evolution of embedding techniques, presents real‑world case studies of embedding‑based FAQ retrieval, explains evaluation metrics such as Recall and MRR, and compares the performance of a proprietary ZhongAn embedding model with OpenAI and Sentence‑BERT models on Chinese FAQ datasets.

EmbeddingEvaluation MetricsFAQ Retrieval
0 likes · 18 min read
Embedding Technology for FAQ Retrieval: Cases, Evaluation Metrics, and Model Comparison
TAL Education Technology
TAL Education Technology
Aug 31, 2023 · Artificial Intelligence

Research on Content-Based Image Retrieval Techniques

This article reviews the fundamentals, feature extraction methods, evaluation metrics, and common datasets of content‑based image retrieval (CBIR), discussing traditional low‑level features, local descriptors, unsupervised and supervised learning approaches, and recent deep‑learning models for improving retrieval performance.

CBIRDatasetsDeep Learning
0 likes · 13 min read
Research on Content-Based Image Retrieval Techniques
Model Perspective
Model Perspective
Aug 26, 2023 · Artificial Intelligence

Why Accuracy Isn’t Enough: Mastering MCC for Imbalanced Classification

This article reviews common classification evaluation metrics—accuracy, precision, recall, and F1—explains their limitations on imbalanced data, and introduces the Matthews Correlation Coefficient (MCC) with Python implementations to provide a more reliable performance measure.

Evaluation MetricsMCCPython
0 likes · 5 min read
Why Accuracy Isn’t Enough: Mastering MCC for Imbalanced Classification
DataFunTalk
DataFunTalk
May 8, 2023 · Artificial Intelligence

Comprehensive Overview of Modern Recommendation System Technologies

This article presents a detailed survey of recent advances in recommendation system technology, covering system architecture, user understanding layers, various recall methods, ranking techniques, auxiliary algorithms such as cold-start and bias modeling, and evaluation metrics, with references to industry practices and academic research.

AIEvaluation MetricsRecommendation Systems
0 likes · 13 min read
Comprehensive Overview of Modern Recommendation System Technologies
DataFunSummit
DataFunSummit
Feb 19, 2023 · Artificial Intelligence

Intelligent Writing Assistant: TexSmart and Effidit Systems, Multi‑Level Unsupervised Text Rewriting, and the New ParaScore Evaluation Metric

This article presents Tencent AI Lab's intelligent writing assistant, detailing the TexSmart text‑understanding platform, the Effidit writing‑assistant features, a multi‑level controllable unsupervised text‑rewriting method, and a novel ParaScore metric that jointly measures semantic similarity and diversity for paraphrase evaluation.

AI writingEvaluation MetricsNLP
0 likes · 14 min read
Intelligent Writing Assistant: TexSmart and Effidit Systems, Multi‑Level Unsupervised Text Rewriting, and the New ParaScore Evaluation Metric
Meituan Technology Team
Meituan Technology Team
Feb 16, 2023 · Artificial Intelligence

Interactive Recommendation System for Food Delivery Feed

This article details Meituan Waimai's end‑to‑end interactive recommendation system for the food‑delivery homepage feed, explaining its architecture, trigger strategies, recall and ranking pipelines, evaluation metrics, experimental results, and future optimization directions.

Evaluation MetricsMeituanfood delivery
0 likes · 24 min read
Interactive Recommendation System for Food Delivery Feed
DaTaobao Tech
DaTaobao Tech
Feb 13, 2023 · Artificial Intelligence

Why Recommendation Systems Matter: From Basics to Advanced Strategies

This article explains what recommendation systems are, their core tasks, evaluation metrics, popular algorithms such as collaborative filtering and latent factor models, how to handle cold‑start and contextual challenges, the role of social networks, and typical system architecture, providing a comprehensive overview for beginners and practitioners.

Evaluation MetricsRecommendation Systemscold start
0 likes · 21 min read
Why Recommendation Systems Matter: From Basics to Advanced Strategies
DataFunSummit
DataFunSummit
Jan 25, 2023 · Artificial Intelligence

Expert Insights on Recommendation System Architecture, Data, Features, Recall, Ranking and Evaluation

This interview compiles expert opinions on the end‑to‑end recommendation system pipeline—including architecture, data collection, user profiling, content structuring, feature engineering, recall strategies, ranking algorithms, multi‑objective optimization, multi‑modal fusion, re‑ranking, cold‑start solutions, evaluation metrics and real‑world applications—highlighting the technical challenges and practical solutions.

Evaluation Metricscold startfeature engineering
0 likes · 15 min read
Expert Insights on Recommendation System Architecture, Data, Features, Recall, Ranking and Evaluation
DataFunTalk
DataFunTalk
Jan 21, 2023 · Artificial Intelligence

Challenges and Best Practices in Recommendation Systems – Expert Interview

This interview with three recommendation‑system experts explores the technical architecture, data sources, feature engineering, recall and ranking strategies, evaluation metrics, cold‑start solutions, and practical difficulties, offering actionable insights to avoid common pitfalls in real‑world recommender deployments.

Evaluation MetricsRecommendation Systemscold start
0 likes · 15 min read
Challenges and Best Practices in Recommendation Systems – Expert Interview
DataFunTalk
DataFunTalk
Jan 18, 2023 · Artificial Intelligence

Search Relevance System Architecture and Practices in QQ Browser

This article presents the QQ Browser search relevance team's experience integrating QQ Browser and Sogou search systems, detailing business overview, relevance system evolution, algorithm architecture, evaluation metrics, deep semantic matching, relevance calibration, and model distillation techniques to improve search relevance performance.

Evaluation Metricsinformation retrievalmodel distillation
0 likes · 31 min read
Search Relevance System Architecture and Practices in QQ Browser
Model Perspective
Model Perspective
Jan 15, 2023 · Artificial Intelligence

Mastering Model Evaluation: Key Metrics, Validation Techniques, and Diagnostics

This guide explains essential evaluation metrics for classification and regression models—including confusion matrix, ROC/AUC, R², and main performance indicators—covers model selection strategies such as train‑validation‑test splits, k‑fold cross‑validation, and regularization techniques, and discusses bias‑variance trade‑offs and diagnostic tools.

Evaluation MetricsModel SelectionRegularization
0 likes · 6 min read
Mastering Model Evaluation: Key Metrics, Validation Techniques, and Diagnostics
Tencent Cloud Developer
Tencent Cloud Developer
Jan 9, 2023 · Artificial Intelligence

Search Relevance Architecture and Practices in QQ Browser

The QQ Browser search relevance team describes a unified, billion‑scale architecture that combines a main and vertical subsystem, a pyramid‑shaped ranking pipeline (recall, coarse, fine), a dedicated GPU‑accelerated relevance service, and hybrid semantic‑matching models (dual‑tower, BERT, matrix fusion) evaluated with offline and online metrics to deliver accurate, fresh, and authoritative results for diverse content and long‑tail queries.

Deep LearningEvaluation MetricsSystem Architecture
0 likes · 28 min read
Search Relevance Architecture and Practices in QQ Browser
Model Perspective
Model Perspective
Aug 7, 2022 · Artificial Intelligence

Mastering Core ML Evaluation Metrics: From Bias‑Variance to ROC Curves

This article explains essential machine‑learning evaluation concepts—including the bias‑variance trade‑off, Gini impurity versus entropy, precision‑recall curves, ROC and AUC, the elbow method for K‑means, PCA scree plots, linear and logistic regression, SVM geometry, normal‑distribution rules, and Student’s t‑distribution—providing clear visual illustrations for each.

Evaluation MetricsPCAROC
0 likes · 7 min read
Mastering Core ML Evaluation Metrics: From Bias‑Variance to ROC Curves
DataFunTalk
DataFunTalk
Jul 8, 2022 · Artificial Intelligence

Civil Aviation QA Competition (CCL2022‑DQAB): Task Description, Data, Evaluation Metrics, and Prizes

The CCL2022‑DQAB competition, organized by Beihang University and AVIC Mobile Technology, invites participants to develop reading‑comprehension models for extracting accurate question‑answer pairs from civil aviation texts, offering detailed task definitions, evaluation criteria, dataset statistics, a prize structure, and a competition schedule.

AICivil AviationDataset
0 likes · 5 min read
Civil Aviation QA Competition (CCL2022‑DQAB): Task Description, Data, Evaluation Metrics, and Prizes
DataFunSummit
DataFunSummit
Jul 7, 2022 · Artificial Intelligence

Discovering and Enhancing Robustness in Low‑Resource Information Extraction

This article examines the robustness challenges of information extraction tasks such as NER and relation extraction, introduces the Entity Coverage Ratio metric, analyzes why pretrained models like BERT may “take shortcuts,” and proposes evaluation tools and training strategies—including mutual‑information‑based methods, negative‑training, and flooding—to improve model robustness across diverse scenarios.

BERTEvaluation MetricsRobustness
0 likes · 12 min read
Discovering and Enhancing Robustness in Low‑Resource Information Extraction
Baidu Geek Talk
Baidu Geek Talk
Jun 15, 2022 · Artificial Intelligence

CCL2022 Video Highlight Extraction Challenge Overview

The article describes the CCL2022 Video Highlight Extraction Challenge, a competition at the 21st China Conference on Computational Linguistics organized by Baidu, inviting participants worldwide to generate timestamped concise summaries of video segments, with registration details, eligibility, task description, example inputs/outputs, and evaluation metrics based on timing accuracy and ROUGE-L.

CCL2022Evaluation MetricsNLP
0 likes · 6 min read
CCL2022 Video Highlight Extraction Challenge Overview
Model Perspective
Model Perspective
May 15, 2022 · Fundamentals

Standardizing Evaluation Indicators: Convert Small, Central, and Interval Metrics to Large Scores

This article explains the concept of indicator standardization, describing why aligning metric directions is essential, and provides step-by-step transformations for small, central, and interval-type indicators—such as reciprocal, translation, and scaling methods—to convert them into large-scale metrics where higher values indicate better performance.

Evaluation Metricsdata normalizationindicator standardization
0 likes · 3 min read
Standardizing Evaluation Indicators: Convert Small, Central, and Interval Metrics to Large Scores
DeWu Technology
DeWu Technology
Mar 11, 2022 · Artificial Intelligence

Deep Learning in Face Recognition

The article surveys deep‑learning‑based face‑recognition systems, detailing detection, preprocessing, and recognition pipelines, describing evaluation metrics such as TAR, FAR, and Rank‑K, reviewing major datasets like LFW, MS‑Celeb‑1M and VGGFace2, and comparing leading architectures—including FaceNet, CenterLoss, SphereFace and InsightFace—while highlighting their strengths, limitations, real‑world applications, and seminal research references.

AIDatasetsDeep Learning
0 likes · 14 min read
Deep Learning in Face Recognition
DataFunSummit
DataFunSummit
Apr 8, 2021 · Artificial Intelligence

Evaluation Metrics and Methods for Recommendation Systems

This article explains the purpose, dimensions, and specific quantitative metrics—such as accuracy, surprise, diversity, RMSE, MAE, R‑squared, MAP, MRR, ROC and AUC—used to evaluate recommendation systems, covering user, platform, item, and system perspectives for practical AI deployments.

Evaluation Metricsinformation retrieval
0 likes · 13 min read
Evaluation Metrics and Methods for Recommendation Systems
58UXD
58UXD
Sep 15, 2020 · Artificial Intelligence

How to Evaluate Recommendation Systems: Metrics, Case Study, and Insights

This article explores the fundamentals and evaluation of recommendation systems, detailing their definition, key performance dimensions such as accuracy, diversity, novelty, serendipity, trust, and real‑time utility, and presents a practical case study from 58.com with reflections on methodology and future improvements.

Evaluation MetricsRecommendation SystemsUser experience
0 likes · 12 min read
How to Evaluate Recommendation Systems: Metrics, Case Study, and Insights
Yanxuan Tech Team
Yanxuan Tech Team
Sep 1, 2020 · Big Data

Yanxuan’s Data Warehouse Blueprint: Architecture, Standards, and Evaluation

This article introduces Yanxuan’s data warehouse concept, platform layers, development standards, and a comprehensive evaluation framework, detailing its multi‑layer architecture (ODS, DWD, DWS, DIM, DM), supporting offline and real‑time platforms, and six key assessment dimensions such as data quality, security, and development efficiency.

Big Data ArchitectureEvaluation Metrics
0 likes · 12 min read
Yanxuan’s Data Warehouse Blueprint: Architecture, Standards, and Evaluation
Alibaba Cloud Developer
Alibaba Cloud Developer
Jul 3, 2020 · Artificial Intelligence

Unlocking Visual Object Tracking: Principles, Algorithms, and Evaluation

This comprehensive review explains visual object tracking in computer vision, covering its definition, core sub‑problems of candidate generation, feature extraction, and decision making, system architecture, motion, feature and observation models, algorithm classifications, evaluation metrics, datasets, and recent research trends.

Computer VisionDeep LearningEvaluation Metrics
0 likes · 30 min read
Unlocking Visual Object Tracking: Principles, Algorithms, and Evaluation
360 Tech Engineering
360 Tech Engineering
Aug 28, 2019 · Artificial Intelligence

Deep Collaborative Filtering Models and Their Implementation in Recommender Systems

This article surveys traditional and deep learning based collaborative filtering techniques—including similarity methods, matrix factorization, explicit and implicit feedback handling, various loss functions, evaluation metrics, and TensorFlow implementations of GMF, MLP, NeuMF, DMF, and ConvMF models—providing practical guidance for building large‑scale recommender systems.

Evaluation MetricsTensorFlowcollaborative filtering
0 likes · 21 min read
Deep Collaborative Filtering Models and Their Implementation in Recommender Systems
DataFunTalk
DataFunTalk
Jun 13, 2019 · Artificial Intelligence

What Makes a Good Recommendation System?

This article explores the multifaceted criteria for evaluating a good recommendation system, covering macro and micro perspectives, product domain considerations, information retrieval, algorithmic accuracy, user experience, and business impact, and outlines a systematic iteration process for continuous improvement.

AIEvaluation MetricsUser experience
0 likes · 13 min read
What Makes a Good Recommendation System?
JD Tech Talk
JD Tech Talk
Mar 29, 2019 · Artificial Intelligence

Understanding Confusion Matrix, ROC Curve, and Evaluation Metrics for Binary Classification Models

After building a binary classification model, this article explains essential evaluation tools such as the confusion matrix, derived metrics like accuracy, precision, recall, F1 score, and the ROC curve, illustrating their definitions, visualizations, and practical considerations for different business scenarios.

Evaluation MetricsF1 scoreROC curve
0 likes · 6 min read
Understanding Confusion Matrix, ROC Curve, and Evaluation Metrics for Binary Classification Models
Efficient Ops
Efficient Ops
Mar 26, 2019 · Artificial Intelligence

How Live-Streaming Platforms Build Scalable Recommendation Systems

This article explains the design of a live‑streaming recommendation system, covering its overall architecture, ranking, content‑based and collaborative‑filtering methods, similarity calculations, multi‑algorithm fusion, sorting, user profiling, and evaluation metrics with practical examples and diagrams.

Evaluation Metricscollaborative filteringcontent-based
0 likes · 17 min read
How Live-Streaming Platforms Build Scalable Recommendation Systems
Meituan Technology Team
Meituan Technology Team
Dec 20, 2018 · Artificial Intelligence

Demystifying Learning to Rank: From Core Algorithms to Scalable Online Sorting Architecture

This article provides a comprehensive, system‑engineer‑focused guide to Learning to Rank, covering fundamental machine‑learning concepts, evaluation metrics such as Precision, nDCG and ERR, training‑testing‑inference stages, pointwise/pairwise/listwise methods, and a detailed multi‑layer online ranking architecture with feature, model and recall governance.

A/B testingDomain-Driven DesignEvaluation Metrics
0 likes · 29 min read
Demystifying Learning to Rank: From Core Algorithms to Scalable Online Sorting Architecture
Manbang Technology Team
Manbang Technology Team
Sep 15, 2018 · Artificial Intelligence

YMM-TECH Algorithm Competition Final: Problem Background, Evaluation Methodology, and Scoring Details

The YMM-TECH algorithm competition final, held at Nanjing University of Posts and Telecommunications, presented a logistics recommendation problem that leverages driver behavior data, evaluates solutions using ranking‑accuracy metrics with position‑weighted scores, and provides detailed formulas, examples, and data‑driven recommendations for 20 cargo items per driver.

AIEvaluation Metricsalgorithm competition
0 likes · 5 min read
YMM-TECH Algorithm Competition Final: Problem Background, Evaluation Methodology, and Scoring Details
360 Quality & Efficiency
360 Quality & Efficiency
Jun 4, 2018 · Artificial Intelligence

Common Engineering Algorithms and Their Testing Methods

This article introduces the most commonly used algorithms in engineering—recommendation, optimization, estimation, and classification—describes their typical application scenarios, and explores various testing methods and evaluation metrics such as offline experiments, user surveys, A/B testing, and performance indicators like accuracy, coverage, and robustness.

Evaluation MetricsRecommendation Systemsalgorithm testing
0 likes · 12 min read
Common Engineering Algorithms and Their Testing Methods
Baidu Waimai Technology Team
Baidu Waimai Technology Team
Aug 3, 2017 · Artificial Intelligence

Model Testing and Evaluation Metrics for Strategy Projects in the AI Era

This article explains the challenges of testing machine‑learning models for strategy projects, outlines the overall testing workflow, describes key offline and online evaluation metrics such as AUC and AB‑testing, and summarizes best‑practice procedures for assessing model performance, user experience, and effect differences.

AB testingAIAUC
0 likes · 8 min read
Model Testing and Evaluation Metrics for Strategy Projects in the AI Era
Architect
Architect
Jun 4, 2016 · Artificial Intelligence

Collaborative Filtering Recommendation Systems: Evaluation Metrics, User‑Based and Item‑Based CF with Python Implementations

This article reviews recommendation system evaluation metrics such as precision, recall, coverage and novelty, explains the principles of user‑based and item‑based collaborative filtering, provides complete Python code for each method, and compares their characteristics and suitable application scenarios.

Evaluation MetricsPythoncollaborative filtering
0 likes · 14 min read
Collaborative Filtering Recommendation Systems: Evaluation Metrics, User‑Based and Item‑Based CF with Python Implementations
Qunar Tech Salon
Qunar Tech Salon
Oct 23, 2015 · Artificial Intelligence

Critical Examination of Face Recognition Benchmarks and Overstated Accuracy Claims

The article critiques the rapid rise of face‑recognition research by highlighting unfair comparisons, lack of statistical validation, misleading accuracy metrics versus real‑world verification rates, and the hype surrounding deep neural networks, urging a more rigorous and application‑focused evaluation of AI systems.

BiometricsDeep LearningEvaluation Metrics
0 likes · 8 min read
Critical Examination of Face Recognition Benchmarks and Overstated Accuracy Claims
21CTO
21CTO
Sep 7, 2015 · Artificial Intelligence

Top 10 Open Challenges Shaping the Future of Personalized Recommendation Systems

This article surveys the fundamental misconceptions about personalized recommendation, distinguishes it from market segmentation and collaborative filtering, and then systematically presents ten critical research challenges—including data sparsity, cold‑start, scalability, diversity‑accuracy trade‑offs, system robustness, user behavior modeling, evaluation metrics, UI/UX, cross‑dimensional data integration, and social recommendation—each illustrated with examples and recent literature.

Evaluation Metricscold startdata sparsity
0 likes · 31 min read
Top 10 Open Challenges Shaping the Future of Personalized Recommendation Systems