Tagged articles
41 articles
Page 1 of 1
MaGe Linux Operations
MaGe Linux Operations
Dec 24, 2025 · Backend Development

Mastering OpenTelemetry: From Setup to Advanced Sampling and Production‑Ready Practices

This guide walks through the fundamentals of OpenTelemetry, covering component architecture, environment setup, SDK and Collector configuration for Java, Go, and Kubernetes, and dives into common pitfalls, performance tuning, security hardening, high‑availability deployment, and advanced tail‑based sampling strategies.

CollectorDistributed TracingKubernetes
0 likes · 27 min read
Mastering OpenTelemetry: From Setup to Advanced Sampling and Production‑Ready Practices
Model Perspective
Model Perspective
Mar 20, 2025 · Big Data

How to Sample Effectively in the Big Data Era: Methods and Best Practices

This article explores essential sampling strategies for big‑data environments—including simple random, reservoir, stratified, oversampling, undersampling, and weighted sampling—detailing their principles, algorithmic steps, advantages, drawbacks, and suitable application scenarios to help analysts choose the right method.

Big DataSamplingoversampling
0 likes · 8 min read
How to Sample Effectively in the Big Data Era: Methods and Best Practices
AI Algorithm Path
AI Algorithm Path
Mar 4, 2025 · Artificial Intelligence

How to Control LLM Output Using Temperature, Top‑K, and Top‑P

The article explains how sampling parameters—Temperature, Top‑k, and Top‑p—shape the output of large language models, comparing greedy and beam search, illustrating probability changes with concrete examples, and offering practical guidance on adjusting these settings for different tasks.

Beam SearchGreedy SearchLLM
0 likes · 9 min read
How to Control LLM Output Using Temperature, Top‑K, and Top‑P
AI Algorithm Path
AI Algorithm Path
Feb 19, 2025 · Artificial Intelligence

How Temperature Shapes Output in Large Language Models

The article explains the Temperature hyper‑parameter in large language models, shows how it modifies the softmax distribution, provides a Python visualisation script, and demonstrates through experiments that higher values increase creativity while lower values make outputs more deterministic.

PythonSamplingSoftmax
0 likes · 5 min read
How Temperature Shapes Output in Large Language Models
JD Tech Talk
JD Tech Talk
Dec 27, 2024 · Backend Development

Log Sampling and Cross‑Thread Propagation in High‑Throughput Java Services

The article examines the performance impact of excessive logging in large‑scale Java systems and proposes request‑level sampling with cross‑thread identifier propagation, offering practical component‑based solutions, implementation considerations, and a concrete code example for backend developers.

BackendJavaSampling
0 likes · 7 min read
Log Sampling and Cross‑Thread Propagation in High‑Throughput Java Services
DataFunTalk
DataFunTalk
Jul 22, 2024 · Fundamentals

A/B Testing and Causal Inference: Evolution of Sampling, Metric Evaluation, and Statistical Inference

The article reviews the development of online A/B testing, covering sampling and traffic‑splitting techniques, metric computation improvements, statistical inference advances, and current challenges such as interference, real‑time inference, and large‑scale metric computation, while referencing recent research papers.

A/B testingMetric EvaluationSampling
0 likes · 10 min read
A/B Testing and Causal Inference: Evolution of Sampling, Metric Evaluation, and Statistical Inference
NewBeeNLP
NewBeeNLP
Jun 28, 2024 · Artificial Intelligence

Why Large Language Models Aren’t Magic: Understanding Compression and Prompt Engineering

This article demystifies large language models by comparing them to classic compression algorithms, explains how they compress massive data into compact parameters, explores their ability to learn abstract patterns, and provides practical insights into prompt engineering, sampling strategies, and multi‑step agent architectures for real‑world applications.

Agent ArchitectureLLMSampling
0 likes · 19 min read
Why Large Language Models Aren’t Magic: Understanding Compression and Prompt Engineering
Huawei Cloud Developer Alliance
Huawei Cloud Developer Alliance
Nov 30, 2023 · Artificial Intelligence

Mastering LLM Text Generation: Decoding Methods Explained

This review of the recent MindSpore NLP public class walks through the fundamentals of large language model text generation, detailing deterministic decoding such as greedy and beam search, stochastic sampling techniques like temperature, top‑k and top‑p, and advanced methods including constrained beam, contrastive, and assisted search, with illustrative examples.

Beam SearchGreedy SearchLLM
0 likes · 5 min read
Mastering LLM Text Generation: Decoding Methods Explained
FunTester
FunTester
Sep 1, 2023 · Operations

Observability in the Cloud‑Native Era: Data Collection Strategies and Sampling Techniques

The article explains how cloud‑native observability systems gather massive telemetry from infrastructure, containers, middleware and services, compares direct push and file‑based collection approaches, and details head, tail and local sampling methods to optimize data completeness and performance.

Distributed TracingObservabilityPerformance Optimization
0 likes · 10 min read
Observability in the Cloud‑Native Era: Data Collection Strategies and Sampling Techniques
DevOps Cloud Academy
DevOps Cloud Academy
Aug 29, 2023 · Cloud Native

Observability and Data Collection Strategies in Cloud‑Native Environments

The article explains that while observability is not new, cloud‑native systems have driven rapid development of observable platforms, detailing data collection architectures, direct push versus file‑based approaches, and various sampling techniques (head, tail, and local sampling) to balance completeness, real‑time reporting, and performance impact.

Samplingcloud-nativedata collection
0 likes · 11 min read
Observability and Data Collection Strategies in Cloud‑Native Environments
Baidu Geek Talk
Baidu Geek Talk
Aug 21, 2023 · Artificial Intelligence

Decoding Strategies for Generative Models: Top‑k, Top‑p, Contrastive Search, Beam Search, and Sampling

The article explains how generative models use deterministic methods like greedy and beam search and stochastic techniques such as top‑k, top‑p, contrastive search and sampling, describing their mechanisms, temperature control, repetition penalties, and practical trade‑offs for balancing fluency, diversity and coherence.

AIBeam SearchSampling
0 likes · 9 min read
Decoding Strategies for Generative Models: Top‑k, Top‑p, Contrastive Search, Beam Search, and Sampling
vivo Internet Technology
vivo Internet Technology
Aug 2, 2023 · Game Development

Pre‑Experiment User Stratification Model for Improving AB Test Uniformity in Vivo Game Center

The paper introduces a pre‑user stratification model that uses covariate‑balancing algorithms to create separate strata for distribution and revenue metrics, ensuring equal user allocation in Vivo game‑center AB tests, which reduces metric variance, improves gray‑release effectiveness, and saves significant investigation effort.

AB testingGame AnalyticsSampling
0 likes · 14 min read
Pre‑Experiment User Stratification Model for Improving AB Test Uniformity in Vivo Game Center
Architect
Architect
Jul 1, 2023 · Artificial Intelligence

Comprehensive Guide to Text Generation Decoding Strategies with HuggingFace Transformers

This tutorial explores various text generation decoding methods—including greedy search, beam search, top‑k/top‑p sampling, sample‑and‑rank, and group beam search—explaining their principles, providing detailed Python code examples, and comparing their use in modern large language models.

Beam SearchGreedy SearchSampling
0 likes · 59 min read
Comprehensive Guide to Text Generation Decoding Strategies with HuggingFace Transformers
Tencent Cloud Developer
Tencent Cloud Developer
Jun 1, 2023 · Artificial Intelligence

A Comprehensive Guide to Decoding Strategies for Text Generation with HuggingFace Transformers

This guide thoroughly explains the major decoding strategies for neural text generation in HuggingFace Transformers—including greedy, beam, diverse beam, sampling, top‑k, top‑p, sample‑and‑rank, beam sampling, and group beam search—detailing their principles, Python implementations with LogitsProcessor components, workflow diagrams, comparative analysis, and references to original research.

Beam SearchSamplingText Generation
0 likes · 60 min read
A Comprehensive Guide to Decoding Strategies for Text Generation with HuggingFace Transformers
DataFunTalk
DataFunTalk
Jan 8, 2023 · Big Data

ByteDance Event‑Tracking Data Cost Governance Practices

This article describes ByteDance's comprehensive approach to managing the massive volume of event‑tracking (埋点) data, detailing the background, cost‑reduction strategies, experience review, future plans, and a Q&A session that together illustrate how systematic data governance can dramatically cut storage and processing expenses.

Big DataByteDanceData Governance
0 likes · 18 min read
ByteDance Event‑Tracking Data Cost Governance Practices
ByteDance SYS Tech
ByteDance SYS Tech
Jan 6, 2023 · Fundamentals

How ByteDance Scaled Profile‑Guided Optimization to Boost CPU Efficiency

This article explains ByteDance's large‑scale adoption of profile‑guided optimization (PGO), covering its principles, instrumentation and sampling methods, the automated platform built for data collection and compilation, and the resulting performance gains across dozens of critical services.

ByteDanceCompiler OptimizationInstrumentation
0 likes · 12 min read
How ByteDance Scaled Profile‑Guided Optimization to Boost CPU Efficiency
Model Perspective
Model Perspective
Oct 4, 2022 · Artificial Intelligence

How Metropolis-Hastings Improves MCMC Sampling Efficiency

This article explains the detailed‑balance condition for Markov chains, shows why finding a transition matrix for a given stationary distribution is hard, and demonstrates how Metropolis‑Hastings modifies MCMC to achieve higher acceptance rates with a concrete Python example.

MCMCMarkov chainMetropolis-Hastings
0 likes · 9 min read
How Metropolis-Hastings Improves MCMC Sampling Efficiency
Model Perspective
Model Perspective
Sep 28, 2022 · Artificial Intelligence

How Monte Carlo Sampling Powers AI: From Basics to Acceptance-Rejection

This article introduces Monte Carlo methods, explains how random sampling approximates integrals, discusses uniform and non‑uniform probability distributions, and details acceptance‑rejection sampling as a technique for generating samples from complex distributions, laying the groundwork for understanding Markov Chain Monte Carlo in AI.

Acceptance-RejectionMCMCMonte Carlo
0 likes · 8 min read
How Monte Carlo Sampling Powers AI: From Basics to Acceptance-Rejection
Model Perspective
Model Perspective
Sep 23, 2022 · Fundamentals

Mastering Monte Carlo: From Acceptance-Rejection to Gibbs Sampling in Python

This article explains the motivations behind Monte Carlo methods, introduces acceptance-rejection sampling, details Markov Chain Monte Carlo concepts, and walks through Metropolis-Hastings and Gibbs sampling algorithms with Python implementations, highlighting their use in high‑dimensional probability distribution sampling.

MCMCMonte CarloPython
0 likes · 18 min read
Mastering Monte Carlo: From Acceptance-Rejection to Gibbs Sampling in Python
Model Perspective
Model Perspective
Sep 21, 2022 · Fundamentals

Unlocking Monte Carlo Sampling: From Basics to Acceptance‑Rejection in AI

Monte Carlo methods, originally a gambling-inspired random simulation technique, provide a versatile way to approximate integrals and sums, and by using acceptance‑rejection sampling they enable drawing samples from complex probability distributions, a key step toward effective Markov Chain Monte Carlo algorithms in machine learning and AI.

Acceptance-RejectionMCMCMonte Carlo
0 likes · 7 min read
Unlocking Monte Carlo Sampling: From Basics to Acceptance‑Rejection in AI
Bilibili Tech
Bilibili Tech
Sep 20, 2022 · Fundamentals

Common Color Representation Methods and Image/Video Fundamentals

The article explains common color models such as grayscale, RGB and YUV, describes image fundamentals like resolution and aspect ratio, outlines typical storage formats (RGB, YUV420P, NV12/NV21) and their bit‑depth considerations, and introduces video basics including frame rate, compression stages and HDR mapping.

Image ProcessingRGBSampling
0 likes · 21 min read
Common Color Representation Methods and Image/Video Fundamentals
Model Perspective
Model Perspective
Jun 1, 2022 · Fundamentals

How the Central Limit Theorem Powers Confidence Intervals and Sample Estimates

This article explains the Central Limit Theorem, distinguishes standard deviation from standard error, illustrates the 3‑σ rule, and shows how confidence levels, significance levels, and interval estimation combine to derive reliable confidence intervals for large‑sample population mean estimates.

Samplingcentral limit theoremconfidence interval
0 likes · 9 min read
How the Central Limit Theorem Powers Confidence Intervals and Sample Estimates
Tencent Cloud Developer
Tencent Cloud Developer
Dec 1, 2021 · Backend Development

From Dapper to Modern Distributed Tracing: Concepts, Algorithms, and Practices

The article traces the evolution of distributed tracing from Google’s Dapper paper through early research, Pinpoint and X‑Trace, to modern open‑source tools like Zipkin, Jaeger and SkyWalking, explaining metadata propagation, asynchronous reporting, classic nested and convolution algorithms, and practical implementation details for non‑intrusive, scalable tracing.

DapperDistributed TracingSampling
0 likes · 14 min read
From Dapper to Modern Distributed Tracing: Concepts, Algorithms, and Practices
Didi Tech
Didi Tech
May 19, 2021 · Artificial Intelligence

Applying Epsilon‑Greedy Bandit Algorithm for Content Delivery Optimization at DiDi

DiDi applied the epsilon‑greedy bandit algorithm integrated with its CMS to optimize ad placement across 600 slots, using quality scores, traffic sampling, and a drag‑and‑drop UI, which boosted CTR from 1.35% to 13.43% and unique visitors by 686%, demonstrating data‑driven growth beyond simple A/B testing.

Content OptimizationData-drivenEpsilon-Greedy
0 likes · 10 min read
Applying Epsilon‑Greedy Bandit Algorithm for Content Delivery Optimization at DiDi
dbaplus Community
dbaplus Community
Jul 8, 2019 · Big Data

How to Use ClickHouse Sampling and Materialized Views for Real‑Time Monitoring of Billion‑Scale Ad Traffic

This article explains how to handle high‑volume advertising monitoring by storing raw request logs in ClickHouse, enabling sampling and materialized views, and using TP999 metrics, aggregating tables, and Grafana queries to achieve fast, flexible, and low‑impact real‑time analytics on billions of events.

ClickHouseSamplingbig-data
0 likes · 10 min read
How to Use ClickHouse Sampling and Materialized Views for Real‑Time Monitoring of Billion‑Scale Ad Traffic
DataFunTalk
DataFunTalk
Jul 5, 2019 · Artificial Intelligence

Lead Quality Prediction for Real Estate: Data, Modeling, and Interpretability

This article presents a case study on building and deploying a lead‑quality classification model for a high‑value, low‑frequency real‑estate platform, covering business context, data challenges, sampling strategies, feature engineering, model selection, tuning, evaluation metrics, interpretability analysis, and observed performance improvements.

Real EstateSamplingclassification
0 likes · 14 min read
Lead Quality Prediction for Real Estate: Data, Modeling, and Interpretability
Hulu Beijing
Hulu Beijing
Mar 8, 2018 · Artificial Intelligence

Master Common Sampling Techniques: Inverse Transform, Rejection, Importance & MCMC

This article explains the core ideas and step-by-step procedures of widely used sampling methods—including inverse transform, rejection, importance, and Markov Chain Monte Carlo techniques such as Metropolis‑Hastings and Gibbs—highlighting their mathematical foundations, practical implementations, and when each method is appropriate.

Importance SamplingMCMCMonte Carlo
0 likes · 11 min read
Master Common Sampling Techniques: Inverse Transform, Rejection, Importance & MCMC
Hulu Beijing
Hulu Beijing
Dec 26, 2017 · Fundamentals

How to Sample a Gaussian Distribution: Methods, Algorithms, and Performance

This article explains why Gaussian (normal) distribution sampling is essential, describes the mathematical transformation from a standard normal, and compares several practical algorithms—including inverse transform, Box‑Muller, Marsaglia polar, rejection sampling, and Ziggurat—highlighting their implementation steps and efficiency considerations.

Box-MullerGaussianMarsaglia
0 likes · 8 min read
How to Sample a Gaussian Distribution: Methods, Algorithms, and Performance
Hulu Beijing
Hulu Beijing
Nov 21, 2017 · Artificial Intelligence

How to Tackle Imbalanced Datasets with Sampling Techniques

Sampling transforms complex distributions into manageable data points, and mastering methods like random oversampling, undersampling, SMOTE, and its variants is essential for handling imbalanced binary classification problems in machine learning, ensuring models achieve balanced accuracy and recall across classes.

SMOTESamplingimbalanced data
0 likes · 8 min read
How to Tackle Imbalanced Datasets with Sampling Techniques
Beike Product & Technology
Beike Product & Technology
Jul 16, 2017 · Industry Insights

How Lianjia Built LTrace: A Low‑Overhead, Scalable Distributed Tracing Platform

This article explains how Lianjia designed and implemented LTrace, a zero‑intrusion, high‑performance distributed tracing system that captures full request chains across heterogeneous services, supports multi‑language environments, offers flexible sampling, and enables rapid fault isolation and performance optimization.

Distributed TracingObservabilitySampling
0 likes · 12 min read
How Lianjia Built LTrace: A Low‑Overhead, Scalable Distributed Tracing Platform
Didi Tech
Didi Tech
Jul 10, 2017 · Fundamentals

Statistical Foundations for A/B Testing: Populations, Samples, Confidence Intervals, and the Central Limit Theorem

This article explains the essential statistical concepts—populations, samples, sampling error, confidence intervals, the Central Limit Theorem, and normal distribution—that underpin A/B testing, showing how they enable reliable hypothesis evaluation, accurate impact prediction, and data‑driven decision making for product experiments.

A/B testingSamplingcentral limit theorem
0 likes · 14 min read
Statistical Foundations for A/B Testing: Populations, Samples, Confidence Intervals, and the Central Limit Theorem
ITFLY8 Architecture Home
ITFLY8 Architecture Home
Dec 8, 2016 · Operations

Designing Effective End-to-End Tracing Systems for Distributed Services

This article surveys the design of end‑to‑end tracing systems for large distributed services, explaining core use cases, tracing approaches, metadata propagation, sampling strategies, visualization techniques, and recommended design choices to improve debugging, performance analysis, and resource attribution.

Distributed TracingSamplingSystem Design
0 likes · 44 min read
Designing Effective End-to-End Tracing Systems for Distributed Services
360 Quality & Efficiency
360 Quality & Efficiency
Sep 18, 2016 · Mobile Development

Performance Testing Metrics and Sampling Strategy for Android Apps

The article outlines a comprehensive set of Android app performance metrics, device coverage, a non‑root sampling strategy using dumpsys commands, shell‑based data collection, and Python‑driven HTML reporting, providing practical guidance and reference implementations for mobile developers.

AndroidPerformance TestingSampling
0 likes · 4 min read
Performance Testing Metrics and Sampling Strategy for Android Apps
Art of Distributed System Architecture Design
Art of Distributed System Architecture Design
Aug 2, 2015 · Artificial Intelligence

Designing Machine Learning Models for Fraud Detection: Sampling, Feature Engineering, and Evaluation

This article explains how Airbnb's Trust & Safety team builds machine‑learning models to detect fraudulent behavior, covering problem definition, role‑based sampling, feature design techniques such as normalization and CP‑coding, and the trade‑offs between precision and recall in model evaluation.

AIModel EvaluationSampling
0 likes · 10 min read
Designing Machine Learning Models for Fraud Detection: Sampling, Feature Engineering, and Evaluation