Tagged articles

Warmup

4 articles · Page 1 of 1

Feb 10, 2024 · Backend Development

How to Warm Up Your Cache to Boost High‑Concurrency System Performance

Cache warming, a technique used in high‑concurrency systems, involves preloading frequently accessed data into memory before traffic spikes to improve hit rates, reduce cold‑start latency, prevent cache breakdowns, and lessen backend load, with various strategies such as startup loading, scheduled jobs, manual triggers, Redis tools, and Caffeine loaders demonstrated through Spring Boot code examples.

CacheCaffeineSpring Boot

0 likes · 10 min read

How to Warm Up Your Cache to Boost High‑Concurrency System Performance

Baobao Algorithm Notes

Jan 14, 2022 · Artificial Intelligence

BERT Interview Q&A: Decoding CLS, Masks, Complexity, and More

An in‑depth Q&A breaks down core BERT concepts—from the purpose of the [CLS] token and masking strategies to self‑attention complexity, sparse attention tricks, subword handling of OOV words, warm‑up learning rates, GPT’s unidirectional nature, and ALBERT’s parameter sharing—providing concise explanations for each.

BERTMaskingSelf-Attention

0 likes · 7 min read

BERT Interview Q&A: Decoding CLS, Masks, Complexity, and More

iQIYI Technical Product Team

Nov 27, 2020 · Artificial Intelligence

Optimizing TensorFlow Serving Model Hot‑Update to Eliminate Latency Spikes in CTR Recommendation Systems

By adding model warm‑up files, separating load/unload threads, switching to the Jemalloc allocator, and isolating TensorFlow’s parameter memory from RPC request buffers, iQIYI’s engineers reduced TensorFlow Serving hot‑update latency spikes in high‑throughput CTR recommendation services from over 120 ms to about 2 ms, eliminating jitter.

Model Hot UpdateTensorFlow ServingWarmup

0 likes · 11 min read

Optimizing TensorFlow Serving Model Hot‑Update to Eliminate Latency Spikes in CTR Recommendation Systems

360 Tech Engineering

Aug 17, 2020 · Artificial Intelligence

Deploying TensorFlow 2.x Models with TensorFlow Serving: Concepts, Setup, and Usage

This guide explains the core concepts of TensorFlow Serving, shows how to prepare Docker images, save TensorFlow 2.x models in various formats, configure version policies, warm‑up models, start the service, and invoke it via gRPC or HTTP with complete code examples.

DockerHTTPModel Deployment

0 likes · 11 min read

Deploying TensorFlow 2.x Models with TensorFlow Serving: Concepts, Setup, and Usage