Qunhe Technology Quality Tech
Qunhe Technology Quality Tech
Dec 18, 2025 · Artificial Intelligence

How We Slashed AI Token Costs by Up to 90% with Smart Pipeline Optimizations

This report details a systematic analysis of AI token consumption in a multilingual UI‑automation workflow and presents four concrete optimization techniques—prompt trimming, duplicate‑call avoidance, text deduplication, and placeholder‑based knowledge‑base integration—that together reduced monthly token usage by over 90% without harming detection accuracy.

AI token optimizationText Deduplicationautomation pipeline
0 likes · 14 min read
How We Slashed AI Token Costs by Up to 90% with Smart Pipeline Optimizations
Sohu Smart Platform Tech Team
Sohu Smart Platform Tech Team
Aug 9, 2025 · Artificial Intelligence

How SimHash and Cosine Similarity Accelerate Large-Scale Text Deduplication

This article explains why traditional pairwise text comparison is impractical for massive news corpora, introduces cosine similarity and SimHash as efficient deduplication techniques, walks through their mathematical foundations, step‑by‑step implementation details, code examples, and discusses trade‑offs such as accuracy versus speed.

SimHashText Deduplicationalgorithm
0 likes · 12 min read
How SimHash and Cosine Similarity Accelerate Large-Scale Text Deduplication
Sohu Tech Products
Sohu Tech Products
Feb 28, 2024 · Big Data

How SimHash and Cosine Similarity Accelerate Large‑Scale Text Deduplication

This article explains why massive news feeds need efficient deduplication, compares cosine similarity and SimHash for measuring text similarity, walks through a step‑by‑step implementation with Java code, and shows how a space‑for‑time indexing strategy can reduce duplicate‑detection complexity from O(n²) to near O(1).

Near-Duplicate DetectionSimHashText Deduplication
0 likes · 14 min read
How SimHash and Cosine Similarity Accelerate Large‑Scale Text Deduplication