Tagged articles
11 articles
Page 1 of 1
AI Tech Publishing
AI Tech Publishing
Apr 29, 2026 · Artificial Intelligence

Why Do AI Agents Forget and Hallucinate? A Complete Guide to KV‑Cache Memory Mechanisms

The article explains that AI agents’ forgetting and hallucinations stem from token‑level attention scores causing key‑value cache eviction before retrieval, then surveys KV‑cache basics, naive growth, streaming‑LLM windowing, SnapKV’s attention‑guided compression, token‑retention studies, Memory Sparse Attention, compares these methods, and discusses practical system pitfalls and design implications.

AI agentsKV cacheMemory Sparse Attention
0 likes · 20 min read
Why Do AI Agents Forget and Hallucinate? A Complete Guide to KV‑Cache Memory Mechanisms
SuanNi
SuanNi
Apr 3, 2026 · Artificial Intelligence

How GEMS Lets a 6B Open‑Source Model Beat Top Closed‑Source Image Generators

The article presents the GEMS (Agent‑Native Multimodal Generation with Memory and Skills) framework, detailing its multi‑agent loop, hierarchical memory compression, on‑demand skill modules, and extensive benchmark results that show a lightweight 6B model surpassing larger proprietary systems on complex image‑generation tasks.

GEMSImage GenerationMultimodal AI
0 likes · 14 min read
How GEMS Lets a 6B Open‑Source Model Beat Top Closed‑Source Image Generators
PaperAgent
PaperAgent
Mar 26, 2026 · Artificial Intelligence

TurboQuant: How Google’s New Vector Quantization Cuts KV Memory 6× and Boosts Speed

TurboQuant, presented at ICLR 2026, introduces a theoretically grounded vector quantization technique that reduces large‑language‑model key‑value cache memory by at least six times, achieves up to eight‑fold speedups, and maintains zero accuracy loss by combining PolarQuant’s polar‑coordinate compression with a 1‑bit QJL error‑correction step, as demonstrated on benchmarks such as LongBench and GloVe.

AI inferenceBenchmarkingTurboQuant
0 likes · 10 min read
TurboQuant: How Google’s New Vector Quantization Cuts KV Memory 6× and Boosts Speed
Xiaolei Talks DB
Xiaolei Talks DB
Feb 25, 2026 · Databases

Engula: Redis‑Compatible In‑Memory Database Cutting Memory Use by 50%

Engula is a Redis‑compatible, high‑performance in‑memory database that cuts memory usage by up to 50% through compression and metadata optimization, while incurring only about 10% performance overhead, and its architecture, testing methodology, and benchmark results are detailed in this article.

CompatibilityIn-Memory Databasememory compression
0 likes · 7 min read
Engula: Redis‑Compatible In‑Memory Database Cutting Memory Use by 50%
Deepin Linux
Deepin Linux
Apr 11, 2025 · Fundamentals

Understanding ZRAM: Linux Memory Compression and Swap Optimization

This article explains the ZRAM technology in Linux, covering its principles, configuration steps, kernel integration, performance optimizations, and practical use cases for improving memory utilization on embedded devices, Android, and legacy PCs.

Swapkernelmemory compression
0 likes · 24 min read
Understanding ZRAM: Linux Memory Compression and Swap Optimization
Open Source Linux
Open Source Linux
May 26, 2023 · Operations

Boost Linux Performance with zSwap, zRAM, and Zstandard Compression

This article explains how Linux memory compression techniques such as zSwap, zRAM, and the Zstandard algorithm reduce I/O pressure, extend flash lifespan, and improve overall system performance, while also covering their drawbacks and step‑by‑step activation procedures.

Linuxmemory compressionperformance optimization
0 likes · 6 min read
Boost Linux Performance with zSwap, zRAM, and Zstandard Compression
MaGe Linux Operations
MaGe Linux Operations
May 18, 2023 · Operations

Boost Linux Performance with zSwap, zRAM, and zstd Compression

Memory compression techniques like Linux's zSwap, zRAM, and the zstd algorithm reduce I/O latency and extend RAM capacity by compressing swap pages, offering performance gains while introducing trade‑offs such as CPU overhead and configuration complexity, and this guide explains their principles, advantages, drawbacks, and activation steps.

Linuxmemory compressionsystem performance
0 likes · 6 min read
Boost Linux Performance with zSwap, zRAM, and zstd Compression
Coolpad Technology Team
Coolpad Technology Team
Nov 6, 2021 · Mobile Development

Analysis of Intermittent Unresponsive Touch Events in Feishu Caused by Process D State and Memory Compression

The article investigates why the Feishu app sometimes fails to respond to swipe gestures after a hot start, tracing the issue to the app entering a D (uninterruptible) state during memory compression, and demonstrates how adjusting CPU priority for compression threads can reduce the problem's occurrence.

AndroidInput EventsProcess D State
0 likes · 8 min read
Analysis of Intermittent Unresponsive Touch Events in Feishu Caused by Process D State and Memory Compression
Programmer DD
Programmer DD
Jul 19, 2021 · Backend Development

How Redis Ziplist Compresses Memory and When to Use It

This article explains Redis's ziplist compressed list structure, its internal fields, lookup algorithm, performance characteristics, configuration thresholds for Hash and List types, and demonstrates a real‑world use case with memory‑saving calculations and experimental results.

Data Structuresmemory compressionredis
0 likes · 11 min read
How Redis Ziplist Compresses Memory and When to Use It
OPPO Kernel Craftsman
OPPO Kernel Craftsman
Feb 21, 2020 · Fundamentals

Overview of Linux Memory Compression Technologies: zSwap, zRAM, and zCache

Linux reduces RAM pressure through three main compression mechanisms—zSwap, which caches compressed pages before writing to swap; zRAM, a RAM‑backed compressed block device; and zCache, a file‑page compressor—each paired with specialized allocators (zsmalloc, zbud, z3fold) and configurable algorithms, offering trade‑offs in speed, ratio, CPU load, and fragmentation.

Linuxmemory compressionperformance
0 likes · 12 min read
Overview of Linux Memory Compression Technologies: zSwap, zRAM, and zCache