Backend Development 25 min read

Five Practical API Call Rate Monitoring Solutions: Full Comparison of Performance, Cost, and Complexity

This article walks through five concrete implementations for per‑minute API call counting—fixed window, lazy sliding window, Spring AOP, Redis time‑series, and Micrometer + Prometheus—detailing their design, code, trade‑offs, benchmark results, memory usage, and real‑world deployment tips.

Programmer XiaoFu

Jun 4, 2025

Five Practical API Call Rate Monitoring Solutions: Full Comparison of Performance, Cost, and Complexity

Why monitor API call frequency

Accurate per‑minute call counts help identify performance bottlenecks, plan capacity, detect abuse spikes, support usage‑based billing, and narrow down problems when the system misbehaves.

Key factors for a monitoring design

The solution must balance accuracy, memory footprint, thread‑safety, clock‑drift handling, and integration effort.

Solution 1: Fixed‑window counter

Store a ConcurrentHashMap<String, AtomicLong> where the key is the API name and the value is the call count. A scheduled task runs every 60 seconds, prints the map and resets all counters.

public class SimpleCounter {
    private ConcurrentHashMap<String, AtomicLong> counters = new ConcurrentHashMap<>();

    public void increment(String apiName) {
        counters.computeIfAbsent(apiName, k -> new AtomicLong(0)).incrementAndGet();
    }

    public long getCount(String apiName) {
        return counters.getOrDefault(apiName, new AtomicLong(0)).get();
    }

    @Scheduled(fixedRate = 60000)
    public void printAndReset() {
        System.out.println("=== API minute statistics ===");
        counters.forEach((api, count) -> System.out.println(api + ": " + count.getAndSet(0)));
    }
}

Drawback: the reset may occur at an arbitrary second (e.g., 8:59:59), so a call at 9:00:01 is counted in the next window, causing a one‑minute offset.

Solution 2: Sliding‑window counter (lazy‑load)

Divide a minute into six 10‑second slices. Each API has an array of six AtomicLong counters. The current slice index is derived from the system clock, and only accessed slices are updated.

public class SlidingWindowCounter {
    private final ConcurrentHashMap<String, CounterEntry> apiCounters = new ConcurrentHashMap<>();
    private final int WINDOW_SIZE_SECONDS = 10;
    private final int WINDOW_COUNT = 6;
    private volatile int currentTimeSlice;

    public SlidingWindowCounter() {
        currentTimeSlice = (int) (System.currentTimeMillis() / 1000 / WINDOW_SIZE_SECONDS);
        ScheduledExecutorService scheduler = Executors.newScheduledThreadPool(1);
        long initialDelay = WINDOW_SIZE_SECONDS - (System.currentTimeMillis() / 1000 % WINDOW_SIZE_SECONDS);
        scheduler.scheduleAtFixedRate(this::slideWindow, initialDelay, WINDOW_SIZE_SECONDS, TimeUnit.SECONDS);
    }

    public void increment(String apiName) {
        int timeSlice = currentTimeSlice;
        CounterEntry entry = apiCounters.computeIfAbsent(apiName, k -> new CounterEntry());
        entry.increment(timeSlice);
    }

    public long getMinuteCount(String apiName) {
        CounterEntry entry = apiCounters.get(apiName);
        return entry == null ? 0 : entry.getTotal(currentTimeSlice);
    }

    private void slideWindow() {
        try {
            int newSlice = (int) (System.currentTimeMillis() / 1000 / WINDOW_SIZE_SECONDS);
            if (newSlice <= currentTimeSlice) {
                System.err.println("Clock skew detected: " + newSlice + " <= " + currentTimeSlice);
                return;
            }
            currentTimeSlice = newSlice;
            cleanupIdleCounters();
        } catch (Exception e) {
            System.err.println("Error in slideWindow: " + e.getMessage());
        }
    }

    private class CounterEntry {
        private final AtomicLong[] counters = new AtomicLong[WINDOW_COUNT];
        private volatile int lastAccessedSlice;
        private volatile long lastUpdateTime;

        CounterEntry() {
            for (int i = 0; i < WINDOW_COUNT; i++) counters[i] = new AtomicLong(0);
            lastAccessedSlice = currentTimeSlice;
            lastUpdateTime = System.currentTimeMillis();
        }

        void increment(int timeSlice) {
            updateWindowsIfNeeded(timeSlice);
            int index = timeSlice % WINDOW_COUNT;
            counters[index].incrementAndGet();
            lastAccessedSlice = timeSlice;
            lastUpdateTime = System.currentTimeMillis();
        }

        long getTotal(int currentSlice) {
            updateWindowsIfNeeded(currentSlice);
            long total = 0;
            for (AtomicLong c : counters) total += c.get();
            return total;
        }

        private void updateWindowsIfNeeded(int currentSlice) {
            int sliceDiff = currentSlice - lastAccessedSlice;
            if (sliceDiff <= 0) return;
            if (sliceDiff >= WINDOW_COUNT) {
                for (AtomicLong c : counters) c.set(0);
            } else {
                for (int i = 1; i <= sliceDiff; i++) {
                    int idx = (lastAccessedSlice + i) % WINDOW_COUNT;
                    counters[idx].set(0);
                }
            }
        }
    }

    private void cleanupIdleCounters() {
        final long IDLE_THRESHOLD_MS = 300_000; // 5 min
        long now = System.currentTimeMillis();
        Iterator<Map.Entry<String, CounterEntry>> it = apiCounters.entrySet().iterator();
        while (it.hasNext()) {
            Map.Entry<String, CounterEntry> e = it.next();
            if (now - e.getValue().lastUpdateTime > IDLE_THRESHOLD_MS) {
                it.remove();
            }
        }
    }
}

Advantages: high time precision, smooth window boundaries, and low overhead because only accessed slices are refreshed.

Solution 3: AOP‑based asynchronous statistics

Spring AOP weaves a counter into every controller method without changing business code. The aspect records total calls, success/failure, and latency category (fast/medium/slow) in a background thread pool.

@Aspect
@Component
public class ApiMonitorAspect {
    private final Logger logger = LoggerFactory.getLogger(ApiMonitorAspect.class);
    @Autowired private SlidingWindowCounter counter;
    private final ThreadPoolExecutor asyncExecutor = new ThreadPoolExecutor(
            2, 5, 60, TimeUnit.SECONDS,
            new LinkedBlockingQueue<>(1000),
            Executors.defaultThreadFactory(),
            new ThreadPoolExecutor.CallerRunsPolicy());

    @Pointcut("@within(org.springframework.web.bind.annotation.RestController) || @within(org.springframework.stereotype.Controller)")
    public void apiPointcut() {}

    @Around("apiPointcut()")
    public Object around(ProceedingJoinPoint joinPoint) throws Throwable {
        long startTime = System.currentTimeMillis();
        Object result = null;
        boolean success = false;
        MethodSignature signature = (MethodSignature) joinPoint.getSignature();
        String methodName = signature.getDeclaringType().getName() + "." + signature.getName();
        try {
            result = joinPoint.proceed();
            success = true;
            return result;
        } catch (Exception e) {
            success = false;
            throw e;
        } finally {
            long executionTime = System.currentTimeMillis() - startTime;
            asyncExecutor.execute(() -> {
                try {
                    counter.increment(methodName);
                    counter.increment(methodName + ":" + (success ? "success" : "failure"));
                    String speedCategory;
                    if (executionTime < 100) speedCategory = "fast";
                    else if (executionTime < 1000) speedCategory = "medium";
                    else speedCategory = "slow";
                    counter.increment(methodName + ":" + speedCategory);
                } catch (Exception ex) {
                    logger.error("Failed to record API metrics", ex);
                }
            });
        }
    }

    public long getApiCallCount(String apiName) {
        return counter.getMinuteCount(apiName);
    }
}

The aspect works like an invisible camera at the gateway: every request is logged without developer intervention.

Solution 4: Distributed counting with Redis (time‑series)

In a multi‑instance deployment each node would maintain its own counters, leading to incomplete statistics. Storing per‑minute counts in a Redis sorted set (ZSET) provides a global view and natural time ordering.

@Service
public class RedisTimeSeriesCounter {
    @Autowired private StringRedisTemplate redisTemplate;
    private final int MAX_RETRIES = 3;
    private final long[] RETRY_DELAYS = {10L, 50L, 200L}; // ms

    public void increment(String apiName) {
        long timestamp = System.currentTimeMillis();
        String key = getBaseKey(apiName);
        String script = "local minute = math.floor(ARGV[1]/60000)*60000; " +
                        "redis.call('ZINCRBY', KEYS[1], 1, minute); " +
                        "redis.call('EXPIRE', KEYS[1], 86400); " +
                        "return 1;";
        Exception lastException = null;
        for (int attempt = 0; attempt < MAX_RETRIES; attempt++) {
            try {
                redisTemplate.execute(new DefaultRedisScript<>(script, Long.class),
                        Collections.singletonList(key), String.valueOf(timestamp));
                return;
            } catch (Exception e) {
                lastException = e;
                if (attempt < MAX_RETRIES - 1) {
                    try { Thread.sleep(RETRY_DELAYS[attempt]); } catch (InterruptedException ie) { Thread.currentThread().interrupt(); break; }
                }
            }
        }
        // fallback to basic ops
        try {
            logger.warn("Failed to execute Redis script after {} retries, falling back", MAX_RETRIES, lastException);
            long minuteKey = Math.floorDiv(timestamp, 60000) * 60000;
            redisTemplate.opsForZSet().incrementScore(key, String.valueOf(minuteKey), 1);
            redisTemplate.expire(key, 1, TimeUnit.DAYS);
        } catch (Exception e) {
            logger.error("Failed to increment API counter for {}", apiName, e);
        }
    }

    public long getCurrentMinuteCount(String apiName) {
        long currentMinute = Math.floorDiv(System.currentTimeMillis(), 60000) * 60000;
        return getCountByMinute(apiName, currentMinute);
    }

    public long getCountByMinute(String apiName, long minuteTimestamp) {
        String key = getBaseKey(apiName);
        Double score = redisTemplate.opsForZSet().score(key, String.valueOf(minuteTimestamp));
        return score == null ? 0 : score.longValue();
    }

    public Map<Long, Long> getCountTrend(String apiName, long startTime, long endTime) {
        String key = getBaseKey(apiName);
        long startMinute = Math.floorDiv(startTime, 60000) * 60000;
        long endMinute = Math.floorDiv(endTime, 60000) * 60000;
        Set<ZSetOperations.TypedTuple<String>> results = redisTemplate.opsForZSet()
                .rangeByScoreWithScores(key, startMinute, endMinute);
        Map<Long, Long> trend = new TreeMap<>();
        if (results != null) {
            for (ZSetOperations.TypedTuple<String> t : results) {
                trend.put(Long.parseLong(t.getValue()), t.getScore().longValue());
            }
        }
        return trend;
    }

    private String getBaseKey(String apiName) {
        return "api:timeseries:" + apiName;
    }
}

Redis ZSETs keep timestamps ordered, making historical queries trivial. The implementation adds exponential‑backoff retries and a fallback to plain incrementScore if the Lua script fails.

Solution 5: Micrometer + Prometheus for multidimensional visualisation

Instrument the service with Micrometer and expose the metrics to Prometheus for long‑term trends, percentile latency, and per‑status breakdowns.

@Configuration
public class MetricsConfig {
    @Bean
    public MeterRegistry meterRegistry() {
        return new PrometheusMeterRegistry(PrometheusConfig.DEFAULT,
                new CollectorRegistry(), Clock.SYSTEM,
                new CommonTags("application", "my-app", "env", "prod"));
    }

    @Bean
    public MeterFilter dimensionFilter() {
        return MeterFilter.maximumAllowableTags("api.calls", "uri", 100);
    }

    @Bean
    public MeterFilter cardinalityLimiter() {
        return new MeterFilter() {
            @Override
            public Meter.Id map(Meter.Id id) {
                if (id.getName().equals("api.calls") &&
                        meterRegistry().find(id.getName()).tagKeys().size() > 5000) {
                    return id.withTag("name", "other");
                }
                return id;
            }
        };
    }
}

@Component
public class ApiMetricsInterceptor implements HandlerInterceptor {
    private final MeterRegistry meterRegistry;
    private final ThreadLocal<Long> startTimeHolder = new ThreadLocal<>();
    private final PathParameterResolver pathResolver = new PathParameterResolver();

    public ApiMetricsInterceptor(MeterRegistry meterRegistry) {
        this.meterRegistry = meterRegistry;
    }

    @Override
    public boolean preHandle(HttpServletRequest request, HttpServletResponse response, Object handler) {
        startTimeHolder.set(System.currentTimeMillis());
        if (handler instanceof HandlerMethod) {
            HandlerMethod hm = (HandlerMethod) handler;
            String apiName = hm.getBeanType().getName() + "." + hm.getMethod().getName();
            String uri = pathResolver.standardizePath(request.getRequestURI());
            meterRegistry.counter("api.calls", "name", apiName, "method", request.getMethod(), "uri", uri).increment();
        }
        return true;
    }

    @Override
    public void afterCompletion(HttpServletRequest request, HttpServletResponse response, Object handler, Exception ex) {
        if (handler instanceof HandlerMethod && startTimeHolder.get() != null) {
            HandlerMethod hm = (HandlerMethod) handler;
            String apiName = hm.getBeanType().getName() + "." + hm.getMethod().getName();
            String status = String.valueOf(response.getStatus());
            long executionTime = System.currentTimeMillis() - startTimeHolder.get();
            meterRegistry.timer("api.latency", "name", apiName, "status", status)
                    .record(executionTime, TimeUnit.MILLISECONDS);
            startTimeHolder.remove();
        }
    }

    private static class PathParameterResolver {
        private final Pattern pathParamPattern = Pattern.compile("/\\d+(/|$)");
        private final Set<String> preservedNumberPaths = Set.of("/v1", "/v2", "/v3", "/2fa", "/oauth2");
        public String standardizePath(String uri) {
            for (String p : preservedNumberPaths) if (uri.contains(p)) return uri;
            Matcher m = pathParamPattern.matcher(uri);
            StringBuffer sb = new StringBuffer();
            while (m.find()) {
                String match = m.group();
                String repl = match.endsWith("/") ? "/{id}/" : "/{id}";
                m.appendReplacement(sb, repl);
            }
            m.appendTail(sb);
            return sb.toString();
        }
    }
}

Prometheus stores the data as a monotonic counter; after a restart the rate can still be computed with rate(). Example queries for per‑minute QPS, 95th‑percentile latency, and status‑code breakdown are provided in the original article.

Performance test and memory consumption

JMeter generated 10 million calls across 100 endpoints on a 4‑core, 8 GB VM (100 threads, 10 min). A sudden 500 % traffic spike for 30 seconds was also injected. Results:

Fixed window: low CPU, memory grows linearly; OOM at 1 million distinct APIs.

Standard sliding window: higher accuracy, memory roughly double fixed window; OOM around 1 million APIs.

Lazy sliding window: similar accuracy, slightly lower memory (clears idle entries after 5 min).

Redis time‑series: offloads memory to the cluster, suitable for >10 k APIs.

Prometheus: best for long‑term trends, but requires careful cardinality limiting.

Approximate memory usage:

1 k APIs – Fixed: 1 MB, Sliding: 2 MB, Lazy sliding: 2 MB

10 k APIs – Fixed: 8 MB, Sliding: 20 MB, Lazy sliding: 15 MB

100 k APIs – Fixed: 70 MB, Sliding: 200 MB, Lazy sliding: 140 MB

1 M APIs – Fixed & Sliding: OOM, Lazy sliding: 1.3 GB (still high, consider Redis)

Typical problems and remedies

Memory explosion – caused by high‑cardinality URL parameters. Mitigation: cap the map size with Guava CacheBuilder.maximumSize(10000) and expire idle entries after 30 min.

Clock drift in distributed pods – leads to inconsistent window boundaries. Solutions compared: NTP on hosts, Redis‑based time service, or Kubernetes‑level PTP synchronization.

Redis write saturation under 100k+ QPS – batch local increments and flush to Redis once per second using pipelining.

Capacity planning recommendations

Fixed/Sliding window memory ≈ 15‑20 MB per 10 k APIs. JVM heap suggestions: ≤ 10 k APIs → ≥ 512 MB heap ≤ 100 k APIs → ≥ 2 GB heap > 100 k APIs → prefer Redis‑based solution. Redis storage estimate: 100 bytes per API‑minute. 1 k APIs for 7 days ≈ 1 GB; 10 k APIs ≈ 10 GB. Recommended cluster: 3 master + 3 replica nodes, 16 GB RAM each. Prometheus: disk ≈ samples × sample‑size × retention. 1 k APIs, 15 s scrape, 30 day retention → ~50 GB. Keep label cardinality ≤ 5 000 per metric.

Final thoughts

No single technique satisfies every requirement. A hybrid architecture—local lazy sliding window for millisecond‑level QPS, Redis time‑series for minute‑level aggregation, and Micrometer + Prometheus for long‑term multidimensional analysis—covers real‑time alerting, short‑term capacity planning, and historical trend visualisation while keeping memory and network overhead under control.

Source: juejin.cn/post/7496453510318866467

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

redis performance testing Prometheus spring aop sliding window api monitoring micrometer

Written by

Programmer XiaoFu

xiaofucode.com – a programmer learning guide driven by the pursuit of profit

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.

Why monitor API call frequency

Key factors for a monitoring design

Solution 1: Fixed‑window counter

Solution 2: Sliding‑window counter (lazy‑load)

Solution 3: AOP‑based asynchronous statistics

Solution 4: Distributed counting with Redis (time‑series)

Solution 5: Micrometer + Prometheus for multidimensional visualisation

Performance test and memory consumption

Typical problems and remedies

Capacity planning recommendations

Final thoughts

Programmer XiaoFu

How this landed with the community

Was this worth your time?

0 Comments

Solution 5: Micrometer + Prometheus for multidimensional visualisation