Fundamentals 24 min read

Why Does My Upgraded Chumby 8 Show 100% CPU Usage? Uncovering a Hidden Kernel Timer Bug

After upgrading a PXA166‑based Chumby 8 from Linux 2.6.28 to 6.x, the top command constantly reported 100% CPU usage, leading the author through profiling, kernel source analysis, procfs inspection, and a timer‑register sequencing bug that was finally fixed by adjusting the delay in the timer_read function.

dbaplus Community
dbaplus Community
dbaplus Community
Why Does My Upgraded Chumby 8 Show 100% CPU Usage? Uncovering a Hidden Kernel Timer Bug

1. Confirming the bug is not introduced by Linux 6.x

The author first rolled back to an older 3.13 kernel, which reproduced the same 100% CPU usage, proving the issue was not specific to the newest kernel version.

2. Understanding how top calculates CPU usage

By enabling CONFIG_PROFILING and using readprofile, the author observed that default_idle_call consumed most of the time, indicating the CPU was actually idle.

Investigation of /proc/stat showed that top reads several fields (user, nice, system, idle, iowait, irq, softirq, etc.) and computes percentages from the differences between successive reads.

Experiments on a desktop PC confirmed that the idle counter increases roughly 1000 units per 10 seconds (given a USER_HZ of 100), matching the expected behavior.

Running the same test on the Chumby showed almost no increase in the idle counter, explaining why top displayed 100% usage.

3. Applying the OLPC workaround

Research revealed that disabling CONFIG_NO_HZ (or adding the kernel command line nohz=off) fixed the problem on OLPC devices. Adding this option to the Chumby kernel immediately reduced the reported CPU usage to the correct idle percentage.

4. Tracing the root cause to a timer‑register read sequencing issue

The author traced the idle‑time calculation to get_cpu_idle_time_us, which ultimately calls timer_read in arch/arm/mach-mmp/time.c. The original implementation writes 1 to the CVWR register, loops for a fixed delay (100 iterations), then reads the register.

Documentation indicated that the timer value can be metastable, requiring either a double‑read verification or a CVWR capture with sufficient delay.

Replacing the delayed read with a direct register read ( __raw_readl(mmp_timer_base + TMR_CR(1))) made top report correct idle time. Increasing the delay to about 300–500 iterations also restored correct behavior, confirming the timing window was too short.

Further comparison with the 2.6.28 kernel showed it used timer 0 and performed a more robust delay, explaining why the older kernel behaved correctly.

5. The bug has existed since 2009

Historical Git analysis revealed the same buggy code was introduced when MMP support was added in 2009, with a FIXME comment noting the need for a longer delay.

The author submitted a patch in September 2022, which was eventually merged into Linux 6.2 and back‑ported to several 4.x/5.x kernels, eliminating the erroneous CPU‑usage reporting on the Chumby.

Overall, the article demonstrates a systematic approach to kernel debugging: reproducing the issue on older versions, profiling, reading procfs, tracing through kernel call stacks, and finally fixing a subtle hardware‑timer sequencing bug.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

ARMnohzCPU usagebusyboxpxa166timer-buglinux-kernel
dbaplus Community
Written by

dbaplus Community

Enterprise-level professional community for Database, BigData, and AIOps. Daily original articles, weekly online tech talks, monthly offline salons, and quarterly XCOPS&DAMS conferences—delivered by industry experts.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.