Boost Embedded Performance: 10 Proven C Code Optimization Tricks
This article presents a collection of practical embedded‑system optimization techniques, covering time‑efficiency improvements such as avoiding floating‑point arithmetic and inlining functions, space‑efficiency strategies like choosing appropriate data types, using unions and flexible arrays, as well as loop unrolling, bit‑field usage, and data‑type selection to maximize performance on resource‑constrained devices.
1. Time Efficiency Optimization
Avoid Floating‑Point Arithmetic
// Slow version: floating‑point calculation
float calculate_voltage(int adc_value) {
return adc_value * 3.3f / 4096.0f;
}
// Fast version: fixed‑point calculation
int calculate_voltage_fast(int adc_value) {
return (adc_value * 3300) >> 12; // replace division by 4096 with right shift
}Reduce Function Calls
// Slow version: frequent function calls
for (int i = 0; i < 1000; i++) {
set_led_state(i % 2);
}
// Fast version: inline expansion
for (int i = 0; i < 1000; i++) {
if (i % 2) {
GPIO_SetBits(GPIOA, GPIO_Pin_5);
} else {
GPIO_ResetBits(GPIOA, GPIO_Pin_5);
}
}2. Space Efficiency Optimization
Choose Appropriate Data Types
// Wasteful version
struct sensor_data {
int temperature; // -40~125°C fits in int8_t
int humidity; // 0~100% fits in uint8_t
int pressure; // 300~1100 hPa fits in uint16_t
};
// Optimized version
struct sensor_data_optimized {
int8_t temperature; // -128~127
uint8_t humidity; // 0~255
uint16_t pressure; // 0~65535
}; // Memory reduced from 12 bytes to 4 bytesUse Unions to Save Space
// Communication packet using a union
typedef union {
struct {
uint8_t header[4];
uint8_t cmd;
uint8_t data[32];
uint8_t checksum;
} packet;
uint8_t raw_data[38]; // Access as raw bytes
} comm_frame_t;
comm_frame_t frame;
frame.packet.cmd = 0x01; // Structured access
send_data(frame.raw_data, 38); // Byte‑array access3. Use Space to Trade for Time
Bit‑Counting Example
// Slow method: loop over bits
int count_ones_slow(unsigned char data) {
int cnt = 0;
unsigned char temp = data & 0xf;
for (int i = 0; i < 4; i++) {
if (temp & 0x01) cnt++;
temp >>= 1;
}
return cnt;
}
// Fast method: lookup table
static int ones_table[16] = {0,1,1,2,1,2,2,3,1,2,2,3,2,3,3,4};
int count_ones_fast(unsigned char data) {
return ones_table[data & 0xf];
}4. Flexible Array Members
typedef struct {
uint16_t head;
uint8_t id;
uint8_t type;
uint8_t length;
uint8_t value[]; // Flexible array
} protocol_new_t;
// Allocate structure + payload in one block
protocol_new_t *p = malloc(sizeof(protocol_new_t) + data_len);5. Bit Operations
Bit‑Fields for Compact Flags
struct flags_smart {
unsigned char flag1 : 1; // 1 bit each
unsigned char flag2 : 1;
unsigned char flag3 : 1;
unsigned char flag4 : 1;
unsigned char flag5 : 1;
unsigned char flag6 : 1;
unsigned char flag7 : 1;
unsigned char flag8 : 1; // Total 1 byte
} flags;Bitwise Arithmetic Instead of Multiplication/Division
uint32_t val = 1024;
uint32_t doubled = val << 1; // Multiply by 2
uint32_t halved = val >> 1; // Divide by 26. Loop Unrolling – Reduce Branch Overhead
// Traditional loop (jump each iteration)
for (int i = 0; i < 4; i++) {
process(array[i]);
}
// Unrolled version (no jump overhead)
process(array[0]);
process(array[1]);
process(array[2]);
process(array[3]);7. Inline Functions – Eliminate Call Overhead
static inline void toggle_led(uint8_t pin) {
PORT ^= 1 << pin;
}
// After inlining, the compiler emits the body directly
toggle_led(LED_PIN);8. Data‑Type Optimization
// Inefficient: using char as loop counter may cause extra checks
char i;
for (i = 0; i < N; i++) { /* ... */ }
// Efficient: use int for better compiler optimization
int i;
for (i = 0; i < N; i++) { /* ... */ }9. Loop Optimization Strategies
Place the innermost loop with the smallest iteration count to minimise branch mis‑predictions. Example:
// Inefficient nesting (large outer loop)
for (int row = 0; row < 100; row++) {
for (int col = 0; col < 5; col++) {
sum += a[row][col];
}
}
// Efficient nesting (small outer loop)
for (int col = 0; col < 5; col++) {
for (int row = 0; row < 100; row++) {
sum += a[row][col];
}
}Use early exit (break) when a condition is satisfied to avoid unnecessary iterations.
bool found = false;
for (int i = 0; i < 10000; i++) {
if (list[i] == target) {
found = true;
break; // exit immediately
}
}10. Structure Memory‑Alignment Optimization
// Unoptimized layout (16 bytes due to padding)
struct waste_memory {
char a; // 1 byte
short b; // 2 bytes + 1 byte padding
char c; // 1 byte + 2 bytes padding before next int
int d; // 4 bytes
char e; // 1 byte + 3 bytes padding at end
};
// Optimized layout (12 bytes)
struct save_memory {
char a;
char c;
short b; // placed after two chars for natural alignment
int d;
char e;
};Key Takeaways
Measure first, then optimize – use profiling tools to locate real bottlenecks.
Balance performance with readability and maintainability.
Prioritize hotspot code for incremental improvements.
Validate correctness after each optimization.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Liangxu Linux
Liangxu, a self‑taught IT professional now working as a Linux development engineer at a Fortune 500 multinational, shares extensive Linux knowledge—fundamentals, applications, tools, plus Git, databases, Raspberry Pi, etc. (Reply “Linux” to receive essential resources.)
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
