Fundamentals 11 min read

Unlocking the Secrets of the eBPF Virtual Machine: Registers, Bytecode, and Kernel Calls

This article delves into the eBPF virtual machine's architecture, detailing its register set, instruction classes, kernel helper invocation, and step‑by‑step analysis of sample bytecode that counts network packets, providing a solid foundation for advanced eBPF tooling.

Qingyun Technology Community
Qingyun Technology Community
Qingyun Technology Community
Unlocking the Secrets of the eBPF Virtual Machine: Registers, Bytecode, and Kernel Calls

This article continues the eBPF overview series, focusing on the virtual machine, its registers, instruction set, kernel helper functions, and a detailed walkthrough of the bytecode used in the first part's example program.

Virtual Machine

eBPF is a RISC‑style register machine with eleven 64‑bit registers, a program counter, and a fixed 512‑byte stack. Nine registers are general‑purpose, one is a read‑only stack pointer, and the program counter is implicit. All registers are 64‑bit wide, but the lower 32 bits can be accessed as sub‑registers, which is useful for cross‑compiling to embedded devices.

The registers are:

eBPF registers diagram
eBPF registers diagram

The eBPF program type, supplied at load time, determines which kernel helper functions may be called and the meaning of the return value stored in r0.

Function calls can have up to five arguments passed in registers r1r5. Arguments must be numeric values or pointers to the eBPF stack; direct arbitrary memory pointers are prohibited. All memory accesses must first load data onto the eBPF stack, simplifying verification.

Kernel helper functions are defined in the kernel core via BPF_CALL_* macros in bpf.h, for example bpf_trace_printk. The verifier checks that the data types of registers match the expected types of helper parameters.

eBPF instructions are fixed‑size 64‑bit encodings grouped into eight classes covering loads/stores (1–8 bytes), conditional/unconditional jumps, arithmetic/logic operations, and function calls. Detailed opcode formats are documented in the Cilium instruction set guide and the IOVisor specification.

Bytecode Example

The article revisits the bytecode used in the first part, generated from the sample program sock_example.c, which counts TCP, UDP, and ICMP packets received on the loopback interface.

At a high level, the program reads the protocol field from the packet, uses it as a key for a map lookup, increments the corresponding counter, and exits.

struct bpf_insn {
    __u8 code;   /* opcode */
    __u8 dst_reg:4; /* dest register */
    __u8 src_reg:4; /* source register */
    __s16 off;   /* signed offset */
    __s32 imm;   /* signed immediate constant */
};

+------------------------+----------------+----+----+--------+
| immediate              | offset         |src |dst |opcode |
+------------------------+----------------+----+----+--------+

The macro BPF_JMP_IMM encodes a conditional jump based on an immediate value. Its definition illustrates how the opcode class ( BPF_JMP), operation ( BPF_OP), and immediate‑value flag ( BPF_K) are combined.

#define BPF_OP(code)   ((code) & 0xf0)
#define BPF_K          0x00
/* conditional jump on immediate, if (dst_reg 'op' imm32) goto pc+off16 */
#define BPF_JMP_IMM(OP, DST, IMM, OFF) \
    ((struct bpf_insn) { \
        .code = BPF_JMP | BPF_OP(OP) | BPF_K, \
        .dst_reg = DST, \
        .src_reg = 0, \
        .off = OFF, .imm = IMM })

When the bytecode contains BPF_JMP_IMM(BPF_JEQ, BPF_REG_0, 0, 2), it assembles to the 0x020015 pattern, commonly used to skip the next two instructions if the map lookup returned zero.

Step‑by‑step walkthrough of the sample bytecode (relevant macros shown):

BPF_MOV64_REG(BPF_REG_6, BPF_REG_1),
BPF_LD_ABS(BPF_B, ETH_HLEN + offsetof(struct iphdr, protocol)),
BPF_STX_MEM(BPF_W, BPF_REG_10, BPF_REG_0, -4),
BPF_MOV64_REG(BPF_REG_2, BPF_REG_10),
BPF_ALU64_IMM(BPF_ADD, BPF_REG_2, -4),
BPF_LD_MAP_FD(BPF_REG_1, map_fd),
BPF_RAW_INSN(BPF_JMP | BPF_CALL, 0, 0, 0, BPF_FUNC_map_lookup_elem),
BPF_JMP_IMM(BPF_JEQ, BPF_REG_0, 0, 2),
BPF_MOV64_IMM(BPF_REG_1, 1),
BPF_RAW_INSN(BPF_STX | BPF_XADD | BPF_DW, BPF_REG_0, BPF_REG_1, 0, 0),
BPF_MOV64_IMM(BPF_REG_0, 0),
BPF_EXIT_INSN()

These instructions move the packet context pointer, load the protocol byte, push it onto the stack, use it as a map key, perform a lookup, conditionally skip incrementing when the lookup fails, increment the counter atomically, and finally return 0.

Summary

We examined the eBPF virtual machine's register file, instruction classes, and how kernel helper functions are invoked from raw bytecode, fully understanding the example from part 1. Future articles will explore compiling higher‑level languages to eBPF and more advanced use cases.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

bytecodeLinuxeBPFvirtual machinelow-level programmingkernel helpers
Qingyun Technology Community
Written by

Qingyun Technology Community

Official account of the Qingyun Technology Community, focusing on tech innovation, supporting developers, and sharing knowledge. Born to Learn and Share!

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.