Fundamentals 23 min read

Understanding Linux Netlink: Mechanism, User‑Space API, and Kernel Implementation

This article explains the Linux Netlink mechanism, its protocol families, advantages over ioctl and /proc, detailed user‑space socket usage, and the kernel‑side APIs required to create, send, receive, and manage Netlink messages, including code examples.

Deepin Linux
Deepin Linux
Deepin Linux
Understanding Linux Netlink: Mechanism, User‑Space API, and Kernel Implementation

Linux Netlink is a communication mechanism between the kernel and user space that allows user programs to send requests or receive events and data via a socket interface. It is primarily used for inter‑process communication (IPC) and network‑related operations such as network configuration, routing table management, and socket state monitoring.

The Netlink protocol family supports multiple protocols, each handling a different type of information exchange; for example, NETLINK_ROUTE deals with routing tables, interfaces, and addresses, while NETLINK_SOCK_DIAG provides socket status and statistics.

Netlink is a special socket unique to Linux, similar to BSD's AF_ROUTE but far more powerful. Many kernel components use Netlink, including routing daemons, 1‑wire subsystem, user‑mode socket protocols, firewalls, socket monitoring, netfilter logging, IPsec policies, SELinux notifications, iSCSI subsystem, audit, forwarding information base lookup, connector, netfilter subsystem, IPv6 firewall, DECnet routing, kernel event notifications, and the generic Netlink protocol.

Netlink offers a convenient bidirectional data transfer method: user‑space applications use standard socket APIs, while kernel‑space code uses dedicated kernel APIs.

Compared with system calls, ioctl, and the /proc filesystem, Netlink has several advantages:

Adding a new Netlink protocol only requires defining a new type in include/linux/netlink.h (e.g., #define NETLINK_MYTEST 17 ), whereas system calls need new syscall numbers, ioctl needs new device files, and /proc requires new entries.

Netlink is asynchronous; messages are queued in the socket buffer, so the sender does not wait for the receiver, unlike the synchronous nature of system calls and ioctl.

The kernel part of Netlink can be implemented as a loadable module without compile‑time dependencies, unlike system calls that must be statically linked into the kernel.

Netlink supports multicast groups, allowing a single send to reach multiple listeners.

The kernel can initiate a Netlink session, while system calls and ioctl can only be invoked from user space.

Because Netlink uses the standard socket API, it is easy to use without specialized training.

User‑Space Use of Netlink

User‑space programs use the standard socket functions socket() , bind() , sendmsg() , recvmsg() and close() . The required header files are linux/netlink.h and sys/socket.h .

To create a Netlink socket:

socket(AF_NETLINK, SOCK_RAW, netlink_type)

The first argument must be AF_NETLINK (or PF_NETLINK ), the second argument SOCK_RAW or SOCK_DGRAM , and the third argument specifies the Netlink protocol (e.g., a custom NETLINK_MYTEST or the generic NETLINK_GENERIC ). Predefined protocol numbers include:

#define NETLINK_ROUTE           0       /* Routing/device hook */
#define NETLINK_W1              1       /* 1‑wire subsystem */
#define NETLINK_USERSOCK        2       /* Reserved for user‑mode socket protocols */
#define NETLINK_FIREWALL        3       /* Firewalling hook */
#define NETLINK_INET_DIAG       4       /* INET socket monitoring */
#define NETLINK_NFLOG           5       /* netfilter/iptables ULOG */
#define NETLINK_XFRM            6       /* ipsec */
#define NETLINK_SELINUX         7       /* SELinux event notifications */
#define NETLINK_ISCSI           8       /* Open‑iSCSI */
#define NETLINK_AUDIT           9       /* auditing */
#define NETLINK_FIB_LOOKUP      10
#define NETLINK_CONNECTOR       11
#define NETLINK_NETFILTER       12      /* netfilter subsystem */
#define NETLINK_IP6_FW          13
#define NETLINK_DNRTMSG         14      /* DECnet routing messages */
#define NETLINK_KOBJECT_UEVENT  15      /* Kernel messages to userspace */
#define NETLINK_GENERIC         16

Each Netlink protocol can have up to 32 multicast groups, each represented by a bit. Multicast reduces the number of system calls needed for broadcasting messages.

The bind() call associates a Netlink socket with a local address defined by struct sockaddr_nl :

struct sockaddr_nl {
  sa_family_t    nl_family;
  unsigned short nl_pad;
  __u32          nl_pid;
  __u32          nl_groups;
};

nl_family must be AF_NETLINK or PF_NETLINK , nl_pad is always zero, nl_pid is the sending/receiving process ID (zero for the kernel), and nl_groups specifies the multicast groups to join (zero means no group).

When multiple threads share a Netlink socket, nl_pid can be set to a custom value, for example:

pthread_self() << 16 | getpid();

Binding is performed as:

bind(fd, (struct sockaddr*)&nladdr, sizeof(struct sockaddr_nl));

To send a Netlink message, the sendmsg() call uses a struct msghdr that references the destination address ( nladdr ) and the message header struct nlmsghdr :

struct msghdr msg;
memset(&msg, 0, sizeof(msg));
msg.msg_name = (void *)&(nladdr);
msg.msg_namelen = sizeof(nladdr);

The Netlink message header is defined as:

struct nlmsghdr {
  __u32 nlmsg_len;   /* Length of message */
  __u16 nlmsg_type;  /* Message type */
  __u16 nlmsg_flags; /* Additional flags */
  __u32 nlmsg_seq;   /* Sequence number */
  __u32 nlmsg_pid;   /* Sending process PID */
};

Common flag values include:

#define NLM_F_REQUEST           1   /* Request message */
#define NLM_F_MULTI             2   /* Multipart message */
#define NLM_F_ACK               4   /* Request acknowledgment */
#define NLM_F_ECHO              8   /* Echo request */
#define NLM_F_ROOT      0x100   /* Return whole table */
#define NLM_F_MATCH     0x200   /* Return matching subset */
#define NLM_F_ATOMIC    0x400   /* Atomic GET */
#define NLM_F_DUMP      (NLM_F_ROOT|NLM_F_MATCH)
#define NLM_F_REPLACE   0x100   /* Override existing */
#define NLM_F_EXCL      0x200   /* Fail if exists */
#define NLM_F_CREATE    0x400   /* Create if not exists */
#define NLM_F_APPEND    0x800   /* Append to list */

For most applications the flags can be set to zero; advanced users (e.g., netfilter or routing daemons) may need specific flags. The sequence number and PID help correlate requests and replies.

Example of constructing a message:

#define MAX_MSGSIZE 1024
char buffer[] = "An example message";
struct nlmsghdr nlhdr;
nlhdr = (struct nlmsghdr *)malloc(NLMSG_SPACE(MAX_MSGSIZE));
strcpy(NLMSG_DATA(nlhdr), buffer);
nlhdr->nlmsg_len = NLMSG_LENGTH(strlen(buffer));
nlhdr->nlmsg_pid = getpid();  /* self pid */
nlhdr->nlmsg_flags = 0;

The struct iovec allows sending multiple buffers in a single system call:

struct iovec iov;
iov.iov_base = (void *)nlhdr;
iov.iov_len = nlh->nlmsg_len;
msg.msg_iov = &iov;
msg.msg_iovlen = 1;

Sending is performed with:

sendmsg(fd, &msg, 0);

Receiving uses a similar setup with a large buffer, then recvmsg() :

#define MAX_NL_MSG_LEN 1024
struct sockaddr_nl nladdr;
struct msghdr msg;
struct iovec iov;
struct nlmsghdr *nlhdr;
nlhdr = (struct nlmsghdr *)malloc(MAX_NL_MSG_LEN);
iov.iov_base = (void *)nlhdr;
iov.iov_len = MAX_NL_MSG_LEN;
msg.msg_name = (void *)&(nladdr);
msg.msg_namelen = sizeof(nladdr);
msg.msg_iov = &iov;
msg.msg_iovlen = 1;
recvmsg(fd, &msg, 0);

After reception, nlhdr points to the message header, nladdr holds the source address, and the macro NLMSG_DATA(nlhdr) returns a pointer to the payload.

Convenient macros for handling Netlink messages are defined in linux/netlink.h :

#define NLMSG_ALIGNTO   4
#define NLMSG_ALIGN(len) ( ((len)+NLMSG_ALIGNTO-1) & ~(NLMSG_ALIGNTO-1) )
#define NLMSG_LENGTH(len) ((len)+NLMSG_ALIGN(sizeof(struct nlmsghdr)))
#define NLMSG_SPACE(len) NLMSG_ALIGN(NLMSG_LENGTH(len))
#define NLMSG_DATA(nlh)  ((void*)(((char*)nlh) + NLMSG_LENGTH(0)))
#define NLMSG_NEXT(nlh,len)      ((len) -= NLMSG_ALIGN((nlh)->nlmsg_len), \
                      (struct nlmsghdr*)(((char*)(nlh)) + NLMSG_ALIGN((nlh)->nlmsg_len)))
#define NLMSG_OK(nlh,len) ((len) >= (int)sizeof(struct nlmsghdr) && \
                           (nlh)->nlmsg_len >= sizeof(struct nlmsghdr) && \
                           (nlh)->nlmsg_len <= (len))
#define NLMSG_PAYLOAD(nlh,len) ((nlh)->nlmsg_len - NLMSG_SPACE((len)))

Netlink Kernel API

The kernel implementation resides in net/core/af_netlink.c . Kernel modules must include linux/netlink.h and use dedicated kernel APIs, which differ from user‑space usage.

To add a new protocol, add a definition to linux/netlink.h (e.g., #define NETLINK_MYTEST 17 ) and the kernel can reference it anywhere.

Creating a kernel Netlink socket is done with:

struct sock *
netlink_kernel_create(int unit, void (*input)(struct sock *sk, int len));

The unit argument is the protocol type (e.g., NETLINK_MYTEST ), and input is a callback invoked when a message arrives.

Example input function:

void input (struct sock *sk, int len)
{
 struct sk_buff *skb;
 struct nlmsghdr *nlh = NULL;
 u8 *data = NULL;
 while ((skb = skb_dequeue(&sk->receive_queue)) != NULL) {
   nlh = (struct nlmsghdr *)skb->data;
   data = NLMSG_DATA(nlh);
   /* process the message */
 }
}

The input function runs in the context of the sending process; for long messages it may be preferable to wake a kernel thread instead of processing directly.

Messages are stored in struct sk_buff ; the macro NETLINK_CB(skb) provides convenient access to the control block:

#define NETLINK_CB(skb) (*(struct netlink_skb_parms*)&((skb)->cb))

Setting address fields for a kernel‑generated message:

NETLINK_CB(skb).pid = 0;
NETLINK_CB(skb).dst_pid = 0;
NETLINK_CB(skb).dst_group = 1;

Sending a unicast message from the kernel uses:

int netlink_unicast(struct sock *sk, struct sk_buff *skb, u32 pid, int nonblock);

Sending a broadcast uses:

void netlink_broadcast(struct sock *sk, struct sk_buff *skb, u32 pid, u32 group, int allocation);

After use, the socket is released with:

void sock_release(struct socket *sock);

/* Example */
sock_release(sk->sk_socket);

A complete example package includes a kernel module ( netlink‑exam‑kern.c ) and two user‑space programs ( netlink‑exam‑user‑recv.c and netlink‑exam‑user‑send.c ). The module is inserted, the receiver is run in one terminal, and the sender in another; the sender reads a file and sends its contents as a Netlink message to the kernel, which stores it in a proc entry ( /proc/netlink_exam_buffer ) and forwards it to the user‑space receiver for display.

Recommended further reading (promotional links omitted for brevity).

kernelC++LinuxIPCSocketNetlink
Deepin Linux
Written by

Deepin Linux

Research areas: Windows & Linux platforms, C/C++ backend development, embedded systems and Linux kernel, etc.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.