Fundamentals 25 min read

Unlocking High-Performance Multithreading: Lock-Free Techniques in iLogtail

This article explores the fundamentals of multithreaded programming, comparing traditional lock-based synchronization with advanced lock-free techniques, and demonstrates how iLogtail implements thread models, memory barriers, atomic operations, spin locks, double-buffering, and deferred reclamation to achieve scalable, high-performance concurrency.

Alibaba Cloud Developer
Alibaba Cloud Developer
Alibaba Cloud Developer
Unlocking High-Performance Multithreading: Lock-Free Techniques in iLogtail

Background

Multithreaded programming is essential on multi‑core processors to utilize CPU resources efficiently. Traditional locks solve synchronization problems, but lock‑free programming can offer better performance and scalability when applied correctly.

Thread Models

Threads can be user‑level or kernel‑level. Common models include:

1:1 model – each user thread maps to a kernel thread (e.g., pthread, std::thread).

N:1 model – many user threads share a single kernel thread.

M:N model – multiple user threads map to multiple kernel threads (e.g., Go goroutines).

Choosing the appropriate model impacts performance, scalability, and simplicity.

Ensuring Correct Execution

Synchronization guarantees consistency of shared data. Modern compilers may reorder instructions, and CPUs may execute out‑of‑order, which can break naive assumptions.

volatile Keyword

In C++, volatile tells the compiler that a variable may change outside the program, preventing certain optimizations, but it does not guarantee atomicity or memory‑visibility across threads.

Memory Barriers

Memory barriers enforce ordering of memory operations:

Full barrier – all prior reads/writes complete before subsequent ones.

Read barrier – ensures prior reads complete before later operations.

Write barrier – ensures prior writes complete before later operations.

Memory Order (C++11)

C++ defines memory_order values to control ordering of atomic operations: memory_order_relaxed – only atomicity, no ordering. memory_order_consume – deprecated in C++17. memory_order_acquire – prevents reordering before the atomic. memory_order_release – prevents reordering after the atomic. memory_order_acq_rel – combines acquire and release. memory_order_seq_cst – strongest guarantee, default for most atomics.

iLogtail’s spin lock uses memory_order_acquire and memory_order_release.

Suppressing Compiler Reordering

Techniques include declaring variables volatile, inserting explicit memory‑barrier instructions, or using std::atomic types, which the C++ standard guarantees will not be reordered with respect to other atomic accesses.

Suppressing Compiler Optimizations

Marking a flag as volatile or std::atomic prevents the compiler from eliminating necessary reads, as shown in the example where a volatile bool data_ready flag controls a loop.

Suppressing CPU Out‑of‑Order Execution

Using relaxed atomics can allow the CPU to reorder operations, which may lead to surprising results such as both threads observing zero values. Proper memory‑order specifications are required to avoid these issues.

Lock Types in iLogtail

Mutex

Standard std::mutex provides acquire/release barriers automatically. iLogtail uses it for protecting global maps.

Condition Variable (Semaphore)

iLogtail combines std::condition_variable with a mutex to control thread lifecycles without busy‑waiting.

Recursive Mutex

std::recursive_mutex

allows the same thread to lock repeatedly, useful for recursive functions.

Read‑Write Lock

Read‑write locks enable multiple concurrent readers and a single writer, improving performance for read‑heavy workloads such as iLogtail’s metric module.

Spin Lock

Spin locks repeatedly attempt to acquire a lock without blocking the thread. iLogtail’s SpinLock implementation uses std::atomic_flag with acquire/release semantics.

class SpinLock {
    std::atomic_flag v_ = ATOMIC_FLAG_INIT;
public:
    SpinLock() {}
    bool try_lock() { return !v_.test_and_set(std::memory_order_acquire); }
    void lock() {
        for (unsigned k = 0; !try_lock(); ++k) {
            boost::detail::yield(k);
        }
    }
    void unlock() { v_.clear(std::memory_order_release); }
};
using ScopedSpinLock = std::lock_guard<SpinLock>;

Lock‑Free Practices in iLogtail

iLogtail replaces large coarse‑grained locks with atomic counters and per‑plugin metric objects, achieving true lock‑free metric collection.

class Counter {
    std::string mName;
    std::atomic_long mVal;
public:
    Counter(const std::string& name, uint64_t val);
    uint64_t GetValue() const;
    const std::string& GetName() const;
    void Add(uint64_t val);
    Counter* CopyAndReset();
};

Double‑Buffering for Read‑Write Separation

Two buffers hold data; one is written while the other is read. After updates, a pointer swap makes the new buffer visible without locking readers.

Deferred Reclamation

Nodes are marked for deletion and removed in a later batch, reducing lock contention during plugin destruction.

MetricsRecordRef::~MetricsRecordRef() {
    if (mMetrics) {
        mMetrics->MarkDeleted();
    }
}

Snapshot Mechanism

WriteMetrics creates a snapshot of the current metric list, then ReadMetrics swaps the head pointer under a short lock, allowing the old list to be deleted without further synchronization.

void ReadMetrics::UpdateMetrics() {
    MetricsRecord* snapshot = WriteMetrics::GetInstance()->DoSnapshot();
    MetricsRecord* toDelete;
    {
        WriteLock lock(mReadWriteLock);
        toDelete = mHead;
        mHead = snapshot;
    }
    while (toDelete) {
        MetricsRecord* obj = toDelete;
        toDelete = toDelete->GetNext();
        delete obj;
    }
}

Illustrations

iLogtail architecture
iLogtail architecture
New metric module
New metric module
Double buffer diagram
Double buffer diagram
Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

CmultithreadingMemory Modellock‑freeiLogtail
Alibaba Cloud Developer
Written by

Alibaba Cloud Developer

Alibaba's official tech channel, featuring all of its technology innovations.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.