Fundamentals 12 min read

Implementing Raft in Go: Persistence, Optimizations, and Crash Scaling

This article, the fourth in a series on building a Raft consensus module in Go, explains how to add persistent state, improve command delivery semantics, optimize AppendEntries handling, and handle crash tolerance, while providing concrete Go code examples and practical testing tips.

360 Tech Engineering
360 Tech Engineering
360 Tech Engineering
Implementing Raft in Go: Persistence, Optimizations, and Crash Scaling

This article is the fourth part of a series on implementing the Raft distributed consensus algorithm in Go, focusing on adding persistence and several optimizations to complete a functional Raft implementation.

1. Persistence – Raft must store three pieces of state (currentTerm, votedFor, and the log) on stable storage. A simple Storage interface is introduced:

type Storage interface {
  Set(key string, value []byte)
  Get(key string) ([]byte, bool)
  // HasData returns true iff any Sets were made on this Storage.
  HasData() bool
}

The interface can be viewed as a map from string keys to arbitrary byte slices backed by any persistent medium.

2. Restoring and Saving State – The consensus module now accepts a Storage instance and restores persisted variables on start:

if cm.storage.HasData() {
  cm.restoreFromStorage(cm.storage)
}

The restoreFromStorage function reads currentTerm , votedFor , and log using encoding/gob and aborts on missing data. The counterpart persistToStorage encodes each variable and writes it back via cm.storage.Set .

func (cm *ConsensusModule) persistToStorage() {
  var termData bytes.Buffer
  if err := gob.NewEncoder(&termData).Encode(cm.currentTerm); err != nil { log.Fatal(err) }
  cm.storage.Set("currentTerm", termData.Bytes())
  // similar for votedFor and log
}

3. Crash Scaling – With persistence, a Raft cluster can survive server crashes. A 2N+1 server cluster tolerates N failures as long as the remaining N+1 servers stay connected.

4. Unreliable RPC Delivery – The implementation uses an RPCProxy to simulate network latency (1‑5 ms) and, when the RAFT_UNRELIABLE_RPC environment variable is set, occasional long delays (75 ms) or drops, mimicking real‑world network faults.

5. Optimizing AppendEntries – The leader now sends AppendEntries (AE) immediately when a new command is submitted, rather than waiting for the next 50 ms heartbeat. A goroutine runs a loop that triggers AE sends either on a timer or when cm.triggerAEChan receives a signal.

func (cm *ConsensusModule) startLeader() {
  cm.state = Leader
  // initialize nextIndex and matchIndex
  go func(heartbeatTimeout time.Duration) {
    cm.leaderSendAEs() // immediate send
    t := time.NewTimer(heartbeatTimeout)
    for {
      doSend := false
      select {
      case <-t.C:
        doSend = true
        t.Reset(heartbeatTimeout)
      case _, ok := <-cm.triggerAEChan:
        if !ok { return }
        doSend = true
        if !t.Stop() { <-t.C }
        t.Reset(heartbeatTimeout)
      }
      if doSend {
        cm.mu.Lock()
        if cm.state != Leader { cm.mu.Unlock(); return }
        cm.mu.Unlock()
        cm.leaderSendAEs()
      }
    }
  }(50 * time.Millisecond)
}

func (cm *ConsensusModule) Submit(command interface{}) bool {
  cm.mu.Lock()
  if cm.state == Leader {
    cm.log = append(cm.log, LogEntry{Command: command, Term: cm.currentTerm})
    cm.persistToStorage()
    cm.mu.Unlock()
    cm.triggerAEChan <- struct{}{}
    return true
  }
  cm.mu.Unlock()
  return false
}

When the leader’s commit index advances, it also notifies followers via triggerAEChan to ensure timely replication.

6. Batch Command Submission – Each call to Submit currently triggers a burst of RPCs. For high‑throughput scenarios, a batch submission API could be added to reduce network load, though Raft’s RPCs are idempotent and safe.

7. Summary – The article completes the Raft series, showing how persistence, crash tolerance, unreliable RPC simulation, and AppendEntries optimizations improve the algorithm’s practicality. It also points readers to mature Go Raft implementations such as etcd’s raft package and HashiCorp’s Raft library.

OptimizationGoPersistenceRaftdistributed consensus
360 Tech Engineering
Written by

360 Tech Engineering

Official tech channel of 360, building the most professional technology aggregation platform for the brand.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.