Implementing a High‑Performance Search Service with Go and the Riot Engine
This guide explains how to set up a Go environment, install the open‑source Riot search engine, understand its goroutine‑based architecture for indexing and ranking, and provides a concise code example demonstrating document indexing and query execution for content‑driven websites.
Requirement
Content‑driven sites such as forums, knowledge bases, or document portals need to search and retrieve matching documents based on user input.
We can build a high‑performance search service to provide this functionality for the main site.
The open‑source Go search engine riot (derived from the now‑stagnant wukong) attracted our attention because of its high performance, scalability, and Chinese language support.
Implementation
Environment Preparation
First, install Go (version 1.8 or higher; Ubuntu currently recommends 1.6, so download from the official site and set the environment variables).
Install Riot’s dependency packages:
go get "github.com/shirou/gopsutil" go get "github.com/go-ego/gpy"Install Riot itself:
go get -u github.com/go-ego/riot go get -u github.com/go-ego/reRiot’s Working Principle
The engine processes user requests, tokenization, indexing, and ranking using separate goroutines.
1. Main goroutine – handles receiving and sending user requests
2. Segmenter goroutine – performs tokenization
3. Indexer goroutine – builds and looks up the inverted index
4. Ranker goroutine – scores and sorts documentsIndexing flow:
When a request to add a document arrives, the main goroutine sends the raw text through a channel to a segmenter goroutine, which tokenizes it and forwards the tokens to an indexer goroutine. The indexer creates an in‑memory inverted index mapping search keywords to document IDs for fast lookup.
Search flow:
The main goroutine receives a query, tokenizes it locally, and passes the tokens to the indexer. The indexer retrieves candidate documents for each token, merges the posting lists to obtain a reduced document set, and sends this set to the ranker. The ranker scores, filters, and orders the documents, then returns the sorted results to the main goroutine, which finally replies to the user.
Multiple goroutines handle tokenization, indexing, and ranking, with intermediate results stored in buffered channels to avoid blocking. To increase concurrency and reduce latency, Riot splits documents into shards (the number of shards can be user‑specified); indexing and ranking requests are processed in parallel across shards, and the main goroutine merges the results.
In summary, a complete search system consists of four parts: document crawling, indexing, searching, and result display.
Simple Usage
package main
import (
"log"
"github.com/go-ego/riot/engine"
"github.com/go-ego/riot/types"
)
var (
// searcher is coroutine safe
searcher = engine.Engine{}
)
func main() {
// Init searcher
searcher.Init(types.EngineInitOptions{Using: 4, SegmenterDict: "./dict/dictionary.txt"})
defer searcher.Close()
// Add documents to the index (docId starts at 1)
searcher.IndexDocument(1, types.DocIndexData{Content: "Google Is Experimenting With Virtual Reality Advertising"}, false)
searcher.IndexDocument(2, types.DocIndexData{Content: "Google accidentally pushed Bluetooth update for Home speaker early"}, false)
searcher.IndexDocument(3, types.DocIndexData{Content: "Google is testing another Search results layout with rounded cards, new colors, and the 4 mysterious colored dots again"}, false)
// Wait for the index to refresh
searcher.FlushIndex()
// Perform a search; the result format is defined in types.SearchResponse
log.Print(searcher.Search(types.SearchRequest{Text: "google testing"}))
}
// Sample output:
// 2017/10/24 15:14:29 Loaded gse dictionary ./dict/dictionary.txt
// 2017/10/24 15:14:31 gse dictionary loaded
// 2017/10/24 15:14:31 Check virtualMemory...Total: 12539645952, Free:7705378816, UsedPercent:11.799818%
// 2017/10/24 15:14:31 {[google testing] [{3 Google is testing another Search results layout with rounded cards, new colors, and the 4 mysterious colored dots again <nil> <nil> [4.7] [] []}] false 1}To use the engine, format your data, store it in Riot, and query through its API.
Performance and Scalability
Performance benchmarks are available at the Riot benchmarking page .
Riot supports persistent storage; for distributed search you need to partition the index and data and perform additional development.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
360 Quality & Efficiency
360 Quality & Efficiency focuses on seamlessly integrating quality and efficiency in R&D, sharing 360’s internal best practices with industry peers to foster collaboration among Chinese enterprises and drive greater efficiency value.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
