Backend Development 15 min read

How to Build a Global Code Search System from Scratch

This article introduces how to build a global code search system called 'Qianxun' from scratch, covering its background, architecture, core technologies, and future prospects.

政采云技术
政采云技术
政采云技术
How to Build a Global Code Search System from Scratch

This article introduces how to build a global code search system called 'Qianxun' from scratch, covering its background, architecture, core technologies, and future prospects.

Background : Previously, the front-end team had hundreds of projects, and finding specific APIs, NPM packages, or keywords across projects was time-consuming. Thus, the global code search system 'Qianxun' was born.

What is Qianxun? : Qianxun is a code global search system that can search for project lists by entering interface paths, NPM package names, or other keywords.

Features of Qianxun : It mainly has two major functions: code search and project list . Code search displays project file information and related information by entering keywords. The project list includes initializing project list information to the Elasticsearch service and synchronizing project code to Elasticsearch.

Elasticsearch : Elasticsearch, abbreviated as ES, is a distributed, scalable, real-time search and analytics engine. Its underlying layer is the open-source library Apache Lucene. If you want to access Elasticsearch, you can directly use the HTTP RESTful API method for CRUD operations.

Comparison between Relational Database and Elasticsearch :

Relational Database

Database

Table

Row

Column

Elasticsearch

Index

Type

Document

Fields

Design Architecture : The client framework uses Vue 3.0, and the UI component library is Element Plus. The server-side part mainly uses Node.js and Koa 2.0.

Design Process : The main service can execute scaffolding commands to pull current project file data from the GitLab service, then convert the project file data into JSON file data and synchronize it to ES. Meanwhile, some file and project information data will be persisted. Finally, the main service can call the ES service's search API to achieve project file data search functionality.

Core Technologies :

1. Node-server Service : The main service, similar to playing the role of a middle layer, can connect and access ES service, MySql service, GitLab service, and execute Node-fscrawler scaffolding-related commands through calling NodeJS's spawn to start a child process.

2. Message Center Design : Some asynchronous tasks and operations, such as file asynchronous download and starting child processes, can be placed in the message center module, mainly to reduce coupling and decouple the controller layer.

3. Node-fscrawler Scaffolding : By executing relevant commands, you can convert the downloaded project file data into a JSON file, and then call the ES service's batch import API to import the project file data into ES.

4. GitLab Service : Provides GitLab RESTful API to obtain or download project file data, etc. Here, gitbeaker is recommended, which is currently a NodeJS library that fully supports all GitLab API services.

5. Elasticsearch Service : The core engine of global code search. Here, a package of Elasticsearch's JavaScript client library elasticsearch.js is used. After the ES service is up, instantiate the Client method of elasticsearch.js, and then pass in some necessary parameters to connect to the ES service through the NodeJS service.

6. Mysql Service : Persist project group (GitLab Group) data, project data, project file data, and search result data, that is, there are 4 tables.

Project Information Initialization : First, obtain the token of the front-end GitLab general account, and then store it to the projects table in Mysql by calling the method of getting all project information of gitbeaker.

Project Information Entry : For missing projects or newly added projects, we will provide manual entry function.

Future Prospects : 'Qianxun' was designed to improve code global search efficiency and reduce labor costs. In the future, 'Qianxun' will further improve search hit rate, support precise search, realize real-time synchronization of project files to ES, and a series of other functions.

backend developmentElasticsearchNode.jsGitLabVue.jscode searchFull-Stack Development
政采云技术
Written by

政采云技术

ZCY Technology Team (Zero), based in Hangzhou, is a growth-oriented team passionate about technology and craftsmanship. With around 500 members, we are building comprehensive engineering, project management, and talent development systems. We are committed to innovation and creating a cloud service ecosystem for government and enterprise procurement. We look forward to your joining us.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.