Artificial Intelligence 14 min read

How StarRocks MCP Server Enables LLMs to Query Databases Without Custom Plugins

StarRocks MCP Server provides a universal adapter that lets large language models like Claude, OpenAI, and Gemini execute SQL queries directly against StarRocks, simplifying data Q&A, intelligent analysis, and automated reporting by eliminating the need for bespoke plugins or complex prompt engineering.

StarRocks

May 13, 2025

Model Context Protocol (MCP)

Model Context Protocol (MCP) is a standardized communication protocol that defines how large language models (LLMs) exchange data, metadata, and context with external services. It specifies request/response formats, state tracking, multi‑turn dialogue identifiers, and supports dynamic context updates, enabling LLMs to interact with real‑world systems in a uniform way.

MCP Core Components

MCP Host : An LLM‑based application (e.g., Claude Desktop) that receives user queries and presents model responses.

MCP Client : Handles connection logic, establishes a session with a specific MCP Server, and forwards requests/responses.

MCP Server : Exposes concrete capabilities such as file access, weather APIs, or database operations through a uniform interface.

StarRocks MCP Server Capabilities

Tools (executable functions the model can invoke) db_overview: Returns an overview of all tables in a specified database. table_overview: Provides column definitions, row count, and a few sample rows for a given table. Supports cache refresh via the parameter refresh=true. read_query: Executes read‑only SQL statements (e.g., SELECT) and returns results in CSV format, including column headers. write_query: Executes DDL/DML statements such as CREATE TABLE, INSERT, UPDATE, DELETE and returns execution metrics (affected rows, execution time, etc.). query_and_plotly_chart: Runs a SQL query, feeds the result to a Plotly Express expression, and returns a Base64‑encoded chart image.

Resources (read‑only contextual data) starrocks:///databases: Lists all database names in the StarRocks cluster. starrocks:///{db}/tables: Lists all tables in the specified database. starrocks:///{db}/{table}/schema: Returns the SHOW CREATE TABLE statement for the given table. proc:///{+path}: Exposes internal /proc paths (e.g., /proc/backends) to retrieve runtime metadata such as node status, transaction info, and job details.

Prompts : Currently undefined for the StarRocks MCP Server; the protocol allows future addition of custom prompt templates to guide model behavior.

Typical Application Scenarios

Real‑time data analysis and AI‑enhanced decision making (e.g., financial risk monitoring, e‑commerce personalization).

Automated reporting and BI augmentation (LLM generates SQL, performs aggregations, and returns visual dashboards).

Complex query optimization (AI assists in rewriting multi‑table joins and heavy aggregations).

Federated data lake queries (MCP enables combined queries across StarRocks and external sources such as Hive or Elasticsearch).

Quick Start Guide

1. Deploy a StarRocks cluster (e.g., on localhost) and create a sample table.

CREATE TABLE IF NOT EXISTS crashdata (
    CRASH_DATE DATETIME,
    BOROUGH STRING,
    ZIP_CODE STRING,
    LATITUDE INT,
    LONGITUDE INT,
    LOCATION STRING,
    ON_STREET_NAME STRING,
    CROSS_STREET_NAME STRING,
    OFF_STREET_NAME STRING,
    CONTRIBUTING_FACTOR_VEHICLE_1 STRING,
    CONTRIBUTING_FACTOR_VEHICLE_2 STRING,
    COLLISION_ID INT,
    VEHICLE_TYPE_CODE_1 STRING,
    VEHICLE_TYPE_CODE_2 STRING
);

2. Load data into the table using the StarRocks stream‑load API.

curl --location-trusted -u root \
    -T ./NYPD_Crash_Data.csv \
    -H "label:crashdata-0" \
    -H "column_separator:," \
    -H "skip_header:1" \
    -H "enclose:\"" \
    -H "max_filter_ratio:1" \
    -H "columns:tmp_CRASH_DATE, tmp_CRASH_TIME, CRASH_DATE=str_to_date(concat_ws(' ', tmp_CRASH_DATE, tmp_CRASH_TIME), '%m/%d/%Y %H:%i'),BOROUGH,ZIP_CODE,LATITUDE,LONGITUDE,LOCATION,ON_STREET_NAME,CROSS_STREET_NAME,OFF_STREET_NAME,NUMBER_OF_PERSONS_INJURED,NUMBER_OF_PERSONS_KILLED,NUMBER_OF_P..." \
    -XPUT http://localhost:8030/api/quickstart/crashdata/_stream_load

3. Start an MCP client (e.g., DeepChat, Claude Desktop, or Cline). Install the mcp-server-starrocks plugin/extension and configure the client to connect to the StarRocks MCP Server endpoint (default http://localhost:8030).

4. Issue MCP commands from the client, such as: mcpserver db_overview db_name=mydb or

mcpserver read_query "SELECT COUNT(*) FROM crashdata"

The model will receive the result in CSV format, which can be further processed or visualized.

Note: Ensure the chosen client supports returning tables and images, as some tools may only display plain text.

Project Repository

Source code and releases are available at https://github.com/StarRocks/mcp-server-starrocks

SQL AI agents LLM MCP StarRocks data analytics

Written by

StarRocks

StarRocks is an open‑source project under the Linux Foundation, focused on building a high‑performance, scalable analytical database that enables enterprises to create an efficient, unified lake‑house paradigm. It is widely used across many industries worldwide, helping numerous companies enhance their data analytics capabilities.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.