How Spring Boot 3 + Apache Calcite Eliminates Cross‑Database Chaos with a Unified Query Layer
This article explains why multi‑source data fragmentation hurts service code, introduces Apache Calcite as a query‑abstraction layer that decouples data location from query logic, and provides step‑by‑step Spring Boot 3 integration, practical code samples, common pitfalls, performance tips, and real‑world scenarios.
Problem Statement
When an application stores orders in MySQL, inventory in PostgreSQL, user profiles in MongoDB, and logs in CSV/Hive/Kafka, developers must write three separate DAO layers, manually merge results, and maintain duplicated logic across the service layer.
Why Apache Calcite?
Calcite is a query‑abstraction layer that sits above the query engine. Its core value is to completely decouple “where the data lives” from “how to query it”. It can treat any tabular source—relational databases, NoSQL stores, files, or streams—as a logical schema and query them with a single SQL dialect.
Key Capabilities
Acts as a logical database, unified SQL gateway, and cross‑source query compiler.
Supports standard SQL for all sources.
Built‑in rule‑based and cost‑based optimizers; automatically pushes down filters and handles cross‑source joins.
Adapter plug‑in mechanism (official adapters for MySQL, MongoDB, Kafka, Hive, file, and custom adapters for internal sources).
Lightweight and easy to embed; no Spring‑specific intrusion, can run inside Spring Boot 3 or as a standalone service.
Integration Steps with Spring Boot 3
Maven Dependencies (version alignment is critical)
<!-- Calcite core -->
<dependency>
<groupId>org.apache.calcite</groupId>
<artifactId>calcite-core</artifactId>
<version>1.36.0</version>
</dependency>
<!-- MySQL Adapter -->
<dependency>
<groupId>org.apache.calcite</groupId>
<artifactId>calcite-mysql</artifactId>
<version>1.36.0</version>
</dependency>
<!-- MongoDB Adapter -->
<dependency>
<groupId>org.apache.calcite</groupId>
<artifactId>calcite-mongodb</artifactId>
<version>1.36.0</version>
</dependency>
<!-- MyBatis‑Plus (Spring Boot 3 compatible) -->
<dependency>
<groupId>com.baomidou</groupId>
<artifactId>mybatis-plus-boot-starter</artifactId>
<version>3.5.5</version>
</dependency>
<!-- Druid connection pool -->
<dependency>
<groupId>com.alibaba</groupId>
<artifactId>druid-spring-boot-starter</artifactId>
<version>1.2.20</version>
</dependency>Calcite Model File (the "blueprint" for unified data sources)
{
"version": "1.0",
"defaultSchema": "ecommerce",
"schemas": [
{
"name": "ecommerce",
"type": "custom",
"factory": "org.apache.calcite.adapter.jdbc.JdbcSchema$Factory",
"operand": {
"jdbcUrl": "jdbc:mysql://localhost:3306/ecommerce_order?useSSL=false&serverTimezone=UTC",
"username": "root",
"password": "123456",
"driver": "com.mysql.cj.jdbc.Driver"
}
},
{
"name": "user_mongo",
"type": "custom",
"factory": "org.apache.calcite.adapter.mongodb.MongoSchema$Factory",
"operand": {
"host": "localhost",
"port": 27017,
"database": "user_db",
"collection": "user_info"
}
}
]
}Spring Configuration (Calcite datasource + MyBatis‑Plus)
package com.icoderoad.config;
import com.baomidou.mybatisplus.annotation.DbType;
import com.baomidou.mybatisplus.extension.plugins.MybatisPlusInterceptor;
import com.baomidou.mybatisplus.extension.plugins.inner.PaginationInnerInterceptor;
import org.apache.calcite.jdbc.CalciteConnection;
import org.mybatis.spring.annotation.MapperScan;
import org.mybatis.spring.SqlSessionFactoryBean;
import org.springframework.context.annotation.*;
import org.springframework.core.io.support.PathMatchingResourcePatternResolver;
import javax.sql.DataSource;
import java.sql.*;
import java.util.Properties;
@Configuration
@MapperScan("com.icoderoad.mapper")
public class CalciteMybatisConfig {
@Bean
public DataSource calciteDataSource() throws Exception {
Properties props = new Properties();
props.setProperty("model", "classpath:calcite-model.json");
Connection conn = DriverManager.getConnection("jdbc:calcite:", props);
return conn.unwrap(CalciteConnection.class).getDataSource();
}
@Bean
public SqlSessionFactoryBean sqlSessionFactory(DataSource calciteDataSource) throws Exception {
SqlSessionFactoryBean factory = new SqlSessionFactoryBean();
factory.setDataSource(calciteDataSource);
factory.setMapperLocations(new PathMatchingResourcePatternResolver()
.getResources("classpath:mapper/*.xml"));
factory.setPlugins(mybatisPlusInterceptor());
return factory;
}
@Bean
public MybatisPlusInterceptor mybatisPlusInterceptor() {
MybatisPlusInterceptor interceptor = new MybatisPlusInterceptor();
interceptor.addInnerInterceptor(new PaginationInnerInterceptor(DbType.MYSQL));
return interceptor;
}
}Data Transfer Object
package com.icoderoad.model;
import lombok.Data;
import java.math.BigDecimal;
@Data
public class UserOrderVO {
private String orderId;
private String orderTime;
private BigDecimal amount;
private String userName;
private String phone;
private String userId;
}Mapper (MyBatis‑Plus style)
package com.icoderoad.mapper;
import com.baomidou.mybatisplus.core.mapper.BaseMapper;
import com.icoderoad.model.UserOrderVO;
import org.apache.ibatis.annotations.*;
import java.util.List;
public interface UserOrderMapper extends BaseMapper<UserOrderVO> {
@Select("""
SELECT
o.order_id AS orderId,
o.order_time AS orderTime,
o.amount,
u.user_name AS userName,
u.phone,
o.user_id AS userId
FROM ecommerce.order o
JOIN user_mongo.user_info u ON o.user_id = u.user_id
WHERE o.user_id = #{userId}
""")
List<UserOrderVO> queryByUserId(@Param("userId") String userId);
}Calcite handles "how to query" while MyBatis‑Plus handles "how to write code".
Key Takeaways from the Example
SQL remains standard; no need to learn MongoDB query language.
The query is unaware of the underlying MongoDB source.
No aggregation code is required in the service layer.
Existing development habits stay unchanged.
High‑Frequency Scenarios for Calcite
Scenario 1 – Enterprise Data‑Center Integration
Orders, users, inventory are isolated in different systems.
Calcite provides a single query entry point.
Service‑layer complexity drops by more than 50%.
Scenario 2 – Real‑time + Batch Joint Analysis
Kafka supplies real‑time streams.
Hive stores historical data.
One SQL statement can drive both monitoring and decision‑making.
Scenario 3 – Files as Data Sources
CSV, Excel, Parquet can be queried directly.
No need to import files into a database.
Ideal for ad‑hoc analysis.
Common Pitfalls and How to Avoid Them
Adapter versions must match Calcite core version.
Maintain clear schema naming to prevent confusion.
Identify and handle the slowest data source to avoid bottlenecks.
Performance‑Optimization Suggestions
Enable metadata caching in Calcite.
Push down filters as much as possible.
For advanced use‑cases, implement custom OptimizerRule.
Conclusion
When a system is small, cross‑database queries are merely an inconvenience. As the system grows into a data‑mid‑platform with multiple sources, real‑time analysis, and a data‑lake, lacking a unified query layer becomes technical debt. Apache Calcite offers a clean, controllable architectural choice rather than a flashy new framework, and is worth a serious trial for anyone building a data‑centric backend.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
LuTiao Programming
LuTiao Programming is a friendly community offering free programming lessons. We inspire learners to explore new ideas and technologies and quickly acquire job-ready skills.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
