Essential Backend Infrastructure for Scalable Java Applications
This article outlines the critical backend components required for building robust Java services, covering API gateways, MVC/IOC/ORM frameworks, caching, databases, search engines, message queues, file storage, unified authentication, configuration, service governance, scheduling, logging, data pipelines, and monitoring strategies.
1.1 Backend Infrastructure
The purpose of using Java backend technologies is to build business applications that provide online or offline services. The essential backend technologies and infrastructure needed are illustrated in the diagram below.
The backend infrastructure refers to the key components or services required for stable online operation. While the described components can support long‑term business needs, other invisible system services such as load balancing, automated deployment, and security are not covered here.
1.1.1 Unified Request Entry – API Gateway
Mobile app backends typically need load balancing, API access control, and user authentication. A common approach uses Nginx for load balancing and implements access control and authentication within each service, but a more maintainable solution is to provide these as a shared library or, better yet, as a dedicated API gateway service (e.g., Kong, Netflix Zuul).
Because every request passes through the gateway, it can become a performance bottleneck. An alternative is to remove the gateway and let each business service directly consult a unified authentication center, caching authentication results to reduce load.
1.1.2 Business Applications and Backend Frameworks
Business applications are divided into online (high traffic, low tolerance for failure) and internal (lower traffic, higher data confidentiality). Typical Java backend frameworks include:
MVC frameworks : Spring MVC, Jersey, JFinal, WebX – provide a unified development process and hide low‑level details.
IOC frameworks : Spring – implements dependency injection.
ORM frameworks : MyBatis, Spring JDBC Template – abstract database access and support sharding, master‑slave, etc.
Cache frameworks : RedisTemplate, Jedis – unified access to Redis/Memcached.
JavaEE performance monitoring : JWebap (or custom extensions) – instrument request latency, JDBC, Redis calls.
These frameworks together form a basic backend application skeleton.
1.1.3 Cache, Database, Search Engine, Message Queue
These four foundational services directly affect overall application performance.
Cache : Local memory cache, Memcached, Redis (most popular) – isolate hot data from the database.
Database : Relational (MySQL, PostgreSQL) and NoSQL (MongoDB, HBase) – primary persistence layer.
Search Engine : Solr, Elasticsearch (based on Lucene) – full‑text and multidimensional queries.
Message Queue : Kafka (high‑throughput, log‑oriented) or RabbitMQ (transactional reliability).
1.1.4 File Storage
All services ultimately rely on reliable, fault‑tolerant file storage. Solutions range from traditional RAID to distributed systems like HDFS, NFS, or Samba. When storage becomes a bottleneck, SSDs are the simplest upgrade.
1.1.5 Unified Authentication Center
Provides registration, login, token verification, internal user management, and app secret handling. Centralizing authentication simplifies user data sharing across services and enables single sign‑on for mobile apps.
1.1.6 Single Sign‑On System
Allows a user to log in once and access multiple applications. Open‑source solutions such as Apereo CAS can be customized for this purpose.
1.1.7 Unified Configuration Center
Manages configuration files (Properties, YAML, HOCON) centrally, supporting dynamic online updates, environment separation, and injection via annotations or XML. Open‑source options include Baidu’s Disconf, Ctrip’s Apollo, often backed by Zookeeper.
1.1.8 Service Governance Framework
Internal service calls typically use RPC (RMI, Hessian, Thrift, Dubbo). A governance framework handles service registration, versioning, load balancing, traffic control, fault tolerance, and circuit breaking. Dubbo (Apache incubating) and Netflix Eureka + Ribbon are popular implementations.
1.1.9 Unified Scheduling Center
Manages periodic tasks across the cluster, supporting Cron expressions, dynamic modification, sharding, workflow chaining, multiple task types (script, code, URL), logging, and alerting. Quartz (standalone) and Spring‑Quartz (clustered via Zookeeper) are common, while Elastic‑Job adds elastic resource utilization.
1.1.10 Unified Logging Service
Centralizes logs from all services via a dedicated log server. Implementations can extend Log4j or Logback with custom appenders and transmit logs via RPC.
1.1.11 Data Infrastructure
Data has become a core asset. When data volume exceeds single‑machine capacity, big‑data technologies (Hadoop, Spark) become necessary. However, many workloads can be handled with MySQL plus occasional Hadoop resources (e.g., xx on Yarn).
Data Highway
Logs are collected (Scribe, Chukwa, Kafka, Flume, Logstash) and transmitted via a message queue (typically Kafka) to downstream processing. Sqoop or Alibaba’s Canal can synchronize database changes to data warehouses like Hive.
Offline Data Analysis
Batch processing using Hadoop MapReduce or Spark (Spark on YARN, Mesos). Hive and Spark SQL provide SQL‑style interfaces. Data skew must be addressed for performance.
Real‑time Data Analysis
Storm, Spark Streaming, and Flink handle low‑latency requirements. Combining offline and real‑time pipelines (Lambda architecture) is common.
Ad‑hoc Data Analysis
SQL‑based tools (Presto, Impala, Hive) enable analysts to query data directly; UI layers such as Hue can be added.
1.1.12 Fault Monitoring
Monitoring includes system metrics (CPU, memory, disk) via Nagios, Cacti, OpenFalcon, and business metrics (PV, UV, transaction failures). Alerts should record machine IDs, be aggregated, prioritized, and can be delivered via email, IM, SMS, or WeChat. Effective incident response requires rapid log‑driven diagnosis; centralized log analysis platforms (ELK) and distributed tracing systems (Zipkin, SkyWalking, Pinpoint, Spring Cloud Sleuth) are essential.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
ITFLY8 Architecture Home
ITFLY8 Architecture Home - focused on architecture knowledge sharing and exchange, covering project management and product design. Includes large-scale distributed website architecture (high performance, high availability, caching, message queues...), design patterns, architecture patterns, big data, project management (SCRUM, PMP, Prince2), product design, and more.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
