Essential Backend Infrastructure for Scalable Internet Services

This article outlines the critical backend components and services—such as API gateways, MVC/IOC/ORM frameworks, caching, databases, search engines, message queues, unified authentication, configuration management, service governance, scheduling, logging, and data processing pipelines—that together enable stable, high‑availability, and maintainable online applications.

21CTO
21CTO
21CTO
Essential Backend Infrastructure for Scalable Internet Services

For an internet company, backend services are indispensable. Beyond business logic, the underlying infrastructure must ensure stability, reliability, easy maintenance, and high availability.

The essential backend building blocks include:

API Gateway

Mobile apps need load balancing, API access control, and user authentication. Typically Nginx handles load balancing, while each service implements its own access control and authentication. A unified API gateway (e.g., Kong) can centralize these functions, but it may become a performance bottleneck, so some architectures bypass the gateway and rely on a unified authentication center with caching.

Business Applications and Backend Frameworks

Business apps are divided into online (high traffic, high availability) and internal (lower traffic, higher confidentiality). For Java backends, common frameworks include MVC (Spring MVC, Jersey, JFinal), IOC (Spring), ORM (MyBatis, JdbcTemplate), caching (Redis, Memcached), and performance monitoring (jwebap).

Cache, Database, Search Engine, Message Queue

Cache: local (Guava, ConcurrentHashMap) and distributed (Redis, Codis, Twemproxy) with eviction and expiration strategies.

Database: memory (Redis, H2) and disk (MySQL, PostgreSQL, HBase) databases, relational and NoSQL (KV, document, column).

Search Engine: Solr or Elasticsearch for keyword‑based content search.

Message Queue: asynchronous communication via ActiveMQ, RabbitMQ, Kafka, ZeroMQ, etc., providing decoupling, eventual consistency, broadcasting, and flow control.

File Storage

Reliable, fault‑tolerant storage is needed; solutions range from RAID to distributed file systems like HDFS, NFS, or Samba. SSDs can alleviate performance bottlenecks.

Unified Authentication Center

Provides registration, login, token validation, internal user management, and app secret handling, enabling single sign‑on across multiple applications.

Single Sign‑On System

CAS or Kisso can be used to implement SSO for web and internal systems.

Unified Configuration Center

Manages configuration files (properties, yaml) centrally, supporting dynamic updates, environment separation, and easy integration via annotations or XML (e.g., Disconf, Zookeeper‑based solutions).

Service Governance Framework

Handles service registration, consumer management, versioning, load balancing, traffic control, fault tolerance, and circuit breaking (e.g., Dubbo, Dubbox).

Unified Scheduling Center

Manages cron‑based tasks, dynamic modifications, workflows, script/code/url execution, logging, and alerting (e.g., Azkaban, Oozie, Quartz, Elastic‑Job).

Unified Logging Service

Collects logs from all services via log4j/logback appenders and forwards them to a central log server.

Data Infrastructure

Data pipelines include log collection (Scribe, Flume, Kafka), transport, and storage. Synchronization between databases and data warehouses can use Sqoop or Canal.

Data Highway

Logs flow through collection, transport, and storage, forming the basis for downstream analytics.

Offline Data Analysis

Batch processing with Hadoop or Spark, using Hive or Spark SQL, addressing data skew and performance.

Real‑Time Data Analysis

Streaming with Storm or Spark Streaming, handling concurrency and windowed writes; often combined with offline processing (Lambda architecture).

Ad‑Hoc Data Analysis

Provides SQL‑based query tools (Presto, Impala, Hive) and UI (Hue) for analysts and product managers.

Fault Monitoring

System monitoring (CPU, memory, I/O) via Nagios, Cacti, OpenFalcon.

Business monitoring (PV, UV, transaction failures) via custom instrumentation.

Alerting via email, IM, SMS, or WeChat with aggregation and severity levels.

Log aggregation platforms (ELK) and distributed tracing (Dapper, Mercury, EagleEye) aid rapid fault diagnosis.

Netflix Components

Open‑source tools such as Zuul (API gateway), Eureka (service discovery), and Hystrix (circuit breaking) form the basis of many Spring Cloud solutions, providing authentication, routing, load balancing, rate limiting, and resilience.

The author’s experience reflects practical choices; feedback is welcome.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

BackendMicroservicescachingapi-gatewayInfrastructure
21CTO
Written by

21CTO

21CTO (21CTO.com) offers developers community, training, and services, making it your go‑to learning and service platform.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.