Operations 9 min read

Website Performance Metrics and Optimization Strategies

This article explains key website performance metrics such as response time, concurrency, and throughput, presents typical values for various operations, and outlines practical optimization strategies for front‑end, application‑server, and storage layers, including caching, CDN, reverse proxy, clustering, and code improvements.

Architecture Digest
Architecture Digest
Architecture Digest
Website Performance Metrics and Optimization Strategies

Website performance is an objective metric that can be expressed through technical indicators such as response time, throughput, concurrency, and performance counters.

1. Performance Test Metrics

1.1 Response Time

Response time refers to the time required for an application to complete an operation, i.e., the time from sending a request to receiving the response data. The table below lists common operation response times.

Operation

Response Time

Open a website

Several seconds

Database query (indexed)

Tens of milliseconds

Mechanical disk single‑seek positioning

4 ms

Sequential read of 1 MB from a mechanical disk

2 ms

Sequential read of 1 MB from an SSD

0.3 ms

Read a value from a remote Redis cluster

0.5 ms

Read 1 MB from memory

Tens of microseconds

Java native method call

Few microseconds

Network transfer of 2 KB

1 microsecond

In practice, response time is usually calculated as the average of multiple measurements.

1.2 Concurrency

Concurrency indicates the number of requests a system can handle simultaneously, reflecting its load performance. For a website, concurrency is the number of users submitting requests at the same time.

Website system users > online users > concurrent users

1.3 Throughput

Throughput measures the number of requests processed by the system per unit of time, reflecting overall processing capability. For a website, it can be expressed as requests/second, pages/second, visitors/day, or transactions/hour.

TPS (transactions per second) is a common throughput metric. Other terms include HPS (HTTP requests per second) and QPS (queries per second).

1.4 Performance Counters

Performance counters are OS‑level metrics such as system load, CPU usage, memory usage, and disk utilization.

2. Performance Optimization Strategies

Based on the layered architecture of a website, optimization can be divided into Web front‑end optimization, application‑server optimization, and storage‑server optimization.

2.1 Web Front‑End Optimization

2.1.1 Browser Access Optimization

Reduce the number of HTTP requests by merging CSS, JavaScript, and images.

Leverage browser caching; when static resources change, rename the files to force updates.

Enable page compression; text files can be compressed by over 80%.

Place CSS at the top of the page and JavaScript at the bottom.

Minimize Cookie transmission; consider using a separate domain for static assets.

2.1.2 CDN Acceleration

A CDN is essentially a cache deployed on servers close to users, typically caching static resources.

2.1.3 Reverse Proxy

In addition to security and load‑balancing functions, a reverse proxy can also cache dynamic resources.

2.2 Application Server Performance Optimization

The application server handles business logic. Common optimization techniques include caching, clustering, and asynchronous processing.

2.2.1 Distributed Cache

Caching stores data that is read frequently but changes rarely. A distributed cache deploys cache nodes across multiple servers to provide a unified cache service.

Two typical architectures are:

JBoss Cache – a synchronized distributed cache where updates are propagated to all nodes.

Memcached – a non‑communicating distributed cache where each node stores independent data.

JBoss Cache keeps identical data on all servers; when one server updates the cache, it notifies the others. This offers fast local reads but can become costly at large cluster sizes.

Large‑scale websites may need terabytes of cache memory; in such cases Memcached is preferred because each node can store different data without inter‑node communication.

2.2.2 Asynchronous Operations

To improve scalability, use message queues to make calls asynchronous.

2.2.3 Using Clusters

Under high concurrency, employ load‑balancing to build a cluster of multiple servers, distributing requests across them.

2.2.4 Code Optimization

Code optimization involves multithreading, resource reuse (object pools or singletons), data structures, and garbage‑collection tuning.

2.3 Storage Performance Optimization

Consider using distributed storage, OpenFiler, RAID arrays, or HDFS (Hadoop) for better storage performance.

Source: http://blog.csdn.net/chaofanwei/article/details/27168603

Copyright statement: Content is sourced from the internet; copyright belongs to the original author. We will indicate the author and source unless it cannot be confirmed. If there is any infringement, please let us know and we will delete it promptly.

-END-

PerformanceCachingthroughputweb optimizationresponse time
Architecture Digest
Written by

Architecture Digest

Focusing on Java backend development, covering application architecture from top-tier internet companies (high availability, high performance, high stability), big data, machine learning, Java architecture, and other popular fields.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.