Xiaohongshu’s Explosive Salaries and a Complete Backend Interview Guide

The article reveals Xiaohongshu’s unusually high 2023 campus recruitment packages—over 51 w annual total—while also providing an extensive backend interview preparation guide covering TCP vs UDP differences, design patterns, workflow versus rule engines, message‑queue selection, and Redis data structures and eviction policies.

JavaGuide
JavaGuide
JavaGuide
Xiaohongshu’s Explosive Salaries and a Complete Backend Interview Guide

TCP vs UDP

TCP is connection‑oriented, requiring a three‑way handshake before data transfer and a four‑way handshake to close. UDP is connection‑less and sends packets without any setup.

Reliability – TCP provides reliable delivery using sequence numbers, ACKs, retransmission, flow control and congestion control. UDP offers best‑effort delivery with no guarantees.

State – TCP maintains connection state (sequence numbers, windows). UDP is stateless, incurring lower overhead.

Transmission efficiency – TCP’s handshakes, ACKs and retransmissions add overhead, reducing efficiency. UDP’s simple header yields higher throughput.

Transmission form – TCP treats data as a byte stream; UDP treats data as discrete messages (datagrams).

Header size – TCP header 20–60 bytes; UDP header fixed at 8 bytes.

Communication mode – TCP supports only point‑to‑point unicast. UDP supports unicast, multicast and broadcast.

Why TCP is reliable

Data is segmented into TCP‑sized blocks (segments) before being handed to the network layer.

Each segment carries a sequence number; the receiver can reorder out‑of‑order packets and discard duplicates.

TCP computes a checksum over the header and payload; a mismatched checksum causes the segment to be dropped without acknowledgment.

Retransmission mechanisms: timeout‑based retransmission, fast retransmit triggered by duplicate ACKs, selective acknowledgment (SACK) to indicate received ranges, and duplicate SACK (D‑SACK) to report duplicate receptions.

Flow control uses a sliding window; the receiver advertises how much buffer space is available, and the sender limits its sending rate accordingly.

Congestion control combines the receiver’s advertised window with a congestion window (cwnd) that the sender adjusts based on network feedback, ensuring the sender does not overload the network.

Design pattern: Chain of Responsibility

In an order‑processing scenario, the request passes through a chain of handlers: inventory check → risk control → payment validation. Each handler performs its check and either forwards the request to the next handler or aborts the chain with an error, decoupling the request sender from concrete processing logic.

Workflow vs. Rule Engine

Workflow (Workflow) abstracts a business process and its steps, defining the ordered execution of tasks and the rules governing transitions.

Example: a leave‑approval process defines nodes such as employee submission, manager review, and HR approval. A workflow engine executes these nodes, tracks state, and provides visual design tools, simplifying development and maintenance.

Rule engines extract decision logic from application code. Rules are expressed declaratively (e.g., decision tables or trees). Example: a loan‑approval rule engine evaluates credit score, income and other factors to decide approval, reducing code coupling and supporting frequent rule changes.

In practice the two are complementary: a workflow node can invoke a rule engine to decide the next path, combining process control with dynamic decision logic.

Message Queues (MQ)

Using MQ provides three main benefits:

Asynchronous processing improves system performance by reducing response latency.

Traffic shaping / rate limiting smooths bursty traffic.

Decoupling reduces system coupling, making services easier to evolve.

RocketMQ delay levels

RocketMQ 4.x supports 18 predefined delay levels. The mapping is:

Level 1 – 1 s

Level 2 – 5 s

Level 3 – 10 s

Level 4 – 30 s

Level 5 – 1 min

Level 6 – 2 min

Level 7 – 3 min

Level 8 – 4 min

Level 9 – 5 min

Level 10 – 6 min

Level 11 – 7 min

Level 12 – 8 min

Level 13 – 9 min

Level 14 – 10 min

Level 15 – 20 min

Level 16 – 30 min

Level 17 – 1 h

Level 18 – 2 h

RocketMQ 5.0 replaces the fixed levels with a time‑wheel based timer, enabling arbitrary scheduled messages.

Other common message queues

Kafka – Distributed streaming platform (originally for log processing). Supports publish/subscribe, durable storage, stream processing APIs, and Raft‑based KRaft mode (Kafka 3.3.1 is production‑ready for KRaft). Official site: http://kafka.apache.org/

RocketMQ – Cloud‑native messaging platform from Alibaba, Apache top‑level project. Features cloud‑native deployment, trillion‑level throughput, stream processing, financial‑grade reliability, minimal external dependencies, and a rich ecosystem. Site: https://rocketmq.apache.org/ Release notes: https://github.com/apache/rocketmq/releases

RabbitMQ – AMQP‑based broker written in Erlang. Provides persistence, acknowledgments, flexible routing via exchanges, clustering, mirrored queues for high availability, multi‑protocol support, and a user‑friendly management UI. Site: https://www.rabbitmq.com/ Release notes: https://www.rabbitmq.com/news.html

Pulsar – Cloud‑native distributed messaging system (originated at Yahoo). Offers multi‑tenant architecture, strong consistency, high throughput, low latency, separate compute and storage, seamless geo‑replication, built‑in functions (Pulsar Functions) and connectors (Pulsar IO). Site: https://pulsar.apache.org/ Release notes: https://github.com/apache/pulsar/releases

Redis

Reasons to use Redis:

In‑memory access makes reads dozens to hundreds of times faster than disk‑based databases.

Single‑node Redis can handle >50 k QPS (MySQL on comparable hardware ~4 k QPS); clusters scale even higher.

Beyond caching, Redis provides distributed locks, rate limiting, message queues, delayed queues, etc.

Data structures

Redis offers five basic types – String, List, Set, Hash, Zset – and three special types – HyperLogLog, Bitmap, Geospatial. Underlying implementations include SDS, LinkedList, Dict, SkipList, Intset, ZipList, QuickList, and ListPack (ListPack replaces ZipList from Redis 7.0).

Mapping of types to internal structures:

String → SDS

List → LinkedList / ZipList / QuickList (Redis 3.2+ uses QuickList; Redis 7.0 replaces ZipList with ListPack)

Hash → Dict, ZipList

Set → Dict, Intset

Zset → ZipList, SkipList

Eviction policies

volatile‑lru – LRU eviction among keys with an expiration.

volatile‑ttl – Evicts keys that are closest to expiration.

volatile‑random – Random eviction among expiring keys.

allkeys‑lru – LRU eviction among all keys.

allkeys‑random – Random eviction among all keys.

no‑eviction – Disallows eviction; writes fail when memory is exhausted.

volatile‑lfu (Redis 4.0+) – LFU eviction among expiring keys.

allkeys‑lfu (Redis 4.0+) – LFU eviction among all keys.

Configuration commands:

config get maxmemory
config get maxmemory-policy
config set maxmemory-policy <policy>

For detailed policy explanations see the Redis documentation at https://redis.io/docs/reference/eviction/.

HyperLogLog

HyperLogLog (HLL) is a probabilistic data structure that estimates the cardinality of a large set using a fixed 12 KB memory footprint (sparse matrix for small cardinalities, dense matrix for larger ones). It provides an approximate count with a small standard error (≈0.81 % in Redis).

Typical scenarios:

Website or app UV (unique visitor) counting.

Search‑engine keyword distinct searcher statistics.

Social‑network interaction counts (e.g., number of distinct users who retweeted a post).

When the exact count is not required and memory efficiency is critical, HLL is the preferred solution.

Bloom Filter

A Bloom filter tests set membership with no false negatives and a configurable false‑positive rate. Elements are hashed by multiple functions; the corresponding bits in a bit array are set to 1. To query, the same hashes are computed; if any bit is 0 the element is definitely absent, otherwise it may be present.

Typical use case: quickly checking whether an element is absent from a massive collection (e.g., cache‑miss filtering, spam detection) before performing a more expensive lookup.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

Design PatternsBackend DevelopmentRedisTCPMessage QueueInterview Preparation
JavaGuide
Written by

JavaGuide

Backend tech guide and AI engineering practice covering fundamentals, databases, distributed systems, high concurrency, system design, plus AI agents and large-model engineering.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.