Xiaohongshu’s Explosive Salaries and a Complete Backend Interview Guide
The article reveals Xiaohongshu’s unusually high 2023 campus recruitment packages—over 51 w annual total—while also providing an extensive backend interview preparation guide covering TCP vs UDP differences, design patterns, workflow versus rule engines, message‑queue selection, and Redis data structures and eviction policies.
TCP vs UDP
TCP is connection‑oriented, requiring a three‑way handshake before data transfer and a four‑way handshake to close. UDP is connection‑less and sends packets without any setup.
Reliability – TCP provides reliable delivery using sequence numbers, ACKs, retransmission, flow control and congestion control. UDP offers best‑effort delivery with no guarantees.
State – TCP maintains connection state (sequence numbers, windows). UDP is stateless, incurring lower overhead.
Transmission efficiency – TCP’s handshakes, ACKs and retransmissions add overhead, reducing efficiency. UDP’s simple header yields higher throughput.
Transmission form – TCP treats data as a byte stream; UDP treats data as discrete messages (datagrams).
Header size – TCP header 20–60 bytes; UDP header fixed at 8 bytes.
Communication mode – TCP supports only point‑to‑point unicast. UDP supports unicast, multicast and broadcast.
Why TCP is reliable
Data is segmented into TCP‑sized blocks (segments) before being handed to the network layer.
Each segment carries a sequence number; the receiver can reorder out‑of‑order packets and discard duplicates.
TCP computes a checksum over the header and payload; a mismatched checksum causes the segment to be dropped without acknowledgment.
Retransmission mechanisms: timeout‑based retransmission, fast retransmit triggered by duplicate ACKs, selective acknowledgment (SACK) to indicate received ranges, and duplicate SACK (D‑SACK) to report duplicate receptions.
Flow control uses a sliding window; the receiver advertises how much buffer space is available, and the sender limits its sending rate accordingly.
Congestion control combines the receiver’s advertised window with a congestion window (cwnd) that the sender adjusts based on network feedback, ensuring the sender does not overload the network.
Design pattern: Chain of Responsibility
In an order‑processing scenario, the request passes through a chain of handlers: inventory check → risk control → payment validation. Each handler performs its check and either forwards the request to the next handler or aborts the chain with an error, decoupling the request sender from concrete processing logic.
Workflow vs. Rule Engine
Workflow (Workflow) abstracts a business process and its steps, defining the ordered execution of tasks and the rules governing transitions.
Example: a leave‑approval process defines nodes such as employee submission, manager review, and HR approval. A workflow engine executes these nodes, tracks state, and provides visual design tools, simplifying development and maintenance.
Rule engines extract decision logic from application code. Rules are expressed declaratively (e.g., decision tables or trees). Example: a loan‑approval rule engine evaluates credit score, income and other factors to decide approval, reducing code coupling and supporting frequent rule changes.
In practice the two are complementary: a workflow node can invoke a rule engine to decide the next path, combining process control with dynamic decision logic.
Message Queues (MQ)
Using MQ provides three main benefits:
Asynchronous processing improves system performance by reducing response latency.
Traffic shaping / rate limiting smooths bursty traffic.
Decoupling reduces system coupling, making services easier to evolve.
RocketMQ delay levels
RocketMQ 4.x supports 18 predefined delay levels. The mapping is:
Level 1 – 1 s
Level 2 – 5 s
Level 3 – 10 s
Level 4 – 30 s
Level 5 – 1 min
Level 6 – 2 min
Level 7 – 3 min
Level 8 – 4 min
Level 9 – 5 min
Level 10 – 6 min
Level 11 – 7 min
Level 12 – 8 min
Level 13 – 9 min
Level 14 – 10 min
Level 15 – 20 min
Level 16 – 30 min
Level 17 – 1 h
Level 18 – 2 h
RocketMQ 5.0 replaces the fixed levels with a time‑wheel based timer, enabling arbitrary scheduled messages.
Other common message queues
Kafka – Distributed streaming platform (originally for log processing). Supports publish/subscribe, durable storage, stream processing APIs, and Raft‑based KRaft mode (Kafka 3.3.1 is production‑ready for KRaft). Official site: http://kafka.apache.org/
RocketMQ – Cloud‑native messaging platform from Alibaba, Apache top‑level project. Features cloud‑native deployment, trillion‑level throughput, stream processing, financial‑grade reliability, minimal external dependencies, and a rich ecosystem. Site: https://rocketmq.apache.org/ Release notes: https://github.com/apache/rocketmq/releases
RabbitMQ – AMQP‑based broker written in Erlang. Provides persistence, acknowledgments, flexible routing via exchanges, clustering, mirrored queues for high availability, multi‑protocol support, and a user‑friendly management UI. Site: https://www.rabbitmq.com/ Release notes: https://www.rabbitmq.com/news.html
Pulsar – Cloud‑native distributed messaging system (originated at Yahoo). Offers multi‑tenant architecture, strong consistency, high throughput, low latency, separate compute and storage, seamless geo‑replication, built‑in functions (Pulsar Functions) and connectors (Pulsar IO). Site: https://pulsar.apache.org/ Release notes: https://github.com/apache/pulsar/releases
Redis
Reasons to use Redis:
In‑memory access makes reads dozens to hundreds of times faster than disk‑based databases.
Single‑node Redis can handle >50 k QPS (MySQL on comparable hardware ~4 k QPS); clusters scale even higher.
Beyond caching, Redis provides distributed locks, rate limiting, message queues, delayed queues, etc.
Data structures
Redis offers five basic types – String, List, Set, Hash, Zset – and three special types – HyperLogLog, Bitmap, Geospatial. Underlying implementations include SDS, LinkedList, Dict, SkipList, Intset, ZipList, QuickList, and ListPack (ListPack replaces ZipList from Redis 7.0).
Mapping of types to internal structures:
String → SDS
List → LinkedList / ZipList / QuickList (Redis 3.2+ uses QuickList; Redis 7.0 replaces ZipList with ListPack)
Hash → Dict, ZipList
Set → Dict, Intset
Zset → ZipList, SkipList
Eviction policies
volatile‑lru – LRU eviction among keys with an expiration.
volatile‑ttl – Evicts keys that are closest to expiration.
volatile‑random – Random eviction among expiring keys.
allkeys‑lru – LRU eviction among all keys.
allkeys‑random – Random eviction among all keys.
no‑eviction – Disallows eviction; writes fail when memory is exhausted.
volatile‑lfu (Redis 4.0+) – LFU eviction among expiring keys.
allkeys‑lfu (Redis 4.0+) – LFU eviction among all keys.
Configuration commands:
config get maxmemory
config get maxmemory-policy
config set maxmemory-policy <policy>For detailed policy explanations see the Redis documentation at https://redis.io/docs/reference/eviction/.
HyperLogLog
HyperLogLog (HLL) is a probabilistic data structure that estimates the cardinality of a large set using a fixed 12 KB memory footprint (sparse matrix for small cardinalities, dense matrix for larger ones). It provides an approximate count with a small standard error (≈0.81 % in Redis).
Typical scenarios:
Website or app UV (unique visitor) counting.
Search‑engine keyword distinct searcher statistics.
Social‑network interaction counts (e.g., number of distinct users who retweeted a post).
When the exact count is not required and memory efficiency is critical, HLL is the preferred solution.
Bloom Filter
A Bloom filter tests set membership with no false negatives and a configurable false‑positive rate. Elements are hashed by multiple functions; the corresponding bits in a bit array are set to 1. To query, the same hashes are computed; if any bit is 0 the element is definitely absent, otherwise it may be present.
Typical use case: quickly checking whether an element is absent from a massive collection (e.g., cache‑miss filtering, spam detection) before performing a more expensive lookup.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
JavaGuide
Backend tech guide and AI engineering practice covering fundamentals, databases, distributed systems, high concurrency, system design, plus AI agents and large-model engineering.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
