ByteDance Backend Interview Secrets: Salary, Design Patterns, HashMap Issues & More
The article shares eye‑opening ByteDance campus salary data, breaks down salary tiers, and then provides detailed interview preparation covering design patterns, HashMap concurrency pitfalls, MySQL indexing rules, query optimization, handling large JSON fields, stock‑decrement bottlenecks, RabbitMQ vs RocketMQ differences, and distributed‑transaction strategies.
Author Xiao Lin reveals that ByteDance’s 2023 campus recruitment for backend roles offers base salaries up to 40k CNY per month, with signing bonuses that push the first‑year total compensation beyond 700k CNY, making it the highest campus offer ever recorded.
Salary tiers (self‑estimated) are:
SSP+ – 40k × 15 + 10w signing bonus ≈ 70w total
SSP – 36k × 15 + 9w ≈ 63w
SSP – 38k × 15 + 5w ≈ 62w
SSP – 35k × 15 + 8w ≈ 60.5w
SSP – 34k × 15 + 9w ≈ 60w
SSP – 32k × 15 + 5w ≈ 53w
SP – 30k × 15 + 1w ≈ 46w
SP – 29k × 15 ≈ 43.5w
SP – 28k × 15 ≈ 42w
“白菜档” – 26k × 15 ≈ 39w
These figures are personal estimates, not official data, but they give candidates a realistic salary benchmark.
Negotiation tips include asking for a higher signing bonus when the base salary cannot be increased, and leveraging competing offers to improve terms.
ByteDance Backend First‑Round Interview Topics
1. Strategy Pattern & Chain‑of‑Responsibility
Both are behavioral design patterns that replace complex if‑else or switch‑case logic with interchangeable strategy objects or a linked chain of handlers, improving extensibility.
Strategy Pattern
Encapsulates a family of algorithms (e.g., payment methods) behind a common interface, allowing the client to select an implementation at runtime.
Use when the system must dynamically choose an algorithm.
Use when an object has many behaviors and you want to avoid massive if‑else / switch blocks.
Use when algorithm changes should not affect the client.
Chain‑of‑Responsibility
Creates a chain of handler objects; a request traverses the chain until a handler processes it, decoupling sender and receiver.
Use when multiple objects can handle a request and the concrete handler is decided at runtime.
Use to decouple sender and receiver.
Use when the set of handlers may change dynamically.
2. Why is HashMap Concurrently Unsafe?
HashMap sacrifices synchronization for speed. In JDK 1.7, concurrent resizing can create a circular linked list, causing CPU spikes. In JDK 1.8 the loop issue is fixed, but concurrent put still suffers from a race condition on size++, leading to data loss.
Because size++ is not atomic (read‑modify‑write), two threads may both read the same old size, increment, and write back the same new value, so two inserts increase the size by only one.
For multithreaded scenarios, prefer ConcurrentHashMap or wrap a regular map with Collections.synchronizedMap.
3. Why Does HashMap Use Chaining?
Chaining offers better isolation and scalability than open addressing. It avoids “clustering” performance degradation, simplifies deletions, and enables the JDK 1.8 optimization that converts long chains into red‑black trees, guaranteeing O(log n) lookup.
4. MySQL Indexing Requirements
An index should be created on columns that have high cardinality, are frequently used in WHERE, JOIN, ORDER BY, or GROUP BY, have short length, and are not updated excessively.
High cardinality ensures the index filters most rows.
High query frequency ensures the index is actually used.
Short column length maximizes entries per B‑tree page; for long VARCHAR, use prefix indexes.
Low update frequency avoids write‑amplification from index maintenance.
5. Optimizing a Query on Column a
First, create an index on a to replace full‑table scans with B‑tree range lookups.
Second, if the query only selects column b (e.g., SELECT b FROM table WHERE a = ?), a composite index (a, b) makes the query a covering index, eliminating the need for a “back‑table” lookup.
6. Querying Large JSON/Text Columns
Three strategies:
Use MySQL 5.7+ virtual columns to extract searchable fields from JSON and index those virtual columns.
Create a hash column (e.g., a_hash) storing MD5/CRC32 of the JSON text, index the hash, and then verify the full text after hash filtering.
For fuzzy or keyword searches, use MySQL full‑text indexes for modest data volumes or offload to ElasticSearch for large‑scale text search.
The core idea is “transform non‑indexable data into indexable data”.
7. Breaking the Stock‑Decrement Bottleneck
Row‑level lock contention on a single inventory row limits throughput. Solutions:
Shard the inventory into multiple rows (e.g., 10 rows each holding 100 units) to distribute lock contention.
Move the decrement logic to Redis, using an atomic Lua script to check stock and decrement in‑memory, then forward successful requests to the database.
Use asynchronous messaging (e.g., send a message to a MQ after a successful Redis decrement) to eventually sync back to MySQL.
Batch multiple decrement requests in the application layer (e.g., every 100 requests or 100 ms) into a single UPDATE stock = stock - N statement to reduce IO.
8. RabbitMQ vs RocketMQ
RabbitMQ focuses on flexibility and AMQP standard compliance; it’s built in Erlang, suitable for low‑latency, small‑message scenarios but has higher per‑message overhead.
RocketMQ is Java‑based, uses a custom TCP protocol and a NameServer for metadata, stores messages in memory‑mapped sequential CommitLog files, and excels at high‑throughput, large‑scale traffic with features like transactional messages, ordered messages, and delayed delivery.
9. Implementing Local Transactional Messages
The “outbox” pattern stores both business changes and the intent to send a message in the same database transaction, guaranteeing atomicity.
Within a local DB transaction, update business data and insert a row into an Outbox table.
A separate relay service polls the Outbox table (or uses Canal to capture binlog changes) and sends the message to the MQ.
After successful send, the relay updates the Outbox row status to “sent” or deletes it. Consumers must be idempotent because duplicate sends are possible.
10. Understanding Distributed Transactions
Four common approaches are compared in terms of consistency, performance, complexity, and suitable scenarios:
2PC – strong consistency, low performance, medium complexity; suitable for traditional DBs.
3PC – strong consistency, medium‑low performance, high complexity; for scenarios needing reduced blocking.
TCC – eventual consistency, high performance, high complexity; fits high‑concurrency e‑commerce.
Saga – eventual consistency, medium performance, high complexity; for long business flows.
Message‑queue based and local‑message‑table patterns – eventual consistency, high performance, low‑medium complexity; ideal for event‑driven architectures.
11. Hand‑Written Single‑Linked List QuickSort
Implementation details are omitted here, but the article links to a GIF illustrating the algorithm.
Overall, the piece combines lucrative salary insights with a comprehensive technical guide for ByteDance backend interview preparation.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
IT Services Circle
Delivering cutting-edge internet insights and practical learning resources. We're a passionate and principled IT media platform.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
