Interview Questions and Reflections on Java, JVM, Spark, and System Design
This article records an interview experience, presenting core questions on Java memory allocation, JVM parameters, Spark and MapReduce execution models, data skew causes and mitigation, real‑time framework scheduling, and system design for massive task scheduling, followed by analysis and learning recommendations.
This article documents an interview that the author conducted, noting that the candidate did not pass and summarizing key technical questions asked.
Core questions: 1) How does memory allocation for a Java object work? 2) What are the main JVM parameters for a production cluster, and why choose G1 over CMS? 3) How to create a thread pool for multithreaded business logic, and what are the core parameters? Is Spark a multi‑process or multithreaded model? What about MapReduce? Briefly describe the processes and threads generated when a Spark job is submitted.
Data component questions: 1) What fundamentally causes data skew, how to detect it, and how to resolve it? 2) How does Spark manage memory, what types of memory does it use, and under what circumstances is off‑heap memory employed? 3) How does a real‑time computation framework schedule tasks?
Other questions: 1) Design a system capable of handling daily million‑level task scheduling. 2) How is Spark's back‑pressure throttling implemented, and could you design your own throttler? Explain your approach.
The author notes that answering six out of the eight questions would be sufficient, but the candidate fell short, showing a tendency to stay at the usage level of components without deeper understanding.
The overall impression is that the interviewee can develop under simple business scenarios or mature platforms but may struggle with complex problems due to fragmented knowledge and lack of a holistic view of components.
The author suggests a learning path: understand the background, become familiar with common features, build a simple project, study source code and modules, then explore issues encountered in practice, seeking community resources when needed.
Additionally, the author plans to launch a source‑code reading project focused on a specific framework module, organized as a small‑team study with bi‑weekly sessions, quizzes, and a requirement to produce a learning note or blog; participants are encouraged to join via the provided WeChat ID.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Big Data Technology & Architecture
Wang Zhiwu, a big data expert, dedicated to sharing big data technology.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
