From Student to NVIDIA: 11 Years of AI Research Lessons
This reflective essay chronicles Zhaocheng Zhu’s eleven‑year journey from undergraduate AI curiosity through doctoral struggles, industry internships, and finally landing a research role at NVIDIA, offering candid insights on publishing, engineering, mentorship, and the evolving realities of academic and corporate AI work.
2015‑2017: What Did I Want to Do?
In 2015, while still an undergraduate, I was captivated by the rapid rise of deep learning—CNNs and RNNs were shattering benchmarks, and the idea of machines learning from raw data felt far more exciting than traditional programming. I spent months dissecting the C/C++ implementation of word2vec to understand what representations the model actually learned.
A breakthrough came during an RNN experiment where using pinyin tokens outperformed conventional Chinese word segmentation, but the project died without a paper, leaving only a modest arXiv preprint on my résumé.
A summer internship at Mitsubishi Electric in Japan exposed me to two researcher archetypes: idea generators and execution experts. I identified more with the former, realizing I needed a Ph.D. to deepen my understanding.
When I later prepared Ph.D. applications, I faced the harsh reality of having zero publications while peers already had multiple first‑author papers.
2017‑2019: What Does Publishing a Paper Require?
During my senior year I interned at Microsoft Research Asia (MSRA) in the object‑detection group, witnessing a highly competitive, weekly‑arXiv‑tracking research culture. My mentor emphasized that strong engineering skills are the foundation of any research idea.
I learned to write CUDA kernels and adopt rigorous engineering habits, which later enabled me to build a multi‑GPU system for large‑scale graph embedding during my Ph.D. in Canada.
My first major paper (WWW) emerged after months of battling compiler bugs and GPU errors, followed by a chaotic writing process where my advisor rewrote the entire introduction and I mistakenly overwrote his edits, forcing us to stay up nights to align the narrative.
2019‑2020: Surviving in a Sparse‑Feedback System
After publishing, I struggled to convey a new task to the community, leading to rejection. The COVID‑19 pandemic pushed me into drug‑combination research, which also resulted in rejections, highlighting the psychological toll of repeated failures.
To maintain motivation, I built a "positive feedback loop"—a 10k‑line software library that provided tangible progress (new features, faster runs, clean refactoring) even when research signals were sparse.
Living with a Ukrainian roommate taught me resilience through cultural exchange and everyday joys, which became an unexpected source of emotional support.
2021‑2023: Finding My Research Direction
Returning to China for New Year reunited me with collaborators, and we revived a failed drug‑research idea called "unidirectional propagation." After months of deep mathematical study, we realized the problem reduced to a graph‑path formulation that existing GNNs could not solve, leading to a NeurIPS paper.
The experience reinforced that the most valuable insights often stem from timeless principles rather than the latest hype.
Collaborating with a blogger on graph‑foundation models sparked the realization that classic algorithms can inspire elegant neural architectures.
2023‑2024: Where Does My Real‑World Home Lie?
After multiple rejected internships, I finally received an offer from Google, only to face a competitive Bay Area environment that contrasted sharply with the calm of Canadian academia.
My mentor taught me to always leave meetings with concrete action items, emphasizing that only executed ideas have value in industry.
Applying for faculty positions proved exhausting; the administrative burden of grant writing and teaching felt misaligned with my desire for curiosity‑driven research.
Repeated interview rejections and visa hurdles made me question whether I should continue pursuing research roles, but a series of NVIDIA interview opportunities—offering visa sponsorship—ultimately led to an offer on my 28th birthday.
2025 and Beyond: What Does Research Mean Today?
Working as an LLM post‑training engineer at a large AI company revealed a stark contrast: academia thrives on influence and funding, while industry depends on productization and revenue.
GPU costs now dominate AI budgets, turning idle GPUs into unacceptable waste and turning researchers into operators of massive, always‑on machines.
AI agents like Cursor and Claude Code are eroding the value of junior talent; companies now favor Ph.D.‑level expertise, and many tasks I once mastered are being automated.
Two classic examples I used to demonstrate LLM generalization limits were broken within a year, underscoring the fragility of our knowledge and hinting at a future where AI handles much of the research process.
Ultimately, I believe the enduring purpose of a researcher is to maintain curiosity, cultivate personal taste, stay grounded in reality, and shoulder responsibility.
Reflection: The path to becoming a researcher is never linear; each setback, curiosity, and small victory builds the foundation for future success.
Original source: https://loud-phalange-7f5.notion.site/Eleven-years-in-AI-What-does-it-actually-mean-to-be-a-researcher-2d56d9bccef780038ae9c27ffab59404Data Party THU
Official platform of Tsinghua Big Data Research Center, sharing the team's latest research, teaching updates, and big data news.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
