Interview Experience 10 min read

Comprehensive Interview Question Cheat Sheet for Top Tech Companies

This article compiles a detailed list of interview question topics from leading tech firms—including search, algorithm engineering, NLP, multimodal LLMs, advertising, recommendation, risk control, and big‑data domains—covering algorithms, system design, machine‑learning concepts, and practical coding challenges.

NewBeeNLP

Feb 25, 2024

Comprehensive Interview Question Cheat Sheet for Top Tech Companies

Feizhu – Search Algorithms

Hash addressing algorithm

Overview of shortest‑path algorithms

Detecting cycles in graphs

Probability problem: given 99% prediction accuracy and 0.3% true‑positive rate, compute P(true positive | positive)

Scenario 1: Modeling the entity of a current query using current and historical queries with their entities

Scenario 2: Language identification for similar languages (e.g., Malay vs. English)

Scenario 3: Query rewriting baseline evaluation and its impact (e.g., matching “Beida” vs. “Beijing University” to hotels)

Scenario 4: Error correction and similar‑word modeling

Unclear scenario, but pleasant interview discussion

Baidu – Algorithm Engineering

C++ smart pointers

Python multiprocessing and multithreading

Garbage‑collection mechanisms

SQL transactions

Principles of LoRA

Explanation of Gradient‑Boosted Decision Trees (GBDT)

Typical architecture for translation tasks

Differences among encoder‑only, decoder‑only, and encoder‑decoder models

Transformer architecture overview

FlashAttention explanation

Differences between FP32 and FP16; fundamentals of mixed‑precision training

Beam search principle vs. direct sampling

Improvements for large models

Common frameworks and hardware used in practice

Python coroutines

Resource sharing between processes and threads

Program memory space and stack

Why Docker is useful and how to create containers

Linux process monitoring, termination, and real‑time file viewing

C++ virtual functions

Python Flask basics

Python Global Interpreter Lock (GIL)

Further details on FlashAttention

When large models require pre‑training

Differences among mainstream large models

Probability problem: two shooters with equal hit probability (0.5), shooter B has one extra attempt; compute chance B scores higher

TAL Education – NLP

How to initialize LoRA matrices and why zero‑initialization is used

Purpose of the past_key_value cache in GPT source code

Input‑output flow per layer in a one‑by‑one GPT implementation

Handling sparse output distributions with spikes

Decision‑tree fundamentals and how to perform regression with trees

Meaning of top‑p (nucleus) sampling in GPT

KL‑divergence formula and its difference from cross‑entropy

Typical inputs for reinforcement learning

Three‑stage construction of ChatGPT’s reward model

CART tree splitting criteria

Problem: Find a duplicate number in an array

Similarity measures beyond cosine similarity

Text embedding techniques

TF‑IDF formula

Scenario 1: Multi‑turn teacher‑student dialogue (audio transcription) – removing irrelevant utterances such as greetings

Scenario 2: Recommending practice questions to students while avoiding previously solved similar items

Hikvision – Multimodal LLMs

Tokenization handling for large models and vocabulary expansion

Design rationale behind Python’s multiprocessing versus multithreading (no true parallel threads)

New PyTorch parallel batch‑normalization

Verbal algorithm for generating perfect squares

Choosing and combining models for ten different modalities

Various CLIP variants

Common tricks that are not widely known

Techniques for handling imbalanced data

Differences between separate modality encoders and CLIP‑style joint encoding

Tencent – Advertising Algorithms

Problem: Compute the intersection of two lists with minimal time complexity, without using maps or sets

Problem: Find the maximum number in a list

NER models beyond GP and advantages of GP over standard NER

Addressing NER prediction errors, e.g., mislabeling “BMW 3‑Series” as B‑I‑B‑I

Definition of linear separability; is logistic regression linear or non‑linear?

Common click‑through‑rate (CTR) models

Structure of the FM component in DeepFM

Handling a single positive‑unbounded feature for binary classification

Overview of typical NLP tasks

Zhihu – Search Algorithms

Project topics

Career‑planning considerations

Challenges encountered in projects

Problem: Find the minimum value in a rotated array

BERT attention mechanism

Optimizers used in deep learning

Common loss functions

Feasibility of starting an internship immediately

Xiaopi – NLP

Find start and end indices of a target substring within a source string, ignoring spaces in the target but counting them in the source (KMP‑style problem)

Awareness of state‑of‑the‑art multimodal multi‑stream models

BERT architecture and associated loss

GPT architecture overview

Understanding of various NER models

Prompt engineering for large models across different tasks

Clustering iPhone Pro Max products without any labels

CITIC Bank HQ – Big Data

Summarize personal technology stack

Difference between DELETE and TRUNCATE in SQL

SQL transaction concepts

Architecture of recommendation systems

Differences between classification and regression

Purpose of activation functions and why non‑linearity is needed

Definition of loss

Basic knowledge of Hadoop

Explanation of RDD (Resilient Distributed Dataset)

Dewu – Recommendation (Possible)

O(n log n) sorting algorithms

Heap sort explanation

Dynamic programming fundamentals

Differences between XGBoost and GBDT

Pros and cons of LoRA

Batch normalization overview

Differences between Random Forest and GBDT

Game‑theory problem: 100 coins, players A and B can take 1‑2 coins per turn; determine A’s winning strategy

Tongcheng Travel – Risk Control

F‑score metric

Why AUC is not always used as an evaluation metric

Methods for handling class‑imbalance problems

Principles of batch normalization

Purpose of 1×1 convolution kernels

Characteristics, advantages, and disadvantages of ReLU and Sigmoid activations

When an internship can be started

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

Big Data machine learning System Design algorithms Interview Questions NLP preparation

Written by

NewBeeNLP

Always insightful, always fun

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.