Machine Learning Methods for Solving Combinatorial Optimization Problems
This article reviews recent advances in applying machine learning—especially attention mechanisms, graph neural networks, and reinforcement learning—to combinatorial optimization, outlines fundamental problem definitions, classic algorithms, modern ML‑based approaches, experimental results, and future research directions.
Reading: Since 2015, the field of machine learning methods for solving combinatorial optimization (ML+CO) has made great progress, with two main research lines: supervised learning and reinforcement learning. The goal is to design ML algorithm frameworks that can solve multiple combinatorial optimization problems.
Today's presentation covers four points:
Combinatorial optimization problems
Machine learning methods
ML algorithms for solving combinatorial optimization
Future research directions
Guest speaker: Prof. Cai Qingqiong, Associate Professor, School of Computer Science, Nankai University.
01 Combinatorial Optimization Problems
Combinatorial Optimization Problems (COP) are discrete‑state optimization problems with many real‑world applications such as communication networks, chip design, flight routing, and data‑center management.
The Traveling Salesman Problem (TSP) is a classic example: given a set of cities, find a tour that visits each city exactly once and returns to the start with minimal total distance or cost.
1. Mathematical Model of COP
Combinatorial optimization can be expressed as a discrete optimization model. For example, the TSP on a complete graph G = (V, E) with vertex set V = {0,1,…,n‑1} and edge weights ω: E → ℚ seeks a permutation that minimizes the sum of adjacent edge weights.
2. Classification of COP
According to computational complexity theory, problems are classified as P, NP, NP‑complete (NPC), and NP‑hard:
P: solvable in polynomial time by deterministic algorithms.
NP: solutions can be verified in polynomial time.
NPC: an NP problem to which all other NP problems can be reduced in polynomial time.
NP‑hard: at least as hard as NP problems, but not necessarily in NP.
Many COPs are NP‑hard, such as TSP, Minimum Vertex Cover (MVC), etc.
3. Basic COP Examples
Knapsack Problem (KP) : select items with given weights and values to maximize total value without exceeding capacity.
Traveling Salesman Problem (TSP) : find the shortest Hamiltonian circuit visiting each city once.
Vehicle Routing Problem (VRP) : design routes for a fleet to satisfy customer demands while minimizing distance or cost.
Minimum Vertex Cover (MVC) : find the smallest set of vertices covering all edges.
Maximum Independent Set (MIS) : find the largest set of mutually non‑adjacent vertices.
4. Traditional Algorithms for COP
Exact, approximation, and heuristic algorithms each have strengths and share three common challenges: (1) algorithms designed for one COP rarely transfer to others; (2) each instance requires re‑running the algorithm from scratch; (3) problem scales keep growing, making manual algorithm design increasingly difficult.
4 Machine Learning for Combinatorial Optimization (ML+CO)
Recent successes of machine learning and deep learning motivate designing ML frameworks to address the three challenges of traditional COP algorithms. Advantages of ML for COP include:
Automatic learning of patterns from large data, providing approximate solutions without extensive hand‑crafted heuristics.
Potential to discover useful properties that are hard for human designers to find.
Ability to generalize across multiple COPs (e.g., S2V‑DQN solves TSP, MVC, MaxCut), whereas classic algorithms are usually problem‑specific.
ML methods for COP date back to the 1980s (Hopfield networks). In the past five‑six years, two main streams have emerged: supervised learning and reinforcement learning, as illustrated below.
Machine Learning Methods
To apply ML to COP we need basic techniques such as attention mechanisms, graph neural networks (GNN), and reinforcement learning.
1. Attention Mechanism
2. Graph Neural Networks
3. (Deep) Reinforcement Learning
End‑to‑End vs. Hybrid Approaches
End‑to‑end models directly map a COP instance to a solution, offering fast inference and strong generalization but no guarantee of optimality. Hybrid methods embed ML components (e.g., branch‑and‑bound node selection) within traditional solvers, achieving better solution quality at the cost of slower inference.
Algorithms Based on Attention
2015: Vinyals et al. introduced the Pointer Network (Algorithm A1) using supervised learning. It generalizes well and runs fast, but solution quality cannot exceed that of the training samples.
2017: Bello et al. proposed Algorithm A2, training the Pointer Network with reinforcement learning (REINFORCE) and a critic baseline to reduce variance.
2019: Kool et al. presented Algorithm A3, adding a rollout baseline that selects the best historical policy as a reference, yielding superior performance close to exact solvers.
Experimental results (inference time only) show Algorithm A3 outperforms A1 and A2 and approaches the performance of traditional solvers.
Algorithms Based on Graph Neural Networks
2018: Li et al. introduced Algorithm B1 (S2V‑DQN) using graph convolutional networks and guided tree search.
2017: Dai et al. proposed Algorithm B2 (Learning combinatorial optimization algorithms over graphs).
Visualization of S2V‑DQN solving an MVC instance shows the model progressively selects high‑degree vertices while preserving graph connectivity, achieving better coverage than naive heuristics.
Future Research Directions
Improve model performance: solution quality, solving speed, and generalization; better integration of attention and GNN; tighter coupling of ML with traditional OR algorithms; develop more effective reinforcement‑learning strategies.
Solve more complex COPs: multi‑objective, multi‑constraint, and dynamically changing problems.
Enhance model interpretability.
Q&A
Q: How to choose a reinforcement‑learning strategy?
A: There is no single answer; it depends on the specific problem. Practitioners should try different strategies and adopt the one that works best for their case.
Thank you for attending.
About DataFun – A media platform focused on big data and AI applications, founded in 2017, with over 100 offline and online events, 2000+ expert speakers, and 800+ original articles.
DataFunSummit
Official account of the DataFun community, dedicated to sharing big data and AI industry summit news and speaker talks, with regular downloadable resource packs.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.