AI for Discrete Optimization (IMEN891N, Fall 2025)
(♦: In-class presentation by students)
Part I: Basic AI techniques
Week 1: Introduction
- Motivation and trends: [Stanford MS&E236] Lecture “Introduction” (link), Survey (Bengio et al. 2021)
Week 1–3: Deep reinforcement learning
- Graph Neural Networks (GNN): [Stanford MS&E236] Lecture “GNNs” (link)
- Tutorials (Cappart et al. 2023, Angioni et al. 2025)
- [Stanford CS224W] Lecture 3–4 (link)
- Reinforcement Learning (RL): [Stanford MS&E236] Lecture “Reinforcement learning (Q-learning)” (link)
- Textbook (Sutton & Barto 2020) Chapter 3–6
- [Stanford CS234] Lecture 2–6 (link)
- ♦ End-to-end heuristics:
Routing (Dai et al. 2018, Kool et al. 2018, Deudon et al. 2018)
Scheduling (Kwon et al. 2020, Park et al. 2021, Kwon et al. 2021)
Week 4: Simulation-based learning
- Bandit models: Textbook (Sutton & Barto 2020) Chapter 2, Tutorial (Agrawal 2019)
- Monographs (Lattimore & Szepesvári 2020, Slivkins 2019)
- 2019 INFORMS TutORial: Recent Advances in Multiarmed Bandits (link)
- Bayesian optimization (Frazier 2018, Theodoridis 2015)
- 2018 INFORMS TutORial: Bayesian Optimization (link)
Part I — Reading list
Bengio, Y., Lodi, A., & Prouvost, A. (2021). Machine learning for combinatorial optimization: A methodological tour d’horizon. European Journal of Operational Research, 290(2), 405–421.
Cappart, Q., Chételat, D., Khalil, E. B., Lodi, A., Morris, C., & Veličković, P. (2023). Combinatorial Optimization and Reasoning with Graph Neural Networks. Journal of Machine Learning Research, 24(130), 1–61.
Angioni, D., Archetti, C., & Speranza, M. G. (2025). Neural combinatorial optimization: A tutorial. Computers & Operations Research, 182, 107102. (link)
Sutton, R. S., & Barto, A. G. (2020). Reinforcement Learning: An Introduction (2nd ed.). MIT Press.
Dai, H., Khalil, E. B., Zhang, Y., Dilkina, B., & Song, L. (2018). Learning Combinatorial Optimization Algorithms over Graphs.
Kool, W., van Hoof, H., & Welling, M. (2018). Attention, Learn to Solve Routing Problems! In ICLR.
Deudon, M., Cournut, P., Lacoste, A., Adulyasak, Y., & Rousseau, L.-M. (2018). Learning Heuristics for the TSP by Policy Gradient. In CPAIOR (LNCS), 170–181. (link)
Kwon, Y.-D., Choo, J., Kim, B., Yoon, I., Gwon, Y., & Min, S. (2020). POMO: Policy Optimization with Multiple Optima for Reinforcement Learning. In NeurIPS, 33, 21188–21198.
Park, J., Chun, J., Kim, S. H., Kim, Y., & Park, J. (2021). Learning to schedule job-shop problems: Representation and policy learning using graph neural network and reinforcement learning. International Journal of Production Research, 59(11), 3360–3377. (link)
Kwon, Y.-D., Choo, J., Yoon, I., Park, M., Park, D., & Gwon, Y. (2021). Matrix encoding networks for neural combinatorial optimization. In NeurIPS, 34, 5138–5149.
Agrawal, S. (2019). Recent Advances in Multiarmed Bandits for Sequential Decision Making. In INFORMS TutORials in Operations Research, 167–188. (link)
Lattimore, T., & Szepesvári, C. (2020). Bandit Algorithms. Cambridge University Press. (link)
Slivkins, A. (2019). Introduction to Multi-Armed Bandits. Foundations and Trends® in Machine Learning, 12(1–2), 1–286. (link)
Frazier, P. I. (2018). Bayesian Optimization. In INFORMS TutORials in Operations Research, 255–278. (link)
Theodoridis, S. (2015). Machine Learning: A Bayesian and Optimization Perspective. Academic Press.