Probabilistic methods for neural combinatorial optimization

Nikolaos Karalias
2023
EPFL thesis

Abstract

The monumental progress in the development of machine learning models has led to a plethora of applications with transformative effects in engineering and science. This has also turned the attention of the research community towards the pursuit of constructing artificial intelligence (AI) models with general reasoning capabilities. Yet, despite the staggering success of artificial neural networks in a variety of tasks that involve language and image generation or object detection and recognition, tasks that involve discrete and combinatorial problem-solving are still a fundamental blind spot of those models and present a longstanding obstacle in the road to general-purpose AI systems. Combinatorial optimization problems are prominent representatives in that category as they present fundamental challenges that are hard to tackle within the standard machine learning paradigm. Two fundamental obstacles in this pursuit are i) the difficulty of navigating exponentially large discrete configuration spaces using continuous gradient-based optimization, ii) our inability to procure large amounts of labeled data due to the high computational budget that this requires. The subject of this thesis will be to develop a coherent approach to combinatorial optimization with neural networks that focuses on directly tackling those challenges. In the first half of the thesis, we will present our proposal for neural combinatorial optimization without supervision. We demonstrate how it is possible to design continuous loss functions for constrained optimization problems in a way that enables training without access to labeled data. We leverage the celebrated probabilistic method from the field of combinatorics to argue about the existence of high-quality solutions within the learned representations of a neural network that has been trained with our approach. We also show how to deterministically recover those solutions using derandomization techniques from the literature. In the second half, we expand the scope of our inquiry and design a general framework for continuous extensions of set functions. This approach enables training neural networks for discrete problems even when the objective and the constraints of the problem are given as a black box. We develop extensions for domains like the hypercube but also higher-dimensional domains like the cone of positive semi-definite matrices. This framework enables us to efficiently incorporate problem-specific priors in the pipeline which leads to improved empirical results. Finally, we show that the versatility of this approach extends beyond combinatorial optimization as it can be used to define a novel continuous surrogate of the discrete training error for classification problems. Overall, our proposed methods make progress in advancing the state of the art for neural combinatorial optimization through principled loss function design. Furthermore, by enabling the use of discrete functions in end-to-end differentiable models they pave the way for improved combinatorial and reasoning capabilities for machine learning algorithms.

Official source

https://infoscience.epfl.ch/record/303523?ln=en

About this result

This page is automatically generated and may contain information that is not correct, complete, up-to-date, or relevant to your search query. The same applies to every other page on this website. Please make sure to verify the information with EPFL's official sources.

Probabilistic methods for neural combinatorial optimization

Graph Chatbot

Chat with Graph Search

Infusing structured knowledge priors in neural models for sample-efficient symbolic reasoning

Topics in statistical physics of high-dimensional machine learning

Deep Learning Theory Through the Lens of Diagonal Linear Networks

Infusing structured knowledge priors in neural models for sample-efficient symbolic reasoning

Deep Learning Theory Through the Lens of Diagonal Linear Networks

Topics in statistical physics of high-dimensional machine learning