**Are you an EPFL student looking for a semester project?**

Work with us on data science and visualisation projects, and deploy your project as an app on top of GraphSearch.

Publication# Learning from survey propagation: a neural network for MAX-E-3-SAT

Abstract

Many natural optimization problems are NP-hard, which implies that they are probably hard to solve exactly in the worst-case. However, it suffices to get reasonably good solutions for all (or even most) instances in practice. This paper presents a new algorithm for computing approximate solutions in Theta(N) for the maximum exact 3-satisfiability (MAX-E-3-SAT) problem by using supervised learning methodology. This methodology allows us to create a learning algorithm able to fix Boolean variables by using local information obtained by the Survey Propagation algorithm. By performing an accurate analysis, on random conjunctive normal form instances of the MAX-E-3-SAT with several Boolean variables, we show that this new algorithm, avoiding any decimation strategy, can build assignments better than a random one, even if the convergence of the messages is not found. Although this algorithm is not competitive with state-of-the-art maximum satisfiability solvers, it can solve substantially larger and more complicated problems than it ever saw during training.

Official source

This page is automatically generated and may contain information that is not correct, complete, up-to-date, or relevant to your search query. The same applies to every other page on this website. Please make sure to verify the information with EPFL's official sources.

Related concepts (11)

Related MOOCs (39)

Related publications (14)

Learning

Learning is the process of acquiring new understanding, knowledge, behaviors, skills, values, attitudes, and preferences. The ability to learn is possessed by humans, animals, and some machines; there is also evidence for some kind of learning in certain plants. Some learning is immediate, induced by a single event (e.g. being burned by a hot stove), but much skill and knowledge accumulate from repeated experiences. The changes induced by learning often last a lifetime, and it is hard to distinguish learned material that seems to be "lost" from that which cannot be retrieved.

Boolean algebra

In mathematics and mathematical logic, Boolean algebra is a branch of algebra. It differs from elementary algebra in two ways. First, the values of the variables are the truth values true and false, usually denoted 1 and 0, whereas in elementary algebra the values of the variables are numbers. Second, Boolean algebra uses logical operators such as conjunction (and) denoted as ∧, disjunction (or) denoted as ∨, and the negation (not) denoted as ¬.

Neural network

A neural network can refer to a neural circuit of biological neurons (sometimes also called a biological neural network), a network of artificial neurons or nodes in the case of an artificial neural network. Artificial neural networks are used for solving artificial intelligence (AI) problems; they model connections of biological neurons as weights between nodes. A positive weight reflects an excitatory connection, while negative values mean inhibitory connections. All inputs are modified by a weight and summed.

The way our brain learns to disentangle complex signals into unambiguous concepts is fascinating but remains largely unknown. There is evidence, however, that hierarchical neural representations play a key role in the cortex. This thesis investigates biologically plausible models of unsupervised learning of hierarchical representations as found in the brain and modern computer vision models. We use computational modeling to address three main questions at the intersection of artificial intelligence (AI) and computational neuroscience.The first question is: What are useful neural representations and when are deep hierarchical representations needed? We approach this point with a systematic study of biologically plausible unsupervised feature learning in a shallow 2-layer networks on digit (MNIST) and object (CIFAR10) classification. Surprisingly, random features support high performance, especially for large hidden layers. When combined with localized receptive fields, random feature networks approach the performance of supervised backpropagation on MNIST, but not on CIFAR10. We suggest that future models of biologically plausible learning should outperform such random feature benchmarks on MNIST, or that such models should be evaluated in different ways.The second question is: How can hierarchical representations be learned with mechanisms supported by neuroscientific evidence? We cover this question by proposing a unifying Hebbian model, inspired by common models of V1 simple and complex cells based on unsupervised sparse coding and temporal invariance learning. In shallow 2-layer networks, our model reproduces learning of simple and complex cell receptive fields, as found in V1. In deeper networks, we stack multiple layers of Hebbian learning but find that it does not yield hierarchical representations of increasing usefulness. From this, we hypothesise that standard Hebbian rules are too constrained to build increasingly useful representations, as observed in higher areas of the visual cortex or deep artificial neural networks.The third question is: Can AI inspire learning models that build deep representations and are still biologically plausible? We address this question by proposing a learning rule that takes inspiration from neuroscience and recent advances in self-supervised deep learning. The proposed rule is Hebbian, i.e. only depends on pre- and post-synaptic neuronal activity, but includes additional local factors, namely predictive dendritic input and widely broadcasted modulation factors. Algorithmically, this rule applies self-supervised contrastive predictive learning to a causal, biological setting using saccades. We find that networks trained with this generalised Hebbian rule build deep hierarchical representations of images, speech and video.We see our modeling as a potential starting point for both, new hypotheses, that can be tested experimentally, and novel AI models that could benefit from added biological realism.

In the last decade, deep neural networks have achieved tremendous success in many fields of machine learning.However, they are shown vulnerable against adversarial attacks: well-designed, yet imperceptible, perturbations can make the state-of-the-art deep neural networks output incorrect results.Understanding adversarial attacks and designing algorithms to make deep neural networks robust against these attacks are key steps to building reliable artificial intelligence in real-life applications.In this thesis, we will first formulate the robust learning problem.Based on the notations of empirical robustness and verified robustness, we design new algorithms to achieve both of these types of robustness.Specifically, we investigate the robust learning problem from the optimization perspectives.Compared with classic empirical risk minimization, we show the slow convergence and large generalization gap in robust learning.Our theoretical and numerical analysis indicates that these challenges arise, respectively, from non-smooth loss landscapes and model's fitting hard adversarial instances.Our insights shed some light on designing algorithms for mitigating these challenges.Robust learning has other challenges, such as large model capacity requirements and high computational complexity.To solve the model capacity issue, we combine robust learning with model compression.We design an algorithm to obtain sparse and binary neural networks and make it robust.To decrease the computational complexity, we accelerate the existing adversarial training algorithm and preserve its performance stability.In addition to making models robust, our research provides other benefits.Our methods demonstrate that robust models, compared with non-robust ones, usually utilize input features in a way more similar to the way human beings use them, hence the robust models are more interpretable.To obtain verified robustness, our methods indicate the geometric similarity of the decision boundaries near data points.Our approaches towards reliable artificial intelligence can not only render deep neural networks more robust in safety-critical applications but also make us better aware of how they work.

"I choose this restaurant because they have vegan sandwiches" could be a typical explanation we would expect from a human. However, current Reinforcement Learning (RL) techniques are not able to provide such explanations, when trained on raw pixels. RL algorithms for state-of-the-art benchmark environments are based on neural networks, which lack interpretability, because of the very factor that makes them so versatile – they have many parameters and intermediate representations. Enforcing safety guarantees is important when deploying RL agents in the real world, and guarantees require interpretability of the agent. Humans use short explanations that capture only the essential parts and often contain few causes to explain an effect. In our thesis, we address the problem of making RL agents understandable by humans. In addition to the safety concerns, the quest to mimic human-like reasoning is of general scientific interest, as it sheds light on the easy problem of consciousness. The problem of providing interpretable and simple causal explanations of agent's behavior is connected to the problem of learning good state representations. If we lack such a representation, any reasoning algorithm's outputs would be useless for interpretability, since even the "referents" of the "thoughts" of such a system would be obscure to us. One way to define simplicity of causal explanations via the sparsity of the Causal Model that describes the environment: the causal graph has the fewest edges connecting causes to their effects. For example, a model for choosing the restaurant that only depends on the cause "vegan" is simpler and more interpretable than a model that looks at each pixel of a photo of the menu of a restaurant, and possibly relies as well on spurious correlations, such the style of the menu. In this thesis, we propose a framework "CauseOccam" for model-based Reinforcement Learning where the model is regularized for simplicity in terms of sparsity of the causal graph it corresponds to. The framework contains a learned mapping from observations to latent features, and a model predicting latent features at the next time-steps given ones from the current time-step. The latent features are regularized with the sparsity of the model, compared to a more traditional regularization on the features themselves, or via a hand-crafted interpretability loss. To achieve sparsity, we use discrete Bernoulli variables with gradient estimation, and to find the best parameters, we use the primal-dual constrained formulation to achieve a target model quality. The novelty of this work is in learning jointly a sparse causal graph and the representation taking pixels as the input on RL environments. We test this framework on benchmark environments with non-trivial high-dimensional dynamics and show that it can uncover the causal graph with the fewest edges in the latent space. We describe the implications of our work for developing priors enforcing interpretability.

2021