**Êtes-vous un étudiant de l'EPFL à la recherche d'un projet de semestre?**

Travaillez avec nous sur des projets en science des données et en visualisation, et déployez votre projet sous forme d'application sur GraphSearch.

Publication# Proactive Synthesis of Recursive Tree-to-String Functions from Examples

Résumé

Synthesis from examples enables non-expert users to generate programs by specifying examples of their behavior. A domain-specific form of such synthesis has been recently deployed in a widely used spreadsheet software product. In this paper we contribute to foundations of such techniques and present a complete algorithm for synthesis of a class of recursive functions defined by structural recursion over a given algebraic data type definition. The functions we consider map an algebraic data type to a string; they are useful for, e.g., pretty printing and serialization of programs and data. We formalize our problem as learning deterministic sequential top-down tree-to-string transducers with a single state (1STS). The first problem we consider is learning a tree-to-string transducer from any set of input/output examples provided by the user. We show that, given a set of input/output examples, checking whether there exists a 1STS consistent with these examples is NP-complete in general. In contrast, the problem can be solved in polynomial time under a (practically useful) closure condition that each subtree of a tree in the input/output example set is also part of the input/output examples. Because coming up with relevant input/output examples may be difficult for the user while creating hard constraint problems for the synthesizer, we also study a more automated active learning scenario in which the algorithm chooses the inputs for which the user provides the outputs. Our algorithm asks a worst-case linear number of queries as a function of the size of the algebraic data type definition to determine a unique transducer. To construct our algorithms we present two new results on formal languages. First, we define a class of word equations, called sequential word equations, for which we prove that satisfiability can be solved in deterministic polynomial time. This is in contrast to the general word equations for which the best known complexity upper bound is in linear space. Second, we close a long-standing open problem about the asymptotic size of test sets for context-free languages. A test set of a language of words L is a subset T of L such that any two word homomorphisms equivalent on T are also equivalent on L. We prove that it is possible to build test sets of cubic size for context-free languages, matching for the first time the lower bound found 20 years ago.

Official source

Cette page est générée automatiquement et peut contenir des informations qui ne sont pas correctes, complètes, à jour ou pertinentes par rapport à votre recherche. Il en va de même pour toutes les autres pages de ce site. Veillez à vérifier les informations auprès des sources officielles de l'EPFL.

Concepts associés

Chargement

Publications associées

Chargement

Concepts associés (11)

Résolution de problème

vignette|Résolution d'un problème mathématique.
La résolution de problème est le processus d'identification puis de mise en œuvre d'une solution à un problème.
Méthodologie
Dans l'ind

Algorithme

thumb|Algorithme de découpe d'un polygone quelconque en triangles (triangulation).
Un algorithme est une suite finie et non ambiguë d'instructions et d’opérations permettant de résoudre une classe de

Problème NP-complet

En théorie de la complexité, un problème NP-complet ou problème NPC (c'est-à-dire un problème complet pour la classe NP) est un problème de décision vérifiant les propriétés suivantes :

- il est pos

Publications associées (8)

Chargement

Chargement

Chargement

Graph theory experienced a remarkable increase of interest among the scientific community during the last decades. The vertex coloring problem (Min Coloring) deserves a particular attention rince it has been able to capture a wide variety of applications. For mathematicians, it is interesting for an additional reason: it is extremely hard to solve it in an efficient way. In this thesis, we introduce several problems generalizing the usual vertex coloring problem, and hence, extending its application domain. We say that a graph is (p, k)-colorable if its vertex set can be partitioned into p cliques and k stable sets. Then, for a given p (respectively k), one may ask the following questions: how to choose p cliques (respectively k stable sets) to be removed from the graph such that the number of stable sets (respectively cliques) partitioning the remaining vertices is minimized? These are called (p, k)-coloring problems. We also introduce Min Split-coloring which is, given a graph G, the problem of minimizing k such that G is (k, k)-colorable. Along the saine line, given a graph G, the objective of the problem Min Cocoloring is to minimize p + k such that G is (p, k)-colorable. All these problems, called together generalized coloring problems, are obviously at least as difficult as Min Coloring. The purpose of this dissertation is to study generalized coloring problems in nome restricted classes of graphs in order to bring a new insight on the relative difficulties of these problems. To this end, we detect in a more precise way the limits between NP-hard and polynomially solvable problems. Chapter 1 introduces generalized coloring problems by emphasizing nome preliminary results which will guide the questions to handle in the following chapters. Chapter 2 exposes the first clans of graphs, namely cacti, where Min Split-coloring is shown to be polynomially solvable. We also observe that generalized coloring problems can be polynomially solved in triangulated graphs. The main result of Chapter 3 is a new characterization of cographs: it is equivalent to say that G is a cograph, and to state that, for every subgraph G' ⊆ G, G' is (p, k)-colorable if and only if G' [V \ K] is (p – 1, k)-colorable, where K induces a maximum clique of G'. This result implies simple combinatorial algorithme to solve all generalized coloring problems; the one for Min Cocoloring improves the best time complexity known so far. In Chapter 4, we handle the recognition of polar graphs which can be seen as a particular (p, k)-coloring, where p cliques are independent (i.e., not linked at all) and k stable sets form a complete k-partite graph. It is known that the recognition of polar graphs is NP-complete. Here, we determine the first clans of graphs, namely cographs, where the polar graphs can be recognized in polynomial time, more precisely in time O(n log n). We also give a characterization by forbidden subgraphs. In the came manner, we characterize monopolar cographs, i.e., cographs admitting a polar partition with at most one clique or at most one stable set. Chapter 5 is devoted to generalized coloring problems in line graphs. Here, we detect the first classes of graphs, namely line graphs of trees, line graphs of bipartite graphs and line graphs of line-perfect graphs, where generalized coloring problems diverge in terms of NP-hardness. In Chapter 6 we study the approximability of generalized coloring problems in line graphs, in comparability graphs and in general graphs. We derive approximation algorithms with a performance guarantee using both the standard approximation ratio and the differential approximation ratio. We show that both Min Split-coloring and Min Cocoloring are at least as hard as Min Coloring to approximate from the standard approximation ratio point of view, whereas, they admit a polynomial time differential approximation scheme and Min Coloring only a constant differential approximation ratio. We also show that Min Cocoloring reduces to Min Split-coloring in all classes of graphs closed under addition of disjoint cliques and under join of a complete k-partite graph. In Chapter 7, we handle two different applications of Min Split-coloring in permutation graphs. They give birth to a new problem, called Min Threshold-coloring, that we study in the came spirit as the other generalized coloring problems. In the last chapter, we present several open questions arising from this thesis.

Optimization is a fundamental tool in modern science. Numerous important tasks in biology, economy, physics and computer science can be cast as optimization problems. Consider the example of machine learning: recent advances have shown that even the most sophisticated tasks involving decision making, can be reduced to solving certain optimization problems. These advances however, bring several new challenges to the field of algorithm design. The first of them is related to the ever-growing size of instances, these optimization problems need to be solved for. In practice, this forces the algorithms for these problems to run in time linear or nearly linear in their input size. The second challenge is related to the emergence of new, harder and harder problems which need to be dealt with. These problems are in most cases considered computationally intractable because of complexity barriers such as NP completeness, or because of non-convexity. Therefore, efficiently computable relaxations for these problems are typically desired.
The material of this thesis is divided into two parts. In the first part we attempt to address the first challenge. The recent tremendous progress in developing fast algorithm for such fundamental problems as maximum flow or linear programming, demonstrate the power of continuous techniques and tools such as electrical flows, fast Laplacian solvers and interior point methods. In this thesis we study new algorithms of this type based on continuous dynamical systems inspired by the study of a slime mold Physarum polycephalum. We perform a rigorous mathematical analysis of these dynamical systems and extract from them new, fast algorithms for problems such as minimum cost flow, linear programming and basis pursuit.
In the second part of the thesis we develop new tools to approach the second challenge. Towards this, we study a very general form of discrete optimization problems and its extension to sampling and counting, capturing a host of important problems such as counting matchings in graphs, computing permanents of matrices or sampling from constrained determinantal point processes. We present a very general framework, based on polynomials, for dealing with these problems computationally. It is based, roughly, on encoding the problem structure in a multivariate polynomial and then recovering the solution by means of certain continuous relaxations. This leads to several questions on how to reason about such relaxations and how to compute them. We resolve them by relating certain analytic properties of the arising polynomials, such as the location of their roots or convexity, to the combinatorial structure of the underlying problem.
We believe that the ideas and mathematical techniques developed in this thesis are only a beginning and they will inspire more work on the use of dynamical systems and polynomials in the design of fast algorithms.

The common point between the different chapters of the present work is graph theory. We investigate some well known graph theory problems, and some which arise from more specific applications. In the first chapter, we deal with the maximum stable set problem, and provide some new graph classes, where it can be solved in polynomial time. Those classes are hereditary, i.e. characterized by a list of forbidden induced subgraphs. The algorithms proposed are purely combinatorial. The second chapter is devoted to the study of a problem linked to security purposes in mobile telecommunication networks. The particularity is that there is no central authority guaranteeing security, but it is actually managed by the users themselves. The network is modelled by an oriented graph, whose vertices represent the users, and whose arcs represent public key certificates. The problem is to associate to each vertex a subgraph with some requirements on the size of the subgraphs, the number of times a vertex is taken in a subgraph and the connectivity between any two users as they put their subgraphs together. Constructive heuristics are proposed, bounds on the optimal solution and a tabu search are described and tested. The third chapter is on the problem of reconstructing an image, given its projections in terms of the number of occurrences of each color in each row and each column. The case of two colors is known to be polynomially solvable, it is NP-complete with four or more colors, and the complexity status of the problem with three colors is open. An intermediate case between two and three colors is shown to be solvable in polynomial time. The last two chapters are about graph (vertex-)coloring. In the fourth, we prove a result which brings a large collection of NP-hard subcases, characterized by forbidden induced subgraphs. In the fifth chapter, we approach the problem with the use of linear programming. Links between different formulations are pointed out, and some families of facets are characterized. In the last section, we study a branch and bound algorithm, whose lower bounds are given by the optimal value of the linear relaxation of one of the exposed formulations. A preprocessing procedure is proposed and tested.