**Êtes-vous un étudiant de l'EPFL à la recherche d'un projet de semestre?**

Travaillez avec nous sur des projets en science des données et en visualisation, et déployez votre projet sous forme d'application sur GraphSearch.

Publication# DynaProg for Scala

Résumé

Dynamic programming is an algorithmic technique to solve problems that follow the Bellman’s principle: optimal solutions depends on optimal sub-problem solutions. The core idea behind dynamic programming is to memoize intermediate results into matrices to avoid multiple computations. Solving a dynamic programming problem consists of two phases: filling one or more matrices with intermediate solutions for sub-problems and recomposing how the final result was constructed (backtracking). In textbooks, problems are usually described in terms of recurrence relations between matrices elements. Expressing dynamic programming problems in terms of recursive formulae involving matrix indices might be difficult, if often error prone, and the notation does not capture the essence of the underlying problem (for example aligning two sequences). Moreover, writing correct and efficient parallel implementation requires different competencies and often a significant amount of time. In this project, we present DynaProg, a language embedded in Scala (DSL) to address dynamic programming problems on heterogeneous platforms. DynaProg allows the programmer to write concise programs based on ADP [1], using a pair of parsing grammar and algebra; these program can then be executed either on CPU or on GPU. We evaluate the performance of our implementation against existing work and our own hand-optimized baseline implementations for both the CPU and GPU versions. Experimental results show that plain Scala has a large overhead and is recommended to be used with small sequences (≤1024) whereas the generated GPU version is comparable with existing implementations: matrix chain multiplication has the same performance as our hand-optimized version (142% of the execution time of [2]) for a sequence of 4096 matrices, Smith-Waterman is twice slower than [3] on a pair of sequences of 6144 elements, and RNA folding is on par with [4] (95% running time) for sequences of 4096 elements. [1] Robert Giegerich and Carsten Meyer. Algebraic Dynamic Programming. [2] Chao-Chin Wu, Jenn-Yang Ke, Heshan Lin and Wu Chun Feng. Optimizing dynamic programming on graphics processing units via adaptive thread-level parallelism. [3] Edans Flavius de O. Sandes, Alba Cristina M. A. de Melo. Smith-Waterman alignment of huge sequences with GPU in linear space. [4] Guillaume Rizk and Dominique Lavenier. GPU accelerated RNA folding algorithm.

Official source

Cette page est générée automatiquement et peut contenir des informations qui ne sont pas correctes, complètes, à jour ou pertinentes par rapport à votre recherche. Il en va de même pour toutes les autres pages de ce site. Veillez à vérifier les informations auprès des sources officielles de l'EPFL.

Concepts associés

Chargement

Publications associées

Chargement

Publications associées (25)

Chargement

Chargement

Chargement

Concepts associés (22)

Programmation dynamique

En informatique, la programmation dynamique est une méthode algorithmique pour résoudre des problèmes d'optimisation. Le concept a été introduit au début des années 1950 par Richard Bellman. À l'épo

Mise en œuvre

La mise en œuvre est le fait de mettre en place un projet.
Ingénierie et informatique
En ingénierie et plus particulièrement en informatique, la mise en œuvre désigne la création d’un pr

Suite définie par récurrence

En mathématiques, une suite définie par récurrence est une suite définie par son (ou ses) premier(s) terme(s) et par une relation de récurrence, qui définit chaque terme à partir du précédent ou des

Approximation algorithms are a commonly used tool for designing efficient algorithmic solutions for intractable problems, at the expense of the quality of the output solution. A prominent technique for designing such algorithms is the use of Linear Programming (LP) relaxations. An optimal solution to such a relaxation provides a bound on the objective value of the optimal integral solution, to which we compare the integral solution we return. In this context, when studying a specific problem, two natural questions often arise: What is a strong LP relaxation for this problem, and how can we exploit it? Over the course of the past few decades, a significant amount of effort has been expended by the research community in order to answer these questions for a variety of interesting intractable problems. Although there exist multiple problems for which we have designed LP relaxations that achieve best-possible guarantees, there still exist numerous problems for which we either have no strong LP relaxations, or do not know how to use them. The main focus of this thesis is extending our understanding of such strong relaxations. We focus on designing good approximation algorithms for certain allocation problems, by employing a class of strong LP relaxations, called configuration-LPs. For many such allocation problems, the best-known results are derived by using simple and natural LP relaxations, whereas configuration-LPs have been used successfully on several occasions in order to break pre-existing barriers set by weaker relaxations. However, our understanding of configuration-LPs is far from complete for many problems. Therefore, understanding and using these relaxations to the farthest extent possible is a quite intriguing question. Answering this question could result in improved approximation algorithms for a wide variety of allocation problems. The first problem we address in this thesis is the restricted max-min fair allocation problem. Prior to our work, the best known result provided an $\Omega(1)$-approximation that ran in polynomial time. Also, it was known how to estimate the value of an optimal solution to the problem within a factor of $1/(4+c)$, for any $c>0$, by solving the corresponding configuration-LP. Our first contribution in this thesis is the design of a $1/13$-approximation algorithm for the problem, using the configuration-LP. Specifically, although our algorithm is fully combinatorial, it consists of a local-search procedure that is guaranteed to succeed only when the configuration-LP is feasible. In order to establish the correctness and running time of the algorithm, it is crucial to use the configuration-LP in our analysis. The second problem we study is the scheduling of jobs on unrelated machines in order to minimize the sum of weighted completion times. For this problem, the best known approximation algorithm achieves a ratio of $3/2-r$, for some small $r>0$. Our second contribution in this thesis is the improvement of this ratio to $(1+\sqrt{2})/2+c$, for any $c>0$, for the special case of the problem where the jobs have uniform Smith ratios. To achieve this ratio, we design a randomized rounding algorithm that rounds solutions to the corresponding configuration-LP. Through a careful examination of the distribution this randomized algorithm outputs, we identify the one that maximizes the approximation ratio, and we then upper bound the ratio this worst-case distribution exhibits by $(1+\sqrt{2})/2+c$.

Adam Teodor Polak, Lars Rohwedder

Knapsack and Subset Sum are fundamental NP-hard problems in combinatorial optimization. Recently there has been a growing interest in understanding the best possible pseudopolynomial running times for these problems with respect to various parameters. In this paper we focus on the maximum item size s and the maximum item value v. We give algorithms that run in time O(n + s³) and O(n + v³) for the Knapsack problem, and in time Õ(n + s^{5/3}) for the Subset Sum problem. Our algorithms work for the more general problem variants with multiplicities, where each input item comes with a (binary encoded) multiplicity, which succinctly describes how many times the item appears in the instance. In these variants n denotes the (possibly much smaller) number of distinct items. Our results follow from combining and optimizing several diverse lines of research, notably proximity arguments for integer programming due to Eisenbrand and Weismantel (TALG 2019), fast structured (min,+)-convolution by Kellerer and Pferschy (J. Comb. Optim. 2004), and additive combinatorics methods originating from Galil and Margalit (SICOMP 1991).

With the latest developments in video coding technology and fast deployment of end-user broadband internet connections, real-time media applications become increasingly interesting for both private users and businesses. However, the internet remains a best-effort service network unable to guarantee the stringent requirements of the media application, in terms of high, constant bandwidth, low packet loss rate and transmission delay. Therefore, efficient adaptation mechanisms must be derived in order to bridge the application requirements with the transport medium characteristics. Lately, different network architectures, e.g., peer-to-peer networks, content distribution networks, parallel wireless services, emerge as potential solutions for reducing the cost of communication or infrastructure, and possibly improve the application performance. In this thesis, we start from the path diversity characteristic of these architectures, in order to build a new framework, specific for media streaming in multipath networks. Within this framework we address important issues related to an efficient streaming process, namely path selection and rate allocation, forward error correction and packet scheduling over multiple transmission paths. First we consider a network graph between the streaming server and the client, offering multiple possible transmission paths to the media application. We are interested in finding the optimal subset of paths employed for data transmission, and the optimal rate allocation on these paths, in order to optimize a video distortion metric. Our in-depth analysis of the proposed scenario eventually leads to the derivation of three important theorems, which, in turn represent the basis for an optimal, linear time algorithm that finds the solution to our optimization problem. At the same time, we provide distributed protocols which compute the optimal solution in a distributed way, suitable for large scale network graphs, where a centralized solution is too expensive. Next, we address the problem of forward error correction for scalable media streaming over multiple network paths. We propose various algorithms for error protection in a multipath scenario, and we assess the opportunity of in-network error correction. Our analysis stresses the advantage of being flexible in the scheduling and error correction process on multiple network paths, and emphasizes the limitations of possible real systems implementations, where application choices are limited. Finally, we observe the improvements brought by in-network processing of transmitted media flows, in the case of heterogeneous networks, when link parameters vary greatly. Once the rate allocation and error correction issues are addressed, we discuss the packet scheduling problem over multiple transmission paths. We rely on a scalable bitstream packet model inspired from the media coding process, where media packets have different priorities and dependencies. Based on the concept of data pre-fetch, and on a strict time analysis of the transmission process, we propose fast algorithms for efficient packet scheduling over multiple paths. We ensure media graceful degradation at the client in adverse network conditions by careful load balancing among transmission paths, and by conservative scheduling which transparently absorb undetected network variations, or network estimation errors. The final part of this thesis presents a possible system for media streaming where our proposed mechanisms and protocols can be straightforwardly implemented. We describe a wireless setup where clients can access various applications over possibly multiple wireless services. In this setup, we solve the rate allocation problem with the final goal of maximizing the overall system performance. To this end, we propose a unifying quality metric which maps the individual performance of each application (including streaming) to a common value, later used in the optimization process. We propose a fast algorithm for computing a close to optimal solution to this problem and we show that compared to other traditional methods, we achieve a more fair performance, better adaptable to changing network environments.