Auto-vectorisation | EPFL Graph Search

Cours associés (4)

CS-629: Constructive Computer Architecture

Beginning with a basic pipeline processor, student will learn to implement intriguing architectural techinques through a series of labs. The class will emphasize the implementation, debugging, and ana

CS-453: Concurrent computing

With the advent of modern architectures, it becomes crucial to master the underlying algorithmics of concurrency. The objective of this course is to study the foundations of concurrent algorithms and

DH-406: Machine learning for DH

This course aims to introduce the basic principles of machine learning in the context of the digital humanities. We will cover both supervised and unsupervised learning techniques, and study and imple

Afficher plus

Séances de cours associées (16)

Publications associées (5)

An Optimizing Multi-platform Source-to-source Compiler Framework for the NEURON MODeling Language

Felix Schürmann, James Gonzalo King, Michael Lee Hines, Pramod Shivaji Kumbhar, Omar Awile, Jorge Blanco Alonso, Liam Roger George Keegan

Domain-specific languages (DSLs) play an increasingly important role in the generation of high performing software. They allow the user to exploit domain knowledge for the generation of more efficient code on target architectures. Here, we describe a new c ...

Springer2020

Exploiting Flow Graph of System of ODEs to Accelerate the Simulation of Biologically-Detailed Neural Networks

Felix Schürmann, Michael Lee Hines, Bruno Ricardo Da Cunha Magalhães

Exposing parallelism in scientific applications has become a core requirement for efficiently running on modern distributed multicore SIMD compute architectures. The granularity of parallelism that can be attained is a key determinant for the achievable ac ...

IEEE2019

Porting an MPEG-HEVC decoder to a low-power many-core platform

Marco Mattavelli, Simone Casale Brunet, Claudio Paolo Alberti, Damien Jack De Saint Jorre

After several generations of video coding standards, MPEG High Efficient Video Coding (HEVC) is likely to emerge as the video coding standards for HD and Ultra-HD TV. HEVC decoding is expected to be less computationally demanding and to provide a higher le ...

2013

Afficher plus

Concepts associés (4)

Fonction intrinsèque

Une fonction intrinsèque est, dans la théorie des compilateurs, une fonction disponible dans un langage de programmation donné dont l'implémentation est assurée par le compilateur même. Typiquement, une séquence d'instructions générées automatiquement remplace l'appel de fonction original, un peu à la manière d'une fonction inline. Par contre, à la différence d'une fonction inline, le compilateur a une connaissance approfondie de la fonction intrinsèque, et par conséquent peut mieux intégrer celle-ci et l'optimiser pour la situation donnée.

Optimisation de boucle

In compiler theory, loop optimization is the process of increasing execution speed and reducing the overheads associated with loops. It plays an important role in improving cache performance and making effective use of parallel processing capabilities. Most execution time of a scientific program is spent on loops; as such, many compiler optimization techniques have been developed to make them faster. Since instructions inside loops can be executed repeatedly, it is frequently not possible to give a bound on the number of instruction executions that will be impacted by a loop optimization.

AltiVec

AltiVec est un ensemble d'instructions SIMD d'opérations en virgule flottante conçu par, et propriété de, Apple, IBM et Motorola (l'alliance AIM), et mis en application sur des versions du PowerPC telle le G4 de Motorola et le G5 d'IBM. AltiVec est un nom commercial détenu uniquement par Motorola ; ainsi l'ensemble est également appelé Velocity Engine par Apple et VMX par IBM. À la suite des performances démontrées dans le calcul d'un processeur vectoriel avec le supercalculateur Cray-1 en 1976, ce type d'architecture devient une technique importante dans le domaine du calcul vectoriel et plus généralement matriciel.

Afficher plus