In computer science, instruction scheduling is a compiler optimization used to improve instruction-level parallelism, which improves performance on machines with instruction pipelines. Put more simply, it tries to do the following without changing the meaning of the code:
Avoid pipeline stalls by rearranging the order of instructions.
Avoid illegal or semantically ambiguous operations (typically involving subtle instruction pipeline timing issues or non-interlocked resources).
The pipeline stalls can be caused by structural hazards (processor resource limit), data hazards (output of one instruction needed by another instruction) and control hazards (branching).
Instruction scheduling is typically done on a single basic block. In order to determine whether rearranging the block's instructions in a certain way preserves the behavior of that block, we need the concept of a data dependency. There are three types of dependencies, which also happen to be the three data hazards:
Read after Write (RAW or "True"): Instruction 1 writes a value used later by Instruction 2. Instruction 1 must come first, or Instruction 2 will read the old value instead of the new.
Write after Read (WAR or "Anti"): Instruction 1 reads a location that is later overwritten by Instruction 2. Instruction 1 must come first, or it will read the new value instead of the old.
Write after Write (WAW or "Output"): Two instructions both write the same location. They must occur in their original order.
Technically, there is a fourth type, Read after Read (RAR or "Input"): Both instructions read the same location. Input dependence does not constrain the execution order of two statements, but it is useful in scalar replacement of array elements.
To make sure we respect the three types of dependencies, we construct a dependency graph, which is a directed graph where each vertex is an instruction and there is an edge from I1 to I2 if I1 must come before I2 due to a dependency. If loop-carried dependencies are left out, the dependency graph is a directed acyclic graph.
Cette page est générée automatiquement et peut contenir des informations qui ne sont pas correctes, complètes, à jour ou pertinentes par rapport à votre recherche. Il en va de même pour toutes les autres pages de ce site. Veillez à vérifier les informations auprès des sources officielles de l'EPFL.
Students learn several implementation techniques for modern functional and object-oriented programming languages. They put some of them into practice by developing key parts of a compiler and run time
The course studies techniques to exploit Instruction-Level Parallelism (ILP) statically and dynamically. It also addresses some aspects of the design of domain-specific accelerators. Finally, it explo
Couvre l'optimisation logicielle, l'efficacité du cache, la planification parallèle et les stratégies de distribution de travail pour les programmes parallèles rapides.
Couvre les optimisations logicielles pour améliorer les performances du programme en maximisant les succès de cache et en optimisant la distribution du travail.
NOTOC In computer science, instruction selection is the stage of a compiler backend that transforms its middle-level intermediate representation (IR) into a low-level IR. In a typical compiler, instruction selection precedes both instruction scheduling and register allocation; hence its output IR has an infinite set of pseudo-registers (often known as temporaries) and may still be – and typically is – subject to peephole optimization. Otherwise, it closely resembles the target machine code, bytecode, or assembly language.
En informatique, le profilage de code (ou code profiling en anglais) consiste à analyser l'exécution d'un logiciel afin de connaitre son comportement à l'exécution. Le profilage de code permet de contrôler lors de l'exécution d'un logiciel : la liste des fonctions appelées et le temps passé dans chacune d'elles ; l'utilisation processeur ; l'utilisation mémoire. en rajoutant des instructions au code source originel qui permettent de mesurer le comportement du logiciel lors de l'exécution.
Dans un compilateur, l'allocation de registres est une étape importante de la génération de code. Elle vise à choisir judicieusement dans quel registre du processeur seront enregistrées les variables durant l'exécution du programme que l'on compile. Les registres sont des mémoires internes au processeur, généralement capables de contenir un mot machine. Les opérations sur des valeurs rangées dans des registres sont plus rapides que celles sur des valeurs en mémoire vive, quand ce ne sont pas les seules possibles.
Side-channel disassembly attacks recover CPU instructions from power or electromagnetic side-channel traces measured during code execution. These attacks typically rely on physical access, proximity to the victim device, and high sampling rate measuring in ...
A central task in high-level synthesis is scheduling: the allocation of operations to clock cycles. The classic approach to scheduling is static, in which each operation is mapped to a clock cycle at compile-time, but recent years have seen the emergence o ...
A central task in high-level synthesis is scheduling: the allocation of operations to clock cycles. The classic approach to scheduling is static, in which each operation is mapped to a clock cycle at compile-time, but recent years have seen the emergence o ...