Tomasulo's algorithm is a computer architecture hardware algorithm for dynamic scheduling of instructions that allows out-of-order execution and enables more efficient use of multiple execution units. It was developed by Robert Tomasulo at IBM in 1967 and was first implemented in the IBM System/360 Model 91’s floating point unit. The major innovations of Tomasulo’s algorithm include register renaming in hardware, reservation stations for all execution units, and a common data bus (CDB) on which computed values broadcast to all reservation stations that may need them. These developments allow for improved parallel execution of instructions that would otherwise stall under the use of scoreboarding or other earlier algorithms. Robert Tomasulo received the Eckert–Mauchly Award in 1997 for his work on the algorithm. The following are the concepts necessary to the implementation of Tomasulo's algorithm: The Common Data Bus (CDB) connects reservation stations directly to functional units. According to Tomasulo it "preserves precedence while encouraging concurrency". This has two important effects: Functional units can access the result of any operation without involving a floating-point-register, allowing multiple units waiting on a result to proceed without waiting to resolve contention for access to register file read ports. Hazard Detection and control execution are distributed. The reservation stations control when an instruction can execute, rather than a single dedicated hazard unit. Instructions are issued sequentially so that the effects of a sequence of instructions, such as exceptions raised by these instructions, occur in the same order as they would on an in-order processor, regardless of the fact that they are being executed out-of-order (i.e. non-sequentially). Tomasulo's algorithm uses register renaming to correctly perform out-of-order execution. All general-purpose and reservation station registers hold either a real value or a placeholder value.

About this result
This page is automatically generated and may contain information that is not correct, complete, up-to-date, or relevant to your search query. The same applies to every other page on this website. Please make sure to verify the information with EPFL's official sources.
Related courses (3)
CS-472: Design technologies for integrated systems
Hardware compilation is the process of transforming specialized hardware description languages into circuit descriptions, which are iteratively refined, detailed and optimized. The course presents a
CS-307: Introduction to multiprocessor architecture
Multiprocessors are a core component in all types of computing infrastructure, from phones to datacenters. This course will build on the prerequisites of processor design and concurrency to introduce
CS-471: Advanced multiprocessor architecture
Multiprocessors are basic building blocks for all computer systems. This course covers the architecture and organization of modern multiprocessors, prevalent accelerators (e.g., GPU, TPU), and datacen
Related lectures (15)
Dynamic Scheduling in Processors: Enhancing Instruction Execution
Covers dynamic scheduling in processors, focusing on out-of-order execution and managing instruction dependencies effectively.
CMP Caches I
Explores CMP cache basics, transactional memory, in-flight transactions, cache overflow mechanism, and victim replication.
Introduction to Dynamic Logic
Introduces Dynamic Logic, covering its principles, issues, and design variations, including footed vs. unfooted designs and multiple-output structures.
Show more
Related publications (26)

Imprecise Store Exceptions

Babak Falsafi, Mathias Josef Payer, Yuanlong Li, Siddharth Gupta, Yunho Oh, Qingxuan Kang, Abhishek Bhattacharjee

Precise exceptions are a cornerstone of modern computing as they provide the abstraction of sequential instruction execution to programmers while accommodating microarchitectural optimizations. However, increasing compute capabilities in deep memory hierar ...
ACM2023

Time scales of dynamic stall development on a vertical-axis wind turbine blade

Karen Ann J Mulleners, Sébastien Le Fouest, Daniel Marie Fernex

Vertical-axis wind turbines are excellent candidates to diversify wind energy technology, but their aerodynamic complexity limits industrial deployment. To improve the efficiency and lifespan of vertical-axis wind turbines, we desire data-driven models and ...
2023

Acceleration of graph pattern mining and applications to financial crime

Jovan Blanusa

Various forms of real-world data, such as social, financial, and biological networks, can berepresented using graphs. An efficient method of analysing this type of data is to extractsubgraph patterns, such as cliques, cycles, and motifs, from graphs. For i ...
EPFL2023
Show more
Related concepts (5)
Out-of-order execution
In computer engineering, out-of-order execution (or more formally dynamic execution) is a paradigm used in most high-performance central processing units to make use of instruction cycles that would otherwise be wasted. In this paradigm, a processor executes instructions in an order governed by the availability of input data and execution units, rather than by their original order in a program. In doing so, the processor can avoid being idle while waiting for the preceding instruction to complete and can, in the meantime, process the next instructions that are able to run immediately and independently.
Hazard (computer architecture)
In the domain of central processing unit (CPU) design, hazards are problems with the instruction pipeline in CPU microarchitectures when the next instruction cannot execute in the following clock cycle, and can potentially lead to incorrect computation results. Three common types of hazards are data hazards, structural hazards, and control hazards (branching hazards). There are several methods used to deal with hazards, including pipeline stalls/pipeline bubbling, operand forwarding, and in the case of out-of-order execution, the scoreboarding method and the Tomasulo algorithm.
Scoreboarding
Scoreboarding is a centralized method, first used in the CDC 6600 computer, for dynamically scheduling instructions so that they can execute out of order when there are no conflicts and the hardware is available. In a scoreboard, the data dependencies of every instruction are logged, tracked and strictly observed at all times. Instructions are released only when the scoreboard determines that there are no conflicts with previously issued ("in flight") instructions.
Show more

Graph Chatbot

Chat with Graph Search

Ask any question about EPFL courses, lectures, exercises, research, news, etc. or try the example questions below.

DISCLAIMER: The Graph Chatbot is not programmed to provide explicit or categorical answers to your questions. Rather, it transforms your questions into API requests that are distributed across the various IT services officially administered by EPFL. Its purpose is solely to collect and recommend relevant references to content that you can explore to help you answer your questions.