Summary
In parallel computing, granularity (or grain size) of a task is a measure of the amount of work (or computation) which is performed by that task. Another definition of granularity takes into account the communication overhead between multiple processors or processing elements. It defines granularity as the ratio of computation time to communication time, wherein computation time is the time required to perform the computation of a task and communication time is the time required to exchange data between processors. If T_comp is the computation time and T_comm denotes the communication time, then the granularity G of a task can be calculated as: Granularity is usually measured in terms of the number of instructions which are executed in a particular task. Alternately, granularity can also be specified in terms of the execution time of a program, combining the computation time and communication time. Depending on the amount of work which is performed by a parallel task, parallelism can be classified into three categories: fine-grained, medium-grained and coarse-grained parallelism. In fine-grained parallelism, a program is broken down to a large number of small tasks. These tasks are assigned individually to many processors. The amount of work associated with a parallel task is low and the work is evenly distributed among the processors. Hence, fine-grained parallelism facilitates load balancing. As each task processes less data, the number of processors required to perform the complete processing is high. This in turn, increases the communication and synchronization overhead. Fine-grained parallelism is best exploited in architectures which support fast communication. Shared memory architecture which has a low communication overhead is most suitable for fine-grained parallelism. It is difficult for programmers to detect parallelism in a program, therefore, it is usually the compilers' responsibility to detect fine-grained parallelism. An example of a fine-grained system (from outside the parallel computing domain) is the system of neurons in our brain.
About this result
This page is automatically generated and may contain information that is not correct, complete, up-to-date, or relevant to your search query. The same applies to every other page on this website. Please make sure to verify the information with EPFL's official sources.
Related courses (11)
CS-302: Parallelism and concurrency in software
From sensors,to smart phones,to the world's largest datacenters and supercomputers, parallelism & concurrency is ubiquitous in modern computing.There are also many forms of parallel & concurrent execu
CS-206: Parallelism and concurrency
Course no longer offered for new students; this edition is only a make-up course for those who repeated the year. Please log in with EPFL credentials and consult the mediaspace link below for course v
CS-470: Advanced computer architecture
The course studies techniques to exploit Instruction-Level Parallelism (ILP) statically and dynamically. It also addresses some aspects of the design of domain-specific accelerators. Finally, it explo
Show more