Task parallelismTask parallelism (also known as function parallelism and control parallelism) is a form of parallelization of computer code across multiple processors in parallel computing environments. Task parallelism focuses on distributing tasks—concurrently performed by processes or threads—across different processors. In contrast to data parallelism which involves running the same task on different components of data, task parallelism is distinguished by running many different tasks at the same time on the same data.
Barrier (computer science)In parallel computing, a barrier is a type of synchronization method. A barrier for a group of threads or processes in the source code means any thread/process must stop at this point and cannot proceed until all other threads/processes reach this barrier. Many collective routines and directive-based parallel languages impose implicit barriers. For example, a parallel do loop in Fortran with OpenMP will not be allowed to continue on any thread until the last iteration is completed.
X10 (programming language)X10 is a programming language being developed by IBM at the Thomas J. Watson Research Center as part of the Productive, Easy-to-use, Reliable Computing System (PERCS) project funded by DARPA's High Productivity Computing Systems (HPCS) program. Its primary authors are Kemal Ebcioğlu, Saravanan Arumugam (Aswath), Vijay Saraswat, and Vivek Sarkar. X10 is designed specifically for parallel computing using the partitioned global address space (PGAS) model.
Bulk synchronous parallelThe bulk synchronous parallel (BSP) abstract computer is a bridging model for designing parallel algorithms. It is similar to the parallel random access machine (PRAM) model, but unlike PRAM, BSP does not take communication and synchronization for granted. In fact, quantifying the requisite synchronization and communication is an important part of analyzing a BSP algorithm. The BSP model was developed by Leslie Valiant of Harvard University during the 1980s. The definitive article was published in 1990.
Intrinsic functionIn computer software, in compiler theory, an intrinsic function (or built-in function) is a function (subroutine) available for use in a given programming language whose implementation is handled specially by the compiler. Typically, it may substitute a sequence of automatically generated instructions for the original function call, similar to an inline function. Unlike an inline function, the compiler has an intimate knowledge of an intrinsic function and can thus better integrate and optimize it for a given situation.
Compute kernelIn computing, a compute kernel is a routine compiled for high throughput accelerators (such as graphics processing units (GPUs), digital signal processors (DSPs) or field-programmable gate arrays (FPGAs)), separate from but used by a main program (typically running on a central processing unit). They are sometimes called compute shaders, sharing execution units with vertex shaders and pixel shaders on GPUs, but are not limited to execution on one class of device, or graphics APIs.
Algorithmic skeletonIn computing, algorithmic skeletons, or parallelism patterns, are a high-level parallel programming model for parallel and distributed computing. Algorithmic skeletons take advantage of common programming patterns to hide the complexity of parallel and distributed applications. Starting from a basic set of patterns (skeletons), more complex patterns can be built by combining the basic ones.
Fork–join modelIn parallel computing, the fork–join model is a way of setting up and executing parallel programs, such that execution branches off in parallel at designated points in the program, to "join" (merge) at a subsequent point and resume sequential execution. Parallel sections may fork recursively until a certain task granularity is reached. Fork–join can be considered a parallel design pattern. It was formulated as early as 1963.