Module 3: Shared-Memory Programming with OpenMP¶

Characteristics¶

pragmas¶: The preprocessor directive employed by OpenMP. Employ clauses to fine control the parallelized behavior. It works like a domain specific language.
Structured block¶: The code block to be parallelized by OpenMP. It can be a for/while/do-while statement, a function call or a code block enclosed by {}
Threads (OpenMP)¶: OpenMP will fork threads to execute the structured block. The collection of threads is called Team. There is a master thread to execute the codes that are not parallelized and the rest of threads are known as slave, which execute the structured block only.
Synchronization¶: A mechanism to constraint the ordering of execution of instructions to preclude potential problems such as racing condition, and other inconsistency problems.
Variable Scope (OpenMP)¶: As OpenMP is not part of C standard, the scope of variables in the structured block needs special handling. The can be defined as private or shared in OpenMP pragma directives. The default scope of variables declared outside of the block is shared, while the default scope of variables declared inside the block is private.

Parallelize outer loop if possible
Separate parallel and for constructs to reduce fork and join operations if parallelization of inner loop is desirable
Consider cache friendship
- change scheduling method