Previous | Contents | Index |
The DO directive specifies that the iterations of the immediately following DO loop must be dispatched across the team of threads so that each iteration is executed by a single thread. The loop that follows a DO directive cannot be a DO WHILE or a DO loop that does not have loop control. The iterations of the DO loop are dispatched among the existing team of threads.
You cannot use a GOTO statement, or any other statement, to transfer control into or out of the DO construct.
If you specify the optional END DO directive, it must appear immediately after the end of the DO loop. If you do not specify the END DO directive, an END DO directive is assumed at the end of the DO loop.
The loop iteration variable is private by default, so it is not necessary to declare it explicitly.
If you do not specify the optional NOWAIT clause on the END DO directive, threads synchronize at the END DO directive. If you specify NOWAIT, threads do not synchronize, and threads that finish early proceed directly to the instructions following the END DO directive.
The DO directive optionally lets you:
Controlling Data Scope Attributes
For information about controlling data scope attributes, see Section 6.1.5.2.
Specifying Schedule Type and Chunk Size
The SCHEDULE clause specifies a scheduling algorithm that determines how iterations of the DO loop are divided among and dispatched to the threads of the team. The SCHEDULE clause applies only to the current DO or PARALLEL DO directive.
Within the SCHEDULE clause, you must specify a schedule type and optionally, a chunk size. Chunk must be a scalar integer expression.
The following list describes the schedule types and how the chunk size affects scheduling:
You can determine the schedule type used for the current DO or PARALLEL DO directive by using the following prioritized list. The available schedule type closest to the top of the list is used:
You can determine the chunk size used for the current DO or PARALLEL DO directive by using the following prioritized list. The available chunk size closest to the top of the list is used:
Use the noniterative worksharing SECTIONS directive to divide the enclosed sections of code among the team. Each section is executed just one time by one thread.
Precede each section with a SECTION directive. However, the SECTION directive is optional for the first section. The SECTION directive must appear within the lexical extent of the SECTIONS and END SECTIONS directives.
The last section ends at the END SECTIONS directive. When a thread completes its section and there are no undispatched sections, it waits at the END SECTION directive unless you specify NOWAIT.
The following example shows how to use the SECTIONS and SECTION directives to execute subroutines XAXIS, YAXIS, and ZAXIS in parallel. The first SECTION directive is optional:
!$OMP PARALLEL !$OMP SECTIONS !$OMP SECTION CALL XAXIS !$OMP SECTION CALL YAXIS !$OMP SECTION CALL ZAXIS !$OMP END SECTIONS !$OMP END PARALLEL |
For information about controlling the data scope attributes, see
Section 6.1.5.2.
6.1.7.3 SINGLE and END SINGLE Directives
Use the SINGLE directive when you want just one thread of the team to execute the enclosed block of code.
Threads that are not executing the SINGLE directive wait at the END SINGLE directive unless you specify NOWAIT.
In the following example, the first thread that encounters the SINGLE directive executes subroutines OUTPUT and INPUT:
!$OMP PARALLEL DEFAULT(SHARED) CALL WORK(X) !$OMP BARRIER !$OMP SINGLE CALL OUTPUT(X) CALL INPUT(Y) !$OMP END SINGLE CALL WORK(Y) !$OMP END PARALLEL |
For information about controlling the data scope attributes, see
Section 6.1.5.2.
6.1.8 Combined Parallel/Worksharing Constructs
The combined parallel/worksharing constructs provide an abbreviated way to specify a parallel region that contains a single worksharing construct. The combined parallel/worksharing constructs are:
Use the PARALLEL DO directive to specify a parallel region that implicitly contains a single DO directive.
You can specify one or more of the clauses for the PARALLEL and the DO directives (see Section 6.1.6 and Section 6.1.7.1).
The following example shows how to parallelize a simple loop. The loop iteration variable is private by default, so it is not necessary to declare it explicitly. The END PARALLEL DO directive is optional.
!$OMP PARALLEL DO DO I=1,N B(I) = (A(I) + A(I-1)) / 2.0 END DO !$OMP END PARALLEL DO |
Use the PARALLEL SECTIONS directive to specify a parallel region that implicitly contains a single SECTION directive.
You can specify one or more of the clauses for the PARALLEL and the SECTIONS directives (see Section 6.1.6 and Section 6.1.7.2).
The last section ends at the END PARALLEL SECTIONS directive.
In the following example, subroutines XAXIS, YAXIS, and ZAXIS can be executed concurrently. The first SECTION directive is optional. Note that all SECTION directives must appear in the lexical extent of the PARALLEL SECTIONS/END PARALLEL SECTIONS construct.
!$OMP PARALLEL SECTIONS !$OMP SECTION CALL XAXIS !$OMP SECTION CALL YAXIS !$OMP SECTION CALL ZAXIS !$OMP END PARALLEL SECTIONS |
Synchronization is the interthread communication that ensures the consistency of shared data and coordinates parallel execution among threads.
Shared data is consistent within a team of threads when all threads obtain the identical value when the data is accessed.
The synchronization constructs are:
Use the ATOMIC directive to ensure that a specific memory location is updated atomically instead of exposing the location to the possibility of multiple, simultaneously writing threads.
This directive applies only to the immediately following statement, which must have one of the following forms:
x = x operator expr x = expr operator x x = intrinsic (x, expr) x = intrinsic (expr, x) |
In the preceding statements:
This directive permits optimization beyond that of a critical section around the assignment. An implementation can replace all ATOMIC directives by enclosing the statement in a critical section. All of these critical sections must use the same unique name.
Only the load and store of x are atomic; the evaluation of expr is not atomic. To avoid race conditions, all updates of the location in parallel must be protected by using the ATOMIC directive, except those that are known to be free of race conditions. The function intrinsic, the operator operator, and the assignment must be the intrinsic function, operator, and assignment.
The following restriction applies to the ATOMIC directive:
In the following example, the collection of Y locations is updated atomically.
!$OMP ATOMIC Y = Y + B(I) |
To synchronize all threads within a parallel region, use the BARRIER directive. You can use this directive only within a parallel region defined by using the PARALLEL directive. You cannot use the BARRIER directive within the DO, PARALLEL DO, SECTIONS, PARALLEL SECTIONS, and SINGLE directives.
When encountered, each thread waits at the BARRIER directive until all threads have reached the directive.
In the following example, the BARRIER directive ensures that all threads have executed the first loop and that it is safe to execute the second loop:
c$OMP PARALLEL c$OMP DO PRIVATE(i) DO i = 1, 100 b(i) = i END DO c$OMP BARRIER c$OMP DO PRIVATE(i) DO i = 1, 100 a(i) = b(101-i) END DO c$OMP END PARALLEL |
Use the CRITICAL and END CRITICAL directives to restrict access to a block of code to one thread at a time.
A thread waits at the beginning of a critical section until no other thread in the team is executing a critical section having the same name.
If you specify a critical section name in the CRITICAL directive, you must specify the same name in the END CRITICAL directive.
The following example includes several CRITICAL directives, and illustrates a queuing model in which a task is dequeued and worked on. To guard against multiple threads dequeuing the same task, the dequeuing operation must be in a critical section. Because there are two independent queues in this example, each queue is protected by CRITICAL directives having different names, XAXIS and YAXIS, respectively.
!$OMP PARALLEL DEFAULT(PRIVATE),SHARED(X,Y) !$OMP CRITICAL(XAXIS) CALL DEQUEUE(IX_NEXT, X) !$OMP END CRITICAL(XAXIS) CALL WORK(IX_NEXT, X) !$OMP CRITICAL(YAXIS) CALL DEQUEUE(IY_NEXT,Y) !$OMP END CRITICAL(YAXIS) CALL WORK(IY_NEXT, Y) !$OMP END PARALLEL |
Unnamed critical sections use the global lock from the Pthread package.
This allows you to synchronize with other code by using the same lock.
Named locks are created and maintained by the compiler and can be
significantly more efficient.
6.1.9.4 FLUSH Directive
Use the FLUSH directive to identify a synchronization point at which a consistent view of memory is provided. Thread-visible variables are written back to memory at this point.
To avoid flushing all thread-visible variables at this point, include a list of comma-separated named variables to be flushed.
The following example uses the FLUSH directive for point-to-point synchronization between thread 0 and thread 1 for the variable ISYNC.
!$OMP PARALLEL DEFAULT(PRIVATE),SHARED(ISYNC) IAM = OMP_GET_THREAD_NUM() ISYNC(IAM) = 0 !$OMP BARRIER CALL WORK() ! I Am Done With My Work, Synchronize With My Neighbor ISYNC(IAM) = 1 !$OMP FLUSH(ISYNC) ! Wait Till Neighbor Is Done DO WHILE (ISYNC(NEIGH) .EQ. 0) !$OMP FLUSH(ISYNC) END DO !$OMP END PARALLEL |
Use the MASTER and END MASTER directives to identify a block of code that is executed only by the master thread.
In the following example, only the master thread executes the routines OUTPUT and INPUT.
!$OMP PARALLEL DEFAULT(SHARED) CALL WORK(X) !$OMP MASTER CALL OUTPUT(X) CALL INPUT(Y) !$OMP END MASTER CALL WORK(Y) !$OMP END PARALLEL |
Use the ORDERED and END ORDERED directives within a DO construct to allow work within an ordered section to execute sequentially while allowing work outside the section to execute in parallel.
When you use the ORDERED directive, you must also specify the ORDERED clause on the DO directive.
Only one thread at a time is allowed to enter the ordered section, and then only in the order of loop iterations.
In the following example, the code prints out the indexes in sequential order.
!$OMP DO ORDERED,SCHEDULE(DYNAMIC) DO I=LB,UB,ST CALL WORK(I) END DO SUBROUTINE WORK(K) !$OMP ORDERED WRITE(*,*) K !$OMP END ORDERED |
To enable the use of DIGITAL Fortran parallel compiler directives in your program, you must include the -mp compiler option on your f90 command:
% f90 -mp prog.f -o prog |
The format of a DIGITAL Fortran parallel compiler directive is:
prefix directive_name [option[[,] option]...] |
All DIGITAL Fortran parallel compiler directives must begin with a directive prefix. Directives are not case-sensitive. Options can appear in any order after the directive name and can be repeated as needed, subject to the restrictions of individual options.
Directives cannot be embedded within continued statements, and
statements cannot be embedded within directives. Trailing comments are
allowed.
6.2.2.1 Directive Prefixes
The directive prefix you use depends on the source form you use in your program. Use the !$PAR prefix when compiling either fixed source form or free source form programs. Use the C$PAR (or c$PAR) and the *$PAR prefixes only when compiling fixed source form programs.
Fixed Source Form
For fixed source form programs, the prefix is one of the following:
|
For more information about fixed source form prefixes, see Section 6.1.2.1.
Free Source Form
For free source form programs, the prefix is !$PAR. For more
information about free source form, see Section 6.1.2.1.
6.2.3 Directive Summary Descriptions
Table 6-3 provides summary descriptions of the DIGITAL Fortran parallel compiler directives. For complete information about the DIGITAL Fortran parallel compiler directives, see Appendix D.
Directive Format |
Description |
---|---|
prefix BARRIER | |
This directive defines a synchronization construct, which, when reached by a thread, blocks further execution by that thread until all threads have reached the barrier. This directive is allowed only within a parallel region, but is not allowed within any worksharing or synchronization construct. | |
prefix CHUNK 1 = chunksize | |
This directive sets a default chunk size used to divide iterations among the threads of the team. The affect of the specified chunk size depends on the schedule type. This directive is provided for compatibility reasons. | |
prefix COPYIN 1 object[, object]... | |
This data environment directive specifies that the listed variables,
single array elements, and common blocks be copied from the master
thread to the PRIVATE data objects having the same name.
This directive is allowed only within a parallel region. |
|
prefix CRITICAL SECTION [(latch-var)]
code prefix END CRITICAL SECTION |
|
These directives define a synchronization construct that specifies a block of code, referred to as a critical section, that is executed by one thread at a time. When a thread enters the critical section, a latch variable is set to closed and all other threads are locked out. When the thread exits the critical section at the END CRITICAL SECTION directive, the latch variable is set to open, allowing another thread access to the critical section. | |
prefix INSTANCE
|
|
This data environment construct makes named common blocks available to
threads.
When you specify SINGLE, all threads share the same instance of the named common blocks. When you specify PARALLEL, the named common blocks are made private to a thread, but global within the thread. |
|
prefix MP_SCHEDTYPE 1 = mode | |
This directive sets a default run-time schedule type. The schedule type does not affect the semantics of the program, but may affect performance. This directive is provided for compatibility reasons. | |
prefix PARALLEL [region-option[[,]region-option]...]
code prefix END PARALLEL |
|
These directives define a parallel construct that is a region of a
program that must be executed by a team of threads in parallel until
the END PARALLEL directive is encountered. Use the worksharing
constructs such as PDO, PSECTIONS, and SINGLE PROCESS to divide the
work in the parallel region among the threads of the team.
The PARALLEL directive takes an optional comma-separated list of options that specifies:
|
|
prefix
do_loop [prefix END PARALLEL DO] |
|
These directives define a combined parallel/worksharing construct that
specifies an abbreviated form of specifying a parallel region that
contains a single PDO directive.
The PARALLEL DO directive takes an optional comma-separated list of options that can be one or more of the options specified for the PARALLEL and PDO directives. |
|
prefix PARALLEL SECTIONS
[par-sect-option[[,]par-sect-option]...]
code prefix END PARALLEL SECTIONS |
|
These directives define a combined parallel/worksharing construct that
specifies an abbreviated form of specifying a parallel region that
contains a single SECTION directive. The semantics are identical to
explicitly specifying the PARALLEL directive immediately followed by a
PSECTIONS directive.
The PARALLEL SECTIONS directive takes an optional comma-separated list of options that can be one or more of the options specified for the PARALLEL and PSECTIONS directives. |
|
prefix PDO [pdo-option[[,]pdo-option]...]
do_loop [prefix END PDO [NOWAIT]] |
|
These directives define a worksharing construct that specifies that
each set of iterations of the contained do_loop is a unit of work that
can be scheduled on a single thread.
This directive must be nested within the lexical extent of a PARALLEL directive. The PARALLEL directive takes an optional comma-separated list of options that specifies:
When the END PDO directive is encountered, an implicit barrier is erected and threads wait at the barrier until all threads have finished. This can be overridden by using the NOWAIT option. If the END PDO directive is not included, an implicit barrier is erected at the last statement in the DO loop. |
|
prefix PDONE | |
This directive specifies that the DO loop in which this PDONE directive
is contained should be terminated early. Any iterations already
dispatched to threads are executed, but any iterations not already
dispatched are not dispatched and not executed.
This directive must be nested within the lexical extent of a PDO or PARALLEL directive. When the schedule type is STATIC or INTERLEAVED, this directive has no effect because all loop iterations are dispatched before the DO loop executes. |
|
prefix PSECTION[S] [sect-option[[,]sect-option]...]
[prefix SECTION] code [prefix SECTION code ] prefix END PSECTION[S] [NOWAIT] |
|
These directives define a worksharing construct that specifies one or
more sections of independent code that are executed in parallel. Each
section must be preceded by the SECTION directive, but is optional for
the first section.
This directive must be nested within the lexical extent of a PARALLEL directive. The PSECTIONS directive takes an optional comma-separated list of options that specifies which variables are PRIVATE or FIRSTPRIVATE. When the END PSECTIONS directive is encountered, an implicit barrier is erected and threads wait at the barrier until all thread have finished. This can be overridden by using the NOWAIT option. |
|
prefix SINGLE PROCESS [proc-option[[,]proc-option] ...]
code prefix END SINGLE PROCESS [NOWAIT] |
|
These directives define a worksharing construct that specifies a block
of code that is executed by only one thread.
This directive must be nested within the lexical extent of a PARALLEL directive. The SINGLE PROCESS directive takes an optional comma-separated list of options that specifies which variables are PRIVATE or FIRSTPRIVATE. When the END SINGLE PROCESS directive is encountered, an implicit barrier is erected and threads wait until all threads have finished. This can be overridden by using the NOWAIT option. |
|
prefix TASKCOMMON com-blk-name[,com-blk-name]... | |
This data environment construct makes named common blocks private to a
thread, but global within the thread.
This directive is semantically equivalent to the INSTANCE PARALLEL directive and differs only in form. You can use the COPYIN option of the PARALLEL directive to copy common block values to the common blocks specified in this directive. |
Previous | Next | Contents | Index |