DIGITAL Fortran 90
User Manual for
DIGITAL UNIX Systems


Previous Contents Index

6.1.7.1 DO and END DO directives

The DO directive specifies that the iterations of the immediately following DO loop must be dispatched across the team of threads so that each iteration is executed by a single thread. The loop that follows a DO directive cannot be a DO WHILE or a DO loop that does not have loop control. The iterations of the DO loop are dispatched among the existing team of threads.

You cannot use a GOTO statement, or any other statement, to transfer control into or out of the DO construct.

If you specify the optional END DO directive, it must appear immediately after the end of the DO loop. If you do not specify the END DO directive, an END DO directive is assumed at the end of the DO loop.

The loop iteration variable is private by default, so it is not necessary to declare it explicitly.

If you do not specify the optional NOWAIT clause on the END DO directive, threads synchronize at the END DO directive. If you specify NOWAIT, threads do not synchronize, and threads that finish early proceed directly to the instructions following the END DO directive.

The DO directive optionally lets you:

Controlling Data Scope Attributes

For information about controlling data scope attributes, see Section 6.1.5.2.

Specifying Schedule Type and Chunk Size

The SCHEDULE clause specifies a scheduling algorithm that determines how iterations of the DO loop are divided among and dispatched to the threads of the team. The SCHEDULE clause applies only to the current DO or PARALLEL DO directive.

Within the SCHEDULE clause, you must specify a schedule type and optionally, a chunk size. Chunk must be a scalar integer expression.

The following list describes the schedule types and how the chunk size affects scheduling:

You can determine the schedule type used for the current DO or PARALLEL DO directive by using the following prioritized list. The available schedule type closest to the top of the list is used:

  1. The schedule type specified in the SCHEDULE clause of the current DO or PARALLEL DO directive
  2. If the schedule type for the current DO or PARALLEL DO directive is RUNTIME, the default value specified in the OMP_SCHEDULE environment variable
  3. The compiler default schedule type of STATIC

You can determine the chunk size used for the current DO or PARALLEL DO directive by using the following prioritized list. The available chunk size closest to the top of the list is used:

  1. The chunk size specified in the SCHEDULE clause of the current DO or PARALLEL DO directive
  2. For RUNTIME schedule type, the value specified in the OMP_SCHEDULE environment variable
  3. For DYNAMIC and GUIDED schedule types, the default is 1
  4. If the schedule type for the current DO or PARALLEL DO directive is STATIC, the loop iteration space is divided by the number of threads in the team

6.1.7.2 SECTIONS, SECTION, and END SECTIONS Directives

Use the noniterative worksharing SECTIONS directive to divide the enclosed sections of code among the team. Each section is executed just one time by one thread.

Precede each section with a SECTION directive. However, the SECTION directive is optional for the first section. The SECTION directive must appear within the lexical extent of the SECTIONS and END SECTIONS directives.

The last section ends at the END SECTIONS directive. When a thread completes its section and there are no undispatched sections, it waits at the END SECTION directive unless you specify NOWAIT.

The following example shows how to use the SECTIONS and SECTION directives to execute subroutines XAXIS, YAXIS, and ZAXIS in parallel. The first SECTION directive is optional:


!$OMP PARALLEL 
!$OMP SECTIONS 
!$OMP SECTION 
      CALL XAXIS 
!$OMP SECTION 
      CALL YAXIS 
!$OMP SECTION 
      CALL ZAXIS 
!$OMP END SECTIONS 
!$OMP END PARALLEL 

For information about controlling the data scope attributes, see Section 6.1.5.2.

6.1.7.3 SINGLE and END SINGLE Directives

Use the SINGLE directive when you want just one thread of the team to execute the enclosed block of code.

Threads that are not executing the SINGLE directive wait at the END SINGLE directive unless you specify NOWAIT.

In the following example, the first thread that encounters the SINGLE directive executes subroutines OUTPUT and INPUT:


!$OMP PARALLEL DEFAULT(SHARED) 
      CALL WORK(X) 
!$OMP BARRIER 
!$OMP SINGLE 
      CALL OUTPUT(X) 
      CALL INPUT(Y) 
!$OMP END SINGLE 
      CALL WORK(Y) 
!$OMP END PARALLEL 

For information about controlling the data scope attributes, see Section 6.1.5.2.

6.1.8 Combined Parallel/Worksharing Constructs

The combined parallel/worksharing constructs provide an abbreviated way to specify a parallel region that contains a single worksharing construct. The combined parallel/worksharing constructs are:

6.1.8.1 PARALLEL DO and END PARALLEL DO Directives

Use the PARALLEL DO directive to specify a parallel region that implicitly contains a single DO directive.

You can specify one or more of the clauses for the PARALLEL and the DO directives (see Section 6.1.6 and Section 6.1.7.1).

The following example shows how to parallelize a simple loop. The loop iteration variable is private by default, so it is not necessary to declare it explicitly. The END PARALLEL DO directive is optional.


!$OMP PARALLEL DO 
      DO I=1,N 
        B(I) = (A(I) + A(I-1)) / 2.0 
      END DO 
!$OMP END PARALLEL DO 

6.1.8.2 PARALLEL SECTIONS and END PARALLEL SECTIONS Directives

Use the PARALLEL SECTIONS directive to specify a parallel region that implicitly contains a single SECTION directive.

You can specify one or more of the clauses for the PARALLEL and the SECTIONS directives (see Section 6.1.6 and Section 6.1.7.2).

The last section ends at the END PARALLEL SECTIONS directive.

In the following example, subroutines XAXIS, YAXIS, and ZAXIS can be executed concurrently. The first SECTION directive is optional. Note that all SECTION directives must appear in the lexical extent of the PARALLEL SECTIONS/END PARALLEL SECTIONS construct.


!$OMP PARALLEL SECTIONS 
!$OMP SECTION 
      CALL XAXIS 
!$OMP SECTION 
      CALL YAXIS 
!$OMP SECTION 
      CALL ZAXIS 
!$OMP END PARALLEL SECTIONS 

6.1.9 Synchronization Constructs

Synchronization is the interthread communication that ensures the consistency of shared data and coordinates parallel execution among threads.

Shared data is consistent within a team of threads when all threads obtain the identical value when the data is accessed.

The synchronization constructs are:

6.1.9.1 ATOMIC Directive

Use the ATOMIC directive to ensure that a specific memory location is updated atomically instead of exposing the location to the possibility of multiple, simultaneously writing threads.

This directive applies only to the immediately following statement, which must have one of the following forms:


x = x operator expr 
 
x = expr operator x 
 
x = intrinsic (x, expr) 
 
x = intrinsic (expr, x) 

In the preceding statements:

This directive permits optimization beyond that of a critical section around the assignment. An implementation can replace all ATOMIC directives by enclosing the statement in a critical section. All of these critical sections must use the same unique name.

Only the load and store of x are atomic; the evaluation of expr is not atomic. To avoid race conditions, all updates of the location in parallel must be protected by using the ATOMIC directive, except those that are known to be free of race conditions. The function intrinsic, the operator operator, and the assignment must be the intrinsic function, operator, and assignment.

The following restriction applies to the ATOMIC directive:

In the following example, the collection of Y locations is updated atomically.


!$OMP ATOMIC 
    Y = Y + B(I) 

6.1.9.2 BARRIER Directive

To synchronize all threads within a parallel region, use the BARRIER directive. You can use this directive only within a parallel region defined by using the PARALLEL directive. You cannot use the BARRIER directive within the DO, PARALLEL DO, SECTIONS, PARALLEL SECTIONS, and SINGLE directives.

When encountered, each thread waits at the BARRIER directive until all threads have reached the directive.

In the following example, the BARRIER directive ensures that all threads have executed the first loop and that it is safe to execute the second loop:


c$OMP PARALLEL 
c$OMP DO PRIVATE(i) 
        DO i = 1, 100 
           b(i) = i 
        END DO 
c$OMP BARRIER 
c$OMP DO PRIVATE(i) 
        DO i = 1, 100 
           a(i) = b(101-i) 
        END DO 
c$OMP END PARALLEL 

6.1.9.3 CRITICAL Directive

Use the CRITICAL and END CRITICAL directives to restrict access to a block of code to one thread at a time.

A thread waits at the beginning of a critical section until no other thread in the team is executing a critical section having the same name.

If you specify a critical section name in the CRITICAL directive, you must specify the same name in the END CRITICAL directive.

The following example includes several CRITICAL directives, and illustrates a queuing model in which a task is dequeued and worked on. To guard against multiple threads dequeuing the same task, the dequeuing operation must be in a critical section. Because there are two independent queues in this example, each queue is protected by CRITICAL directives having different names, XAXIS and YAXIS, respectively.


!$OMP PARALLEL DEFAULT(PRIVATE),SHARED(X,Y) 
!$OMP CRITICAL(XAXIS) 
      CALL DEQUEUE(IX_NEXT, X) 
!$OMP END CRITICAL(XAXIS) 
      CALL WORK(IX_NEXT, X) 
!$OMP CRITICAL(YAXIS) 
      CALL DEQUEUE(IY_NEXT,Y) 
!$OMP END CRITICAL(YAXIS) 
      CALL WORK(IY_NEXT, Y) 
!$OMP END PARALLEL 

Unnamed critical sections use the global lock from the Pthread package. This allows you to synchronize with other code by using the same lock. Named locks are created and maintained by the compiler and can be significantly more efficient.

6.1.9.4 FLUSH Directive

Use the FLUSH directive to identify a synchronization point at which a consistent view of memory is provided. Thread-visible variables are written back to memory at this point.

To avoid flushing all thread-visible variables at this point, include a list of comma-separated named variables to be flushed.

The following example uses the FLUSH directive for point-to-point synchronization between thread 0 and thread 1 for the variable ISYNC.


!$OMP PARALLEL DEFAULT(PRIVATE),SHARED(ISYNC) 
      IAM = OMP_GET_THREAD_NUM() 
      ISYNC(IAM) = 0 
!$OMP BARRIER 
      CALL WORK() 
! I Am Done With My Work, Synchronize With My Neighbor 
      ISYNC(IAM) = 1 
!$OMP FLUSH(ISYNC) 
! Wait Till Neighbor Is Done 
      DO WHILE (ISYNC(NEIGH) .EQ. 0) 
!$OMP FLUSH(ISYNC) 
      END DO 
!$OMP END PARALLEL 

6.1.9.5 MASTER Directive

Use the MASTER and END MASTER directives to identify a block of code that is executed only by the master thread.

In the following example, only the master thread executes the routines OUTPUT and INPUT.


!$OMP PARALLEL DEFAULT(SHARED) 
      CALL WORK(X) 
!$OMP MASTER 
      CALL OUTPUT(X) 
      CALL INPUT(Y) 
!$OMP END MASTER 
      CALL WORK(Y) 
!$OMP END PARALLEL 

6.1.9.6 ORDERED Directive

Use the ORDERED and END ORDERED directives within a DO construct to allow work within an ordered section to execute sequentially while allowing work outside the section to execute in parallel.

When you use the ORDERED directive, you must also specify the ORDERED clause on the DO directive.

Only one thread at a time is allowed to enter the ordered section, and then only in the order of loop iterations.

In the following example, the code prints out the indexes in sequential order.


!$OMP DO ORDERED,SCHEDULE(DYNAMIC) 
      DO I=LB,UB,ST 
         CALL WORK(I) 
      END DO 
      SUBROUTINE WORK(K) 
!$OMP ORDERED 
      WRITE(*,*) K 
!$OMP END ORDERED 

6.2 DIGITAL Fortran Parallel Compiler Directives

The topics described include:

6.2.1 Compiler Command Line Option

To enable the use of DIGITAL Fortran parallel compiler directives in your program, you must include the -mp compiler option on your f90 command:


% f90 -mp prog.f -o prog

6.2.2 Format for DIGITAL Fortran Parallel Directives

The format of a DIGITAL Fortran parallel compiler directive is:


prefix directive_name [option[[,] option]...] 

All DIGITAL Fortran parallel compiler directives must begin with a directive prefix. Directives are not case-sensitive. Options can appear in any order after the directive name and can be repeated as needed, subject to the restrictions of individual options.

Directives cannot be embedded within continued statements, and statements cannot be embedded within directives. Trailing comments are allowed.

6.2.2.1 Directive Prefixes

The directive prefix you use depends on the source form you use in your program. Use the !$PAR prefix when compiling either fixed source form or free source form programs. Use the C$PAR (or c$PAR) and the *$PAR prefixes only when compiling fixed source form programs.

Fixed Source Form

For fixed source form programs, the prefix is one of the following:

  • !$PAR
  • C$PAR (or c$PAR)
  • *$PAR

For more information about fixed source form prefixes, see Section 6.1.2.1.

Free Source Form

For free source form programs, the prefix is !$PAR. For more information about free source form, see Section 6.1.2.1.

6.2.3 Directive Summary Descriptions

Table 6-3 provides summary descriptions of the DIGITAL Fortran parallel compiler directives. For complete information about the DIGITAL Fortran parallel compiler directives, see Appendix D.

Table 6-3 DIGITAL Fortran Parallel Compiler Directives
Directive
Format
Description
prefix BARRIER
  This directive defines a synchronization construct, which, when reached by a thread, blocks further execution by that thread until all threads have reached the barrier. This directive is allowed only within a parallel region, but is not allowed within any worksharing or synchronization construct.
prefix CHUNK 1 = chunksize
  This directive sets a default chunk size used to divide iterations among the threads of the team. The affect of the specified chunk size depends on the schedule type. This directive is provided for compatibility reasons.
prefix COPYIN 1 object[, object]...
  This data environment directive specifies that the listed variables, single array elements, and common blocks be copied from the master thread to the PRIVATE data objects having the same name.

This directive is allowed only within a parallel region.

prefix CRITICAL SECTION [(latch-var)]

code

prefix END CRITICAL SECTION
  These directives define a synchronization construct that specifies a block of code, referred to as a critical section, that is executed by one thread at a time. When a thread enters the critical section, a latch variable is set to closed and all other threads are locked out. When the thread exits the critical section at the END CRITICAL SECTION directive, the latch variable is set to open, allowing another thread access to the critical section.
prefix INSTANCE
  • SINGLE
  • PARALLEL
/com-blk-name/[[,]/com-blk-name/]...
  This data environment construct makes named common blocks available to threads.

When you specify SINGLE, all threads share the same instance of the named common blocks.

When you specify PARALLEL, the named common blocks are made private to a thread, but global within the thread.

prefix MP_SCHEDTYPE 1 = mode
  This directive sets a default run-time schedule type. The schedule type does not affect the semantics of the program, but may affect performance. This directive is provided for compatibility reasons.
prefix PARALLEL [region-option[[,]region-option]...]

code

prefix END PARALLEL
  These directives define a parallel construct that is a region of a program that must be executed by a team of threads in parallel until the END PARALLEL directive is encountered. Use the worksharing constructs such as PDO, PSECTIONS, and SINGLE PROCESS to divide the work in the parallel region among the threads of the team.

The PARALLEL directive takes an optional comma-separated list of options that specifies:

  • Whether the statements in the parallel region are executed in parallel by a team of threads or serially by a single thread (IF option)
  • Whether variables are PRIVATE, FIRSTPRIVATE, or SHARED
  • Whether variables have a DEFAULT data scope attribute
  • Whether master thread common block values are copied to TASKCOMMON threads (COPYIN option)
prefix
  • PARALLEL DO
  • DOACROSS 1
[par-do-option[[,]par-do-option]...]


do_loop

[prefix END PARALLEL DO]
  These directives define a combined parallel/worksharing construct that specifies an abbreviated form of specifying a parallel region that contains a single PDO directive.

The PARALLEL DO directive takes an optional comma-separated list of options that can be one or more of the options specified for the PARALLEL and PDO directives.

prefix PARALLEL SECTIONS [par-sect-option[[,]par-sect-option]...]

code

prefix END PARALLEL SECTIONS
  These directives define a combined parallel/worksharing construct that specifies an abbreviated form of specifying a parallel region that contains a single SECTION directive. The semantics are identical to explicitly specifying the PARALLEL directive immediately followed by a PSECTIONS directive.

The PARALLEL SECTIONS directive takes an optional comma-separated list of options that can be one or more of the options specified for the PARALLEL and PSECTIONS directives.

prefix PDO [pdo-option[[,]pdo-option]...]

do_loop

[prefix END PDO [NOWAIT]]
  These directives define a worksharing construct that specifies that each set of iterations of the contained do_loop is a unit of work that can be scheduled on a single thread.

This directive must be nested within the lexical extent of a PARALLEL directive.

The PARALLEL directive takes an optional comma-separated list of options that specifies:

  • Whether variables are PRIVATE, FIRSTPRIVATE, LASTLOCAL, or REDUCTION
  • How iterations are scheduled onto threads and whether this is deferred until run time (MP_SCHEDTYPE option)
  • How many iterations each thread is assigned (CHUNK or BLOCKED option)
  • Whether iterations are in an ordered sequence (ORDERED option)

When the END PDO directive is encountered, an implicit barrier is erected and threads wait at the barrier until all threads have finished. This can be overridden by using the NOWAIT option.

If the END PDO directive is not included, an implicit barrier is erected at the last statement in the DO loop.

prefix PDONE
  This directive specifies that the DO loop in which this PDONE directive is contained should be terminated early. Any iterations already dispatched to threads are executed, but any iterations not already dispatched are not dispatched and not executed.

This directive must be nested within the lexical extent of a PDO or PARALLEL directive.

When the schedule type is STATIC or INTERLEAVED, this directive has no effect because all loop iterations are dispatched before the DO loop executes.

prefix PSECTION[S] [sect-option[[,]sect-option]...]

[prefix SECTION]


code

[prefix SECTION

code ]

prefix END PSECTION[S] [NOWAIT]
  These directives define a worksharing construct that specifies one or more sections of independent code that are executed in parallel. Each section must be preceded by the SECTION directive, but is optional for the first section.

This directive must be nested within the lexical extent of a PARALLEL directive.

The PSECTIONS directive takes an optional comma-separated list of options that specifies which variables are PRIVATE or FIRSTPRIVATE.

When the END PSECTIONS directive is encountered, an implicit barrier is erected and threads wait at the barrier until all thread have finished. This can be overridden by using the NOWAIT option.

prefix SINGLE PROCESS [proc-option[[,]proc-option] ...]

code

prefix END SINGLE PROCESS [NOWAIT]
  These directives define a worksharing construct that specifies a block of code that is executed by only one thread.

This directive must be nested within the lexical extent of a PARALLEL directive.

The SINGLE PROCESS directive takes an optional comma-separated list of options that specifies which variables are PRIVATE or FIRSTPRIVATE.

When the END SINGLE PROCESS directive is encountered, an implicit barrier is erected and threads wait until all threads have finished. This can be overridden by using the NOWAIT option.

   
prefix TASKCOMMON com-blk-name[,com-blk-name]...
  This data environment construct makes named common blocks private to a thread, but global within the thread.

This directive is semantically equivalent to the INSTANCE PARALLEL directive and differs only in form.

You can use the COPYIN option of the PARALLEL directive to copy common block values to the common blocks specified in this directive.


1This format is for fixed source form only.


Previous Next Contents Index