Previous | Contents | Index |
The THREADPRIVATE directive makes named common blocks private to a thread but global within the thread.
The THREADPRIVATE directive takes the following form:
|
c
Is one of the following: C (or c), !, or * (see Chapter 6).
Rules and Restrictions for the THREADPRIVATE Directive
cb is the name of the common block you want made private to a thread. Each thread gets its own copy of the common block, so data written to the common block by one thread is not directly visible to other threads.
During serial portions and MASTER sections of the program, accesses are to the master thread copy of the common block. On entry to the first parallel region, data in the THREADPRIVATE common blocks should be assumed to be undefined unless a COPYIN clause is specified on the PARALLEL directive.
When a common block (which is initialized using DATA statements) appears in a THREADPRIVATE directive, each thread copy is initialized once prior to its first use. For subsequent parallel regions, data in THREADPRIVATE common blocks are guaranteed to persist only if the dynamic threads mechanism has been disabled and if the number of threads are the same for all the parallel regions.
The following restrictions apply to the THREADPRIVATE directive:
Examples
In the following example, the common blocks BLK1 and FIELDS are specified as thread private:
COMMON /BLK/ SCRATCH COMMON /FIELDS/ XFIELD, YFIELD, ZFIELD c$OMP THREADPRIVATE(/BLK/,/FIELDS/) c$OMP PARALLEL DEFAULT(PRIVATE) COPYIN(/BLK1/,ZFIELD) |
The set of DIGITAL Fortran Parallel Compiler Directives allows you to specify the actions to be taken by the compiler and run-time system to execute a DIGITAL Fortran program in parallel.
For information about the directive format, refer to Chapter 6.
D.2.1 BARRIER Directive
The BARRIER directive is the same as the OpenMP Fortran API BARRIER
directive (see Section D.1.2).
D.2.2 CHUNK Directive
The CHUNK directive sets a default chunksize to adjust the number of iterations assigned to a thread. The effect of CHUNK varies, depending on the scheduling type.
A CHUNK directive takes the following form:
|
c
Is one of the following: C (or c), !, or * (see Chapter 6).chunksize
Is a scalar integer expression.
The effect of chunksize varies by scheduling type, as follows:
Defaults
The chunksize used for any parallel DO loop can be determined from the following prioritized list. The available value closest to the top of the list is used:
The COPYIN directive specifies that the values of the listed data objects be copied from the master thread to the PRIVATE data objects of the same name in slave threads.
A COPYIN directive takes the following form:
|
c
Is one of the following: C (or c), !, or * (see Chapter 6).object
Is one of the following:
- variable-name
- single-array-element
- /common-block-name/
Rules and Restrictions for the COPYIN Directive
Single array elements can be copied, but array sections cannot.
SHARED variables cannot be copied.
When an ALLOCATABLE array is to be copied, it must be allocated when the COPYIN directive is encountered.
COPYIN directives are permitted only within one of the following constructs:
Example
C$PAR COPYIN A,B, /X/, C(I) |
This directive specifies that
a
and
b
, the entire contents of common block X, and the
i
th element of
c
be copied from the master thread to the PRIVATE data objects of the
same name.
D.2.4 CRITICAL SECTION Directive Construct
The CRITICAL SECTION directive restricts access to the enclosed code to only one thread at a time.
A CRITICAL SECTION directive takes the following form:
|
c
Is one of the following: C (or c), !, or * (see Chapter 6).latch-var
Is a naturally aligned INTEGER(4) or INTEGER(8) SHARED variable.Using an explicit latch variable allows the program to control whether multiple critical sections have unique latches or use the same latch. When the CRITICAL SECTION directive is used without an explicit latch variable, the compiler supplies a unique latch for each critical section.
Rules and Restrictions for CRITICAL SECTION and END CRITICAL
SECTION Directives
Critical sections can appear anywhere an executable DIGITAL Fortran statement can appear.
If a latch variable name is specified, the program must explicitly initialize the latch variable to 0 (zero) before any CRITICAL SECTION using that latch variable is executed.
A thread waits at the beginning of a critical section until no other thread in the team is executing a critical section having the same latch variable name. When the thread that is executing the critical section reaches the END CRITICAL SECTION directive, the latch variable is set to 1 (one), allowing another thread to enter the critical section.
The program must not reuse that latch variable in anything other than a CRITICAL SECTION until all uses as a latch variable are complete.
All unnamed critical sections map to the same latch variable name
supplied by the compiler. Critical section latch variable names are
global to the program.
D.2.5 INSTANCE Directive Construct
The INSTANCE directive specifies the availability of named common blocks.
The INSTANCE directive takes the following form:
|
c
Is one of the following: C (or c), !, or * (see Chapter 6).
- SINGLE
Specifies that there will be a single instance of the named common blocks. This implies that all threads share the same copy, and assignments to the constituent items in the common blocks occurring in one thread affect the values of those items in the same named common blocks in other threads. INSTANCE SINGLE is the default for named common blocks.- PARALLEL
The INSTANCE PARALLEL directive is the same as the OpenMP Fortran API THREADPRIVATE directive (see Section D.1.15).- /com-blk-name/
The common block name delimited by slashes (/name/).
The MP_SCHEDTYPE directive sets a default run-time scheduling type. The scheduling type does not effect the semantics of the program, but may affect performance.
A MP_SCHEDTYPE directive takes the following form:
|
c
Is one of the following: C (or c), !, or * (see Chapter 6).mode
Is one of the following:
- DYNAMIC
- GSS
- GUIDED
- INTERLEAVE
- INTERLEAVED
- RUNTIME
- SIMPLE
- STATIC
- DYNAMIC
When a thread becomes available for more work, it is assigned the next chunksize of the remaining iterations. This is sometimes described as threads competing for iterations. If less than one chunksize of iterations remains, the next available thread is assigned all the remaining iterations.- GSS
GSS is an alternative spelling of GUIDED.- GUIDED
Similar to DYNAMIC, except that the number of iterations assigned is relatively large at the beginning of the loop, and decreases exponentially as threads become available for more work. The number of iterations assigned is not necessarily divisible by chunksize.
For this scheduling type, chunksize is the minimum number of iterations that can be assigned when a thread becomes available for work. When the number of iterations remaining to be assigned is less than or equal to chunksize, all the remaining iterations are assigned to the next available thread.
In some cases, setting a chunksize greater than one improves execution efficiency as the loop nears termination, by reducing contention among the threads for the small number of remaining iterations.- INTERLEAVE
INTERLEAVE is an alternative spelling of INTERLEAVED.- INTERLEAVED
Chunks of iterations are assigned to threads in a round-robin fashion.- RUNTIME
Environment variables are used to manage scheduling.
Environment variable names are case-sensitive, but their values are not case-sensitive. Environment variables used are:
- MP_CHUNK---An environment variable that specifies a chunksize, where chunksize is an integer constant.
- MP_SCHEDTYPE---An environment variable that specifies one of the following: DYNAMIC, GSS, GUIDED, INTERLEAVE, INTERLEAVED, SIMPLE, or STATIC
- SIMPLE
SIMPLE is an alternative spelling of STATIC.- STATIC
Assigns each slave thread one contiguous group of iterations. Each thread is assigned an approximately equal number of iterations.
STATIC is the default scheduling type when no other method has been specified.
Rules and Restrictions for the MP_SCHEDTYPE Directive
The MP_SCHEDTYPE directive can appear anywhere in a DIGITAL Fortran program. When more than one MP_SCHEDTYPE directive appears in the same program, the most recently encountered directive is used.
Defaults
The scheduling type used for any parallel DO loop can be determined from the following prioritized list. The available value closest to the top of the list is used:
The DYNAMIC and GUIDED scheduling types introduce a certain amount of overhead to manage the continuing assignment of iterations to threads during the execution of the loop. However, this overhead is sometimes offset by better load balancing when the average execution time of iterations is not uniform throughout the DO loop.
The STATIC and INTERLEAVED types assign all of the iterations to the threads in advance, with each thread receiving approximately equal numbers of iterations. One of these is usually the most efficient scheduling type when the average execution time of iterations is uniform throughout the DO loop.
For details on default chunksize and the effect of specifying
chunksizes for the same program in more than one context, see
Section D.2.2.
D.2.7 PARALLEL Directive Construct
The PARALLEL directive construct is the same as the OpenMP Fortran API PARALLEL directive construct (see Section D.1.10) with the following exceptions:
The PARALLEL DO directive is the same as the OpenMP Fortran API PARALLEL DO directive (see Section D.1.11) with the following exception:
The PARALLEL SECTIONS directive is the same as the OpenMP Fortran API
PARALLEL SECTIONS directive (see Section D.1.12)
D.2.10 PDO Directive Construct
The PDO directive specifies that the iterations of the immediately following DO loop must be executed in parallel. The loop that follows a PDO directive cannot be a DO WHILE or a DO loop without loop control. The iterations of the DO loop are distributed across the already existing threads.
A PDO directive takes the following form:
|
c
Is one of the following: C (or c), !, or * (see Chapter 6).pdo-option
Is one of the following:
- BLOCKED (chunksize)
- CHUNK (chunksize)
- FIRSTPRIVATE (var[[,] var]...)
- LASTLOCAL (var[[,] var]...)
- LAST LOCAL (var[[,] var]...)
- LOCAL (var[[,] var]...)
- [MP_SCHEDTYPE =] mode
- (ORDERED)
- PRIVATE (var[[,] var]...)
- REDUCTION (var[[,] var]...)
- BLOCKED (chunksize)
BLOCKED is an alternate spelling of CHUNK.- CHUNK (chunksize)
Adjusts the number of consecutive iterations assigned to a thread. At the end of the PDO construct, chunksize reverts to the default. The effect of CHUNK varies, depending on the scheduling type. For details, see Section D.2.2.
A chunksize specified in a PDO directive supersedes any chunksize set with a CHUNK directive earlier in the program, and applies only for the duration of the PDO construct.
For details on default chunksize and on the effect of specifying chunksizes for the same program in more than one context, see Section D.2.2.
- chunksize
Is a scalar integer expression.- FIRSTPRIVATE (var[[,] var]...)
The FIRSTPRIVATE option is the same as the OpenMP Fortran API FIRSTPRIVATE clause (see Section D.1.6).- LASTLOCAL (var[[,] var]...)
The LASTLOCAL option is the same as the OpenMP Fortran API LASTPRIVATE clause (see Section D.1.6).- LAST LOCAL (var[[,] var]...)
LAST LOCAL is an alternative spelling for LASTLOCAL, even in free form.- LOCAL (var[[,] var]...)
LOCAL is an alternative spelling for PRIVATE.- [MP_SCHEDTYPE=] mode
Controls the scheduling type and allocation of work for the PDO construct.mode
Is one of the following:
- DYNAMIC
- GSS
- GUIDED
- INTERLEAVE
- INTERLEAVED
- RUNTIME
- SIMPLE
- STATIC
For a description of these scheduling types, see Section D.2.6.
At the end of the PDO construct, the scheduling type reverts to the default (see Section D.2.6). The scheduling type does not affect the correctness of the program, but may affect performance.- (ORDERED)
Specifies that the iterations are assigned to threads in the same iteration order as generated by a normal DIGITAL Fortran DO statement.- PRIVATE (var[[,] var]...)
The PRIVATE option is the same as the OpenMP Fortran API PRIVATE clause (see Section D.1.6).- REDUCTION (var[[,] var]...)
The REDUCTION option in the DIGITAL Fortran parallel compiler directive set is different from the REDUCTION clause in the OpenMP Fortran API directive set. In the OpenMP Fortran API directive set, both a variable and an operator type are given. In the DIGITAL Fortran parallel compiler directive set, the operator is not given in the directive. The compiler must be able to determine the reduction operation from the source code. The REDUCTION option can be applied to a variable in a DO loop only if the variable meets the following criteria:
- must be scalar
- must be assigned to exactly once in the DO loop
- must be read from exactly once in the DO loop and only in the right side of the assignment
- the assignment must be one of the following forms:
x = x operator expr x = expr operator x (except for subtraction) x = operator(x, expr) x = operator(expr, x)
where operator is one of the following supported reduction operations: +, -, *, .AND., .OR., .EQV., .NEQV., MAX, MIN, IAND, or IOR.
The compiler rewrites the reduction operation by computing partial results into local variables and then combining the results into the reduction variable. The reduction variable must be SHARED in the enclosing context.- do_loop
Is a DIGITAL Fortran DO construct with loop control.- NOWAIT
The NOWAIT clause is the same as the OpenMP Fortran API NOWAIT clause (see Section D.1.6).
Rules and Restrictions for PDO and END PDO Directives
There are several ways of specifying a scheduling type. The scheduling type defaults to STATIC when no other information is available. For detailed information on scheduling type defaults, see Section D.2.6.
PDO directives are permitted only within the lexical extent of the PARALLEL and END PARALLEL directives.
For more information about rules and restrictions, refer to Chapter 6. Note that the restrictions referring to a single SCHEDULE and ORDERED clause do not apply.
Previous | Next | Contents | Index |