Previous | Contents | Index |
This entire appendix applies only to Compaq Fortran on Compaq Tru64 UNIX systems.
This appendix summarizes the library routines available for use with
directed parallel decomposition requested by the
-mp
and
-omp
compiler options. Where applicable, new programs should call run-time
parallel library routines using the OpenMP Fortran API format
( Section D.1). For compatibility with existing programs, the
Compaq Fortran compiler recognizes equivalent routines of the formats
described in Section D.2. Thus, for example, if your program calls
_OtsGetNumThreads the Compaq Fortran compiler interprets that as a call
to omp_get_num_threads.
D.1 OpenMP Fortran API Run-Time Library Routines
This section describes library routines that control and query the parallel execution environment. The section also documents the general-purpose lock routines supported by Compaq Fortran.
The following summary table lists the supported OpenMP Fortran API run-time library routines. These routines are all external procedures.
Routine Name | Usage |
---|---|
omp_set_num_threads | Set the number of threads to use for the next parallel region. |
omp_get_num_threads | Get the number of threads currently in the team executing the parallel region from which the routine is called. |
omp_get_max_threads | Get the maximum value that can be returned by calls to the omp_get_num_threads() function. |
omp_get_thread_num | Get the thread number, within the team, in the range from zero to omp_get_num_threads() minus one. |
omp_get_num_procs | Get the number of processors that are available to the program. |
omp_in_parallel | Inform whether or not a region is executing in parallel. |
omp_set_dynamic | Enable or disable dynamic adjustment of the number of threads available for execution of parallel regions. |
omp_get_dynamic | Inform if dynamic thread adjustment is enabled. |
omp_set_nested | Enable or disable nested parallelism. |
omp_get_nested | Inform if nested parallelism is enabled. |
omp_init_lock | Initialize a lock to be used in subsequent calls. |
omp_destroy_lock | Disassociate a lock variable from any locks. |
omp_set_lock | Make the executing thread wait until the specified lock is available. |
omp_unset_lock | Release the executing thread from ownership of a lock. |
omp_test_lock | Try to set the lock associated with a lock variable. |
Sets the number of threads to use for the next parallel region.
Syntax:
INTERFACE SUBROUTINE omp_set_num_threads (number_of_threads) INTEGER number_of_threads END SUBROUTINE omp_set_num_threads END INTERFACE INTEGER scalar_integer_expression CALL omp_set_num_threads (scalar_integer_expression) |
Description:
The compiler evaluates the scalar integer expression and interprets its value as the number of threads to use. This function takes effect only when called from serial portions of the program. The behavior of the function is undefined if the function is called from a portion of the program where the omp_in_parallel function returns TRUE .
A call to omp_set_num_threads sets the maximum number of threads to use for the next parallel region when dynamic adjustment of the number of threads is enabled. A call to the omp_set_num_threads routine overrides the OMP_NUM_THREADS environment variable (see Table 6-4).
See Also:
omp_get_dynamic()
omp_get_num_threads()
omp_in_parallel()
omp_set_dynamic()
Returns the number of threads currently in the team executing the parallel region from which it is called.
Syntax:
INTERFACE INTEGER FUNCTION omp_get_num_threads () END FUNCTION omp_get_num_threads END INTERFACE INTEGER result result = omp_get_num_threads () |
Description:
This function interacts with the omp_set_num_threads() call and the omp_num_threads environment variable that control the number of threads in a team. If the number of threads has not been explicitly set by the user, the default is implementation dependent.
The omp_get_num_threads() function binds to the closest enclosing PARALLEL directive (see Chapter 6); it returns 1 if the call is made from the serial portion of a program, or from a nested parallel region that is serialized.
See also:
omp_set_num_threads()
omp_num_threads environment variable (see Table 6-4)
Returns the maximum value that can be returned by calls to the omp_get_num_threads() function.
Syntax:
INTERFACE INTEGER FUNCTION omp_get_max_threads () END FUNCTION omp_get_max_threads END INTERFACE INTEGER result result = omp_get_max_threads () |
Description:
If your program uses omp_set_num_threads() to change the number of threads, subsequent calls to omp_get_max_threads() will return the new value. When the omp_set_dynamic() routine is set to TRUE , you can use omp_get_max_threads() to allocate data structures that are maximally sized for each thread.
This function has global scope.
Return Values:
This function returns the maximum value whether executing from a serial region or from a parallel region.
If your program used omp_set_num_threads to change the number of threads, subsequent calls to omp_get_max_threads will return the new value.
See also:
omp_set_num_threads()
omp_set_dynamic()
Returns the thread number, within the team.
Syntax:
INTERFACE INTEGER FUNCTION omp_get_thread_num () END FUNCTION omp_get_thread_num END INTERFACE INTEGER result result = omp_get_thread_num () |
Description:
This function binds to the closest enclosing PARALLEL directive (see Chapter 6). The master thread of the team is thread zero.
Return Values:
The value returned ranges from zero to omp_get_num_threads()-1. The function returns zero when called from a serial region or from within a nested parallel region that is serialized.
See also:
omp_get_num_threads()
omp_set_num_threads()
Returns the number of processors that are available to the program.
Syntax:
INTERFACE INTEGER FUNCTION omp_get_num_procs () END FUNCTION omp_get_num_procs END INTERFACE INTEGER result result = omp_get_num_procs () |
Return Values:
This function returns an integer value indicating the number of
processors your program has available,
D.1.6 omp_in_parallel
Returns whether or not a region is executing in parallel.
Syntax:
INTERFACE LOGICAL FUNCTION omp_in_parallel () END FUNCTION omp_in_parallel END INTERFACE LOGICAL result result = omp_in_parallel() |
Description:
This function has global scope.
Return Values:
This function returns
TRUE
if it is called from the dynamic extent of a region executing in
parallel, even if nested regions exist that may be serialized.;
otherwise it returns
FALSE
. A parallel region that is serialized is not considered to be a region
executing in parallel.
D.1.7 omp_set_dynamic
Enables or disables dynamic adjustment of the number of threads available for execution in a parallel region.
Syntax:
INTERFACE SUBROUTINE omp_set_dynamic (enable) LOGICAL enable END SUBROUTINE omp_set_dynamic END INTERFACE LOGICAL scalar_local_expression CALL omp_set_dynamic (scalar_logical_expression) |
Description:
To obtain the best use of system resources, certain run-time environments automatically adjust the number of threads that are used for executing subsequent parallel regions. This adjustment is enabled only if the value of the scalar logical expression to omp_set_dynamic is TRUE . Dynamic adjustment is disabled if the value of the scalar logical expression is FALSE .
When dynamic adjustment is enabled, the number of threads specified by the user becomes the maximum thread count. The number of threads remains fixed throughout each parallel region and is reported by the omp_get_num_threads() function.
A call to omp_set_dynamic overrides the OMP_DYNAMIC environment variable.
The default for dynamic thread adjustment is implementation dependent. A user code that depends on a specific number of threads for correct execution should explicitly disable dynamic threads. Implementations are not required to provide the ability to dynamically adjust the number of threads, but they are required to provide the interface in order to support portability across platforms.
See also:
omp_get_dynamic()
omp_get_num_threads()
The OMP_DYNAMIC environment variable (see Table 6-4)
Determines the status of dynamic thread adjustment.
Syntax:
INTERFACE LOGICAL FUNCTION omp_get_dynamic () END FUNCTION omp_get_dynamic END INTERFACE LOGICAL result result = omp_get_dynamic () |
Return Values:
This function returns TRUE if dynamic thread adjustment is enabled; otherwise it returns FALSE . The function always returns FALSE if dynamic adjustment of the number of threads is not implemented.
See also:
omp_set_dynamic ()
omp_set_num_threads()
The OMP_DYNAMIC environment variable (see Table 6-4)
Enables or disables nested parallelism.
Syntax:
INTERFACE SUBROUTINE omp_set_nested (enable) LOGICAL enable END SUBROUTINE omp_set_nested END INTERFACE LOGICAL scalar_logical_expression CALL omp_set_nested (scalar_logical_expression) END INTERFACE |
Description:
If the value of the scalar logical expression is FALSE , nested parallelism is disabled, and nested parallel regions are serialized and executed by the current thread. This is the default. If the value of the scalar logical expression is set to TRUE , nested parallelism is enabled, and parallel regions that are nested can deploy additional threads to form the team.
A call to omp_set_nested overrides the OMP_NESTED environment variable (see Table 6-4).
When nested parallelism is enabled, the number of threads used to execute the nested parallel regions is implementation dependent. This allows implementations that comply with the OpenMP standard to serialize nested parallel regions, even when nested parallelism is enabled.
See Also:
omp_get_nested()
Determines the status of nested parallelism.
Syntax:
INTERFACE LOGICAL FUNCTION omp_get_nested () END FUNCTION omp_get_nested END INTERFACE LOGICAL result result = omp_get_nested () |
Description:
This function returns TRUE if nested parallelism is enabled. If nested parallelism is disabled it returns FALSE . The function always returns FALSE if nested parallelism is not implemented.
See Also:
omp_set_nested()
The OpenMP run-time library includes a set of general-purpose locking routines. Your program must not attempt to access any lock variable, var, except through the routines described in this section. The var lock variable is an integer of a KIND large enough to hold an address. On Compaq Tru64 UNIX systems, var should be declared as INTEGER(KIND=8).
Using the lock control routines requires that they be called in a specific sequence:
A simple SET_LOCK and UNSET_LOCK combination satisfies this requirement. If you want your program to do useful work while waiting for the lock to become available, you can use the combination of TRY_LOCK and UNSET_LOCK instead. For example:
PROGRAM LOCK_USAGE EXTERNAL OMP_TEST_LOCK LOGICAL OMP_TEST_LOCK INTEGER LCK ! This variable should be of size POINTER CALL OMP_INIT_LOCK(LCK) !$OMP PARALLEL SHARED(LCK) PRIVATE(ID) ID = OMP_GET_THREAD_NUM() CALL OMP_SET_LOCK(LCK) PRINT *, MY THREAD ID IS , ID CALL OMP_UNSET_LOCK(LCK) DO WHILE (.NOT. OMP_TEST_LOCK(LCK)) CALL SKIP(ID) ! Do not yet have lock, do something else END DO CALL WORK(ID) ! Have the lock, now do work CALL OMP_UNSET_LOCK(LCK) !$OMP END PARALLEL CALL OMP_DESTROY_LOCK(LCK) END |
The lock control routines are described in detail in the following
sections.
D.1.11.1 omp_init_lock
Initializes a lock associated with a given lock variable for use in subsequent calls.
Syntax:
INTERFACE SUBROUTINE omp_init_lock (var) INTEGER(KIND=8) var END SUBROUTINE omp_init_lock END INTERFACE INTEGER(KIND=8) v CALL omp_init_lock (v) |
Description:
The initial state of the lock variable v is unlocked.
Restriction:
Attempting to call this routine with a lock variable that is already
associated with a lock is an invalid operation and will cause a
run-time error.
D.1.11.2 omp_destroy_lock
Disassociates a given lock variable from any locks.
Syntax:
INTERFACE SUBROUTINE omp_destroy_lock (var) INTEGER(KIND=8) var END SUBROUTINE omp_destroy_lock END INTERFACE INTEGER(KIND=8) v CALL omp_destroy_lock (v) |
Restriction:
Attempting to call this routine with a lock variable that has not been
initialized is an invalid operation and will cause a run-time error.
D.1.11.3 omp_set_lock
Makes the executing thread wait until the specified lock is available.
Syntax:
INTERFACE SUBROUTINE omp_set_lock (var) INTEGER(KIND=8) var END SUBROUTINE omp_set_lock END INTERFACE INTEGER(KIND=8) v CALL omp_set_lock (v) |
Description:
When the lock becomes available the thread is granted ownership.
Restriction:
Attempting to call this routine with a lock variable that has not been
initialized is an invalid operation and will cause a run-time error.
D.1.11.4 omp_unset_lock
Releases the executing thread from ownership of the lock.
Syntax:
INTERFACE SUBROUTINE omp_unset_lock (var) INTEGER(KIND=8) var END SUBROUTINE omp_unset_lock END INTERFACE INTEGER(KIND=8) v CALL omp_unset_lock (v) |
Description:
If the thread does not own the lock specified by the variable, the behavior is undefined.
Restriction:
Attempting to call this routine with a lock variable that has not been
initialized is an invalid operation and will cause a run-time error.
D.1.11.5 omp_test_lock
Tries to set the lock associated with the lock variable var.
Syntax:
INTERFACE LOGICAL FUNCTION omp_test_lock (var) INTEGER(KIND=8) var END SUBROUTINE omp_test_lock END INTERFACE INTEGER(KIND=8) v LOGICAL result result = omp_test_lock (v) |
Return Values:
If the attempt to set the lock specified by the variable succeeds, the function returns TRUE , otherwise the function returns FALSE . In either case, the routine does not wait for the lock to become available.
Restriction:
Attempting to call this routine with a lock variable that has not been
initialized is an invalid operation and will cause a run-time error.
D.2 Other Parallel Threads Routines
Compaq Fortran supports the set of parallel thread routines described in this section primarily for existing programs. For creating new programs, the set of routines described in Section D.1 is preferred.
Where indicated in the following table, the _Otsxxxx (Compaq spelling) and the mpc_xxxx (compatibility spelling) routine names are equivalent. For example, calling _OtsGetNumThreads is the same as calling mpc_numthreads.
Routine Name | Description |
---|---|
_OtsInitParallel |
Start slave threads for parallel processing if they have not yet been
started implicitly (normally, the threads have been started by default
at the first parallel region). Call as a subroutine with two arguments
(see Section D.2.2) :
|
_OtsStopWorkers
mpc_destroy |
Stop any slave threads created by parallel library support. This routine cannot be called from within a parallel region. After this call, new slave threads will be implicitly created the next time a parallel region is encountered, or can be created explicitly by calling _OtsInitParallel. Call as a subroutine. (see Section D.2.3). |
_OtsGetNumThreads
mpc_numthreads |
Return the number of threads that are being used in the current parallel region (if running within one), or the number of threads that have been created so far (if not currently within a parallel region). Invoke as an integer function. (see Section D.2.4). |
_OtsGetMaxThreads
mpc_maxnumthreads |
Return the number of threads that would normally be used for parallel processing in the current environment. This is affected by the environment variable MP_THREAD_COUNT, by the number of processes in the current process's processor set, and by any call to _OtsInitParallel. Invoke as an integer function (see Section D.2.5). |
_OtsGetThreadNum
mpc_my_threadnum |
Return a number that identifies the current thread. The main thread is 0, and slave threads are numbered densely from 1. Invoke as an integer function (see Section D.2.6). |
_OtsInParallel
mpc_in_parallel_region |
Returns 1 if you are currently within a parallel region, or 0 if not. Invoke as an integer function. (see Section D.2.7). |
See Also:
To call the _Otsxxxx or mpc_xxxx routines, use the cDEC$ ALIAS directive (described in the Compaq Fortran Language Reference Manual) to handle the mixed-case naming convention and missing trailing underscore. For example, to call the _OtsGetThreadNum routine with an alias of OtsGetThreadNum, use the following code:
integer a(10) INTERFACE INTEGER FUNCTION OtsGetThreadNum () !DEC$ ALIAS OtsGetThreadNum, '_OtsGetThreadNum' END FUNCTION OtsGetThreadNum END INTERFACE !$par parallel do do i = 1,10 print *, "i=",i, " thread=", OtsGetThreadNum () enddo end |
Alternatively, to use the compatibility naming convention of mpc_my_threadnum:
integer a(10) INTERFACE INTEGER FUNCTION mpc_my_threadnum () !DEC$ ALIAS mpc_my_threadnum, 'mpc_my_threadnum' END FUNCTION mpc_my_threadnum END INTERFACE !$par parallel do do i = 1,10 print *, "i=",i, " thread=", mpc_my_threadnum () enddo end |
Syntax:
INTERFACE SUBROUTINE otsinitparallel (nthreads, attr) !DEC$ ALIAS otsinitparallel, '_OtsInitParallel' INTEGER nthreads INTEGER (KIND=8) attr !DEC$ ATRRIBUTES, VALUE :: nthreads, attr END SUBROUTINE otsinitparallel END INTERFACE |
Description:
Starts slave threads for parallel processing if they have not yet been started implicitly. Use this routine if you want to:
The arguments are:
Syntax:
INTERFACE SUBROUTINE otsstopworkers () !DEC$ ALIAS otsstopworkers, '_OtsStopWorkers' END SUBROUTINE otsstopworkers END INTERFACE CALL otsstopworkers () |
Description:
Stop any slave threads created by parallel library support. Use this
routine if you need to perform some operation (such as a call to
fork()
) that cannot tolerate extra threads running in the process. This
routine cannot be called from within a parallel region. After this
call, new slave threads will be implicitly created the next time a
parallel region is encountered, or can be created explicitly by calling
otsinitparallel.
D.2.4 _OtsGetNumThreads or mpc_numthreads
Returns the number of threads being used (in a parallel region) or created so far (if not in a parallel region).
Syntax:
INTERFACE INTEGER FUNCTION otsgetnumthreads () !DEC$ ALIAS otsgetnumthreads, '_OtsGetNumThreads' END FUNCTION otsgetnumthreads END INTERFACE INTEGER result result = otsgetnumthreads () |
Description:
Returns the number of threads that are being used in the current parallel region (if running within one), or the number of threads that have been created so far (if not currently within a parallel region). You can use this call to decide how to partition a parallel loop. For example:
nt = otsgetnumthreads () c$par parallel do do i = a,nt-1 work(i) = 0 k0 = 1+(i*n)/nt k1 = ((i+1)+n)/nt do j = 1,m do k = k0,k1 ! use work(i) enddo enddo enddo |
Returns the maximum number of threads for the current environment.
Syntax:
INTERFACE INTEGER FUNCTION otsgetmaxthreads () !DEC$ ALIAS otsgetmaxthreads, '_OtsGetMaxThreads' END FUNCTION otsgetmaxthreads END INTERFACE INTEGER result result = otsgetmaxthreads () |
Description:
Returns the number of threads that would normally be used for parallel
processing in the current environment. This is affected by the
environment variable MP_THREAD_COUNT, by the number of processes in the
current process's processor set, and by any call to otsinitparallel.
D.2.6 _OtsGetThreadNum or mpc_my_threadnum
Returns the number of the current thread.
Syntax:
INTERFACE INTEGER FUNCTION otsgetthreadnum () !DEC$ ALIAS otsgetthreadnum, '_OtsGetThreadNum' END FUNCTION otsgetthreadnum END INTERFACE INTEGER result result = otsgetthreadnum () |
Description:
Returns a number that identifies the current thread. The main thread is 0, and slave threads are numbered densely from 1.
Previous | Next | Contents | Index |