Compaq Fortran
Release Notes for Compaq Tru64
UNIX Systems
In addition, the ability to call parallel HPF subprograms from
non-parallel (Fortran or non-Fortran) main programs, is not supported
in this release. For more information, see Chapter 6 of the DIGITAL
High Performance Fortran 90 HPF and PSE Manual.
1.7.4 Version 5.3 New Features
The following new Compaq Fortran features are now supported:
- The following new features are now supported:
- You can now CALL a function. In other words, a routine that is
declared to be a FUNCTION can be invoked by a CALL statement. The
function's return value is discarded.
- Compaq Fortran now supports COMPLEX(KIND=16), also spelled
COMPLEX*32. This is a complex number composed of two 128-bit extended
floating point numbers (ie, REAL(KIND=16)). Complete documentation is
in the updated Compaq Fortran Language Reference Manual as well as the
/usr/lib/cmplrs/fort90/decfortran90.hlp help file. Here are some
highlights:
- COMPLEX*32 or COMPLEX(KIND=16) declares a pair of REAL*16 128-bit
reals as a complex pair. It is 32 bytes big.
- COMPLEX*32 constants are (x,y) where at least one of x and y is a
REAL*16 constant, eg, (1,2Q0).
- COMPLEX arithmetic supports + - * / ** . Mixed type arithmetic
converts everything up to COMPLEX*32 since COMPLEX*32 is the biggest.
- COMPLEX*32 can be read and written in all I/O forms.
- Command line option "-real_size 128" forces "COMPLEX" to be
COMPLEX*32 and DOUBLE COMPLEX to be COMPLEX*32. "-double_size 128"
forces DOUBLE COMPLEX to be COMPLEX*32.
- Intrinsic generic functions that take COMPLEX now take COMPLEX*32.
New specific intrinsic functions for COMPLEX*32 are CQABS, QIMAG,
QCONJG, CQCOS, CQEXP, CQLOG, QREAL, CQSIN, CQSQRT, QCMPLX.
- Operations involving a COMPLEX*16 and a REAL*16 now produce a
COMPLEX*32 result. These used to produce a COMPLEX*16 result.
- The BUFFERED= keyword has been added to the OPEN and INQUIRE
statements. The default is BUFFERED='NO' for all I/O, in which case the
RTL empties its internal buffer for each WRITE. If BUFFERED='YES' is
specified and the device is a disk, the internal buffer will be filled,
possibly by many WRITE statements, before it is emptied.
If the
OPEN has BUFFERCOUNT and BLOCKSIZE arguments, their product is the size
in bytes of the internal buffer. If these are not specified, the
default size is 8192 bytes. This internal buffer will grow to hold the
largest single record but will never shrink.
- Character vector constructors may now have unequal length elements.
The length of each element is the maximum of the element lengths. For
example,
(/ 'ab', 'abc', 'a' /) == (/ 'ab ', 'abc', 'a ' /)
|
- The Compaq Extended Math Library (CXML) routines are updated in the
Compaq Fortran kit. See the CXML release notes in:
/usr/opt/XMDCOM360/docs/XMD360_release_note.txt
- The following new
f90
command options are now supported:
-
-arch ev67
and
-tune ev67
now provide instruction set support and performance tuning for the ev67
processor (21264A chip), which adds the count extension (CIX)
instructions POPCNT, LEADZ, and TRAILZ.
-
-align sequence
allows the components of a SEQUENCEd derived type to be aligned
according to the alignment rules set by the user. The default alignment
rules are to align components on natural boundaries. The default is
-align nosequence
which means components of a SEQUENCEd derived type will be packed,
regardless of the current alignment rules set by the user.
-
-fast
now sets
-align sequence
so that SEQUENCEd derived type components can be naturally aligned for
improved performance.
-
-fast
now sets
-arch host -tune host
.
-
-assume buffered_io
turns on buffered I/O for all Fortran logical units opened for
sequential writing. The default is
-assume nobuffered_io
.
-
-dname=value
now allows a quoted string as
value
. For example,
-ddate="nov 20, 1999"
passes the character string
nov 20, 1999
as the value of
date
to
cpp(1)
and to the Compaq Fortran 90 compiler.
-
-warn hpf
tells the compiler to do both syntactic and semantics checking on HPF
directives. The default is
-warn nohpf
unless
-wsf
is specified, in which case
-warn hpf
is assumed.
-
-f77rtl
tells the compiler to use the run-time behavior of Compaq Fortran 77
instead of Compaq Fortran 90. For example, this affects the output form
for NAMELIST. The default is
-nof77rtl
.
-
-mixed_str_len_arg
tells the compiler that the hidden length passed for a character
argument is to be placed immediately after its corresponding character
argument in the argument list. The default is
-nomixed_str_len_arg
, which places the hidden lengths in sequential order at the end of the
argument list.
- The file suffix
.f90
now tells the driver that the file contains Fortran 90 free-form source
that must be preprocessed by
cpp(1)
.
cpp(1)
produces an intermediate
.i90
file that is then compiled.
1.7.5 Version 5.3 Important Information
Some important information to note about this release:
- As of Compaq Fortran V5.3, the
f77
command executes the Compaq Fortran 90 compiler instead of the Compaq
Fortran 77 compiler. Use
f77 -old_f77
to execute the Compaq Fortran 77 compiler.
- There are four INCLUDE files in /usr/include that give definitions
of DFAO RTL symbols:
- for_fpe_flags.f - flags for for_set/get_fpe(3f)
- fordef.f - return values for the fp_class intrinsic
- foriosdef.f - values for STAT= IO status results
- forompdef.f - interface blocks to the omp_* routines
- forreent.f - flags for for_set_reentrancy(3f)
- PARAMETER constants ae now alloacted in a read-only PSECT.
- Files that contain declarations that will be INCLUDEd into source
code should declare data fully so that command line options used to
compile the source code do not unexpectedly affect the INCLUDEd
declarations. For example, if I is declared INTEGER, then using the
-i2
changes I from INTEGER*4 to INTEGER*2. If I is declared INTEGER*4, then
its definition is not affected by
-i2
.
1.7.6 Version 5.3 Corrections
From version X5.2-829-4296F ECO 01 to FT1 T5.3-860-4498G, the following
corrections have been made:
- Fix problem with wrong generated code if an OPTIONAL and omitted
descriptor-based dummy argument is passed as an actual argument to a
routine which declares that argument as OPTIONAL.
- Fix problem where ASSOCIATED did not always return the correct
result for a pointer component that was transferred via pointer
assignment.
- Enable display of array bounds larger than 32 bits in listing
summary.
- Fix internal compiler error for certain uses of defined assignment
where multiple defined operators appeared in the right-hand side of the
assignment.
- Add /ALIGN=SEQUENCE (/ALIGN:SEQUENCE, -align sequence) which
specifies that SEQUENCE types may be padded for alignment.
- Make the default for BLANK= in OPEN match the documentation when
the -f66 (/NOF77) switch is specified, which is to default to
BLANK='ZERO'. Previously, BLANK='NULL' was used regardless.
- Allow array constructors to have scalar CHARACTER source elements
of varying size.
- Correct problem where a call to a routine with the C and VARYING
attributes generates incorrect code.
- Make sure that -g3 does not turn off optimization.
- Fix internal compiler error for statement function which uses the
function return variable of the host function.
- Fix internal compiler error for incorrect program which uses an
component of a derived type variable in an automatic array bounds
expression, the derived type is undefined and IMPLICIT NONE is used.
- Fix internal compiler error when RESULT variable has same name as a
previously seen FUNCTION.
- Fix problem with PUBLIC/PRIVATE attributes in a particular
complicated module usage.
- Eliminate spurious error message for valid generic procedure
reference.
- Fix problem with DATA initialization of zero-origin arrays.
- Fix problem where compiler would not allow "# linenum" to appear in
a source file if a !DEC$ or !MS$ directive was seen.
- Don't give "unused" warning for EQUIVALENCEd variable.
- Properly treat INT(n,KIND=) in an array constructor.
- Don't disable type checking for %LOC.
- Properly parse generic INTERFACE whose name begins with TO.
- When -align dcommons is used, make sure that POINTER objects in
COMMON are aligned on quadword boundaries.
- Correctly parse program with IF construct whose name begins with IF.
- Fix a case where two NaNs sometimes compared as equal.
- If an attempt is made to DEALLOCATE an item which is not
DEALLOCATEable, such as an array slice, a run-time error is now given.
Previously, the results were unpredictable.
From version FT1 T5.3-860-4498G to FT2 T5.3-893-4499U, the following
corrections have been made:
- Allocate all PARAMETER constants in a read-only PSECT.
- Ensure that locally-allocated derived-type arrays are naturally
aligned.
- Generate correct code for pointer assignment of an array generated
from a section of a derived type.
- Eliminate internal compiler error in certain cases with dummy
argument that has OPTIONAL and INTENT(OUT) attributes.
- Flag square-bracket array constructor syntax as an extension.
- Eliminate internal compiler error for certain uses of TRANSFER.
- Properly detect ambiguous generic reference when all distinguishing
arguments are OPTIONAL.
- Eliminate internal compiler error for a case involving a PRIVATE
POINTER in a module.
- Eliminate spurious "this name has already been used as an external
procedure" error for recursive function which returns a derived type.
- "Directive not supported on this platform" diagnostic is now
informational, not warning severity.
- Allow array sections in DATA statement variable list.
From version FT2 T5.3-893-4499U to V5.3-915-449BB, the following
corrections have been made:
- Eliminate access violation on some platforms for ALLOCATE of
pointer in a derived type.
- Correct problem where compiler could omit putting out declaration
for a routine symbol.
- Handle non-present, optional dummy arguments as third argument to
INDEX, SPAN, and VERIFY.
- Generate correct code when passing character array slices as
arguments.
- Fix case of contiguous array slice as first argument to TRANSFER.
- Fix INQUIRE by IOLIST of ALLOCATABLE arrays.
- Correct problem involving pointer assignment with sections of a
derived type.
- Eliminate inappropriate error messages when overloading SIGN
intrinsic.
- Eliminate internal compiler error when "-" defined as both unary
and binary operators in separate modules.
- Eliminate spurious unused warning for pointer target.
- Implement OMP interpretation regarding DEFAULT(NONE).
- Eliminate spurious standards diagnostic for !DEC$ UNROLL.
- Correct problem with accessibility of NAMELIST names from module.
- When -real_size 64 and -double_size 128 are used, make sure DOUBLE
PRECISION gets REAL*16.
- Correct evaluation of FLOAT intrinsic with -real_size 64.
- Correct problem with array constructors in format expressions.
1.7.7 HPF in Compaq Fortran Version 5.3
As in Fortran 90 Version 5.2, the HPFLIBS subset replaces the old
PSESHPF subset. If you previously installed the PSESHPF subset you do
not need to delete it. If you choose to delete it, delete it before you
install the Fortran 90 V5.3 HPFLIBS170 subset. If you delete the
PSESHPF subset after you install the Fortran HPFLIBS170 subset, you
need to delete the HPFLIBS170 subset and then reinstall it. For
information on using the
setld
command to check for and delete subsets, see the Compaq Fortran
Installation Guide for Tru64 UNIX Systems.
To execute HPF programs compiled with the
-wsf
switch you must have both PSE160 and Fortran 90 Version 5.3 with the
HPFLIBS170 subset installed. For this release the order of the
installation is important. You must first install PSE160 and then
install Fortran 90 Version 5.3 with the HPFLIBS170 subset. The
HPFLIBS170 subset must be installed last. If you do this it will be
properly installed.
If you also need to use the latest versions of MPI and PVM, you must
install PSE180. PSE180 contains only MPI and PVM support. The support
for HPF programs compiled with the
-wsf
option is only found in PSE160. Therefore you must install both
versions of PSE and you must install PSE180 after PSE160.
To install Compaq Fortran with HPF and MPI and PVM, install them in the
following order. The order is very important.
- Delete any old versions that you wish to delete.
- Install PSE160.
- Install Compaq Fortran Version 5.3 including the HPFLIBS170 subset.
- Install PSE180.
The HPF runtime libraries in Compaq Fortran Version 5.3 are only
compatible with PSE Version 1.6. Programs compiled with this version
will not run correctly with older versions of PSE. In addition,
programs compiled with older compilers will no longer run correctly
when linked with programs compiled with this version. Relinking is not
sufficient; programs must be recompiled and relinked.
If you cannot install these in the order described, follow these
directions to correct the installation:
- If you have installed Fortran Version 5.3 but are missing PSE160,
then install PSE160. Delete the HPFLIBS170 subset of Fortran V5.3 and
then reinstall the HPFLIBS170 subset.
- If you installed Fortran Version 5.3 first and then PSE160, then
delete the HPFLIBS170 subset of Fortran V5.3. Next, reinstall the
HPFLIBS170 subset.
- If you already have Fortran Version 5.3 and PSE160 installed but
did not install the HPFLIBS170 subset of Fortran V5.3, then simply
install the HPFLIBS170 subset.
- If you deleted any old PSESHPF subset after installing Fortran
V5.3, this will also cause problems. In this case delete the HPFLIBS170
subset of Fortran Version 5.3 and then reinstall the HPFLIBS170 subset.
- If you installed PSE180 before PSE160, then delete PSE180 and
reinstall it now.
For more information about installing PSE160, see the Compaq
Parallel Software Environment Release Notes, Version 1.6.
For more information about installing PSE180, see the Compaq
Parallel Software Environment Release Notes, Version 1.8.
1.7.8 Version 5.3 Known Problems
The following known problems exist with Compaq Fortran Version 5.3:
- The following is a list of known problems for
-omp
parallel support in Version 5.3:
- Nested parallel regions are not supported by
-omp
. A program that contains nested parallel regions will cause the
compiler to fail with an internal error.
1.8 New Features, Corrections, and Known Problems in Version 5.2
Version 5.2 is a minor release that includes corrections to problems
discovered since Version 5.1 was released and certain new features.
The following topics are discussed:
1.8.1 Version 5.2 ECO 01 New Features
The following new Compaq Fortran (DIGITAL Fortran 90) features are now
supported:
- IVDEP Directive
The IVDEP directive assists the compiler's
dependence analysis. It can also be specified as INIT_DEP_FWD
(INITialize DEPendences ForWarD). The IVDEP directive takes the
following form:
cDEC$ IVDEP
c Is one of the following: C (or c), !, or *.
|
The IVDEP directive is an assertion to the compiler's optimizer
about the order of memory references inside a DO loop.
The IVDEP
directive tells the compiler to begin dependence analysis by assuming
all dependences occur in the same forward direction as their appearance
in the normal scalar execution order. This contrasts with normal
compiler behavior, which is for the dependence analysis to make no
initial assumptions about the direction of a dependence.
The IVDEP
directive must precede the DO statement for each DO loop it affects. No
source code lines, other than the following, can be placed between the
IVDEP directive statement and the DO statement:
- An UNROLL directive
- A PARALLEL DO directive (TU*X only)
- A PDO directive (TU*X only)
- Placeholder lines
- Comment lines
- Blank lines
The IVDEP directive is applied to a DO loop in which you know that
dependences are in lexical order. For example, if two memory references
in the loop touch the same memory location and one of them modifies the
memory location, then the first reference to touch the location has to
be the one that appears earlier lexically in the program source code.
This assumes that the right-hand side of an assignment statement is
"earlier" than the left-hand side.
The IVDEP directive informs the
compiler that the program would behave correctly if the statements were
executed in certain orders other than the sequential execution order,
such as executing the first statement or block to completion for all
iterations, then the next statement or block for all iterations, and so
forth. The optimizer can use this information, along with whatever else
it can prove about the dependences, to choose other execution orders.
Example
In the following example, the IVDEP directive provides
more information about the dependences within the loop, which may
enable loop transformations to occur:
!DEC$ IVDEP
DO I=1, N
A(INDARR(I)) = A(INDARR(I)) + B(I)
END DO
|
In this case, the scalar execution order follows:
- Retrieve INDARR(I).
- Use the result from step 1 to retrieve A(INDARR(I)).
- Retrieve B(I).
- Add the results from steps 2 and 3.
- Store the results from step 4 into the location indicated by
A(INDARR(I)) from step 1.
IVDEP directs the compiler to initially assume that when steps 1
and 5 access a common memory location, step 1 always accesses the
location first because step 1 occurs earlier in the execution sequence.
This approach lets the compiler reorder instructions, as long as it
chooses an instruction schedule that maintains the relative order of
the array references.
- UNROLL Directive
The UNROLL directive tells the compiler's
optimizer how many times to unroll a DO loop. It takes the following
form:
cDEC$ UNROLL [(n)]
c Is one of the following: C (or c), !, or *.
n Is an integer constant. The range of "n" is 0 through 255.
|
The UNROLL directive must precede the DO statement for each DO loop
it affects. No source code lines, other than the following, can be
placed between the UNROLL directive statement and the DO statement:
- An IVDEP directive
- A PARALLEL DO directive (TU*X only)
- A PDO directive (TU*X only)
- Placeholder lines
- Comment lines
- Blank lines
If "n" is specified, the optimizer unrolls the loop "n" times. If
"n" is omitted, or if it is outside the allowed range, the optimizer
picks the number of times to unroll the loop.
The UNROLL directive
overrides any setting of loop unrolling from the command line.
Some important information to note about this release:
- -fast now implies "-arch host -tune host" as defaults. These can be
overridden with explicit options. Note that this has an impact on
redistributed programs - if they are to run on older generation
processors than the compiling host, -arch, at least, must be overridden.
- The command line option "-source_listing" is not documented but it
produces a listing file with a file extension of ".lis" {as opposed to
"-V" which produces a .l listing file}.
- This ECO release includes the two subsets XMDLOA351 (DXML serial
libraries) and XMDPLL351 (DXML parallel libraries).
- Note that there is an installation order issue with PSE: PSE160
should be installed BEFORE Fortran, since the Fortran kit has newer HPF
libraries. If you are also using MPI and/or PVM, then there is also a
dependency with the latest MPI/PVM kits, which are in PSE V1.8
(PSE180): the installation order needs to be PSE160 then Fortran then
PSE180 OR PSE160 then PSE180 then Fortran.
From version V5.2-705-428BH to X5.2-829-4296F, the following
corrections have been made:
- Correct a problem with PACK when the first argument is a
two-dimensional slice of a three-dimensional array.
- Correct problem with ADJUSTL, ADJUSTR and COTAN with array element
arguments.
- Fix internal compiler error for certain uses of LL* intrinsics.
- Prevent internal compiler error when the size of a return value is
based on a call to a pure function with the argument to this function.
- Correct problems with nested uses of SPREAD intrinsic.
- Make ASSOCIATED return the correct result when the target is an
element of a deferred-shape array.
- Correct a problem with a USE...ONLY of some symbols from an
EQUIVALENCE group in a module. Previously, the compiler might generate
an external reference to the wrong symbol.
- Correct a problem with EOSHIFT of a structure array with a
multidimensional structure component.
- Eliminate the unnecessary use of temporary array copies in many
cases.
- Add support for specific names IMVBITS, JMVBITS and KMVBITS
(already documented).
- Correct a problem where calling an ELEMENTAL routine with a pointer
array may give incorrect results.
- Fix transfer intrinsic where the MOLD is a character substring with
non-zero base, e.g., TRANSFER(X, CH(I1:I2)).
- Fix problem where CSHIFT of an array of derived type generated bad
code.
- Correct problem with pointer assignment when the right-hand-side is
an array of derived types.
- Correct problems involving function return value whose size depends
on properties of input arguments.
- Fix problem that caused internal compiler error with RESHAPE.
- Fix problem where IBCLR of mixed-kind arguments gave wrong answer.
- When fpp is invoked, have it also look in the current directory for
include files.
- Correct problem with I/O of a slice of an assumed-size array.
- Issue error message for lexically nested parallel regions.
- In listing summary, list zero-length COMMON PSECTs.
- Eliminate spurious warning when passing a POINTER or assumed-shape
array in COMMON to a routine with a compatible dummy argument
declaration.
- Fix internal compiler error involving array-valued functions with
entry points.
- Generate correct code for unusual (and non-standard) dummy aliasing
case involving an EQUIVALENCEd variable passed as an argument.
- Fix problem with incorrect code for a call to ALLOCATE or
DEALLOCATE where STAT= is specified using an array element.
- -fast now implies -arch host -tune host as defaults. These can be
overridden with explicit options. Note that this has an impact on
redistributed programs - if they are to run on older generation
processors than the compiling host, -arch, at least, must be overridden.
- Fix internal compiler error for certain programs which CALL a
function.
- Correct compiler abort with ASSOCIATED (X,(Y))
- Don't give standards warning for ELEMENTAL PURE.
- Consider FORALL index variables "used" for -warn unused purposes.
- Disallow leading underscore in identifiers, as documented.
- Correct problem with implied DO loop in non-INTEGER array
constructors in initialization expressions.
- Allow expression involving array constructors in an initialization
expression.
- %LOC is treated the same as LOC for type checking purposes.
- Correct problem involving generic routine resolution.
- SEQUENCE now byte-packs fields, as the documentation says.
- Correct compiler abort with RESHAPE in initialization expression.
- Correct compiler abort for case with defined operators.
- Correct compiler abort for syntax error X(;,:)
- Give appropriate error if DO loop variable is too small for range.
- Correct compiler abort for LEN_TRIM(array) in initialization
expression.
- Correct compiler abort for SIZE(non-array).
- Correct problems with ISHFT(array) in initialization expression.
- Allow SHAPE in initialization expression.
- Don't give standards warning for use of INDEX in initialization
expression.
- Consider statement function dummy argument "used" for /warn=unused.
- Correct compiler abort for invalid syntax in a Variable Format
Expression (VFE).
- Correct compiler abort for module procedure with ENTRY.
- Allow full set of F95-permitted intrinsic functions in
specification expressions.
- Correct compiler abort with invalid VFE in FORMAT.
- Correct problem with accessibility of MODULE symbols when two
modules define the symbol but one has marked it PRIVATE.
- Correct compiler abort for certain programs when -i8 and -wsf
specified.
- Correct problem with missing and duplicate alignment warnings.
- Allow repeated NULL() in DATA initialization when variables have
different types.
- Correct spurious "shapes do not conform" error.
- Correct compiler abort for invalid program using wrong component in
ASSOCIATED.
- When -names as_is specified, don't make IMPLICIT case-sensitive.
- Give standards warning for Q exponent letter in floating literals.
- Generate correct code for generic which replaces MIN or MAX.
- Give more reasonable error message when variable used as control
construct name.
- Eliminate spurious message for vector-valued subscript in defined
assignment.
- Give error if INTENT not properly specified for defined assignment.
- Correct internal compiler error for overloaded MAX.
- Eliminate spurious warning for FORALL.
- Give warning when INTENT(IN) argument modified in PURE FUNCTION.
- Eliminate spurious error for valid DATA with array subscript.
- Allow ORDER in RESHAPE to be non-constant.
- Fix compiler abort with RESHAPE.
- Don't give unused warning for TARGET argument used in pointer
assignment.
- Properly distinguish STRUCTUREs with the same name in different
CONTAINed routines.
- Allow NULL() to initialize a pointer to derived type.
- Incorrect warning for variable IF when -omp specified.
- Don't give unused warning for array constructor implied-DO variable.
- Allow INTRINSIC :: name (new in F95).
- Eliminate spurious standards warning for certain obscure uses of
UNPACK.
- Eliminate compiler abort when transformational intrinsic used
(illegally) in statement function.
- Raise limit of number of items in a FORMAT from 200 to 2048.
- Disallow invalid INTENT keywords.
- Allow CALL of a typed identifier (Compaq Fortran 77 extension).
- Correct problem where USE-associated identifiers aren't seen in
certain cases involving renaming.
- Correctly evaluate CEIL intrinsic when used in a specification
expression.
- Allow SIZE intrinsic to be overloaded.
- Don't issue spurious "function value has not been defined" warning
for case involving ENTRY and RESULT.
- Fix internal compiler error involving defined assignment.
- Fix problem with incorrect CHARACTER initialization values and CHAR
function.
- Disallow array constructor being used to initialize a scalar.
- Allow ALLOCATE/DEALLOCATE of argument to PURE SUBROUTINE.
- Fix problem for certain uses of period separators for derived type
fields.
- Eliminate spurious syntax error for use-associated variable in
NAMELIST.
- Eliminate spurious syntax error for certain uses of variable format
expression in FMT=.
- Allow as an extension the use of a name previously seen in a CALL
statement as an actual argument without an EXTERNAL statement or
explicit interface.
- Eliminate spurious overflow message for MS-style base-2 constant.
- Correct problem with generic routine matching.
- Correct internal compiler error when function return value used in
statement function expression.
1.8.2 Version 5.2 New Features
Version 5.2 supports the following new features:
- The following new features are now supported:
- The f90 compiler now gives "uninitialized variable" warnings at
optimization levels lower than -O4.
- The RTL now has support for handling units *, 5 and 6 as separate
units. Use of this feature, requires both RTL and compiler support.
Programs must be compiled with a version of the compiler that
implements this support and linked with or use a shareable RTL that
implements the support. Older existing images will continue to work
with the newer RTL. As a consequence of separating the units: if you
were to connect unit 6 to a file, and then write to unit * - that write
would produce output to the console (or stdout device). Previous to
this, a write to unit * would go to the same file connected to unit 6.
This new behavior is consistent to that of VMS and MS-FPS.
- For F90, a NAMELIST input group can start with either an ampersand
(&) or dollar sign ($) in any column and can be terminated by one
of a slash (/), an ampersand (&) or a dollar sign($) in any column.
- The DIGITAL Extended Math Library (DXML) routines are now included
in the Compaq (DIGITAL) Fortran kit.
- The following new
f90
command options are now supported:
-
-assume gfullpath
causes the full source file path to be included in the debug
information. The default is
-assume nogfullpath
.
-
-assume [no]pthreads_lock
lets you select the kind of locking used for an unnamed critical
section (when parallel processing is requested with
-mp
or
-omp
). Using the default,
-assume nopthreads_lock
, provides the fastest performance by providing a single lock for all
unnamed critical sections (but does not lock out other process
threads).
To request more restrictive locking, specify
-assume pthreads_lock
. This locks out all other process threads in addition to all critical
sections, which slows application performance.
When using
-assume nopthreads_lock
(default), enter critical is used with the _OtsGlobalLock argument. With
-assume pthreads_lock
, enter critical is used with the _OtsPthreadLock argument.
-
-arch ev6
generates instructions for ev6 processors (21264 chips). This option
permits the compiler to generate any EV6 instruction, including
instructions contained in the BWX (Byte/Word manipulation instructions)
or MAX (Multimedia instructions) extension, square root and
floating-point convert, and count extension. Applications compiled with
this option may incur emulation overhead on ev4, ev5, ev56, and pca56
processors, but will still run correctly.
1.8.3 Version 5.2 Important Information
Some important information to note about this release:
- UNIX Virtual Memory from the Compaq Tru64 UNIX docset
There is
a new manual in V4.0D of the docset: "System Configuration and Tuning".
Section 4.7.3 from that book is "Increasing the Available Address
Space".
If your applications are memory-intensive, you may want to increase the
available address space. Increasing the address space will cause only a
small increase in the demand for memory. However, you may not want to
increase the address space if your applications use many forked
processes.
The following attributes determine the available
address space for processes:
vm-maxvas
This attribute controls the maximum amount of virtual address
space available to a process. The default value is 1 GB (1073741824).
For Internet servers, you may want to increase this value to 10 GB.
per-proc-address-space
max-per-proc-address-size
These attributes control the maximum amount of user process
address space, which is the maximum number of valid virtual regions.
The default value for both attributes is 1 GB.
per-proc-stack-size
max-per-proc-stack-size
These attributes control the maximum size of a user process stack.
The default value of the per-proc-stack-size attribute is 2097152
bytes. The default value of the max-per-proc-stack-size attribute is
33554432 bytes. You may need to increase these values if you receive
cannot grow stack messages.
per-proc-data-size
max-per-proc-data-size
These attributes control the maximum size of a user process data
segment. The default value of the per-proc-data-size attribute is
134217728 bytes. The default value of the max-per-proc-data-size is 1
GB. You can use the setrlimit function to control the consumption of
system resources by a parent process and its child processes. See
setrlimit(2) for information.
- If you try to link
-non_shared
a parallel application that uses
-mp
or
-omp
, you must explicitly add
-lpset
in addition to the libraries
f90
links in.
- The
-nod
command switch is now available to allow symbol definitions (using
-d
) to be passed to
fpp
but not to be passed to the conditional compilation facilty inside the
f90
compiler.
- When
-arch ev6
is used, the
f90
driver will add
-qlm_ev6
before
-lm
on the
cc
command so
ld
will look for the EV6-tuned math library.
- Please note the behavior of NOWAIT reductions: each thread
contributes its part, and proceeds without waiting for the final value
of the reduction variable. The reduction variable's value is undefined
until a synchronization operation has occurred, or the parallel region
is left.
- UNIX v4.0D contains ld options to restrict library searches to
shared and archived libraries. See
-no_so
,
-no_archive
, and
-so_archive
in the
ld(1)
man page.
- Use the
setld -d
option to install the software to another root directory. Everything in
the installation then hangs off that root. Commands like f90 can be
pointed to by PATH, the DECF90 environment variable can point to where
the compiler is,
-l
can tell f90 where the RTL is, and the LD_LIBRARY_PATH environment
variable can be used to ensure that the desired version of shareable
libraries are picked up at run time.
1.8.4 Version 5.2 Corrections
From version V5.1-594-3882K to FT1 T5.2-682-4289P, the following
corrections have been made:
- Don't create stack temporary for character operands to ALL except
when absolutely necessary.
- Add -warn argument_checking warning for mismatch between INTEGER
kinds with explicit interface.
- Add -warn argument_checking warning for insufficent arguments.
- Improve display of various diagnostic messages so that the
"pointer" is more appropriate.
- Fix internal compiler error when compiling a -mp or -omp program
with any COMMON or EQUIVALENCED data declared in a PRIVATE,
LASTPRIVATE, FIRSTPRIVATE, or REDUCTION list.
- Fix problem with TRANSFER of CHARACTER items using non-1 substring
offset.
- Don't give use-before-defined warning for pointer structure
assignment.
- Allow LOC(intrinsic_name).
- Allow RECORDs of empty STRUCTUREs.
- Allow repeat counts in FORMATs to be up to 2147483647.
- Always quadword-align EQUIVALENCE groups.
- Prevent internal compiler error with very long list of -D
definitions.
- Correct problem relating to use of an AUTOMATIC array in a parallel
region.
- Allow contained function result to have dimension bounds depend
upon size of one of its array arguments.
- Eliminate inappropriate argument mismatch warning with record
structures when -wsf is specified. Add support for -assume gfullpath,
which causes the full source file path to be included in the debug
information.
- If -check bounds is in effect, don't optimize implied-DO in I/O as
this can prevent bounds checking from occurring.
- Eliminate inappropriate use-before-defined warnings when passing
array slices.
- Improve generated code when calling routines with INTENT(IN).
Prevent an output statement (WRITE, etc.) from inhibiting
use-before-defined warnings.
- Improve generated code when calling intrinsic functions.
- -fast or -math_library fast implies -check nopower.
- Fortran 90 interpretation 100 - ASSOCIATED of two zero-sized arrays
always returns .FALSE..
- Eliminate internal compiler error for
LOC(character-parameter-constant)
- Eliminate"text handle table overflow" errors for certain programs
that had very large and complicated single statements (e.g., DATA).
- Allow structure field names which are the same as relational
operators.
- In pointer assignment, where the right-hand-side is a structure
constructor, enforce the standard's requirement that the constructor
expression be an allowable target.
- Allow a module procedure as an actual argument.
- Eliminate inappropriate error about use of PRIVATE type declared
later in the module.
- Eliminate parsing error where a KIND specifier is continued across
multiple source lines.
- Eliminate parsing error involving an assignment to a variable whose
name begins with"PARAMETE".
- When passing an element of a named array constant as an actual
argument, make sure that sequence association works as if it had been a
variable.
- Correct problem with visibility of inherited identifier.
- Eliminate internal compiler error for PARAMETER declaration where
the constant value is an undefined identifier.
- Eliminate internal compiler error involving a statement function
having the same name as another routine in the same compilation.
- Make severity of -warnings declarations diagnostics warning instead
of error.
- Eliminate internal compiler error when all source is
conditionalized away.
- Eliminate internal compiler error for certain programs which use
TRANSFER in a PARAMETER declaration.
- Allow a tab character in a FORMAT.
- Assume INTEGER type for bit constants where required.
- Don't sign extend result of ICHAR in a PARAMETER definition.
- Eliminate internal compiler error for certain programs using
functions with mask arguments.
- Make !DEC$ATTRIBUTES (no space) work in any column in fixed-form.
- Give proper error instead of internal compiler error when QFLOAT
used on platforms that don't support REAL*16.
- Don't consider a DECODE to modify the buffer argument for purposes
of INTENT.
- Eliminate internal compiler error for certain programs when -assume
dummy_aliases is in effect.
- Correct problem with certain programs using STRUCTUREs with %FILL
fields.
- When -real_size 64 is in effect, intrinsics with explicitly REAL*4
or COMPLEX*8 arguments are no longer inappropriately promoted to
REAL*8/COMPLEX*16.
- Do not cause internal compiler error for reference to undefined
user operator.
- Allow use of an array-constructor's implied DO variable in a
specification expression.
- Allow SIZE argument to be omitted to IISHFTC, JISHFTC, KISHFTC.
- Make result type of IBSET, IBCLR, IBITS, etc. be type of the first
argument.
- Allow up to 256 arguments to an intrinsic function (e.g., MAX, MIN)
in a specification expression - the previous limit was 8.
- Give error for passing an array section with vector subscript to
INTENT(INOUT) or INTENT(OUT) argument.
- Fix internal compiler error for use in the length specification
expression for a function LEN(concatenation) where one of the
concatenation arguments is a passed-length argument to the function
being declared.
- Fix internal compiler error for use in the length specification
expression for a function LEN(TRIM(arg)) where arg is a passed-length
argument to the function being declared.
- Treat a negative declared length for a CHARACTER variable as if it
were zero.
- Properly parse "ELSE IFCONSTRUCT" where CONSTRUCT is a construct
name.
- Give an error when an AUTOMATIC variable is DATA initialized.
- Properly propagate (or not) PRIVATE attribute for nested USE.
- Eliminate undeserved argument conformance error in certain cases
involving WHERE masks.
- Ensure that the return kind of ICHAR is "default integer", no
matter what kind that is (due to integer_size switch).
- Fix internal compiler error for type constructor with string
argument for numeric element.
- Fix internal compiler error when an INTERFACE TO block has certain
syntax errors.
- Correctly parse non-standard 'n syntax for REC= in I/O statement
when the I/O list contains a quoted literal.
- Fix problem relating to ONLY and nested USE.
- Make variables whose names begin with $ have implicit INTEGER type.
- Allow $ in the range for IMPLICIT (sorts after Z).
- If a program has multiple USE statements where the module files
cannot be found, give error messages for each of them.
- Allow SIZEOF in EQUIVALENCE array index.
- Fix internal compiler error with certain array initializers
containing an implied DO.
- Accept F95-style reference to MAXVAL, MINVAL, MAXLOC, MINLOC with a
mask as a second non-keyword argument.
- Accept F95-style reference to PRODUCT and SUM with a mask as a
second non-keyword argument.
- Don't give inappropriate alignment warnings for REAL*16 variables
in COMMON.
- Don't give error message for empty FORALL statement body.
- Allow FORALL to be nested 7 deep (previous limit was 3).
- Correctly parse certain complex instances of named FORALL.
- Allow RESULT of ENTRY to have same name as host FUNCTION.
- Demote diagnostic for not using all active combinations of FORALL
index names from error to warning.
- Eliminate inappropriate error for certain uses of intrinsic
functions in a specification expression.
- Eliminate internal compiler error for a peculiar (and erroneous)
case of a USE of a NAMELIST whose group contains a variable inherited
from another module but which isn't visible due to an ONLY list.
- Make OPTIONS /EXTEND_SOURCE persistent across an INCLUDE.
- Add support for defined assignment statement from within a WHERE
statement.
- Allow a function result length to be computed using a field of an
array element, where the array is a derived type passed as a dummy
argument.
- Fix problem with functions returning complex/doublecomplex.
From version FT1 T5.2-682-4289P to FT2 T5.2-695-428AU, the following
corrections have been made:
- Allow an ALLOCATABLE variable to be PRIVATE in a parallel scope.
- Support ISHC for INTEGER*8.
- Correct problem with overlapping CHARACTER assignment in FORALL.
- Correct debug information for CHARACTER POINTERs.
- Correct problems with ISHFTC which can cause alignment errors.
- Correct problem with FORALL and WHERE with non-default integer size.
- Don't issue spurious UNUSED warning for argument whose interface
comes from a MODULE.
- Fix internal compiler error for invalid IMPLICIT syntax.
- Eliminate inappropriate type mismatch error for certain cases of
references to a generic procedure with a procedure argument.
- Allow use of . field separator in addition to % in
ALLOCATE/DEALLOCATE.
- Give warning of unused variable in module procedure when
appropriate.
- Do not allow a non-integer/logical expression in a logical IF.
- Fix another case of recognizing a RECORD field that has the same
name as a relational operator.
- Correct compiler failure for CMPLX(R8,R8) when real_size=64 is in
effect.
- Allow gaps in keyword names in MAX/MIN, for example MAX(A1=x,A4=y).
- Correct compiler failure when a COMPLEX array is initialized with a
REAL array constructor.
- Correct compiler failure when the CHAR intrinsic is used in an
initialization expression.
- Correct compiler failure ("possible out of order or missing USE")
in certain uses of nested MODULEs and ONLY.
- Show correct source pointer for syntax error in declaration.
From version FT2 T5.2-695-428AU to V5.2-705-428BH, the following
corrections have been made:
- The compiler now accepts a new DEFAULT keyword on the !DEC$
ATTRIBUTES directive. This tells the compiler to ignore any compiler
options that change external routine or COMMON block naming or argument
passing conventions, and uses just the other attributes specified (if
any). The options which this affects are -names and -assume underscore.
- Avoid giving a spurious "Inconsistent THREADPRIVATE declaration of
common block" error if one COMMON block has a name which is an initial
substring of another and one of them is named in a THREADPRIVATE
directive.
- Prevent FUSE XREF from dying when !DEC$ ATTRIBUTES is used.
- Add support for -source_listing option. The listing file has the
extension .lis.
- The f66 option now establishes OPEN defaults of STATUS='NEW' and
BLANK='ZERO'.
- Correct compiler failure with RESHAPE and SHAPE used in an
initialization expression.
- Eliminate spurious error when a defined operator is used in a
specification expression
- Correct compiler failure when undefined user-defined operator is
seen.
- Eliminate spurious error when component of derived type named
constant is used in a context where a constant is required.
- Correct problem with host association and contained procedure.
- Correct compiler failure with WHERE when non-default integer_size
is in effect.
1.9 High Performance Fortran (HPF) Support in Version 5.2
Compaq Fortran (DIGITAL Fortran 90) Version 5.2 supports the entire
High Performance Fortran (HPF) Version 2.0 specification with the
following exceptions:
- Nested FORALL statements
- WHERE statements within FORALL statements
- Passing CYCLIC(N) arguments to EXTRINSIC (HPF_LOCAL) routines. See
Section 1.9.5.3.
- Accessing non-local data (other than arguments) within PURE
functions in FORALL statements
- SORT_UP library procedure
- SORT_DOWN library procedure
In addition, the compiler supports many HPF Version 2.0 approved
extensions including:
- Extrinsic (HPF_LOCAL) routines
- Extrinsic (HPF_SERIAL) routines
- Mapping of derived type components
- Pointers to mapped objects
- Shadow-width declarations
- All HPF_LOCAL_LIBRARY routines (except LOCAL_BLKCNT, LOCAL_LINDEX,
and LOCAL_UINDEX). Other exceptions are the approved extensions to
HPF_LOCAL_LIBRARY routines.
- ON directive within INDEPENDENT loops
- RESIDENT directive used with INDEPENDENT loops
1.9.1 Optimization
This section contains release notes relevant to increasing code
performance. You should also refer to Chapter 7 of the DIGITAL High Performance Fortran 90 HPF and PSE Manual
for more detail.
1.9.1.1 The -fast Compile-Time Option
To get optimal performance from the compiler, use the
-fast
option if possible.
Use of the
-fast
option is not permitted in certain cases, such as programs with
zero-sized data objects or with very small nearest-neighbor arrays.
For More Information:
- On the cases where use of
-fast
is not permitted, see the "Optimizing" and "Compiling" chapters of the
DIGITAL High Performance Fortran 90 HPF and PSE Manual.
1.9.1.2 Non-Parallel Execution of Code
The following constructs are not handled in parallel:
- Reductions with non-constant DIM argument.
- CSHIFT, EOSHIFT and SPREAD with non-constant DIM argument.
- Some array-constructors
- PACK, UNPACK, RESHAPE
- xxx_PREFIX, xxx_SUFFIX, GRADE_UP, GRADE_DOWN
- In the current implementation of Compaq Fortran 95/90, all I/O
operations are serialized through a single processor; see Chapter 7 of
the DIGITAL High Performance Fortran 90 HPF and PSE Manual for more details
- Date and time intrinsics, including DATE_AND_TIME, SYSTEM_CLOCK,
DATE, IDATE, TIME, and SECNDS
If an expression contains a non-parallel construct, the entire
statement containing the expression is executed in a nonparallel
fashion. The use of such constructs can cause degradation of
performance. Compaq recommends avoiding the use of constructs to which
the above conditions apply in the computationally intensive kernel of a
routine or program.
1.9.1.3 INDEPENDENT DO Loops Currently Parallelized
Not all INDEPENDENT DO loops are currently parallelized. It is
important to use the
-show hpf
or
-show hpf_indep
compile-time option, which will give a message whenever a loop marked
INDEPENDENT is not parallelized.
Currently, a nest of INDEPENDENT DO loops is parallelized whenever the
following conditions are met:
- When INDEPENDENT DO loops are nested, the NEW keyword must be used
to assert that all loop variables (except the outer loop variable) are
NEW. It is recommended that the outer DO loop variable be in the NEW
list, as well.
- The loop does not contain any of the constructs listed in
Section 1.9.1.2 that cause non-parallel execution.
- Each subscript of each array reference must either
- contain no references to INDEPENDENT DO loop variables, or
- contain one reference to an INDEPENDENT DO loop variable and the
subscript expression is an affine function of that DO loop variable.
- At least one array reference must reference all the independent
loops in a nest of independent loops.
- The compiler must be able to prove that loop nest either
- requires no inter-processor communication, or
- can be made to require no inter-processor communication with
compiler-generated copyin/copyout code around the loop nest.
- Any reductions in an interior (i.e. any but the outer) loop may use
an INDEPENDENT DO index as a subscript only if that index represents a
serially distributed dimension of the array. An exception to this is
the index of the outermost DO loop, which may be used as a subscript
even if it represents a non-serially distributed array dimension.
- There must not be any assignments to scalars, except for NEW or
reduction variables.
- Any procedure call inside an INDEPENDENT DO loop must either be
PURE, or be encapsulated in an ON HOME RESIDENT region (see
Section 1.9.5.6).
When the entire loop nest is encapsulated in an ON HOME RESIDENT
region, then only the first two restrictions apply.
For More Information:
- On enclosing INDEPENDENT DO loops in an ON HOME RESIDENT region,
see Section 1.9.5.6
1.9.1.4 Nearest-Neighbor Optimization
The following is a list of conditions that must be satisfied in an
array assignment, FORALL statement, or INDEPENDENT DO loop in order to
take advantage of the nearest-neighbor optimization:
- Relevant arrays with the POINTER or TARGET attributes must have
shadow edges explicitly declared with the SHADOW directive.
- The arrays involved in the nearest-neighbor style assignment
statements should not be module variables or variables assigned by USE
association. However, if both the actual and all associated dummies are
assigned a shadow-edge width with the SHADOW directive, this
restriction is lifted.
- A value must be specified for the
-wsf
option on the command line.
- Some interprocessor communication must be necessary in the
statement.
- Corresponding dimensions of an array must be distributed in the
same way (though they can be offset using an ALIGN directive). If the
-nearest_neighbor
flag's optional nn field is used to specify a maximum
shadow-edge width, only constructs with a subscript difference
(adjusted for any ALIGN offset) less than or equal to the value
specified by nn will be recognized as nearest neighbor. For
example, the assignment statement (
FORALL (i=1:n) A(i) = B(i-3)
) has a subscript difference of
3
. In a program compiled with the flag
-nearest_neighbor 2
, this assignment statement would not be eligible for the nearest
neighbor optimization.
- The left-hand side array must be distributed BLOCK in at least one
dimension.
- The arrays must not have complicated subscripts (no vector-valued
subscripts, and any subscripts containing a FORALL index must be affine
functions of one FORALL index; further, that FORALL index must
not be repeated in any other subscript of a particular array reference).
- Statements with scalar subscripts are eligible only if that array
dimension is (effectively) mapped serially.
- Subscript triplet strides must be known at compile time and be
greater than 0.
- The arrays must be distributed BLOCK or serial (*) in each
dimension.
Compile with the
-show hpf
or
-show hpf_nearest
switch to see which lines are treated as nearest-neighbor.
Nearest-neighbor communications are not profiled by the
pprof
profiler. See the section about the
pprof
Profile Analysis Tool in the Parallel Software Environment (PSE)
Version 1.6 release notes.
For More Information:
- On profiling nearest-neighbor computations, see the section about
the
pprof
Profile Analysis Tool in the Parallel Software Environment (PSE)
Version 1.6 release notes.
- On using EOSHIFT for nearest-neighbor computations, see
Section 1.9.1.6
1.9.1.5 Widths Given with the SHADOW Directive Agree with Automatically Generated Widths
When compiler-determined shadow widths don't agree with the widths
given with the SHADOW directive, less efficient code will usually be
generated.
To avoid this problem, create a version of your program without the
SHADOW directive, and compile with the
-show hpf
or
-show hpf_near
option. The compiler will generate messages that include the sizes of
the compiler-determined shadow widths. Make sure that any widths you
specify with the SHADOW directive match the compiler-generated widths.
1.9.1.6 Using EOSHIFT Intrinsic for Nearest Neighbor Calculations
In the current compiler version, the compiler does not always recognize
nearest-neighbor calculations coded using EOSHIFT. Also, EOSHIFT is
sometimes converted into a series of statements, only some of which may
be eligible for the nearest neighbor optimization.
To avoid these problems, Compaq recommends using CSHIFT or FORALL
instead of EOSHIFT if these alternatives meet the needs of your program.
1.9.2 New Features
This section describes the new HPF features in this release of Compaq
Fortran.
1.9.2.1 RANDOM_NUMBER Executes in Parallel
The RANDOM_NUMBER intrinsic subroutine now executes in parallel for
mapped data. The result is a significant decrease in execution time.
1.9.2.2 Improved Performance of TRANSPOSE Intrinsic
The TRANSPOSE intrinsic will execute faster for most arrays that are
mapped either * or BLOCK in all dimensions.
1.9.2.3 Improved Performance of DO Loops Marked as INDEPENDENT
Certain induction variables are now recognized as affine functions of
the INDEPENDENT DO loop indices, thus meeting the requirements listed
in Section 1.9.1.3. Now, the compiler can parallelize array references
containing such variables as subscripts. An example is next.
! Compiler now recognizes a loop as INDEPENDENT because it
! knows that variable k1 is k+1.
PROGRAM gauss
INTEGER, PARAMETER :: n = 1024
REAL, DIMENSION (n,n) :: A
!HPF$ DISTRIBUTE A(*,CYCLIC)
DO k = 1, n-1
k1 = k+1
!HPF$ INDEPENDENT, NEW(i)
DO j = k1, n
DO i = k1, n
A(i,j) = A(i,j) - A(i,k) * A(k,j)
ENDDO
ENDDO
ENDDO
END PROGRAM gauss
|