6 WRITING PORTABLE PROGRAMS
One of the key reasons for having a Starlink programming standard is to promote software portability.
What is meant by this term, and why is it important?
6.1 Meaning of Portability
Other things being equal, it is clearly desirable for applications to be usable on different computers
rather than be limited to just one type. Equally clearly, there may be a tradeoff between the extra
trouble of ensuring that application code is highly machine-independent and the work of modifying
or rewriting programs from time to time. The Starlink Application Programming Standard recognizes
this tradeoff and allows the programmer to choose what degree of portability is appropriate, taking
into account:
- the type of software;
- its life expectancy;
- who will support it long-term; and
- the extent to which the programmer is prepared to rely on infrastructure software
provided by others.
To put the recommendations of the Starlink Standard in context, consider four degrees of portability,
called here Absolute Portability, Portable Fortran, Adaptable Fortran and Laissez-Faire.
ABSOLUTE PORTABILITY is where application source code compiles and runs on all types
of computer without any alterations whatsoever. Once the programmer has completed
work on an application, the code need never again be touched. To achieve this result, the
programmer must be fully insulated from the facilities offered by the platform. Because the
Fortran 77 standard does not include all those things which applications need to do, and in any
case compilers vary in their compliance with and interpretation of the standard or have
bugs, it is not possible to rely on pure standard Fortran. (Similar arguments apply to other
languages.) The classic solution is to write applications in a private programming language,
and to accommodate differences between computers, compilers and operating systems by
providing different versions of the language interpreter software. This approach has the benefit
that for any new platform, once a new version of the system software has been written,
unlimited quantities of application code will run. However, having to use the systems’s own
programming language provokes scepticism among users, introduces extra training needs,
produces code which can only run within the system, and reduces the convenience and
effectiveness of online source code debuggers. These drawbacks led Starlink to reject this
approach.
PORTABLE FORTRAN, Starlink’s recommendation, is to write applications in an industry-standard
language – Fortran 77 – with controlled use of certain platform-dependent features as sanctioned by
Sections 2 and 4 of the present document. Significant departures from standard Fortran (for example
the use of %VAL) should be present in only a small minority of modules, with most routines in de facto
standard Fortran. These departures can, if and when necessary, easily be edited using simple
preprocessors like forconv or even by hand. Furthermore, the programs can be understood and
modified by non-specialists.
ADAPTABLE FORTRAN, also embraced by the Starlink Standard, differs from Portable Fortran in the
degree to which departures from ANSI Fortran are tolerated. While gratuitous use of platform-specific
features is frowned upon, it is accepted that some use of such features will be convenient and
relatively harmless. Programs of this general level of portability are easy to write and to adapt
manually for new platforms as required.
LAISSEZ-FAIRE programming is where programmers can use whatever the current machine’s
Fortran compiler accepts – the objective is simply to have a program that works. If a new computer is
introduced, authors can decide whether to adapt, rewrite or scrap their applications. This style of
programming lies outside the Starlink Standard and is deprecated for anything more than a casual
one-off.
The Absolute Portability and Portable Fortran categories presuppose substantial quantities of
infrastructure software, libraries and utilities which leave the programmer free to concentrate on the
application itself rather than worrying about user interfaces, error handling, input/output and so on.
At the lowest levels within the infrastructure there is a small platform-specific kernel, which has to be
rewritten for each new machine. The Adaptable Fortran and Laissez-Faire categories allow
programmers to provide their own infrastructure if they wish.
Starlink’s recommendation is to base programs on the various standard tools and libraries, and to
aim for the PORTABLE FORTRAN level in applications code.
6.2 Why Portability Matters
Despite the fact that for its first decade Starlink supported just one platform – VAX/VMS – the
importance of avoiding platform-specific software has been stressed from the beginning. There are
two main reasons for this. Firstly, there were and are collaborating astronomical institutions using
non-Starlink types of computer – Data General, Fujitsu, Perkin-Elmer, CDC, Cray and various Unix
platforms – and it is useful if they can run Starlink applications, and programs written by Starlink
users. The second reason is to enable existing software to run on different sorts of Starlink equipment
– currently Sun and DECstation as well as VAX/VMS. Quite apart from further Unix-based platforms,
it is also possible that fast specialized processors will be added to the existing Starlink systems,
and there is interest in using various types of Personal Computer. Attention to software
portability – which means resisting the temptation to use Sun and DECstation features now
just as much as avoiding VAX dependency in the past – means great benefits in the long
term.
6.3 Achieving Portability
Programmers who have followed the recommendations given earlier in Section 2 are likely to
encounter fewer difficulties in adapting their code to run on new and multiple platforms than
programmers who have not. Of those recommendations, the key one is to work only with ANSI
Standard Fortran 77 and only to use a VAX extension or an extension available on some other
platform when it is essential, or safe, to do so. This advice should be borne in mind when reading the
following notes, many of which refer to problems that afflict code where non-standard Fortran
extensions have been used. The notes concentrate on the specific problem of adapting VAX/VMS
applications to run on Unix platforms but also serve to illustrate more general portability
issues.
- ANSI FORTRAN 77 STANDARD: There are a number of violations of the Fortran standard
that are allowed by the VMS compiler that will cause problems on a UNIX system; some are
rejected by the compilers but others cannot be detected at compile time and will cause programs
to fail.
- Overlapping character substrings: The character strings on either size of an
assignment statement must not overlap. If not detected at compile time, such an
overlap will produce incorrect results on Unix platforms.
- Illegal string concatenation: The concatenation operator // cannot be used in
circumstances that would require the allocation of arbitrary amounts of dynamic
memory at run-time. For example, a CHARACTER*(*) dummy argument of a
subroutine cannot be concatenated with another character string in the argument list
of a subroutine or function call. (The ANSI standard puts it thus: a passed-length
character dummy argument may only be the operand of a concatenation operator
within an assignment statement.)
- Mixing character and numeric data in a COMMON block: Separate COMMON blocks
are required for character data on the one hand, and numeric and logical data on the
other. (Similarly, it is illegal to EQUIVALENCE character data with anything else.)
- INPUT/OUTPUT: The Fortran I/O system is not tightly enough specified to avoid problems
with different implementations:
- Most compilers have their own set of non-standard I/O keywords, especially in
OPEN statements. If use of such keywords is unavoidable they must only appear in
explicitly platform-dependent routines, not in the middle of large programs.
- Compilers vary in their tolerance of illegal combinations of keywords, which must
be avoided.
- I/O unit numbers: On Unix platforms the I/O unit numbers 0, 5 and 6 refer to
the standard error, input and output channels respectively and cannot be used
for anything else. Furthermore, only those unit numbers can usefully be used for
reading from and writing to the terminal; other logical units are buffered in a way
that is inappropriate for terminal I/O.
- Version numbers: The Unix file system does not have file versions; opening a file
with STATUS=’NEW’ when the file already exists will either destroy the contents of
the file or fail depending on the system.
- READONLY: On the DECstation, the non-standard keyword READONLY is required
in order to open a file that you do not have write access to. The Sun compiler issues
a warning message if READONLY is used.
- RECORDTYPE: On Unix platforms, opening an existing file with the non-standard
keyword RECORDTYPE=’FIXED’ requires that the RECL keyword is used as well
because unlike on VMS the file system does not store the record length in the file
header.
- Unformatted direct-access files: In the OPEN statement for direct-access files the
Fortran standard requires the record length to be specified, by means of the RECL
keyword. In the case of a formatted file, the length is in characters; however, the
Fortran standard does not specify the units of length for an unformatted file. For
unformatted files the Sun uses bytes, whereas the VAX and DECstation use numeric
storage units (the space required to store a REAL or INTEGER value).
- Printer control codes: In the OPEN
statement, CARRIAGECONTROL=’LIST’ is non-standard and is not supported by some
platforms. There is no machine-independent way of specifying whether a text file
contains Fortran printer control codes or not, and the effect of typing out text files
produced by Fortran programs or of reading such files into a text editor cannot be
predicted. This problem must be handled through per-platform code variations or
by using per-platform utilities for processing the files (for example the fpr command
on the DECstation and Sun).
- Prompt strings: There is no portable way of suppressing CR/LF after a message
has been output, though the VAX, DECstation and Sun all use the non-standard ‘$’
edit descriptor. Provision must be made for per-platform variations at this point in
an application.
- Status: The I/O status values returned by OPEN, CLOSE, READ and WRITE
are non-portable and application code should avoid using them in anything more
than a general way (or should use the ERR_FIOERR routine – see SUN/104).
Unfortunately, it is not even possible to map the numbers from the different
platforms onto a single adopted set of values since the conditions that each platform
reports as errors are different.
- End-of-File: A READ or WRITE statement which includes the ERR= specifier behaves
differently on different platforms when end-of-file is encountered. The condition is
treated as an error on the VAX and DECstation but not on the Sun. To comply with
the ANSI standard, all platforms return an IOSTAT value of .
- DATA STORAGE ALIGNMENT: The VAX is unusual in imposing no restriction on the
addresses of data; many architectures generate a hardware error if, for example, a floating point
operand has an odd rather than an even address. Both the Sun and the DECstation do this and
although both operating systems handle the error successfully and allow the program to
continue, it is at the expense of both a huge execution time overhead and a mysterious message
being output. COMMON blocks should therefore be arranged such that the longer data types
always appear before shorter data types.
- THE BACKSLASH CHARACTER: Unix compilers treat the backslash character as an escape
character (so that for example \t is translated into a tab character) and to insert a true backslash
character the source must have two backslash characters (i.e. \ on VMS must be converted to \\
on Unix). The forconv tool (SUN/111) can be used to insert the extra backslash character when
converting source code.
Some compilers have a switch that turns off the special meaning of the backslash character but
using this is unwise – see the remarks on compiler switches later.
- THINGS THAT WORK BY ACCIDENT ON VAX: There are a number of bugs that can go
unnoticed on VMS but will cause programs to fail on other systems:
- Argument mismatches across subroutine and function calls: For example a
DOUBLE PRECISION argument passed to a subroutine expecting a REAL happens
to work on VMS but is a bug.
- Uninitialized variables: On a VAX uninitialized variables will be set to zero; on Suns
and DECstations they will not (see section 2, rule 36).
- Missing SAVE statements: On VMS the values of variables local to a subroutine are
retained between calls to the subroutine. On other systems they may not be; on the
DECstation, for example, it depends on compiler switches. (See section 2, rule 14.)
- THINGS THAT WORK BY ACCIDENT ON UNIX: Similarly, there are bugs that go
undetected on many Unix systems which will cause problems on VAX/VMS. For example, if a
CHARACTER*n argument is not declared as such in a subprogram, it is often possible to get
away with this on Unix systems but not on VMS.
- UNUSED VARIABLES: The Unix compilers always complain about declared but unused
variables. (See section 2, rule 32.)
- INCLUDES: INCLUDE statements, by their very nature, cannot avoid involvement with the
syntax of file names which makes writing source code that will run on many machines with
absolutely no change of source code difficult if not impossible. However, the following scheme
keeps the changes to a minimum and allows what changes that may be necessary to be
automated.
- On the VAX call your INCLUDE files xxxx.for where xxxx is some name of your
own choosing.
- Include them with statement like:
- Either compile your code in the same directory as the included files are stored or define
xxxx as a logical name.
- On Unix call the INCLUDE file xxxx (with no file extension) and compile your code in the
same directory.
The INCLUDE files that are used when calling Starlink subroutine libraries have logical
names defined on the VAXs so that, for example, SAE_PAR.FOR is included with the
statement:
On Unix the corresponding file is called sae_par and is stored in /star/include along with all
the other Starlink INCLUDE files so that the INCLUDE statement must be changed
to:
INCLUDE ’/star/include/sae_par’
The forconv program described in SUN/111 will accomplish the conversion from VMS to Unix.
The reverse operation can be done with a simple edit script.
It is also possible, though not at present the recommended technique, to set up a soft link file
pointing to the required INCLUDE file and to specify the name of the soft link file in the
INCLUDE statement.
- ONE MODULE PER SOURCE FILE: Code that is going to be inserted into a subroutine library
(a Unix archive) must have just one routine per source file before it is compiled. This is because
the Unix equivalent (ar) of the VMS librarian does not split object files into separate modules
when it inserts them into a library. The Unix command fsplit will split a Fortran source file into
separate files.
- COMPILER SWITCHES: It is unwise to do anything requiring use of special compiler
switches. There are sure to be problems in the future when someone – not the original author –
compiles the program, in good faith, without the switch. Examples are the switches that allow
code to extend beyond column 72 (see section 2, rule 59) and to disable the special meaning of
the backslash character.
- WHEN ALL ELSE FAILS: Unavoidable per-machine variations can be handled either by
using the forconv preprocessor (SUN/111) or by using separate files. Where the
latter technique is used, a code identifying the platform should be appended to the
name:
| | _sun4 | Sun SPARCstation etc. |
| | _ind | platform-independent substitute |
and the file extension should identify the language in the normal way:
Thus, different versions of a Fortran routine fsub for, respectively, VAX and Sun, would be
fsub_vax.f and fsub_sun4.f. Care must be taken not to exceed 15 characters or ar will truncate
the file name.