### 6 WRITING PORTABLE PROGRAMS

One of the key reasons for having a Starlink programming standard is to promote software portability. What is meant by this term, and why is it important?

#### 6.1 Meaning of Portability

Other things being equal, it is clearly desirable for applications to be usable on different computers rather than be limited to just one type. Equally clearly, there may be a tradeoff between the extra trouble of ensuring that application code is highly machine-independent and the work of modifying or rewriting programs from time to time. The Starlink Application Programming Standard recognizes this tradeoff and allows the programmer to choose what degree of portability is appropriate, taking into account:

• the type of software;
• its life expectancy;
• who will support it long-term; and
• the extent to which the programmer is prepared to rely on infrastructure software provided by others.

To put the recommendations of the Starlink Standard in context, consider four degrees of portability, called here Absolute Portability, Portable Fortran, Adaptable Fortran and Laissez-Faire.

ABSOLUTE PORTABILITY is where application source code compiles and runs on all types of computer without any alterations whatsoever. Once the programmer has completed work on an application, the code need never again be touched. To achieve this result, the programmer must be fully insulated from the facilities offered by the platform. Because the Fortran 77 standard does not include all those things which applications need to do, and in any case compilers vary in their compliance with and interpretation of the standard or have bugs, it is not possible to rely on pure standard Fortran. (Similar arguments apply to other languages.) The classic solution is to write applications in a private programming language, and to accommodate differences between computers, compilers and operating systems by providing different versions of the language interpreter software. This approach has the benefit that for any new platform, once a new version of the system software has been written, unlimited quantities of application code will run. However, having to use the systems’s own programming language provokes scepticism among users, introduces extra training needs, produces code which can only run within the system, and reduces the convenience and effectiveness of online source code debuggers. These drawbacks led Starlink to reject this approach.

PORTABLE FORTRAN, Starlink’s recommendation, is to write applications in an industry-standard language – Fortran 77 – with controlled use of certain platform-dependent features as sanctioned by Sections 2 and 4 of the present document. Significant departures from standard Fortran (for example the use of %VAL) should be present in only a small minority of modules, with most routines in de facto standard Fortran. These departures can, if and when necessary, easily be edited using simple preprocessors like forconv or even by hand. Furthermore, the programs can be understood and modified by non-specialists.

ADAPTABLE FORTRAN, also embraced by the Starlink Standard, differs from Portable Fortran in the degree to which departures from ANSI Fortran are tolerated. While gratuitous use of platform-specific features is frowned upon, it is accepted that some use of such features will be convenient and relatively harmless. Programs of this general level of portability are easy to write and to adapt manually for new platforms as required.

LAISSEZ-FAIRE programming is where programmers can use whatever the current machine’s Fortran compiler accepts – the objective is simply to have a program that works. If a new computer is introduced, authors can decide whether to adapt, rewrite or scrap their applications. This style of programming lies outside the Starlink Standard and is deprecated for anything more than a casual one-off.

The Absolute Portability and Portable Fortran categories presuppose substantial quantities of infrastructure software, libraries and utilities which leave the programmer free to concentrate on the application itself rather than worrying about user interfaces, error handling, input/output and so on. At the lowest levels within the infrastructure there is a small platform-specific kernel, which has to be rewritten for each new machine. The Adaptable Fortran and Laissez-Faire categories allow programmers to provide their own infrastructure if they wish.

Starlink’s recommendation is to base programs on the various standard tools and libraries, and to aim for the PORTABLE FORTRAN level in applications code.

#### 6.2 Why Portability Matters

Despite the fact that for its first decade Starlink supported just one platform – VAX/VMS – the importance of avoiding platform-specific software has been stressed from the beginning. There are two main reasons for this. Firstly, there were and are collaborating astronomical institutions using non-Starlink types of computer – Data General, Fujitsu, Perkin-Elmer, CDC, Cray and various Unix platforms – and it is useful if they can run Starlink applications, and programs written by Starlink users. The second reason is to enable existing software to run on different sorts of Starlink equipment – currently Sun and DECstation as well as VAX/VMS. Quite apart from further Unix-based platforms, it is also possible that fast specialized processors will be added to the existing Starlink systems, and there is interest in using various types of Personal Computer. Attention to software portability – which means resisting the temptation to use Sun and DECstation features now just as much as avoiding VAX dependency in the past – means great benefits in the long term.

#### 6.3 Achieving Portability

Programmers who have followed the recommendations given earlier in Section 2 are likely to encounter fewer difficulties in adapting their code to run on new and multiple platforms than programmers who have not. Of those recommendations, the key one is to work only with ANSI Standard Fortran 77 and only to use a VAX extension or an extension available on some other platform when it is essential, or safe, to do so. This advice should be borne in mind when reading the following notes, many of which refer to problems that afflict code where non-standard Fortran extensions have been used. The notes concentrate on the specific problem of adapting VAX/VMS applications to run on Unix platforms but also serve to illustrate more general portability issues.

• ANSI FORTRAN 77 STANDARD:   There are a number of violations of the Fortran standard that are allowed by the VMS compiler that will cause problems on a UNIX system; some are rejected by the compilers but others cannot be detected at compile time and will cause programs to fail.
• Overlapping character substrings:   The character strings on either size of an assignment statement must not overlap. If not detected at compile time, such an overlap will produce incorrect results on Unix platforms.
• Illegal string concatenation:   The concatenation operator // cannot be used in circumstances that would require the allocation of arbitrary amounts of dynamic memory at run-time. For example, a CHARACTER*(*) dummy argument of a subroutine cannot be concatenated with another character string in the argument list of a subroutine or function call. (The ANSI standard puts it thus: a passed-length character dummy argument may only be the operand of a concatenation operator within an assignment statement.)
• Mixing character and numeric data in a COMMON block:   Separate COMMON blocks are required for character data on the one hand, and numeric and logical data on the other. (Similarly, it is illegal to EQUIVALENCE character data with anything else.)
• INPUT/OUTPUT:   The Fortran I/O system is not tightly enough specified to avoid problems with different implementations:
• Most compilers have their own set of non-standard I/O keywords, especially in OPEN statements. If use of such keywords is unavoidable they must only appear in explicitly platform-dependent routines, not in the middle of large programs.
• Compilers vary in their tolerance of illegal combinations of keywords, which must be avoided.
• I/O unit numbers:   On Unix platforms the I/O unit numbers 0, 5 and 6 refer to the standard error, input and output channels respectively and cannot be used for anything else. Furthermore, only those unit numbers can usefully be used for reading from and writing to the terminal; other logical units are buffered in a way that is inappropriate for terminal I/O.
• Version numbers:   The Unix file system does not have file versions; opening a file with STATUS=’NEW’ when the file already exists will either destroy the contents of the file or fail depending on the system.
• READONLY:   On the DECstation, the non-standard keyword READONLY is required in order to open a file that you do not have write access to. The Sun compiler issues a warning message if READONLY is used.
• RECORDTYPE:   On Unix platforms, opening an existing file with the non-standard keyword RECORDTYPE=’FIXED’ requires that the RECL keyword is used as well because unlike on VMS the file system does not store the record length in the file header.
• Unformatted direct-access files:   In the OPEN statement for direct-access files the Fortran standard requires the record length to be specified, by means of the RECL keyword. In the case of a formatted file, the length is in characters; however, the Fortran standard does not specify the units of length for an unformatted file. For unformatted files the Sun uses bytes, whereas the VAX and DECstation use numeric storage units (the space required to store a REAL or INTEGER value).
• Printer control codes:   In the OPEN statement, CARRIAGECONTROL=’LIST’ is non-standard and is not supported by some platforms. There is no machine-independent way of specifying whether a text file contains Fortran printer control codes or not, and the effect of typing out text files produced by Fortran programs or of reading such files into a text editor cannot be predicted. This problem must be handled through per-platform code variations or by using per-platform utilities for processing the files (for example the fpr command on the DECstation and Sun).
• Prompt strings:   There is no portable way of suppressing CR/LF after a message has been output, though the VAX, DECstation and Sun all use the non-standard ‘\$’ edit descriptor. Provision must be made for per-platform variations at this point in an application.
• Status:   The I/O status values returned by OPEN, CLOSE, READ and WRITE are non-portable and application code should avoid using them in anything more than a general way (or should use the ERR_FIOERR routine – see SUN/104). Unfortunately, it is not even possible to map the numbers from the different platforms onto a single adopted set of values since the conditions that each platform reports as errors are different.
• End-of-File:   A READ or WRITE statement which includes the ERR= specifier behaves differently on different platforms when end-of-file is encountered. The condition is treated as an error on the VAX and DECstation but not on the Sun. To comply with the ANSI standard, all platforms return an IOSTAT value of $-1$.
• DATA STORAGE ALIGNMENT:   The VAX is unusual in imposing no restriction on the addresses of data; many architectures generate a hardware error if, for example, a floating point operand has an odd rather than an even address. Both the Sun and the DECstation do this and although both operating systems handle the error successfully and allow the program to continue, it is at the expense of both a huge execution time overhead and a mysterious message being output. COMMON blocks should therefore be arranged such that the longer data types always appear before shorter data types.
• THE BACKSLASH CHARACTER:   Unix compilers treat the backslash character as an escape character (so that for example \t is translated into a tab character) and to insert a true backslash character the source must have two backslash characters (i.e. \ on VMS must be converted to \\ on Unix). The forconv tool (SUN/111) can be used to insert the extra backslash character when converting source code.

Some compilers have a switch that turns off the special meaning of the backslash character but using this is unwise – see the remarks on compiler switches later.

• THINGS THAT WORK BY ACCIDENT ON VAX:   There are a number of bugs that can go unnoticed on VMS but will cause programs to fail on other systems:
• Argument mismatches across subroutine and function calls:   For example a DOUBLE PRECISION argument passed to a subroutine expecting a REAL happens to work on VMS but is a bug.
• Uninitialized variables:   On a VAX uninitialized variables will be set to zero; on Suns and DECstations they will not (see section 2, rule 36).
• Missing SAVE statements:   On VMS the values of variables local to a subroutine are retained between calls to the subroutine. On other systems they may not be; on the DECstation, for example, it depends on compiler switches. (See section 2, rule 14.)
• THINGS THAT WORK BY ACCIDENT ON UNIX:   Similarly, there are bugs that go undetected on many Unix systems which will cause problems on VAX/VMS. For example, if a CHARACTER*n argument is not declared as such in a subprogram, it is often possible to get away with this on Unix systems but not on VMS.
• UNUSED VARIABLES:   The Unix compilers always complain about declared but unused variables. (See section 2, rule 32.)
• INCLUDES:   INCLUDE statements, by their very nature, cannot avoid involvement with the syntax of file names which makes writing source code that will run on many machines with absolutely no change of source code difficult if not impossible. However, the following scheme keeps the changes to a minimum and allows what changes that may be necessary to be automated.
• On the VAX call your INCLUDE files xxxx.for where xxxx is some name of your own choosing.
• Include them with statement like:
INCLUDE ’xxxx’
• Either compile your code in the same directory as the included files are stored or define xxxx as a logical name.
• On Unix call the INCLUDE file xxxx (with no file extension) and compile your code in the same directory.

The INCLUDE files that are used when calling Starlink subroutine libraries have logical names defined on the VAXs so that, for example, SAE_PAR.FOR is included with the statement:

INCLUDE ’SAE_PAR’

On Unix the corresponding file is called sae_par and is stored in /star/include along with all the other Starlink INCLUDE files so that the INCLUDE statement must be changed to:

INCLUDE ’/star/include/sae_par’

The forconv program described in SUN/111 will accomplish the conversion from VMS to Unix. The reverse operation can be done with a simple edit script.

It is also possible, though not at present the recommended technique, to set up a soft link file pointing to the required INCLUDE file and to specify the name of the soft link file in the INCLUDE statement.

• ONE MODULE PER SOURCE FILE:   Code that is going to be inserted into a subroutine library (a Unix archive) must have just one routine per source file before it is compiled. This is because the Unix equivalent (ar) of the VMS librarian does not split object files into separate modules when it inserts them into a library. The Unix command fsplit will split a Fortran source file into separate files.
• COMPILER SWITCHES:   It is unwise to do anything requiring use of special compiler switches. There are sure to be problems in the future when someone – not the original author – compiles the program, in good faith, without the switch. Examples are the switches that allow code to extend beyond column 72 (see section 2, rule 59) and to disable the special meaning of the backslash character.
• WHEN ALL ELSE FAILS:   Unavoidable per-machine variations can be handled either by using the forconv preprocessor (SUN/111) or by using separate files. Where the latter technique is used, a code identifying the platform should be appended to the name:  suffix platform

 _vax VAX/VMS

 _sun4 Sun SPARCstation etc.

 _mips DECstation etc.

 _pcm PC/Microsoft

 _ind platform-independent substitute

and the file extension should identify the language in the normal way:

 extension language

 .f FORTRAN

 .c C

Thus, different versions of a Fortran routine fsub for, respectively, VAX and Sun, would be fsub_vax.f and fsub_sun4.f. Care must be taken not to exceed 15 characters or ar will truncate the file name.