As indicated several times earlier, many of the details of mixed language programming are implementation dependent. This section will deal in turn with each type of hardware that Starlink possesses. Given that programs can be written in a portable way, you may wonder if you need to know about the implementation specific details at all. This is in fact necessary when debugging programs, since the debugger will be working on the output of any macros that hide the implementation specific details from the programmer.
There is some duplication between the following subsections, one for each type of operating system, particularly in the examples. This has been done so that each section can be read separately from any other.
A Sun computer is based on a 32 bit architecture. Data can be addressed in multiples of 1, 2, 4, 8 or 16 bytes, a byte being 8 bits. References to FORTRAN and C in this subsection refer to the Sun FORTRAN and ANSI C compilers.
There is a simple correspondence between Sun FORTRAN and C numeric variable types. The standard types are given in the upper part of
Table 1 and non-standard extensions in the lower part. These should generally be avoided for reasons of portability, however, they are provided since HDS (see SUN/92) has corresponding data types.
type | Sun FORTRAN | Sun C
|
INTEGER | INTEGER | int |
REAL | REAL | float |
DOUBLE | DOUBLE PRECISION | double |
LOGICAL | LOGICAL | int |
CHARACTER*1 | char | |
CHARACTER | CHARACTER*n | char[n] |
BYTE | BYTE | signed char |
WORD | INTEGER*2 | short int |
UBYTE | unsigned char | |
UWORD | unsigned short int | |
POINTER | INTEGER | unsigned int |
Although C defines unsigned data types of unsigned char
(range 0 to
255), unsigned short
(range 0 to 32767) and unsigned int
(range 0 to
), there
are no corresponding unsigned data types in FORTRAN. There is also a C type called long int
,
however on Suns, this is the same as an int
.
The C language does not specify whether variables of type char
should be stored as signed or
unsigned values. On Suns, they are stored as signed values in the range -128 to 127.
Similarly there is no C data type that corresponds to the FORTRAN data type of COMPLEX
. However,
since Sun FORTRAN passes all numeric variable by reference, a COMPLEX
variable could be passed to a
C subprogram where it might be handled as a structure consisting of two variables of type
float
.
A Sun FORTRAN LOGICAL
value can be passed to a C int
. Sun FORTRAN and C both use zero to
represent a false value and anything else to represent a true value, so there is no problem with
converting the data values.
The Sun FORTRAN compiler appends an underscore character to all external names that it generates. This applies to the names of subroutines, functions, labelled common blocks and block data subprograms.
To understand how to pass arguments between Sun FORTRAN and C programs, it is necessary to understand the possible methods that the operating system can use for passing arguments and how each language makes use of them. There are three ways that an actual argument may be passed to a subroutine. What is actually passed as an argument should always be a four byte word. It is the interpretation of that word that is where the differences arise.
Sun FORTRAN passes all data types other than CHARACTER
by reference, i.e. the address of the
variable or array is put in the argument list. CHARACTER
variables are passed by a mixture of reference
and value. The argument list contains the address of the character variable being passed, but there is
also an extra argument added at the end of the argument list for each character variable. This gives the
actual length of the FORTRAN CHARACTER
variable and so this datum is being passed by value. These
extra arguments are hidden from the FORTRAN programmer, but must be explicitly included in any
C routines.
C uses call by value to pass all variables, constants (except string constants), expressions, array elements, structures and unions that are actual arguments of functions. It uses call by reference to pass whole arrays, string constants and functions. C never uses call by descriptor as a default.
To pass a C variable of type double
by value requires the use of two longwords in the argument list.
Similarly, if a C structure is passed by value, then the number of bytes that it takes up in the argument
list can be large. This is a dangerous practice and all structures should be passed by reference. Since,
by default, Sun FORTRAN does not pass variables by value anyway, this should not give rise to any
problems.
In Sun FORTRAN, the default argument passing mechanism can be overridden by use of the %VAL
and %REF
functions. These functions are not portable and should be avoided whenever possible. The
%DESCR
function provided in VAX FORTRAN is not provided on a Sun. In C there is no
similar way of “cheating” as there is in FORTRAN; however, this is not necessary as the
language allows more flexibility itself. For example, if you wish to pass a variable named x
by reference rather than by value, you simply put &x
as the actual argument instead of
x
.
Since C provides more flexibility in the mechanism of passing arguments than does FORTRAN, it is C that ought to shoulder the burden of handling the different mechanisms. All numeric variables and constants, array elements, whole arrays and function names should be passed into and out of C functions by reference. Numeric expressions will be passed from FORTRAN to C by reference and so the corresponding dummy argument in the C function should be declared to be of type “pointer to type”. When C has a constant or an expression as an actual argument in a function call, it can only pass it by value. Sun FORTRAN cannot cope with this and so in a C program, all expressions should be assigned to variables before being passed to a FORTRAN routine.
Here are some examples to illustrate these points.
The C function name requires the underscore as the FORTRAN compiler generates this automatically.
In this first example, a Sun FORTRAN program passes an INTEGER
and REAL
variable to a C function.
The values of these arguments are then assigned to two local variables. They could just as well have
been used directly in the function by referring to the variables *a
and *b
instead of assigning their
values to the local variables x
and y
. Since the FORTRAN program passes the actual arguments by
reference, the dummy arguments used in the declaration of the C function should be a pointer to the
variable that is being passed.
Now an example of calling a Sun FORTRAN subroutine from C.
The C main function declares and initializes a variable, i
, and declares a function fort2_
(note the
underscore). It calls fort2_
, passing the address of the variable i
rather than its value, as this is what
the FORTRAN subroutine will be expecting.
As we have seen, the case of scalar numeric arguments is fairly straightforward, however, the passing of character variables between Sun FORTRAN and C is more complicated. Sun FORTRAN passes character variables by passing the address of the character variable and then adding an extra value to the argument list that is the size of the character variable. Furthermore, there is the point that FORTRAN deals with fixed-length, blank-padded strings, whereas C deals with variable-length, null-terminated strings. The simplest possible example of a character argument is given here as an illustration. Don’t worry if it looks complicated, the F77 macros described in Section 5 hide all of these details from the programmer, and in a portable manner as well!
The second variable declaration in the C subprogram declares a local variable to be a string and initializes it. This string is then copied to the storage area that the subprogram argument points to, taking care not to copy more characters than the argument has room for. Finally any remaining space in the argument is filled with blanks, the null character being overwritten. You should always fill any trailing space with blanks in this way.
The way that the return value of a function is handled is very much like a simple assignment statement. The value is actually returned in one or two of the registers of the CPU, depending on the size of the data type. Consequently there is no problem in handling the value of any function that returns a numerical value as long as the storage used by the value being returned and the value expected correspond (see Table 1 on page 51).
The case of a function that returns a character string is more complex. The way that Sun FORTRAN
returns a character variable as a function value is to add two hidden extra entries to the beginning of
the argument list. These are a pointer to a character variable and the value of the length of this
variable. If a C function wishes to emulate a FORTRAN CHARACTER
function, then you must explicitly
add these two extra arguments to the C function. Any value that the C function returns will be
ignored. Here is an example to illustrate this.
The C function copies some asterisks into the location that Sun FORTRAN will interpret as the return
value of the FORTRAN CHARACTER
function. The number of such asterisks is specified
by the single argument of the FORTRAN function and the rest of the string is filled with
blanks.
Although FORTRAN and C use different method for representing global data, it is actually very easy
to mix them. If a Sun FORTRAN common block contains a single variable or array, then the
corresponding C variable simply needs to be declared as extern
and the two variables will use the
same storage.
C external variable:
Note that the name of the C variable corresponds to the name of the FORTRAN common block, not the name of the FORTRAN variable. This example shows that you can use the same storage area for both Sun FORTRAN and C strings, however, you must still beware of the different way in which FORTRAN and C handle the end of a string.
If the FORTRAN common block contains more than one variable or array, then the C variables must be contained in a structure.
If you wish to access the Sun FORTRAN blank common block, then the corresponding C structure
should be called _BLNK__
.
C external variable:
This section applies for Alpha OSF/1, Ultrix/RISC and possibly other DEC Unix systems.
The machine specific details relating to mixed language programming are almost identical to those for the Sun and so the previous subsection should be consulted for more details. This is not to say that there are no differences between the DECstation and Sun compilers, merely that they do not generally impinge on the question of mixed language programming.
One place where the DEC system may differ from the Sun is in how logical values are handled. The
original FORTRAN compiler for the DECstation (FORTRAN for RISC) used the Sun interpretation of
logical values, i.e. zero is false, non-zero is true. The more recent DEC FORTRAN compiler uses the
VMS convention that only checks the lowest bit of a value, so 0 is false, 1 is true, 2 is false, 3 is true, etc.
When DEC FORTRAN sets a LOGICAL
variable to TRUE
, all the bits in the data are set to 1, resulting in
a numerical equivalent value of -1. Unfortunately this means that the correct value of the macros
F77_ISFALSE
and F77_ISTRUE
used in a C function, depend on which FORTRAN compiler
you are using. It is not possible to handle this automatically, so you must be sure to use
the right values for the macros. The default assumption is that you are using the newer
DEC FORTRAN compiler. Fortunately this is unlikely to be a problem in practice, since a
TRUE
value will normally be 1 or -1, and these values will be handled correctly by either
compiler.
The DEC Alpha machines can use addresses up to 64 bits long, but where FORTRAN INTEGER
s are
used to hold an address, only 32 bits can be held. However, the linker has flags -T and -D which can be
used to ensure that allocated memory addresses will fit into 32 bits. The user generally does not have
to worry about these, as they are inserted automatically if the relevant Starlink library link script
(e.g. hds_link
) is used.
A VAX computer is based on a 32 bit architecture. Data can be addressed as bytes (8 bits), words (16 bits), longwords (32 bits), quadwords (64 bits) or octawords (128 bits). The terminology is a hangover from the PDP-11 series of computers and the basic unit of storage on a VAX is the longword. References to FORTRAN and C in this subsection refer to the VAX FORTRAN and VAX C compilers produced by DEC.
There is a simple correspondence between VAX FORTRAN and VAX C numeric variable types. The standard types are given in the upper part of
Table 2 and non-standard extensions in the lower part. These should generally be avoided for reasons of portability. However, they are provided since HDS (see SUN/92) has corresponding data types.
type | VAX FORTRAN | VAX C
|
INTEGER | INTEGER | int |
REAL | REAL | float |
DOUBLE | DOUBLE PRECISION | double |
LOGICAL | LOGICAL | int |
CHARACTER*1 | char | |
CHARACTER | CHARACTER*n | char[n] |
BYTE | BYTE | char |
WORD | INTEGER*2 | short int |
UBYTE | unsigned char | |
UWORD | unsigned short int | |
POINTER | INTEGER | unsigned int |
Although VAX C defines unsigned data types of unsigned char
(range 0
to 255), unsigned short
(range 0 to 32767) and unsigned int
(range 0 to
), there
are no corresponding unsigned data types in FORTRAN. There is also a C type called long int
;
however in VAX C, this is the same as an int
.
The C language does not specify whether variables of type char
should be stored as signed or
unsigned values. On VMS, they are stored as signed values in the range -128 to 127.
Similarly there is no C data type that corresponds to the FORTRAN data type of COMPLEX
. However,
since VAX FORTRAN passes all numeric variable by reference, a COMPLEX
variable could be passed to
a VAX C subprogram where it might be handled as a structure consisting of two variables of type
float
.
A VAX FORTRAN LOGICAL
value can be passed to a VAX C int
, but care must be taken over the
interpretation of the value since VAX FORTRAN only considers the lower bit of the longword to be
significant (0 is false, 1 is true) whereas VAX C treats any numerical value other than 0 as true. When
VAX FORTRAN sets a logical value to true, it sets all the bits. This corresponds to a numerical value
of minus one.
To understand how to pass arguments between VAX FORTRAN and VAX C programs, it is necessary to understand the possible methods that VMS can use for passing arguments and how each language makes use of them. VMS defines a procedure calling standard that is used by all compilers written by DEC for the VMS operating system. This is described in the “Introduction to the VMS Run-Time Library” manual with additional information in the “Introduction to VMS System Services” manual. If you have a third party compiler that does not conform to this standard then you will not be able to mix the object code that it produces with that from DEC compilers. There are three ways that an actual argument may be passed to a subroutine. What is actually passed as an argument should always be a longword. It is the interpretation of that longword that is where the differences arise. Note the word should in the last but one sentence. VAX C will occasionally generate an argument that is longer than one longword. This is a violation of the VAX procedure calling standard. It causes no problems for pure VAX C programs, but is a potential source of problems for mixed language programs.
VAX FORTRAN passes all data types other than CHARACTER
by reference, i.e. the address of the
variable or array is put in the argument list. CHARACTER
variables are passed by descriptor. The
descriptor contains the type and class of descriptor, the length of the string and the address where the
characters are actually stored.
VAX C uses call by value to pass all variables, constants (except string constants), expressions, array elements, structures and unions that are actual arguments of functions. It uses call by reference to pass whole arrays, string constants and functions. VAX C never uses call by descriptor as a default method of passing arguments.
To pass a VAX C variable of type double
by value requires the use of two longwords in the
argument list and so is a violation of the VAX procedure calling standard. The passing
of a VAX C structure that is bigger that one longword is a similar violation. It is always
better to pass C structures by reference, although this should not be a problem in practice
since in the case of a pure VAX C program, everything is handled consistently and in the
case of a mixture of FORTRAN and C, you would not normally pass variables by value
anyway.
In VAX FORTRAN, the default argument passing mechanism can be overridden by use of the %VAL
,
%REF
and %DESCR
functions. These functions are not portable and should be avoided whenever
possible. The only exception is that %VAL
is used in Starlink software for passing pointer variables. In
VAX C there is no similar way of “cheating” as there is in VAX FORTRAN; however, this is
not necessary as the language allows more flexibility itself. For example, if you wish to
pass a variable named x
by reference rather than by value, you simply put &x
as the actual
argument instead of x
. To pass something by descriptor, you need to construct the appropriate
structure and pass the address of that. See the DEC manual “Guide to VAX C” for further
details.
Since C provides more flexibility in the mechanism of passing arguments than does FORTRAN, it is C that ought to shoulder the burden of handling the different mechanisms. All numeric variables and constants, array elements, whole arrays and function names should be passed into and out of C functions by reference. Numeric expressions will be passed from VAX FORTRAN to VAX C by reference and so the corresponding dummy argument in the C function should be declared to be of type “pointer to type”. When C has a constant or an expression as an actual argument in a function call, it can only pass it by value. VAX FORTRAN cannot cope with this and so in a VAX C program, all expressions should be assigned to variables before being passed to a FORTRAN routine.
Here are some examples to illustrate these points.
In this first example, a FORTRAN program passes an INTEGER
and REAL
variable to a C function. The
values of these arguments are then assigned to two local variables. They could just as well have been
used directly in the function by referring to the variables *a
and *b
instead of assigning their values to
the local variables x
and y
. Since the VAX FORTRAN program passes the actual arguments by
reference, the dummy arguments used in the declaration of the VAX C function should be a pointer to
the variable that is being passed.
Now an example of calling a VAX FORTRAN subroutine from VAX C.
The VAX C main function declares and initializes a variable, i
, and declares a function fort2
. It calls
fort2
, passing the address of the variable i
rather than its value, as this is what the VAX FORTRAN
subroutine will be expecting.
As we have seen, the case of scalar numeric arguments is fairly straightforward. However, the passing
of CHARACTER
variables between VAX FORTRAN and VAX C is more complicated. VAX FORTRAN
passes CHARACTER
variables by descriptor and VAX C must handle these descriptors. Furthermore,
there is the point that FORTRAN deals with fixed-length, blank-padded strings, whereas C deals with
variable-length, null-terminated strings. It is also worth noting that VAX/VMS machines
handle CHARACTER
arguments in a manner which is different from the usual Unix way. The
simplest possible example of a CHARACTER
argument is given here in all of its gory detail.
You will be pleased to discover that this example is purely for illustration. The important
point is that it is different from the Sun example and, anyway, the F77 macros described in
Section 5 hide all of these differences from the programmer, thereby making the code
portable.
The second variable declaration in the C subprogram declares a local variable to be a string and initializes it. This string is then copied to the storage area that the subprogram argument points to, taking care not to copy more characters than the argument has room for. Finally any remaining space in the argument is filled with blanks, the null character being overwritten. You should always fill any trailing space with blanks in this way. What should definitely not be done is to modify the descriptor to indicate the number of non blank characters that it now holds. The VAX FORTRAN compiler will not expect this to happen and it is likely to cause run-time errors. See the DEC manual “Guide to VAX C” for more details of handling descriptors in VAX C.
If an actual argument in a VAX FORTRAN routine is an array of characters, rather than just a
single character variable, the descriptor that describes the data is different. It is defined by
the macro dsc$descriptor_a
instead of dsc$descriptor_s
. This contains extra information
about the number of dimensions and their bounds; however, this can generally be ignored
since the first part of the dsc$descriptor_a
descriptor is the same as the dsc$descriptor_s
descriptor. This extra information can be unpacked from the descriptor, however, to do so
would lead to non-portable code. It is generally better to use the address of the array that is
passed in the descriptor and to pass any array dimensions as separate arguments. The C
subroutine then has all of the information that it requires and can handle the data as an
array or by using pointers, as the programmer sees fit. See example 4 for an illustration of
this.
The way that the return value of a function is handled is very much like a simple assignment
statement. In practice, the value is actually returned in one or two of the registers of the CPU,
depending on the size of the data type. Consequently there is no problem in handling the value of any
function that returns a numerical value as long as the storage used by the value being
returned and the value expected correspond (see Table 2 on page 59). If a VAX C function is
treated as a LOGICAL
function by VAX FORTRAN, there is no problem as long as the VAX C
function ensures that it returns a value that will be interpreted correctly. The best thing to
do is to make sure that the C function can only return zero (for false) or minus one (for
true).
The case of a function that returns a character string is more complex. The way that VAX FORTRAN
returns a CHARACTER
variable as a function value is to add a hidden extra entry to the beginning of the
argument list. This is a pointer to a character descriptor. If a VAX C function wishes to return a
function value that VAX FORTRAN will interpret as a character string, then you must explicitly add
an extra argument to the VAX C function and build the appropriate structure in your C function. This
may seem rather complicated, but what it boils down to is that the following two segments of
VAX FORTRAN are equivalent (but only in VAX FORTRAN).
CHARACTER
. It is left as an exercise for the
reader to demonstrate that the above assertion is true using just FORTRAN.
Although FORTRAN and C use different methods for representing global data, it is actually very easy
to mix them. If a VAX FORTRAN common block contains a single variable or array, then the
corresponding VAX C variable simply needs to be declared as extern
and the two variables will use
the same storage.
C external variable:
Note that the name of the C variable corresponds to the name of the FORTRAN common block, not the name of the FORTRAN variable. This example shows that you can use the same storage area for both VAX FORTRAN and VAX C strings. However, you must still beware of the different way in which FORTRAN and C handle the end of a string.
If the FORTRAN common block contains more than one variable or array, then the C variables must be contained in a structure.
If you wish to access the VAX FORTRAN blank common block, then the corresponding VAX C structure should be called $BLANK.
C external variable:
The F77 macros have been designed to cope with other systems as far as is possible. It should be
possible to modify the include file f77.h
to cope with most computers. The places where this may
prove difficult, or even impossible, are likely to be due to arguments being passed in an unforeseen
way.
The include file also declares the functions used for handling character strings. The declarations are written as function prototypes and assume that the C compiler will handle this feature of ANSI C. If a particular C compiler does not support this feature, then the header file could easily be modified to take this into account.