1<!--===- docs/FortranForCProgrammers.md 2 3 Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. 4 See https://llvm.org/LICENSE.txt for license information. 5 SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception 6 7--> 8 9# Fortran For C Programmers 10 11```eval_rst 12.. contents:: 13 :local: 14``` 15 16This note is limited to essential information about Fortran so that 17a C or C++ programmer can get started more quickly with the language, 18at least as a reader, and avoid some common pitfalls when starting 19to write or modify Fortran code. 20Please see other sources to learn about Fortran's rich history, 21current applications, and modern best practices in new code. 22 23## Know This At Least 24 25* There have been many implementations of Fortran, often from competing 26 vendors, and the standard language has been defined by U.S. and 27 international standards organizations. The various editions of 28 the standard are known as the '66, '77, '90, '95, 2003, 2008, and 29 (now) 2018 standards. 30* Forward compatibility is important. Fortran has outlasted many 31 generations of computer systems hardware and software. Standard 32 compliance notwithstanding, Fortran programmers generally expect that 33 code that has compiled successfully in the past will continue to 34 compile and work indefinitely. The standards sometimes designate 35 features as being deprecated, obsolescent, or even deleted, but that 36 can be read only as discouraging their use in new code -- they'll 37 probably always work in any serious implementation. 38* Fortran has two source forms, which are typically distinguished by 39 filename suffixes. `foo.f` is old-style "fixed-form" source, and 40 `foo.f90` is new-style "free-form" source. All language features 41 are available in both source forms. Neither form has reserved words 42 in the sense that C does. Spaces are not required between tokens 43 in fixed form, and case is not significant in either form. 44* Variable declarations are optional by default. Variables whose 45 names begin with the letters `I` through `N` are implicitly 46 `INTEGER`, and others are implicitly `REAL`. These implicit typing 47 rules can be changed in the source. 48* Fortran uses parentheses in both array references and function calls. 49 All arrays must be declared as such; other names followed by parenthesized 50 expressions are assumed to be function calls. 51* Fortran has a _lot_ of built-in "intrinsic" functions. They are always 52 available without a need to declare or import them. Their names reflect 53 the implicit typing rules, so you will encounter names that have been 54 modified so that they have the right type (e.g., `AIMAG` has a leading `A` 55 so that it's `REAL` rather than `INTEGER`). 56* The modern language has means for declaring types, data, and subprogram 57 interfaces in compiled "modules", as well as legacy mechanisms for 58 sharing data and interconnecting subprograms. 59 60## A Rosetta Stone 61 62Fortran's language standard and other documentation uses some terminology 63in particular ways that might be unfamiliar. 64 65| Fortran | English | 66| ------- | ------- | 67| Association | Making a name refer to something else | 68| Assumed | Some attribute of an argument or interface that is not known until a call is made | 69| Companion processor | A C compiler | 70| Component | Class member | 71| Deferred | Some attribute of a variable that is not known until an allocation or assignment | 72| Derived type | C++ class | 73| Dummy argument | C++ reference argument | 74| Final procedure | C++ destructor | 75| Generic | Overloaded function, resolved by actual arguments | 76| Host procedure | The subprogram that contains a nested one | 77| Implied DO | There's a loop inside a statement | 78| Interface | Prototype | 79| Internal I/O | `sscanf` and `snprintf` | 80| Intrinsic | Built-in type or function | 81| Polymorphic | Dynamically typed | 82| Processor | Fortran compiler | 83| Rank | Number of dimensions that an array has | 84| `SAVE` attribute | Statically allocated | 85| Type-bound procedure | Kind of a C++ member function but not really | 86| Unformatted | Raw binary | 87 88## Data Types 89 90There are five built-in ("intrinsic") types: `INTEGER`, `REAL`, `COMPLEX`, 91`LOGICAL`, and `CHARACTER`. 92They are parameterized with "kind" values, which should be treated as 93non-portable integer codes, although in practice today these are the 94byte sizes of the data. 95(For `COMPLEX`, the kind type parameter value is the byte size of one of the 96two `REAL` components, or half of the total size.) 97The legacy `DOUBLE PRECISION` intrinsic type is an alias for a kind of `REAL` 98that should be more precise, and bigger, than the default `REAL`. 99 100`COMPLEX` is a simple structure that comprises two `REAL` components. 101 102`CHARACTER` data also have length, which may or may not be known at compilation 103time. 104`CHARACTER` variables are fixed-length strings and they get padded out 105with space characters when not completely assigned. 106 107User-defined ("derived") data types can be synthesized from the intrinsic 108types and from previously-defined user types, much like a C `struct`. 109Derived types can be parameterized with integer values that either have 110to be constant at compilation time ("kind" parameters) or deferred to 111execution ("len" parameters). 112 113Derived types can inherit ("extend") from at most one other derived type. 114They can have user-defined destructors (`FINAL` procedures). 115They can specify default initial values for their components. 116With some work, one can also specify a general constructor function, 117since Fortran allows a generic interface to have the same name as that 118of a derived type. 119 120Last, there are "typeless" binary constants that can be used in a few 121situations, like static data initialization or immediate conversion, 122where type is not necessary. 123 124## Arrays 125 126Arrays are not types in Fortran. 127Being an array is a property of an object or function, not of a type. 128Unlike C, one cannot have an array of arrays or an array of pointers, 129although can can have an array of a derived type that has arrays or 130pointers as components. 131Arrays are multidimensional, and the number of dimensions is called 132the _rank_ of the array. 133In storage, arrays are stored such that the last subscript has the 134largest stride in memory, e.g. A(1,1) is followed by A(2,1), not A(1,2). 135And yes, the default lower bound on each dimension is 1, not 0. 136 137Expressions can manipulate arrays as multidimensional values, and 138the compiler will create the necessary loops. 139 140## Allocatables 141 142Modern Fortran programs use `ALLOCATABLE` data extensively. 143Such variables and derived type components are allocated dynamically. 144They are automatically deallocated when they go out of scope, much 145like C++'s `std::vector<>` class template instances are. 146The array bounds, derived type `LEN` parameters, and even the 147type of an allocatable can all be deferred to run time. 148(If you really want to learn all about modern Fortran, I suggest 149that you study everything that can be done with `ALLOCATABLE` data, 150and follow up all the references that are made in the documentation 151from the description of `ALLOCATABLE` to other topics; it's a feature 152that interacts with much of the rest of the language.) 153 154## I/O 155 156Fortran's input/output features are built into the syntax of the language, 157rather than being defined by library interfaces as in C and C++. 158There are means for raw binary I/O and for "formatted" transfers to 159character representations. 160There are means for random-access I/O using fixed-size records as well as for 161sequential I/O. 162One can scan data from or format data into `CHARACTER` variables via 163"internal" formatted I/O. 164I/O from and to files uses a scheme of integer "unit" numbers that is 165similar to the open file descriptors of UNIX; i.e., one opens a file 166and assigns it a unit number, then uses that unit number in subsequent 167`READ` and `WRITE` statements. 168 169Formatted I/O relies on format specifications to map values to fields of 170characters, similar to the format strings used with C's `printf` family 171of standard library functions. 172These format specifications can appear in `FORMAT` statements and 173be referenced by their labels, in character literals directly in I/O 174statements, or in character variables. 175 176One can also use compiler-generated formatting in "list-directed" I/O, 177in which the compiler derives reasonable default formats based on 178data types. 179 180## Subprograms 181 182Fortran has both `FUNCTION` and `SUBROUTINE` subprograms. 183They share the same name space, but functions cannot be called as 184subroutines or vice versa. 185Subroutines are called with the `CALL` statement, while functions are 186invoked with function references in expressions. 187 188There is one level of subprogram nesting. 189A function, subroutine, or main program can have functions and subroutines 190nested within it, but these "internal" procedures cannot themselves have 191their own internal procedures. 192As is the case with C++ lambda expressions, internal procedures can 193reference names from their host subprograms. 194 195## Modules 196 197Modern Fortran has good support for separate compilation and namespace 198management. 199The *module* is the basic unit of compilation, although independent 200subprograms still exist, of course, as well as the main program. 201Modules define types, constants, interfaces, and nested 202subprograms. 203 204Objects from a module are made available for use in other compilation 205units via the `USE` statement, which has options for limiting the objects 206that are made available as well as for renaming them. 207All references to objects in modules are done with direct names or 208aliases that have been added to the local scope, as Fortran has no means 209of qualifying references with module names. 210 211## Arguments 212 213Functions and subroutines have "dummy" arguments that are dynamically 214associated with actual arguments during calls. 215Essentially, all argument passing in Fortran is by reference, not value. 216One may restrict access to argument data by declaring that dummy 217arguments have `INTENT(IN)`, but that corresponds to the use of 218a `const` reference in C++ and does not imply that the data are 219copied; use `VALUE` for that. 220 221When it is not possible to pass a reference to an object, or a sparse 222regular array section of an object, as an actual argument, Fortran 223compilers must allocate temporary space to hold the actual argument 224across the call. 225This is always guaranteed to happen when an actual argument is enclosed 226in parentheses. 227 228The compiler is free to assume that any aliasing between dummy arguments 229and other data is safe. 230In other words, if some object can be written to under one name, it's 231never going to be read or written using some other name in that same 232scope. 233``` 234 SUBROUTINE FOO(X,Y,Z) 235 X = 3.14159 236 Y = 2.1828 237 Z = 2 * X ! CAN BE FOLDED AT COMPILE TIME 238 END 239``` 240This is the opposite of the assumptions under which a C or C++ compiler must 241labor when trying to optimize code with pointers. 242 243## Overloading 244 245Fortran supports a form of overloading via its interface feature. 246By default, an interface is a means for specifying prototypes for a 247set of subroutines and functions. 248But when an interface is named, that name becomes a *generic* name 249for its specific subprograms, and calls via the generic name are 250mapped at compile time to one of the specific subprograms based 251on the types, kinds, and ranks of the actual arguments. 252A similar feature can be used for generic type-bound procedures. 253 254This feature can be used to overload the built-in operators and some 255I/O statements, too. 256 257## Polymorphism 258 259Fortran code can be written to accept data of some derived type or 260any extension thereof using `CLASS`, deferring the actual type to 261execution, rather than the usual `TYPE` syntax. 262This is somewhat similar to the use of `virtual` functions in c++. 263 264Fortran's `SELECT TYPE` construct is used to distinguish between 265possible specific types dynamically, when necessary. It's a 266little like C++17's `std::visit()` on a discriminated union. 267 268## Pointers 269 270Pointers are objects in Fortran, not data types. 271Pointers can point to data, arrays, and subprograms. 272A pointer can only point to data that has the `TARGET` attribute. 273Outside of the pointer assignment statement (`P=>X`) and some intrinsic 274functions and cases with pointer dummy arguments, pointers are implicitly 275dereferenced, and the use of their name is a reference to the data to which 276they point instead. 277 278Unlike C, a pointer cannot point to a pointer *per se*, nor can they be 279used to implement a level of indirection to the management structure of 280an allocatable. 281If you assign to a Fortran pointer to make it point at another pointer, 282you are making the pointer point to the data (if any) to which the other 283pointer points. 284Similarly, if you assign to a Fortran pointer to make it point to an allocatable, 285you are making the pointer point to the current content of the allocatable, 286not to the metadata that manages the allocatable. 287 288Unlike allocatables, pointers do not deallocate their data when they go 289out of scope. 290 291A legacy feature, "Cray pointers", implements dynamic base addressing of 292one variable using an address stored in another. 293 294## Preprocessing 295 296There is no standard preprocessing feature, but every real Fortran implementation 297has some support for passing Fortran source code through a variant of 298the standard C source preprocessor. 299Since Fortran is very different from C at the lexical level (e.g., line 300continuations, Hollerith literals, no reserved words, fixed form), using 301a stock modern C preprocessor on Fortran source can be difficult. 302Preprocessing behavior varies across implementations and one should not depend on 303much portability. 304Preprocessing is typically requested by the use of a capitalized filename 305suffix (e.g., "foo.F90") or a compiler command line option. 306(Since the F18 compiler always runs its built-in preprocessing stage, 307no special option or filename suffix is required.) 308 309## "Object Oriented" Programming 310 311Fortran doesn't have member functions (or subroutines) in the sense 312that C++ does, in which a function has immediate access to the members 313of a specific instance of a derived type. 314But Fortran does have an analog to C++'s `this` via *type-bound 315procedures*. 316This is a means of binding a particular subprogram name to a derived 317type, possibly with aliasing, in such a way that the subprogram can 318be called as if it were a component of the type (e.g., `X%F(Y)`) 319and receive the object to the left of the `%` as an additional actual argument, 320exactly as if the call had been written `F(X,Y)`. 321The object is passed as the first argument by default, but that can be 322changed; indeed, the same specific subprogram can be used for multiple 323type-bound procedures by choosing different dummy arguments to serve as 324the passed object. 325The equivalent of a `static` member function is also available by saying 326that no argument is to be associated with the object via `NOPASS`. 327 328There's a lot more that can be said about type-bound procedures (e.g., how they 329support overloading) but this should be enough to get you started with 330the most common usage. 331 332## Pitfalls 333 334Variable initializers, e.g. `INTEGER :: J=123`, are _static_ initializers! 335They imply that the variable is stored in static storage, not on the stack, 336and the initialized value lasts only until the variable is assigned. 337One must use an assignment statement to implement a dynamic initializer 338that will apply to every fresh instance of the variable. 339Be especially careful when using initializers in the newish `BLOCK` construct, 340which perpetuates the interpretation as static data. 341(Derived type component initializers, however, do work as expected.) 342 343If you see an assignment to an array that's never been declared as such, 344it's probably a definition of a *statement function*, which is like 345a parameterized macro definition, e.g. `A(X)=SQRT(X)**3`. 346In the original Fortran language, this was the only means for user 347function definitions. 348Today, of course, one should use an external or internal function instead. 349 350Fortran expressions don't bind exactly like C's do. 351Watch out for exponentiation with `**`, which of course C lacks; it 352binds more tightly than negation does (e.g., `-2**2` is -4), 353and it binds to the right, unlike what any other Fortran and most 354C operators do; e.g., `2**2**3` is 256, not 64. 355Logical values must be compared with special logical equivalence 356relations (`.EQV.` and `.NEQV.`) rather than the usual equality 357operators. 358 359A Fortran compiler is allowed to short-circuit expression evaluation, 360but not required to do so. 361If one needs to protect a use of an `OPTIONAL` argument or possibly 362disassociated pointer, use an `IF` statement, not a logical `.AND.` 363operation. 364In fact, Fortran can remove function calls from expressions if their 365values are not required to determine the value of the expression's 366result; e.g., if there is a `PRINT` statement in function `F`, it 367may or may not be executed by the assignment statement `X=0*F()`. 368(Well, it probably will be, in practice, but compilers always reserve 369the right to optimize better.) 370 371Unless they have an explicit suffix (`1.0_8`, `2.0_8`) or a `D` 372exponent (`3.0D0`), real literal constants in Fortran have the 373default `REAL` type -- *not* `double` as in the case in C and C++. 374If you're not careful, you can lose precision at compilation time 375from your constant values and never know it. 376