wiki:CodingGuide

Coding Guide

Introduction

This document describes the coding guidelines that should be followed when writing or editing the ORAC community aerosol and cloud code. Comments and suggestions are welcome. The purpose of this document is to maintain and produce code that:

  • maximize readability and comprehensibility;
  • encourage uniformity;
  • simplify maintenance;
  • assist portability;
  • promote reliability and fault tolerance.

Licence agreement

A copy of the ORAC licence agreement can be found here. By downloading and using the ORAC code you are agreeing to:

  1. Acknowledge ORAC in any relevant publications.
  2. Feed back into the code any improvements made/bugs found.

Specific guidelines for writing and editing ORAC community code

Version control

All code should be stored under version control. Instructions on how to use the configuration control can be found at SVNGuide and SVNUploading and the official subversion manual can be found at  http://svnbook.red-bean.com/

Template

The following template should be used for all new routines

!-------------------------------------------------------------------------------
! Name: orac_header_template.F90
!
! Purpose:
!
! Description and Algorithm details:
!
! Arguments:
! Name Type    In/Out/Both Description
! ------------------------------------------------------------------------------
!      integer
!      real
!      struct
!
! History:
! YYYY/MM/DD, Initials: Description which should be
!    indented if it carries over a line
!
! $Id$
!
! Bugs:
! None known
!-------------------------------------------------------------------------------

  • Name: The name should fully describe the function of the routine.
  • Purpose: A very short written description of the purpose of the routine. (Neglect for module definitions.)
  • Description and algorithm details: A longer written description of the routine outlining inputs and outputs. (Neglect for module definitions.) Optionally, but included for preference, would be important equations and/or references to web pages or papers.
  • Arguments: For each argument of the routine, list,
    1. Name
    2. Type
    3. If the variable is an input, output, or both
    4. Description
  • History: history of ALL changes made to code must include:
    1. date of edit
    2. Initials of the editor (see below)
    3. nature of the change
  • ! $Id$: The first time the file is checked into sub version with the appropriate key word it will set the version number.
  • Bugs: A short description of anything known not to work with date and name of person responsible for this piece of work.

The following initials are currently found within the code:

Initials Name Initials Name Initials Name
AS Andy Smith AY Andy Sayer AP Adam Povey
CA Chris Arnold CP Caroline Poulsen GM Greg McGarragh
GT Gareth Thomas KS Kevin M. Smith MJ Matthias Jerg
MS Martin Stengel OS Oliver Sus SS Stefan Stapelberg
TN Tim Nightingale WJ William Jones

Conventions for file naming

All text files will have the file extension “.txt” to distinguish them.

All binary files will have the file extension “.bin” to distinguish them

All Fortran source code files will have the file extension “.F90” to distinguish them, except those imported using #include statements, which shall have the extension “.inc”.

Conventions for error handling

[Jan 19 2015: This doesn't describe current practice and will be revised once we have a consistent treatment of error handling.] Functions will indicate their success or failure by means of an int return value of STATUS_OK. Any function that can fail should return such a status code. Success will be indicated by a STATUS_OK value (which should be defined as 0), failure by anything else. If the function simply needs to indicate a failure, the value one should be returned. In more complex situations, a range of values may be appropriate.

Only functions that cannot fail (or more accurately, where the likelihood of failure is sufficiently small that it is not checked for) are allowed to return a value that is not a status value, e.g. basic mathematical functions can simply return the result value rather than a status code.

Conventions for Error Codes

Generic error codes are defined below.

Name Error code Description
DriverFileOpenErr 1000
DriverFileReadErr 1001
DriverFileNotFound 1002
DriverFileDataErr 1003
AMethInvalid 1004 SPixel averaging method
LimitMethInvalid 1005
SegSizeInvalid 1006 Image segment size
ICFileOpenErr 1010
ICFileReadErr 1011
InstIDInvalid 1012
CtrlDataInvalid 1013
ChanFileOpenErr 1020
ChanFileReadErr 1021
ChanFileDataErr 1022
CCFileOpenErr 1030
CCFileReadErr 1031
CCNClassErr 1032
CCSelectError 1033
CCDefaultError 1034
LUTFileOpenErr 1040
LUTFileReadErr 1041
LUTFileDataErr 1042
MSIFileOpenErr 1050
MSIFileReadHeadErr 1051
MSIFileReadDataErr 1052
MSIFileEOFErr 1053
MSIFileCloseErr 1054
CfFileOpenErr 1060
CfFileReadHeadErr 1061
CfFileReadDataErr 1062
CfFileEOFErr 1063
CfFileCloseErr 1064
LsFileOpenErr 1070
LsFileReadHeadErr 1071
LsFileReadDataErr 1072
LsFileEOFErr 1073
LsFileCloseErr 1070
GeomFileOpenErr 1080
GeomFileReadHeadErr 1081
GeomFileReadDataErr 1082
GeomFileEOFErr 1083
GeomFileCloseErr 1084
IntTransErr 1090
LocFileOpenErr 1100
LocFileReadHeadErr 1101
LocFileReadDataErr 1102
LocFileEOFErr 1103
LocFileCloseErr 1104
LwRTMRTMFileOpenErr 1110
LwRTMRTMInstErr 1111
LwRTMRTMDateErr 1112
LwRTMChanErr 1113
LwRTMReadErr 1114
LwRTMPFileOpenErr 1120
LwRTMProfDateErr 1121
LwRTMProfNLatErr 1122
LwRTMProfNLonErr 1123
LwRTMProfErr 1124
LwRTMProfReadErr 1125
SwRTMRTMFileOpenErr 1130
SwRTMRTMInstErr 1131
SwRTMRTMDateErr 1132
SwRTMChanErr 1133
SwRTMReadErr 1134
SwRTMPFileOpenErr 1135
SwRTMProfDateErr 1136
SwRTMProfNLatErr 1137
SwRTMProfNLonErr 1138
SwRTMProfErr 1139
SwRTMProfReadErr 1129
SPixelCentPix 1140
SPixelAllPix 1141
SPixelCloudPix 1142
SPixelAmeth 1143
SPixelInvalid 1144
SPixelCloudFrac 1145
SPixelGeomSol 1150
SPixelGeomSat 1151
SPixelGeomRel 1152
SPixelSurfglint 1153
BkpFileOpenErr 1160
GetRTMLwMaxLat 1170
GetRTMLwMinLat 1171
GetRTMLwMaxLon 1172
GetRTMLwMinLon 1173
GetSurfaceMeth 1180
GetRsCentPix 1190
GetRsAvMeth 1191
GetLwSwRTMLat 1190
GetLwSwRTMLon 1191
APMethErr 1200
FGMethErr 1201
CloudClassMethErr 1210
XMDADMeth 1220
XSDADMeth 1230
InvCholNotPosDef 1240
OutFileOpenErr 1250
DiagFileWriteErr 1251
AlbFileOpenErr 1270
AlbFileReadHeadErr 1271
AlbFileReadDataErr 1272
AlbFileEOFErr 1273
ScanFileOpenErr 1280
ScanFileReadHeadErr 1281
ScanFileReadDataErr 1282
ScanFileEOFErr 1283
ScanFileCloseErr 1280
LUTIntflagErr 1284
RTMIntflagErr 1285
Spixelillum 1290
CWP_Calcerror 1300
illumFileOpenErr 1310
illumFileReadHeadErr 1311
illumFileReadDataErr 1312
illumFileEOFErr 1313
PrimaryFileOpenErr 1400
SecondaryFileOpenErr 1401
PrimaryFileDefinitionErr 1402
SecondaryFileDefinitionErr 1403
PrimaryFileWriteErr 1404
SecondaryFileWriteErr 1405
PrimaryFileCloseErr 1406
SecondaryFileCloseErr 1407

Conventions for output

Any output (screen or file) is slow, computationally expensive, and wasteful of resource (CPU and disk). All non-error output to the screen should be precluded by if (verbose), where verbose is a logical variable passed to the procedure from the main level.

General Optimal Estimation coding guidelines

Developers should keep in mind the principals of the OE and community code and follow the guidelines below,

  • Empirical algorithms and tests (e.g threshold tests) should not be introduced unless absolutely necessary.
  • Code specific to a single instrument should not be introduced. Code should be introduced for generic application.
  • Coding should be as modular as possible.
  • Optional algorithms should be able to be switched on and off.

Procedure for demonstrating code improvements

See wiki:Testbed.

Procedure for reporting bugs

http://proj.badc.rl.ac.uk/orac/report/1

Channel indexing in the main processor

Channel indexing in ORAC is awkward. This derives from the desire that arrays only be as large as they need to be. For the main processor, it is possible to need to consider:

  1. The channels present in the preprocessor files.
  2. The channels we wish to process out of (1).
  3. The thermal, solar, or mixed channels out of (2).
  4. The channels that contain valid or useful data out of (2).
  5. The thermal, solar, or mixed channels out of (4).
  6. The channels in (4) within an array that stores the channels in (3).
  7. The channels in (5) within an array that stores the channels in (3).

It is important when subscripting an array (in a loop or otherwise) that you use the correct set of indices. In most uses, it will make little difference but when channels are missing or not in order of increasing wavelength, incorrect answers can be produced with no obvious warning. The table below should provide a means of checking the appropriate subscript given the bounds of the do loop you are coding (column) to the dimension of the array you are subscripting (row).

i=1,Ctrl%Ind%Ny i=1,Ctrl%Ind%YSolar i=1,Ctrl%Ind%YThermal i=1,SPixel%Ind%Ny i=1,SPixel%Ind%NSolar i=1,SPixel%Ind%NThermal
Ctrl%Ind%Ny i Ctrl%Ind%YSolar(i) Ctrl%Ind%YThermal(i) SPixel%spixel_y_to_ctrl_y_index(i) SPixel%spixel_y_solar_to_ctrl_y_index(i) SPixel%spixel_y_thermal_to_ctrl_y_index(i)
Ctrl%Ind%YSolar * i find_in_array(Ctrl%Ind%YSolar, Ctrl%Ind%YThermal(i)) SPixel%Ind%YSolar(i) SPixel%spixel_y_solar_to_ctrl_y_solar_index(i) find_in_array(Ctrl%Ind%YSolar, SPixel%spixel_y_thermal_to_ctrl_y_index(i))
Ctrl%Ind%YThermal * find_in_array(Ctrl%Ind%YThermal, Ctrl%Ind%YSolar(i)) i SPixel%Ind%YThermal(i) find_in_array(Ctrl%Ind%YThermal, SPixel%spixel_y_solar_to_ctrl_y_index(i)) SPixel%spixel_y_thermal_to_ctrl_y_thermal_index(i)
SPixel%Ind%Ny i SPixel%Ind%YSolar(i) SPixel%Ind%YThermal(i)
SPixel%Ind%NSolar * i find_in_array(SPixel%Ind%YSolar, SPixel%Ind%YThermal(i))
SPixel%Ind%NThermal * find_in_array(SPixel%Ind%YThermal, SPixel%Ind%YSolar(i)) i
  • For example, when looping over 1, Ctrl%Ind%YSolar,
    integer :: i, ii
    real, dimension(1:Ctrl%Ind%Ny)       :: input_from_available_channels
    real, dimension(1:Ctrl%Ind%NThermal) :: input_from_thermal_channels
    real, dimension(1:Ctrl%Ind%NSolar,3) :: output_to_solar_channels
    
    ! Loop over solar channels
    do i=1, Ctrl%Ind%NSolar
       ! Write a value to each solar channel
       output_to_solar_channels(i,1) = 0.1
    
       ! Use YSolar to index the solar channels within all available channels
       output_to_solar_channels(i,2) = input_from_available_channels(Ctrl%Ind%YSolar(i))
    
       ! For mixed channels, you may need to copy information from a thermal array to a solar one.
       ! This requires translating an index.
       ii = find_in_array(Ctrl%Ind%YThermal, Ctrl%Ind%YSolar(i))
       if (ii > 0) output_to_solar_channels(i,3) = input_from_thermal_channels(ii)
    end do
    
    • find_in_array is a function within the Int_Routines_def module. It returns the index within the first argument which is first equal to the second argument.
  • Some care is needed with the SPixel%spixel_y_..._index arrays as these are of length Ctrl%Ind%Ny rather than the length of the appropriate SPixel array. The following is commonly used in ORAC at the moment,
    integer :: i, ii
    real, dimension(1:Ctrl%Ind%Ny)       :: input_from_available_channels
    real, dimension(1:Ctrl%Ind%NThermal) :: output_to_processed_channels
    
    ! Loop over channels to be processed
    do i=1, SPixel%Ind%Ny
       ! Store the appropriate index in a more convenient variable name
       ii = SPixel%spixel_y_to_ctrl_y_index(i)
    
       output_to_processed_channels(i) = input_from_available_channels(ii)
    end do
    
  • Spaces marked with an * in the table are where you loop over more elements than will (possibly) in the array. These require a construct such as,
    integer :: i, ii
    real, dimension(1:Ctrl%Ind%Ny)         :: input_from_available_channels
    real, dimension(1:Ctrl%Ind%NThermal,2) :: output_to_thermal_channels
    
    ! Count the number of thermal channels
    ii = 1
    
    ! Loop over all available channels
    do i=1, Ctrl%Ind%Ny
       ! Check if this channel is thermal
       if (btest(Ctrl%Ind%Ch_Is(i), ThermalBit)) then
          ! Index thermal arrays with thermal counter, ii
          output_to_thermal_channels(ii,1) = 0.1
    
          ! Index available channels arrays with loop iterator, i
          output_to_thermal_channels(ii,2) = input_from_available_channels(i)
    
          ! Increment thermal channel counter
          ii = ii + 1
       end if
    end do
    
    • Note that the array Ctrl%Ind%Ch_Is is a bit flag for storing properties of the available channels. Currently, two bits are used: These are parameters defined in the ECP_constants module.
  • When looping over mixed channels, SPixel%spixel_y_mixed_to_spixel_y_solar_index and SPixel%spixel_y_mixed_to_spixel_y_thermal_index exist as a means to reference solar and thermal arrays.
    integer :: i, ii
    real, dimension(1:SPixel%Ind%NSolar)   :: input_from_solar_channels
    real, dimension(1:SPixel%Ind%NThermal) :: input_from_thermal_channels
    real, dimension(1:SPixel%Ind%NMixed,2) :: output_to_mixed_channels
    
    ! Loop over mixed channels to be processed
    do i=1, SPixel%Ind%NMixed
       ii = SPixel%spixel_y_mixed_to_spixel_y_solar_index(i)
       output_to_mixed_channels(i,1) = input_from_solar_channels(ii)
    
       ii = SPixel%spixel_y_mixed_to_spixel_y_thermal_index(i)
       output_to_mixed_channels(i,2) = input_from_thermal_channels(ii)
    end do
    
  • Blank spaces in the table are conversions not currently required in ORAC.


A few notes on conventions etc.

File names and routine names:

  • Routine names have underscores in them, file names generally don't, e.g. function Read_SAD is in source file ReadSAD.F90. This is to take advantage of UNIX case sensitivity, but still make the routine names legible when the FORTRAN compiler has turned them into upper case and writes them out in error messages.
  • The make file passes the source files to the C pre-compiler before compilation. This allows conditional compilation (handy for debugging: put #ifdef DEBUG and #endif around the lines of code you want for debugging, e.g. a write(*,*) statement, and type "make FFLAGS=-DDEBUG" - the double D is intentional).

Modules:

  • Modules are used to define data structures, i.e. to set out the templates for what they look like, but not actually to declare them (this is done in the code where they're used).
  • Each module has it's own source file.
  • Module names for type definitions end in _def, filenames don't, so e.g. module CPL_def is in source file CPL.F90 .
  • The compiler creates a .mod file for each module, e.g. CPL_DEF.mod. If you want to use a module in a piece of code make sure the module is compiled before the code that uses it (see the Makefile).

Interfaces:

  • Subroutines that use passed-length (or assumed-shape) arrays must have their interface declared to any subroutine that calls them. There is an interface module (source file XXXRoutines.F90, module name XXX_Routines_def) for each set of subroutines. e.g. SAD_Routines_def defines interfaces for all routines called by Read_SAD, LUT_Routines_def is for all routines called by Read_LUT.

ECP_Constants:

  • This module holds useful constants such as error code parameters, array size limits, format statements. Using parameters for constants in this way should make for more readable and more easily maintainable code.

Source code header comments:

  • The file f90_header is a blank template for commenting the header part of an ORAC source file. See existing code for examples of how to fill it in (not all bits apply to all source files, e.g. "algorithm" doesn't make sense for a module defininng a structure type).

Attachments