wiki:ORACIntro

An introduction to ORAC for a new user

The Optimal Retrieval of Aerosol and Cloud (ORAC) is an optimal estimation scheme that determines aerosol and cloud properties from multispectral imagery. There are three separate components to the software:

  1. The pre-processor takes a level 1B image from a satellite (an orbit or granule) along with other geophysical data to produce a set of NetCDF files.
  2. Those files are the inputs for the main processor, which performs the retrieval given some set of microphysical properties for a type of aerosol or cloud particle.
  3. The post-processor merges multiple processed outputs (e.g. liquid and ice phase cloud) into the final product.

You will require:

  • SVN, a version control package. It can be found at  http://subversion.tigris.org/.
  • a Fortran 2003 compiler. The author of these instructions has used both ifort and gfortran on an Ubuntu system.

The source code for ORAC is stored at http://proj.badc.rl.ac.uk/orac. The homepage of this site presents a wiki-style set of instructions. The source code itself can be viewed in a browser by clicking "Browse Source" along the top bar of the page. You will be asked for a login, which is managed through the BADC.

If you would like to help develop ORAC then you will need to become a developer. It is essential you follow development protocols and you may wish to join our monthly ORAC meeting.

  • To be added as a developer email Don Grainger (r.grainger@…) and ask to be added as an SVN and trac developer for ORAC. This may take a few days.
  • Once you have been made a developer, it is advised you join the ORAC developer's mailing list. To do so, create an account with  http://www.jiscmail.ac.uk, go to  http://www.jiscmail.ac.uk/devorac, click "Subscribe or Unsubscribe", select your desired preferences, click "Subscribe", and click the link in the confirmation email that you will eventually receive.

Once logged in, the web interface presents you with a series of folders. 'trunk' is your primary concern, which contains all of the source code for ORAC. Files can be viewed through this interface and their various versions compared. However, this interface is primarily for inspecting the files, not altering or using them. Of particular use is the "Revision Log" link at the top right of the page for any code, which can be used to track changes over any number of revisions.

To download a copy of the ORAC source code, open a terminal and change directory to the folder in which you wish to store the files. Use the command:

   svn checkout http://proj.badc.rl.ac.uk/svn/orac/trunk

This will produce a local working copy of the source code on your machine. For a brief description of the other SVN commands you will require to update the source, view http://proj.badc.rl.ac.uk/orac/trunk/docs/ORAC_Mini_SVN_guide.pdf and/or the SVN book ( http://svnbook.red-bean.com/en/1.7/svn-book.pdf).

Broadly, the folder structure is:

  • docs) Documentation, largely similar to that shown in the wiki.
  • idl) Assorted plotting and visualisation codes for ORAC's output written in IDL.
  • config) Configuration files that define the compilers, flags, and libraries used to compile ORAC.
  • patches) Local patches for the libraries used by ORAC.
  • tools) Useful scripts. The regression tests are stored here, a description of which is here.
  • common) Routines which are common to all three components.
  • pre_processing) The ORAC pre-processor source code. The main program is called preprocessing_for_orac.F90 and it's is executable orac_preproc.x.
  • src) The ORAC main processor source code. The main program is called ECP.F90 and it's executable is orac.
  • post_processing) The ORAC post-processor source code. The main program is called post_process_level2.F90 and it's executable is post_process_level2.
  • derived_products) Programs that evaluate the final ORAC product.

Having obtained the ORAC source code, one must now compile it. Instructions are here.

Each ORAC executable takes a single argument - a path to a driver file. A driver file specifies which inputs should be evaluated and specifies various parameters and settings. Python scripts have been written to automate much of that process and we would recommend that you use them in the first instance (instructions here).



Driver file specifics

If, for whatever reason, you wish to generate or alter a driver file a description of the arguments of each file follows.

Pre-processor

The following lines are mandatory in the driver file (in this order, without comments):

  1. Name of the sensor. Valid options are ATSR2, AATSR, SLSTR, AVHRR, MODIS, SEVIRI, AHI, and VIIRS.
  2. Path of the level 1B file to be opened.
  3. Path of the geolocation information for that file. For the ATSR series of instruments, this field will be ignored.
  4. Path of the USGS land-use and digital elevation model, found at http://proj.badc.rl.ac.uk/orac/browser/data/Aux_file_CM_SAF_AVHRR_GAC_ori_0.05deg.nc.
  5. Directory of the ECMWF ERA-Interim file appropriate to this scene. (When using BADC data,  http://badc.nerc.ac.uk/data/ecmwf-era-interim, this is specifically the path to the GGAM file.)
  6. Directory containing the RTTOV coefficient file for this sensor. Likely candidates can be found at  http://proj.badc.rl.ac.uk/svn/orac/data/coeffs (if not with your installation of RTTOV).
  7. Directory containing the RTTOV 0.5 degree HSR emissivity data, which can be found at  http://cimss.ssec.wisc.edu/iremis/.
  8. Directory containing the NISE snow/sea-ice map appropriate to this scene, which can be found at  ftp://n5eil01u.ecs.nsidc.org/SAN/OTHR/NISE.004/.
  9. Directory containing the MODIS MCD43C3 surface albedo file appropriate to this scene, which can be found at  http://ladsweb.nascom.nasa.gov.
  10. Directory containing the MODIS MCD43C1 surface BRDF file appropriate to this scene, which can be found at the same address.
  11. Directory containing the MODIS emissivity map (named global_emis_inf10_monthFilled_MYD11C3), which can be found at the same address.
  12. Reciprocal of longitude grid resolution. 1.38888889 is the standard.
  13. Reciprocal of latitude grid resolution. 1.38888889 is the standard.
  14. Directory to save the output files in.
  15. The smallest pixel number to accept in the across-track direction. (A value of 0 in any of the next 4 arguments results in the entire orbit being processed.)
  16. The largest pixel number to accept in the across-track direction.
  17. The smallest pixel number to accept in the along-track direction.
  18. The largest pixel number to accept in the along-track direction.
  19. Version number of NetCDF used.
  20. A comment string specifying the file convention.
  21. A comment string specifying the processing institute.
  22. A comment string specifying the processor used.
  23. A comment string specifying the creator's email address.
  24. A comment string specifying the creator's URL.
  25. A comment string specifying the file version.
  26. A comment string specifying the any references appropriate to this file.
  27. A comment string specifying the file's history.
  28. A comment string specifying a summary of the file.
  29. A comment string specifying any important keywords describing the file.
  30. A comment string specifying any comments about the file.
  31. A comment string specifying the project name. This is the prefix of the output file name.
  32. A comment string specifying the data's license.
  33. A UUID for this file.
  34. The time at which the file was generated.
  35. Path of the AATSR drift correction, which can be found at  http://proj.badc.rl.ac.uk/svn/orac/data.
  36. Indicate formatting of ECMWF data. 0 = A single NetCDF file; 1 = Three NetCDF files; 2 = Two GRIB and one NetCDF files (the BADC set); 3 = A single GRIB file (MARS outputs); 4 = Forecast data.
  37. When using three ECMWF files, the GGAS file.
  38. When using three ECMWF files, the SPAM or GPAM file.
  39. True = split the input file into 4096 line chunks; False = Process the entire input as one.
  40. 1 = Process only daytime; 2 = Process only night; 0 or 3 = Process everything.
  41. False = Print nothing to stdout; True = Make verbose output.
  42. This line has been depreciated and is ignored.
  43. True = Assume all directories assigned above actually are full paths; False = When passed a directory, ORAC searches it for an appropriate file.
  44. False = Assume a Lambertian surface; True = Use the full BRDF surface treatment.
  45. Version number of RTTOV used.
  46. Version number of ECMWF data used.
  47. Version number of SVN used.

The following are optional arguments. The label is separated from its values by an =:

  • ECMWF_TIME_INT_METHOD) 2 = Interpolate between two ECMWF files to determine the meteorology; Any other number = Use the meteorology from the nearest ECMWF file to this orbit.
  • ECMWF_PATH_2) When using ECMWF_TIME_INT_METHOD=2, this is the directory of the second GGAM file.
  • ECMWF_PATH2_2) When using ECMWF_TIME_INT_METHOD=2, this is the directory of the second GGAS file.
  • ECMWF_PATH3_2) When using ECMWF_TIME_INT_METHOD=2, this is the directory of the second SPAM file.
  • USE_HR_ECMWF) True = Read a second, higher resolution ECMWF file for better coverage of the surface; False = Don't.
  • ECMWF_PATH_HR) Specifies the directory of the high resolution ECMWF file.
  • ECMWF_PATH_HR_2) When using ECMWF_TIME_INT_METHOD=2, this is the directory of the second high resolution ECMWF file.
  • USE_ECMWF_SNOW_AND_ICE) True = Use the snow/ice field in the ECMWF file; False = Use the NISE map.
  • USE_MODIS_EMIS_IN_RTTOV) True = Use the MODIS emissivity retrieval; False = Use the RTTOV emissivity atlas.
  • ECMWF_NLEVELS) The number of levels in the ECMWF data. Valid values are 60, 91, 137.
  • USE_L1_LAND_MASK) True = Use the land/sea mask from the imager data; False = Use the USGS land/sea mask.
  • USE_OCCCI) True = Use the Ocean Colour CCI product; False = Assume a constant sea-surface backscatter and extinction.
  • OCCCI_PATH) Directory of the Ocean Colour CCI product.

After running the script, eleven NetCDF files should be produced in the directory specified by line 14.

Main processor

The behavior of the retrieval is controlled by numerous variables and switches in the Control structure. As such, the main processor has a more advanced driver file parser which can set any variable in that structure. The syntax of the driver mimics standard Fortran.

To illustrate, consider this skeleton (non-functional) driver file.

# ORAC Driver File
Ctrl%FID%Data_Dir          = /home/me/data
Ctrl%Run_ID                = "Super Awesome Data File"
# This is a comment
Ctrl%Verbose               = True
Ctrl%RS%Use_Full_BRDF      = f
Ctrl%Ind%NAvail            = 14 # This is also a comment
Ctrl%Ind%Channel_Proc_Flag = 1,1,1,1,0,0,0,1,1,1,1,0,0,0
Ctrl%X0[1:3]               = 1.0, -0.5, 300
Ctrl%XB[ITau]              = -1.0
Ctrl%XB[IRe]               = -0.848
  • The first line must take exactly this form. (It indicates the new driver file format should be used.)
  • Each line is broadly the name of the field within the structure to set, an equals sign, and the value it should take.
  • A # indicates that all subsequent characters on that line are comments and should be ignored.
  • Quotes "" optionally delimits a string (and escapes any characters within).
  • Whitespace is not relevant; New lines are.
  • Logical variables, such as Ctrl%Verbose, can be set true with any string starting in T, t, Y, y, or 1; False is F, f, N, n, or 0.
  • An array variable, like Ctrl%Ind%Channel_Proc_Flag, can be set by a comma-separated list. A second array dimension is delimited with a semi-colon ;.
  • You may also assign subsets of an array. Ctrl%X0[1:3] indicates the first to third elements of the array. Other valid indices are
    • [:], the entire array;
    • [:3], the first to third element;
    • [3:], the third to last element;
    • [3], the third element.
  • Various keywords are defined in ECPConstants.F90, which can be used to improve legibility. For example, Ctrl%XB[ITau] sets the optical depth element of the a priori state vector.

The following fields are manadatory:

  • # ORAC Driver File) This must be the first line.
  • Ctrl%FID%Data_Dir) The directory containing the input files.
  • Ctrl%FID%Filename) The root name of the input files.
  • Ctrl%FID%Out_Dir) The directory into which the outputs should be saved.
  • Ctrl%FID%SAD_Dir) The directory containing the SAD files.
  • Ctrl%InstName) The name of the instrument to process. For MODIS and AVHRR, this must specify both the instrument and the platform separated by a hyphen (e.g. MODIS-AQUA).
  • Ctrl%Ind%NAvail) The number of channels available in the input files.
  • Ctrl%Ind%Channel_Proc_Flag) A boolean array specifying which of the available channels should be used during processing.
  • Ctrl%LUTClass) The name of the look-up table to use during processing. Choices include:
Name Description Name Description
WAT Liquid cloud ICE Ice cloud
A70 Dust A71 Polluted dust
A72 Light polluted dust A73 Light dust
A74 Light clean dust A75 NHemisphere background
A76 Clean maritime A77 Dirty maritime
A78 Polluted maritime A79 Smoke

Troubleshooting

  • If you are recieving an error on a line with a string or filename, try surrounding it in "". Some characters (like a full stop) are meaningful to the software and need escaping.
  • If trying to set an allocatable array, make sure it's length is declared before it's contents. In the above, Ctrl%Ind%NAvail must be set before Ctrl%Ind%Channel_Proc_Flag.
  • The error 'flex parser jammed' probably means you misspelled the name of a variable or forgot an equals, comma, or semi-colon. (At some point these errors will be made more useful.)

After running, two NetCDF files should be produced.

Post-processor

The following lines are mandatory in the driver file for cloud-only processing (in this order, without comments):

  1. Full path to the WAT primary file.
  2. Full path to the ICE primary file.
  3. Full path to the WAT secondary file.
  4. Full path to the ICE secondary file.
  5. Full path for the output primary file.
  6. Full path for the output secondary file.
  7. True = Check if the cloud top temperature is consistent with the cloud phase selected; False = Make the cloud phase selection based upon the Pavolonis typing only.

When processing aerosol (with or without cloud), the same arguments are necessary but WAT and ICE should be replaced with the first and second phases evaluated (the ordering of phases isn't relevant). For each additional phase desired, add the full path to the primary and secondary files on adjacent lines. Also, add the following,

USE_BAYESIAN_SELECTION = True

This argument can also be used in cloud processing if you wish to make phase selection based on retrieval cost rather than the Pavolonis typing.

The following are optional arguments. The label is separated from its values by an =:

  • COST_THRESH) When using Bayesian selection, a phase must have greater cost than this limit to be accepted. The default is 0.
  • NORM_PROB_THRESH) When using Bayesian selection, the cost is used to calculate a probability for each phase. To be selected, that must be greater than this limit to be accepted.
  • OUTPUT_OPTICAL_PROPS_AT_NIGHT) True = Output optical depth and effective radius at night; False = Don't.
  • VERBOSE) False = Print nothing to stdout; True = Make verbose output.
  • USE_CHUNKING) True = Split the input file into 4096 line chunks to save memory (highly recommended for AATSR); False = Process the entire input as one.
  • USE_NETCDF_COMPRESSION) True = Compress the output files; False = Don't.