Version 2 (modified by domlowe, 15 years ago) (diff)

Added Ag's point about cell_methods etc to csmlAPIQuestions

The scope of the CSML API for Alpha was limited to providing access to a single CSML feature type (Grid Series) and performing a single operation (subsetting) on that feature.

The API now needs to undergo a through review to ensure it structured to fullfill our longterm requirements which involve multiple operations on multiple feature types.

We want to follow the 'processing affordance' pattern as per the Met Office  Exeter Communique (pdf). A major part of the API rethink will involve deciding how best to implement this in practice.

Additionally integration of the alpha CSML API with WCS and the Data Extractor has brought up plenty of other issues that we also need to take into account. Most of which (except minor bugs) are documented here:

  • Idenfication of axes - although CSML does not place specific requirements on axis names, applications need to know which are the longitude/latitude/level/time axis. This can usually be inferred from the CSML context/feature type, but should we explicitly have attributes in the CSML document such as isLatitude, isLongitude, isTime etc. ?
  • Calendaring - The frame of reference for times may depend on different calendars e.g 360_day, Gregorian etc. This information should be stored in the CSML probably in a srsName attribute - is this always possible? What about when times are encoded as file extracts rather than inline? Currently (alpha) the API refers back to the original data to check the calendar attribute, but this is inefficient.
  • Path names - in the CSML, should we store relative paths or absolute paths. The 'delivered' CSML should probably be relative, but what about the stored CSML?
  • domain - the domain complement and reference could be a hinderance when we want to do basic axis indexing. At present I don't know the order automatically, this is a major problem for efficient sub-setting. (quoted from Ag - could you explain why this is?)
  • Global attributes - applications can need to know 'global attributes', for presentation or otherwise. Without MOLES, we don't have access to these directly from CSML. Is there a case for storing some global attributes in the CSML.
  • CF compliance - to write CF-compliant NetCDF we may need to store additional attributes alongside 'variable' and 'axes' (i.e. attributes of the feature and domain). Which attributes are required for CF compliance? Will this model be viable when the CSML is not derived from underlying CF-NetCDF?

  • Data in memory - applications may want to hold the data in memory rather than receiving a link to NetCDF file.
  • Multiple feature selection - the Data Extractor allows selection of more than one variable (feature). Need to write CSML containing multiple features.
  • Multiple NetCDF files - when the data selection is large we may want to deliver multiple NetCDF files.
  • Upside-down data - in Alpha we didn't use the mappingRule element to specify the orientation of the data, so some of it was the wrong way up. The orientation (e.g. +x-y+z+series) should be calculated when scanning and processed by the API.
  • The DX (and other) GUI needs to provide enough information about a variable so that the user can identify what it is. The name is not enough because you can have different cell_methods (CF-talk) on a variable (such as mean and std deviation). But, even cell_methods might not be definitive. We need to consider what/how this stuff gets to users.