Version 1 (modified by lawrence, 14 years ago) (diff)


Numerical Simulation Discovery Metadata (aka NumSim)

The DIF? describes datasets at the discovery level, but where simulations are involved, discovery metadata needs more information than is available in the existing schema.

A new schema which is being trialled at the British Atmospheric Data Centre can be found here. The proposed schema should be accessible to both DIF? and ISO19115? parent discovery schema (as it evolves), although at the moment it's rather standalone.

The schema is documented in a pdf file which you can download from here. See also example files: PUM 4.5 Beowulf.xml HadCM3 Beowulf 500 year run

Comments are welcome! Please note that this version has only just started to be trialled with data, so some changes are inevitable as described below.

You can either annotate this page, or join the mailing list at (mail the list at climate-model-doc@…). Or email bryan lawrence directly.


Two sets of email feedback on V005 have been received, and are documented on the NumSimFeedback page, and V006 is now the current release.

More responses welcome.

Technical Issues



  • See [NumSim05to06] Changes.

Planned Changes

Planned changes are either:

  • Definite (we will definitely do this)
  • Maybe (it's something that seems sensible)
  • Probable (it's something we think we should do)
  • Discussion (it's something to think about, in practice these items if decided should pop to a later version)


V007-01 ... under active development (Jan/Feb? 2006)

  • (Definite) We need to change the external references to be more standards compliant (there must be a standard way to do this). (see also response 2.8)
    • (Maybe) Need to make sure we can traverse between references in discovery easily
  • (Maybe) There is clearly scope to increase the complexity of the related model element to include both a controlled vocab for the relationship and a description. Where we know a model is a child of a parent we could inherit the component characteristics except where they differ. Maybe that's an Earley Suite issue, not an issue for this ...
    • (Discussion) but I don't feel happy with AnyURI either, we should probably wrap it in an element which has the date the URI was last known to be valid ... so we can manually check them from time to time.
  • (Probable) Need a script for XML to XHTML and a stylesheet.


  • (Maybe) Do we want sequences or any in most of the complex groups? (Does order matter?) Make life easy for editors.
  • (Maybe) Follow up on Response 1.2, and consider whether ModelComponents such as "Atmosphere" need to be complex types, so that the software searching can distinguish between hydrostatic/non-hydrostatic and wet/dry, for example.
    • Curation Issue: Should we suck any pages we link to down into the archive as copies?
  • (Discussion) Should we have the initial conditions as attributes (subelements) of the Model rather than as attributes of the simulation. This would help in some regards, but might break the future ability to inherit model descriptions ... Note that Response 2.4 is unkeen on this.
    • How would we handle ensembles? For example, one could hold the entire SRESA2 ensemble, or an ensemble member. What discovery records should exist in the first case? What should the response to a search be?
      • Imagine a three member ensemble, should there be four records? At what point does the user see four? First response should be one surely?
      • Note that as it stands we need to make clear that an ensemble record which includes multiple initial condition members does not need repeated initial condition elements for each ensemble member. (There is an impossible sentence at the bottom of the simulated description that could be improved a lot to make this clearer).
  • (Discussion) Should we allow references in the initial condition like we do the boundary condition?
  • (Discussion) We could support individual members of the climate prediction ensembles by adding to the model element an optional <perturbed> element which could be a list made up of arg param pairs. These would appear in ensemble member descriptions but not in grand ensemble descriptions. The usual issue of what level in the D heirarchy should be exposed to the wider world will arise.
  • (Probable) Conform to ISO19115 extension mechanism
  • (Discussion) Should we make the model resolution a properly configurable subelement. (This to support the situation where the model resolution is different from the resolution of the accompanying dataset).

This should almost certainly use Balaji's formalism.


  • (Probable)  Earley Suite Convergence. What we want to be able to do is generate NumSim entries from Earley Suite descriptions. This may involve some interesting ontological jumps :-)
  • (Discuss) Following response 2.3. Given that model has another understanding, in for example, geology, should we use a different word for model, e.g.: simulator?