Changes between Version 3 and Version 4 of CSMLParserHowTo


Ignore:
Timestamp:
14/07/06 12:08:39 (13 years ago)
Author:
domlowe
Comment:

First draft of how to use parser

Legend:

Unmodified
Added
Removed
Modified
  • CSMLParserHowTo

    v3 v4  
    33This document covers: 
    44 
    5  * Using the online parser. 
    6  * How to install the parser code. 
    7  * How to parse a CSML file. 
    8  * How to query CSML attributes. 
     5 * Using the online parser 
     6 * How to install the parser 
     7 * How to parse a CSML file 
     8 * How to query CSML attributes 
    99   * directly via the parser 
    1010   * with the high level api to the parser 
     
    1212 
    1313 
    14 The CSML parser is a convential parser in that it can read a CSML (which is encoded as XML) file and determine the structure and properties of the data within. 
     14The CSML parser is a conventional parser in that it can read a CSML file (which is encoded as XML) and determine the structure and properties of the data within. 
    1515The parser creates Python objects representing the contents of the CSML file. These Python objects can then be interogated either directly, or via a higher level CSML API that provides a more intuitive interface. In addition to the ability to parse CSML you can also use the parser 'in reverse' to create your own CSML documents. 
    1616 
    17 So for each class (type of element) in CSML there is a python class. Each class has 3 methods, __init__(), fromXML() and toXML().  
     17So for each class (type of element) in CSML there is a python class. Each class has 3 methods, init(), fromXML() and toXML().  
    1818The hierarchical relationship between CSML schema elements is also represented within the schema class hierarchy. The upshot being that you can convert to and from XML to Python representations of your CSML document without losing any structural information or any content. 
    19 Rather than go into great detail about how this works (which is the subject of another document (TBA)), here we will concentrate on how to use the parser. 
    2019 
    2120So the root level element of a CSML document is the Dataset element, and there is a python class called Dataset(), which has init, fromXML and toXML methods. 
    22 The hierarchical nature of the parse means that if you call the fromXML or toXML methods of a class it will automatically call the fromXML or toXML methods of it's child classes and this will recurse through the XML hierarchy. 
     21The hierarchical nature of the parser means that if you call the fromXML or toXML methods of a class it will automatically call the fromXML or toXML methods of all classes below it in the CSML XML hierarchy and this will recurse through the XML hierarchy. So calling the fromXML method of the Dataset class will call the fromXML method of all classes below it in the CSML XML hierarchy, eg the !FeatureCollection, every Feature etc. 
     22 
     23Anyway rather than go into great detail about how this works (which is the subject of another document (TBA)), here we will concentrate on how to use the parser. 
     24 
    2325 
    2426 == So...  how to actually use the parser. == 
     
    2628 
    2729Well first the easy way. Use the online parser. This is handy for testing your CSML documents parse as expected. 
    28 Note this is not a true CSML validator, but will show you what how the parser "sees" your CSML. 
     30Note this is not a true CSML validator, but will show you how the parser 'sees' your CSML. If the input and output differ, then something has not parsed well. This could be a  problem with your CSML document or it could be something that isn't fully implemented in the parser or it could just be a bug. Please let me know.  
    2931 
    3032The online parser  is simply a web interface to the parser, and allows you to parse a CSML document. You can't do anything with the parsed document, but it is useful as a way of verifying what the parser 'sees' when it parses your CSML document. If you don't have a CSML document, you can download one {HERE}. 
     
    8385 
    8486 == The CSML API == 
    85 As we have just seen, the parser itself provides an API of sorts via the object hierarchy. but it is clumsy to navigate. The most common things you will want to do with features have been wrapped up in a set of simple methods.  
     87As we have just seen, the parser itself provides an API of sorts via the object hierarchy. but it is clumsy to navigate. The most common things you will want to do with features have been wrapped up in a set of simple methods. Rather than accurately document the methods here (!PyDoc does that nicely), this is how to use the methods to perform a subsetting operation on a !GridSeriesFeature: 
     88 
     89{{{ 
     90#!python 
     91import API   #This is all you need to import, the API module will import the parser as API.Parser 
     92 
     93f='coapec.xml' # your CSML file 
     94 
     95#Initialise and parse the dataset 
     96csml = API.Parser.Dataset()  # Create a new empty csml Dataset object 
     97csml.parse(f) # parse the CSML file - this is like calling the fromXML() method of the Dataset 
     98 
     99#You can now interrogate the CSML document: 
     100 
     101#get list of features in the dataset 
     102flist= csml.getFeatureList()  
     103print '\n Here are all the features in %s:' %f 
     104print flist 
     105 
     106#select a feature by name (gml:id) 
     107print '\n Selecting feature with gml:id = %s' %flist[4] 
     108feature=csml.getFeature(flist[4]) 
     109 
     110#These are some attributes, the gml:id and gml:description 
     111print feature.id 
     112print feature.description 
     113 
     114#get the domain of the feature 
     115print '\n The feature has domain reference:'  
     116print feature.getDomainReference() 
     117 
     118#get the domain complement of the feature 
     119print '\n The feature has domain complement :'  
     120#print feature.getDomainComplement() 
     121 
     122#get combined domain, this returns the domainReference and the domainComplement 
     123print '\n The feature has domain:'  
     124#print feature.getDomain() 
     125 
     126#get list of allowed subsettings 
     127print '\n the following feature subsetting operations are allowed:' 
     128print feature.getAllowedSubsettings() 
     129 
     130 
     131#Now we can subset the file based on a selection 
     132 
     133#define a selection (you would base this on the values of the domain ref/complement but I have hardcoded it here) 
     134timeSelection=['2794-12-1T0:0:0.0', '2844-12-1T0:0:0.0']  #max and min values (you can also provide a list of specific values) 
     135spatialSubsetDictionary= {} 
     136spatialSubsetDictionary['latitude']=(-30.0,30.0) 
     137spatialSubsetDictionary['longitude']=(90, 120.0) 
     138#If the feature is defined in any other dimension you can add that here too. 
     139 
     140#request subsetted data from feature (can set output file paths here) 
     141subsetCSML, subsetNetCDF, arraySize=feature.subsetToGridSeries(timeSelection, csmlpath='my.xml', ncpath='my.nc',**spatialSubsetDictionary) 
     142 
     143 
     144#Now we have a subsetted CSML document and a NetCDF file that describe/contain your subsetted data. 
     145print subsetCSML #csml document (string) 
     146print subsetNetCDF # netcdf file (file) 
     147print 'arraySize: %s' %arraySize  #this is just useful - how big is the data. 
     148}}} 
     149 
     150 
     151But wait, perhaps this didn't work. If you couldn't perform the subset operation, then it is because you need more things installing... 
     152All the parser operations shown so far have just operated on a CSML document and the CSML python objects in memory. However when you perform a subsetting operation, if the data is stored in real data files, then some i/o operations take place. Typically this means installing the cdms module to read NetCDF, Nappy to read NASAAmes, and potentially other modules to read other file formats. I should probably write more on this, but installation is an area that's going to change radically so I won't for now. 
     153 
     154 
     155So, to summarise, you can: 
     156 
     157 * Parse CSML files using the online parser and visualise the content 
     158 * Parse a file using python and interogate it directly in a fairly longwinded manner. 
     159 * Use the CSML API to parse the file and interogate it using simple methods. 
     160 * Perform operations on the CSML and underlying data using the CSML API. 
     161 
     162There is another thing you can do and that is to use the parser's toXML() methods to create your CSML. However that is the subject of a [wiki:UsingTheParserToCreateCSML separate how to].