Version 4 (modified by domlowe, 13 years ago) (diff)

still editing how to

Read methods needed to integrate a new data format into the CSML API

The CSML API uses a unified 'data interface' class (DI) to read various data formats.

The basic read methods that need implementing for a new data format are:

  • DI.openFile(self, fileName) --- opens the file
  • DI.setAxis(self,axisName) --- this 'sets' the axis you want to read (axis: e.g latitude, time, pressure, depth.. etc)
  • DI.getDataForAxis(self) --- this returns the entire set of values for that axis
  • DI.setVariable(self,variableName) --- this 'sets' the variable you want to read (variable: e.g Temperature, WindSpeed etc..)
  • DI.getDataForVariable(self,) --- this returns the entire set of values for that variable
  • DI.getSubsetOfDataForVar(self,kwargs) --- this returns a subset of values for that variable
  • DI.closeFile(self) --- closes the open file

When the CSML API instantiates a DataInterface object (from now on, DI), what is actually returned is a data interface specific to the data format.

In the DataInterface class there is a bit of python code that does something like this:

                if self.iface == 'nappy':
                        return NappyInterface()
                elif self.iface == 'cdunif':
                        return cdunifInterface()

So if you want to integrate your format, XYZFormat, the first thing to do is to create an XYZInterface() and we can then have:

                if self.iface == 'nappy':
                        return NappyInterface()
                elif self.iface == 'cdunif':
                        return cdunifInterface()
                elif self.iface == 'XYZ':
                        return XYZInterface()

So (in python) you should create a class that looks like this:

class XYZInterface(AbstractDI):
    #Data Interface for XYZ File format

    def __init__(self):
        #this might change when CSML is revamped
        self.extractPrefix = '_XYZextract_'

    def openFile(self, filename):
        #some code to open the file

    def setAxis(self,axis):
        #some code to set an axis to be queried, may not need to do much, depending on your format

    def getDataForAxis(self):
        #some code to return the values for an axis
        return data

    def setVariable(self,varname):
        #some code to set a variable to be queried, may not need to do much, depending on your format

    def getDataForVar(self):
        #some code to return all values for a variable
        return data

    def getSubsetOfDataForVar(self, **kwargs):
        #takes keyword args defining subset eg
        #subset=getSubsetOfDataForVar(latitude=(0.,10.0), longitude=(90, 100.0), ...)
        #and returns a subset of data for tha variable 
        return data

    def closeFile(self):
        #some code to close the file

I think perhaps it is best to explain this by showing how the interface differs for cdms/cdunif and NAPPY data interfaces:

Example Data Interfaces

First, the openFile method. This is pretty straightforward, we open the file and assign the open file to self.file.

CDMS: openFile

        def openFile(self, filename):

NAPPY: openFile

        def openFile(self, filename):

The set axis method differs for the two interfaces. The cdunif method is straightforward and grabs an axis object direct from the file whereas the NAPPY method stores the name of the axis in a variable called self.axisstub for reference later - however to do this it has to get all the axes and then strip the units from them. This is confusing detail but the basic idea is to store 'something' that will give you a handle back to the axis. This something will be internal to the XYZInterface class.

CDMS: setAxis

    def setAxis(self,axis):

NAPPY: setAxis

        def __stripunits(self,listtostrip):
                #strips units of measure from list
                #eg ['Universal time (hours)', 'Altitude (km)', 'Latitude (degrees)', 'Longitude (degrees)']
                #becomes ['Universal time', 'Altitude', 'Latitude', 'Longitude']
                cleanlist = []
                for item in listtostrip:
                        if openbracket != -1:
                                #if brackets exist, strip units.
                return cleanlist

        def __getListOfAxes(self):
                return axes

        def setAxis(self,axis):
                axes = self.__getListOfAxes()

CDMS: getDataForAxis

NAPPY: getDataForAxis

CDMS: setVariable

NAPPY: setVariable

CDMS: getDataForVariable

NAPPY: getDataForVariable

CDMS: getSubsetOfDataForVariable

NAPPY: getSubsetOfDataForVariable

CDMS: closeFile

NAPPY: closeFile