Changeset 220


Ignore:
Timestamp:
22/09/14 13:29:50 (5 years ago)
Author:
mjuckes
Message:

work on mip table qc

Location:
CCCC/trunk/ceda_cc
Files:
3 edited

Legend:

Unmodified
Added
Removed
  • CCCC/trunk/ceda_cc/config/testStandardNames.txt

    r211 r220  
    2121=global_ncattribute_cv= 
    2222*Global NetCDF attribute controlled vocabularies 
    23 Verify that NetCDF global attributes are consistent with controlled vocabulary constraints. 
     23Verify that NetCDF global attribute values are consistent with controlled vocabulary constraints. 
    2424 
    2525=filename_filemetadata_consistency= 
     
    3333=parse_filename= 
    3434*Parse the file name into component elements 
    35 File names will generally consist of a sequence of elements separated by a special character. This test checks that the correct number of elements are present. 
     35File names will generally consist of a sequence of elements separated by a special character. This test checks that the correct number of elements are present (or that the number of elements is in the correct range). 
    3636 
    3737=parse_filename_timerange= 
    3838*Parse the time range specified in the file name 
    39 If the file name contains a time range, this test will check that the given element has the correct syntax. 
     39If the file name contains a time range, this test will check that the given element has the correct syntax (usually "start-end", where "start" and "end" are strings such as "19900101" or "199001"). 
    4040 
    4141=filename_timerange_length= 
    4242*Verify the number of characters used to specify the time range 
    43 Verify the number of characters used to specify the time range 
     43Verify that the number of characters used to specify the time range fits the requirements. This will be 6 if the time range is specified to the nearest month, 8 if it is specified to the nearest day. 
    4444 
    4545=time_attributes= 
    4646*Verify that the time variable is present (if needed) with the appropriate attributes 
    47 Check that a "time" variable is present and has appopriate attributes, including units and bounds. 
     47Check that a "time" variable is present and has appopriate attributes, including units and, if required, bounds. The bounds attribute is required if the "cell_methods" attribute on the data variable specifies that the data is not instantaneous. 
    4848 
    4949=pressure_levels= 
    5050*Check attributes, bounds and values of pressure levels 
    51 Check attributes, bounds and values of pressure levels 
     51Check attributes, bounds and values of pressure levels. Where data is interpolated to pressure levels, the MIP data request generally defines the levels required. 
    5252 
    5353=height_levels= 
    5454*Check attributes, bounds and values of height levels 
    55 Check properties of the vertical coordinate 
     55Check properties of the height vertical coordinate 
    5656 
    5757=grid_mapping= 
    5858*Check the grid_mapping attributes 
    59 Check the attributes specifying the grid in the grid_mappign variable 
     59Check the attributes specifying the grid in the grid_mapping variable. The usage of the "grid_mapping" variable is defined in the etCDF CF Convention. 
    6060 
    6161=rotated_latlon_attributes= 
    6262*Check the attributes of the rotated latitude and longitude coordinate variables 
    63 Check the attributes of the rotated latitude and longitude coordinate variables 
     63Check the attributes of the rotated latitude and longitude coordinate variables. This test checks variable attributes (e.g. long_name, standard_name, units, and axis attributes) and the type of the variable. 
    6464 
    6565=rotated_latlon_domain= 
    6666*Check the domain specified by rotated latitude and longitude coordinate variables 
    67 Check the domain specified by rotated latitude and longitude coordinate variables 
     67Check the domain specified by rotated latitude and longitude coordinate variables. There may be some tolerance specified, rather than requiring an exact match. 
    6868 
    6969=regular_grid_attributes= 
    7070*Check the attributes of the latitude and longitude coordinate variables 
    71 Check the attributes of the latitude and longitude coordinate variables 
     71Check the attributes of the latitude and longitude coordinate variables. 
    7272 
    7373=regular_grid_domain= 
    7474*Check the domain specified by latitude and longitude coordinate variables 
    75 Check the domain specified by latitude and longitude coordinate variables 
     75Check the domain specified by latitude and longitude coordinate variables. This may also include a check on the grid spacing. 
    7676 
    7777=filename_timerange_value= 
    7878*Check the time range specified in the file name 
    79 Check the time range specified in the file name 
     79Check that the time range specified in the file name is consistent with the data request (e.g. some variables should be in blocks of 10 years, starting on January 1st in the 1st year of a decade). 
  • CCCC/trunk/ceda_cc/extractMipInfo.py

    r217 r220  
    11 
    2 import collections, glob, string 
    3 from fcc_utils2 import mipTableScan, snlist 
     2import collections, glob, string, re 
     3from fcc_utils2 import mipTableScan, snlist, tupsort 
    44from config_c4 import CC_CONFIG_DIR 
    55 
     
    99snl, snla = snc.gen_sn_list( ) 
    1010NT_mip = collections.namedtuple( 'mip',['label','dir','pattern'] ) 
     11NT_canvari = collections.namedtuple( 'canonicalVariation',['conditions','text', 'ref'] ) 
    1112vlist = [ 
    1213('uas', 
     
    7677'INCONSISTENT LONG NAMES' ) ] 
    7778 
     79 
     80class helper: 
     81 
     82  def __init__(self): 
     83    self.applycv = True 
     84    self.re1 = re.compile( '"(.*)"=="(.*)"' ) 
     85 
     86    self.cmip5Tables= ['CMIP5_3hr', 'CMIP5_6hrPlev', 'CMIP5_Amon', 'CMIP5_cfDay', 'CMIP5_cfOff', 'CMIP5_day', 'CMIP5_grids', 'CMIP5_Lmon', 'CMIP5_OImon', 'CMIP5_Oyr', 
     87  'CMIP5_6hrLev', 'CMIP5_aero', ' CMIP5_cf3hr', 'CMIP5_cfMon', 'CMIP5_cfSites', 'CMIP5_fx', 'CMIP5_LImon', 'CMIP5_Oclim', 'CMIP5_Omon' ] 
     88    self.cmip5DefPoint = ['CMIP5_3hr', 'CMIP5_6hrPlev', 'CMIP5_cfOff', 'CMIP5_6hrLev', ' CMIP5_cf3hr', 'CMIP5_cfSites' ] 
     89 
     90    self.canonvar = [ NT_canvari( (('table','CMIP5_3hr'),), 'This is sampled synoptically.', '' ), 
     91                      NT_canvari( (), 'The flux is computed as the mass divided by the area of the grid cell.', 'This is calculated as the convective mass flux divided by the area of the whole grid cell (not just the area of the cloud).' ), 
     92            ] 
     93 
     94    self.canonvar = [] 
     95    for l in open( 'canonicalVariations.txt' ).readlines(): 
     96      if l[0] != '#': 
     97        ix = l.index(':') 
     98        s = string.strip( l[ix:] ) 
     99        r = self.re1.findall( s ) 
     100        assert len(r) == 1, 'Cannot parse: %s' % s 
     101        self.canonvar.append( NT_canvari( (), r[0][0], r[0][1] ) ) 
     102 
     103  def match(self,a,b): 
     104      if type(a) == type( 'X' ) and type(b) == type( 'X' ): 
     105        a0,b0 = map( lambda x: string.replace(x, '__ABSENT__',''), [a,b] ) 
     106        return string.strip( string.replace(a0, '  ', ' '), '"') == string.strip( string.replace(b0, '  ', ' '), '"') 
     107      else: 
     108        return a == b 
     109 
     110  def checkCond( self, table, var, conditions ): 
     111    val = True 
     112    for ck, cv in conditions: 
     113      if ck == 'table': 
     114        val &= table == cv 
     115      elif ck == 'var': 
     116        val &= var == cv 
     117 
     118    return val 
     119         
     120       
     121 
    78122class snsub: 
    79123 
     
    98142snsubber = snsub() 
    99143 
    100 mips = ( NT_mip( 'cmip5','cmip5_vocabs/mip/', 'CMIP5_*' ), 
    101          NT_mip( 'ccmi', 'ccmi_vocabs/mip/', 'CCMI1_*')  ) 
    102 mips = ( NT_mip( 'cmip5','cmip5_vocabs/mip/', 'CMIP5_*' ), 
    103           ) 
    104144 
    105145cmip5_ignore = ['pfull','phalf','depth','depth_c','eta','nsigma','vertices_latitude','vertices_longitude','ztop','ptop','p0','z1','z2','href','k_c','a','a_bnds','ap','ap_bnds','b','b_bnds','sigma','sigma_bnds','zlev','zlev_bnds','zfull','zhalf'] 
     
    107147class mipCo: 
    108148 
    109   def __init__(self,mips): 
     149  def __init__(self,mips,helper=None): 
    110150    self.vl0 = [] 
    111151    self.tl = [] 
    112152    self.td = {} 
     153    self.helper = helper 
    113154    for mip in mips: 
    114155      self._scan(mip) 
     
    166207         if att == '__dimensions__': 
    167208           atl = map( lambda x: string.join( td[x][v][0] ), l ) 
    168            print '#######', v,l,atl 
    169209         else: 
    170210           atl = map( lambda x: td[x][v][1].get(att,'__ABSENT__'), l ) 
     
    251291    
    252292class typecheck1: 
    253   def __init__( self, m, thisatts): 
     293  def __init__( self, m, thisatts,helper=None): 
    254294    self.type2Atts = ['positive','comment', 'long_name', 'modeling_realm', 'out_name', 'standard_name', 'type', 'units', 'flag_meanings', 'flag_values'] 
    255     self.type3Atts = ['positive','modeling_realm', 'out_name', 'standard_name', 'type', 'units', 'flag_meanings', 'flag_values'] 
     295    self.type3Atts = ['positive','long_name','modeling_realm', 'out_name', 'standard_name', 'type', 'units', 'flag_meanings', 'flag_values'] 
    256296    self.type4Atts = ['positive','modeling_realm', 'out_name', 'standard_name', 'type', 'units', 'flag_meanings', 'flag_values'] 
     297    self.type2Atts = ['positive','comment', 'long_name', 'modeling_realm', 'out_name', 'standard_name', 'type', 'units'] 
     298    self.type3Atts = ['positive','long_name','modeling_realm', 'out_name', 'standard_name', 'type', 'units'] 
     299    self.type4Atts = ['positive','modeling_realm', 'out_name', 'standard_name', 'type', 'units'] 
    257300    self.m = m 
    258301    vars = m.vars 
    259302    vdict = m.vdict 
     303    self.helper=helper 
    260304    td = m.td 
    261305    vd2 = {} 
     
    270314       for att in thisatts: 
    271315         if att == '__dimensions__': 
    272            atl = map( lambda x: string.join( td[x][v][0] ), l ) 
    273            print '#######', v,l,atl 
     316           atl = map( lambda x: (string.join( td[x][v][0] ),x), l ) 
    274317         else: 
    275            atl = map( lambda x: td[x][v][1].get(att,'__ABSENT__'), l ) 
    276          atl.sort() 
    277          av = [atl[0],] 
    278          for a in atl[1:]: 
     318           atl = map( lambda x: (td[x][v][1].get(att,'__ABSENT__'),x), l ) 
     319         atl.sort( tupsort(0).cmp ) 
     320         a0 = atl[0][0] 
     321         if a0 == None: 
     322           a0 = "" 
     323         av = [a0,] 
     324         for a,tab in atl[1:]: 
     325           if a == None: 
     326             a = "" 
    279327           if a != av[-1]: 
    280              av.append(a) 
     328             if self.helper != None and self.helper.applycv: 
     329               thisok=False 
     330               pmatch = False 
     331               for cond,src,targ in self.helper.canonvar: 
     332                 if string.find(a,src) != -1 or  string.find(av[-1],src) != -1: 
     333                   ##print 'Potential match ---- ',a 
     334                   ##print src,'###',targ 
     335                   ##print av[-1] 
     336                   pmatch = True 
     337                 if self.helper.checkCond( tab, v, cond ): 
     338                   if self.helper.match(string.replace( a, src, targ ), av[-1]) or self.helper.match(string.replace( av[-1], src, targ ), a): 
     339                     thisok = True 
     340               if thisok: 
     341                 print '############### conditional match found', tab, v 
     342               else: 
     343                 if pmatch: 
     344                   ##print '########### no matvh found' 
     345                   pass 
     346                 av.append(a) 
     347             else: 
     348               av.append(a) 
    281349         adict[att] = av 
     350         if v == "snd": 
     351           print adict 
    282352        
    283353## check for type 2 
     
    288358       elif all( map( lambda x: len(adict[x]) == 1, self.type3Atts )): 
    289359           tval = 3 
     360       elif all( map( lambda x: len(adict[x]) == 1, self.type4Atts )): 
     361           tval = 4 
    290362       else: 
    291363           l = map( lambda x: '%s:%s, ' % (x,len(adict[x]) ), self.type2Atts ) 
     
    295367       elif tval == 3: 
    296368         type3.append( v) 
     369       elif tval == 4: 
     370         type4.append( v) 
    297371       else: 
    298          type4.append(v) 
     372         type5.append(v) 
    299373    xx = float( len(vars) ) 
    300     print string.join( map( lambda x: '%s (%5.1f%%);' % (x,x/xx*100), [len(type1), len(type2), len(type3), len(type4)] ) ) 
     374    print string.join( map( lambda x: '%s (%5.1f%%);' % (x,x/xx*100), [len(type1), len(type2), len(type3), len(type4), len(type5)] ) ) 
    301375    self.type1 = type1 
    302376    self.type2 = type2 
    303377    self.type3 = type3 
    304378    self.type4 = type4 
     379    self.type5 = type5 
    305380 
    306381  def exportHtml( self, typecode ): 
     
    324399""" 
    325400    fixedType3TmplB = "<li>%s [%s]: %s: %s [%s]</li>\n" 
     401    fixedType4TmplB = "<li>%s [%s]: %s [%s]</li>\n" 
     402    fixedType5TmplA = """ [%(units)s]</h3> 
     403       out_name: %(out_name)s; type: %(type)s <br/> 
     404""" 
     405    fixedType5TmplB = "<li>%s [%s]: %s, %s [%s]: %s</li>\n" 
    326406         
    327407    if typecode == 1: 
     
    367447      oo.close() 
    368448            
    369     elif typecode == 3: 
    370       oo = open( 'type3.html', 'w' ) 
    371       self.type3.sort() 
    372       oo.write( '<h2>Variables with varying long_name/comment</h2>\n' ) 
    373       for v in self.type3: 
     449    elif typecode in [3,4,5]: 
     450      oo = open( 'type%s.html' % typecode, 'w' ) 
     451      thistype,h2,al,tmplA,tmplB = { 3:(self.type3,"Variables with varying comment",['long_name','comment','cell_methods'], fixedType3TmplA, fixedType3TmplB), 
     452                      4:(self.type4,"Variables with varying long_name",['long_name','cell_methods'],fixedType3TmplA, fixedType4TmplB), 
     453                      5:(self.type5,"Remaining variables",['standard_name','long_name','cell_methods','realm'],fixedType5TmplA, fixedType5TmplB) }[typecode] 
     454      thistype.sort() 
     455      oo.write( '<h2>%s</h2>\n' % h2 ) 
     456      for v in thistype: 
    374457            l = self.m.vdict[v] 
    375458            etmp = {} 
    376459            for a in allAtts: 
    377460                etmp[a] = self.m.td[l[0]][v][1].get( a, 'unset' ) 
    378             oo.write( '<h3>' + v + (fixedType3TmplA % etmp) ) 
     461            oo.write( '<h3>' + v + (tmplA % etmp) ) 
    379462            oo.write( '<ul>\n' ) 
    380463            for t in l: 
    381464              dims = string.join( self.m.td[t][v][0] ) 
    382               sa = tuple( [t,dims,] + map( lambda x: self.m.td[t][v][1].get( x, 'unset' ), ['long_name','comment','cell_methods'] ) ) 
    383               oo.write( fixedType3TmplB % sa ) 
     465              sa = tuple( [t,dims,] + map( lambda x: self.m.td[t][v][1].get( x, 'unset' ), al ) ) 
     466              oo.write( tmplB % sa ) 
    384467            oo.write( '</ul>\n' ) 
    385468      oo.close() 
    386469            
     470mips = ( NT_mip( 'cmip5','cmip5_vocabs/mip/', 'CMIP5_*' ), 
     471         NT_mip( 'ccmi', 'ccmi_vocabs/mip/', 'CCMI1_*')  ) 
     472mips = ( NT_mip( 'cmip5','cmip5_vocabs/mip/', 'CMIP5_*' ), ) 
     473mips = ( NT_mip( 'ccmi', 'ccmi_vocabs/mip/', 'CCMI1_*'),  ) 
    387474m = mipCo( mips )   
     475h = helper() 
    388476 
    389477allatts = ms.al 
     
    393481  if a not in thisatts: 
    394482    thisatts.append(a) 
    395 s =typecheck1( m, thisatts) 
     483s =typecheck1( m, thisatts, helper=h) 
    396484s.exportHtml( 1 ) 
    397485s.exportHtml( 2 ) 
    398486s.exportHtml( 3 ) 
     487s.exportHtml( 4 ) 
     488s.exportHtml( 5 ) 
  • CCCC/trunk/ceda_cc/summary.py

    r212 r220  
    11 
    22import string, sys, glob, os 
     3import collections 
    34 
    45HERE = os.path.dirname(__file__) 
     
    67  HERE = '.' 
    78print '############################ %s' % HERE 
     9 
     10NT_esn = collections.namedtuple( 'errorShortName', ['name', 'long_name', 'description' ] ) 
     11class errorShortNames(object): 
     12 
     13  def __init__(self,file='config/testStandardNames.txt' ): 
     14    assert os.path.isfile(file), 'File %s not found' % file 
     15    ii = map( string.strip, open(file).readlines() ) 
     16    ll = [[ii[0],]] 
     17    for l in ii[1:]: 
     18      if len(l) > 0 and l[0] == '=': 
     19        ll.append( [l,] ) 
     20      else: 
     21        ll[-1].append( l ) 
     22    self.ll = [] 
     23    for l in ll: 
     24      if len(l) < 2: 
     25        print l 
     26      else: 
     27        self.ll.append( NT_esn( string.strip(l[0],'='), l[1][1:], string.join(l[2:]) ) ) 
    828 
    929def cmin(x,y): 
     
    109129    self.testnames() 
    110130    if dohtml: 
     131      self.htmlEsn( ) 
    111132      self.htmlout( ee, ff, esum ) 
    112133 
     
    137158    print s 
    138159 
     160  def htmlEsn( self ): 
     161    esn = errorShortNames() 
     162    cnt = '<h1>Error Short Names</h1>\n' 
     163    for l in esn.ll: 
     164      cnt += '''<a name="%s"><h2>%s</h2></a> 
     165            <p><i>%s</i><br/> 
     166             %s 
     167             </p> 
     168             ''' % (l.name,l.name, l.long_name, l.description ) 
     169     
     170    self.__htmlPageWrite( 'html/ref/errorShortNames.html', cnt ) 
     171 
    139172  def htmlout( self, ee, ff, esum ): 
    140173    if not os.path.isdir( 'html' ): 
    141174      os.mkdir( 'html' ) 
     175      os.mkdir( 'html/ref' ) 
    142176      os.mkdir( 'html/files' ) 
    143177      os.mkdir( 'html/errors' ) 
     
    184218      ks = ee[k][1].keys() 
    185219      ks.sort() 
     220      sect_esn = None 
    186221      for k2 in ks: 
    187222        nn += 1 
     223        this_esn = string.split(k2,']')[0][1:] 
     224        if this_esn != sect_esn: 
     225          sect_esn = this_esn 
     226          list.append( '<h2>%s: %s<a href="../ref/errorShortNames.html#%s">(definition)</a></h2>' % (k,this_esn, this_esn) ) 
    188227        list.append( eItemTmpl % (nn,k, ee[k][1][k2][0], k2  ) ) 
    189228        l2 = [] 
     
    196235        self.__htmlPageWrite( efp, ePage ) 
    197236    eIndexContent = """<h1>List of detected errors</h1> 
    198 Code[number of files with error]: result  
     237<p>Code[number of files with error]: result <br/> 
     238Click on the code to see a list of the files in which each error is detected. 
     239</p> 
    199240<ul>%s</ul> 
    200241"""  % (string.join(list, '\n' ) ) 
Note: See TracChangeset for help on using the changeset viewer.