source: TI01-discovery/branches/ingestAutomation-upgrade/OAIBatch/Utilities.py @ 4854

Subversion URL: http://proj.badc.rl.ac.uk/svn/ndg/TI01-discovery/branches/ingestAutomation-upgrade/OAIBatch/Utilities.py@4854
Revision 4854, 803 bytes checked in by cbyrom, 11 years ago (diff)

Add new ingest script - to allow ingest of DIF docs from eXist hosted
atom feed. NB, this required restructure of original OAI harvester
to allow re-use of shared code - by abstracting this out into new class,
absstractdocumentingester.

Add new documentation and tidy up codebase removing dependencies where possible to simplify things.

Line 
1from xml.etree import ElementTree as ET
2import logging
3from ndg.common.src.lib.ETxmlView import loadET, nsdumb
4
5def idget(xml,dataType='DIF'):
6    ''' Given an xml document (string), parse it using ElementTree and
7    find the identifier within it. Supports dataTypes of 'DIF' and 'MDIP'...
8    '''
9    et=loadET(xml)
10    helper=nsdumb(et)
11    if dataType=='DIF':
12        return helper.getText(et,'Entry_ID')
13    elif dataType == 'MDIP':
14        return helper.getText(self.tree,'DatasetIdentifier')
15    else:
16        raise TypeError,'idget does not support datatype [%s]'%dataType
17
18import unittest
19
20class TestCase(unittest.TestCase):
21    """ Tests as required """
22
23    def testidget(self):
24        self.assertEqual(idget(self.difxml),'NOCSDAT192')
25   
26
27if __name__=="__main__":
28    unittest.main()
29
30
31
Note: See TracBrowser for help on using the repository browser.