source: TI01-discovery/branches/ingestAutomation-upgrade/OAIBatch/ @ 3797

Subversion URL:
Revision 3797, 1.2 KB checked in by cbyrom, 12 years ago (diff)

Upgraded version of ingest codebranch - including major refactoring of the ingest
scripts to make more OO - allowing re-use and simplification of code + removal of reliance
on eXist DB to store data; this will now all be stored and looked up from the Postgres DB

1import sys
3class SchemaNameSpace:
4    '''
5    Class to change/correct namespaces to the latest ones used by NDG discovery
6    NB, only currently handles correction of DIF files
7    '''
8    def __init__(self,infile,outfile,format):
9        '''
10        Constructor - with the logic to do the namespace change
11        @param infile: file to correct namespaces in
12        @param outfile: file to create with the corrected namespaces
13        @param format: Format of file being processed.  DIF is the only format which currently is processed.   
14        '''
15        self.ff=open(infile,'r')
16        self.ww=open(outfile,'w')
17        self.format= format
18        self.lines=self.ff.readlines()
19        for self.line in self.lines:
20            if self.format== "DIF" and self.line.startswith('<DIF'):
21                print "INFO: changing line for %s. output to %s" %(infile,outfile)
22                self.line='<DIF xmlns="" xmlns:xsi="">\n'
23            self.ww.write(self.line)
24        self.ff.close()
25        self.ww.close()
27if __name__=="__main__":
28    import sys
29    f=sys.argv[1]
30    w=sys.argv[2]
31    form=sys.argv[3]
32    SchemaNameSpace(f,w,form)
Note: See TracBrowser for help on using the repository browser.