Changeset 5218 for TI01-discovery


Ignore:
Timestamp:
22/04/09 12:18:10 (10 years ago)
Author:
cbyrom
Message:

Tidy up docs + fix bug with wrongly scoped variable + remove unecessary
global scoping of another variable.

Location:
TI01-discovery/branches/ingestAutomation-upgrade/OAIBatch
Files:
2 edited

Legend:

Unmodified
Added
Removed
  • TI01-discovery/branches/ingestAutomation-upgrade/OAIBatch/README.txt

    r4854 r5218  
    1818The oai_document_ingester script ingests documents obtained by the OAI harvester. 
    1919 
    20 Usage: python oai_document_ingester.py [OPTION] <datacentre>" 
    21  - where:\n   <datacentre> is the data centre to ingest data from; and options are:" 
    22  -v - verbose mode for output logging" 
    23  -d - debug mode for output logging" 
     20Usage: python oai_document_ingester.py [OPTION] <datacentre> 
     21 - where:\n   <datacentre> is the data centre to ingest data from; and options are: 
     22 -v - verbose mode for output logging 
     23 -d - debug mode for output logging 
    2424 
    2525 
     
    3636polls this feed again, it will try to retrieve and ingest the new document. 
    3737 
    38 Usage: python feeddocumentingester.py [OPTION] <feed> [interval=..], [ingestFromDate=..]" 
    39               [eXistDBHostname=..], [eXistPortNo=..], [dataCentrePoll=..]" 
    40  - where:\n <feed> is the atom feed to ingest data from; options are:" 
    41  -v - verbose mode for output logging" 
    42  -d - debug mode for output logging" 
    43  and keywords are:" 
    44  interval - interval, in seconds, at which to retrieve data from the feed" 
    45  ingestFromDate - date, in format, 'YYYY-MM-DD', from which documents should be ingested - if not set, ingest date is taken as the current time" 
    46  eXistDBHostname - name of eXist DB to retrieve data from - NB, this will likely be where the feed is based, too - default is 'chinook.badc.rl.ac.uk'" 
    47  eXistPortNo - port number used by the eXist DB - defaults to '8080'" 
    48  dataCentrePoll - data centre whose documents should be polled for - e.g 'badc', 'neodc' - if not set, all documents on a feed will be ingested" 
     38Usage: python feeddocumentingester.py [OPTION] <feed> [interval=..], [ingestFromDate=..] 
     39              [eXistDBHostname=..], [eXistPortNo=..], [dataCentrePoll=..] 
     40 - where:\n <feed> is the atom feed to ingest data from; options are: 
     41 -v - verbose mode for output logging 
     42 -d - debug mode for output logging 
     43 and keywords are: 
     44 interval - interval, in seconds, at which to retrieve data from the feed 
     45 ingestFromDate - date, in format, 'YYYY-MM-DD', from which documents should be ingested - if not set, ingest date is taken as the current time 
     46 eXistDBHostname - name of eXist DB to retrieve data from - NB, this will likely be where the feed is based, too - default is 'chinook.badc.rl.ac.uk' 
     47 eXistPortNo - port number used by the eXist DB - defaults to '8080' 
     48 dataCentrePoll - data centre whose documents should be polled for - e.g 'badc', 'neodc' - if not set, all documents on a feed will be ingested 
    4949 
    5050NB, the feed URL will typically be pointing at the RESTful interface to an eXist DB which is hosing the feed. 
  • TI01-discovery/branches/ingestAutomation-upgrade/OAIBatch/abstractdocumentingester.py

    r5167 r5218  
    156156                self._NDG_dataProvider = False 
    157157 
    158                 self._datacentre_config_filename = self._base_dir + 'datacentre_config/' + datacentre + "_config.properties" 
    159                 logging.info("Retrieving data from datacentre config file, " + self._datacentre_config_filename) 
     158                datacentre_config_filename = self._base_dir + 'datacentre_config/' + datacentre + "_config.properties" 
     159                logging.info("Retrieving data from datacentre config file, " + datacentre_config_filename) 
    160160                 
    161161                # Check this file exists; if not, assume an invalid datacentre has been specified 
    162                 if not os.path.isfile(self._datacentre_config_filename): 
     162                if not os.path.isfile(datacentre_config_filename): 
    163163                    sys.exit("ERROR: Could not find the config file; either this doesn't exist or the datacentre " \ 
    164                         "specified (%s) is invalid\n" %self.datacentre) 
     164                        "specified (%s) is invalid\n" %datacentre) 
    165165                     
    166                 datacentre_config_file = open(self._datacentre_config_filename, "r") 
     166                datacentre_config_file = open(datacentre_config_filename, "r") 
    167167                 
    168168                for line in datacentre_config_file.readlines(): 
     
    184184                 
    185185                if self._harvest_home == "": 
    186                     sys.exit("Failed at getting harvested records directory stage. datacentre config file tried = %s" %self._datacentre_config_filename) 
     186                    sys.exit("Failed at getting harvested records directory stage. datacentre config file tried = %s" %datacentre_config_filename) 
    187187                 
    188188                logging.info("harvested records are in " + self._harvest_home) 
     
    194194                 
    195195                if self._datacentre_format == "": 
    196                     sys.exit("Failed at stage: getting datacentre format. datacentre config file tried = %s" %self._datacentre_config_filename) 
     196                    sys.exit("Failed at stage: getting datacentre format. datacentre config file tried = %s" %datacentre_config_filename) 
    197197                 
    198198                logging.info("format being harvested: " + self._datacentre_format) 
    199199                 
    200200                if self._datacentre_namespace == "": 
    201                     sys.exit("Failed at stage: getting datacentre namespace. datacentre config file tried = %s" %self._datacentre_config_filename) 
     201                    sys.exit("Failed at stage: getting datacentre namespace. datacentre config file tried = %s" %datacentre_config_filename) 
    202202                 
    203203                logging.info("datacentre namespace: " + self._datacentre_namespace) 
Note: See TracChangeset for help on using the changeset viewer.