Ticket #482 (closed defect: fixed)
[WG] Upgrade BADC production of DIFs
Reported by: | selatham | Owned by: | selatham |
---|---|---|---|
Priority: | blocker | Milestone: | PROD |
Component: | community | Version: | |
Keywords: | Cc: |
Change History
comment:1 Changed 14 years ago by selatham
- Status changed from new to assigned
- Description modified (diff)
comment:2 Changed 14 years ago by selatham
- Owner changed from selatham to ko23
- Status changed from assigned to new
Run latest bulkdestubb.jar over BADC moles records to produce DIFs.
Got a problem where Simplelinks come out with 'URI' or 'Logo' strung into the URI.
<Related_URL> <URL>URIhttp://badc.nerc.ac.uk/data/chablis</URL> <Description> - </Description> </Related_URL> <Related_URL> <URL>Logohttp://badc.nerc.ac.uk/graphics/logos/nerc-2.gif</URL> <Description> - </Description> </Related_URL>
comment:3 Changed 14 years ago by selatham
- Priority changed from required to blocker
This is now a blocker as all BADC DIFs are invalid.
comment:4 Changed 14 years ago by selatham
Also, Parameters are just not coming out in DIF. I've changed the 'unknown' terms and vocabs to 'null' in moles as per conversation with Kev. But still not appearing.
<dgStdParameterMeasured> <dgValidTerm>EARTH SCIENCE</dgValidTerm> <dgValidTermID> <ParentListID>http://vocab.ndg.nerc.ac.uk/term/P111</ParentListID> <TermID>GCAT0001</TermID> </dgValidTermID> <dgValidSubterm> <dgValidTerm>EARTHSCIENCE</dgValidTerm> <dgValidTermID> <ParentListID>http://vocab.ndg.nerc.ac.uk/term/121</ParentListID> <TermID>null</TermID> </dgValidTermID> <dgValidSubterm> <dgValidTerm>Atmosphere</dgValidTerm> <dgValidTermID> <ParentListID>http://vocab.ndg.nerc.ac.uk/term/P131</ParentListID> <TermID>null</TermID> </dgValidTermID> <dgValidSubterm> <dgValidTerm>AtmosphericChemistry</dgValidTerm> <dgValidTermID> <ParentListID>http://vocab.ndg.nerc.ac.uk/term/P141</ParentListID> <TermID>null</TermID> </dgValidTermID> <dgValidSubterm> <dgValidTerm>OxygenCompounds</dgValidTerm> <dgValidTermID> <ParentListID>null</ParentListID> <TermID>null</TermID> </dgValidTermID> </dgValidSubterm> </dgValidSubterm> </dgValidSubterm> </dgValidSubterm> </dgStdParameterMeasured
comment:5 Changed 14 years ago by selatham
By the way, the DIFs produced cannot be parsed by exist or elementTree:-
storing document badc.nerc.ac.uk__DIF__dataent_chablis.xml (0 of 1) ...could not parse file /usr/local/WSClients/OAIBatch/data/badc/discovery_corrected/badc.nerc.ac.uk__DIF__dataent_chablis.xml: org.xml.sax.SAXParseException: XML document structures must start and end within the same entity.
<p><DIF xmlns="http://gcmd.gsfc.nasa.gov/Aboutus/xml/dif/" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"><br/>LINE: </p> Traceback (most recent call last): File "/usr/local/WSClients/OAIBatch/oai_ingest.py", line 238, in ? ident=getID(original_filename) File "/usr/local/WSClients/OAIBatch/oai_ingest.py", line 40, in getID d=DIF(xml) File "/usr/local/WSClients/OAIBatch/DIF.py", line 51, in __init__ raise ValueError,'DIF input cannot be parsed into an ElementTree instance:\n%s'%xml ValueError: DIF input cannot be parsed into an ElementTree instance: <DIF xmlns="http://gcmd.gsfc.nasa.gov/Aboutus/xml/dif/" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"> ...
comment:6 Changed 14 years ago by selatham
- Cc ko23 added
The parameters are coming out now that I am putting 'list level' in - although the 'detailed variable' is not appearing.
comment:7 Changed 14 years ago by selatham
URLs are still incorrect. Now coming out as:-
<Related_URL> <URL/> <Description>URL to aid in delivering data. Note that this may point directly to the data or, more likely, point to the web site of the curator.</Description> </Related_URL> <Related_URL> <URL/> <Description> - </Description> </Related_URL> <Related_URL> <URL/> <Description> - </Description> </Related_URL>
comment:8 Changed 14 years ago by selatham
DIFs can be parsed by exist and elementtree now. Therefore they will ingest into NDG discovery now.
But URLs still wrong - see last comment.
comment:10 Changed 14 years ago by selatham
- Status changed from new to assigned
- Owner changed from ko23 to selatham
- Cc ko23 removed
- Milestone changed from ReFactored_Discovery_WebServices to PROD
The URL stuff was actually brought up in ticket #356 which I'm re-assigning to Kev.
However, now got a problem with xqueries timing out with the current exist config.
comment:11 Changed 14 years ago by selatham
- Status changed from assigned to closed
- Resolution set to fixed
The timeout was deliberate for Front-end issues. Now agreed That the XQuery timesout rather than exist itself. Re-set the config. Bulk generator runs now. (URL stuff gone to Kev #356)