Ticket #853 (closed task: fixed)

Opened 12 years ago

Last modified 12 years ago

[WG] xmlhandler2 doesn't handle non-ascii gracefully:

Reported by: lawrence Owned by: lawrence
Priority: critical Milestone: PROD Final
Component: community Version:
Keywords: Cc: spascoe

Description

See:  http://glue.badc.rl.ac.uk:8081/view/badc.nerc.ac.uk__DIF__dataent_11872529492720024

Which currently gives:

>> status,xmlh=ndgRetrieve.ndgRetrieve(ndgO,request.environ['ndgConfig'],logger,format)
Module ows_server.models.ndgRetrieve:79 in ndgRetrieve
>>  x=xmlHandler2.xmlHandler(r,string=1)
 Module ows_server.models.xmlHandler2:41 in __init__
 >>  self.tree=ET.parse(xmlf).getroot()
 Module elementtree.ElementTree:859 in parse
 Module elementtree.ElementTree:583 in parse
 Module elementtree.ElementTree:1242 in feed
UnicodeEncodeError: 'ascii' codec can't encode character u'\xfc' in position 3033: ordinal not in range(128)

Change History

comment:1 Changed 12 years ago by lawrence

Also a problem for

http://glue.badc.rl.ac.uk:8081/view/grid.bodc.nerc.ac.uk__DIF__EDMED1048008

comment:2 Changed 12 years ago by lawrence

  • Status changed from new to closed
  • Resolution set to fixed

This should be fixed in changeset:2832

comment:3 Changed 12 years ago by lawrence

And deployed on glue ...

comment:4 Changed 12 years ago by lawrence

  • Status changed from closed to reopened
  • Resolution fixed deleted

Damn, still got problems:

 http://glue.badc.rl.ac.uk:8081/view/badc.nerc.ac.uk__NDG-B1__dataent_11872529492720024

Module ows_server.controllers.retrieve:58 in view
>>  status,x=interface.GetXML(uri)
Module ows_server.lib.ndgInterface:129 in GetXML
>> status,xmlh=ndgRetrieve.ndgRetrieve(ndgO,request.environ['ndgConfig'],logger,format)
Module ows_server.models.ndgRetrieve:74 in ndgRetrieve
>>  x=xmlHandler2.xmlHandler(r,string=1)
Module ows_server.models.xmlHandler2:38 in __init__
>>  xmlf=StringIO.StringIO(self.xmls.encode('utf-8')) # StringIO is supposed to be unicode! .encode('utf-8'))
 UnicodeDecodeError: 'ascii' codec can't decode byte 0xc2 in position 13394: ordinal not in range(128)

comment:5 Changed 12 years ago by lawrence

  • Status changed from reopened to closed
  • Resolution set to fixed

Well this is done now in changeset:2834.

It's done with a nasty hack: everything is forced to be utf-8, and what's not, is just replaced ... this is not nice, but I can't find my way through the morass of unicode ...

comment:6 Changed 12 years ago by lawrence

A slightly more elegant solution in xmlHandler via changeset:2835, but it's still fairly ugly.

Note: See TracTickets for help on using tickets.