Thursday, October 06, 2005

SAX Parser Funny Stuff

It turns out that the startElementNS method on a SAX ContentHandler is not necessarily passed the QName of the element. The name is passed as a URI/element name tuple, but the QName is not mandatory unless the feature_namespace_prefixes feature is enabled. The following code shows how to do this:


from xml.sax.handler import feature_namespaces, feature_namespace_prefixes
from xml.sax import make_parser

contentHandler = MyContentHandler()

parser = make_parser()

# Perform namespace processing
parser.setFeature(feature_namespaces, True)

# Report original prefixed names (i.e. qname is passed to startElementNS)
parser.setFeature(feature_namespace_prefixes, True)

parser.setContentHandler(contentHandler)

parser.parse(inputSource)


I was made aware of this issue when YAXL failed to parse XML source properly on somebody else's machine; the elements were being named "None". That tip led to this fix. Merci Marc!

No comments: