3
\$\begingroup\$

I have that XML-to-JSON function based on ElementTree. It looks very simple but until now it does what it's supposed to do: give a JSON description of the document's ElementTree.

import xml.etree.ElementTree as ET def dirtyParser(node): '''dirty xml parser parses tag, attributes, text, children recursive returns a nested dict''' # mapping with recursive call res = {'tag':node.tag, 'attributes': node.attrib, 'text': node.text, 'children': [dirtyParser(c) for c in node.getchildren()]} # remove blanks and empties for k, v in res.items(): if v in ['', '\n', [], {}, None]: res.pop(k, None) return res 

Usage:

>>> some_xml = ET.fromstring(u'<?xml version="1.0" encoding="UTF-8" ?><records><record><him>Maldonado, Gavin G.</him><her>Veda Parks</her></record></records>') >>> dirtyParser(some_xml) >>> {'tag': 'records', 'children': [{'tag': 'record', 'children': [{'tag': 'him', 'text': 'Maldonado, Gavin G.'}, {'tag': 'her', 'text': 'Veda Parks'}]}]} 

Is it really that reliable?

\$\endgroup\$

    1 Answer 1

    3
    \$\begingroup\$

    It's probably not reliable except if your XML data is simple.

    1. XML is tricky!
      1. You forgot the .tail attribute, which contains any text after a given attribute.
      2. Whitespace is significant, so you won't be able to go back to the same XMl document.
      3. And everything else I don't know about.
    2. The way Python represents dictionary is different from JSON. For example, JSON only allows " for quoting, not '. You can use json.dumps to solve this problem.
    3. More obviously, if you were representing this data using JSON, your data would look like:

      "records": [ {"him": "Maldonado, Gavin G.", "her": "Veda Parks"} ] 

      or something like that. That's very different from what you're outputting, so your progrem does not really represent your data using JSON, but represents the XML representing your data using JSON. But converting to "real JSON" is much more difficult except for some very specific XML, and would not be useful as a general purpose converter.

    This program may be useful to you in some specific scenarios, but you'd better explicitly state what kind of data you accept and reject anything else. Also, what's the point of this?

    \$\endgroup\$
    1
    • \$\begingroup\$#3 catched me: the code works more or less but it don't deal with the conceptual difference between xml and json: so it's stupid and useless. I'd better extract my data from raw xml with a purpose specific function!\$\endgroup\$CommentedSep 22, 2014 at 9:00

    Start asking to get answers

    Find the answer to your question by asking.

    Ask question

    Explore related questions

    See similar questions with these tags.