xmltodict

xmltodict is a Python module that makes working with XML feel like you are working with JSON, as in this "spec":

>>>doc=xmltodict.parse("""... <mydocument has="an attribute">... <and>... <many>elements</many>... <many>more elements</many>... </and>... <plus a="complex">... element as well... </plus>... </mydocument>... """)>>>>>>doc['mydocument']['@has']u'an attribute'>>>doc['mydocument']['and']['many'][u'elements',u'more elements']>>>doc['mydocument']['plus']['@a']u'complex'>>>doc['mydocument']['plus']['#text']u'element as well'

It's very fast (Expat-based) and has a streaming mode with a small memory footprint, suitable for big XML dumps like Discogs or Wikipedia:

>>>defhandle_artist(_,artist):...printartist['name']>>>>>>xmltodict.parse(GzipFile('discogs_artists.xml.gz'),...item_depth=2,item_callback=handle_artist)APerfectCircleFantômasKingCrimsonChrisPotter...

It can also be used from the command line to pipe objects to a script like this:

importsys,marshalwhileTrue:_,article=marshal.load(sys.stdin)printarticle['title']

$ cat enwiki-pages-articles.xml.bz2 | bunzip2 | xmltodict.py 2 | myscript.py
AccessibleComputing
Anarchism
AfghanistanHistory
AfghanistanGeography
AfghanistanPeople
AfghanistanCommunications
Autism
...

Or just cache the dicts so you don't have to parse that big XML file again. You do this only once:

$ cat enwiki-pages-articles.xml.bz2 | bunzip2 | xmltodict.py 2 | gzip > enwiki.dicts.gz

And you reuse the dicts with every script that needs them:

$ cat enwiki.dicts.gz | gunzip | script1.py$ cat enwiki.dicts.gz | gunzip | script2.py
...

You can also convert in the other direction, using the unparse() method:

>>>mydict={...'page':{...'title':'King Crimson',...'ns':0,...'revision':{...'id':547909091,...}...}...}>>>printunparse(mydict)<?xmlversion="1.0"encoding="utf-8"?><page><ns>0</ns><revision><id>547909091</id></revision><title>KingCrimson</title></page>

Ok, how do I get it?

You just need to

There is an official Fedora package for xmltodict. If you are on Fedora or RHEL, you can do:

$ sudo yum install python-xmltodict

Donate

If you love xmltodict, consider supporting the author on Gittip.

martinblech/xmltodict · GitHub

xmltodict

Ok, how do I get it?

Donate

Trending Articles

Police confirm man stabbed to death in Selsdon was Andrew David Else of Croydon

Angry father ordered to compensate daughter’s male friend

Moondru Mudichu 20-07-2016 – Polimer tv Serial

Download: Rich Bizzy -Panono Ukwenda (Cover)

Sniper: Ghost Warrior 3: Трейнер/Trainer (+17) [1.0 - 1.02] {FLiNG}

IN COURT: Full list of people sentenced at Northampton Magistrates’ Court

GERVASE JOHN

Gordian S01e01-73 [H264 - Ita Jap Ac3 - SoftSub Ita]

Ndebele names

Hyper-V replication "Enabling Replication Failed"

FLASHBACK WITH SIRASA FM AT GALGAMUWA 2022

Prison officer charged!

Jessica Carradero Lopez Arrested by Miami-Dade County Corrections on Dec 17,...

Anthony Wahome Biography, Family, Wife and Children

Who’s been sentenced at Northampton Magistrates’ Court

Reply: Betrayal at House on the Hill:: Rules:: Re: Haunt #6 - Spoilers Within

Jamani mm nauliza hivi second selection za form five zinatoka lini?

(NOTES & Audio) The 12 Sources for Islamic Shariah Parts 1 & 2

Madonna – Behind Me (feat. Guido Dos Santos) – Single [iTunes Plus M4A]

Laura Pausini - Platinum Collection (3Cd) (2009) .mp3 - 320 Kbps