Hello. I'm trying to iterate through xml files with the lxml.etree module, but I'm having a bit of trouble. For now, I'm only using the one RSS feed, http://distrowatch.com/news/dw.xml. But eventually I'm going to have the program prompt user for feed url, store the input in a variable, then use the variable as the base_url for etree.parse().
This prints out a good chunk of data from the xml file, but what I'd really like to do is print out only the <title> of the feed headline, which is contained within the <item> tag. At first I thought of doing something like this:
But that didn't work, as I got a 'list index out of range' error.
from lxml import etree
tree = etree.parse("http://distrowatch.com/news/dw.xml")
root = tree.getroot()
for channel in root:
for item in channel:
print(item.text)
This prints out a good chunk of data from the xml file, but what I'd really like to do is print out only the <title> of the feed headline, which is contained within the <item> tag. At first I thought of doing something like this:
for channel in root:
for item in channel:
print(item[0].text)
But that didn't work, as I got a 'list index out of range' error.