I'm using this command:
xmllint --xpath 'substring-after(string(//item/link), "_")' rss.xml
And get the desired output, except it's on the first element. How would I fix this to have it be applied to each link
?
I'm open to using any utility, so long as the sample input is accepted and one expression can be used to get the desired output.
Sample Input:
<rss version="2.0"> <channel> <title>Malicious IPs | By Last Bad Event | Project Honey Pot</title> <link><![CDATA[http://www.projecthoneypot.org/list_of_ips.php]]></link> <description/> <copyright>Copyright 2021 Unspam Technologies, Inc</copyright> <language>en-us</language> <lastBuildDate>July 03 2021 07:15:12 PM</lastBuildDate> <image> <url>http://www.projecthoneypot.org/images/small_phpot_logo.jpg</url> <title>Project Honey Pot | Distribute Spammer Tracking System</title> <link>http://www.projecthoneypot.org</link> </image> <item> <title>92.204.241.167 | C</title> <link>http://www.projecthoneypot.org/ip_92.204.241.167</link> <description>Event: Bad Event | Total: 3,061 | First: 2021-03-27 | Last: 2021-07-03</description> <pubDate>July 03 2021 07:15:12 PM</pubDate> </item> <item> <title>181.24.239.244</title> <link>http://www.projecthoneypot.org/ip_181.24.239.244</link> <description>Event: Bad Event | Total: 1 | First: 2021-07-03 | Last: 2021-07-03</description> <pubDate>July 03 2021 07:15:12 PM</pubDate> </item> <item> <title>193.243.195.66 | S</title> <link>http://www.projecthoneypot.org/ip_193.243.195.66</link> <description>Event: Bad Event | Total: 4 | First: 2021-06-12 | Last: 2021-07-03</description> <pubDate>July 03 2021 07:15:12 PM</pubDate> </item> </channel> </rss>
Desired Output:
92.204.241.167 181.24.239.244 193.243.195.66
Present Output:
92.204.241.167
xmlstarlet
rather thanxmllint
?xmlstarlet sel -t -m "EXP1" -v "EXP2"
xmllint --xpath '//item/link' rss.xml | sed 's/\(.*_\)\(.*\)\(<.*$\)/\2/g'