Use an XML Parser. Personally - like XML::Twig
and perl
.
#!/usr/bin/env perl use strict; use warnings; use XML::Twig; my $twig = XML::Twig->new( ); $twig->parsefile ( 'your_file.xml' ); foreach my $saw_user ( $twig->get_xpath('//saw:user') ) { print $saw_user ->att('name'), "\n"; }
This prints:
[email protected][email protected][email protected]
If you want a 'one liner' then instead:
perl -MXML::Twig -0777 -e 'print map { $_ -> att('name')."\n"} ( XML::Twig->parse( <> )->get_xpath('//saw:user') )' your_xml_file
Please for the sake of future maintenance programmers and sysadmins - DO NOT use regular expressions to parse XML. Why you may ask? Well, because taking your XML as an example - it can look like any of these and still be semantically identical:
(your example +
<?xml version="1.0" encoding="utf-8"?> <saw:ibot jobID="36" priority="normal" version="1" xmlns:saw="com.siebel.analytics.web/report/v1"> <saw:schedule disabled="false" timeZoneId="(GMT-05:00) Eastern Time (US & Canada)"> <saw:start endTime="23:59:00" repeatMinuteInterval="60" startImmediately="true" /> <saw:recurrence runOnce="false"> <saw:weekly fri="true" mon="true" thu="true" tue="true" wed="true" weekInterval="1" /> </saw:recurrence> </saw:schedule> <saw:dataVisibility runAs="cgm" type="recipient" /> <saw:choose> <saw:when condition="true"> <saw:deliveryContent> <saw:headline> <saw:caption> <saw:text>Availability Parity Alert for Next 14 Days (@{NQ_SESSION.LBL_Next_14_Arrival_Days})</saw:text> </saw:caption> </saw:headline> <saw:conditionalReport/> </saw:deliveryContent> <saw:postActions/> </saw:when> <saw:otherwise/> </saw:choose> <saw:deliveryDestinations> <saw:destination category="dashboard" /> <saw:destination category="activeDeliveryProfile" /> </saw:deliveryDestinations> <saw:recipients customize="false" specificRecipients="false" subscribers="true"> <saw:subscribers> <saw:user name="[email protected]" /> <saw:user name="[email protected]" /> <saw:user name="[email protected]" /> </saw:subscribers> </saw:recipients> <saw:conditionQuery> <saw:reportRefNode path="/shared/Quote/Product/Alerts/Daily Availability Parity Alert - Next 14 Days - Content" /> </saw:conditionQuery> </saw:ibot>
Or like this (note tag wrapping of elements)
<?xml version="1.0" encoding="utf-8"?> <saw:ibot jobID="36" priority="normal" version="1" xmlns:saw="com.siebel.analytics.web/report/v1"> <saw:schedule disabled="false" timeZoneId="(GMT-05:00) Eastern Time (US & Canada)"> <saw:start endTime="23:59:00" repeatMinuteInterval="60" startImmediately="true"/> <saw:recurrence runOnce="false"> <saw:weekly fri="true" mon="true" thu="true" tue="true" wed="true" weekInterval="1"/> </saw:recurrence> </saw:schedule> <saw:dataVisibility runAs="cgm" type="recipient"/> <saw:choose> <saw:when condition="true"> <saw:deliveryContent> <saw:headline> <saw:caption> <saw:text>Availability Parity Alert for Next 14 Days (@{NQ_SESSION.LBL_Next_14_Arrival_Days})</saw:text> </saw:caption> </saw:headline> <saw:conditionalReport/> </saw:deliveryContent> <saw:postActions/> </saw:when> <saw:otherwise/> </saw:choose> <saw:deliveryDestinations> <saw:destination category="dashboard"/> <saw:destination category="activeDeliveryProfile"/> </saw:deliveryDestinations> <saw:recipients customize="false" specificRecipients="false" subscribers="true"> <saw:subscribers> <saw:user name="[email protected]"/> <saw:user name="[email protected]"/> <saw:user name="[email protected]"/> </saw:subscribers> </saw:recipients> <saw:conditionQuery> <saw:reportRefNode path="/shared/Quote/Product/Alerts/Daily Availability Parity Alert - Next 14 Days - Content"/> </saw:conditionQuery> </saw:ibot>
Or like this:
<?xml version="1.0" encoding="utf-8"?> <saw:ibot jobID="36" priority="normal" version="1" xmlns:saw="com.siebel.analytics.web/report/v1" ><saw:schedule disabled="false" timeZoneId="(GMT-05:00) Eastern Time (US & Canada)" ><saw:start endTime="23:59:00" repeatMinuteInterval="60" startImmediately="true" /><saw:recurrence runOnce="false" ><saw:weekly fri="true" mon="true" thu="true" tue="true" wed="true" weekInterval="1" /></saw:recurrence></saw:schedule><saw:dataVisibility runAs="cgm" type="recipient" /><saw:choose ><saw:when condition="true" ><saw:deliveryContent ><saw:headline ><saw:caption ><saw:text >Availability Parity Alert for Next 14 Days (@{NQ_SESSION.LBL_Next_14_Arrival_Days})</saw:text></saw:caption></saw:headline><saw:conditionalReport /></saw:deliveryContent><saw:postActions /></saw:when><saw:otherwise /></saw:choose><saw:deliveryDestinations ><saw:destination category="dashboard" /><saw:destination category="activeDeliveryProfile" /></saw:deliveryDestinations><saw:recipients customize="false" specificRecipients="false" subscribers="true" ><saw:subscribers ><saw:user name="[email protected]" /><saw:user name="[email protected]" /><saw:user name="[email protected]" /></saw:subscribers></saw:recipients><saw:conditionQuery ><saw:reportRefNode path="/shared/Quote/Product/Alerts/Daily Availability Parity Alert - Next 14 Days - Content" /></saw:conditionQuery></saw:ibot>
Hopefully by looking at these samples, you'll see that by reformatting your XML in a PERFECTLY VALID fashion, your regex might one day break mysteriously.
sed
orawk
. 2. We can't provide you examples of code to run without seeing the XML that contains the data you want to retrieve. 3. Don't parse XML withsed
orawk
. 4. Please update your question to provide a minimal example XML file. 5. Don't parse XML withsed
orawk
.{}
marker to indent the content by four spaces. I'll do it for you once again.../tmp/xml:33.18: Opening and ending tag mismatch: subscribers line 29 and recipients
and other errors