Assuming some XML input document like the following:
<?xml version="1.0"?> <root> <ProfessionalClaim paymentIndicator="P" claimProcessedDateTime="20180409120000102" claimEndDate="2018-04-02" claimStartDate="2018-04-02" sourceSystemId="abcd" claimActionCode="00"/> <ProfessionalClaim paymentIndicator="P" claimProcessedDateTime="20180430120000281" claimEndDate="2018-04-17" claimStartDate="2018-04-17" sourceSystemId="abcd" claimActionCode="00"/> <ProfessionalClaim paymentIndicator="P" claimProcessedDateTime="20180430120000281" claimEndDate="2018-04-18" claimStartDate="2018-04-18" sourceSystemId="abcd" claimActionCode="00"/> <ProfessionalClaim paymentIndicator="P" claimProcessedDateTime="20180430120000281" claimEndDate="2018-04-19" claimStartDate="2018-04-19" sourceSystemId="abcd" claimActionCode="00"/> </root>
... we may use xmlstarlet
to extract the claimStartDate
attribute's value from each ProfessionalClaim
node that has another ProfessionalClaim
node following it, together with that next ProfessionalClaim
node's claimEndDate
attribute's value:
xmlstarlet select --template \ --match '//ProfessionalClaim[following-sibling::ProfessionalClaim/@claimEndDate]' \ --value-of 'concat(@claimStartDate, " ", following-sibling::ProfessionalClaim/@claimEndDate)' \ -nl input.txt
This first matches each ProfessionalClaim
node that is followed by another ProfessionalClaim
node.
For each such node, the value of the claimStartDate
attribute is concatenated with the value of the claimEndDate
attribute of the following ProfessionalClaim
node, with a single space character as delimiter.
Given my example document above, this would generate
2018-04-02 2018-04-17 2018-04-17 2018-04-18 2018-04-18 2018-04-19