-4

pre requisite:Should be done using ksh script commands

I have the following document, and need to extract all the <Sw:RMARecrd>s whose <Doc:Crspdt> children contain BSDTUS30 or MITMUS30.

<?xml version="1.0" encoding="UTF-8" ?> <Sw:RMAFile xmlns:Sw="urn:swift:snl:ns.Sw" xmlns:Doc="urn:swift:snl:ns.Doc" xmlns:SwSec="urn:swift:snl:ns.SwSec"> <Sw:RMAFileHdr> <Sw:Bic8Lst> <Doc:Bic8>BSDTGB20</Doc:Bic8> <Doc:Bic8>BSDTUS30</Doc:Bic8> Doc:Bic8>BWTRUS30</Doc:Bic8> <Doc:Bic8>MELNJPJ0</Doc:Bic8> <Doc:Bic8>NEIMGB20</Doc:Bic8> <Doc:Bic8>ZYHJGB20</Doc:Bic8> <Doc:Bic8>ZYIYUS30</Doc:Bic8> <Doc:Bic8>ZYJDGB20</Doc:Bic8> </Sw:Bic8Lst> <Sw:SvcLst> <Doc:SvcNm>swift.fin!p</Doc:SvcNm> </Sw:SvcLst> <Sw:FileMaintncSts>Partial</Sw:FileMaintncSts> <Sw:FileDesc/><Sw:CrDtTm>2016-08-01T10:17:02Z</Sw:CrDtTm> <Sw:TltRecrd>254</Sw:TltRecrd> <Sw:LAU><Sw:LAUVal>RRgL2lsocXDswCHxgnf4ww==</Sw:LAUVal></Sw:LAU> </Sw:RMAFileHdr> <Sw:RMARecrd> <Sw:Tp>Received</Sw:Tp> <Sw:RMASts>Rejected</Sw:RMASts> <Doc:Issr>ZYLCUS30</Doc:Issr> <Doc:Crspdt>BSDTGB20</Doc:Crspdt> <Doc:SvcNm>swift.fin!p</Doc:SvcNm> <Doc:IssdDtTm>2014-09-12T13:16:19Z</Doc:IssdDtTm> </Sw:RMARecrd> <Sw:RMARecrd> <Sw:Tp>Received</Sw:Tp> <Sw:RMASts>Enabled</Sw:RMASts> <Doc:Issr>AGIGUS30</Doc:Issr> <Doc:Crspdt>BSDTUS30</Doc:Crspdt> <Doc:SvcNm>swift.fin!p</Doc:SvcNm> <Doc:IssdDtTm>2013-06-26T13:20:20Z</Doc:IssdDtTm> </Sw:RMARecrd> <Sw:RMARecrd> <Sw:Tp>Received</Sw:Tp> <Sw:RMASts>Enabled</Sw:RMASts> <Doc:Issr>AQRMUS30</Doc:Issr> <Doc:Crspdt>BSDTUS30</Doc:Crspdt> <Doc:SvcNm>swift.fin!p</Doc:SvcNm> <Doc:IssdDtTm>2014-11-05T02:17:34Z</Doc:IssdDtTm> </Sw:RMARecrd> <Sw:RMARecrd> <Sw:Tp>Received</Sw:Tp> <Sw:RMASts>Enabled</Sw:RMASts> <Doc:Issr>BLBGGB20</Doc:Issr> <Doc:Crspdt>BSDTUS30</Doc:Crspdt> <Doc:SvcNm>swift.fin!p</Doc:SvcNm> <Doc:IssdDtTm>2015-11-20T10:30:18Z</Doc:IssdDtTm> <SwSec:Signature> <SwSec:SignedInfo><Sw:Reference><Sw:DigestValue>s6ytg+2eV+e4Pg0UzUwD+lW0hAudR3N/VeSwleg3bzU=</Sw:DigestValue></Sw:Reference></SwSec:SignedInfo> <SwSec:SignatureValue> PEMF@Proc-Type: 4,MIC-ONLY Content-Domain: RFC822 EntrustFile-Version: 2.0 Originator-DN: cn=blbggb2l-2,ou=prod,o=blbggb2l,o=swift Orig-SN: 1416707530 MIC-Info: SHA256, RSA, TwfVoV22y+iqiNwiZ5p40kGk7a9Gm8bHcdPH1bzF19063Q8BsglE59dF8Fsscnk8 M1SuDzwAVZFI4Na1iqf/cAbuugVbXKThBUAtNrqypVehrsl4BOXkU3LK0XGVtrDj oVHsBs0k8zhk/6cOBUIWr2O+WQA9opvgMEYdaNqVW2OC+UCBsDV8gDyZFvi/cnVR mEn4OOEKfNrQMvPR+ackPWFdb5FE70N/L2IZjrYGPcVbkR/UBg6zCOojuEOqbSdO EEzT5DVd8d3AHb2NeqXoYNnRmkxK9qqIijCw5VHTPCBANmKuJVlciMW0Vv+rrbsU MIIP/MkoPPW17r0Ts9acoQ== </SwSec:SignatureValue> <SwSec:KeyInfo> <SwSec:SignDN>cn=blbggb2l-2,ou=prod,o=blbggb2l,o=swift</SwSec:SignDN> <SwSec:CertPolicyId>1.3.21.6.2</SwSec:CertPolicyId> </SwSec:KeyInfo> <SwSec:Manifest> <Sw:Reference><Sw:DigestRef>Authorisation</Sw:DigestRef><Sw:DigestValue>aLxFLajsQFYloHlaU2GZPfudNO9sdeqGPb3G8GBkweA=</Sw:DigestValue></Sw:Reference> <Sw:Reference><Sw:DigestRef>Sw.E2S</Sw:DigestRef><Sw:DigestValue>7XFoTufTG0l2fMNoC+mzpAmTKgeipVlcTK0Q3KlW8fw=</Sw:DigestValue></Sw:Reference> <Sw:Reference><Sw:DigestRef>Sw.NRS</Sw:DigestRef><Sw:DigestValue>qRuWmiLLsuT2lamWkG8Zo7qRrxqolRCWNLPs//OsvCE=</Sw:DigestValue></Sw:Reference> </SwSec:Manifest> </SwSec:Signature> </Sw:RMARecrd> <Sw:RMARecrd> <Sw:Tp>Received</Sw:Tp> <Sw:RMASts>Enabled</Sw:RMASts> <Doc:Issr>BLBGGB50</Doc:Issr> <Doc:Crspdt>BSDTUS30</Doc:Crspdt> <Doc:SvcNm>swift.fin!p</Doc:SvcNm> <Doc:IssdDtTm>2014-11-17T17:30:27Z</Doc:IssdDtTm> </Sw:RMARecrd> <Sw:RMARecrd> <Sw:Tp>Received</Sw:Tp> <Sw:RMASts>Enabled</Sw:RMASts> <Doc:Issr>BRIPUS40</Doc:Issr> <Doc:Crspdt>BSDTUS30</Doc:Crspdt> <Doc:SvcNm>swift.fin!p</Doc:SvcNm> <Doc:IssdDtTm>2014-07-22T06:28:12Z</Doc:IssdDtTm> </Sw:RMARecrd> <Sw:RMARecrd> <Sw:Tp>Received</Sw:Tp> <Sw:RMASts>Enabled</Sw:RMASts> <Doc:Issr>CFSMAU20</Doc:Issr> <Doc:Crspdt>BSDTUS30</Doc:Crspdt> <Doc:SvcNm>swift.fin!p</Doc:SvcNm> <Doc:IssdDtTm>2015-02-26T23:24:52Z</Doc:IssdDtTm> </Sw:RMARecrd> <Sw:RMARecrd> <Sw:Tp>Received</Sw:Tp> <Sw:RMASts>Enabled</Sw:RMASts> <Doc:Issr>CITIBGS0</Doc:Issr> <Doc:Crspdt>BSDTUS30</Doc:Crspdt> <Doc:SvcNm>swift.fin!p</Doc:SvcNm> <Doc:IssdDtTm>2014-04-08T07:34:10Z</Doc:IssdDtTm> </Sw:RMARecrd> <Sw:RMARecrd> <Sw:Tp>Received</Sw:Tp> <Sw:RMASts>Enabled</Sw:RMASts> <Doc:Issr>CITICZP0</Doc:Issr> <Doc:Crspdt>BSDTUS30</Doc:Crspdt> <Doc:SvcNm>swift.fin!p</Doc:SvcNm> <Doc:IssdDtTm>2014-01-20T07:52:11Z</Doc:IssdDtTm> </Sw:RMARecrd> </Sw:RMAFile> 

Given the above file, the output I am hoping to achieve is the following:

<?xml version="1.0" encoding="UTF-8" ?> <Sw:RMAFile xmlns:Sw="urn:swift:snl:ns.Sw" xmlns:Doc="urn:swift:snl:ns.Doc" xmlns:SwSec="urn:swift:snl:ns.SwSec"> <Sw:RMAFileHdr> <Sw:Bic8Lst> <Doc:Bic8>BSDTGB20</Doc:Bic8> </Sw:Bic8Lst> <Sw:SvcLst> <Doc:SvcNm>swift.fin!p</Doc:SvcNm> </Sw:SvcLst> <Sw:FileMaintncSts>Partial</Sw:FileMaintncSts> <Sw:FileDesc/><Sw:CrDtTm>2016-08-01T10:17:02Z</Sw:CrDtTm> <Sw:TltRecrd>254</Sw:TltRecrd> <Sw:LAU><Sw:LAUVal>RRgL2lsocXDswCHxgnf4ww==</Sw:LAUVal></Sw:LAU> </Sw:RMAFileHdr> <Sw:RMARecrd> <Sw:Tp>Received</Sw:Tp> <Sw:RMASts>Rejected</Sw:RMASts> <Doc:Issr>ZYLCUS30</Doc:Issr> <Doc:Crspdt>BSDTGB20</Doc:Crspdt> <Doc:SvcNm>swift.fin!p</Doc:SvcNm> <Doc:IssdDtTm>2014-09-12T13:16:19Z</Doc:IssdDtTm> </Sw:RMARecrd> </Sw:RMAFile> 
2
  • 1
    Please format your post.CommentedAug 1, 2016 at 12:39
  • Why the close votes?CommentedAug 1, 2016 at 14:10

1 Answer 1

1

(Assuming the missing < is fixed near the beginning of the XML...)

You don't actually want to extract (select) the data, you want to delete the data you don't want to see.

Using XMLStarlet:

$ xml ed -t -d '//Sw:RMARecrd[Doc:Crspdt != "BSDTGB20" and Doc:Crspdt != "MITMUS30"]' -nl data.xml 

This returns

<?xml version="1.0" encoding="UTF-8"?> <Sw:RMAFile xmlns:Sw="urn:swift:snl:ns.Sw" xmlns:Doc="urn:swift:snl:ns.Doc" xmlns:SwSec="urn:swift:snl:ns.SwSec"> <Sw:RMAFileHdr> <Sw:Bic8Lst> <Doc:Bic8>BSDTGB20</Doc:Bic8> <Doc:Bic8>BSDTUS30</Doc:Bic8> <Doc:Bic8>BWTRUS30</Doc:Bic8> <Doc:Bic8>MELNJPJ0</Doc:Bic8> <Doc:Bic8>NEIMGB20</Doc:Bic8> <Doc:Bic8>ZYHJGB20</Doc:Bic8> <Doc:Bic8>ZYIYUS30</Doc:Bic8> <Doc:Bic8>ZYJDGB20</Doc:Bic8> </Sw:Bic8Lst> <Sw:SvcLst> <Doc:SvcNm>swift.fin!p</Doc:SvcNm> </Sw:SvcLst> <Sw:FileMaintncSts>Partial</Sw:FileMaintncSts> <Sw:FileDesc/> <Sw:CrDtTm>2016-08-01T10:17:02Z</Sw:CrDtTm> <Sw:TltRecrd>254</Sw:TltRecrd> <Sw:LAU> <Sw:LAUVal>RRgL2lsocXDswCHxgnf4ww==</Sw:LAUVal> </Sw:LAU> </Sw:RMAFileHdr> <Sw:RMARecrd> <Sw:Tp>Received</Sw:Tp> <Sw:RMASts>Rejected</Sw:RMASts> <Doc:Issr>ZYLCUS30</Doc:Issr> <Doc:Crspdt>BSDTGB20</Doc:Crspdt> <Doc:SvcNm>swift.fin!p</Doc:SvcNm> <Doc:IssdDtTm>2014-09-12T13:16:19Z</Doc:IssdDtTm> </Sw:RMARecrd> </Sw:RMAFile> 

If you additionally want to remove the corresponding <Doc:Bic8> entries:

$ xml ed -t \ -d '//Sw:RMARecrd[Doc:Crspdt != "BSDTGB20" and Doc:Crspdt != "MITMUS30"]' \ -d '//Doc:Bic8[. != "BSDTGB20" and . != "MITMUS30"]' -nl data.xml 

Which returns

<?xml version="1.0" encoding="UTF-8"?> <Sw:RMAFile xmlns:Sw="urn:swift:snl:ns.Sw" xmlns:Doc="urn:swift:snl:ns.Doc" xmlns:SwSec="urn:swift:snl:ns.SwSec"> <Sw:RMAFileHdr> <Sw:Bic8Lst> <Doc:Bic8>BSDTGB20</Doc:Bic8> </Sw:Bic8Lst> <Sw:SvcLst> <Doc:SvcNm>swift.fin!p</Doc:SvcNm> </Sw:SvcLst> <Sw:FileMaintncSts>Partial</Sw:FileMaintncSts> <Sw:FileDesc/> <Sw:CrDtTm>2016-08-01T10:17:02Z</Sw:CrDtTm> <Sw:TltRecrd>254</Sw:TltRecrd> <Sw:LAU> <Sw:LAUVal>RRgL2lsocXDswCHxgnf4ww==</Sw:LAUVal> </Sw:LAU> </Sw:RMAFileHdr> <Sw:RMARecrd> <Sw:Tp>Received</Sw:Tp> <Sw:RMASts>Rejected</Sw:RMASts> <Doc:Issr>ZYLCUS30</Doc:Issr> <Doc:Crspdt>BSDTGB20</Doc:Crspdt> <Doc:SvcNm>swift.fin!p</Doc:SvcNm> <Doc:IssdDtTm>2014-09-12T13:16:19Z</Doc:IssdDtTm> </Sw:RMARecrd> </Sw:RMAFile> 

You may possibly want to be more restrictive with the matching than that though and specify the path down to the nodes without resorting to //.

9
  • i want the same output but using ksh script.
    – Trupti
    CommentedAug 2, 2016 at 8:53
  • 1
    @Trupti Install XMLStarlet, then call it from you ksh script as above. You do not want to use line-oriented tools for doing XML processing. In general, don't parse structured data with regular expressions
    – Kusalananda
    CommentedAug 2, 2016 at 9:11
  • Can we do it directly using scripting commands?
    – Trupti
    CommentedAug 2, 2016 at 9:35
  • 1
    xmlstarletIS a scripting command, in exactly the same way that grep, or awk, or sed are scripting commands. It just happens to be one that's written specifically to handle XML data, which is a good thing because parsing XML or HTML with regular expressions just doesn't work
    – cas
    CommentedAug 4, 2016 at 5:09
  • 1
    @Trupti do you have python or perl? if so, you can easily install an XML parsing module (e.g. under your home directory, like in ~/bin/). With perl, it's trivially easy - just use the cpan command....and not much harder with python.
    – cas
    CommentedAug 8, 2016 at 10:35

You must log in to answer this question.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.