My question is not related to the Parse XML to get node value in bash script? Also i cannot install/use any new XML parser as per the company policy. This needs to be achieved using shell/perl/awk/sed
I will try to rephrase my question:
1) We have a process.log file in which we have lot of text data and in between we have some XML data published.
2) There of thousands of different XML published in the logs along with other text data.
3) Now i need to select only the XML files which are published after Outgoing XML: value
4) Also the XML file which must be selected and copied to a new file should be the one which matches the value in the ALERTID tag.
5) The ALERTID value will be provided in the script input. So in our case mGMjhgHgffHhhFdH1u4 will be provided in the input and we need to select the full XML file published for this alertid. Starting tag is from and ending tag is
5) So i need to select the relevant Outgoing XML file in a new file based on a particular ALERTID so it can be replayed in different environments.
Format of the log:
Info Jan 11 17:30:26.12122 The process is not responding to heartbeats Debug Jan 11 17:30:26.12123 Incoming XML :<xml version "1.0" encoding ="UTF-8"?> <Alert trigger = "true" > <Alerttype>orderReject</Alerttype> <AlertID>ghghfsjUtYuu78T1</AlertID> <Order>uusingas</Order> <Quantity>1254</Quanity> </Alert> (CreateInitEventHandler. C:356) Debug Jan 11 17:30:26.12199 The process is going down with warnings Debug Jan 11 17:30:26.148199 Outgoing XML: <xml version "1.0" encoding ="UTF-8"?> <Alert trigger = "true" > <Alerttype>orderheld</Alerttype> <AlertID>mGMjhgHgffHhhFdH1u4</AlertID> <Order>uwiofhdf</Order> <Quantity>7651</Quanity> </Alert>(CreateEventHandler. C:723) Debug Jan 11 17:30:26.13214 The process has restarted and thread opened Debug Jan 11 17:30:26.13215 The heartbeat is recieved from alertlistener process
Now the requirement is to take AlertID in the input, scan the process log and extract the matching outgoing XML in a separate file.
Using awk i am able to extract all the outgoing xml files but not sure how to extract the one related to a particular AlertID.
Eg:
awk '/Outgoing/{p=1; s=$0} P & & /<\/Alert>/ {print $0 FS s; s="" ;p=0}p' 1.log>2.log
awk
(which matches regular expressions) is not going to be useful in extracting data from XML (which is context free). If you don't know those terms, it means that you cannot correctly parse XML with regular expressions.