I have a log file which has XMLs being logged. I need to search and extract all XML's that have a specific string in the any one of the nodes.
e.g. the log file will have mulitple xml's containing the search param.
randomlogentry1 randomlogentry2 Printing XML:<CreateDataABC> <Tag1>searchparam</Tag1> </CreateDataABC> randomlogentry3 randomlogentry4 randomlogentry5 Printing XML: <DataCreatedABC> <TagA>otherparam</TagA> <TagB>searchparam</TagB> <TagC>otherparam</TagC> </DataCreatedABC> randomlogentry6 randomlogentry7
The expected output is the two XML's printed on console or written to seperate files.
XML1:
<CreateDataABC> <Tag1>searchparam</Tag1> </CreateDataABC>
XML2:
<DataCreatedABC> <TagA>otherparam</TagA> <TagB>searchparam</TagB> <TagC>otherparam</TagC> </DataCreatedABC>
The position of 'searchparam' in a XML is never fixed and the only constants are the 'ABC' string and the 'searchparam'.
I thought to use sed to extract between 2 line numbers for which I tried the following:
- Search for the searchparam and identify line no.
- Find the next occurence of ABC and get the line number,
I somehow cant seem to be able to find the previous occurence of ABC from a specific line!!
Has anyone done this before?
EDIT: Updated the example log format and expected output.