0

There is a function Parse_xml as below

 Parse_XML() { TDIR=$1 _VERSION= _REVISION= _FILENAME= _COMPONENT= _DESCRIPT= _ISITOA=0 _NOLOG=0 _OAVERSION= local TMP=/tmp/tmpfile.txt-$$ local JUNK # find the cpq_package XML file and assign it to file local file= for xmlfile in *.xml do if [ -n "$(head ${xmlfile} | grep '<cpq_package')" ] ; then file="${xmlfile}" break fi done if [ -z "${file}" ] || [ ! -f "${file}" ] then _NOLOG=1 return fi ${echo} `grep \<version $file|awk -F = '{print $2}'|awk '{print $1}'|tr -d '"'` > $TMP read _VERSION JUNK < $TMP ${echo} `grep \<version $file|awk -F '=' '{print $3}'|awk '{print $1}'|tr -d '"'` > $TMP read _REVISION JUNK < $TMP _OAVERSION=${_VERSION} _VERSION=${_VERSION}${_REVISION} 

here the version and revisions fetched from xml file from this line

<version value="GPK5" revision="B" type_of_change="1"/> <version value="GPK5" revision="" type_of_change="1"/> 

here some of the revision are empty string and some are having 1 character so the command

 grep \<version CP057761.xml|awk -F = '{print $2}'|awk '{print $1}'|tr -d '"' 

is fetching all the version from xml and store in TMP file. And command

grep \<version CP057761.xml|awk -F '=' '{print $3}'|awk '{print $1}'|tr -d '"' 

is fetching revisions of all the version headers from xml with different versions.

so sometimes the revision of previous version if fetched and added to a version which has empty revision.

How I can modify this command

 ${echo} `grep \<version $file|awk -F = '{print $2}'|awk '{print $1}'|tr -d '"'` > $TMP read _VERSION JUNK < $TMP ${echo} `grep \<version $file|awk -F '=' '{print $3}'|awk '{print $1}'|tr -d '"'` > $TMP read _REVISION JUNK < $TMP _OAVERSION=${_VERSION} _VERSION=${_VERSION}${_REVISION} 

to search only the value in _VERSION variable in xml file and fetch it's particular version. so when it has revision, the _VERSION prints GPK5B and when its empty, the _VERSION prints GPK5.

I fixed the issue by searching the $_VERSION in grep of revision instead \<version. it fetched me only revisions with that particular version and read _REVISION JUNK $TMP fetched me the revision So basically I wanted only latest revision along with version. I regret, I wasn't clear with my question before.

2

2 Answers 2

3

Use an XML parser to parse XML data. is one.

Given file.xml containing

<root> <version value="GPK5" revision="B" type_of_change="1"/> <version value="GPK5" revision="" type_of_change="1"/> </root> 

Then

xmlstarlet sel -t -m '//version' -v '@value' -v '@revision' -n file.xml 

Outputs

GPK5B GPK5 
0
    1

    Don't use sed nor regex to parse HTML/XML you cannot, must not parse any structured text like XML/HTML with tools designed to process raw text lines. If you need to process XML/HTML, use an XML/HTML parser. A great majority of languages have built-in support for parsing XML and there are dedicated tools like xidel, xmlstarlet or xmllint if you need a quick shot from a command line shell.. Never accept a job if you don't have access to proper tools.

    is the most advanced XML/HTML parser in command line out there.

    His syntax is more intuitive than xmlstarlet and xmllint when you know query language:

    xidel -e '//version/(@value||""||@revision)' -s file.xml GPK5B GPK5 

      You must log in to answer this question.

      Start asking to get answers

      Find the answer to your question by asking.

      Ask question

      Explore related questions

      See similar questions with these tags.