11

Below I have an example of an Adobe XML swidtag used to track inventory. I need to parse out relevant information using xmllint in bash and output that to a new text file.

For example I would like to parse the following

swid:entitlement_required_indicator swid:product_title swid:product_version swid:name swid:numeric swid:major swid:minor swid:build swid:review 

I have tried using this, but it will not let me read the namespace

xmllint --xpath '//swid:product_version/swid:name/text()' file.xml 

I've also tried

xmllint --xpath "//*[local-name1()='product_version']/*[local-name2()='name']/text()" file.xml 

But got these errors

xmlXPathCompOpEval: function local-nameame1 not found XPath error : Unregistered function XPath error : Stack usage errror XPath evaluation failure 

Sample tag file for Creative Suite 5 The following sample is for Adobe Photoshop CS5 serialized as Creative Suite 5 Master Collection (Suite)

<?xml version="1.0" encoding="utf-8"?> <swid:software_identification_tag xsi:schemaLocation="http://standards.iso.org/iso/19770/-2/2008/schema.xsd software_identification_tag.xsd" xmlns:swid="http://standards.iso.org/iso/19770/-2/2008/schema.xsd" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"> <!--Mandatory Identity elements --> <swid:entitlement_required_indicator>true</swid:entitlement_required_indicator> <swid:product_title>Acrobat XI Pro</swid:product_title> <swid:product_version> <swid:name>1.0</swid:name> <swid:numeric> <swid:major>1</swid:major> <swid:minor>0</swid:minor> <swid:build>0</swid:build> <swid:review>0</swid:review> </swid:numeric> </swid:product_version> <swid:software_creator> <swid:name>Adobe Systems Incorporated</swid:name> <swid:regid>regid.1986-12.com.adobe</swid:regid> </swid:software_creator> <swid:software_licensor> <swid:name>Adobe Systems Incorporated</swid:name> <swid:regid>regid.1986-12.com.adobe</swid:regid> </swid:software_licensor> <swid:software_id> <swid:unique_id>CreativeCloud-CS6-Mac-GM-MUL</swid:unique_id> <swid:tag_creator_regid>regid.1986-12.com.adobe</swid:tag_creator_regid> </swid:software_id> <swid:tag_creator> <swid:name>Adobe Systems Incorporated</swid:name> <swid:regid>regid.1986-12.com.adobe</swid:regid> </swid:tag_creator> <!--Optional Identity elements --> <swid:license_linkage> <swid:activation_status>activated</swid:activation_status> <swid:channel_type>SUBSCRIPTION</swid:channel_type> <swid:customer_type>RETAIL</swid:customer_type> </swid:license_linkage> <swid:serial_number>909702426602037824854600</swid:serial_number> </swid:software_identification_tag> 

    4 Answers 4

    23

    This discussion is enlightening.

    At the very least, even if not ideal, you should be able to do:

    xmllint --xpath "//*[local-name()='product_version']/*[local-name()='name']/text()" file.xml 

    Or use xmlstarlet instead:

    xmlstarlet sel -t -v //swid:product_version/swid:name file.xml 
    2
    • The discussion you listed is very enlightening, thank you. For "//*[local-name()='product_version'] is local-name, something that I create? Ex. "//*[name1()='product_version']. I tried renaming it and got
      – macman
      CommentedNov 27, 2012 at 16:19
    • 1
      The reason this works in XMLStarlet is a feature: "In order to handle namespaces with greater ease, XMLStarlet (versions 1.2.1+) will use the namespace prefixes declared on the root element of the input document."
      – Tanz87
      CommentedJul 6, 2020 at 18:00
    8

    Try using a here-doc. Example:

    #!/bin/bash xmllint --shell file.xml <<EOF setns swid=http://standards.iso.org/iso/19770/-2/2008/schema.xsd xpath //swid:product_version/swid:name/text() EOF 

    Works with later versions of xmllint that support the --xpath parameter.

      3

      With an older version of xmllint (which doesn't support --xpath) you can set a namespace and query more intuitively thus (but you have to grep out some additional garbage):

      #!/bin/bash echo 'setns swid=http://standards.iso.org/iso/19770/-2/2008/schema.xsd cat //swid:product_version/swid:name/text()' | \ xmllint --shell file.xml | egrep -v '^(/ >| -----)' 
      1
      • nice and clear. used a similar approach in my answer below.
        – roblogic
        CommentedNov 10, 2017 at 5:10
      1

      I had similar issues, reading pom.xml (a maven configuration file) in shell script for jenkins. To ensure a good result, I would do:

      xmllint --xpath "//swid:software_identification_tag/*[local-name()='product_version']/*[local-name()='name']/text()" file.xml 

      You don't seem to have the problem here put if your xml has that kind of additionnal content:

      <swid:product_specifics> <swid:product_version> ... </swid:product_version> </swid:product_specifics> 

      xmllint --xpath "//*[local-name()='product_version']/*[local-name()='name']/text()" file.xml won't work

      In my situation, a pom.xml has many "version" elements, so if you want a specific one, the path should be exact, otherwise you'll get multiple values you don't want.

        You must log in to answer this question.

        Start asking to get answers

        Find the answer to your question by asking.

        Ask question

        Explore related questions

        See similar questions with these tags.