30

I would like to know how I can get the value of a node with the following paths:

config/global/resources/default_setup/connection/host config/global/resources/default_setup/connection/username config/global/resources/default_setup/connection/password config/global/resources/default_setup/connection/dbname 

from the following XML:

<?xml version="1.0"?> <config> <global> <install> <date><![CDATA[Tue, 11 Dec 2012 12:31:25 +0000]]></date> </install> <crypt> <key><![CDATA[70e75d7969b900b696785f2f81ecb430]]></key> </crypt> <disable_local_modules>false</disable_local_modules> <resources> <db> <table_prefix><![CDATA[]]></table_prefix> </db> <default_setup> <connection> <host><![CDATA[localhost]]></host> <username><![CDATA[root]]></username> <password><![CDATA[pass123]]></password> <dbname><![CDATA[testdb]]></dbname> <initStatements><![CDATA[SET NAMES utf8]]></initStatements> <model><![CDATA[mysql4]]></model> <type><![CDATA[pdo_mysql]]></type> <pdoType><![CDATA[]]></pdoType> <active>1</active> </connection> </default_setup> </resources> <session_save><![CDATA[files]]></session_save> </global> <admin> <routers> <adminhtml> <args> <frontName><![CDATA[admin]]></frontName> </args> </adminhtml> </routers> </admin> </config> 

Also I want to assign that value to the variable for further use. Let me know your idea.

3

9 Answers 9

26

Using bash and xmllint (as given by the tags):

xmllint --version # xmllint: using libxml version 20703 # Note: Newer versions of libxml / xmllint have a --xpath option which # makes it possible to use xpath expressions directly as arguments. # --xpath also enables precise output in contrast to the --shell & sed approaches below. #xmllint --help 2>&1 | grep -i 'xpath' 

{ # the given XML is in file.xml host="$(echo "cat /config/global/resources/default_setup/connection/host/text()" | xmllint --nocdata --shell file.xml | sed '1d;$d')" username="$(echo "cat /config/global/resources/default_setup/connection/username/text()" | xmllint --nocdata --shell file.xml | sed '1d;$d')" password="$(echo "cat /config/global/resources/default_setup/connection/password/text()" | xmllint --nocdata --shell file.xml | sed '1d;$d')" dbname="$(echo "cat /config/global/resources/default_setup/connection/dbname/text()" | xmllint --nocdata --shell file.xml | sed '1d;$d')" printf '%s\n' "host: $host" "username: $username" "password: $password" "dbname: $dbname" } # output # host: localhost # username: root # password: pass123 # dbname: testdb 

In case there is just an XML string and the use of a temporary file is to be avoided, file descriptors are the way to go with xmllint (which is given /dev/fd/3 as a file argument here):

set +H { xmlstr='<?xml version="1.0"?> <config> <global> <install> <date><![CDATA[Tue, 11 Dec 2012 12:31:25 +0000]]></date> </install> <crypt> <key><![CDATA[70e75d7969b900b696785f2f81ecb430]]></key> </crypt> <disable_local_modules>false</disable_local_modules> <resources> <db> <table_prefix><![CDATA[]]></table_prefix> </db> <default_setup> <connection> <host><![CDATA[localhost]]></host> <username><![CDATA[root]]></username> <password><![CDATA[pass123]]></password> <dbname><![CDATA[testdb]]></dbname> <initStatements><![CDATA[SET NAMES utf8]]></initStatements> <model><![CDATA[mysql4]]></model> <type><![CDATA[pdo_mysql]]></type> <pdoType><![CDATA[]]></pdoType> <active>1</active> </connection> </default_setup> </resources> <session_save><![CDATA[files]]></session_save> </global> <admin> <routers> <adminhtml> <args> <frontName><![CDATA[admin]]></frontName> </args> </adminhtml> </routers> </admin> </config> ' # exec issue #exec 3<&- 3<<<"$xmlstr" #exec 3<&- 3< <(printf '%s' "$xmlstr") exec 3<&- 3<<EOF $(printf '%s' "$xmlstr") EOF { read -r host; read -r username; read -r password; read -r dbname; } < <( echo "cat /config/global/resources/default_setup/connection/*[self::host or self::username or self::password or self::dbname]/text()" | xmllint --nocdata --shell /dev/fd/3 | sed -e '1d;$d' -e '/^ *--* *$/d' ) printf '%s\n' "host: $host" "username: $username" "password: $password" "dbname: $dbname" exec 3<&- } set -H # output # host: localhost # username: root # password: pass123 # dbname: testdb 
1
14

Using xmllint and the --xpath option, it is very easy. You can simply do this:

XML_FILE=/path/to/file.xml HOST=$(xmllint --xpath 'string(/config/global/resources/default_setup/connection/host)' $XML_FILE USERNAME=$(xmllint --xpath 'string(/config/global/resources/default_setup/connection/username)' $XML_FILE PASSWORD=$(xmllint --xpath 'string(/config/global/resources/default_setup/connection/password)' $XML_FILE DBNAME=$(xmllint --xpath 'string(/config/global/resources/default_setup/connection/dbname)' $XML_FILE 

If you need to get to an element's attribute, that's also easy using XPath. Imagine you have the file:

<?xml version="1.0" encoding="UTF-8" standalone="yes"?> <addon id="screensaver.turnoff" name="Turn Off" version="0.10.0" provider-name="Dag Wieërs"> ..snip.. </addon> 

The needed shell statements would be:

VERSION=$(xmllint --xpath 'string(/addon/@version)' $ADDON_XML) AUTHOR=$(xmllint --xpath 'string(/addon/@provider-name)' $ADDON_XML) 
    8

    Although there are a lot of answers already, I'll chime in with xml2.

    $ xml2 < test.xml /config/global/install/date=Tue, 11 Dec 2012 12:31:25 +0000 /config/global/crypt/key=70e75d7969b900b696785f2f81ecb430 /config/global/disable_local_modules=false /config/global/resources/db/table_prefix /config/global/resources/default_setup/connection/host=localhost /config/global/resources/default_setup/connection/username=root /config/global/resources/default_setup/connection/password=pass123 /config/global/resources/default_setup/connection/dbname=testdb /config/global/resources/default_setup/connection/initStatements=SET NAMES utf8 /config/global/resources/default_setup/connection/model=mysql4 /config/global/resources/default_setup/connection/type=pdo_mysql /config/global/resources/default_setup/connection/pdoType /config/global/resources/default_setup/connection/active=1 /config/global/session_save=files /config/admin/routers/adminhtml/args/frontName=admin 

    With a little magic you can even set those as variables directly:

    $ eval $(xml2 < test.xml | tr '/, ' '___' | grep =) $ echo $_config_global_resources_default_setup_connection_host localhost 
    0
      4

      The following works when run against your test data:

      { read -r host; read -r username; read -r password; read -r dbname; } \ < <(xmlstarlet sel -t -m /config/global/resources/default_setup/connection \ -v ./host -n \ -v ./username -n \ -v ./password -n \ -v ./dbname -n) 

      This puts the content into variables host, username, password and dbname.

      7
      • 1
        xmlstarlet: command not found, so this command is not useful to me :(CommentedJul 17, 2013 at 15:31
      • 2
        @MagePsycho bash does not have any built-in support for XML parsing. You either need to have a tool that does (xmlstarlet, xsltproc, a modern Python, etc), or you can't parse XML correctly.CommentedJul 17, 2013 at 15:48
      • @CharlesDuffy is there a way to get the value may be using regex pattern or else?CommentedJul 17, 2013 at 15:51
      • 6
        @MagePsycho you can just install xmlstarlet. In any case, you should never use regular expressions to parse (X)HTML.
        – terdon
        CommentedJul 17, 2013 at 15:57
      • 1
        @MagePsycho I was about to post the same link terdon already did. In short: No.CommentedJul 17, 2013 at 16:05
      4

      A pure bash function, just for the unfortunate case when you are not allowed to install anything appropriate. This may, and probably will, fail on more complicated XML:

      function xmlpath() { local expr="${1//\// }" local path=() local chunk tag data while IFS='' read -r -d '<' chunk; do IFS='>' read -r tag data <<< "$chunk" case "$tag" in '?'*) ;; '!–-'*) ;; '![CDATA['*) data="${tag:8:${#tag}-10}" ;; ?*'/') ;; '/'?*) unset path[${#path[@]}-1] ;; ?*) path+=("$tag") ;; esac [[ "${path[@]}" == "$expr" ]] && echo "$data" done } 

      Usage:

      bash-4.1$ xmlpath 'config/global/resources/default_setup/connection/host' < MagePsycho.xml localhost 

      Known issues:

      • slow
      • searches only by tag names
      • no character entity decoding
        2

        Using xq (from https://kislyuk.github.io/yq/) to just get those strings out:

        #!/bin/sh set -- \ config/global/resources/default_setup/connection/host \ config/global/resources/default_setup/connection/username \ config/global/resources/default_setup/connection/password \ config/global/resources/default_setup/connection/dbname IFS=: xq -r --arg path_string "$*" \ 'getpath(($path_string | split(":") | map(split("/")))[])' file.xml 

        This gives the path expressions to xq as a :-delimited list in the variable $path_string. This string is subsequently split into its constituent paths, and these are then further split into path elements, so that one path internally may look like

        [ "config", "global", "resources", "default_setup", "connection", "dbname" ] 

        The path arrays are given to the getpath() function which extracts the values located at those paths.

        The output, for the given XML document, will be

        localhost root pass123 testdb 

        Creating shell assignment statements instead:

        #!/bin/sh set -- \ config/global/resources/default_setup/connection/host \ config/global/resources/default_setup/connection/username \ config/global/resources/default_setup/connection/password \ config/global/resources/default_setup/connection/dbname eval "$( IFS=: xq -r --arg path_string "$*" ' ($path_string | split(":") | map(split("/"))[]) as $path | "\($path[-1])=\(getpath($path)|@sh)"' file.xml )" printf 'host = "%s"\n' "$host" printf 'user = "%s"\n' "$username" printf 'pass = "%s"\n' "$password" printf 'database = "%s"\n' "$dbname" 

        For the given paths and XML document, the xq statement above would create the output

        host='localhost' username='root' password='pass123' dbname='testdb' 

        This would be safe to eval to assign the host, username, password, and dbname shell variables.

        The output of the script would be

        host = "localhost" user = "root" pass = "pass123" database = "testdb" 
          0

          You can make use of php command line interface coding in bash scripts to handle several complex scripts that actually span over multiple lines of coding. First, try to make your solution using PHP scripts, and then later on pass the parameters using CLI mode. Thus, you can get control over superb usages of XML parsers.

          The environment seems that you can use PHP in client mode via ssh/shell access.

          php -f yourxmlparser.php 

          Now, do all the things within your php file. Make use of command line parameters it can take.

          You can even assign that return values to Shell environment to continue rest of your shell scripts.

          And the other way is to use |grep option to match your required value within the xml file, if you are pretty sure of the structure of your xml file that does not change over time.

          1
          • Use case for all of the above has merit. 5 Stars for all combined.
            – Cymatical
            CommentedMar 11, 2021 at 17:49
          0

          This comment use only sh/bash commands and methods ! /test.xml is your XML type file at first question...

          #!/bin/sh cat /test.xml | while read line;do [ "$(echo "$line" | grep "<host>")" ]&& echo "host: $(echo $line | cut -f3 -d'[' | cut -f1 -d']')" [ "$(echo "$line" | grep "<username>")" ]&& echo "username: $(echo $line | cut -f3 -d'[' | cut -f1 -d']')" [ "$(echo "$line" | grep "<password>")" ]&& echo "password: $(echo $line | cut -f3 -d'[' | cut -f1 -d']')" [ "$(echo "$line" | grep "<dbname")" ]&& echo "dbname: $(echo $line | cut -f3 -d'[' | cut -f1 -d']')" done 

          output:

          host: localhost username: root password: pass123 dbname: testdb 

          if u want write this values to file use this method :

          #!/bin/sh cat /test.xml | while read line;do [ "$(echo "$line" | grep "<host>")" ]&& echo "$line" | cut -f3 -d'[' | cut -f1 -d']' > /config/global/resources/default_setup/connection/host [ "$(echo "$line" | grep "<username>")" ]&& echo "$line" | cut -f3 -d'[' | cut -f1 -d']' > /config/global/resources/default_setup/connection/username [ "$(echo "$line" | grep "<password>")" ]&& echo "$line" | cut -f3 -d'[' | cut -f1 -d']' > /config/global/resources/default_setup/connection/password [ "$(echo "$line" | grep "<dbname")" ]&& echo "$line" | cut -f3 -d'[' | cut -f1 -d']' > /config/global/resources/default_setup/connection/dbname done 

          this method will overwrite your local files used only getting values (your datas will lost from output files)

            0

            Using Raku (formerly known as Perl_6):

            I recognize the OP requested a bash script, but since other answers have deviated from this requirement, here's a Raku solution (4 one-liners):

            raku -MXML -e 'my $xml=open-xml($*ARGFILES.Str); $xml.lookfor(:TAG<host>).>>.cdata>>.data.put;' raku -MXML -e 'my $xml=open-xml($*ARGFILES.Str); $xml.lookfor(:TAG<username>).>>.cdata>>.data.put;' raku -MXML -e 'my $xml=open-xml($*ARGFILES.Str); $xml.lookfor(:TAG<password>).>>.cdata>>.data.put;' raku -MXML -e 'my $xml=open-xml($*ARGFILES.Str); $xml.lookfor(:TAG<dbname>).>>.cdata>>.data.put;' 

            OUTPUT:

            localhost root pass123 testdb 

            Briefly, Raku is called at the bash command line, and -M module XML is loaded with the command -MXML. The xml file is opened with open-xml and stored in the $xml object. Then the $xml object is queried recursively for desired tags [ in point of fact, the lookfor(...) code is a shortcut for elements(..., :RECURSE) ]. Then the CDATA values are extracted.

            There are other ways to get the desired data, such as simply walking the XML-parse tree:

            raku -MXML -e 'my $xml=open-xml($*ARGFILES.Str); .cdata.map(*.data).put for $xml.nodes[1].nodes[7].nodes[3].nodes[1].nodes[1,3,5,7];' 

            Which can be simplified to:

            raku -MXML -e 'my $xml=open-xml($*ARGFILES.Str); .cdata>>.data.put for $xml.nodes[1][7][3][1][1,3,5,7];' 

            The two lines of code above each return:

            localhost root pass123 testdb 

            https://github.com/raku-community-modules/XML
            https://raku.org/

            [For alternative solutions in Raku, there's also the LibXML module, which provides bindings to the (possibly faster) libxml2 library. See https://modules.raku.org/dist/LibXML:cpan:WARRINGD].

              You must log in to answer this question.

              Start asking to get answers

              Find the answer to your question by asking.

              Ask question

              Explore related questions

              See similar questions with these tags.