0

I've been digging through stackexchange for the past few days and I've found bits and pieces of what I'm trying to accomplish, but I'm unsure how to put it all together...

I'm trying to create a script that has curl calls to an API. This returns a whole bunch of xml which I want to then parse down to just certain values only. Overall, I want this script to make the call, parse the values / set them as a variable, and return(display) them.

I may have found a working type solution, but is this practical?

#!/bin/bash test=$(curl -k --silent "https://username:[email protected]?.full=true&name=devicename") test2=$(curl -k --silent "https://username:[email protected]?.devicestatus&name=devicename") variable1=$grep -oPm1 "(?<=<name>)[^<]+" <<< "$test:) variable2=$grep -oPm1 "(?<=<status>)[^<]+" <<< "$test:) echo "$variable" echo "$variable2" 

[admin]>./script SwitchName UP

Here's the XML I'm trying to dig through:

<?xml version="1.0" ?> <queryResponse type="AccessPointDetails" rootUrl="https://website/webacs/api/v1/data" requestUrl="https://website/webacs/api/v1/data/AccessPointDetails?.full=true&amp;name=devicename" responseType="listEntityInstances" count="1" first="0" last="0"> <entity url="https://website/webacs/api/v1/data/AccessPointDetails/14008947223" type="AccessPointDetails" dtoType="accessPointDetailsDTO"> <accessPointDetailsDTO id="14008947223" displayName="14008947223"> <clientCount>6</clientCount> <clientCount_2_4GHz>0</clientCount_2_4GHz> <clientCount_5GHz>6</clientCount_5GHz> <ipAddress>172.16.83.5</ipAddress> <name>devicename</name> <unifiedApInfo> ...... </unifiedApInfo> <upTime>609857</upTime> </accessPointDetailsDTO> </entity> </queryResponse> <?xml version="1.0" ?> <queryResponse type="AccessPointDetails" rootUrl="https://website/webacs/api/v1/data" requestUrl="https://website/webacs/api/v1/data/AccessPointDetails?.full=true&amp;name=devicename" responseType="listEntityInstances" count="1" first="0" last="0"> <entity url="https://website/webacs/api/v1/data/AccessPointDetails/14008947223" type="AccessPointDetails" dtoType="accessPointDetailsDTO"> <accessPointDetailsDTO id="14008947223" displayName="14008947223"> <name>devicename</name> <status>UP</status> <unifiedApInfo> ...... </unifiedApInfo> </accessPointDetailsDTO> </entity> </queryResponse> 
4
  • Or if it's easier, should I assign my api call to a variable and then parse the xml out? variable=$(curl -k "https://.....")
    – Ryan
    CommentedApr 20, 2016 at 16:31
  • Don't comment on your own questions, edit them.CommentedApr 20, 2016 at 16:48
  • Also, parsing XML with grep is an ugly hack (which may be OK, but don't expect it to work in all cases)CommentedApr 20, 2016 at 16:50
  • 1
    You might benefit from using xmlstarlet or similar tools that are actually meant to parse xml, e.g., unix.stackexchange.com/a/225682/4252 . Posting xml returned from the curl call and desired output/outcome will also help us better help you (-:
    – KM.
    CommentedApr 20, 2016 at 18:54

1 Answer 1

3

First some comments/questions to get you thinking in a different way:

(in other words, this started as a comment but became an actual answer somewhere along the way)

  1. Why are you even trying to extract the devicename element when you already know it - it's what you used to fetch the XML (with name=devicename in the URL)?

  2. Even if you didn't already have it, the second curl command (with ?.devicestatus) contains both devicename and status elements, so you only need to fetch the second.

  3. your variable1= and variable2= lines are seriously messed up. you've used $grep rather than $(grep on both lines and have terminated the double-quote with a : rather than another double-quote.

    i.e. it should be like <<< "$test", not <<< "$test:

  4. As others have already mentioned in comments, using regular expressions to parse XML is really not a good way to do it. Use an XML processor instead, e.g. xmlstarlet is a useful tool for working with XML in shell scripts. Or write your script in a language (e.g. perl or python that have XML processing libraries available. search here on this site and on https://stackoverflow.com/ for many examples).

  5. Because of 3. and 4. above, the answer to your question is "No, this is not practical because it won't work at all and also because regular expressions shouldn't be used here".

Now for some possible solutions:

This just fixes the syntax errors in your script so that it should run:

#!/bin/bash test=$(curl -k --silent "https://username:[email protected]?.full=true&name=devicename") test2=$(curl -k --silent "https://username:[email protected]?.devicestatus&name=devicename") variable1=$(grep -oPm1 "(?<=<name>)[^<]+" <<< "$test1") variable2=$(grep -oPm1 "(?<=<status>)[^<]+" <<< "$test2") echo "$variable" echo "$variable2" 

That's far from optimal, though, not least because regular expressions can't reliably parse XML. Trying to do so is an ugly hack at best, and can only work if conditions (i.e. the XML input) are absolutely perfect for what you're trying to extract. Even small changes in the XML output by the server (like eliminating excess spaces, including newlines) can and will break your script.

If I were trying to do what you seem to be doing, here's roughly how I'd do it:

#!/bin/bash U='username' P='password' site='website.api.address' element_base='queryResponse/entity' element_AP="${element_base}/accessPointDetailsDTO" element_status="${element_AP}/status" devname='devicename' url="https://${U}:${P}@${site}?.devicestatus&name=${devname}" xml=$(curl -k --silent "$url") status=$(printf '%s\n' "$xml" | xmlstarlet sel -t -v "$element_status") echo "$devname: $status" 

One of the useful things about writing the script this way is that by building up the various strings ($url and $element_status in particular) from other variables, it's easy to change them without much risk of typos or other errors. They can also come from the command line (e.g. U="$1" ; P="$2" ; devname="$3" or using getopts to process command-line options like -u username -p passsword -d devicename) or from a config file, or both. You could also provide multiple devnames on the command line and fetch them in a loop.

Here's another version of the script that combines some of those ideas:

#!/bin/bash # get username and password, and remove them from the args U="$1" ; shift P="$1" ; shift #edited. was $2 site='website.api.address' element_base='queryResponse/entity' element_AP="${element_base}/accessPointDetailsDTO" element_status="${element_AP}/status" url="https://${U}:${P}@${site}?.devicestatus" for devname in "$@" ; do xml=$(curl -k --silent "${url}&name=${devname}") status=$(printf '%s\n' "$xml" | xmlstarlet sel -t -v "$element_status") echo "$devname: $status" done 

    You must log in to answer this question.

    Start asking to get answers

    Find the answer to your question by asking.

    Ask question

    Explore related questions

    See similar questions with these tags.