6

I would like to know how can I split my data from the following format:

<datas> <data> <name>Name1</name> </data> <data> <name>Name2</name> </data> </datas> 

to the following format:

<data><name>Name1</name></data> <data><name>Name2</name></data> 

The parsed data would be sent to a Python script as follows:

 python script.py <data><name>Name1<name></data> python script.py <data><name>Name2<name></data> 

I have tried commands like:

echo 'cat /datas/data' | xmllint --shell file.xml 

but how can I pass the output in the desired format to the Python script?

8
  • 1
    Is the formatting important, or do you just want to extract all <data> tags and the lower tags?
    – Kusalananda
    CommentedJul 14, 2016 at 14:56
  • thank you for helping me to make the format, Kusalananda
    – Aryise
    CommentedJul 14, 2016 at 14:56
  • the format is important. Because I am passing the formatted data to a python script as arguement
    – Aryise
    CommentedJul 14, 2016 at 14:56
  • 3
    It would be better if the Python script parsed the XML (using an XML parser) and extracted the bits it needed...
    – Kusalananda
    CommentedJul 14, 2016 at 15:03
  • 1
    oh! I thought the format you mentioned was <data><name> there can be newlines after the data tag. Python is using xml.etree.ElementTree
    – Aryise
    CommentedJul 14, 2016 at 15:18

2 Answers 2

6

I would preprocess the data with XMLStarlet:

$ xml sel -t -c '/datas/data' -nl data.xml <data> <name>Name1</name> </data><data> <name>Name2</name> </data> 

Then it depends on how you Python script wants to read this data. Hopefully, it's from a file or from standard input...

8
  • hmmmm can xml sel work in mac os? what library should I install to run xml command? :)
    – Aryise
    CommentedJul 14, 2016 at 15:38
  • @Aryise I'm working on the command line on a MacBook Air running El Capitan.
    – Kusalananda
    CommentedJul 14, 2016 at 15:39
  • @Aryise I'm using XMLStarlet from NetBSD's pkgsrc package system, but I believe it's available through Homebrew as well.
    – Kusalananda
    CommentedJul 14, 2016 at 15:41
  • oh my.... I am using Mac OS X Yosemite. They tell me: -bash: xml: command not found :(
    – Aryise
    CommentedJul 14, 2016 at 15:41
  • @Aryise You will have to install it through some means. Homebrew and MacPorts have many good utilities for Mac, and both have XMLStarlet.
    – Kusalananda
    CommentedJul 14, 2016 at 15:43
6

I'd use xslt.

the xslt stylesheet looks like this

<?xml version="1.0" encoding="UTF-8"?> <xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform"> <xsl:template match="/datas"> <xsl:apply-templates select="data"/> </xsl:template> <xsl:template match="data"> <data><name><xsl:value-of select="./name"/></name></data><xsl:text>&#xa;</xsl:text> </xsl:template> </xsl:stylesheet> 

for the transformation use the program xsltproc.

say your input file is named in.xml

the xslt stylesheet is named in.xsl

then the call is

 xsltproc in.xsl in.xml 

output:

<?xml version="1.0"?> <data><name>Name1</name></data> <data><name>Name2</name></data> 
3
  • is there a way not to modify the xml file. I dont think I am allow to modify it. I can only change the structure of execution flow. Currently, I am using .sh script to schedule the run for the robot. Now the robot should be in different instance for different <data> tag. so I can only change the script for it. :(
    – Aryise
    CommentedJul 14, 2016 at 15:19
  • you don't have to modify your input file. the xml code in my example is the stylesheet you have to provide. in my example: YOUR DATA => in.xml the stylesheet =>in.xsl. just copy the example xml code into in.xsl and it should work.
    – murphy
    CommentedJul 14, 2016 at 15:21
  • oh I see.. Although it takes additional step, it can be a backup plan if I really couldnt parse the XML directly into the way I wanted it. I should automate the process of copy the data in XML to XSL. Because the XML file that is used is dynamic. for instance, today can be using a.xml file but the next run can be either b or c.xml
    – Aryise
    CommentedJul 14, 2016 at 15:24

You must log in to answer this question.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.