1

xml1.xml

<app> <bbb> <jjj>test1</jjj> </bbb> <bbb> <jjj>test2</jjj> </bbb> </app> 

xml2.xml

file2 xml2.xml

<app> <bbb> <jjj>test2</jjj> </bbb> <bbb> <jjj>test3</jjj> </bbb> <bbb> <jjj>test4</jjj> </bbb> </app> 

Can I combine 2 file to 1 file as below?

<app> <bbb> <jjj>test1</jjj> </bbb> <bbb> <jjj>test2</jjj> </bbb> <bbb> <jjj>test3</jjj> </bbb> <bbb> <jjj>test4</jjj> </bbb> </app> 
2

3 Answers 3

1

Adapted from https://stackoverflow.com/questions/10163675/merge-xml-files-in-php

$doc1 = new DOMDocument(); $doc1->load('xml1.xml'); $doc2 = new DOMDocument(); $doc2->load('xml2.xml'); // get 'app' element of document 1 $app1 = $doc1->getElementsByTagName('app')->item(0); // iterate over 'bbb' elements of document 2 $items2 = $doc2->getElementsByTagName('bbb'); for ($i = 0; $i < $items2->length; $i ++) { $item2 = $items2->item($i); // import/copy item from document 2 to document 1 $item1 = $doc1->importNode($item2, true); // append imported item to document 1 'app' element $app1 ->appendChild($item1); } $doc1->save('merged.xml'); 
    0

    It looks like you can do a mergesort and prune it. Basically sort just assumes you know what you're doing and runs a single pass over two or more inputs, interleaving them as their lexicographic sort order converges.

    Here's what a GNU -merge sort prints for your example:


    <app> <app> <bbb> <bbb> <jjj>test1</jjj> </bbb> <bbb> <jjj>test2</jjj> </bbb> <bbb> <jjj>test2</jjj> </bbb> </app> <jjj>test3</jjj> </bbb> <bbb> <jjj>test4</jjj> </bbb> </app> 

    So at least its all folded in now, but, like I said, you still have to prune it. This sed script will do it for your examples:

    sort -m /tmp/xml[12] | sed -ne:n -e'$!s|/a..> *$|bbb>|;$p' \ -e'\|^[^>]*b.*\n|{N;P;D;}' \ -eN -e's|\(.*\)\n\(.*\n\)* *\1 *$|\1|' \ -e's|\n|&|3;tD' -ebn -e:D -eP\;D 

    It just ensures its got at least three lines stacked as it works through input and compares the first line in the stack against the last when the first line isnt a <bbb> tag.


    <app> <bbb> <jjj>test1</jjj> </bbb> <bbb> <jjj>test2</jjj> </bbb> <bbb> <jjj>test3</jjj> </bbb> <bbb> <jjj>test4</jjj> </bbb> </app> 
      0

      You can't using "shell" linux - to do XML, you really need an XML parser.

      However, there are plenty of scripting tools that do have options - my personal favourite is perl and the XML::Twig library. (This is very commonly available in Unix package managers, despite not being part of 'core').

      #!/usr/bin/env perl use strict; use warnings; use XML::Twig; #load both my $first = XML::Twig->new->parsefile('xml1.xml'); my $second = XML::Twig->new->parsefile('xml2.xml'); #iterate bbb elements in second file foreach my $bbb ( $second->get_xpath('//bbb') ) { #extract 'text' of jjj element (of this bbb element) my $jjj = $bbb->first_child_text('jjj'); #use xpath query to check it doesn't exist first. if ( not $first->get_xpath("//bbb/jjj[string()='$jjj']") ) { print $jjj, " not in first, splicing\n"; #cut/paste (note - done in memory, so original file isn't altered) $bbb->move( 'last_child', $first->root ); } } #set output formatting - can do some odd things with particularly strange XMl. $first->set_pretty_print('indented_a'); $first->print; ## if you want to save it: open( my $output, '>', "combined.xml" ) or die $!; print {$output} $first->sprint; close($output); 

        You must log in to answer this question.

        Start asking to get answers

        Find the answer to your question by asking.

        Ask question

        Explore related questions

        See similar questions with these tags.