3

I want to combine two sed command in one and really don't know how to do it, I've tried multiple things without success.

For the result I want all slash between a specific tag to be replaced with backslash, what would result in :

Source :

<FilePath>a/b/c/d</FilePath> <OtherTags>Bob</OtherTags> <FilePath>1/2/3/4</FilePath> 

Result :

<FilePath>a\b\c\d</FilePath> <OtherTags>Bob</OtherTags> <FilePath>1\2\3\4</FilePath> 

I've found this command to change text between tags :

sed -i -e 's/\(<FilePath>\).*\(<\/FilePath>\)/<FilePath>TEXT_TO_REPLACE_BY<\/FilePath>/g' test.txt 

But this command replace all the text... So what I want is to keep the text and only replace the slash with backslack with this kind of command :

sed -e 's/\\/\//g' test.txt 

But I struggle to combine those two.

Thanks for your help.

3
  • This is possible with sed, but difficult. It would be easier with awk and even easier with perl. Is awk available? Perl?CommentedSep 7, 2020 at 19:37
  • 1
    Can you assume that there are no < or > inside the tags (and in particular no nested tags)? Otherwise you really need a library that can parse XML properly.CommentedSep 7, 2020 at 19:42
  • Since the whole point of your question is to not replace / with \ inside of <OtherTags>, it'd have been useful for us testing if you had included in your example lines where the text within <OtherTags> included /s. Just Bob doesn't give us anything to test with to prove if a potential solution works or not. Bob/Smith would have been much more useful
    – Ed Morton
    CommentedSep 8, 2020 at 21:43

6 Answers 6

3

With GNU awk for the 3rd arg to match():

awk 'match($0,/(.*<FilePath>)(.*)(<\/FilePath>.*)/,a){ gsub("/","\\",a[2]); $0=a[1] a[2] a[3] } 1' file <FilePath>a\b\c\d</FilePath> <OtherTags>Bob/Smith</OtherTags> <FilePath>1\2\3\4</FilePath> 

The above was run using this input:

$ cat file <FilePath>a/b/c/d</FilePath> <OtherTags>Bob/Smith</OtherTags> <FilePath>1/2/3/4</FilePath> 
    1

    command

    sed -e 's/\//\\/g' -e 's/<\\/<\//g' filename 

    output

    <FilePath>a\b\c\d</FilePath> <OtherTags>Bob</OtherTags> <FilePath>1\2\3\4</FilePath> 
      1

      GNU sed with extended regex mode we can progressively match the xml tag FilePath opening and closing on the same line and assuming this tag is not part of quotes or comments.

      sed -Ee ' :a;s|<(FilePath)>([^/]*(/[^/]*)*)/([^/]*</\1>)|<\1>\2\\\4|;ta' file 
      perl -lpe ' s{<FilePath>\K.*?(?=</FilePath>)} <$& =~ tr|/|\\|r>xge; ' file 

      We isolate the portion between the opening and closing of tag and transform the forward slashes to backslashes in the portion therein.

      We can compose multiline regex for ease in expressing the intent.

      snr=' s| <(FilePath)> ( [^/]* ([/][^/]*)* ) / ( [^/]* ) </\1> |<\1> \2 \\ \4 </\1>| ' ws=$'\t \n' sed -E ":a;${snr//[$ws]/};ta" file 
        1

        Assuming that the document is a well formed XML document such as

        <?xml version="1.0"?> <root> <FilePath>a/b/c/d</FilePath> <OtherTags>Bob</OtherTags> <FilePath>1/2/3/4</FilePath> </root> 

        Then, using xmlstarlet, we may convert the forward slashes to backslashes in the values of all FilePath nodes:

        $ xmlstarlet ed -u '//FilePath' -x 'translate(., "/", "\")' file.xml <?xml version="1.0"?> <root> <FilePath>a\b\c\d</FilePath> <OtherTags>Bob</OtherTags> <FilePath>1\2\3\4</FilePath> </root> 

        The XPath function translate() changes the characters in the 2nd argument (/) to the characters in the 3rd argument (\) in the string referenced by the 1st argument (., the current node's value). The translate() function is applied to all nodes matching the XPath //FilePath. This XPath pattern matches FilePath nodes anywhere in the entire document.

          0

          Here's a fairly simple solution built with a pipeline using sed and tr. It assumes that:

          • There are no nested tags inside <FilePath>…</FilePath> (the replacement is only performed up to the next < after <FilePath>).
          • <FilePath> does not appear inside a literal string (no <![CDATA[<FilePath>/blah]]> or <mytag label="<FilePath>">).
          • Your sed implementation processes the last line correctly even if it doesn't have a terminating newline.

          The principle is to use tr to switch between < and line breaks; that way, when sed processes a “line”, it's actually processing text between one open/close tag and the text open/close tag.

          tr '<\n' '\n<' | sed '/^FilePath>/ y:/:\\:' | tr '<\n' '\n<' 

          Here's a Perl solution. It assumes that:

          • There are no nested tags inside <FilePath>…</FilePath>.
          • <FilePath> does not appear inside a literal string (no <![CDATA[<FilePath>/blah]]> or <mytag label="<FilePath>">).
          • <FilePath> is always followed by </FilePath> on the same line).

          The construction is fairly natural: it applies the function backslashify to the text inside <FilePath>…</FilePath>. The regex .*? is a non-greedy match: a greedy match .* would replace everything from the first <FilePath> to the last </FilePath> on the line if there were multiple <FilePath>…</FilePath> chunks on the same line. s:(?!<<)/(?!>):\\: is a fancier version of tr:/:\\: that avoids replacing slashes if they come immediately before > or after <, which allows nested tags.

          perl -pe 'sub backslashify {local $_ = $_[0]; s:(?!<<)/(?!>):\\:; return $_} s:(<FilePath>)(.*?)(</FilePath>):$1.backslashify($2).$3:e' 
            0

            To avoid confusion, I'd advise you to use separator other than / for sed if / is presented in your patterns. I personally use ~ for such cases. The one-liner which suit you needs looks so:

             sed '/<FilePath>/s~/~\\~g;/<\\FilePath>/s~<\\~</~' your_file 

            Explanation: /<FilePath>/s~/~\\~g - replaces / with \ in strings containing <FilePath> substring. This will replace all/, so </FilePath> will become <\FilePath>. In order to revert this one needs the second part of expression - /<\\FilePath>/s~<\\~</~

            1
            • 2
              No, this is not at all what the question requested. The goal is to replace slashes with backslashes only between the tags, not on the whole line.CommentedSep 7, 2020 at 19:39

            You must log in to answer this question.

            Start asking to get answers

            Find the answer to your question by asking.

            Ask question

            Explore related questions

            See similar questions with these tags.