3

I have a file say "SAMPLE.txt" with following content,

P1 10,9:6/123456 P2 blah blah P1 10,9:5/98765 P2 blah blah P1 blah blah P2 

I want a output file say "RESULT.txt" as,

Value1:123456 Value2:98765 Value3:NULL 

I need to first fetch content between P1 & P2 part then I want to find value of 10,9*/ which I want to save to another value. Incase some P1--P2 block doesn't contain this value I want to save it as "NULL".

How can I code the above in shell/awk ?

I am very new to scripting.

    5 Answers 5

    2

    This works and is fully portable:

    sed '\|^P1.*|!d;s||Value:| :n N;\|\nP2|!bn s|:.*\n10,9[^/]*/|:| s|\n.*||;s|:$|:NULL|' 

    The flow works like this:

    1. First it addresses a line ^beginning with P1

    2. If the current line !doesn't match it deletes it.

    3. If it does, it replaces P1 with Value:

    4. It then sets the :next label and pulls in the Next line.

    5. If \nP2 is !not then found it branches back to the :next label and tries again repeatedly until it occurs.

    6. It then deletes any occurrence of :.*\n10,9 up to the first occurring / character.

    7. It deletes the first \newline available and all following characters.

    8. If the last character is the :colon following Value it inserts the string NULL.

    Steps 6 and 7 are what makes it work - in 6 it deletes the \newline preceding your desired number string, but if that isn't there then the next \newline will be the one immediately following Value: so everything goes in step 7.

    Here it is in action:

    sed '\|^P1.*|!d;s||Value:| :n N;\|\nP2|!bn s|:.*\n10,9[^/]*/|:| s|\n.*||;s|:$|:NULL|' <<\DATA P1 10,9:6/123456 P2 blah blah P1 10,9:5/98765 P2 blah blah P1 blah blah P2 DATA 

    OUTPUT:

    Value:123456 Value:98765 Value:NULL 
    3
    • @StéphaneChazelas - thanks for very much for the edit - I didn't know labels had to be separated in that way. Which leaves me wondering - is it only that they can't precede other statements on the line but can be preceded themselves or is it neither? I'm not going to alter it - it's cleaner looking this way anyway - but would it work POSIXly if it were instead: ...s||Value|;:n?
      – mikeserv
      CommentedJun 9, 2014 at 19:53
    • cmd;:n is OK. label names can have any character including ; (possibly not trailing blanks). In that regard GNU sed is not POSIX compliant (though nobody in their right mind would use labels such as n;N;...).CommentedJun 9, 2014 at 19:58
    • @StéphaneChazelas - I was actually just looking at it. I wonder why GNU sed discriminates between the way it handles :label and r,a,i,w - it obeys the spec in all of the latter cases but bends the rules for the former though it's the same rule for all. Probably it's what you say - the semi-colon makes the difference. As for being in my right mind - I don't think any such claim would stand up under scrutiny, so I don't make it... Anyway, thanks again, as always. Someday you're gonna let me know where you keep all of that info...
      – mikeserv
      CommentedJun 9, 2014 at 20:04
    2

    With perl (slurps the whole file in memory though):

     perl -0777 -ne 'while (/P1\n(.*?)\nP2/gs) { printf "Value%d:%s\n", ++$n, $1 =~ /^10,9.*\// ? $'\'': "NULL"}' 
      1

      A perl solution:

      $ perl -F'/' -alne ' if (/P1/../P2/ and $_ !~ /^P/) { print "Value@{[++$i]}:",$F[1]?$F[1]:"NULL"; } ' file Value1:123456 Value2:98765 Value3:Null 

      An awk solution:

      $ awk -F'/' '/P2/{f=0};/P1/{f=1;next};f{print "Value"++i":"($2?$2:"Null")}' file Value1:123456 Value2:98765 Value3:Null 
        1

        Another awk solution:

        $ awk '$0=="P1" {f=1} $0=="P2" {f=0}f' file | paste - - | \ awk -F"\t" '$2~/[0-9]/ {split ($2,a,"/"); \ print "Value"NR":"a[2]} $2!~/[0-9]/ {print "Value"NR":NULL"}' Value1:123456 Value2:98765 Value3:NULL 
          0

          Thanks a lot guys. This is the piece of code that finally solved my problem.

          nawk -v fname="${filename}" -F '/|:' ' function isnum(x){return(x==x+0)} /P1/,/P3/{ # Found start increment i reset variables go to next line if(/P1/){ ++i fid ="" count++ next } # Found end validate variable and print go to next line if(/P3/){ printf "%s|",count printf "%s|",isnum(fid)?fid:"NULL" next } if(!fid && /36,59:*/) { fid = $NF } ' ${filename} >>output.txt 

          But now I am having another issue for which I have created a separate thread.

          Here is the link if you guys can help.

          https://stackoverflow.com/questions/24277167/finding-and-replacing-text-inside-awk-block?noredirect=1#comment37509363_24277167

            You must log in to answer this question.

            Start asking to get answers

            Find the answer to your question by asking.

            Ask question

            Explore related questions

            See similar questions with these tags.