0

I've been trying to extract form data, from a huge file. I need a very specific pattern which so far fails me.
I have this consistent part of the log:

Machine info and user info blah blah blah [senderID=60, ipaddress=/10.1.1.11:8443, serviceIdinList=[13], serviceBitbox=11111, servicesList= | BeatController | BeatMaker | WaveShow, client=apache, 

all lines appear like this.
From this line I need to make it look like this:

senderID=60, ipaddress=/10.1.1.11:8443, serviceIdinList=[13], serviceBitbox=11111, servicesList= | BeatController | BeatMaker | WaveShow, 

*Note, everything after the "WaveShow," is irrelevant, as is everything before "senderID"

I've tried this command from a post here,

sed -n '/servicesList=/{s/.*servicesList=//;s/\S*=.*//;p}'

but it only prints out

servicesList= | BeatController | BeatMaker | WaveShow

I have tried to modify it in some iterations with regex, played with grep and sed but no progress

    2 Answers 2

    0

    If what you are trying to do is output everything between and including senderID= and WaveShow,, then you need this sed command:

    sed -n 's/.*\(senderID=.*WaveShow,\).*/\1/p' 

    This will capture everything between those two strings using the \( and \) brackets and output it using \1 (and \2 etc. if you have more captures).

    Note that the leading .* is "greedy", meaning that if you have the string senderID= twice in the input, then the first one will be discarded. If this is not what you want, then sed is not the correct tool; perl can handle this. The command then becomes:

    perl -ne 'print if s/.*?(senderID=.*WaveShow,).*/$1/' 

    -n means "execute a loop for each line of input, and don't print the line at the end of the loop". -e specifies the expression to execute inside the loop.

    The ? after the .* changes the * to match at little as possible (i.e. match non-greedily). The brackets cause perl to group that part and to capture it, which then can be used as $1 for the first capture, $2 for the second, etc.

    However that is not optimal way of doing it in perl. This is a lot better as it does not involve changing strings needlessly, capturing the text and printing just that:

    perl -ne 'print "$1\n" if /(senderID=.*WaveShow,)/' 

    There are probably many more ways of doing this in perl, perhaps even more efficiently.

      0

      Is the trailing comma required?

      If not, this should work:

      grep senderID filename | cut -d '[' -f 2- | cut -d ',' -f -5

      Output:

      senderID=60, ipaddress=/10.1.1.11:8443, serviceIdinList=[13], serviceBitbox=11111, servicesList= | BeatController | BeatMaker | WaveShow

        You must log in to answer this question.

        Start asking to get answers

        Find the answer to your question by asking.

        Ask question

        Explore related questions

        See similar questions with these tags.