0

I am writing script to run a software. I am trying to add function in while loops to trim text in variable, so that it can be applied as variable in other part of the command. But what should be the correct way to add the function?

This is a working code when only one --msa file is run.

while read -r i; do raxml-ng --msa ../C049.laln_1l --model $i --prefix C049-rT; done < ../C049.model 

For a very brief introduction, raxml-ng is the software I use, the parameters to run the software is set up by each of these --msa, --model, and --prefix files. For every --msa file, they have their corresponding --model and --prefix. I named them the same to ease the scripting. For eg., C049.laln_1l need to be matched with C049.model and C049-rT.

As in the example above, I can loop the command if I have other --msa files with the same extension like this:

while read -r i; do while read -r j; do raxml-ng --msa ../$i.laln_1l --model $j --prefix $i-rT; done < ../$i.model done < msalist 

Now I have a list of --msa files (listed in msalist) to run and some of them with different file extension.

The msalist file contains:

C049.laln_1l C092.laln_1l C016.laln_1l gc30_part.cseq gc3f.glist.cseq... 

I named the model and prefix using only the text before the first . .

Eg. list for model parameter:

C049.model C092.model C016.model gc30_part.model gc3f.model... 

It is the same case for prefix parameter.

So when writing the bash script to loop for all the --msa files in msalist, I tried do "$( sed 's/\..*//g' "$i" )".model to get C049.model instead of C049.laln_1l.model. But it doesn't seem to work.

trees=$2 threads=$3 while read -r i; do while read -r j; do raxml-ng --msa ../"$i" --model "$j" --prefix "$( sed 's/\..*//g' "$i" )"-rT; done < ../"$( sed 's/\..*//g' "$i" )".model; done < "$alnlist" 

How to trim the text in msalist in order to be read by --model and --prefix?

    1 Answer 1

    1

    To get the part before the first . in any POSIX shells, you can just do ${var%%.*}. So here:

    while IFS= read -r i; do prefix=${i%%.*} while IFS= read -r j; do raxml-ng --msa ../"$i" --model "$j" --prefix "$prefix-rT"; done < "../$prefix.model"; done < "$alnlist 

    Also note the syntax to read a line is IFS= read -r line, not read -r line.

    Here, you could also do:

    while IFS=. read -r prefix rest; do while IFS= read -r j; do raxml-ng --msa ../"$prefix.$rest" --model "$j" --prefix "$prefix-rT"; done < "../$prefix.model"; done < "$alnlist 

    If you wanted to use sed to remove everything starting with the first ., first note that sed 's/\..*//' removes . followed by any number of characters from every line of its input, not the input as a whole, and you'd need to pass the content of $i as input, not as argument. sed treats its arguments as file names to read the input from, so:

    printf '%s\n' "$i" | sed 's/\..*//' 

    For instance. Though to remove everything starting with the first . in the whole input, that would rather have to be:

    printf '%s\n' "$i" | sed ' :1 $!{ # except on the line line, append the next line to the # pattern space and loop N b1 } s/\..*//' 
    1
    • Thanks for the help! I have been looking into this for the whole morning! It is working now! Btw, do you mind to elaborate on {i%%.*}? What does the i%% referring to?
      – web
      CommentedSep 30, 2022 at 16:11

    You must log in to answer this question.

    Start asking to get answers

    Find the answer to your question by asking.

    Ask question

    Explore related questions

    See similar questions with these tags.