2
\$\begingroup\$

As a part of my bash routine I am printing some message in terminal regarding status of the workflow. The message in splited into two parts (part 1: begining of the task, part 2: status of its finishing)

echo -n "Dataset is being processed ! "; execution of some AWK script; echo " Processing has been COMPLETED!" 

Here is realisation in bash contained a part of the AWK code:

# print pharase 1: initiation of the process echo -n "Dataset is being rescored.. Please wait"; sleep 0.5 # this is the process: makedir for the result and execute AWK code to process input file mkdir ${results} # Apply the following AWK code on the directory contained input file while read -r d; do awk -F, ' }' "${d}_"*/target_file.csv > "${results}/"${d%%_*}".csv" done < <(find . -maxdepth 1 -type d -name '*_*_*' | awk -F '[_/]' '!seen[$2]++ {print $2}') # print pharase 2: finish of the result, which would appear in the terminal near phrase 1 # this will print "COMPLETED" letter-by-letter with the pause of 0.2 sec between each letter echo -n " C"; sleep 0.2; echo -n "O"; sleep 0.2; echo -n "M"; sleep 0.2; echo -n "P"; sleep 0.2; echo -n "L"; echo -n "E"; sleep 0.2; echo -n "T"; echo -n "E"; sleep 0.2; echo "D!" 

While executing this script in bash, everything seems to be OK and I have not noticed any problems related to the parts of the code between both 'echo -n' blocks. May such splitting of the status phrase using "echo -n" lead to some bugs of the routine in bash ? Any suggestions for realisation of such status message in bash using another syntax?

\$\endgroup\$
2
  • 2
    \$\begingroup\$Micro-review - printf %s is more portable than echo -n.\$\endgroup\$CommentedApr 29, 2021 at 14:03
  • \$\begingroup\$Thanks! should I so substitute all ECHO parts of my code to printf when I use it to print some messages in terminal during script execution ? What is the advantae of using printf >&2 ? Cheers\$\endgroup\$
    – Hot JAMS
    CommentedApr 30, 2021 at 7:35

1 Answer 1

2
\$\begingroup\$

I see very little error-checking in this script. It's important to know whether mkdir succeeded, for example (it certainly won't as it stands, as results is never assigned).

We really ought to be quoting variable expansions, to prevent unwanted word-splitting:

mkdir "$results" || exit 

The arbitrary sleep values need documenting. Why do we need to sleep, and how was the duration determined? Can we wait for something instead?

Not all echo implementations accept -n option. To be portable, we should use printf %s instead.

The Awk command in the read loop looks like it's corrupted: the closing brace is unmatched.

Since this program is quite chatty, consider printing each file name as it's reached. Or a fraction complete:

readarray -t files \ < <(find . -maxdepth 1 -type d -name '*_*_*' | awk -F '[_/]' '!seen[$2]++ {print $2}') for i in ${!files[@]} do d=${files[$i]} printf 'Processing (%d/%d)\r' $i ${#files[@]} awk -F, $'\n}' "${d}_"*/target_file.csv >"$results/${d%%_*}.csv" done 

I don't understand why we have the !seen[$2]++ {print $2} in the find pipeline - find will output each filename exactly once anyway. Much better to have find print just the directories, and zero-terminate them so we're robust enough to handle all possible filenames:

readarray -t -d '' files \ < <(find * -maxdepth 0 -type d -name '*_*_*' -print0) 
\$\endgroup\$

    Start asking to get answers

    Find the answer to your question by asking.

    Ask question

    Explore related questions

    See similar questions with these tags.