I have the following csv file:
"V1","V2","V3","V4","V5","V6","V7","V8","V9","V10","Class" 65,Female,0.7,0.1,187,16,18,6.8,3.3,0.9,1 62,Male,10.9,5.5,699,64,100,7.5,3.2,0.74,1 62,Male,7.3,4.1,490,60,68,7,3.3,0.89,1 58,Male,1,0.4,182,14,20,6.8,3.4,1,1 72,Male,3.9,2,195,27,59,7.3,2.4,0.4,1 46,Male,1.8,0.7,208,19,14,7.6,4.4,1.3,1
I am only interested in the columns V1:age, V2:sex, V8:grade1, V9:grade2.
I would like to create a bash script that will output the the data where V9 is equal to 3 and sort the output by sex, showing the Female data first.
I am a 100% beginner with bash scripts and although I know how to obtain this output from shell, I could only come up with this when it comes to bash script commands:
#!/usr/bin/env bash INPUT=./phpOJxGL9.csv OLDIFS=$IFS IFS=',' [ ! -f $INPUT ] && { echo "$INPUT file not found"; exit 99; } echo Grade2 = 3 echo Age Sex Grade2 Grade1 echo '************************' while read V1 V2 V3 V4 V5 V6 V7 V8 V9 V10 do if [ $V9 -eg "3" ]; then cut -d',' -f1,2,8,9 | sort -k2 -t',' fi done < $INPUT IFS=$OLDIFS
The out put should look somewhat like this:
Can anyone help?
perl
,awk
, orPython
- this task is very easy in Python using the pandas lib and this can be used in a scriptbash
, but easy in languages designed for text handling, likeperl
, Python,...But seeman cut paste bash
.if [$V9 -eg 3]; then
is junk: the variable should be double-quoted, it needs spaces either side of each square bracket, the operator for equal is -eq, none of your data is equal to this, and shell does not do real numbers anyway. This cannot be the script you are running. Pass all scripts through shellcheck.net before running.