split file lines based on first field

Question

I have file with content like below and want to convert my output like below

Input
```
1,a,b,c 2,b,c 3,e,f 4,l 
```
Required output
```
1,a 1,b 1,c 2,b 2,c 3,e 3,f 4,l 
```

Values on first field is unique and no duplicate lines for 1st field in the input.

I am new to scripting and not sure how can we do this.

Similar (but not exactly the same): awk command to delimit the second column — Kusalananda, CommentedJul 25, 2019 at 13:40
Should that last 4,1 (four-one) be 4,l (four-ell) like in the input file? — ilkkachu, CommentedJul 25, 2019 at 14:43

pLumo · Accepted Answer · 2019-07-25 13:32:08Z

You can use awk and loop through the fields starting with 2:

awk -F, '{ OFS=FS; for (i=2;i<=NF;i++) print $1,$i }' file

Output:

1,a 1,b 1,c 2,b 2,c 3,e 3,f 4,l

Don't keep setting OFS for every input line, just do it once when you set FS: awk 'BEGIN{FS=OFS=","} { for (i=2;i<=NF;i++) print $1,$i }' file — Ed Morton, CommentedJul 25, 2019 at 17:08

Stéphane Chazelas · Accepted Answer · 2019-07-25 16:02:13Z

With sed, you would do

sed -E 's/([^,]*,)([^,]*),/\1\2\n\1/;P;D' file

Note that using \n in the replacement string works just for GNU sed. On other systems you would need to use an actual newline, preceeded by a backslash:

sed -E 's/([^,]*,)([^,]*),/\1\2\ \1/;P;D' file

-E means extended regular expressions, so I can use () instead of . Just for readability
[^,]* matches a string without a comma, so it matches one field
Thus, [^,]*,[^,]*, matches the first two fields. I did put () around the fields so I can reuse them as \1 and \2 in the replacement
The s command replaces the first two fields with themselves, adds a newline and repeats the first field in the new line. So the line is split in two: 1,a,b,c will be one line with 1,a and another one with 1,b,c
Now P prints the first line in the buffer (we know that it's already fine for printing)
D deletes the first line from the buffer and starts the script over if there is anything left in the buffer after removing. So the remaining 1,b,c will again get split into the 1,b and 1,c lines.
If there is only one x,y left, the pattern will not match anymore, so no newline gets inserted and sed doesn't cycle, but will continue with the next line

aborruso · Accepted Answer · 2019-07-25 16:09:00Z

Using Miller (https://github.com/johnkerl/miller)

mlr --c2n --ofs "," --implicit-csv-header then reshape -r "[^1]" -o item,value then cut -x -f item input.txt

you will have in output

1,a 1,b 1,c 2,b 2,c 3,e 3,f 4,l

Rakesh Sharma · Accepted Answer · 2019-07-26 03:34:38Z

You can accomplish this task various ways, as shown below:

$ sed -e ':a s/,/\n/2;/\n/!b P;s/,.*\n/,/;ba ' file.csv

Explanation:

We try to change the second comma to a newline. If not possible => the pattern space had less than two and hence should be handed over to stdout.

OTW, we print the leading two comma-separated fields, then remove the second field such that the third now becomes the second nd so on.

$ perl -F, -lane ' my $f1 = shift @F; print join ",", $f1, $_ for @F; ' file.csv

Explanation:

Split each line into fields based on comma and perl will store the fields in the array @F. The first field we shift off the array @F and store in the scalar $f1. Then progressively print the elements of the array.

$ perl -F\(,\) -lane ' my $f1 = shift @F; print $f1, splice @F, 0, 2 while @F > 1 ; ' file.csv $ perl -F, -lane 'print $F[0], $_ for /,(?:(?!,).)*/g' file.csv $ sed -Ee 's/,?[^,]*/[&] /g' file.csv | dc -e " [q]sq [SMlN1+sNz1<a]sa [dnLMn10anlN1-dsN0<b]sb [?z0=q0sNlaxlbxclcx]sclcx "

Result:

 1,a 1,b 1,c 2,b 2,c 3,e 3,f 4,l

Praveen Kumar BS · Accepted Answer · 2019-07-28 14:31:49Z

Tried with Below command and it worked fine

count_line=`awk '{print NR}' p.txt| sort -nr| sed -n '1p'` for((i=1;i<=$count_line;i++)); do j=`awk -v i="$i" -F "," 'NR==i{print $1}' p.txt`;k=`awk -v i="$i" -F "," 'NR==i{print NF}' p.txt`; for ((z=2;z<=$k;z++)); do awk -v i="$i" -v j="$j" -v z="$z" -F "," 'NR==i{print j","$z}' p.txt; done; done

output

1,a 1,b 1,c 2,b 2,c 3,e 3,f 4,l

Stack Exchange Network

split file lines based on first field

5 Answers 5

You must log in to answer this question.

Linked

Hot Network Questions

split file lines based on first field

5 Answers 5

You must log in to answer this question.

Linked

Related

Hot Network Questions