The reason your command outputs nothing is that your printf
statement outputs
:MixerGrinder:35000\$
... which matches nothing in the file, instead of
:MixerGrinder:Electronics:10:35000$
... which matches the lines that you wanted to match.
It does that because you have escaped the $
in the single-quoted printf
format string (it does not need escaping since the string is single-quoted, and it wouldn't need it in any other case either as it does not introduce a shell expansion in this case), and because you are using 2,5
rather than 2-5
in the field range argument to cut
.
You also wrote < (...)
instead of <(...)
(there should not be a space between <
and (...)
), which broke the process substitution and should have caused a syntax error (bash: syntax error near unexpected token `('
).
An alternative approach, similar to yours, but that allows for spaces in the fields:
$ grep -f <(cut -d : -f 2-5 catalog.txt | sort | uniq -d | sed 's/.*/:&$/') catalog.txt 808:MixerGrinder:Electronics:10:35000 809:MixerGrinder:Electronics:10:35000
Using Miller (mlr
) to add a temporary count
field to each record, containing the count of the total number of records having the same fields 2, 3, 4, and 5. We then filter the resulting data, retaining only those records with a count
of more than 1, and remove the count
field.
$ mlr --nidx --fs colon count-similar -g 2,3,4,5 then filter '$count>1' then cut -x -f count catalog.txt 808:MixerGrinder:Electronics:10:35000 809:MixerGrinder:Electronics:10:35000
This reads the data as an integer-indexed input file (--nidx
), which uses colons for field separators (--fs colon
). If the data is actually a header-less CSV file, then use --csv
and -N
in place of --nidx
. This would allow Miller to understand quoted CSV fields.
uniq -D -s3 catalog.txt
.