3

I have a text file containing 1000 lines in this format:

001122 abc def ghi 334455 xyz aaa bbb 667788 ccc ccc ddd 

How can I convert it into this format using a Linux command by adding spaces to certain columns?

00 11 22 abc def ghi 33 44 55 xyz aaa bbb 66 77 88 ccc ccc ddd 
5
  • 2
    Is certain column always first column or are you looking for a solution that lets you specific which column to space by it's number?
    – Ed Morton
    CommentedApr 18, 2020 at 14:53
  • @EdMorton I've assumed the latter in my answer, otherwise it would be a trivial question with dozens of duplicates already on the SE network.CommentedApr 18, 2020 at 17:11
  • You may want to look at unix.stackexchange.com/help/someone-answers, and also take the tour: unix.stackexchange.com/tour (you'll receive a badge when you've done so).
    – Kusalananda
    CommentedApr 18, 2020 at 20:26
  • Are work orders accepted on this site? Isn't any demonstrated effort required? "Don't just dump the problem statement"CommentedApr 19, 2020 at 15:28
  • From Meta: "... Show the research that you’ve already done. As you already saw in How do I ask a good question?, tell us what you’ve already found and why it didn’t meet your needs)"CommentedApr 19, 2020 at 15:46

6 Answers 6

5

Naively but straight forward:

$ sed 's/\(..\)\(..\)\(..\)/\1 \2 \3/' file 00 11 22 abc def ghi 33 44 55 xyz aaa bbb 66 77 88 ccc ccc ddd 

That is, match and collect the three first groups of two characters on each line, and space them out by inserting spaces in the replacement string.

Fancy but requires thinking:

$ sed 's/../ &/3; s/../ &/2' file 00 11 22 abc def ghi 33 44 55 xyz aaa bbb 66 77 88 ccc ccc ddd 

This first expression replaces the 3rd match of .. on each line with a space followed by whatever those .. matched. Then again, but for the 2nd match.

    4

    Using any awk in any shell on every UNIX box and letting you specify which column to change and independent of the characters in that column:

    $ awk -v c=1 '{gsub(/../,"& ",$c); sub(/ $/,"",$c)}1' file 00 11 22 abc def ghi 33 44 55 xyz aaa bbb 66 77 88 ccc ccc ddd $ awk -v c=2 '{gsub(/../,"& ",$c); sub(/ $/,"",$c)}1' file 001122 ab c def ghi 334455 xy z aaa bbb 667788 cc c ccc ddd $ awk -v c=3 '{gsub(/../,"& ",$c); sub(/ $/,"",$c)}1' file 001122 abc de f ghi 334455 xyz aa a bbb 667788 ccc cc c ddd 
      4

      A simple sed command is all that is needed (change filename with the acutal file):

      sed -E 's|([0-9]{2})([0-9]{2})([0-9]{2})[[:blank:]]*(.*)|\1 \2 \3 \4|g' filename 

      If you want to change the source file (filename) in place, pass in the -i option:

      sed -i -E 's|([0-9]{2})([0-9]{2})([0-9]{2})[[:blank:]]*(.*)|\1 \2 \3 \4|g' filename 

      Explanation:

      ([0-9]{2}) matches groups of 2 digits 3 times

      (.*) matches everything else which is all the letters

      [[:blank:]]* matches space characters including tabs

      \1 through \4 are matched groups

      Note that this will only work with GNU sed. Almost all mainstream Linux distributions come with GNU Linux. If you are using macOS, your sed is BSD sed, unless your installed GNU sed available as gsed.

      10
      • Although technically correct, why bother using vertical bar when the more conventional forward slash would suffice?
        – DannyNiu
        CommentedApr 18, 2020 at 12:48
      • 1
        I find it easier to use |. Although it does not make any difference in this instance, when you have strings like https://, you won't have to bother escaping https:\/\/ if you use |.
        – GMaster
        CommentedApr 18, 2020 at 12:50
      • Thank you for the quick replay but the output looks like this spaces between digits is correct but extra +7 spaces added before the text when i backspace once it becomes proper any help ``` 00 11 22 abc def ghi 33 44 55 xyz aaa bbb 66 77 88 ccc ccc ddd ```
        – Rozer
        CommentedApr 18, 2020 at 13:21
      • @Rozer It looks like there are some tabs or spaces that is not visible in the input text your provided. Anyway, I have updated my answer. Try and see if it helps.
        – GMaster
        CommentedApr 18, 2020 at 14:01
      • Thank you @GMaster problem solved
        – Rozer
        CommentedApr 18, 2020 at 14:06
      3

      A Generic version for any number/position of spaces in awk:

      awk -v s='2,4' '{f=!split(s,a,",");for(i in a){r="^.{"a[i]+f++"}";gsub(r,"& ")}}1' 00 11 22 abc def ghi ⋮ 

      A more powerful version, where other characters than space can be inserted:

      spacers(){ awk -v s="$1" '{f=!split(s,a,/[^*0-9]*/);split(s,p,/[*0-9]*/); for(i in a){if(""==b=a[i])continue; r="^.{"(b!="*"?b+f++:length($0))"}"; gsub(r,"&"p[i+1])}} 1' $2;} 

      That way, you can do e.g.:

      spacers '0|2 4 6|[email protected] |* |' file |00 11 22| [email protected] | def ghi | 

      which is great for creating org-mode tables and piping directly to clipboard.

      Note: The shell-function also accepts data through STDIN.

      (Earlier versions of this answer contained a generic awk-solution, that used sed for the final replace)

        2

        Being completely lazy about typing here,

        sed -E "s/([0-9]{2})/\1 /g; s/ +/ /g" file1 

        Put a space after every pair of digits and then reduce the multiple spaces to a singleton.

        Or, perhaps even lazier

        sed 's/./& /4;s/./& /2' file1 
          2

          If the input data is exactly as depicted, GNU cut is an option. Note that the --output-delimiter has to be explicitly set to a space. This makes for a very rigid solution unlike some of the other answers, lacking both the flexibility to deal with variable string length in the first field and the ability to designate an arbitrary field to operate on.

          cut -c1-2,3-4,5- --output-delimiter=' ' <file 00 11 22 abc def ghi 33 44 55 xyz aaa bbb 66 77 88 ccc ccc ddd 

            You must log in to answer this question.

            Start asking to get answers

            Find the answer to your question by asking.

            Ask question

            Explore related questions

            See similar questions with these tags.