Using regex with IFS variable to split Bash string

Question

I am trying to split a string into an array by any character that is not alphanumeric. Can assign a regex pattern to the IFS variable to accomplish this?

I have tried it like so:

input="$1" IFS="[^a-zA-Z]" read -ra name_parts <<< "$input"

But this splits the string by any "a" or "A" - not even recognizing the "^". This question looks similar by title, but does not appear to be about the question I'm asking.

I should add that I am actually only concerned about alphabetic characters, so "alphanumeric" was inaccurate. I don't need to catch [0-9]. — omeganumeric, CommentedJun 29, 2020 at 5:38
What is your expected output and your input string? Clearly this is an XY problem — Inian, CommentedJun 29, 2020 at 5:48
This is an exercise for generating acronyms from names that may contain spaces, dashes, underscores, or some shell globbing characters. I am able to pass my particular set of tests with IFS=" |-|_|*". I understand XY problem, but I wanted to understand the limits of using IFS, and think about how I might be able to solve it with an unlimited variation of possible delimiters. I read about IFS, but was unable to find specifics about that limitation. Thanks for your answer. — omeganumeric, CommentedJun 29, 2020 at 5:55

Inian · Accepted Answer · 2020-06-29 05:45:28Z

IFS cannot be used that way. It does not take a regular expression. At the minimum, the characters (literal) composing the IFS is used by the shell to split words when it does expansion of words. E.g.

IFS=: read -r v1 v2 <<<"foo:bar"

What you have defined in IFS="[^a-zA-Z]" takes the characters literally i.e. each of [, ^, a, -, z, A, Z and ] are used as separators to split your input string which is clearly not something you would expect to do.

ilkkachu · Accepted Answer · 2020-06-29 09:17:41Z

IFS is just a bunch of characters (or bytes), not a regex. But you could use e.g. awk or sed to split the string based on a regex, print it out with a simpler separator and then read it with the shell's read.

read -ra name_parts < <(awk -vFS='[^a-zA-Z]' -vOFS=' ' '{$1=$1; print}' <<< "$input")

or

read -ra name_parts < <(sed -e 's/[^a-zA-Z]/ /g' <<< "$input")

Rakesh Sharma · Accepted Answer · 2020-06-29 07:23:43Z

Instead of tinkering with IFS, you're better off mapping the the input string and then splitting it using the default IFS:

read -ra name_parts <<<"$(printf '%s\n' "$input" | LC_ALL=C tr -cs 'a-zA-Z\n' '[ *]')"

Now the array name _parts will hold the string sliced at the non alphabetic positions.

Stack Exchange Network

Using regex with IFS variable to split Bash string

3 Answers 3

You must log in to answer this question.

Linked

Hot Network Questions

Using regex with IFS variable to split Bash string

3 Answers 3

You must log in to answer this question.

Linked

Related

Hot Network Questions