As a Linux system administrator or developer, you‘ll often find the need to manipulate strings and text in your Bash scripts. One common task is splitting a string into an array – that is, breaking up a string into multiple parts and storing each part in a separate array element.
Bash doesn‘t have a built-in split
function, but Linux provides other tools that make splitting strings easy. In this comprehensive guide, we‘ll explore the various methods of splitting strings into arrays in Bash, including:
- Using the Internal Field Separator (IFS)
- Using the
read
command with-a
- Using the
tr
command - Using the
cut
command - Using regex with Bash capabilities
We‘ll look at examples of each method and explain the syntax, so you can easily split strings for your scripts.
Using the IFS Variable as a Delimiter
The easiest way to split a string into an array in Bash is to use the Internal Field Separator (IFS) variable as a delimiter.
IFS is an environment variable that Bash uses to separate words when parsing input. By default, it contains the space, tab, and newline characters.
Here‘s an example:
string="LinuxHint tutorials are awesome"
IFS=‘ ‘ read -ra array <<< "$string"
for element in "${array[@]}"
do
echo "$element"
done
This splits the $string
on spaces, stores each word in the array
variable, then prints each element.
Here‘s what it does:
IFS=‘ ‘
– Sets IFS to only contain a spaceread -ra array
– Useread
to split$string
into arrayarray
<<< "$string"
– Pass $string to read via stdin- Loop through
${array[@]}
and print each element
The output would be:
LinuxHint
tutorials
are
awesome
By changing the IFS delimiter, you can split on other characters too:
string="LinuxHint,tutorials,are,awesome"
IFS=‘,‘ read -ra array <<< "$string"
Now it splits on commas instead of spaces.
Using IFS to split strings is very straightforward, but it only allows a single character delimiter. Next we‘ll see how to use read
itself to split strings.
Using read -a to Split into Arrays
The read
command in Bash allows splitting strings into arrays with the -a
option.
-a
tells read
to store the tokenized values into an array variable.
Here is an example:
string="LinuxHint tutorials are awesome"
read -ra array <<< "$string"
for element in "${array[@]}"
do
echo "$element"
done
This splits the string on spaces by default and saves the parts to the array
variable.
We don‘t explicitly set an IFS delimiter here. read
uses the default space/tab/newline IFS.
The output is:
LinuxHint
tutorials
are
awesome
To split on other characters, provide a custom IFS when calling read -a
:
string="LinuxHint:tutorials:are:awesome"
IFS=‘:‘ read -ra array <<< "$string"
Now it splits on colons instead of spaces.
The read -a
method is simple to use and allows custom delimiters without the limitations of directly setting IFS.
Next let‘s look at using the tr
command to split strings.
Using tr Command to Split on Character
The tr
translation command in Linux allows you to replace or delete characters from stdin or files. We can also use it to split strings by providing an appropriate set of characters.
tr
works on character sets, so you need to specify the delimiter character set to split on, like this:
string="LinuxHint,tutorials,are,awesome"
IFS=‘,‘ read -ra array <<< "$string"
for element in "${array[@]}"
do
echo "$element"
done
This splits the string on commas, instead of the default space separation.
The key things to note in use of tr
are:
- The first argument has the set of characters to split on
- The second argument is a set of replacement chars (we use ‘‘ to delete)
tr
splits stdin line by line by default
This makes tr
very convenient for single character delimiters in scripts.
Next up we have splitting via the cut
command.
Using cut to Split Strings on Delimiter
The cut
command in Linux is designed for slicing columns in text data, but can also split strings using delimiter characters.
Here is an example:
text="awesome,tutorials,are,LinuxHint"
IFS=‘,‘ read -ra array <<< "$(cut -d, -f1-4 <<< "$text")"
for element in "${array[@]}"
do
echo "$element"
done
This splits the $text
string on commas using cut
:
-d,
– Delimiter is comma-f1-4
– Extract fields 1 to 4
This handles the string splitting, then read -a
stores the values into the array.
The advantage of using cut
is it allows multi-character delimiters with the -d
option.
A key thing to remember with cut
is it only splits the first line if input has multiple lines.
Now let‘s examine using Bash regex capabilities.
Splitting with Bash Regular Expressions
As an advanced technique, Bash also provides support for regex-based pattern matching and string splitting.
Here is an example:
string="awesome::tutorials::are::LinuxHint"
if [[ "$string" =~ (.+):: ]]
then
array[0]=${BASH_REMATCH[1]}
fi
if [[ "$string" =~ .+::(.+):: ]]
then
array[1]=${BASH_REMATCH[1]}
fi
# ...
for element in "${array[@]}"
do
echo "$element"
done
This uses two key components:
[[ string =~ regex ]]
– Match string against regex${BASH_REMATCH[1]}
– Access regex capture group
The regex matches parts of the string, capture them into groups, then the BASH_REMATCH
array allows storing matched parts.
With appropriately crafted regexes, you can split and capture strings very precisely.
The main downside to this method is writting regexes can get complex compared to other tools. But it is very powerful.
Splitting Strings from File Contents
In most real world scenarios, the string you need to split comes from a file rather than a variable in a script.
Whether it is line-by-line data or a JSON blob, the same principles apply.
Here is an example reading from a CSV data file:
# File data.csv
# Line1: foo,bar,test
# Line2: cat,dog,mouse
while IFS=‘,‘ read -ra array
do
for element in "${array[@]}"
do
echo "$element"
done
echo # newline
done < data.csv
This iterates through the file line by line, splitting each line by comma into arrays.
The same approach works for any file formats or string sources – streams, APIs, databases etc.
Now that we have covered the various methods to split strings into arrays in Bash, let‘s summarize the key points.
Summary
Splitting strings into arrays is a common task in Bash scripting. Here are some key things to remember:
- Use IFS to split on spaces/tabs if they exist in the string
read -a
also uses IFS but allows custom delimiterstr
translates individual characters, good for single char delimiterscut
can use multi-char delimiters and handles columns well- Regex matches offer advanced splitting capabilities
- File data can be split line-by-line or by patterns
Choosing the right method depends on the data format and types of delimiters used. But this guide has equipped you with several options to handle string splitting in scripts.
Bash doesn‘t have a built-in split
function, but Linux provides other flexible commands. Splitting strings into arrays enables easier processing and scripting workflows.
Let me know in the comments if you have any other questions!