As a Linux system administrator or developer, you‘ll often find the need to manipulate strings and text in your Bash scripts. One common task is splitting a string into an array – that is, breaking up a string into multiple parts and storing each part in a separate array element.

Bash doesn‘t have a built-in split function, but Linux provides other tools that make splitting strings easy. In this comprehensive guide, we‘ll explore the various methods of splitting strings into arrays in Bash, including:

  • Using the Internal Field Separator (IFS)
  • Using the read command with -a
  • Using the tr command
  • Using the cut command
  • Using regex with Bash capabilities

We‘ll look at examples of each method and explain the syntax, so you can easily split strings for your scripts.

Using the IFS Variable as a Delimiter

The easiest way to split a string into an array in Bash is to use the Internal Field Separator (IFS) variable as a delimiter.

IFS is an environment variable that Bash uses to separate words when parsing input. By default, it contains the space, tab, and newline characters.

Here‘s an example:

string="LinuxHint tutorials are awesome"
IFS=‘ ‘ read -ra array <<< "$string"

for element in "${array[@]}"
do
  echo "$element"
done

This splits the $string on spaces, stores each word in the array variable, then prints each element.

Here‘s what it does:

  • IFS=‘ ‘ – Sets IFS to only contain a space
  • read -ra array – Use read to split $string into array array
  • <<< "$string" – Pass $string to read via stdin
  • Loop through ${array[@]} and print each element

The output would be:

LinuxHint 
tutorials
are
awesome

By changing the IFS delimiter, you can split on other characters too:

string="LinuxHint,tutorials,are,awesome" 
IFS=‘,‘ read -ra array <<< "$string"

Now it splits on commas instead of spaces.

Using IFS to split strings is very straightforward, but it only allows a single character delimiter. Next we‘ll see how to use read itself to split strings.

Using read -a to Split into Arrays

The read command in Bash allows splitting strings into arrays with the -a option.

-a tells read to store the tokenized values into an array variable.

Here is an example:

string="LinuxHint tutorials are awesome"

read -ra array <<< "$string" 

for element in "${array[@]}"
do
  echo "$element" 
done

This splits the string on spaces by default and saves the parts to the array variable.

We don‘t explicitly set an IFS delimiter here. read uses the default space/tab/newline IFS.

The output is:

LinuxHint
tutorials 
are
awesome

To split on other characters, provide a custom IFS when calling read -a:

string="LinuxHint:tutorials:are:awesome"
IFS=‘:‘ read -ra array <<< "$string"

Now it splits on colons instead of spaces.

The read -a method is simple to use and allows custom delimiters without the limitations of directly setting IFS.

Next let‘s look at using the tr command to split strings.

Using tr Command to Split on Character

The tr translation command in Linux allows you to replace or delete characters from stdin or files. We can also use it to split strings by providing an appropriate set of characters.

tr works on character sets, so you need to specify the delimiter character set to split on, like this:

string="LinuxHint,tutorials,are,awesome"
IFS=‘,‘ read -ra array <<< "$string"

for element in "${array[@]}" 
do
   echo "$element"
done

This splits the string on commas, instead of the default space separation.

The key things to note in use of tr are:

  • The first argument has the set of characters to split on
  • The second argument is a set of replacement chars (we use ‘‘ to delete)
  • tr splits stdin line by line by default

This makes tr very convenient for single character delimiters in scripts.

Next up we have splitting via the cut command.

Using cut to Split Strings on Delimiter

The cut command in Linux is designed for slicing columns in text data, but can also split strings using delimiter characters.

Here is an example:

text="awesome,tutorials,are,LinuxHint"
IFS=‘,‘ read -ra array <<< "$(cut -d, -f1-4 <<< "$text")"

for element in "${array[@]}"
do 
  echo "$element"
done 

This splits the $text string on commas using cut:

  • -d, – Delimiter is comma
  • -f1-4 – Extract fields 1 to 4

This handles the string splitting, then read -a stores the values into the array.

The advantage of using cut is it allows multi-character delimiters with the -d option.

A key thing to remember with cut is it only splits the first line if input has multiple lines.

Now let‘s examine using Bash regex capabilities.

Splitting with Bash Regular Expressions

As an advanced technique, Bash also provides support for regex-based pattern matching and string splitting.

Here is an example:

string="awesome::tutorials::are::LinuxHint" 

if [[ "$string" =~ (.+):: ]]
then
  array[0]=${BASH_REMATCH[1]} 
fi

if [[ "$string" =~ .+::(.+):: ]]  
then
  array[1]=${BASH_REMATCH[1]}
fi 

# ...

for element in "${array[@]}" 
do
   echo "$element" 
done

This uses two key components:

  1. [[ string =~ regex ]] – Match string against regex
  2. ${BASH_REMATCH[1]} – Access regex capture group

The regex matches parts of the string, capture them into groups, then the BASH_REMATCH array allows storing matched parts.

With appropriately crafted regexes, you can split and capture strings very precisely.

The main downside to this method is writting regexes can get complex compared to other tools. But it is very powerful.

Splitting Strings from File Contents

In most real world scenarios, the string you need to split comes from a file rather than a variable in a script.

Whether it is line-by-line data or a JSON blob, the same principles apply.

Here is an example reading from a CSV data file:

# File data.csv 
# Line1: foo,bar,test
# Line2: cat,dog,mouse

while IFS=‘,‘ read -ra array
do
  for element in "${array[@]}"
  do
    echo "$element"
  done 
  echo # newline  
done < data.csv

This iterates through the file line by line, splitting each line by comma into arrays.

The same approach works for any file formats or string sources – streams, APIs, databases etc.

Now that we have covered the various methods to split strings into arrays in Bash, let‘s summarize the key points.

Summary

Splitting strings into arrays is a common task in Bash scripting. Here are some key things to remember:

  • Use IFS to split on spaces/tabs if they exist in the string
  • read -a also uses IFS but allows custom delimiters
  • tr translates individual characters, good for single char delimiters
  • cut can use multi-char delimiters and handles columns well
  • Regex matches offer advanced splitting capabilities
  • File data can be split line-by-line or by patterns

Choosing the right method depends on the data format and types of delimiters used. But this guide has equipped you with several options to handle string splitting in scripts.

Bash doesn‘t have a built-in split function, but Linux provides other flexible commands. Splitting strings into arrays enables easier processing and scripting workflows.

Let me know in the comments if you have any other questions!

Similar Posts

Leave a Reply

Your email address will not be published. Required fields are marked *