If you write scripts to automate tasks on Linux, properly handling file paths is crucial. A misconfigured path can derail even the most well-crafted Bash script.
Understanding absolute and relative paths is essential background knowledge. But beyond that, you need to know how to implement robust paths in real-world scripts.
This comprehensive 4-part guide takes an in-depth look at Linux file paths from an advanced Bash scripter‘s perspective. You will learn:
- Key path concepts – absolute, relative, symlinks and more
- Why absolute paths matter for reliability and portability
- How to reference script, file and command locations
- Advanced path handling with variables and commands
- Best practices for bulletproof scripting with paths
Follow these proven techniques to level up your scripting game!
Absolute vs Relative Paths: Key Concepts
Before diving into scripting uses, let‘s recap some key file path concepts in Linux:
Types of Paths
There are two primary types of paths:
Absolute paths provide the full directory tree from /
root to the target file or folder. For example:
/home/john/projects/script.sh
Relative paths specify the location relative to the current working directory (CWD). For example, if CWD is /home/john
:
projects/script.sh
Absolute paths define a single, canonical location that always points to the same place. Relative paths depend on CWD, so the target varies by your shell‘s location.
Symbolic Links
The ln -s
command creates symbolic links (symlinks) – special files that serve as references to other files or folders.
Symlinks allow a file to appear in multiple locations at once. For example:
$ ln -s /usr/bin/python3 /home/john/bin/python
Now python
refers to /usr/bin/python3
in the context of /home/john/bin
.
Tools like readlink -f
can resolve symlinks to find the actual underlying file path. This becomes very relevant when handling paths in scripts (as we‘ll see later).
With those basics covered, let‘s examine why absolute paths matter so much for serious Bash scripting.
Why Absolute Paths Are Essential for Reliable Scripting
Seasoned Bash scripters almost always recommend using absolute instead of relative paths. Here are three major reasons:
1. Portability: Scripts Break with Changing CWD
The most common argument for absolute paths is portability across environments.
If your script relies on relative paths, it can easily break when run from other directories. The relative path will resolve incorrectly if CWD differs.
Here is a simple example script with a relative path:
# script.sh
INPUT_FILE=data/input.csv
python process.py $INPUT_FILE
This works fine if you invoke it as:
/home/me/projects$ ./script.sh
data/input.csv
correctly resolves to /home/me/projects/data/input.csv
.
However, if you run the same script from another folder:
/home/$ ./projects/script.sh
Now there is no data
folder in /home/
, so the script fails!
The relative path resolved incorrectly because our CWD – execution context – changed.
Using absolute paths fixes this because the full location is defined from root folder /
, so it always points to the same spot no matter where the script executes from.
2. Reliability: Relative Paths Break Easily
In addition to portability issues, relative paths intrinsically less reliable due to their dynamic resolution.
If a script depends on CWD and intermediate folders for path resolution, there are more opportunities for breakage:
- Intermediate folders renamed/moved
- Permissions problems in ancestor tree
- Drive failures interrupting full tree
With an absolute path like /xyz/input.csv
, you reference the target file directly in a single hop.
But the relative path ../../input.csv
makes multiple resolution hops through the filesystem (varying based on CWD). So more can go wrong!
One real folder rename can tank a widely-used relative path script. Absolute paths point straight to the target, avoiding cascading upstream breakage.
3. Readability: Explicit vs Implicit Target
Finally, absolute paths explicitly define the full target location in an easy-to-understand way.
Seeing /home/bob/data/input.csv
makes it clear exactly which input.csv
the script will reference.
Whereas with a relative path like ../../data/input.csv
, the actual file location depends on CWD context. This makes code harder to reason about for readers unfamiliar with that environment.
In summary:
- Relative paths depend on CWD, risk unpredictability
- Absolute paths define an exact location, reduce possibility of failure
That‘s why seasoned Bash scripters recommend absolute paths whenever possible!
Okay, hopefully you‘re convinced on why absolute paths matter. Next let‘s explore how to actually implement them in Bash scripts.
Referencing Absolute Paths of Files & Scripts
When writing Bash scripts, you‘ll often need to programmatically reference:
- The script itself
- Other external scripts or files
- System commands
Here are reliable methods to get the absolute paths for each, starting with the script itself:
1. Get Absolute Path of Bash Script Itself
Say you have a script /home/john/scripts/myscript.sh
that needs to reference its own location.
Here is best practice using the $BASH_SOURCE
variable:
#!/bin/bash
ABS_SCRIPT_PATH="$(readlink -f "${BASH_SOURCE[0]}")"
echo "This script is located at $ABS_SCRIPT_PATH"
Breaking this down:
${BASH_SOURCE[0]}
gives path to the script being executedreadlink -f
resolves any relative bits to get full absolute path$()
captures output to assign to $ABS_SCRIPT_PATH variable
Even if you call the script as ./myscript.sh
from another folder, $BASH_SOURCE
will provide the actual full path instead of just "myscript.sh".
(Pro Tip): Always use ${BASH_SOURCE[0]}
instead of $BASH_SOURCE
for maximum compatibility.
2. Get Absolute Path of Other Scripts & Files
To reference other scripts or files in your code:
DATA_SCRIPT="/home/john/pipelines/transform.sh"
LINKED_SCRIPT="$(readlink -f /some/other/script.sh)"
Hard-code the full static path if you know the exact location and it won‘t change.
Or use readlink -f
to resolve the absolute path if the location may vary or use symlinks.
This gives you a bulletproof way to access external files and tools.
3. Get Absolute Path of Commands
You‘ll often need to invoke external Linux commands like grep
, awk
, etc. in scripts.
Here is an easy way to get the absolute path of any command:
AWK_PATH=$(which awk)
GREP_PATH=$(readlink -f $(which grep))
$AWK_PATH data.csv
$GREP_PATH pattern file.txt
which <cmd>
gets path of command- Wrap in
readlink -f
to resolve symlinks
Now you have the full path to reference commands reliably.
Which style works best depends whether the tool location may change between systems or use symlinks.
Combine these techniques to access the absolute path of scripts, tools, and files in your code.
Advanced Path Handling With Variables & Commands
Beyond basic path access, there are some more advanced Bash capabilities worth covering:
Convenient Path Variables
There are bash variables that give quick access to certain paths:
$PWD
– Current working directory$OLDPWD
– Previous working directory$HOME
– Home folder of current user
For example:
~/projects$ echo "My home folder is $HOME"
My home folder is /home/john
Use these pre-defined variables whenever possible instead of hard-coding values.
Portable ~/. Configs
To reference a .bashrc
or .ssh
config from your script, avoid absolute paths to user folders.
Use the $HOME
variable instead:
# GOOD
SOURCE_SSH="$HOME/.ssh/special_key"
# BAD
SOURCE_SSH="/home/john/.ssh/special_key"
This helps make scripts portable across systems where the user home folder may be named differently.
Splitting Paths with basename and dirname
The basename
and dirname
commands let you easily separate paths:
PATH="/usr/local/bin/python3"
dirname $PATH # /usr/local/bin
basename $PATH # python3
Use these instead of flaky string parsing or substitutions.
Normalizing Paths with realpath
The realpath
command resolves relative bits and symlinks to standardize paths:
PATH="/usr/local/lib/../bin/python"
realpath $PATH # /usr/local/bin/python
This prevents weird relative sequences and symlinks from breaking path logic.
There are many other handy utilities for programmatic path handling – pwd
, readlink
, etc. Consult the man pages for these useful tools.
Now that you know how to access absolute paths, let‘s cover some pro tips and best practices.
Pro Tips for Bulletproof Paths in Bash Scripts
Here are some key lessons full-stack developers have learned handling lots of path issues:
Validate Paths Before Usage
It‘s smart to validate path validity early instead of assuming a file/folder exists:
# Check path exists
if [ ! -e "$DATA_FOLDER" ]; then
echo "Error - $DATA_FOLDER not found" >&2
exit 1
fi
# Continue with file handling/processing...
Catch missing resources upfront to fail fast, rather than mid-process!
Always Prefer Absolute Paths
Favor absolute over relative paths even if not strictly required:
# Prefer this
LOG_FILE="/var/log/myapp.log"
# Over this
LOG_FILE="myapp.log"
Absolute paths are more explicit, portable, and just as easy to use in most cases.
Store Paths in Variables
Centralize path references instead of scattering strings throughout code:
# Bad
python /path/script.py
mv /path/file.txt /new/location
# Good
SCRIPT=/path/script.py
DATA_FILE=/path/file.txt
python $SCRIPT
mv $DATA_FILE /new/location
Updating paths means changing one spot instead of dig through code.
Standardize Path Format & Handling
Follow consistent conventions in your scripts:
- Use trailing
/
on folders vs. files - Always surround paths with quotes
- Consider creating path variables or helper functions
These patterns make path logic less bug-prone.
Make Paths Configurable
Avoid hardcoded absolute paths when possible. Instead, load paths from:
- Configuration files
- Environment variables
- Command line parameters
This helps make reusable, portable scripts.
Following these tips will help you avoid many headaches from path issues!
Caveats: Symlinks and Edge Cases
There are some edge cases around path resolution in Linux worth noting:
Symlinks can cause problems because tools like pwd
and cd
follow them by default instead of showing the symlink path itself.
So be careful blindly using CWD values in scripts.
Additionally, some mount points like /proc
or /sys
have special internal path semantics that can break certain assumptions.
The main lesson is always validate paths explicitly instead of making assumptions!
Absolute vs Relative: Performance Concerns?
Given all the benefits absolute paths provide, some wonder if there is any downside or penalty to avoid.
A common question – do absolute paths hurt performance compared to relative?
The answer is generally no:
- Filesystems optimize lookup using inodes
- Path length has minimal impact on I/O or lookup time in most cases
There can be edge cases with extremely long pathnames (>4000 chars). But for almost all scripts, embed as many absolutes paths as you need without performance concerns.
Cross-Platform Portability Considerations
If you plan to run scripts across *nix platforms, keep a few OS-specific considerations in mind:
- Path separators differ between Linux (
/home/john
) and Windows (C:\Users\john
) - Max file path length varies: Linux 4096+ chars, Windows 260 chars
- Drive naming styles vary between platforms
- Case-sensitive vs insensitive filesystems impact lookups
Thankfully most modern scripting languages provide cross-platform libraries to abstract these away from your code.
But understanding the core path differences can help debug the rare issue.
Overall, the guidelines here apply broadly across Linux, macOS, and UNIX-style environments. Sticking to absolute paths maximizes portability across these platforms.
Concluding Takeaways
After all that, what are the key lessons on paths and Bash scripting?:
-
Absolute paths uniquely identify files and remove ambiguity. They boost reliability and portability.
-
Relative paths depend on the working directory and break easily when it changes. Only use for user-specific cases.
-
Built-in variables like
$BASH_SOURCE
let you easily reference a script‘s location from within the code. -
Commands like
readlink -f
,realpath
,basename
anddirname
help disassemble paths programmatically. -
Centralize path handling into variables and config. Validate paths exist before usage.
Learning these best practices will level up your Bash scripting game! While grasp of paths takes time, the effort pays dividends in stable automated systems.
You now have a Swiss Army knife of techniques to wrangle paths in Bash scripts like a pro. The world awaits automation with robust absolute path handling!