Tidbits and takeaways on the topic of Bash, the Unix shell scripting and command language that ships as the default for many Linux distributions. Bash is a useful tool but one that developers are often hesitant to approach due to its opaque man pages and symbol-heavy syntax.

Good to know

column is useful for “columnating” text inputs, for example:

cat /etc/passwd | column -t -s ":" | less

(Compare this to cat /etc/passwd) Found via Have you ever used the “column” command in Linux?

How to get help: man, help and info

Knowing how and where to get help is arguably the most important competence of any developer.

Getting help in Bash is a two step process:

  1. Find out what the type of the command you want help on is: type <COMMAND>
  2. Get help:
    1. If <COMMAND> is a shell builtin, do help <COMMAND>
    2. If <COMMAND> is anything else, try man <COMMAND>
  • info if also available

man stands for manual and as man’s man page will tell you is an interface to the system reference manuals. Exit man by hitting q once you’re done.

A couple of examples

For example, if we type type pwd into a terminal, we are told pwd is a shell builtin so we can do help pwd and we get some useful help.

If instead, we type git is /usr/bin/git, I am told git is /usr/bin/git so we can try man git. git has a manual page.

Searching the Manuals

Sometimes, you want to search for a command’s instructions and in that case, you would want man -k <REGEXP>

As man man tells us that passing the -k or --apropos flag to man will search the manual pages for the given regular expression:

-k, –apropos Equivalent to apropos. Search the short manual page descriptions for keywords and display any matches. See apropos(1) for details.

What is - Help One-liners

Indispensable is the whatis command which will display one-line manual page descriptions. This is usually the top line of the man page containing the synopsis.

For example if we run whatis man we get the following printed to our terminal and the whatis command exits, unlike man which throws us into the manual page.

man (1)              - an interface to the system reference manuals
man (7)              - macros to format man pages

Customising the Shell

bashrc and bash_profile

You can place aliases or functions that you would like to make available inside interactive shells in the .bashrc file usually found in your user directory (~).

For example, if you want to create a convenient set of aliases for using Git, you can add the following to your .bashrc file:

.bash_aliases

If you want to to consume positional arguments for what you’re doing, you might want to use a function.

For example, if we want a convenient way to create a new directory and change into it, we can add the following to our .bashrc file:

mkcd() {  
	mkdir -p "$1"  
	cd "$1"
}

You’d use it like this:

$ mkcd my-new-directory

Keep in mind when you add aliases or functions to your .bashrc file, they will be run for each new interactive shell you open. So you don’t want to insert commands that are not idempotent (i.e. have different effects if run multiple times), like changes to the PATH variable. Put these into the .bash_profile file instead.

export PATH=$PATH:/usr/local/bin # not idempotent

Bash History

Besides the above you can also set the number of lines of history that are saved during a session by setting the HISTSIZE variable, or the number of lines that are saved to the history file by setting the HISTFILESIZE variable.

HISTSIZE=1000
HISTFILESIZE=2000

Note the history saved to disk is stored in the ~/.bash_history file.

You can also choose not to put duplicate lines or lines starting with space in the history with:

HISTCONTROL=ignoreboth

and append to the history file, instead of overwriting it with

shopt -s histappend

See the following:

Customising the Prompt

You can customise the prompt by setting the PS1 environment variable.

export PS1="\u@\h:\w\$ "

The above will set the prompt to be the username, the hostname, the current working directory and a dollar sign followed by a space.

You can design your own prompt using the following variables:

\d    date
\h    hostname (up to .)
\H    hostname (full)
\n    newline
\r    carriage return
\s    shell name
\t    the time
\u    username
\v    version of bash
\w    working directory (path)
\W    working directory (base)
\$    # for root, $ for normal user
\\    backslash
\[    ansi color code

See the following:

Setting and Unsetting Flags

When setting flags, - and + serve opposing purposes with - turning on a flag and + turning it off.

For example, in the case of using Bash strict mode, conda activate fails due to an unbound PS1 variable, so if running conda activate inside of a script where you have set set -euxo pipefail at the top, you might turn these strict flags off either side of calling this command to not halt execution due to the bug in the Conda code that you are not developing:

#!/usr/bin/env bash
 
set -euxo pipefail
 
# some commands here
 
CONDA_ENV_NAME=my-lovely-conda-env
 
# Strict mode temporarily disabled as some unbound variables in conda activate
eval "$(conda shell.bash hook)"
set +eu
conda activate "$CONDA_ENV_NAME"
set -eu
conda env list
 
# some other commands now $(which python) points inside my-lovely-conda-env

For example, from the help for [declare](https://unix.stackexchange.com/a/565925/275311), which explicitly declares variables (maybe when you want to specify they should be constrained to be integer type, for example).

Running

help declare

Returns, in abridged form:

declare: declare [-aAfFgilnrtux] [-p] [name[=value] ...]
    Set variable values and attributes.

    Declare variables and give them attributes.  If no NAMEs are given,
    display the attributes and values of all variables.

    Options:
        ...

    Options which set attributes:
      -a        to make NAMEs indexed arrays (if supported)
      -i        to make NAMEs have the `integer' attribute
      -l        to convert the value of each NAME to lower case on assignment
      -n        make NAME a reference to the variable named by its value
      -r        to make NAMEs readonly

    Using `+' instead of `-' turns off the given attribute.

    ...

    Exit Status:
    Returns success unless an invalid option is supplied or a variable
    assignment error occurs.

Note in particular the sentence Using + instead of - turns off the given attribute.

Deleting by Pattern

Oftentimes you want to delete all files in a directory that match a certain pattern. For example, if you have a directory with a bunch of .pyc files in it, you might want to delete them all.

The appropriate tool to use in this situation is not rm but find which, is can be used to search for files and directories.

For example, if we want to delete all .pyc files in the current directory, including nested directories within it, we can do

find .  -type f -name "*.pyc" -delete

Alternatively, if we want to delete all the .ckpt files whose filenames without the suffix (i.e. basename filename.ckpt .ckpt) terminate in a digit ([0-9]), we can do either one of:

# 1
find . -type f -name "*[0-9].ckpt" -delete
 
# 2
find . -type f -name "*[[:digit:]].ckpt" -delete

Alternative syntax is:

find . -type f -name '*[0-9].ckpt' -exec rm {} +

Have a look at:

Looping

Syntax: Looping over files, ranges and lines

Bash loops have the syntax:

for i in /etc/rc.*; do
  echo "$i"
done

Iterating over a range is possible with the for i in {start..stop..step} syntax where the final ..step is optional and not required.

for i in {5..50..5}; do
    echo "Welcome $i"
done

We can also iterate over lines of a file by redirecting the input from a file in the following way:

while read -r line; do
  echo "$line"
done <file.txt

Example: Renaming all items in a directory by looping over filenames

Run the following command to iterate over 1, 2, 3, …, 329 and rename WAV files like 1.wav with mv "${i}.wav" to a name that is left-padded with zeros for 3 digits using the destination name "$(printf "%03d" $i).wav"

for i in {1..329}; do mv "${i}.wav" "$(printf "%03d" $i).wav"; done

See the section on Loops from the Bash scripting cheatsheet.

Bash scripting cheatsheet

Getting the script directory like Path(__file__)

In Python, you can always get the path of the Python file (module, script) and consequently its parent directory (via .parent or .parents[idx] for ancestors).

In Bash, this is:

#!/bin/bash
 
SCRIPT_DIR=$(realpath $(dirname "$0"))
 
echo "The directory of this script is: $SCRIPT_DIR"
  • $0 the currently invoked command (so script)
  • directory name
  • resolve to an absolute (real) path
  • use another command to get the unresolved path e.g. for symlinks

Difference between Single and Double Square Brackets ([ vs [[)

The double brackets, , were introduced in the Korn Shell as an enhancement that makes it easier to use in tests in shell scripts. We can think of it as a convenient alternative to single brackets.

It’s available in many shells like Bash and zsh. However, double brackets aren’t POSIX compliant.

[[ is a new, improved version of [, and it is a keyword rather than a program. This makes it easier to use.

The single bracket is a built-in command that’s older than the double bracket. The double bracket is an enhanced version of the single bracket. If we want the portability of scripts, we should prefer single brackets. However, using the double brackets is generally more convenient.

From Are double square brackets [[ ]] preferable over single square brackets [ ] in Bash? link:

[[ has fewer surprises and is generally safer to use. But it is not portable - POSIX doesn’t specify what it does and only some shells support it (beside bash, I heard ksh supports it too). For example, you can do

[[ -e $b ]]

to test whether a file exists. But with [, you have to quote $b, because it splits the argument and expands things like "a*" (where [[ takes it literally). That has also to do with how [ can be an external program and receives its argument just normally like every other program (although it can also be a builtin, but then it still has not this special handling).

[[ also has some other nice features, like regular expression matching with =~ along with operators like they are known in C-like languages. Here is a good page about it: What is the difference between test, [ and [[ ? and Bash Tests

Debugging Common Problems (Q&A)

Usage of xargs - Example with file_ends_with_newline - ChatGPT

The command you provided has some issues related to the usage of xargs and the incorrect usage of quotes. Here’s the corrected version of the bash command:

ls | xargs -I{} bash -c 'echo $(tail -c1 "{}" | wc -l)'

Explanation:

  1. ls: Lists the files in the current directory.
  2. xargs -I{}: Takes the output from ls and replaces {} with each file name.
  3. bash -c '...': This is necessary because you need to execute a command that involves shell features like $(...), and this needs to be evaluated by the shell.
  4. "{}": The filename passed by xargs is enclosed in double quotes to handle any filenames with spaces or special characters.
  5. tail -c1 "{}" | wc -l: tail -c1 reads the last byte of the file, and wc -l counts the number of lines, which will be either 0 or 1 depending on whether that last byte is a newline character.

This command will effectively echo 1 if the last character of the file is a newline, otherwise, it will echo 0.