getopts: A Story

Posted on Apr 25, 2021

It looks like I’m in a shell scripting fever lately… which is a great thing! You know, shell scripting is an amazing, very underestimated tool. How many things could’ve been easily tackled as shell scripts but aren’t… (sad noises) So this post is a bit of a follow-up to the last one: let’s dive into some more POSIX shell scripting!

One thing I was doing in the very worst possible way in my shell scripts was parsing CLI arguments. My usual strategy was… horrible… Let’s say we’ve got the following interface in mind:

foobar [-a] file

My code would look like this cute pile of nonsense:

#!/bin/sh

# Cumbersome, horrible implementation for:
# foobar [-a] file

usage () {
	echo 'usage: foobar [-a] file' >&2
	exit 1
}

case "$1" in
-a)	set_a=1
	shift 
	;;
-*)	usage ;; # Not *) because of the 'file' argument
esac

# Our interface *requires* 'file'
[ "$1" ] && file="$1" || usage

# Be gay, do crime...

For a simple interface like this, it works. Now, imagine you’ve got an interface like cras’s (just for the sake of argument, although it’s a binary written in C):

cras [-ailnoOv] [-detT num] [file]

Of course my solution becomes very hard to scale up to something like that or just any interface that might combine a couple of options, one of which uses an argument. Yeah, a while-loop would make the trick, sure… but… This must be something folks writing shell scripts should’ve solved, right? I mean, it’s a very common task, isn’t it?

When I find myself in this situation, I start reading other people’s code, especially of utilities I use on a daily basis. Interestingly, I found out that it seems that most of the programs I use are written in anything but POSIX sh? For some reason I mostly use binaries and bash scripts that use ad hoc interfaces not based on options (e.g. pass).

So… I went off to look for resources on the internet. I’m not sure where it was, but I found people referring to getopts as the best solution.

And here is where the fun starts. The interface of getopts is, basically the one you get for getopt() in C but even nicer. For our interface above:

#!/bin/sh

# Doesn't seem impressive, but wait until we extend it!
# foobar [-a] file

usage () {
	echo 'usage: foobar [-a] file' >&2
	exit 1
}

# getopts asks for a string listing all options. A leading : deactivates some
# features like automatically printing errors to stderr. The second argument is
# the name of our iterator variable.
while getopts :a opt; do
	case "$opt" in
	a)	set_a=1 ;;
	?)	usage ;; # getopts returns ? for unknown options
	esac
done

# Shift $@ to the first non-option parameter (i.e. 'file').
shift $((OPTIND - 1)) 

# Our interface *requires* 'file'
[ "$1" ] && file="$1" || usage

# Be gay, do crime...

Remember cras’s interface? With getopts it’d look like this:

#!/bin/sh

# OH!
# cras [-ailnoOv] [-detT num] [file]

usage () {
	echo 'usage: cras [-ailnoOv] [-detT num] [file]' >&2
	exit 1
}

# A : *after* an option means the option requires an argument
while getopts :ailnoOvd:e:t:T: opt; do
	case "$opt" in
	a)	set_a=1 ;;
	i)	set_i=1 ;;
	l)	set_l=1 ;;
	# [...]
	d)	set_d=1
		d_arg="$OPTARG" 
		;;
	e)	set_e=1
		e_arg="$OPTARG"
		;;
	# [...]
	?)	usage ;; # getopts returns ? for unknown options
	esac
done
shift $((OPTIND - 1)) 

# Our interface does *not* requires 'file'... If not present, grab CWD
file="${1:-$(pwd)}"

# Be gay, do crime... again!

Isn’t it awesome? No more worrying about how to deal with CLI options! This works! Wait a second… Is this POSIX? You know I care about this sort of things, right?

Go figure… getopts is POSIX: Behold the specs! This is even better!

… Hm… wait… If getopts is POSIX… then… what about getopt() in C?

To be honest, lately I’ve been avoiding getopt() in C because I mistakenly thought it was a GNU extension. Truth is it is POSIX as well (but not part of the C Standard!), only getopt_long() is GNU-exclusive. All this time I’ve been using libsl’s arg.h instead mainly because of this misconception. I must say, though, I still prefer it and will keep using it: it’s very simple and being a header in standard C, it’s portable by default even in non-POSIX systems; not that that entails a critical use case for me, but it feels nice.

I wonder where my prejudice came from in the first place? Now I wonder how many other things I might not be aware of in my programming and computing in general. One of the things I love from this art is that you’re constantly learning new things… some of which are decades old yet current! It’s a fun, rewarding journey full of surprises.

Never, ever stop learning!