Ari In The Shell
So yesterday I pushed a new git repo online: wsclean. Nothing fancy at all: it’s just a very simple shell script that helps me to clean my source files from useless tabs and spaces that sometimes are left over due to oversight, quick edits, you name it.
It’s a bit messy, I guess? But the gist of it is that it calls
sed to find
and replace (what I consider) unneeded whitespace characters, using a
regexp.1 Pretty standard UNIX-y stuff, huh? The remaining code is just to
deal with the files themselves and implementing a “failsafe mode,” which sort
of puts me in a bad light as a developer. Not even I trust my own code?
Ohnoes… No, just kidding: I find the failsafe mode very useful if I want to
diff on the results, just in case I’m dealing with some file that isn’t
tracked by git.
My coding and computing experience has improved a lot since I’ve become increasingly more shell literate. And sometimes I wonder whether any of my projects could be rewritten from C into POSIX sh… hey, most of of what I do is some type of filtered I/O to files: both cras and schain read some file, apply some change to those files, and then print the information to stdout with some makeup to make it look better for each project’s goals. Rewritting them into POSIX sh should be trivial, I think.
Shell scripting languages are Turing complete, so in theory you should be
able to write any computational problem using them. You should be able to write
anything in them. String support is superb. They’re totally integrated into the
POSIX userland… precisely because they’re the shell (duh!)… What prevents
me to do those rewrites? I could save myself from the usual boilerplate code
that comes from using
strtok() to parse a string! The grep+sed+AWK trifecta
is able to do the same tasks and possibly in a less cumbersome way! Ari, you’re
being such a C fangirl…
Hey, maybe I am a C fangirl. Maybe my way of seeing things is different, but I have a very clear feeling as to when C is more suited than POSIX sh to do the task, even if involves lots of I/O and strings.
“Ariadna, are you mixing programming and feelings again? Why??”
Yes, I am. All human endeavors are sparked by irrational intuition. Logic keeps things moving from that initial spark.
The shell, as I see it, is a window to interact with the userland. The
strengths and the weaknesses of the shell don’t really come from what it does,
but from the strengths and weaknesses of the userland you’re using. POSIX sh
doesn’t know anything about regexps (in fact, it uses Kleene stars in a way
that is different to all regexp syntaxes), text formatting, dealing with files,
etc. Its built-ins are usually very scarce (and implementation-dependent): most
of what you use in a shell script are programs installed on your computer, not
functions. In some userlands not even
test (or its alias
[ ) is a shell
built-in. The very same script might work or not on someone else’s system
depending on whether the userland behaves the same way or not as yours!
Yes, I know: there’s a thing called SUS, the Single Unix Specification, which is part of POSIX, and defines the standard commands in a POSIX environment. If you write portable shell code, any POSIX compliant system should run it identically. Vendors target this compliance, but hey, bugs exist. I recently discovered a bug in a friend’s AWK code because his implementation didn’t comply with the standard in a very subtle thing, whereas GNU AWK did follow the specs in that area.
So, in my view of things, I write shell scripts when I see my code as an
“automated calling of commands” sort of project. If I see it as an “independent
tool,” I write it in C as a standalone program, even if it makes use of I/O
routines that do have interfaces at the shell level (i.e. most of
unistd.h). I know, I know, this is utterly subjective. I guess the
threshold lies in whether the program is defining some kind of novel interface
on its own; if it does, I go for C. For example, even though these are trivial
interfaces, cras and schain define file formats and a way to deal with those
files. I see them as programs that need to be first-class citizens of the whole
userland, instead of being a composition of other already existing userland
tools. On the contrary, wsclean and
phrenamer do not, so I see them as
automated execution of other tools.
Again, I’m not trying to set a moral standard here. Who knows what I may end up thinking of all of this in some months or years! I know, I’m the worst.
However, being shell literate is a must for all Linux and Unix-like systems, in my opinion. GUI tools can be powerful, but shell tools open the gates to awesome ways to make your system your own. If you haven’t, look for any guides and tutorials out there (if it’s POSIX shell instead of bash, the better). It’s fun and teaches a lot about the system you’re running. Don’t be silly, give living in a shell a chance!
I spent too much time deciding whether I’d use regex or regexp in this article! Settled on regexp because I like that it’s got an even number of letters. ↩︎