I'm Preserving strlcpy()

Posted on Mar 20, 2022

Hi everybody! I wanna tell you about a very smallish contribution to the FOSS world I started yesterday… and my thoughts behind why I think it’s necessary and why we should do it more often, in my opinion: preservation. This is not about putting code in cans and store them in your basement, but about avoiding a sorts of “code rot” I’ve noticed over the years… which is very much alike what happens to manuscripts and other forms of media that are passed around over the years… or the centuries… So, I’m preserving good ol’ strlcpy() in a repository of my own.

I invite you to read the Readme file on the repository to have a glimpse on the situation with this module. In a nutshell, strlcpy.c allows for safe string copying in C. The routine was written by Todd C. Miller for OpenBSD’s libc, proved very useful to get rid of the unsafe strcpy() routine and the safer yet unreliable strncpy() routine from the C Standard Library…1 and the rest is FOSS history: people liked it and started using it on their own projects thanks to Miller releasing his code under the ISC license. A total true success story in FOSS!

But there arose the problems. The original code is meant to be used in OpenBSD, so to use it outside of it, you need to remove a call to a macro that is internal to OpenBSD’s libc (DEF_WEAK()). And this being FOSS, people have added other routines for their own needs, like estrlcpy(), which is a wrapper around strlcpy() that prints an error message to stderr when an error is hit while copying the string. Awesome stuff, but you find these modifications within other people’s projects… A classic one is which include directives you find at the beginning of the module.

Therefore, if you grab the module from those projects, you’ll have to roll back their changes to make the project useful to you.

And, on the other hand, add incorrect copyright information to the mix because people have been getting the code not from OpenBSD, but who knows where from… So… you see very, very weird attribution strings in some versions of strlcpy(), like this one.2 or this baffling discussion on a patch that solves this but only after a lot of confusion…3

Meanwhile, Apple decided to write their own totally different version. Probably due to their obsessive NIH syndrome, but maybe they didn’t want to risk their chances either. It’s under the APSL (a non-copyleft open source license very similar to the Apache), so here’s the code if you prefer using Cupertino’s version. I wouldn’t though… Why would you, seriously? I mean, OpenBSD’s doesn’t call any subroutine whatsoever, while Apple’s abuses memcpy()!

Back to the source “tracing” issue, though.

FOSS is about decentralization of code. In fact, the artificial verticalization of git repos that is touted by Github and its clones (e.g. Gitea, Gitlab) is one of the most harmful developments in the current FOSS culture. Drew DeVault has explained this way better than I ever could on this post on his blog. So that strlcpy() has seen sort of a chaotic history is actually something we should embrace.

However, maybe me being a BA in Spanish Philology who had some dealings with weird manuscripts from 17th century Spain in her undergrad… I have a love for the preservation of the source material in a way that is closest to what it was meant to look like… yet OK, my repo also introduces the modifications I mentioned that make the code portable outside OpenBSD… but acknowledging its origins.

This may also have some relation with me loving retrocomputing documentaries and watching hours upon hours of retro-tech YouTubers? To me they’re heroes who are preserving and restoring history and doing stuff that is so cool! I know, FOSS sorta preserves itself automatically, but still…

It’s not about telling people “Ariadna has the true strlcpy() repo,” but honoring a part of our history. You should totally use the version of strlcpy() you like the most or even roll your own. I do prefer this one because I trust OpenBSD code when it comes to safe code and you surely know that C “strings” are a source of problems if you don’t know what you’re doing. Miller’s version is impecable and also a very nice example of how to use pointer arithmetics in a sane way to do something practical.

I am tempted to write a Makefile that compiles the module into both a static and shared libraries to ease using it even more, but I should think if it’s really worth it. It’d be convenient, yes… It’d also go way beyond the scope of preserving the module too. If it was a library whose source was split into many modules, I would’ve provided a Makefile from the start, but… I mean, look, this module lets itself be copied into any project as easy as it gets.

I truly hope you find this little thing useful or, at least, interesting. It’ll stay a git repo alone; I’m not planning to build a whole project around it and I don’t think I’m going to follow upstream for changes.4 I do wish this to be the first of many preservation efforts in my “career,” though… I mean, my original code is amateurish, but I happen to have some training when it comes to dusting up what people wrote in the past… So why not use my weirdness in service of the FOSS community!

Sending you all lots of love as always! 😘 Our Good Lord bless you all!

  1. If you don’t know what I’m referring to: in a nutshell, strcpy() is as a brute as the dd(1) command is and will copy over one string over to the destination memory address no matter whether there’s enough memory allocated there… so buffer overflows are very likely. strncpy() only copies over the number of bytes you ask for, but the destination string may turn out not to be null-terminated… which may lead to boundary errors later on anywhere else in your code. strlcpy() is basically a strncpy() that guarantees that the destination string will be null-terminated (the trade-off is possible truncation, but the return value will tell you if that’s the case). ↩︎

  2. Seriously, guys, /* Taken from OpenBSD */? You’ve managed to break the only requirement of the ISC License! You had one job! ↩︎

  3. Notice how the patch wants to attribute the code to the ISC itself! ↩︎

  4. Not sure how would a module like this change any further… but it has seen some over the years, so who knows! ↩︎