Tip 42: Don't bother with union

24 December 11. [link] PDF version

level: textbook reader
purpose: reduce cognitive load

Part of a series of tips on POSIX and C. Start from the tip intro page, or get 21st Century C, the book based on this series.

I've read a lot of old computing manuals--I've even got a PL/I book on my shelf that I've spent many minutes flipping through. On the language family tree, PL/I is in some ways the parent of a lot of languages, including ALGOL, FORTRAN, SAS, C, and other languages from the era when every language name was capitalized.

These manuals cared a lot about alignment. Computer memory had certain perferred sizes, like eight-byte chunks, and those chunks were requisite for processing. If a variable began halfway through a chunk, you were screwed. I've had to deal with this here in my lifetime, when writing C code to interface with FORTRAN code; alignment bugs ensued.

R is a statistics-oriented scripting language that, under the hood, is a kind of LISP, implemented via S expressions, which can be of any type: integer, floating point, text, vector, list of other S expressions, function, something like two dozen options total. Also, roughly everything is an S expression--if you had 100,000 of them in memory at one time, it wouldn't be all that surprising, and at this point having a fraction of the memory footprint that one would have via a fixed-size struct would be noticeable. It's a somewhat clever setup, marred only by its lack of documentation not that I'm bitter.

To summarize the use cases for using unions:

Are you doing either of those? If not, then don't bother--alignment is irrelevant on a modern machine, and any memory savings is near to it. Meanwhile, there are lots of problems with unions that can easily be avoided by just using structs instead.

PS for those of you who got here by searching for unions are useless: I am a member of my local labor union. They've done good things.


[Previous entry: "Tip 41: Initialize arrays and structs with zeros"]
[Next entry: "Tip 43: Wrap substructures in parent structures"]