Tip 11: String literals
24 October 11. [link] PDF versionPart of a series of tips on POSIX and C. Start from the tip intro page, or get 21st Century C, the book based on this series.
level: intermediate string user
purpose: understand an annoying subtlety of C string handling
Here is a program that sets up two strings and prints them to the screen:
#include <stdio.h> int main(){ char *s1 = "Thread"; char *s2; asprintf(&s2, "Floss"); printf("%s\n", s1); printf("%s\n", s2); }
Both forms will leave a single word in the given string. However, the C compiler treats them in a very different manner, which can trip up the unaware.
Did you try the sample code in tip #10 that showed what strings are embedded into the program binary? In the example here, Thread would be such an embedded string, and s1 could thus point to a location in the executable program itself. How efficient--you don't need to spend run time having the system count characters or waste memory repeating information already in the binary. I suppose in the 1970s this mattered.
Both the baked-in s1 and the allocated-on-demand s2 behave identically for reading purposes, but you can't modify or free s1. Here are some lines you could add to the above example, and their effects:
s2[0]='f'; //Switch Floss to lowercase. s1[0]='t'; //Segfault. free(s2); //Clean up. free(s1); //Segfault.
If you think of a bare string declared like "Floss" as pointing to a location in the program itself, then it makes sense that s1's contents will be absolutely read-only.
I honestly don't know how your compiler really handles a constant string, but it is a fine mental model to presume it is pointing to a point in the program, so writing upon is strictly forbidden.
Did you think this would be a series about why C is better than every other language in every way? If so, sorry to disappoint you. The difference between constant and variable strings is subtle and error-prone, and makes hard-coded strings useful only in limited contexts. I can't think of a scripting language where you would need to care about this distinction.
But here is one simple solution: strdup, which is POSIX-standard, and is short for string duplicate. Usage:
char *s3 = strdup("Thread");
The string Thread is still hard-coded into the program, but s3 is a copy of
that constant blob, and so can be freely modified as you wish.
[Previous entry: "Tip 10: Use asprintf to make string handling less painful"]
[Next entry: "Tip 12: Use asprintf to extend strings"]