The book version

17 November 12.

[PDF version]

Hi. I'm back.

In case you hadn't worked it out, I was out working on turning that tip-every-other-day series into a book, made from paper. That's now done, and the next several entries are going to be about statistical technique. This column here will be a few notes on the book and its production, in case you're interested in such things.

A cover in the O'Reilly style: white background, title (“21st Century C”) against purple background at top, black-and-white animal at bottom. The animal is the Cookie Monster.
Figure One: My mock-up of the cover, which never got used.

The book came out a week or two ago--just over a year and a month after I started all this. Here it is on Amazon, or buy it directly from O'Reilly, or buy the paper book from Amazon, then register the print book at O'Reilly and (at the Your Products Print Books page) upgrade to an electronic version.

You've got options, because print is increasingly just another medium for viewing PDFs. According to the acquiring editor at O'Reilly, most of their book sales are now non-paper editions. A tech press like O'Reilly is clearly ahead of the curve on this, but it's increasingly clear that that's just how it's gonna be, and we are in the transition period from paper-norm + electronic-niche to electronic-norm + paper-niche.

The O'Reilly back-end is of course all smoothly automated. I check out from their Subversion repository, add text, check back in, and next thing you know I have a publication-quality PDF on hand. They make some changes over there, and then what I wrote has an index. When we're all happy with the product, they change a few settings on the back-end, and the edition gets sent out to the world in mobi/PDF/&c. formats.

Yes, they still use Subversion, which was odd to me. Every time the editors and I had logistic troubles, it was over Subversion's trouble with merging simultaneous work.

So writing a book is now largely like printing any other PDF, plus the additional organization that goes into any major project.

Book format vs article/blog format
All that said, the book format is still unique and essential, for a few reasons.

The first is in the concept of a largely self-contained work that goes into great detail about a subject. A series of articles or blog posts makes it so easy to leave some topics unexplored, because either the author will get to them later, or can just add a link or a reference to some other place that covers that topic in detail. Our expectations about a book are that it will cover the ground it chose to cover, in one place and with one consistent worldview and notation. If there are big holes, people will complain and leave two-star reviews on Amazon.

Have you gotten a login to goodreads? I've found that reviews there tend to be a little more sincere and positive than reviews on larger sites. The reviews feel more like they are notes written to friends, as opposed to Amazon reviews, which read like an address being read from on stage.

The second is that correcting an error that has been printed in a thousand copies is rather difficult. That forces another sort of discipline on the author--especially with a technical book where every page will have a few dozen factual statements that could be wrong.

So on both the table-of-contents level and the micro-detail level, there's a stronger incentive to get things right and produce a better quality of output.

Reconciling the two books
This is a blog about Modeling with Data, and here I am plugging another book.

They're both books involving C, so if you left it at that, they'd be overlapping. I really hope you aren't leaving it there.

But if you were using both as a C textbook, you might want to start in Modeling with Data, Chapter 2, which is a pretty complete basic C tutorial. Then, work through 21st Century C, which assumes basic C knowledge, and covers the environment and still more C tricks. I like the coverage of makefiles in 21st Cent. C better than Appendix A of MwD, so I hope you didn't buy MwD for the appendix on makefiles. At that point, the rest of MwD will look really easy and feel really comfortable because--maybe I've mentioned this before--better computing technique makes you a better statistician.

At the end of all that, you'll know everything I do, on top of your own prior knowledge. Books are great that way.


[link] [3 comments]
[Previous entry: "End tip mode"]
[Next entry: "Raking"]

Replies: 3 comments

on Sunday, November 18th, Paul Gribble said

Hi Ben,
Congrats on the new book, looks fantastic.
Logistics Q: did you write is using LaTeX? Or something else?

on Sunday, November 18th, JC said

I'd just like to thank you for making the text freely available -- i'm not a programmer (yet?) but am interested and simply can't afford to take chances on buying books. Especially technical / reference books! So as an interested but poor person, thank you.

Awesome cover by the way ;o)

on Monday, November 19th, BK said

O'Reilly uses DocBook XML for everything, because it provides enough structure for reworking into their many formats.

Manual input of XML is really annoying, so they encouraged me to write in Asciidoc, which is a markdown-like setup that expands into DocBook. Asciidoc is much faster than writing XML, but somewhat unpredictable (It uses ++ to indicate literal text, so what happens when you refer to C++? Multi-paragraph bullet points, especially those with a code sample, were also a special challenge.)

One often has to write literal XML into the Asciidoc; I avoided all that boilerplate by coming up with a set of m4 macros (see the entry on using m4 to automate boilerplate).

E.g., I would write one m4-laden .c file, then use one set of m4 macros to produce an XML-annotated version, and use another null set to remove all annotations, producing a clean version to be compiled, tested, and put online.

When the book was nearing production, they stripped away all the macros (Asciidoc & m4), leaving us with straight-up XML.

The XML gets converted to PDF on their side, using proprietary styles and fonts that I've never seen.

All in all, a very different experience from _Modeling with Data_, which I wrote in LaTeX using a style file provided by Princeton.

Comment!
h for human:
Name:
E-Mail:
Homepage: