« September 2011 | Main

March 26, 2012

A Great Books Approach to Understanding Computers

This is something I wrote a couple of years ago, and want to keep around. Unfortunately, the site where I wrote it is going off the air shortly, so I'm moving it here to preserve it.

We are big fans of the Great Books approach in our homeschooling. The idea behind this approach is that certain works of the human mind are so transcendently great that they provide meaningful knowledge and mental or spiritual growth far out of their own time. For example, any two-dimensional geometry book you can find owes its existence to — indeed, is largely a restatement of — Euclid’s Elements. In fact, Elements was key to the development of logic, mathematics, and science. Should not such a powerful book be read by anyone who seeks to understand any of these domains? In a less technical field, is it possible to truly grok civics without reading Plato’s Republic, comparative religion without reading Augustine’s Confessions, or human conflict without reading Sun Tzu’s The Art of War? I think not.

Yet in my chosen field of endeavor, management information systems (a better name than “information technology,” which misses the point of focus), there is no widely accepted canon of work. Certainly, I have been exposed to the great works of computer science only through my own efforts, after I had discovered the great books approach to everything else, and that was well into my career. It was also about that time that I realized how bereft of theory my field is. Take programming, which is considered an art; yet does not an artist go to museums to view da Vinci’s or van Gogh’s work for himself? How else can he place his own creativity in an understandable context? (I suppose, looking at a great deal of what is considered art these days, that that process may have lapsed, however.) A programmer, though, is trained through dry examples, dryer texts, and by instructors who often know little more than their students about how computers actually work. This was not always the case, but today the levels of abstraction between the user and the machine are so sophisticated, so abstruse, that the vast majority of people in my field are functionally incompetent; that is, they can write working code, but it is not elegant code, and is typically bug-laden code, and is so frequently ill-designed that people accept as a matter of course that restarting their computer to fix a problem is going to be frequently required.

This is, quite simply, a disgrace.

This reading list is my attempt at compiling a list of books that everyone involved with computers as a professional endeavor should read, along with an explanation of why. (Some of these books belong on a general great books list, and some of them are in the Britannica list.) More to the point, the more of these that you read, the better you will understand your craft. The fewer of them you read, the less you know what you are attempting to do. If you have additional suggestions, or think that I am in some way off base in including some particular work, please let me know why.

George Boole, An Investigation of the Laws of Thought

Boole’s work was an attempt to explain how the human brain functions, how people think in a mechanical sense. It is the source of boolean logic, which underlies all that computers are and do. The foundation of the computer is not the machine, but the logic that it embodies, and that logic owes an incalculable debt to Boole. Though Boole had many works of relevance to mathematicians and scientists, this is the work most relevant to understanding computers.

Douglas Hofstadter, Gödel, Escher, Bach and Metamagical Themas

These are philosophy, or mathematics, or logic books; you pick. Anyway, both of these books are so fundamentally tied into the logic of problem-solving and the creation of algorithms that they are essential to understanding reasoning. And since reasoning is essential to understanding programming…. Plus, Metamagical Themas (which is actually a collection of essays) in particular is just fun.

Daniel Hillis, The Pattern on the Stone

I have found no better work for explaining in layman’s terms why computers are the way they are, how they work at a fundamental level, and what are their true constraints and possibilities.

Marvin Minsky, Computation: Finite and Infinite Machines

Once you’ve read Hillis’ The Pattern on the Stone to get the basic concepts of computers, Minsky’s classic work shows what is possible and what is not possible with computers. Minsky dives deeply into Turing machines, and the concept of the universal computer. The main thing that this book teaches is the limits of the possible, and so this book essentially describes the universe of problems that computers can solve, and those that they cannot.

Donald Knuth, The Art of Computer Programming, all four volumes (and hopefully more before he passes)

I won’t lie to you: these books are rough going. But there is simply no better explanation anywhere of the algorithms that underly computer science. It’s one thing to know to use a quicksort, and quite another entirely to know why that is not always the best choice, and which choices might be better in certain cases. It is not necessary to read these books to program, only to understand why programming works the way it does.

John Hennessy and David Patterson, Computer Architecture: A Quantitative Approach

This is the best work in existence on the design of modern computers. It explains in detail what makes systems cost-effective, how they are put together, and how they can be best utilized. This is not just a book for computer designers, but also a book for people who have to make purchase decisions, or architecture decisions on how to interconnect systems. It is a book, in short, of how to think about computer systems hardware.

Harold Abelson and Jay Sussman, The Structure and Interpretation of Computer Programs

This books teaches the fundamentals of how to think about programming, as procedure and data. Starting from these first principles, it builds the structure of how to program, and in the process teaches how to think about problem solving, which is at the basis of programming. There is no better way I know to learn how to decompose a problem.

Brian Kernighan and Dennis Ritchie, The C Programming Language

Ordinarily, I would avoid books about specific programming languages, but this is not an ordinary book. This book contains the best tie between high-level languages and low-level computer constructs I have ever seen. It shows pointer operations amazingly well, and completely explores the structure of C-like languages (which include Java, C++, Objective-C, C# and a number of others, collectively the most popular languages in use today). Plus, unlike most programming books, this one is very concise, and has no fluff. You learn from this not so much the C language (though you learn that, too), but how to think about high-level languages.

Brian Kernighan and Rob Pike, The UNIX Programming Environment

Like The C Programming Language, this book is deep and broad. It covers the design of an obsolete system, which would seem to be an odd topic to be placed on a great books list. But here’s the deal: not only is this system the basis of a huge number of modern operating systems, so that the system itself still has relevance, this book teaches how operating systems work as a layered construct. After reading this book, you will be able to tackle any operating system as a user, or an administrator, or a manager with significantly more confidence, because you will understand how to think about operating systems. If you haven’t read this book, your odds of passing a systems administrator interview with me are slim.

Alfred Aho, Ravi Sethi and Jeffrey Ullman, Compilers

Once you know how to write in a high-level language, you need to know how it gets translated into terms the computer can understand. This book tells you how that happens, in all its gory detail. It’s a tough book to get through, but if the guys who created Microsoft’s INI file format had read it, maybe they would have learned enough about parsing to avoid that particular travesty.

Brian Kernighan and Rob Pike, The Practice of Programming

There are a lot of books about how to program. This is a book about how to program elegantly, robustly, and efficiently. Frankly, if you have not read this book and understood it, or worked out the ideas for yourself through hard experience, then you probably wouldn’t pass any interview for a programmer that I would give.

Martin Fowler, Refactoring

This is one of those books that creates a new idea, and with it a new term, that then becomes generally accepted without being actually understood. I cannot tell you how many programmers I’ve had to hand this book to point out that it’s not just me talking about how to write good, compact, efficient, maintainable code. The thing is, having read this book, what you really take away is not just how to fix badly-designed code, but how to avoid badly designing code in the first place.

James Rumbaugh, et al, Object-Oriented Modeling and Design

Having toured high-level languages, we enter a new abstraction layer: objects. This book is the one that taught me how to think in terms of self-contained, reusable, bulletproof code. It is still unequalled. Modern GUIs and programming languages (including most infamously Java) depend on thiese concepts, and this book shows how to think about objects.

Erich Gamma, Richard Helm, Ralph Johnson, John Vlissides (the Gang of Four), Design Patterns

When you write object-oriented code, there are certain needs that come up over and over and over again: a class that can only be instantiated once, a structured carrier of data between objects, an abstraction layer to hide the interface from the data, an abstraction layer to hide the data from the details of its physical storage, and so on. This book looks at the most common of these recurrent patterns, and shows how to solve each one correctly, so that you don’t have to work it out from scratch each time. You want to get your programmers writing better, cheaper and more reusable code? Have them read this.

Grady Booch, James Rumbaugh, Ivar Jacobson, The Unified Modeling Language User Guide

I must admit at the outset that this is not the book that I wish it was; however, it is the closest approach I’ve yet seen. This book describes a software engineering approach to software design, which includes extensive modeling of objects and their interfaces and interactions. That is good. But it does so by propounding one particular representation (UML) as the only “right” way to do this, and also relies on a software engineering approach I consider fundamentally flawed, the Unified Process. Having seen a derivative of the Unified Process in operation, I assure you that it is the wrong way to design software, analogous to the military strategy of attacking into the teeth of the enemy’s defenses, and just as likely either to fail, or to succeed pyrrhicly. For that reason, I recommend this book not as a methodology to be followed, but for its completeness in describing what must be considered in software design of object-oriented systems, and how to represent that in a universal notation for object interactions.

Alistair Cockburn, Agile Software Development

This book is the Mr. Hyde to the above book’s Mr. Jekyll. It describes a software design and implementation approach that works at the lowest cost to highest effectiveness of any I have seen or used. To the analogy above of the frontal attack, this is the corresponding deep penetration at an unexpected weak point: it is generally effective at very low cost, requires highly skilled practitioners to attempt, and is routinely dismissed by incompetents who generally only understand the right up the middle approach. More specifically, this book describes how to design software iteratively, such that there is very quickly a working version in the hands of users, who then become partners in guiding the developers to a complete realization of the final vision for the project.

Bob Schmidt, Data Modeling for Information Professionals

This book is a great introduction to the theory of data modeling, and gives a sound base for developing logical, rational ways of persisting data. Moreover, this is a book that teaches you how to break down data into self-referential chunks, in the same way that object orientation breaks down process into self-referential chunks, which leads to a better understanding of how to manage data efficiently.

Douglas Comer, et al, Internetworking with TCP/IP (there are three volumes; this is the first)

You want to know how networks function, you come here. This is a three-volume set, and it is indispensable for a network professional or a systems architect.

Albert-Laszlo Barabasi, Linked

This book is an examination of networks. Not computer networks, per se, though those are covered, but all networks of any kind. There are several books about complexity theory (including Chaos, Wolfram’s book (Towards a new Science?), Tipping Point) that are useful to an understanding of the emergent behavior of networked systems, but Barabasi’s is the one most relevant to computer people. In particular, he covers the implications of networks on security (physical as well as electronic), employee retention and other immediately-applicable domains. If you are looking, for example, to craft an organization which is agile, responsive, and powerful, you have to be willing to give up central control. This book explains why the two are incompatible.

Eric Raymond, The Complete Hacker’s Dictionary

This is a book from which just about any computer person can benefit, because it explains why you see a lot of the terms you do (like why variables are often named “foo”). While it’s in the form of a jargon dictionary, I tend to think of it as an insight into the mind of a master computer wrangler. Plus, it’s a lot of fun.

Steven Levy, Hackers

A question I am often asked is, “How do I hire people like you?” My usual answer is to first hire a person like me. The problem is one of recursion: it takes a good computer person to recognize a computer person, and good computer people want to be surrounded by good computer people, so they’ll find and retain them. All that said, if you don’t know how to recognize a good computer person, read this book. It tells you the mindset and habits that make good computer people good, and will at least help you to tell the MCSE who thinks he knows what he’s doing when in fact he knows only the magic incantations for a few neat tricks from someone with a chance at being great.

Cliff Stoll, The Cuckoo’s Egg

This engaging story of tracking down crackers who broke into Stoll’s network is a must-read for system administrators. It gets you inside a master’s head, and teaches you how to think about security in a practical way.

Neal Stephenson, In the Beginning was the Command Line

I don’t know whether to describe this as an allegory of computer systems, a defense of the CLI, a history of operating systems, a cultural critique (particularly of multiculturalism), or a theoretical examination of metaphor, or an essay on the psychology of choice. In any case, read it, even if it sounds a little outdated, given the emphasis on the now long-defunct BeOS.

Fred Brooks, The Mythical Man-Month

Brooks very thoroughly proves that throwing money and people at a problem makes it worse, not better. This is really an argument for competence in management as well as systems people.

Tom DeMarco, Peopleware

Something like 90% of large software projects fail; this is another book that explains why that is: it’s all about the people, what they can do and what they are allowed to do.

Strunk and White, The Elements of Style

Yes, I’m serious! Look, a computer program, or a programming project, or an integration project, or anything else in IT is dependent upon the ability to clearly describe what you want to happen, or what you have done, or how you plan to do something. Computer programming is an expressive act. This is a book about how to express yourself well. In fact, it’s the book on how to express yourself well, and is very similar to K&R’s The C Programming Language in a lot of ways.

Edward Tufte, The Visual Display of Quantitative Information

This is to metrics, presentations and graphics in general what Strunk and White is to language: a guide to clearly and concisely expressing yourself. If you loathe watching a Powerpoint presentation, it’s probably because the creator has never read this book.

Eric Raymond, The Cathedral and the Bazaar

This essay is, in many ways, a manifesto. And it’s a manifesto of a thought process that I don’t always agree with; in particular, I think that there is a place for non-free software (though I also think that place should be much smaller than it currently is). There is no better explanation of the idea of open source, no better advocacy of the position that software wants to be free.

Edward Yourdon, Death March

Yet another excellent book on why software projects and integration projects fail. In any organization where more than 10% of the projects fail, or where employee turnover exceeds 20%, this is a good place to start fixing the problems in the organization.

We have moved up the hierarchy from the ideas that underlie computing, to the machine, the concept of programming, the operating system, and then high-level program design. We then jumped over into databases, networking, systems integration and administration, and systems architecture. Finally, we looked at systems management. I think that this presents a fairly comprehensive treatment of all phases of information systems, and I certainly hope that it will prove of use. Note that I was following a particular progression here, and I think that it’s incomplete (particularly in the sense of underlying concepts of mathematics and history). For that reason, I strongly recommend that the reader also examine the books noted in Eisenberg’s Creating a Computer Science Cannon, which is particularly strong in those areas.

And now a final word of practical advice, if you are going to hire someone for an IT position, and they haven’t read at least a few of these, reconsider.

Preserving the comments:

  1. Excellent list! I can’t disagree with a single one of those books.

    Here’s a few that I have liked. They mostly cover the same material as other books on your list, just from a different angle.

    “Operating System Concepts” by Abraham Silberschatz.

    “Implementing Lean Software Development” by Mary Poppendieck and Tom Poppendieck.

    “Patterns of Enterprise Application Architecture” by Martin Fowler.

    “The Pragmatic Programmer: From Journeyman to Master” by Andrew Hunt, David Thomas.

    Also, C will show you how a computer works, SICP will show how you computation works.

    Posted by Russell  on  03/10/2008  at  01:46 PM
  2. Learning C is akin to learning to play the piano. Except that instead of simply mastering scales and chords on a keyboard, you hit a series of objects in the room which then, as a side effect, bounce off one of the walls (or the ceiling, or the floor, or the vase sitting by the window) and onto the correct keys for you.

    smile

    Posted by IB Bill  on  03/10/2008  at  03:02 PM
  3. Bah! Real men don’t fiddle around in any of those namby-pamby “higher level” languages; they work right down on the bare metal. They write microcode—and they don’t comment it!

    Posted by Francis W. Porretto  on  03/10/2008  at  04:35 PM
  4. Heh. My first language was BASIC. My second was Assember for the 68B09E. I had been programming Assembler for some four years before I learned FORTRAN, and it was another year before I learned Pascal. Since then, everything I’ve written has been Perl, C or a C derivative, or Java (blech).

    Russell, I have heard good things about The Pragmatic Programmer, but I specifically wanted to exclude books I haven’t read, or at least tried to work through, since I’m not really qualified to comment on them.

    Posted by Jeff Medcalf  on  03/10/2008  at  04:48 PM
  5. For a second there, I thought you were going to leave Brooks out. If you had, I would have had to completely discount everything you said wink

    Posted by Chris Byrne  on  03/10/2008  at  10:24 PM
  6. A question relating to nothing in particular: as a Libertarian, do you believe the Fed should be manipulating the interest and bond rates to try to stave off furthering the recession or should we just let the maket take care of itself?

    Posted by .(JavaScript must be enabled to view this email address)  on  03/14/2008  at  03:15 PM
  7. I always hate questions in the form of “as a member of [group X], do you believe [A, B and C]?” To the extent that I am a member of a group, it is because I believe A, B and C that I am a member. So I cannot tell you what I think as a member of some group, only what I think as me. If that causes you (or for that matter me) to stick on a label of some group membership, so be it. In either case, it’s not primary.

    Now to answer the question. I think that the government should minimize interference in markets to the largest extent that they can. But remember that currency and bonds are not really a free market: their only supplier is the government. So to the extent that the government controls the “market” anyway, it should act responsibly to prevent fluctuations in that market from having bad effects. And to the extent that the government is interfering in true markets (such as home mortgages and the like), it should cut it out. Not just the latest interference, but all the interference, up to and including the granting or guaranteeing of loans in the first place, the regulation of building standards and lending standards and so forth. (Note that by “government” I mean the Federal government.) A market with fewer distortions would behave in a more healthy manner. The problem is that it’s so hard to unpack one little distortion being considered now from a whole raft of larger distortions already created that the question is nearly meaningless as a philosophical matter, as opposed to as a matter of practical policy.

    Posted by Jeff Medcalf  on  03/14/2008  at  04:39 PM
Posted by jeff at 7:36 PM | TrackBack