March 26, 2012

A Great Books Approach to Understanding Computers

This is something I wrote a couple of years ago, and want to keep around. Unfortunately, the site where I wrote it is going off the air shortly, so I'm moving it here to preserve it.

We are big fans of the Great Books approach in our homeschooling. The idea behind this approach is that certain works of the human mind are so transcendently great that they provide meaningful knowledge and mental or spiritual growth far out of their own time. For example, any two-dimensional geometry book you can find owes its existence to — indeed, is largely a restatement of — Euclid’s Elements. In fact, Elements was key to the development of logic, mathematics, and science. Should not such a powerful book be read by anyone who seeks to understand any of these domains? In a less technical field, is it possible to truly grok civics without reading Plato’s Republic, comparative religion without reading Augustine’s Confessions, or human conflict without reading Sun Tzu’s The Art of War? I think not.

Yet in my chosen field of endeavor, management information systems (a better name than “information technology,” which misses the point of focus), there is no widely accepted canon of work. Certainly, I have been exposed to the great works of computer science only through my own efforts, after I had discovered the great books approach to everything else, and that was well into my career. It was also about that time that I realized how bereft of theory my field is. Take programming, which is considered an art; yet does not an artist go to museums to view da Vinci’s or van Gogh’s work for himself? How else can he place his own creativity in an understandable context? (I suppose, looking at a great deal of what is considered art these days, that that process may have lapsed, however.) A programmer, though, is trained through dry examples, dryer texts, and by instructors who often know little more than their students about how computers actually work. This was not always the case, but today the levels of abstraction between the user and the machine are so sophisticated, so abstruse, that the vast majority of people in my field are functionally incompetent; that is, they can write working code, but it is not elegant code, and is typically bug-laden code, and is so frequently ill-designed that people accept as a matter of course that restarting their computer to fix a problem is going to be frequently required.

This is, quite simply, a disgrace.

This reading list is my attempt at compiling a list of books that everyone involved with computers as a professional endeavor should read, along with an explanation of why. (Some of these books belong on a general great books list, and some of them are in the Britannica list.) More to the point, the more of these that you read, the better you will understand your craft. The fewer of them you read, the less you know what you are attempting to do. If you have additional suggestions, or think that I am in some way off base in including some particular work, please let me know why.

George Boole, An Investigation of the Laws of Thought

Boole’s work was an attempt to explain how the human brain functions, how people think in a mechanical sense. It is the source of boolean logic, which underlies all that computers are and do. The foundation of the computer is not the machine, but the logic that it embodies, and that logic owes an incalculable debt to Boole. Though Boole had many works of relevance to mathematicians and scientists, this is the work most relevant to understanding computers.

Douglas Hofstadter, Gödel, Escher, Bach and Metamagical Themas

These are philosophy, or mathematics, or logic books; you pick. Anyway, both of these books are so fundamentally tied into the logic of problem-solving and the creation of algorithms that they are essential to understanding reasoning. And since reasoning is essential to understanding programming…. Plus, Metamagical Themas (which is actually a collection of essays) in particular is just fun.

Daniel Hillis, The Pattern on the Stone

I have found no better work for explaining in layman’s terms why computers are the way they are, how they work at a fundamental level, and what are their true constraints and possibilities.

Marvin Minsky, Computation: Finite and Infinite Machines

Once you’ve read Hillis’ The Pattern on the Stone to get the basic concepts of computers, Minsky’s classic work shows what is possible and what is not possible with computers. Minsky dives deeply into Turing machines, and the concept of the universal computer. The main thing that this book teaches is the limits of the possible, and so this book essentially describes the universe of problems that computers can solve, and those that they cannot.

Donald Knuth, The Art of Computer Programming, all four volumes (and hopefully more before he passes)

I won’t lie to you: these books are rough going. But there is simply no better explanation anywhere of the algorithms that underly computer science. It’s one thing to know to use a quicksort, and quite another entirely to know why that is not always the best choice, and which choices might be better in certain cases. It is not necessary to read these books to program, only to understand why programming works the way it does.

John Hennessy and David Patterson, Computer Architecture: A Quantitative Approach

This is the best work in existence on the design of modern computers. It explains in detail what makes systems cost-effective, how they are put together, and how they can be best utilized. This is not just a book for computer designers, but also a book for people who have to make purchase decisions, or architecture decisions on how to interconnect systems. It is a book, in short, of how to think about computer systems hardware.

Harold Abelson and Jay Sussman, The Structure and Interpretation of Computer Programs

This books teaches the fundamentals of how to think about programming, as procedure and data. Starting from these first principles, it builds the structure of how to program, and in the process teaches how to think about problem solving, which is at the basis of programming. There is no better way I know to learn how to decompose a problem.

Brian Kernighan and Dennis Ritchie, The C Programming Language

Ordinarily, I would avoid books about specific programming languages, but this is not an ordinary book. This book contains the best tie between high-level languages and low-level computer constructs I have ever seen. It shows pointer operations amazingly well, and completely explores the structure of C-like languages (which include Java, C++, Objective-C, C# and a number of others, collectively the most popular languages in use today). Plus, unlike most programming books, this one is very concise, and has no fluff. You learn from this not so much the C language (though you learn that, too), but how to think about high-level languages.

Brian Kernighan and Rob Pike, The UNIX Programming Environment

Like The C Programming Language, this book is deep and broad. It covers the design of an obsolete system, which would seem to be an odd topic to be placed on a great books list. But here’s the deal: not only is this system the basis of a huge number of modern operating systems, so that the system itself still has relevance, this book teaches how operating systems work as a layered construct. After reading this book, you will be able to tackle any operating system as a user, or an administrator, or a manager with significantly more confidence, because you will understand how to think about operating systems. If you haven’t read this book, your odds of passing a systems administrator interview with me are slim.

Alfred Aho, Ravi Sethi and Jeffrey Ullman, Compilers

Once you know how to write in a high-level language, you need to know how it gets translated into terms the computer can understand. This book tells you how that happens, in all its gory detail. It’s a tough book to get through, but if the guys who created Microsoft’s INI file format had read it, maybe they would have learned enough about parsing to avoid that particular travesty.

Brian Kernighan and Rob Pike, The Practice of Programming

There are a lot of books about how to program. This is a book about how to program elegantly, robustly, and efficiently. Frankly, if you have not read this book and understood it, or worked out the ideas for yourself through hard experience, then you probably wouldn’t pass any interview for a programmer that I would give.

Martin Fowler, Refactoring

This is one of those books that creates a new idea, and with it a new term, that then becomes generally accepted without being actually understood. I cannot tell you how many programmers I’ve had to hand this book to point out that it’s not just me talking about how to write good, compact, efficient, maintainable code. The thing is, having read this book, what you really take away is not just how to fix badly-designed code, but how to avoid badly designing code in the first place.

James Rumbaugh, et al, Object-Oriented Modeling and Design

Having toured high-level languages, we enter a new abstraction layer: objects. This book is the one that taught me how to think in terms of self-contained, reusable, bulletproof code. It is still unequalled. Modern GUIs and programming languages (including most infamously Java) depend on thiese concepts, and this book shows how to think about objects.

Erich Gamma, Richard Helm, Ralph Johnson, John Vlissides (the Gang of Four), Design Patterns

When you write object-oriented code, there are certain needs that come up over and over and over again: a class that can only be instantiated once, a structured carrier of data between objects, an abstraction layer to hide the interface from the data, an abstraction layer to hide the data from the details of its physical storage, and so on. This book looks at the most common of these recurrent patterns, and shows how to solve each one correctly, so that you don’t have to work it out from scratch each time. You want to get your programmers writing better, cheaper and more reusable code? Have them read this.

Grady Booch, James Rumbaugh, Ivar Jacobson, The Unified Modeling Language User Guide

I must admit at the outset that this is not the book that I wish it was; however, it is the closest approach I’ve yet seen. This book describes a software engineering approach to software design, which includes extensive modeling of objects and their interfaces and interactions. That is good. But it does so by propounding one particular representation (UML) as the only “right” way to do this, and also relies on a software engineering approach I consider fundamentally flawed, the Unified Process. Having seen a derivative of the Unified Process in operation, I assure you that it is the wrong way to design software, analogous to the military strategy of attacking into the teeth of the enemy’s defenses, and just as likely either to fail, or to succeed pyrrhicly. For that reason, I recommend this book not as a methodology to be followed, but for its completeness in describing what must be considered in software design of object-oriented systems, and how to represent that in a universal notation for object interactions.

Alistair Cockburn, Agile Software Development

This book is the Mr. Hyde to the above book’s Mr. Jekyll. It describes a software design and implementation approach that works at the lowest cost to highest effectiveness of any I have seen or used. To the analogy above of the frontal attack, this is the corresponding deep penetration at an unexpected weak point: it is generally effective at very low cost, requires highly skilled practitioners to attempt, and is routinely dismissed by incompetents who generally only understand the right up the middle approach. More specifically, this book describes how to design software iteratively, such that there is very quickly a working version in the hands of users, who then become partners in guiding the developers to a complete realization of the final vision for the project.

Bob Schmidt, Data Modeling for Information Professionals

This book is a great introduction to the theory of data modeling, and gives a sound base for developing logical, rational ways of persisting data. Moreover, this is a book that teaches you how to break down data into self-referential chunks, in the same way that object orientation breaks down process into self-referential chunks, which leads to a better understanding of how to manage data efficiently.

Douglas Comer, et al, Internetworking with TCP/IP (there are three volumes; this is the first)

You want to know how networks function, you come here. This is a three-volume set, and it is indispensable for a network professional or a systems architect.

Albert-Laszlo Barabasi, Linked

This book is an examination of networks. Not computer networks, per se, though those are covered, but all networks of any kind. There are several books about complexity theory (including Chaos, Wolfram’s book (Towards a new Science?), Tipping Point) that are useful to an understanding of the emergent behavior of networked systems, but Barabasi’s is the one most relevant to computer people. In particular, he covers the implications of networks on security (physical as well as electronic), employee retention and other immediately-applicable domains. If you are looking, for example, to craft an organization which is agile, responsive, and powerful, you have to be willing to give up central control. This book explains why the two are incompatible.

Eric Raymond, The Complete Hacker’s Dictionary

This is a book from which just about any computer person can benefit, because it explains why you see a lot of the terms you do (like why variables are often named “foo”). While it’s in the form of a jargon dictionary, I tend to think of it as an insight into the mind of a master computer wrangler. Plus, it’s a lot of fun.

Steven Levy, Hackers

A question I am often asked is, “How do I hire people like you?” My usual answer is to first hire a person like me. The problem is one of recursion: it takes a good computer person to recognize a computer person, and good computer people want to be surrounded by good computer people, so they’ll find and retain them. All that said, if you don’t know how to recognize a good computer person, read this book. It tells you the mindset and habits that make good computer people good, and will at least help you to tell the MCSE who thinks he knows what he’s doing when in fact he knows only the magic incantations for a few neat tricks from someone with a chance at being great.

Cliff Stoll, The Cuckoo’s Egg

This engaging story of tracking down crackers who broke into Stoll’s network is a must-read for system administrators. It gets you inside a master’s head, and teaches you how to think about security in a practical way.

Neal Stephenson, In the Beginning was the Command Line

I don’t know whether to describe this as an allegory of computer systems, a defense of the CLI, a history of operating systems, a cultural critique (particularly of multiculturalism), or a theoretical examination of metaphor, or an essay on the psychology of choice. In any case, read it, even if it sounds a little outdated, given the emphasis on the now long-defunct BeOS.

Fred Brooks, The Mythical Man-Month

Brooks very thoroughly proves that throwing money and people at a problem makes it worse, not better. This is really an argument for competence in management as well as systems people.

Tom DeMarco, Peopleware

Something like 90% of large software projects fail; this is another book that explains why that is: it’s all about the people, what they can do and what they are allowed to do.

Strunk and White, The Elements of Style

Yes, I’m serious! Look, a computer program, or a programming project, or an integration project, or anything else in IT is dependent upon the ability to clearly describe what you want to happen, or what you have done, or how you plan to do something. Computer programming is an expressive act. This is a book about how to express yourself well. In fact, it’s the book on how to express yourself well, and is very similar to K&R’s The C Programming Language in a lot of ways.

Edward Tufte, The Visual Display of Quantitative Information

This is to metrics, presentations and graphics in general what Strunk and White is to language: a guide to clearly and concisely expressing yourself. If you loathe watching a Powerpoint presentation, it’s probably because the creator has never read this book.

Eric Raymond, The Cathedral and the Bazaar

This essay is, in many ways, a manifesto. And it’s a manifesto of a thought process that I don’t always agree with; in particular, I think that there is a place for non-free software (though I also think that place should be much smaller than it currently is). There is no better explanation of the idea of open source, no better advocacy of the position that software wants to be free.

Edward Yourdon, Death March

Yet another excellent book on why software projects and integration projects fail. In any organization where more than 10% of the projects fail, or where employee turnover exceeds 20%, this is a good place to start fixing the problems in the organization.

We have moved up the hierarchy from the ideas that underlie computing, to the machine, the concept of programming, the operating system, and then high-level program design. We then jumped over into databases, networking, systems integration and administration, and systems architecture. Finally, we looked at systems management. I think that this presents a fairly comprehensive treatment of all phases of information systems, and I certainly hope that it will prove of use. Note that I was following a particular progression here, and I think that it’s incomplete (particularly in the sense of underlying concepts of mathematics and history). For that reason, I strongly recommend that the reader also examine the books noted in Eisenberg’s Creating a Computer Science Cannon, which is particularly strong in those areas.

And now a final word of practical advice, if you are going to hire someone for an IT position, and they haven’t read at least a few of these, reconsider.

Preserving the comments:

  1. Excellent list! I can’t disagree with a single one of those books.

    Here’s a few that I have liked. They mostly cover the same material as other books on your list, just from a different angle.

    “Operating System Concepts” by Abraham Silberschatz.

    “Implementing Lean Software Development” by Mary Poppendieck and Tom Poppendieck.

    “Patterns of Enterprise Application Architecture” by Martin Fowler.

    “The Pragmatic Programmer: From Journeyman to Master” by Andrew Hunt, David Thomas.

    Also, C will show you how a computer works, SICP will show how you computation works.

    Posted by Russell  on  03/10/2008  at  01:46 PM
  2. Learning C is akin to learning to play the piano. Except that instead of simply mastering scales and chords on a keyboard, you hit a series of objects in the room which then, as a side effect, bounce off one of the walls (or the ceiling, or the floor, or the vase sitting by the window) and onto the correct keys for you.

    smile

    Posted by IB Bill  on  03/10/2008  at  03:02 PM
  3. Bah! Real men don’t fiddle around in any of those namby-pamby “higher level” languages; they work right down on the bare metal. They write microcode—and they don’t comment it!

    Posted by Francis W. Porretto  on  03/10/2008  at  04:35 PM
  4. Heh. My first language was BASIC. My second was Assember for the 68B09E. I had been programming Assembler for some four years before I learned FORTRAN, and it was another year before I learned Pascal. Since then, everything I’ve written has been Perl, C or a C derivative, or Java (blech).

    Russell, I have heard good things about The Pragmatic Programmer, but I specifically wanted to exclude books I haven’t read, or at least tried to work through, since I’m not really qualified to comment on them.

    Posted by Jeff Medcalf  on  03/10/2008  at  04:48 PM
  5. For a second there, I thought you were going to leave Brooks out. If you had, I would have had to completely discount everything you said wink

    Posted by Chris Byrne  on  03/10/2008  at  10:24 PM
  6. A question relating to nothing in particular: as a Libertarian, do you believe the Fed should be manipulating the interest and bond rates to try to stave off furthering the recession or should we just let the maket take care of itself?

    Posted by .(JavaScript must be enabled to view this email address)  on  03/14/2008  at  03:15 PM
  7. I always hate questions in the form of “as a member of [group X], do you believe [A, B and C]?” To the extent that I am a member of a group, it is because I believe A, B and C that I am a member. So I cannot tell you what I think as a member of some group, only what I think as me. If that causes you (or for that matter me) to stick on a label of some group membership, so be it. In either case, it’s not primary.

    Now to answer the question. I think that the government should minimize interference in markets to the largest extent that they can. But remember that currency and bonds are not really a free market: their only supplier is the government. So to the extent that the government controls the “market” anyway, it should act responsibly to prevent fluctuations in that market from having bad effects. And to the extent that the government is interfering in true markets (such as home mortgages and the like), it should cut it out. Not just the latest interference, but all the interference, up to and including the granting or guaranteeing of loans in the first place, the regulation of building standards and lending standards and so forth. (Note that by “government” I mean the Federal government.) A market with fewer distortions would behave in a more healthy manner. The problem is that it’s so hard to unpack one little distortion being considered now from a whole raft of larger distortions already created that the question is nearly meaningless as a philosophical matter, as opposed to as a matter of practical policy.

    Posted by Jeff Medcalf  on  03/14/2008  at  04:39 PM
Posted by jeff at 7:36 PM | TrackBack

March 31, 2009

SVN on MacOS X 10.5

OK, so I was setting up SVN on a MacOS X Leopard (10.5) server, and had a couple of problems. Most of these were relatively easily solved, but here are the secrets that are not revealed by Apple's instructions.

You cannot restart the server once you enable the mod_dav_svn module, until you have at least one site configured to allow WebDAV (under the Options pane in the site config on Server Admin). You might also have to allow folder listing; I don't recall if that was a problem or not.

You have to edit the sites file to do more than put in the DAV and SVNParentPath directives, because the file is not written correctly.

You MUST MUST MUST comment out the ErrorDocument line in the site configuration, or you will get an error like:

Error: 2 (No such file or directory) Description: PROPFIND request failed on 'your project'

(and another that the SVN directory could not be opened)

There's a good two hours gone, almost all on that last part.

Posted by jeff at 11:16 AM | TrackBack

April 15, 2007

Topic Notes: Projects that Fail

One of my pet peeves is enterprise software projects that fail. In my opinion, the only possible reason for an IT project to fail is incompetence, either on the technical staff or (far more likely) the project or above management. The most common management incompetence is the inability to say "no". (And as I am fond of pointing out in other contexts, if you have a manager who can't say "no", it's a management problem one level higher.) All of this is just a roundabout way to introduce Karl Gallagher's insightful essay on how excess requirements and misclassification of requirements kill projects.

It's actually a part of a piece I've been thinking on for some time: why it is that non-IT staff produce far more projects that work for end users than IT staff can produce.

By the way, I'll be writing over at Eternity Road as well, per Fran Porretto's gracious invitation. I haven't figured out which stuff I do will go where, except that the political stuff will probably go over there, and the IT stuff will probably be here or cross-posted, depending on depth.

Posted by jeff at 1:38 PM | TrackBack

February 28, 2007

Thoughts on Programming's Future

I came across this interview with Richard Gabriel, from 2002, that takes a wonderful view of one possible direction for programming:

So, because you can program well or poorly, and because most of it is creative (in that we don't really know what we're doing when we start out), my view is that we should train developers the way we train creative people like poets and artists. People may say,"Well, that sounds really nuts." But what do people do when they're being trained, for example, to get a Master of Fine Arts in poetry? They study great works of poetry. Do we do that in our software engineering disciplines? No. You don't look at the source code for great pieces of software. Or look at the architecture of great pieces of software. You don't look at their design. You don't study the lives of great software designers. So, you don't study the literature of the thing you're trying to build.

Second, MFA programs create a context in which you're creating while reflecting on it. For example, you write poetry while reading and critiquing other poetry, and while working with mentors who are looking at what you're doing, helping you think about what you're doing and working with you on your revisions. Then you go into writers' workshops and continue the whole process, and write many, many poems under supervision in a critical context, and with mentorship. We don't do that with software.

I was talking to Mark Strand, who is one of the first poets who mentored me, and he said, more or less, that how good you are depends on how many poems you've written in your life. About two and a half years ago, I started writing a poem a day, and I've gotten way better since I started doing that. And so, I've probably written about 1000 poems in my life so far, almost all of them in the last two years.

Compare that to how many programs someone has written before they're considered a software developer or engineer. Have they written 1000? No, they've probably written 50. So, the idea behind the MFA in software is that if we want to get good at writing software, we have to practice it, we have to have a critical literature, and we have to have a critical context.

That, more or less, is a take on programming as a creative discipline. It is the classical education approach applied to computing, and it is a beautiful and wonderful idea. The part of me that wrote a lot of poems as a young man, the part that enjoys role-playing games and fiddling with software after everyone is asleep, really likes that approach.

But the part of me that goes to work every day, and looks at building robust systems that work, realizes that that approach to training programmers will only work for a small number: it's too difficult, time-consuming, and expensive. The immediate input is money and hard work, the output gain is in the far future, and is more variable and nebulous. Only a very small proportion of IT people could benefit — the future leaders and mentors. It is important, but not comprehensive, towards solving the software crisis.

(Background note: the software crisis is a term for the problem that it is nearly impossible to write programs that are not buggy, and even techniques like OO programming and formal design cannot help you past a few million lines, by which point the program is essentially guaranteed to be unrunnable because of bugs.)

Another approach is looking at programming languages. I had an epiphany a few weeks ago, wherein I realized that a large common element of many of the design problems being encountered in projects I work with is not that the developers are not good at figuring out how to correctly factor business logic, but that business logic is not an object or a characteristic of an object, so expressing it as objects is fundamentally wrong. Object orientation is just too simple to provide the correct abstractions beyond the model layer and the display-interaction part of the GUI (which is inherently physical). It cannot express business logic (processes, procedures, constraints, timing sequences, coupled rule sets and the like) appropriately.

In looking around for people already working on this, one of my colleagues pointed me to Sun's "metaphors" concept. I think that, from an enterprise software point of view, fixing the problem at this level — ending the artificial separation of design and coding, and allowing units of execution to run anywhere in an organic fashion (delegating deployment and execution decisions by rule set rather than by fixed implementation, for example) — would be a great step forwards. The problem that this attacks is that reusability is hard, currently. Take a simple idea of an object, and put it into a radically different infrastructure than what it was built for, and it often fails. Consider a system with a singleton logger, running on two different systems. Which gets the singleton, and how does the other communicate with it? Or do they each get one singleton, and hope nothing breaks?

In the end, I think that there are three fundamental changes that have to happen before software creation can become robust and inexpensive: a better job must be done training developers, system administrators and technical management, particularly at the beginning of their careers; a better programming paradigm must be developed to simplify the creation and maintenance of large-scale, distributed mixed behavioral/data-centric applications; and we must figure out a way to get people to actually start using the various tools that have become available over the past few decades to simplify the process of system development and integration.

On that last point, two of the little-discussed reasons why engineering software is not like engineering a bridge, are that project managers over bridges are usually qualified engineers in their own right, and that a project manager building a bridge would never dream of saying something like, "We're behind schedule, so we're leaving out half the supports on this one span to meet our deadlines." That happens — all the time — in software development. Posted by jeff at 2:59 PM | Comments (2) | TrackBack

February 19, 2007

The Most Common, and Most Costly Possible, IT Project Management Failure

So what invidious failure could it be to merit the title of most common and most costly IT project management failure? Is it staffing unqualified people? No, that can be fixed without even throwing away a version of whatever it is you're building. Is it bad requirements gathering? Not really, though that is a common failure. Usually, bad requirements gathering means you throw the first one away and build the second to the now properly-gathered requirements. You may end up embarrassed, even fired, but from a corporate point of view it's hardly the end of the world. Is it inadequate testing? Nope, though that, too, is pretty bad and pretty common. An inadequately-tested application means that your customers are debugging the application instead of your programmers. (In other words, think of almost any released Microsoft product.) The bugs that are found can be fixed with point updates — this is painful and expensive, but there are worse project management monsters to slay.

No, the most common and most costly possible IT project management failure is simply managing to dates instead of work effort. This is the gift that just keeps on giving: you pay for the software when you build it badly because you are too rushed to build it well, and again with every fix and every patch and every feature added thereafter. You pay for it over and over again, until you stop.

Let me give you a real-life example. A project I am peripherally involved with (that is, I work with the primary reviewer, and occasionally review; I neither design nor manage nor even approve) has gone through the following cycle:

  1. The first design was horrible. There were hundreds of review comments from the first design review. Many of the basic ideas behind data modeling and object-oriented programming were clearly either not understood, or were so badly expressed as to be almost comically bad. It was very, very clear that the program would be at least twice as large as it had to be, and probably larger, and that it would be so complex as to be nearly impossible to maintain with any efficiency.
  2. The right action at this point would have been to can the design, can the designers, and start over. OK, I probably would have given the designers general feedback and then had them start over, canning them if they still obviously didn't get it. Nonetheless, the key item here is that the design was clearly flawed. (Moreover, it came about during design reviews that the designers and the customers disagreed on what the requirements meant, and the designers were arguing requirements with the customer!) Clearly, the design should not be pursued.
  3. You can already see it coming: the design was used. Why? Because there was a deadline to meet, and the prior design group that was canned had ended up pushing the project well behind schedule. But it's worse than that: not only was the design used, but the project manager decided that the number of comments on the design document meant that the document was flawed, rather than the design, and so canned the "documentation" effort until after coding. Besides, it was already partly coded, even though design wasn't done. Did I mention yet that the requirements gathering done by the first group was terrible, but was used going forwards anyway because of, wait for it, lack of time in the schedule to redo it right. That, by the way, is why the designers ended up arguing requirements with the customer.
  4. Anyway, the beat goes on, and application coding is completed. Now it's time to document the design. Except no one actually knows what the design is. Thus, every design document produced creates more questions than it answers, and the review architects begin to get slowly sucked into the weeds, all the way down to the code more than once. At times, the review architects are creating diagrams to check their understanding, because the design team can't produce the diagrams themselves!
  5. In parallel with the documentation and reviews noted, the application goes through testing. When the user acceptance tests come, the application is rejected utterly. The users find so many functional and security gaps that they refuse to let it be deployed. There wasn't time to design it well, so now the company has to invest in emergency fixes and bringing on extra people and staffing 24 hours and so forth, just to get it deployed at the pilot sites (six of the 180-some sites that will eventually, theoretically, have the application).
  6. There is a follow-on project, that adds significant capabilities and is scheduled for some six months out from the point that the users reject the prior version. Clearly, the user issues need to be fixed immediately to allow wider deployment. Clearly, there's no time to go fix even the most egregious flaws in the architecture and design. Clearly, there's no time to document this before we code it. I'm sure you can see where this is going. The originally-planned follow-on project is pushed out just long enough to put in an interim project, intended to fix the most critical bugs and add the most critical features required of, but not delivered by, the recently-rejected version. And because this has to happen quickly to meet the schedule, the documentation (ie, the design work) is again pushed to the back of the effort, and no time is available for architectural fixes. I can buy that, except that there is still denial going on about the inevitable result: we are digging deeper into the hole of bad design we are already in, and for the same reasons.
  7. By this point, as we are in testing for the interim version, everyone agrees that the design is deeply flawed. The critical flaws that were called out in the original design review, and ignored, more than a year prior have now resulted in bugs that have had the application down for weeks at the pilot sites, as well as a squadron of emergency fixes to correct all kinds of previously-identified, and ignored, issues. But now we're even further behind, because we made some assumptions in November to meet our schedule for February, that required another team to deliver something in October that they told us up front they could not deliver until January. The schedule demanded, and then when reality delivered as promised, more slippages and emergency work followed, as night follows day. So now, let's look ahead to the next version, the one that was pushed out to slot in the interim version.
  8. Are we going to fix the architectural issues? No; no time is available in the schedule for this, because we have to implement these new features for the business, and we were supposed to have delivered them already. So we will have time to design before we code, but we will not have time to fix anything already identified. Indeed, by and large the new team (much more technically competent overall than the old team) will begin by extending the problematic parts of the old design, making the hole deeper, instead of refactoring first.
  9. There is, now, clearly a need identified for a version beyond the one already in design. There are numerous fixes that need to be put in place, as well as adding in sufficient functionality to bring us up to where the original version was supposed to be. This version is about to go into planning, and is scheduled for end-of-year delivery. Now, we've spent three+ years and millions of dollars to build something that should have been done in one year for about a million dollars, and every time we've had to fix or extend the system, we've paid more for it, and taken more time, than we needed to do. Now, it seems, there is finally a version with time and scope for fixes, because we can rearchitect and refactor, then add in the functions, faster and cheaper than we can just add in the functions.
  10. By now, you should have figured this out. If we take the time to rearchitect and refactor, that's time, in the project managers' minds, that is not available for adding in the new functionality that's needed, and all the estimates tell us that every moment is necessary to meet the current schedules. No argument that we are actually shrinking the schedule and budget will be entertained, because there is no connection in the PM's heads between code complexity and size on the one hand, and cost to maintain and extend on the other.

Yeah, I had a terrible day at work. Why do you ask?

Posted by jeff at 9:17 PM | Comments (1) | TrackBack

February 11, 2007

Why Software Tends to be Bad

Designing and building good software is difficult. If you doubt this, answer the following:

  • Do you have/have you used more software with bad interfaces (confusing, hidden features, too much exposed functionality, weird tab orders) or good interfaces (clean, consistent, exposing what is important and hiding power-user options)?
  • Do you have/have you used more software with or without noticeable bugs or design flaws? How about with or without occasional major bugs or design flaws that cause lost data, including application crashes?
  • How often do you just give up on using software because you can't figure it out, or it's too cumbersome, or it behaves badly?
  • How often do you need technical support to get beyond the most rudimentary features of your software?

The problem is that software is terribly prone to failure. Engineers express the degree to which a mechanical system is prone to failure in terms of the number of moving parts and the total number of parts, moving or otherwise. A moving part (think of the wheel bearings in your car) wears down over time, and thus can fail. Properly designed, a non-moving part will not fail under its design loads, but the more parts that there are, the more chance that there will be some flaw in design, manufacture or assembly. The software analogy of a part, in modern languages, is a semicolon; every programmatic statement in languages like C/C++, Java, and Perl terminates with a semicolon. The software analogy of a moving part is code that can change its behavior as circumstances (such as data) change, or code only exercised under uncommon circumstances (and thus that might not be reached by test cases). The number of parts and particularly of "moving parts" in software is far, far larger than any mechanical system, and thus software is inherently more prone to failure than mechanical systems.

This complexity can be mitigated in many ways. These include creation and reuse of standard components for standard tasks or entities (and the consequent multiple cycles of refinement), use of standardized ways of doing common tasks, encapsulation and abstraction, proper requirements gathering and test case development/execution, rigorous unit testing that reaches every branch, automated code analysis, and good logical design practice. The three very powerful tools that have arisen from various combinations of these mitigation techniques and tools are best practices for application design, standardized design patterns, and object-oriented coding practices.

Sadly, in the real world, these practices are more honored in the breach. In part, this arises, in software developed by companies, from the fact that most computer people are magicians, who don't understand these tools, or bureaucrats, who don't want to pay for using them. Proper coding is expensive, and it's often difficult to convince people that it's easier and cheaper to design and code correctly once, than to redesign and recode several times.

In academic and open source software, which attracts vastly more artists, the underlying code is often wonderful, while the interfaces and reports are miserable and the software is incredibly difficult to use. I have actually heard university-based programmers say that if you can't understand their software's interface, the problem is yours, rather than the interface's. If the measure of software's utility is how widely it's used within its problem domain, academic software tends to be the least useful code written.

Of particular concern to businesses, heavyweight development methodologies are expensive, because they assume that people will make mistakes, and mitigate this tendency by making people do sufficient verification work before coding to (theoretically) ensure that mistakes are caught. Agile development is much cheaper, but only works if your people are in the top few per cent of the industry (which makes them more expensive to employ, of course), your development is done in-house, and you have good or at least well-understood business processes already in place.

Building software to do more than you need today is more expensive than building software for only your current needs. Building services and libraries saves you in the long run because you only write them once. The practical upshot is that most managers tend to want to use agile development methods even though their staff is incapable of doing so, or their business processes are immature; and most managers tend to want to ignore reusable code because their budget and schedule are based on this project, not the next one.

But academically-developed software, and much open source, is built by programmers for programmers with very little attention to usability. In some cases, such as for code libraries or faceless servers, this works very well. In others, such as for finished desktop applications, it often works very, very badly. Software that is perfect, but unusable, is not any better (except for strip mining the base code) than software that is imperfect but usable. In many ways, perfect base code with a lousy interface is worse than bad base code with a usable interface.

It is possible to build good software. But it's not common.

Posted by jeff at 6:08 PM | Comments (3) | TrackBack

February 5, 2007

Rethink

From Armed Liberal, check out Michael Wesch's impressive — what? Presentation? Movie? Artwork? — on Web 2.0. It happens that I'm not usually captured by the latest computer buzzwords, because my entry into the Internet was in the Spring of 1988, when not only did the web not exist, but its precursor (gopher) did not exist either. With that background, I often see the new buzzwords as old concepts repackaged. SOA? It's just the concept of services — like your web server or FTP server or mail server — where the interface is on port 80 (the web port) and is defined by description files rather than by a pre-agreed protocol. Useful, yes, but hardly earth-shattering in either concept or execution; I'd be happier to see more businesses get the first level of reuse right than to see more businesses jump on SOA — the payback is higher, and most companies haven't gotten to the level of code reuse across projects in the same group, never mind recycling of code and entities at an enterprise level.

In a way, web 2.0 is like that: it's a buzzword for a collection of web services and sites that really are rehashes of things already there. In a way.

But there is more to it than that, because it has always been the case that increasing the number of people capable of sharing information, and the amount of information they are able to share, changes the world. And web 2.0, stripped of the hype and boiled down to the common elements that tie these various sites and services together, is about one thing only: making it possible for everyone to share any information at all, any time, to anyone, without a mediator, a priest, a government official, an editor, a reviewer or anyone else in the way. That is the promise of web 2.0: universality of conversation, creation, art, life.

Now there I go sounding like the various hype-driven tech press organizations. But seriously, that is a non-trivial change. When I was a child, if someone had an idea, their capacity to share it was limited to people who knew them. Maybe a few would have type-written newsletters, perhaps for a school or company, in which they could share their idea. Even fewer, less than one in one hundred, would have the ability to get an idea consistently into the local newspaper, even as a letter to the editor (because there were so many, and such little space to print them). No more than one in a thousand — probably no more than one in one hundred thousand — could get their idea onto the radio, or on TV, or even in a book. The cost of sharing information was high; the speed was low. The barriers to entry, then, were such that relatively few ideas could be widely and quickly shared.

With web 2.0 — a buzzword I still hate — some professor in a relatively minor college has an idea, creates an expression of that idea, and it gets picked up by a guy in Canada, where it is seen by a blogger from Winds of Change, where I see it, and now you see it here. Moreover, at any or every step of the way, the idea and its expression could be passed verbatim, without notable chance of error, or modified into something completely new. I could, for example, take the idea, and make a different presentation, perhaps in the hated Powerpoint. Or I could snip the video apart and put in my own images and ideas to change the emphasis. You can do that, too. Everyone can.

I'm not one for hype, but sometimes it is justified.


Posted by jeff at 7:04 PM | TrackBack

January 29, 2007

For the Doubters

I used to argue with people every once in a while who claimed that, of course, Microsoft was out to make a profit, but that they were not trying to destroy their competition and lock consumers into their products because that would be wrong, and Microsoft's competitors are just ticked off that they're losing to Microsoft. Oh, really?

Posted by jeff at 11:33 PM | TrackBack

January 4, 2007

Computer People

Working with enterprise IT systems is hard. Sometimes, it is very, very hard. There are a lot of reasons for this, but regardless of the reasons, the combination of difficulty and impact on the bottom line would lead the naïve to conclude that all IT people are uniformly quite bright. This is, sadly, not the case. Many IT people are quite bright, and some are truly exceptional, but this is hardly more of a universal attribute in enterprise IT work than it is in the general population.

IT is not unique in this; in engineering, for example, there are more than a few rocket scientists who are, actually, pretty average intellectually. But engineering generally is a mature field; it has evolved measures to deal with the need to design things that could kill a lot of people if events go awry, using people who are, well, just people. IT is immature, and still depends on above-average skills to get decent results without unaffordable costs. This is, perhaps, a key reason why 90% or more of large IT projects at large companies fail to be completed on time and on budget (over 30% fail to be completed at all!), and even then often lack more than half of their originally-intended feature set.

Since we don't have very much professional legacy of good practice, in comparison to older technical fields, IT managers, project managers and architects generally have to do a lot of muddling through. Frustration with this is, in fact, the core motivation behind the software engineering movement. Part of that muddling through is understanding the kind of people you have to deal with on a daily basis. In IT, there are essentially four types: bureaucrats, magicians, artists and engineers.

Bureaucrats are those people who are entirely or nearly entirely process oriented. To a bureaucrat, a schedule, list or budget is far more meaningful than whether or not the system they are developing or administering actually works. Bureaucrats love arbitrary dates, large numbers of technically-meaningless milestones, and the illusion of control. In fact, to a bureaucrat, controlled failure is more valuable than chaotic success. Bureaucrats typically do not last long as either administrators, technicians or developers; they quickly migrate into management, where they can do the most harm. The rest end up working in call centers, data center operations, deployment or technical audit (in all of which they excel).

Magicians don't actually understand what they are doing. They have some finite list of incantations, sufficient to grant success at taming certain demons. If you've ever come across a system administrator whose idea of fixing a broken system is to restore it from the OS up, going through each installation manual along the way and finishing with restoring the last good data backup, you've seen the magician in action. Pretty much everyone starts out as a magician, and all of us are magicians in some areas; what distinguishes the true magician is that he will always be a magician in a few areas, and utterly unknowing in all other areas. Coders who cannot understand the implications of what they are coding, system/network/database administrators with a one-hammer toolkit, and architects who are lost when a machine is set in front of them and they are actually expected to use it are common examples of the breed.

In comparison to the first two groups, the artist truly understands everything about what he is doing. To an artist, an enterprise system is a symphony, and he is the conductor (and plays all the instruments). If you cannot understand that his code is dependent on the byte order of an undocumented network device in Pittsburgh, it just shows that you don't know enough to speak on his level. If you are appalled when you determine that he spent a month rewriting a rarely-called but compatibility-critical standard system function in assembly and with a non-standard interface (for just that extra bit of performance, and besides, it was a total kludge), or that he is using print queues with a custom back end as a network messaging interface "for convenience", well, that just shows that you aren't worthy. These guys are gold, but I suggest you keep them in the back room, and filter all of their output through someone with more interest in non-geniuses being able to use the systems. Under no circumstances allow an artist to deal with end users, customers, or people whose opinion you value.

Probably even more rare than the artists are the engineers. I don't mean people with "engineer" in their title (often glorified coders or system administrators), but the ones who understand the need for discipline, reliability, repeatability, and solid design. Engineers are in love with six sigma, and actually understand the math involved as well as the pretty pictures that express the math; or alternately, can tell you why six sigma is all fine and good, but could be improved by.... Engineers are the ones who fix a problem once, and document everything. They are also the ones who do the vast majority of the useful work on a project. If you find yourself in possession of an engineer, make sure that they are involved in training and in hiring new people, and under no circumstance underpay them; you do not want them to end up working for your competition. (Actually, that goes for the artists, too; but you can put engineers in charge, while artists will alienate all of your other employees if they are in charge.)

In the total IT population, the proportion is probably about 30-50% bureaucrats, about 40-60% magicians, and maybe 5% each of artists and engineers. The best combination is to have your senior managers be a mix of bureaucrats and engineers; your line managers and architects be almost entirely engineers; your deployment, operations and call center people be mostly bureaucrats; and your developers be artists and engineers. Pair up the magicians with the less-experienced engineers (who will usually be good team leads in any case) or with the more tactful artists, and you'll end up turning many of them into engineers or artists; the remainder will eventually move on, voluntarily or otherwise.

Posted by jeff at 9:23 PM | TrackBack

November 17, 2006

This is Huge

Sun has released the Java environments as open source under the GPL. This has two large and contradictory implications: evolution of the platform, including to smaller and larger devices as well as for stability and features, will speed up; meanwhile compatibility will decline amongst implementations, unless Sun has either a rigid compliance certification or a well-developed process for ensuring that users have the ability to specify strict compliance when running (ie, do not allow non-standard Java to run). The compatibility issue is key for enterprise installations, and I hope Sun has solved this.

Posted by jeff at 2:37 PM | Comments (3) | TrackBack

October 25, 2006

How Corporations Lose a Lot of Money

I work in IT, at a company currently taking a lot of financial hits. In such a company, you would think that every effort would be bent towards fixing problems that cost a lot of money. You would be right only at the smallest level, closest to the ground. Above that, there are counterveiling priorities that always win out. Here are three examples:

1. When a problem is occurring in a project, and you are responsible for getting it fixed, and you don't know what it is exactly, what do you do? Well, if you are a director, you might choose to look into the problem to see what is going on, and then react. You might also fly a dozen people half way (OK, 35% of the way) around the world on a moment's notice to make sure that you have all the right people in place to fix whatever the problem might be.

Now, if the problem were with a production process, such that your company is losing money in six figures or more every day, the "panic" option is not necessarily a bad one: it can easily pay for itself. But when the problem does not incur such a cost, even in the worst case, it is a lot cheaper to find out the nature of the problem, then send the right one or two people to fix it.

But the counterveiling priority in this case is appearing to your boss (a company executive) as if you are doing everything possible to fix the problem. The appearance of action takes precedence over wise use of resources, particularly money and time.

2. If you have a problem with your software development process, such that most of your projects fail, you can look at this a few ways. The first is that you can realize that it is a process problem, and fix the process. The second is that you can realize that it is a process problem, and actually adhere to the process. The third is that you can realize that it is a process problem, but ignore the process. The fourth is that you can fail to realize that it is a process problem, and just assume that your project teams are stupid at best.

I was in a meeting yesterday where we reviewed why we had gotten into a bad state with a project. The project had delivered code, but was having all kinds of performance and functional problems, and the user community is unhappy. The documentation for the project is so bad that at this point, the architects are throwing up their arms and saying they cannot even review the documentation, because it embeds all kinds of objects from tools they do not have, as well as largely consisting of links to other documents, sometimes documents that don't exist or are not viewable for some reason. (Please note: the documentation in question is the documentation that, if you follow the company's stated process for development, is used by the programmers to code the application.) All along, for over a year, the architects assigned to review the project had been throwing up red flags about performance, functionality, and documentation inadequacy.

At the meeting, it was agreed that the problem was that the project manager kept approving deliverables over the objections of the architects on the grounds that it was necessary to meet schedule deadlines. This led to shortcutting the processes that would have found (indeed, did find) and prevented the problems, delivering a higher-quality application to the users. Now, what to do about it? There are at least two minor enhancements to the application the project delivered to fix a bunch of problems and to prepare for the next major version, and the first of those is expected to start coding any day now, with delivery in November. The next major version is also supposed to start coding any day now, with delivery in March or so.

Since the next major version is basing its coding documents on the flawed documents of the prior version, they are having all kinds of issues. These range from simply having to ignore the documentation and read the code to figure out what's going on, to replacing code wholesale because they don't realize it's there, to being unable to deliver a reviewable coding document. So again, what to do?

Well, unfortunately, the schedule pressure is such that it appears the project is going to be given the go-ahead to code without complete and workable coding documentation. Here, the pressure of time leads to taking shortcuts, which leads to problems, which increases the time pressure because the dates did not have any slack in them for fixing problems. But no one is willing to go to the business and say that we're going to slip dates, because that would look bad. The pressure to meet arbitrary deadlines counterveils the pressure to cut costs, and overwhelms it. Everyone who matters will see the missed deadline now; no one who cares about costs will really notice the long-term costs, since they will be incurred in the future. Looking good to your customer is more important, it seems, than doing a good job for your customer and actually providing for the customer's needs.

3. Let's say you are the CIO, and you have a long-term strategic vision, developed with millions of dollars of consulting time, that you want to use to change the way your company provides itself with IT services. Now, you know as the CIO that such efforts at the top are meaningless: it's only when they are applied throughout the hierarchy that they are felt. To make that happen, there are many, many necessary steps, none of them sufficient by themselves. These include the actual creation of the vision, communication of the vision to all levels, aligning the organization around the vision and measuring to detect change (or lack of change) towards compliance with that vision.

Generally, for this to work, the highest levels under you have to understand and buy into the vision, and their incentives have to be arranged so that they will push the vision to their teams, who will do that downwards and so on until everyone is moving in the same direction. With a large company, this takes notable amounts of time.

Now, the problem with strategic visions of this nature is that they are not self-enforcing. Buzzwords don't help much when you are facing the kind of day-to-day decisions that come up at lower levels of the hierarchy. Counterveiling pressures of schedule and available tools make any strategy difficult to implement in practice. Unless the incentives all point towards the strategy, the cumulative effect of the friction at the ground level overwhelms the higher ideals. How can a project manager, for example, be expected to implement a vision that "integrate[s] emerging technologies" with standardized non-functional requirements to "develop a culture that leans forward", when he is being measured on whether or not next week's release schedule is met? Particularly when non-functional requirements aren't standardized and he has no power to standardize them.

So instead, what ends up happening is that hundreds of architects from around the world are flown in for a week-long conference where the vision is communicated, with the help of numerous artists, novelists and other such aids to capturing the output of the conference in an appealing and very presentation-worthy way, packaged for display to the (always-fawning) technical press. This has virtually no effect on the day-to-day work, because it doesn't address it. The higher-level managers and directors who should be translating this into terms that do make day-to-day sense are being measured on other things, and so that is what they are responding to.

The pressure to look good to the press, and thus to your peers, though, is incredibly strong. Stronger by far, in fact, than the pressure to do the dirty work of rooting out the flaws in execution of your last strategy, and stronger than the pressure to reorganize to execute the new strategy.

In all three cases, which are really just representative of the hundreds of cases that come up over a given year in any large organization, the pressure to look good in some way overwhelms the pressure to do good work and deliver good results. Failing spectacularly is more likely to get you promoted than succeeding quietly and consistently. And it costs money by the bucket loads. On the other hand, it's very hard to figure out where the money is going, so looking good has a payoff, and being fiscally sound doesn't.

It's just that the stockholders aren't well served by that approach.

Posted by jeff at 5:46 PM | Comments (2) | TrackBack

October 19, 2006

Pushing a Rope

Perhaps the most annoying characteristic of the IT industry is how relentlessly focused on the future the industry is. In some ways, this is good: we are constantly looking for ways to make life easier and to do things faster, better and cheaper. But there is a limit to our headlong rush: we are very often leaving our businesses behind.

The reality is, IT is not — or should not be — leading an organization unless that organization's core business is IT. IT organizations should be helping the business to advance by finding solutions to the business' problems. But this relentless future focus often puts us out in front, finding ways to incorporate emerging technologies into problems more to find a way to use the emerging technologies than because the business needs it. If you are asking, for example, how blogs can help your business, you are asking the wrong question. There are many questions to which blogs can be all or part of an answer (such as, how can I enable managers of plants on different continents to easily communicate their problems and collaborate to find solutions?), but blogs, and technologies in general, should never be the question.

We all seem to complain about managers who read a magazine article or see a vendor presentation, and then want to immediately deploy a technology into their organization without understanding it. In fact, Scott Adams has made himself a millionaire lampooning that very problem. Yet it seems just as prevalent that it is the IT people who are pushing technology on managers. And that is not unlike pushing a rope: it's very frustrating, and doesn't get you nearly as far as if the person on the other end of the rope is pulling.

Me? I'd be happier if more vendors could implement 10 year old technology well. One of my favorite ways of figuring out if I'm using good processes is to ask whether I would buy product X (where X is whatever my client produces) if it were made that way. If I were working for a financial services firm, for example, I would ask if their financial services would be better or worse using processes of the maturity and with similar outcomes to the way I make IT systems. If the answer is worse, or if I would not buy it knowing their processes, then it means I'm doing the wrong thing.

I wish more people would ask, and honestly answer, that question. And then do something about it. Instead, I spent a good chunk of my day today getting tasked with incorporating emerging technologies into our standard non-functional requirements for vendors. Heck, like I said, I'd be happy if they could do a better job with old technologies, and we'd get much more bang for the buck putting our money there.

Oh, well.

Posted by jeff at 7:40 PM | Comments (2) | TrackBack

October 10, 2006

Geeky Fun

Google has a (beta) tool for searching code. Anyone who knows programmers knows that they are a little off in the best of times, and thus their comments to explain their code are sometimes, um, colorful. Here's my favorite so far.

I could waste hours with this tool.

Posted by jeff at 11:22 PM | Comments (3) | TrackBack

July 19, 2006

Medcalf's Law

Medcalf's Law:

Any internet service that is not designed from the start to be secure will degrade to worthlessness as its utility is discovered by crackers and spammers.

Fact:

The basic protocols of the Internet are not designed for security. These include the fundamental transports (IP, TCP, UDB) as well as common services (SMTP for email, for example).

Conclusion:

The Internet is doomed to degrade into worthlessness or be replaced by a system designed to be secure.

Sorry, just got tired of deleting the spam that leaks through my various filters. Had to vent.

Posted by jeff at 9:49 AM | Comments (2) | TrackBack

June 23, 2006

Build or Buy

One of the most common decisions that IT executives have to make is whether to build custom software to meet a business need, or to buy COTS (commercial off-the-shelf) software to meet that need. As you might suspect, COTS vendors tend to believe that a 100% buy solution is best for everyone, and large outsourcing vendors tend to believe that a 100% build solution is best for everyone. The fact that each is recommending the solution that happens to be most profitable to them is, of course, incidental: just look at the piles of arguments we've come up with, borrowed or paid for!

The major arguments of the 100% build side are that developing custom software is wasteful (inefficient and costly, and then you eventually discard the software anyway), risky (if you fail, who do you turn to for support?) and distracting (shouldn't you really be worried about your core business?). The major arguments of the 100% build side are that custom software meets your business needs exactly; custom software can give you a business advantage over your competitors; and if your business model changes, your custom software can evolve to meet your requirements, while COTS software often cannot, or at least not quickly. The dirty little secret of both is that both are entirely correct in their arguments, but neither really addresses the way that you can decide which type of solution best meets your needs. Please, allow me.

The first consideration is whether the business need being met is common across all companies. An example of this would be presentation tools, or word processors, or print queueing software. In those cases, there is zero controversy: buy a COTS tool.

The second consideration is whether the business need being met is unique to your company. If no other company in the world has a need for this tool, there is no COTS tool to address it, and thus no controversy: you have to build a tool or change your business.

Now things get more complicated, because those really are the only two bright-line rules that are available to an IT executive making these decisions. All other choices are balancing trade-offs. Before we get into how to make the decision, let's consider those factors being traded.

The biggest two factors in most managers' heads are schedule and budget, and they are closely linked. The longer it takes to develop or deploy something, the higher the cost because the people have to be paid for all the time. But reducing the schedule arbitrarily can increase costs beyond those that would be saved by employing people on the project for less time. Consider: if functionality has to be dropped, causing the business need to be met less well; or if milestones are constantly missed, causing the business to have to rework their plans for acceptance testing and deployment, the cost of reworking the business around the change is almost always higher than the labor cost estimated to be saved. Worse, the estimated savings don't materialize if the project misses deadlines, because the people are still there doing the work after your estimated budget for people costs has run dry. Also, look at TCO very closely. It can cost more to implement many COTS products (particularly very complex suites like PeopleSoft and SAP) than it would take to build your own software to meet your needs. (In the case of financial systems, though, unless you are a financial company, I would advise against it: the risks of missing or misinterpreting government regulations are quite high.)

The next most important factor is your IT department's ability to deliver on projects. Some IT managers can staff a project, gather detailed and testable requirements, monitor project managers and architects, know when to get involved, know when to stay out of the mud, and focus on the needs of the business; these are the managers who will consistently deliver their projects, usually on time and budget. Conversely, some IT managers are completely incapable of one or more of those management skills, and cannot deliver on time and on budget even if their project is to install MS Office in a department. Depending on the mix of skills and abilities among the managers, you might be able to deploy COTS or develop custom software, but not both. Recognizing this reality can help you to (a) correct it, or (b) make appropriate choices to increase the chance of successfully delivering a solution to the business.

There is an aside about management that is too critical not to mention here, even though it's a bit off-topic. You as an executive are a manager, too. If your department consistently cannot deliver projects, consider that you might be the problem. Either you've hired bad managers; or created or allowed procedures and policies that impede effective delivery; or forced your managers to respond to constantly changing requirements, schedules and budgets; or failed to understand what the business needs.

Back to the topic at hand. The next consideration is whether meeting 85% of the business need is enough, because that's about all you will usually get with COTS products. In many cases, that really is sufficient, and the gaps can be filled manually or with a process modification to better match the software. Typically, you can easily absorb this level of imperfect match of solution and requiremenst in areas that are not core to your business. A bank can accept, for example, COTS packages for HR, but might not be able to accept COTS packages for auditing accounts. In your core business area, unless everyone in your industry does things the same way, COTS packages are often not the best choice, because they force you to compromise your core methodologies, or work around the software, in order to be productive. A nearly 100% build policy in non-core, but still important, business areas can cost you, because a focus on meeting every need all the time in the systems can result in added acquisition and development costs that are not justified by the slight improvements in productivity that could be gained.

Your core business is the one place where you can hope to gain a competitive advantage through superior systems and the efficiencies they produce. If there are opportunities to save 10% of the time it takes to do the most common tasks required for your core business, and if you can realize those opportunities, you can beat your competition on price, quality or both. If you have adopted a 100% COTS policy, here is where you will fall behind your competitors. (Which is not to say that you definitely needs a custom solution here. Once you adopt a 100% buy or nearly 100% build mindset, you will simply miss many opportunities to do things better the other way. Agility is more effective and profitable than controllability in many cases.)

Finally, the legacy environment into which systems must be integrated has to be considered. The key factor in predicting schedules and budgets is comparison to similar tasks performed previously. If you know how long it takes to implement COTS package A, or develop service B, because you've done it before, you can predict your schedule and budget with some accuracy. One key reason why project management does not work as well with enterprise IT systems as it does with, say, building a bridge is that bridges don't have to deal with legacy systems. Often, even with a COTS package that has been implemented at a thousand companies, you will find that your environment has legacy systems that the new package has never before been integrated with. If you have an environment with many proprietary legacy systems, integration of new COTS systems is exponentially harder than in a relatively clean environment. By contrast, building a solution into an environment heavy with legacy systems often still results in inefficient systems (inevitable in having to do many stages of extract/transform/load (ETL) or writing interfaces for many different protocols), but at least you will be able to ensure that the system integrates well into the current environment. Sadly, in an environment with many legacy systems, it is often the case that adding a new COTS package worsens the problem for the next project, because there's one more proprietary package that your project teams have to figure out how to work with. Where you do buy COTS packages, try very, very hard to ensure that they use industry-standard protocols for inter-system communications, or you may eventually build an environment that cannot grow until many critical systems are ripped out and replaced at more or less the same time. This is not a career-enhancing move.

So when considing schedule and budget, management abilities, system criticality and how closely the system meets business requirements, competative advantage and legacy systems in the environment, how does one decide to weight each factor? Sadly, there is no easy rule for this, because each company is different. I will propose a few rules of thumb, though:

  • If any of the five key trade-offs strongly suggests a solution be COTS or be custom, make everything else work around that.
  • If one or more trade-offs strongly suggests COTS or custom, and one or more of the other trade-offs strongly suggests the opposite, you have a business conflict that has to be solved before you can provide a system to match. First fix the business problem, then reevaluate.
  • The most critical success factor is management capability: you cannot do what you cannot do. You can fix this with hiring or policy changes, or simply accept that your current structure compels a build or a buy decision.
  • After management capability, competative advantage is the key decision driver. If you cannot gain a competative advantage from a system, don't build it unless there is no COTS product that can meet your needs. It's sufficient in non-core areas to meet 85% of the needs, and in peripheral areas meeting 50% of the business' needs may be sufficient. In core areas, meeting 100% of the business needs can be vital. In fact, it's often the case that using IT architects to map the business process can result in process improvements even if no system is implemented, because the actual business process is rarely fully understood, even within the business. In core areas, rolling your own is often a good idea even when there is a promising COTS package, unless that package meets every business need completely.
  • Next most important is what it's going to cost and how long it's going to take to implement a solution. Most of the time, this favors COTS products, but not always. Do your due diligence here on how long similar companies have taken to implement the COTS solution, and don't just trust the COTS vendor's word on that: call the customers.
  • Finally, the environment is rarely dispositive — at least, not directly. Generally, the environment really acts as a driver by altering the cost and schedule to implement. In some cases (where you depend on a protocol that COTS products for that space don't support, for example), environment can become directly important. In any case, it's vitally important that your managers and vendors know the environment they are deploying into, or they will be likely to blow right through their schedules without slowing down. In heavily legacy-driven environments, COTS products are usually much more difficult to integrate than generally advertised or expected.

This discussion brings up more questions, like how to reduce project risk (more to the point, why most projects fail), whether to use open source products in your business, how to identify areas where IT can make a business more profitable, whether enterprise architecture methodologies are really worth the investment, whether and when to oursource and a host of others. Topics for another day, I'm afraid.

But if you're interested in knowing about any of those, let me know, and someday I'll tackle it.

Posted by jeff at 5:47 PM | TrackBack

June 22, 2006

I'm Imagining the Customer Service Call

Via Transterrestrial Musings, we have the exploding Dell laptop. The first thing I could think of was the customer service call...

CUSTOMER, after an 18 minute hold: Yes, hello? I'd like to see if my laptop is under warranty, or maybe has a recall ...

I'm sorry? ...

Yes, it's an Inspiron1 B130, and it ...

Yes, I was running Windows XP, but I don't see ...

No, it doesn't boot. ...

No, the power switch doesn't turn it on at all, and it doesn't matter whether it's plugged into the wall, because...

No, the power's not out. Look, the laptop exploded! ...

No, not Vista: XP. ...

I should hope not!! What I really need is...

No, no. The point is, my laptop exploded and I need to know if it's under warranty, or if there's some recall that covers this. ...

Um, let me see. Where is the serial number tag? ...

Well, it's kind of charred, but the last character looks like Helen Thomas. ...

No, not Dylan Thomas: Helen Thomas. The reporter. ...

No, I don't have the box with me. I'm in Japan at a conference...

What?...

What!?...

What do you mean it's not business hours in Japan?!?!


1 Yes, that's an actual Dell product line. Yes, marketing people do sometimes suck; why do you ask?

Posted by jeff at 10:56 PM | TrackBack

May 15, 2006

Join Together Applescript for iTunes

Some days, you find a really neat tool on the Net, and you have to share. In this case, it's Join Together, one of Doug's Applescripts for iTunes.

I use my iPod primarily for audiobooks. Until a few months ago, I had been buying the audiobooks and importing them manually. Then, I discovered Audible, which can give you a single file for up to several discs/hours, complete with chapter marks. That's been great, but still doesn't help with the purchased CDs. Enter Join Together.

Take your playlist of files - in this example, 7 CDs that were split into 14 audio files. Tell Join Together that you want to merge them, edit the information to give it the name of the book, then tell it to use the same export settings as the originating set of files (which makes it easy and less time consuming since they were all imported at once). Next, tell it to save it as .m4b (to show up as an Audiobook on the iPod), and optionally, using Apple's Chapter Tool, create chapter marks in the file.

A few minutes later (very quickly with the default settings of the files), a combined file spits out into the iTunes Library. My 14 audio files are now one big file with 14 chapters. Quick and painless.

Doug - thanks a bunch! I'm going to be joining a lot of files now, and maybe using your utility to downsize them in the process.

Posted by Nemo at 11:00 PM | Comments (2) | TrackBack

April 23, 2006

Network Neutrality

There have been a couple of posts by Dale Franks at QandO recently, here and here, dealing with network neutrality (the proposition, or design principle really, that the INTERconnections between NETworks that make up the Internet should not care about the content of the data passing across them — a packet is a packet is a packet). The alternative is "intelligent networks", which look at the source, destination, type or contents of packages and decide to charge differently, or throttle differently, based on those characteristics.

For example, let's say that you have a cable modem, as I do. Your cable company might decide that Google is "using up too much bandwidth" because so many people use it. The cable company would then go to Google and say that traffic to and from Google would be throttled (artificially limited as to the amount that could be sent across the cable company's network) unless Google paid a fee to make up for its "excessive usage" of the cable company's bandwidth. Google would then have the option of striking deals with the cable and telephone monopolies that provide most of the "last mile" broadband to internet users; setting up alternate ways of getting to its users; or being throttled, with the consequent loss of reputation (they will appear to perform badly), traffic and revenue.

Presumably, there would not be such a problem with Google's (or Microsoft's, or Yahoo's, or any other large site's) own provider. For one thing, Google pays for vast amounts of bandwidth directly carried by long-haul carriers over dedicated lines. For another thing, if any of Google's (presumably multiple) providers threatened such a thing, Google could just cancel their contract with the provider and use its other alternatives, some or all of which would be glad to take the additional cash in vast amounts.

What really would be happening is that the local providers would be ignoring the fact that its customers are in fact paying for their bandwidth, and would be attempting to get other, deeper pockets to pay for the provider's bandwidth a second time. From a bottom line perspective, it makes a lot of sense in the very, very short run for local providers to use bandwidth shaping and similar technologies to wring more money out of the network, just as it pays for them to set up added services (like email or web content provision or hosting) that are so bad none of their customers will actually use them, so that their costs of setting up are small but they get extra revenue from charging for the unused services. But in the case of bandwidth shaping, this really is only a very, very short term benefit.

Let's say that my local cable company were to put such limitations in place. I would notice, and would immediately cancel my contract in favor of another local vendor that did not have such limitations. If I could not find such a vendor, I'd lease a T1 line (a dedicated, high-speed line). Even for people who cannot lease a T1, there are usually multiple providers in urban areas, where most broadband consumers live. And there's no percentage in inconveniencing your rural users, because let's face it, there aren't enough of them for bandwidth shaping to be a threat to deep pockets companies. The collective loss of business, possibly combined with efforts to end local monopolies for cable and phone service, would quickly put an end to the business utility of bandwidth shaping. In any but the short term, unless there is a forced monopoly, there is no percentage in inconveniencing your customers. And while there may be a cable monopoly, and a phone monopoly, and both may collude in inconveniencing their customers, there are still other ways of getting high-speed connections which would suddenly become much more attractive. In other words, there is no monopoly on the provision of high-speed IP connectivity even if particular methods of providing high-speed IP connectivity are monopolies.

So while I support Dale's general idea of deregulating telecom services as much as possible, I don't really see network neutrality violations as being much of a big deal: the Internet routes around damage. While it may require building more connections than currently exist (a good thing in any case) in order to do so, the advantage from bandwidth shaping would be so short term as to create no real problems for internet users.

UPDATE: See also here and here. I thought about whether to put my response to Scott Chaffin in the comments or here, and decided to leave it in the comments. But if you agree with Scott that the problem is "free riders", you might read my response in the comments for a concrete example of why this is not so.

Posted by jeff at 5:36 PM | Comments (9) | TrackBack

April 13, 2006

Carelessness with Data

Like many IT folks, I carry a flash memory stick around. I recently switched mine to a secure flash drive after losing my original for a couple of days. Stories like this make me even more glad I have done so. Granted, I don't have any military data on mine, but still......

Afghans selling stolen military data

One flash memory drive, the Times reported Thursday, holds the names, photos and phone numbers of people described as Afghan spies working for the military. The data indicates payments of $50 bounties for each Taliban or al-Qaida fighter caught based on the source’s intelligence.
Posted by Nemo at 10:32 AM | Comments (2) | TrackBack

March 28, 2006

IT Insecurity

Perhaps the most damaging aspect of IT security, particularly in large enterprises, is that security policies are not generally set by computer or network security experts. Instead, people who are reckoned to be security experts because they've done it before (without, generally, any user feedback on what they have done, and excepting disaster often not even getting feedback from other security people) are generally put in charge; or worse yet, non-technical managers are put in charge, of making IT security policies. This leads to policies that seem to be quite secure, but are actually not. In particular, most companies really cannot get password policies right, and in the process generally reduce their security.

Consider the typical enterprise: a few computer systems put in a long time ago were gradually built on, expanded and replaced; along the way, numerous other systems were created to solve various tactical problems, and these too were, over time, built on, expanded and replaced; there is a constant press to build on, expand and replace old systems and to create new systems to solve current problems. (Remember: by and large, computer systems exist in enterprises because they solve problems of some kind or another, at least theoretically.)

This leads to a situation where there are many, many dependencies between systems, largely undocumented; there are many systems that no longer really do anything, but appear to; there are many systems that have unknown user bases; and most of the systems don't talk to each other even to the extent of agreeing what a valid user identity might be. (And yes, there are tools to solve these problems, which involve installing more systems....) As a consequence, a user might have one id/password for the corporate portal or blog, another for the email system, another for the network/his computer, another for access through the firewall/proxies to the external Internet, and a half dozen others for other systems. In the perfect case, all of these systems authenticate the user (that is, determine that he is who he says he is) via the same system, so that one id/password combination gets the user into any systems he needs; and in fact via single sign-on the user should only have to login once to gain access to all of these systems. In practice, virtually no company has fully integrated their access and identity management, and in most companies I've worked for, a given user has an average of some 2 or 3 user ids and between 5 and 10 passwords to remember. And this is where it gets fun, because of security policies that don't work as intended.

Let's say that I have 3 ids that cover 12 systems, and some of them are integrated with a centralized user store so that I only have to remember 4 passwords. That's not bad, as enterprises go, actually. But generally each of these systems that is not integrated with the common user store will have their own password policies, which may be similar to but not quite the same as that of the common user store. (Typically, corporate IT policies will drive security policies to resemble, but not match, each other, because different systems can support different subsets of what the administrators want to implement.)

Now, I can remember 3 ids and 4 passwords, and which systems they apply to. (Some people can't; I have lots of practice.) But there are challenges to my memory, from the aforementioned policies. First, passwords are almost always required to be at least 8 characters. This makes it more difficult to break an encrypted password, should you acquire one, since each additional character doubles the number of possible combinations you have to try. Second, passwords are almost always required to have different types of characters in them (capital and small letters, numbers, non-numeric characters). This means that an attacker cannot cut out possible combinations by limiting the character set he tries to crack against. These are good policies. Now for the bad ones, in practice, that sound good, in theory.

Password reuse is often restricted, so that, for example, the last 12 passwords cannot be reused. This sounds good: it means that if an attacker manages to break a password, it will (once changed) not be useful to him for another year. But in practice, it means that you now have to remember a series of passwords at least one longer than the amount you are unable to reuse, or that you have to remember a new set of passwords each time you are required to change passwords. And rest assured, the passwords will be forced to change on different days, so that you are forced to remember new passwords and forget old ones several times per month.

Password complexity is generally enforced. A typically draconian password rule is at least 2 each of capital letters, small letters, and numbers and at least 1 other character must be present, and no character can appear more than twice. (Note that 7 of the minimum 8 characters of your password are generally constrained.) In addition, to make cracking passwords based on user knowledge more difficult, variations on your name or id, and often other bits of information, are generally forbidden. In some cases, variations on common words, where numbers are used to replaced digits (1 for l, for example) are also forbidden as too easy to crack. These also sound good in theory, but are in practice problematic. First, different systems are able to implement different parts of this kind of rule set, so that in practice a password that works on one system may not work on others, so that the user cannot even set all their passwords to the same value. Second, the complex passwords that result are difficult to remember, and in combination with the passwords changing on different dates and not being reusable, this makes the memory problem much, much harder. Exponentially harder, in fact. (Also, from a brute force cracking perspective, it means that I as an attacker could eliminate a large part of the potential password set, because those passwords would violate the rules.)

Passwords are generally forced to be changed every month in a large enterprise. This is a good idea, because it means that a cracked password is only good for a limited time, but it results in more problems, because of the complex password requirements and lack of ability to reuse passwords. The practical upshot is that it becomes nearly impossible to know your ids and passwords after about 4 months of working within such a security regime. "Nearly" impossible, but not impossible: there are in fact three strategies that users can take to cope with this problem.

The first strategy is of only limited use: do things manually. People learn to avoid things that cause them pain or inconvenience, even if the that is limited to embarassment (at, say, having to call the help desk to get password reset) or lost time (as they try several possibilities). So once systems are too inconvenient to use, people will not use them except when they are required to as part of their job. Peripheral systems (the kind that keep, say, documentation on processes, "important" company news, tracking records, bug reporting systems, frequently asked questions and the like) fall by the wayside as too much additional trouble.

The second strategy is fantastic, and I highly recommend it if you cannot change the security regime at your enterprise: create a system. I won't detail mine, for obvious reasons, but in general you want to create a base password, and make every other password a permutation of that. For example, as an "easy" system, let's say that your base is the ever popular "password". You can vary this to meet the password requirements I gave above by adding capitals: "PassWord", digits for numbers: "Pa55Word", and the occasional symbol: "P@55Word". But then you add in an additional factor, the current month, so that as you change your password each month, you know the current one (assuming you've been using the systems, so that you're forced to change monthly) will have either this or last month's symbol in it. For example, you could use the following set of passwords: "P@01Word", "P@02Word", "P@03Word" and so on to "P@12Word". Assuming, of course, that your security system does not flag "Word" as being a dictionary word, and refuse to let you use it as a part of your password. (Speaking of bad security rules: this one works fine if you're checking the whole password, but really badly when checking a substring of the password.)

But let's face it, most people are idiots. I know that this sounds mean, but having been involved with supporting people using computers over close to two decades now, pragmatism demands that I realize reality. The world is simply not as nice as we'd like it to be. And given that most people are idiots, and that even those people that are not idiots are often too busy to make up a password scheme like I suggested above (particularly when there will almost always be at least one application or system that forces you to violate your password scheme), virtually all enterprise users eventually default to the third practical solution: write down all of your passwords and ids.

Now this is a security no no: I don't (as an attacker) have to guess your ids or break your passwords if you hand them to me on a piece of paper. Think Prisoner of Azkaban here, where Neville wrote down the passwords to the Griffindor common room. But it's not the users' fault that they have to write down their ids and passwords: the security system forces them to do so. (I've twice been at companies where I've had to do so, in one case because of a system that assigned passwords on its own, random 5-character passwords, and in the other case because I coldn't come up with a system that would match all of the disparate rules of the various systems I had to use.)

So guess what I found this morning?

I blame the security guys, though, not the poor secretary who had to remember not only her own passwords, but those of the directors she supports. Although, perhaps taping it to the computer was a bit much; she could at least have locked the list in a drawer.

Posted by jeff at 9:00 AM | Comments (2) | TrackBack

IT Insecurity

Perhaps the most damaging aspect of IT security, particularly in large enterprises, is that security policies are not generally set by computer or network security experts. Instead, people who are reckoned to be security experts because they've done it before (without, generally, any user feedback on what they have done, and excepting disaster often not even getting feedback from other security people) are generally put in charge; or worse yet, non-technical managers are put in charge, of making IT security policies. This leads to policies that seem to be quite secure, but are actually not. In particular, most companies really cannot get password policies right, and in the process generally reduce their security.

Consider the typical enterprise: a few computer systems put in a long time ago were gradually built on, expanded and replaced; along the way, numerous other systems were created to solve various tactical problems, and these too were, over time, built on, expanded and replaced; there is a constant press to build on, expand and replace old systems and to create new systems to solve current problems. (Remember: by and large, computer systems exist in enterprises because they solve problems of some kind or another, at least theoretically.)

This leads to a situation where there are many, many dependencies between systems, largely undocumented; there are many systems that no longer really do anything, but appear to; there are many systems that have unknown user bases; and most of the systems don't talk to each other even to the extent of agreeing what a valid user identity might be. (And yes, there are tools to solve these problems, which involve installing more systems....) As a consequence, a user might have one id/password for the corporate portal or blog, another for the email system, another for the network/his computer, another for access through the firewall/proxies to the external Internet, and a half dozen others for other systems. In the perfect case, all of these systems authenticate the user (that is, determine that he is who he says he is) via the same system, so that one id/password combination gets the user into any systems he needs; and in fact via single sign-on the user should only have to login once to gain access to all of these systems. In practice, virtually no company has fully integrated their access and identity management, and in most companies I've worked for, a given user has an average of some 2 or 3 user ids and between 5 and 10 passwords to remember. And this is where it gets fun, because of security policies that don't work as intended.

Let's say that I have 3 ids that cover 12 systems, and some of them are integrated with a centralized user store so that I only have to remember 4 passwords. That's not bad, as enterprises go, actually. But generally each of these systems that is not integrated with the common user store will have their own password policies, which may be similar to but not quite the same as that of the common user store. (Typically, corporate IT policies will drive security policies to resemble, but not match, each other, because different systems can support different subsets of what the administrators want to implement.)

Now, I can remember 3 ids and 4 passwords, and which systems they apply to. (Some people can't; I have lots of practice.) But there are challenges to my memory, from the aforementioned policies. First, passwords are almost always required to be at least 8 characters. This makes it more difficult to break an encrypted password, should you acquire one, since each additional character doubles the number of possible combinations you have to try. Second, passwords are almost always required to have different types of characters in them (capital and small letters, numbers, non-numeric characters). This means that an attacker cannot cut out possible combinations by limiting the character set he tries to crack against. These are good policies. Now for the bad ones, in practice, that sound good, in theory.

Password reuse is often restricted, so that, for example, the last 12 passwords cannot be reused. This sounds good: it means that if an attacker manages to break a password, it will (once changed) not be useful to him for another year. But in practice, it means that you now have to remember a series of passwords at least one longer than the amount you are unable to reuse, or that you have to remember a new set of passwords each time you are required to change passwords. And rest assured, the passwords will be forced to change on different days, so that you are forced to remember new passwords and forget old ones several times per month.

Password complexity is generally enforced. A typically draconian password rule is at least 2 each of capital letters, small letters, and numbers and at least 1 other character must be present, and no character can appear more than twice. (Note that 7 of the minimum 8 characters of your password are generally constrained.) In addition, to make cracking passwords based on user knowledge more difficult, variations on your name or id, and often other bits of information, are generally forbidden. In some cases, variations on common words, where numbers are used to replaced digits (1 for l, for example) are also forbidden as too easy to crack. These also sound good in theory, but are in practice problematic. First, different systems are able to implement different parts of this kind of rule set, so that in practice a password that works on one system may not work on others, so that the user cannot even set all their passwords to the same value. Second, the complex passwords that result are difficult to remember, and in combination with the passwords changing on different dates and not being reusable, this makes the memory problem much, much harder. Exponentially harder, in fact. (Also, from a brute force cracking perspective, it means that I as an attacker could eliminate a large part of the potential password set, because those passwords would violate the rules.)

Passwords are generally forced to be changed every month in a large enterprise. This is a good idea, because it means that a cracked password is only good for a limited time, but it results in more problems, because of the complex password requirements and lack of ability to reuse passwords. The practical upshot is that it becomes nearly impossible to know your ids and passwords after about 4 months of working within such a security regime. "Nearly" impossible, but not impossible: there are in fact three strategies that users can take to cope with this problem.

The first strategy is of only limited use: do things manually. People learn to avoid things that cause them pain or inconvenience, even if the that is limited to embarassment (at, say, having to call the help desk to get password reset) or lost time (as they try several possibilities). So once systems are too inconvenient to use, people will not use them except when they are required to as part of their job. Peripheral systems (the kind that keep, say, documentation on processes, "important" company news, tracking records, bug reporting systems, frequently asked questions and the like) fall by the wayside as too much additional trouble.

The second strategy is fantastic, and I highly recommend it if you cannot change the security regime at your enterprise: create a system. I won't detail mine, for obvious reasons, but in general you want to create a base password, and make every other password a permutation of that. For example, as an "easy" system, let's say that your base is the ever popular "password". You can vary this to meet the password requirements I gave above by adding capitals: "PassWord", digits for numbers: "Pa55Word", and the occasional symbol: "P@55Word". But then you add in an additional factor, the current month, so that as you change your password each month, you know the current one (assuming you've been using the systems, so that you're forced to change monthly) will have either this or last month's symbol in it. For example, you could use the following set of passwords: "P@01Word", "P@02Word", "P@03Word" and so on to "P@12Word". Assuming, of course, that your security system does not flag "Word" as being a dictionary word, and refuse to let you use it as a part of your password. (Speaking of bad security rules: this one works fine if you're checking the whole password, but really badly when checking a substring of the password.)

But let's face it, most people are idiots. I know that this sounds mean, but having been involved with supporting people using computers over close to two decades now, pragmatism demands that I realize reality. The world is simply not as nice as we'd like it to be. And given that most people are idiots, and that even those people that are not idiots are often too busy to make up a password scheme like I suggested above (particularly when there will almost always be at least one application or system that forces you to violate your password scheme), virtually all enterprise users eventually default to the third practical solution: write down all of your passwords and ids.

Now this is a security no no: I don't (as an attacker) have to guess your ids or break your passwords if you hand them to me on a piece of paper. Think Prisoner of Azkaban here, where Neville wrote down the passwords to the Griffindor common room. But it's not the users' fault that they have to write down their ids and passwords: the security system forces them to do so. (I've twice been at companies where I've had to do so, in one case because of a system that assigned passwords on its own, random 5-character passwords, and in the other case because I coldn't come up with a system that would match all of the disparate rules of the various systems I had to use.)

So guess what I found this morning?

I blame the security guys, though, not the poor secretary who had to remember not only her own passwords, but those of the directors she supports. Although, perhaps taping it to the computer was a bit much; she could at least have locked the list in a drawer.

Posted by jeff at 9:00 AM | Comments (2) | TrackBack

February 26, 2006

Breaking The Last Enigma Messages

The M4 Message Breaking Project has broken one of the last three remaining unbroken Enigma messages. There is a client you can download to contribute your computer's time (none for Macs, but there is a Unix client that looks workable for OS X).

(Hat tip: Martin McKeay)

Posted by Nemo at 2:09 PM | TrackBack

February 9, 2006

The Games Will Come

One interesting and largely unexplored fallout of the transition of Macs to Intel processors is on how it will effect the software market. You see, if you write a piece of software, the vast majority of it is not particular to any given platform (combination of operating system and hardware configuration). Let's take a simple example, a recent game I've been working on. About 90% of the game is identical regardless of which platform I code for, even assuming I don't use Java (which hides the platform details via abstraction to a virtual machine). The parts that are not identical are filesystem access and video. That's pretty much it.

So once Macs are all on Intel, the software world could become much more interesting. Consider this: a properly designed program, using the DAO and MVC patterns, would only require having one library per platform different. The rest of the software, compiled, would be binary compatible. The only difference would be in which library is called for disk access and graphics calls, and that's trivial to configure in the installer, which itself can be easily cross-platform.

Within a very small span of time, about 3 years at most, it could be possible that well-written, object-oriented software would be inherently available on both Macs and PCs. And that would blow Microsoft's model all to hell, to the benefit of users everywhere.

Thanks to Peeve Farm for getting me thinking this way.

Posted by jeff at 8:23 PM | TrackBack

January 25, 2006

How to Hire a Good System Administrator

I am asked, periodically, how to hire good systems administrators, DBAs and integration people. Since it seems to be a topic of somewhat general interest at least among IT managers, I decided to address it here. But since most of my readers seem to come here for the political posts, I'll put it in the extended section.

Q: What are the qualities to look for in a good admin?

A: You want to get someone who is intelligent, capable of both intuition and logic, lazy, egotistical but not subject to easy shame, and a little bit compulsive (a lot persistent).

The kinds of systems installed in an enterprise are very, very complex. It takes someone with a fair degree of intelligence to be able to remember the interconnections and interactions of the various systems, as well as the components on each one, and know where to start looking when problems surface. For the same reason, you need someone capable of intuition (no one will ever completely understand the behavior of sufficiently complex installations), so that he'll know where to start looking, and with a very structured, logical mind, so that he'll know what to investigate and what to discard.

You want someone lazy, because lazy people hate to do work. Well, you've got the computers to do the work, so why should administrators have to do things? Things break, or are just not quite polished enough. A lazy system administrator will sometimes take weaks fixing things so that they don't break again, ever, or polishing the edges on something to save himself 5 minutes of work in a day. These kinds of improvements add up, in both system stability and reduced workload. That means that you can expand your environment without expanding your work force.

You want someone egotistical, because an egotistical person does not want to admit they cannot solve a problem. So they will work quite hard to solve a problem. And again, these are very complex systems, so someone without the ego investment will often fail to solve the really odd problems, the ones that you can work around at a cost, but whose causes are not apparent. However, you can't get someone so egotistical that they cannot admit to anything that would shame them: that type covers up their mistakes, and complete openness is required in order to track down unintended side effects. (Managers can help this along by not shooting the messenger, even when he's there to tell you that he just unintentionally took down your entire production environment.)

The reason that you want someone who is a bit compulsive, and quite persistent, is that some problems hover just below the level of "must fix now", so they don't get fixed. This is, again, often the case with difficult problems of uncertain provenance that have a (painful, but workable) workaround. A good admin will use his spare time to track down and fix these problems, because they bug him, and he can't leave them alone.

Q: How can I gauge an admin's true experience level, given how misleading resumes often are?

A: I've found that the best way to do this is to offer him the root password on a really big, really fast, really new system. If he's all eager to try it out, he's not very experienced. (If he starts asking about detailed OS levels, configuration and such, that's a clue.) A mid-level admin will just accept the password and ask what the machine is used for. A very experienced admin will groan, at least inwardly, and try to figure out how to avoid having the root password: he's been here before, and it's a jading experience after a while.

Q: What are good questions or tests?

A: Well, assuming that you are not particularly proficient yourself, find someone who is. Those kinds of questions vary by system, and it's hard to generalize. There are a few things you can do even if you don't understand the systems very well yourself:

Break the system (a test system, that is) in a known way. (For UNIX machines, setting application startup files to mode 000, or an invalid owner, is a good way to do this. Or fill up the filesystem on a box with a large file, open the file in an editor, then delete the file from the filesystem and leave the editor running. See if the candidate can figure out why the filesystem is full even though there aren't any files in it.) Let the admin fix it. It doesn't necessarily matter if he does; what you are looking for is whether he goes for the right ideas or not. The cleverer the break, the better the test, and the more likely the admin will be way off at first as he looks for the simple things. (If you see hoofprints, it's a good troubleshooting technique to look for horses before you look for zebras.)

Explain that each of your 45 boxes has a separate root password, that these are cryptic and random and changed monthly, and that this is done for security reasons. If he does not vociferously question your sanity, he either does not understand security or he is not assertive enough to stand up for himself when he knows he's right. Or, alternately, he figures he can get around that behind your back. In any case, if he doesn't protest, don't hire him.

Another good one is to posit a situation where a critical security patch has been made available by the vendor, and his manager is insisting it be installed immediately. What process does he use? If he starts with the production systems, or doesn't consider outage windows, he may not have ever worked in a true enterprise shop: the value of the data being compromised and the application run-time being threatened almost always outweighs the risk of an unpatched system over the short term, so expect protests that "now" is not advisable as a good time to install things, unless "now" is already in an outage window and the patch has already been tested.

Oh, and ask what kind of puzzles he liked to do when he was a kid (or likes now). Every good admin I know did puzzles as a kid: word finds, crosswords, cryptoquotes, logic puzzles — something.

Q: What did you mean before about resumes being misleading?

A: System administration is an apprenticed art. There are no formal methods to administration that are worth the time to learn, because problems are too complex, diverse and non-repetitive (well, the meaningful problems don't repeat, anyway) to reduce to a set of rules. As with auto mechanics, it's a process of accumulation of techniques and insights that leads to good results. The quality of an admin depends more on how he was mentored, or whether he was capable of self-mentoring (some are), than on how many machines of what kinds in what circumstances he has worked on. That kind of stuff doesn't show up on a resume. The best admin I ever hired had something like 6 months of formal experience, and boxes he played with at home. (He was a security guard before I hired him.) The worst admin I ever hired had 15 years' experience on UNIX systems. Don't trust resumes.

Posted by jeff at 12:26 PM | Comments (4) | TrackBack

January 18, 2006

Free Markets Work; Who Would Have Guessed?

Offshoring low-skill (call center) and medium-skill (Java web app coding) work to India, China and other low-labor cost countries has been going on for some time. Long enough that the labor pool there is drying up, leading to a shortage of workers, leading to rapidly rising wages. Hey, guess what: markets work.

(hat tip: Karl Gallagher)

Posted by jeff at 12:54 PM | TrackBack

December 22, 2005

More Flash Drive Horrors

She found her master’s degree in a trash can

For anyone who's ever obsessed about a project but forgotten to back up the data, watched a computer screen fizzle just before a deadline or left crucial documents in a cab -- here is a story about backing up, and moving forward.

Read the story, then repeat after me: Backups will save my life.

Posted by Nemo at
10:55 AM | Comments (1) | TrackBack

December 21, 2005

True, But...

Via Peeve Farm, I found a couple of posts discussing whether Apple's upcoming release of Intel-based Macs capable of running Windows will put Dell out of business. On the pro side is Daniel Jalkut and on the con side is Bob Crosley (who has about the coolest blog banner I've seen yet).

Certainly, Jalkut has a point, as recent high-profile negative reviews have made clear that you can pay high-end prices for a low-end PC if you go with the wrong brand. (Sony is much better, but no-name knock-offs are not much worse than a Dell.) But that's not enough. As Crosley notes, it is businesses — not individuals — that buy most PCs. And Crosley has fingered one of the two big reasons why that is: remote support is easier on PCs. There is another reason, though, why businesses will not switch to Mac: software. There are simply some key business apps that require Windows. Without it, how do you plan on configuring your Checkpoint Firewall 1 firewall? The UNIX console is, I think, abandoned now (Nemo can confirm or deny). But even if not, the point remains: there are some applications with not only no Mac version, but no Mac equivalent.

Now I was deliberately vague in the above commentary, because I did not distinguish between Apple hardware and the MacOS X operating system. It's a near certainty that companies will not switch immediately over to MacOS X — or even within a span of several years. It's possible that MacOS X will gain business software that whittles away at the advantages PCs have in a corporation, but those advances will be useless in tipping a company to running on MacOS until they are all in place. That is a powerful disincentive for software makers.

But there are some ways in which Apple could certainly make major inroads into corporations with their Intel-based machines as a hardware company. The first is that Apple could sell their hardware to businesses touting them as Windows systems. Moreover, Apple could tout them as the most stable Windows systems around, because the Apple hardware is rigorously configuration-controlled, and is not very internally expandable. This combination means that Apple will always be able to thoroughly test their hardware with Windows in ways most PC makers cannot do. Further, Apple can market to businesses based on more reliable hardware than most PC makers can: only Sony is in Apple's league for producing quality hardware. Finally, Apple can market to businesses on the basis of drop-in replacement: an Intel-based Mac-mini would simply plug into the peripherals already in use, with the possible exception of companies that have not yet switched to USB-based mice and keyboards.

Essentially, to get into this market, Apple would have to do three things: stand up a business support division with both sales and service capability to handle the needs of corporations (particularly with regards to predictable delivery of hardware on short delivery notice and flexible financial terms), including Windows technical support; bundle Windows pre-installed (even though companies will immediately overwrite it with their own ghost images, they generally need the license) in their business-class machines, and where they ship a mouse, ship a two-button mouse; and finally, refrain from selling MacOS to the businesses unless invited to do so. The first meets the business need for IT: predictable and dependable sales, service and technical support. The second meets IT expectations: you don't want to surprise desktop IT guys with new ways of doing things. They are remarkably inflexible in the little things. The third is how you keep from annoying your customers. In other words, Apple would have to stand up a parallel business PC company with only minimal overlap (hardware and shared resources like financial and IT departments) with their consumer-oriented MacOS X-based business.

It's a fair bet that Apple could move a lot of hardware that way: companies want good deals, but they also recognize long-term savings as an important part of the purchase decision. Companies buy based on TCO, not purchase price. By doing so, Apple would put some serious cash into their bottom line, growing their company in large leaps. As long as Apple didn't get distracted away from their MacOS X innovation, and the accompanying hardware innovation, there's not a downside there. In fact, the increased sales volume would also make the MacOS X systems cheaper, and give Apple more money to put into software development.

In addition, Apple could simply do what Microsoft has done for years, and claim marketshare based on hardware sales (Microsoft, at least for a time, counted every Intel-compatible CPU sold as a sale of Windows!), which would incentivize developers to at least port to MacOS X. And another fun side effect of increased sales, particularly into businesses, is that hardware developers would have incentives to include Mac drivers for their hardware, which is currently another point of issue for some users.

And then too, something interesting would happen as a side effect. Even though Apple would not be pushing MacOS X to businesses, some of the employees of those companies would begin to buy Macs to run Windows. And some of those would dual boot, and realize the differences between MacOS X and Windows. Whether that resulted in demands upon Microsoft to get better, or users switching to MacOS X, is hard to predict. But what is not hard to predict is that the resulting demand would, one way or another, lead to improved user experience overall.

Posted by jeff at 2:57 PM | Comments (1) | TrackBack

November 21, 2005

Writing Unmaintainable Code

This is a brilliant discussion of how to code badly. Sadly, a few of my favorite examples have been left out. For example:

  • In the interest of efficiency, unroll subroutines and methods. Not only does this avoid the overhead of a method call or a JMP instruction, it also allows you to write, for example, your logging code with subtle differences based on where it is called. Make use of the latter property.
  • Object orientation is so confusing. Use large objects filled with procedural code — preferably code related to several different subproblems — so that you can instantiate the minimum number of objects. However, every single utility method should have its own class, as an efficiency. You don't need logging all the time, for example, but you always want to be able to get your hands on an employee, and if you need an employee you will probably also need to know about the building the employee works in. This is particularly useful when someone is trying to reuse your code without truly understanding it.
  • If you program in an object oriented language, and store your data in a SQL database, you can arrange things so that you can create objects like Employees — iterate the structure to find the one you need, and be sure to maintain the array reference to which employee that might be — rather than having an Employee object with the one employee you want. Clever use of this technique can make it impossible to get one row of data, or even all data related to one particular entry, at any one time.
  • Use design patterns, but not for their designed purpose. Create a data access object that contains all of the SQL queries you might need, but open and close the database in your main logic, and be sure to have a separate data access object, complete with application logic, for LDAP queries. If the maintenance programmer doesn't understand how the data is stored, how can he use it properly?
  • Lie to introspection routines.
  • Document a method as follows:
    // MUST return a FOO, or misc. calculations of financial returns will be subtly wrong
    Then return anything other than FOO. Make the maintenance programmer figure out if its a bug or a bad comment. As a bonus, a multi-billion dollar bank might lose confidence in years worth of calculations, requiring much manual audit work to determine whether or not the calculations are correct. Be sure not to mention, anywhere, which calculations might be subtly wrong.
  • Create two methods that do the same thing, in different ways. The arguments should be in different orders, and the names entirely dissimilar. Use the methods interchangeably.
  • Be stylistically inconsistent about blocks. Use all of:
    abc {
    }

    and

    abc
    {
    }

    and

    abc
        {
        }

    Feel free to mix and match.

  • He mentions using tabs instead of spaces, but neglects the joy of having tabs used in some cases, and spaces used in others. This is particularly fun with odd tab sizes, like '3'.

Posted by jeff at 4:30 PM | TrackBack

November 18, 2005

The Memory Stick Nightmare

Bob Sullivan points out a real story of data loss - and what is certainly going to be increasingly common the next few years in HELP! I LEFT MY IDENTITY IN THE BACKSEAT OF A TAXI:

Last month, Wilcox Memorial Hospital in Kauai had to inform 120,000 past and present patients that their private information had been misplaced. Their names, addresses, Social Security numbers, even medical record numbers had been placed on one of those tiny USB flash drives -- and now, according to a letter sent home, the drive was missing.

I've thought I've lost my own personal USB widget any number of times. Most recently, I thought I had left it in Philadelphia. I found it two days later in the bottom of my briefcase. On the widget is a variety of things both business and personal: a Quicken file with personal finances (encrypted), resume, remote access certificate for the office, pictures, etc. No customer data from the office, but the certificate could have been a problem. It's password protected, but within another 3-4 days, I was going to get my certificate revoked if I hadn't found it (I'm just that anal about it).

I expect that as more stories like the hospital become public, more of these flash drives will use encryption - which I'm starting to see in the marketplace, but not commonly yet. I would guess that within 12-24 more months, it will be standard. In the meantime, more letters like the one the hospital sent home will be happening.

Posted by Nemo at 8:32 AM | Comments (1) | TrackBack

November 7, 2005

Poetic Justice

Sony included a rootkit with its copy-protected CDs, with the intention of limiting purchasers' use of the CDs (particularly how many copies can be burned). But the rootkit is so badly written that simply renaming your ripping program makes it invisible to the Sony rootkit. So Sony's compensation for millions of dollars in probable lost sales from the bad publicity is to somewhat control the use made of CDs they produce by people who cannot use Google to find the way around their copy protections.

Sell Sony short: they are going to have a rough quarter.

Posted by jeff at 10:57 AM | TrackBack

October 18, 2005

Small Annoying Things

Technorati does a terrible job of tracking links. It just doesn't seem to have scaled well; that, or it is badly designed. In particular, links will disappear from the list for a particular blog, even when the link is still there and can be verified by visiting it. Sometimes, the links never show up at all.

Truth Laid Bear's Ecosystem does a fairly good job, but it seems like links just appear and disappear somewhat randomly. I have gone from 80 links to 40 to 63 in a three-day span. Heck if I know why. It's not the variation that bothers me, but the fact that it's hard to tell who's really linking to me.

Blogspot doesn't do trackbacks out of the box. Or at least, not by default. This means that blogspot blogs don't show up in my trackbacks when they link to me.

The combination of all three of these means that I don't necessarily know who's linking to me on what topics, and that is annoying. The reason it is annoying is that they are clearly writing about things that interest me (or I wouldn't have written them), and I sometimes don't even know that they are linking to me unless I run across the post accidentally.

That's annoying.

Posted by jeff at 11:02 PM | Comments (4) | TrackBack

October 1, 2005

Fighting for Control of the Internet? How Pointless

Allow me to let you in on a little secret: the Internet is not controllable in any meaningful way. At least, not permanently.

Lately, the UN (and then the EU) have tried to "take control" of the Internet away from the US government, and have fortunately been refused. There are a lot of bloggers commenting on this, including:

QandO
InstaPundit
Meryl Yourish
Wizbang
Mark in Mexico
UPDATE: ZenPundit
UPDATE: The Glittering Eye

And Belmont Club, who asks: "BTW, who does control the Internet?"

Well, here's the thing: if you don't live in a country that controls everything that you can do with telecommunications (including to whom you can place phone calls), you do, if you want it. All you have to do is decide which networks you want to connect to, agree with them on how addresses are provided and which systems will act as root servers for the naming services, and establish a physical connection between your networks. In actual fact, that is exactly how the Internet came into being, as university and government and corporate networks began to connect to each other. (The naming service has already completely changed once; google "dns history" without the quotes.)

So should some tyrannical agency start compelling the existing backbone providers (that is, the companies that provide connectivity and bandwidth to the ISPs that provide you with connectivity and bandwidth), an alternate Internet would spring up within a very short time period, using different name servers, a different body controlling addressing and ports, and not connecting to the existing Internet, except maybe through a controlled and isolated gateway system. (In an ideal world, the first step would be to fix the underlying problems with IP, such as lack of encryption/non-deniability at the lowest protocol level and the too-small address space), but we don't live in an ideal world, and getting OS vendors to release patches for all of their extant OSs, including ones they no longer actually sell or support, just isn't going to happen.)

That's it. All it takes is agreement on two things (who assigns numbers and where to go to look for names) and a physical connection, which could be anything from a phone connection periodically dialed to a direct physical line. And it's a given that middlemen would evolve immediately to make it unnecessary for everyone to connect to everyone else — that's a lesson we've already learned. Then middlemen would evolve to connect the middlemen, and voluntary groups would come into being to reach consensus on protocols and standards, and we'd be back where we started, having moved anyone who wants and is not denied freedom over to the new internet, leaving behind the censors, tyrants, and those unfortunate enough to be compelled to live under them.

There's a shorthand term for all of this: the Internet routes around censorship. The Internet was designed to survive a nuclear war; the UN doesn't have a chance.

UPDATE: Little we can do but acquiesce? What are they going to do to make me (or anyone else) change the addresses of a. b. c. and so forth? Frown at us really hard? Issue an ultimatum? They can kiss my named.ca.

Posted by jeff at 7:15 PM | TrackBack

September 26, 2005

No, No; It's Worse Than That

Francis Porretto talks about the necessity of supporting software after the product is released, and the concomitant effects on the programmers caused by managers' failure to understand that necessity beforehand — or often afterwards. But it's actually worse than that. Consider the proper way to design and release software, the Holy Grail of the software engineering movement (in contrast to programming, which is an art):

  1. Gather business requirements.

    This is the process of getting the information from the people making money for the company, or doing the organization's business, as to what their problem is and what would constitute a suitable solution to that problem. These requirements are ideally listed in a set form including several methods of organization (requirement type, originator, urgency, and so forth), and should include a concrete use case (that is, a statement of what a user would be doing to cause the requirement to come about, and how the software should react in that case). This is how you understand the organization's — your customer's — goal.

  2. Develop technical requirements.

    This is the process of determining how to meet the customer's business requirements. Technical requirements are the things that must be done to create a system that fulfills the customer's business requirements. In particular, this requires that each technical requirement note which business requirements caused it to come into existence, and whether the technical requirement is necessary, sufficient or both to meet that requirement. Each technical requirement must include a test case, that is, a list of instructions of things to do in the completed program, and what result should be found from doing so. This is your strategy for meeting the customer's needs. These should be reviewed by the customer and signed off as valid prior to beginning any development.

  3. Develop a project plan.

    This is your plan for implementing your strategy in code or system (usually some mix) form. It should include a detailed list of tasks, each of which notes which tasks must be complete before that task can be begun and/or completed, whether those tasks are milestones (significant events to be reported to the customer, usually triggering further review and approvals processes), what resources (including money, people, hardware, building space and so forth) are necessary to begin and/or complete that task, and the anticipated duration of the task. The development of the plan is what allows you to set a budget and an estimated completion date, along with known or estimated error probabilities so that projects contingent on the completion of this project can plan for contingencies. All of your time estimates should be very conservative, or you will be blindsided by events (like when an employee wins the lottery and the project gets set back because you have to find a replacement, or if a backup fails at a critical time).

  4. Develop the product

    This is the process of executing tasks to complete the plan. This process is managed by a project manager, usually the same one who developed the plan, whose job includes doing whatever is necessary to complete the project on time and on budget with the committed resources. This also includes, in any reasonably complex project, numerous project reviews and sometimes in-line changes to the business requirements, and the associated adaptation of the strategy and plan.

  5. Testing the product to requirements.

    This is the process of putting the product through each test case of each technical requirement, and then through each use case of each business requirement, to ensure that the project's goals have been met. There are different types of testing, but all of them come down to verification that the project is complete and correct.


This is how projects should work. All too often, this is how projects actually work:
  • The IT director has a budget set in advance of the current year, usually based on a vague notion of what projects will need to be done. Many of the projects only have a name, with no actual description beyond that.
  • The IT director talks to someone in the business, or sometimes just gets a notion into his head, and kicks off an IT project in response. The initial email and perhaps a few follow-up meetings between the IT directory and an IT manager constitute the "business requirements".
  • The IT manager goes to the project manager and architect, and tells them what technology base to implement on, what final date (and frequently intermediate dates) to hit, and what resources they have to use.
  • The project manager and architect scramble to fit a project into that schedule that meets the requirements, such as they are, as best they can manage. This usually involves cutting functionality, misunderstandings, shortcuts, bad programming practices, failure to document and many other problems.
  • The software is delivered, generally on time or close to it, and the business once again wonders what the point of an IT department is, after all. Outsourcing decisions are often made at this point.

If you find yourself on an IT project where the deadlines and budget were made before the project was designed, the odds are that the project is going to be a failure — often an expensive failure — even if it delivers on time and on budget, because the odds of such a project meeting the business requirements, performance expectations and being maintainable and having many reusable components are vanishingly small at that point.

Oh, and another tip, embrace the use of open source software, people, for infrastructural bits of code like logging, database access and so on. For that matter, embrace open source applications where they are useful and have large enough user communities to ensure their future development. It will save a lot of time and money spent reimplementing things that are already solved problems.

Posted by jeff at 11:57 PM | Comments (3) | TrackBack

June 12, 2005

Delete and Ban

I have been wanting for some time to be able to delete a trackback ping, and at the same time ban the IP (same with category pings and comments). Since SixApart hasn't done the work for me, I figured I'd do it myself. In the extended entry are the diffs to enable this functionality.

Of course, there are the caveats. This has only been tested on one installation with two blogs, and only on MT 3.16. Even then the testing has been limited so far, though I suspect it will get a more thorough workout over time. This only works with trackback pings, category pings and comments. (Hopefully, it works correctly with all of those all of the time, but no guarantees on that.) This is not a plug-in, but alteration to the basic MT code. As such, it carries more risk than installing most plug-ins. Please don't muck up your app, then whine to me about it. In other words, use at your own risk; no warranty; that word does not mean what you think it means; you can't handle the truth. Oh, and the template change allows for translations, but I didn't do any localization at all.

If you use this, please let me know. If you use this and have problems, definitely let me know so that I can fix them. If you are from SixApart, and want to incorporate this, please feel free, especially if you can improve it.

UPDATE: One thing I should note: there are no checks to make sure that you haven't deleted this IP before, so if you are deleting several items from the same place, your banlist will likely include the same IP several times, which is a little inefficient. I'll probably fix that at some point.

UPDATE: OK, I've fixed the code so that it only puts the IP in the ban list once. If you've already updated your CMS.pm with the original code, I suggest going back to the original and then proceeding from the diff in the extended section. I also fixed the display of the code. All of the -> operators weren't showing up correctly.

Here is the change to <MT_BIN>/lib/MT/App/CMS.pm

1832a1833,1846
>             if (($type eq 'ping' || $type eq 'ping_cat' || $type eq 'comment')
>                 && $q->param('deleteAndBan'))
>             {
>                 use MT::IPBanList;
>                 my $existing = MT::IPBanList->load({ 'ip' => $obj->ip, 'blog_id' => $q->param('blog_id') });
>                 unless ($existing) {
>                     my $ban = MT::IPBanList->new;
>                     $ban->blog_id($q->param('blog_id'));
>                     $ban->ip($obj->ip);
>                     $ban->save
>                         or die $ban->errstr;
>                 {
>             }

And here is the change to <MT_BIN>/tmpl/cms/delete_confirm.tmpl

56,64d55
> <TMPL_IF NAME=TYPE_COMMENT>
> <input type="submit" name="deleteAndBan" value="<MT_TRANS phrase="Delete and Ban">" />
> </TMPL_IF>
> <TMPL_IF NAME=TYPE_PING>
> <input type="submit" name="deleteAndBan" value="<MT_TRANS phrase="Delete and Ban">" />
> </TMPL_IF>
> <TMPL_IF NAME=TYPE_PING_CAT>
> <input type="submit" name="deleteAndBan" value="<MT_TRANS phrase="Delete and Ban">" />
> </TMPL_IF>
Posted by jeff at 3:18 PM | TrackBack

June 6, 2005

Apple, Intel, Eeek

Actually, the "Eeek" is not that serious. The big thing I'm worried about in Apple switching to Intel CPUs is not performance (Apple convinced me that they could handle this kind of transition when they went from 680x0 to PPC), but Classic. We have a lot of software that we use - kids' games, cross-platform CDs for education, old versions of apps we don't want to pay $100s to replace and that still work, apps that we need that have never been made available for OS X and so on. If Classic isn't supported, and it's hard for me to figure out how it would be, unless Apple is running a 680x0 emulation layer on a PPC emulation layer on Intel (gurgh!), we can't move to the new machines, at least not completely, until we can find replacements. Of course, if you look at Apple's 2% sales vs. 16% installed base, the suggestion is that Macs last a lot longer, which has certainly been the case for us, so maybe we won't need to upgrade for a long time.

UPDATE: Wizbang has some interesting and intelligent thoughts. Sadly, no comments, because of the near-inevitable religious flamewar nature of OS bigotry.

Posted by jeff at 6:17 PM | Comments (3) | TrackBack

May 26, 2005

Dancing the Happy Dance

I've been waiting for this for a while: the release date for the RedHat Directory Server (based on Netscape DS 4.x codebase) is June 1. Frankly, I will be amazed if OpenLDAP has much mindshare a year after that. RHDS, assuming it shares the same capabilities as NDS and SunONE (or whatever they're calling it this week), is faster, more flexible, has multi-mastering replication, has better command-line tools, has a management console, stores configuration in the directory (separate suffix) and has better system management and logging support.

With this being open sourced, I can see a few modifications that can make this the (almost) perfect directory server (some of which are in SunONE, and some are not): more granular replication, down to the attribute level; preferred-master replication override on a per-database, per-attribute or per-filter basis; distributed single sign-on support (referring writes of the password retry count to the master is a security hole in a high-load environment, and true multi-master is slower than master/hub/replica schemes); logically-consistent data split across multiple back-ends; support for a query language more flexible than LDAP queries, translated by a front end query engine implemented on the server; and so on. Most importantly, I can reference the source if I have a question. (I once was almost paid double by Sun to answer my own question on a project I was doing for them; shouldn't have told the support guy who they would be calling out to answer it.)

OK, admittedly, some of these things won't get done - or won't get done quickly - but now I know that if I really need them for a project or a client, I can hire the team to get them done. (Yes, some of the projects I've done are big enough that they could justify that.) In the end, the key advantage of this to me as a directory consultant (and to my clients as directory users) is that an enterprise-class directory is available that can be customized to any purpose the client requires. Moreover, this is an ideal platform for an open source access and identity management solution. (Need more time: anyone care to pay me to design and help implement and open source access and identity management solution?)

As an aside, I got this story from SlashDot. If anyone tells you OpenLDAP is ready for the enterprise, or SecureWay is a good directory server, run - don't walk - away from them. Do not hire them as a directory architect, as they are clearly either incompetent or smoking something illegal.

Posted by jeff at 10:46 PM | Comments (2) | TrackBack

August 17, 2003

Technical Support

Note: this is a post recovered from my old blog, before it died of an insufficient backup. Any comments/trackbacks on it have not been brought over, but can be seen with the original. The date is that of the original posting.

I am not going to tell you about my current technical support woes, because you've heard it all before. I will just say this instead: if you are, say, Microsoft; and if you buy a company, say Connectix, with good products like, say, Virtual PC; and if you basically hang up on the customers when they call support - not even having the decency to hang up in person, but letting the automatic system hang up for you; and if you right before hanging up tell them to go to your support website; and if you then make it impossible for customers to get support on the purchased product over the web because the product doesn't have one of your product numbers; and if you provide no mechanism for customers to say "Hey, I'm trying to get support but I can't even get to the point where I can try to get support"; then you can forget about getting any money out of me to update that excellent product, which I otherwise might have done at some point, when I start using it again regularly.

I hate companies which have the easiest-to-use sales systems in the world, combined with the most impossible-to-use (and expensive if you don't want it to be impossible) support systems. It's basically telling your customer "Thanks for the money, but don't bother us when we've screwed up." And it's not a way to get my money.

The desktop computer industry contains an overabundance of such companies.

UPDATE: OK, the reason that I use VirtualPC is to run Project and Visio, which don't have Mac versions. When you're doing IT work, people expect you to have Office, Visio and Project, and are stunned when you don't. When you are doing this on contract, it's not good for business to tell your clients to pick another format to send you. VirtualPC has the added bonus that you can play PC games on the airplane, and run the occasional Windows-only program you just need to use once - like the console for Checkpoint firewall.

I am most likely about to start doing contract work again, and so would have most likely upgraded VirtualPC were it not for the lousy support I've gotten. So instead, I'll be buying ConceptDraw to replace Visio, and FastTrack Schedule to replace MS Project. (I currently own both Visio and Project.) So MS loses a potential upgrade sale of $100 (plus future upgrades of VirtualPC, Visio and Project) and two other companies get a total of $550 plus future upgrade revenue. All because of an inability to get minimal service on a product.

Posted by jeff at 12:00 AM | TrackBack

July 13, 2003

Eliminating Spam

Note: this is a post recovered from my old blog, before it died of an insufficient backup. Any comments/trackbacks on it have not been brought over, but can be seen with the original. The date is that of the original posting.

Steven Green points to this John Dvorak article, and asks: "can some of you smart network-type people tell me if any of these ideas [for reducing spam technologically] are doable without needing an entirely new email system and software"?

I'm not going to go into detail on all of Dvorak's proposals, but I will say that he misses the point, for an entirely good reason. The Internet was built as a survivable network, meant in fact to survive nuclear strikes on some of its nodes and still keep going. It was also designed to assume security: all of the users would have to be authenticated by a node before obtaining access to the network, and all of the nodes were trusted. This was an excellent architecture in 1988 (when I first got onto the Internet), because all of the users were accountable for their actions - to their employer, or their university, or their government agency, etc. As a result, abuse was a minor problem (mostly incoming freshmen at colleges) and easily dealt with. But the system dealt with abuse - the human part of the system, not the network part. You could actually be shamed off of the Internet back then. Removing the human enforcer of netiquette and good practice was like giving the government the power to raise almost unlimited revenue from a very small proportion of the population: corruption and abuse exploded. Open source routing - where everyone passed traffic by the shortest route, so that my traffic could go right through IBM's internal network if that was the fastest way to the destination - was the first to go, replaced by a few backbone networks, with the multiply-connected large private networks protected by layers of firewalls and large IT staffs. (This increased the brittleness of the network at the same time that it increased security.)

So how do we fix this? We cannot build a safe, secure and reliable system on top of an inherently insecure, self-regulating and untrustworthy network. We would have to build a new network from the ground up. And to do that, we'd have to scrap everything except the physical layer - the NICs, wires, routers, bridges, modems and so forth would be all that's left. The intelligence of the network would have to change, some of it drastically.

The first problem to solve is design: how do you create a secure, trusted network, which at the same time still allows connections from anywhere to anywhere, using any protocol, as the default? How do you do this without either excluding people or forcing a central controlling agency (brittle, arbitrary and powerhungry as any bureaucracy) or limiting the ability of people to use the network in reasonable ways? How do you manage traffic in such a way that the network can be flexible, without overwhelming small companies who are multiply-connected by passing external traffic through their networks? How do you provide authentication and authorization, in other words, to a global network using only local resources?

It turns out that if you are willing to start with a blank page, it's not that difficult. The major issue to be solved is trust: how do I know whom I'm talking to? Since you don't want a brittle network, that will fall apart when something happens to a small number of nodes, the only trust model that works is to authenticate yourself to some local authentication source. For example, an ISP or a company would provide a directory which lists all of their users, and contains the information necessary to trust that user. Each node, then, has to trust its neighboring nodes. (This is already the case today, in that you cannot establish a physical connection to another node without being their customer or their provider, or entering into an agreement to do so.) Each node would need to cryptologically authenticate itself to its neighbors, and vice versa, and each user would have to authenticate themselves to their node. No node would pass traffic that did not include its partner node's connection key in the message portion of the packet, and no node would alter that information.

On the good side, this lets any traffic be traced back to its source. You cannot fake traffic as being from a node that you are not from, because the network will not pass on the message unless it has authenticated the upstream source - that is, you - and your information is in the packet header. Therefore, you could put preceding node information in the header, but it would be bogus and the trace would effectively stop being verifiable at you, the sender. On the bad side, this would dramatically increase the amount of traffic on the Internet (by increasing the size of any given packet) and would slow down all traffic, bandwidth being equal, because of the overhead of nodes authenticating to each other and the larger byte-count of a given data set.

Let's say, though, that we were to use that or some similar measure of trust to guarantee that the network was trusted. Now, we still have a whole host of problems to solve, because the IP spec would have to be rewritten at a pretty fundamental level. This means that the NICs would need updated firmware, or for cheap cards that have their firmware burned into the hardware, the whole NIC would have to be replaced. Then, you would have to rewrite the network stack to take account of the new protocol. You'd have to rewrite all of the protocols like UDP, FTP, SMTP, POP, NNTP, LDAP, SSH and the like. Some of these would be huge efforts, while others would need few or no changes. The OS network stacks would have to be rewritten for each OS, and some applications would need to be rewritten as well, if they deal with the network information at a low level.

All in all, it would be a huge load of work, and not likely to get done as an organized effort. The better way to do it would be to set up such a network privately, amongst friends as it were, and expand that to their friends, and their friends, and so on, and so on... And if you were to gateway to the global internet, you'd lose a lot of the benefits right there.

So in practical terms, I don't think it's going to happen.

Posted by jeff at 12:00 AM | TrackBack

July 11, 2003

Can I Just Have the Wire, Please?

Note: this is a post recovered from my old blog, before it died of an insufficient backup. Any comments/trackbacks on it have not been brought over, but can be seen with the original. The date is that of the original posting.

Aubrey's cable modem problems reminded me of something that's bugged me for years. Why is it so difficult to find high-speed Internet connectivity without non-network services?

Internet service breaks down into two parts, network and service. The network piece includes physical connectivity, addressing, routing, and (optionally) naming services (DNS and reverse DNS). The service piece includes email hosting, web hosting, file service hosting (FTP in particular) and so on - in other words, all of the things that you do that need a server (including, say, hosting a weblog). Note that for web browsing, instant messaging and the like, there is no need to have a service - that comes for free with the network, because the applications involved have every piece running on your system.

Now, it's certainly true that most people will need both network and service pieces. Even someone who only wants to surf the web and do email needs a service provider for email. However, there are numerous service providers perfectly willing to provide these services (frequently for free or at very low cost) to anyone able to connect to them.

What I cannot find is an abundance of companies willing to provide a high-speed network connection - and nothing else required - for a reasonable price. If you want a high-speed connection, you can lease a T1 or better (for about $1200 per month with no restrictions, with fractional connections not being much cheaper - and in this case, for example, no faster than a good modem), or you can get some form of DSL or cable modem.

The problem is, getting a good DSL or cable modem connection is still expensive for a home user. I pay about $100 per month for mine, to get a commercial DSL connection at 768K down/384K up with 5 static IP addresses and no other services. The reality of it is, the marginal cost to the provider (Verizon, in my case) is almost certainly less than $5 per month. (I have worked with setting up services for several ISPs, and the large cost is in providing the services, not the network.) I could chop that in half, keep the data rate, add email and web (and, I'm sure, other) services, and lose the static IP addresses. At that point, I'd have to go out an pay additional costs to run this weblog, because Verizon's home user plans don't allow for what it would take to get MT to run, and would have significantly less control over my other server-based services (which I currently provide for myself).

Certainly there is a shortage of IPv4 addresses, but they are available, and the big providers (including mine) have them. Most people would be fine with a pure network connection with dynamic addresses, and a service plan from either their provider (more convenient) or some third party (likely better service and cheaper). For those few who want static addresses, they could be charged more. Still, the connectivity charge in a major city for high-speed service should be on the order to $20 to $30 at most, presuming that the user is getting less than or equal to 16 (raw, 14 usable) addresses, and that includes a hefty profit. Add another $10 to $20 per month for services from a third party (or $30 to $50 per month for services from the line provider) and you'd still be able to save a bundle, while the providers would still make a profit.

So why is it so difficult to get just a high-speed connection, with static addresses and no services? (OK, I suspect that it's an outgrowth of some FCC regulations creating a effective-monopoly or near-monopoly environment, but even with that it seems to me that someone would want to take the money they could get just providing Internet connectivity.)

Posted by jeff at 12:00 AM | TrackBack

March 26, 2003

Corporate IT and Outsourcing

Note: this is a post recovered from my old blog, before it died of an insufficient backup. Any comments/trackbacks on it have not been brought over, but can be seen with the original. The date is that of the original posting.

It's worse than this makes it out to be. First, my credentials, because I don't have sources to cite: I've worked in IT for 12 years, almost the entire time in enterprise-level jobs. I've worked as a consultant, a manager, and an admin in various for-profit companies.

OK, that said, the state of IT is worse than Bigwig's article makes out, because he only considers outsourcing as it compares to internal IT departments. Internal IT departments are themselves very inefficient. For example, I worked on a project once which spent a year and millions of dollars to build a production environment that was ill-conceived to begin with. When it was finally working and doing what it was supposed to do (for more money and in more time than was actually necessary, but at least it worked), it was immediately taken out of production because the new VP decided to do things differently. This is more typical than not. It has been said that 90% of IT projects fail, and as far as I can tell, that is true.

So why do big IT projects fail? Generally, they are political, which means that cancelling them is also a political, rather than a business decision. They tend to be thought of by the corporate sponsors in business terms without regard to technical considerations, and by the implementors in technical terms without regards to business considerations. There is a strong desire to shave short term pennies by spending long term dollars.

Another example: I once worked on a call and problem management system for doing technical support of software products. The support contracts were fiendishly complicated. Within a few seconds, the person on the phone had to be able to tell the caller if he could or could not place a support call as that person from that company for that product on that computer. This was in the mid-to-late '90s, so the computing power of the enterprise-class machines we had then is somewhat less than the laptop I'm typing this on. But the product we developed for internal use worked, worldwide and for many products and fast enough. Because it worked so well, we were asked to expand it out into the other software support lines within the company. Well, one group had a problem-management system from the time when buying a computer meant that you had lifetime support for free, and it ran on mainframes and cost some $40M every year to run. It was so heavily invested, and had so many applications written against it, that it was deemed too important to get rid of. But their application couldn't do the entitlement piece of support (are you contractually entitled to get support?), so it was decided that their mainframe-based problem management system would be used with our UNIX-based client server entitlement system (the problem management part of our product was to be left to wither and die). To do this, a client had to be developed, taking a year and a half, to talk to both systems. From a user-interface standpoint, the mainframe-based call management system was so bad (and the graphical client basically was a screen scraper for that app) that the support personnel had to increase in order to field the same number of calls. The entire app slowed down, because of the delay of going to two systems instead of one. With the amount of money it took to run that system for one year, we could have finished development of our system (it was rewritten from scratch to work for the whole company, and to eliminate accumulated cruft) and supported it for a decade with constant development. This is common, though: use the more expensive system in worse ways, because we already have it, rather than replacing it with something better. Note: this was considered a great success for the IT group of which we had by then become a part.

To put this in non-IT terms, imagine if you had a car which carried 2 people, got 8 miles to the gallon, and cost $40000 per year in maintenance. Imagine further that you could replace this with a car costing $20000, which carried 6 people comfortably and got 35 miles to the gallon. No-brainer, right? Not if you are in corporate IT, because maintenance comes from a different budget than acquisitions, and it is almost impossible to repurpose funds. (I feel for government heads of department; I really do.)

So what works? Generally, the products I've seen developed by corporate IT which work are those which are developed under a single management chain, generally by a small group of really good people using no formal development methodology, where they are trying to solve real user problems, as opposed to big-picture problems (like, optimize our supply-chain management). These products can grow over time to encompass a larger number of problems for more groups. They tend to go astray when they get big enough and important enough to become politically useful to other management chains. The fight for control is done over requirements, and it results in the biggest fish in the fight getting bigger, and the product getting more expensive and less useful over time.

It's not just me who sees it this way. Most of the IT people I've ever known (and I know a lot of them) see it similarly. (Two programmers at a company I know of were experts in a product. They were transferred to a group with a bigger-fish manager, and put to work on something completely outside of their expertise. The new manager didn't need their skills; he just wanted the headcounts to increase his importance.)

So why don't companies work this way, in general? Well, for two reasons. First, most people in charge of most companies have no clue of how to tie IT projects to business needs. Second, even when they do know how to do this, there is a tendency for projects which are efficient, useful and well-managed to become so important that the power-seeking managers with pointy hair and no clue end up taking them over to stroke their egos, then making bad decisions (if the takeover process was not itself fatal).

Not that I'm bitter, or anything. Time to go catch up on Dilbert.

Posted by jeff at 12:33 PM | TrackBack

March 13, 2003

Theft? Uh, No

Note: this is a post recovered from my old blog, before it died of an insufficient backup. Any comments/trackbacks on it have not been brought over, but can be seen with the original. The date is that of the original posting.

Steven Den Beste is apparently having problems with images hosted on his site being linked to by other sites, rather than copied and hosted from the other sites. In response, he set up his server so that people getting images from his site via a referral not from within his site will instead get this image:

(and yes, I'm aware of the irony of referring you to his site for the image).

I take issue with the image. The word "theft" is much over-used these days, but I expected more from Den Beste, who generally thinks things through fairly well. (OK, except about computers, and maybe this is where the cognitive dissonance comes from.) If I call you on the phone, I am theoretically depriving you of the use of your phone line for other purposes for the time you are talking to me. But it's not theft, because in reality you have the opportunity to not answer. In effect, what Steven is doing is telling his server not to answer the line if a request comes in for a big image. Fair enough. But it's rudeness, not theft, to link to another person's large images when they don't want you to do so.

Posted by jeff at 12:00 AM | TrackBack