Any decent programmer can sketch out an algorithm. It takes some special inspiration to explain one by telling an old joke about a painter named Shlemiel.
The connection between software and Yiddish humor may not have been evident until Joel Spolsky began writing his Joel on Software essays and blog in 2000. But now that Spolsky has demonstrated it, the payoff is clear. For starters, Spolsky’s writing is a lot more fun to read than most of the mountain of verbiage under the “how to write software” rubric. And once you’ve read his explanation of the dreaded “Shlemiel the painter algorithm” — a clumsy way of writing a routine that involves repetitious round trips, each one longer than the previous — you won’t easily forget it, whether you’re a programmer who has learned to avoid a dumb mistake or anyone else who might spy Shlemiel-like behavior in the world beyond RAM.
Spolsky, a professor’s son who grew up in New Mexico and Israel (where he served in the armed forces), worked as a program manager on Microsoft Excel in the early ’90s and later for the free e-mail provider Juno before cofounding Fog Creek Software. His entertaining and crystal-clear essays have placed him in the very small class of programming pros — from Frederick Brooks near the dawn of computing time to Ellen Ullman, Paul Graham and Dave Winer today — who are able to write about their insular world in a way that wins the respect of their colleagues and the attention of outsiders.
The best of Spolsky’s essays have now been collected between soft covers in a generous volume with a fanciful title: “Joel on Software: And on Diverse and Occasionally Related Matters That Will Prove of Interest to Software Developers, Designers, and Managers, and to Those Who, Whether by Good Fortune or Ill Luck, Work With Them in Some Capacity.”
Amid the clank of old heating pipes at the Fog Creek office atop an old building near New York’s Penn Station, I spoke with Spolsky recently about what’s wrong with software development today, what Microsoft has in common with his grandparents and what schemes for automatically generating code have to do with the writing of Isaac Bashevis Singer.
Why, after decades of ostensible progress in the field, does it still seem so hard to make good software?
Nobody would be surprised if you told them it was hard to do open heart surgery. OK, fine, so maybe software isn’t open heart surgery. But it has about the same number of moving parts, it’s probably just as complicated, and it’s critical in a different way. But there’s something weird about designing software specifically: Everybody thinks they know how to do it, even when they have no training. They just think, Oh sure, I can do that!
There are many different schools of thought on better ways to organize software development. Do you subscribe to any of them?
There’s certainly a lot of faux methodologies, what I often call “big-M” methodologies, extreme programming being a very popular one right now. And even when they’re reflecting good ideas or best practices, the real goal of the methodologies is to sell books, not to actually solve anybody’s problem. And selling the books is actually just a way to sell consulting engagements that the people who write those books do at high cost; that’s their career — giving speeches to people working for very boring companies on how to do software better.
The key problem with the methodologies is that, implemented by smart people — the kind of people who invent methodologies — they work. Implemented by shlubs who will not do anything more than follow instructions they are given, they don’t work.
And the smart people are probably the ones who would have done fine without them, anyway.
Maybe it would have taken them a little longer to get to the correct methodology. Which is why I still believe that there are what business consultants call “best practices.” I hate that term, but that’s what Joel on Software tries to be: Here are some best practices for this, and that, and the other thing. There’s no set of instructions, but there are all kinds of guidelines.
There’s about 8 million things I can tell you. I can tell you, when you do a beta test you’re not going to have a single beta tester who’s willing to install every single beta version of your software — and do a complete test from head to toe of that version. You’re going to have people who try it out on the first day; they’re going to report a bunch of bugs, and that’s going to be the last you hear from them. So if you have 500 beta testers, you need to give 50 of them each of 10 releases so that you actually get some feedback from the third, fourth, fifth, sixth, seventh beta. So that’s a little tiny thing — it doesn’t tell you anything about software management, it’s just a piece of random useful advice. The world is full of those things, and they don’t add up to a methodology.
It’s almost more like folklore.
Yeah, it’s just an art. There should be a guild! It’s like a medieval way of training people to do things, but it worked pretty well. The closest thing I have that people consider to be a methodology is the “Joel test,” and I don’t even consider that a methodology — it’s just a quick way to judge whether a team is doing well or badly.
A set of criteria or yardsticks.
It’s more like CMM [the "capability maturity model" from Carnegie Mellon's Software Engineering Institute] without the headaches of CMM: a quick way to find out if a software team is good or not. It’s certainly possible to fake it.
Microsoft scores pretty high on the Joel test, right?
I think Microsoft is now well in the advanced throes of getting everything 100 percent right in terms of the discipline and just no longer producing products people want to buy with any level of consistency.
You mean that the process is well tuned, but it has lost the bigger picture of what direction the army should be marching?
Partially, and partially also the process is overtuned. There’s this other phenomenon that happens.
I remember once, when my grandparents were alive but were old, they were trying to go out to a restaurant with me one night, and it took a half-hour from the time they were standing by the door to the time we got out the front door because my grandmother kept saying, “Do I have my keys?” Of course she did have her keys. “Did I turn on the burglar alarm?” She went and turned on the burglar alarm. And then she said, “Do I have my keys? Go check if the garden door’s open; I think I was outside today.” “Is the water running?”
Basically, by the time she got to be 65, every mistake she’d ever made in her life she had corrected by creating a new procedure by which she made sure that she never made that mistake again. For example, before she left the house, she double-checked that she had her keys, the burglar alarm was on and so forth. So she had been acquiring these habits to prevent making mistakes she had made in the past. And by the time she got to be 65, it took a half-hour to run through the whole checklist!
What has happened at Microsoft is that someone discovers, for example — this is something I found on a blog at Microsoft, it’s a real story — that in Turkish there are two letter I’s. There’s I with a dot and I without a dot, and a capital I without a dot becomes a lowercase i without a dot, whereas in English a capital I without a dot becomes a lowercase i with a dot. So if you’ve got some code that’s checking if something is an integer, and it’s doing it by actually comparing the strings, and if your code just happens, in order to make it a case-insensitive test, to lowercase the thing first, that works fine — until you’re running on Turkish Windows. Then, the lowercase i actually wittily, and stupidly, becomes a lowercase Turkish i, which does not compare to an integer with a dot on the i, so that code doesn’t work.
So I found on a blog some guy that was going through thousands and thousands of lines of code trying to find these cases. And he’s writing about it on his blog so that nobody else would ever make that mistake. And you can sort of see that at some point the guy would say, “Before we ship, let’s just make sure there’s a representative of the Turkish Windows team in the room signing off.” You can imagine them making up some system to prevent this mistake. And now it’s gotten to the point where, as Adam Bosworth says, it takes 16 months to get out any release of any Microsoft product. They really can’t steer that ship very fast. They spent something like a year on SP2 [Windows XP Service Pack 2] — it’s a good thing they did that, but it’s almost all cleanup, maintenance, catch-up, for security reasons.
It’s what the Army calls fatigue. Fatigue is everything in the Army that you do to keep your equipment in good working condition: polishing your shoes, brushing your teeth, making sure that you’re ready and that all your bullets are clean and there’s no sand in your gun. It’s all called fatigue, and it takes about two hours a day for an infantry guy. And it’s everything but the actual thing you’re trying to do. Microsoft has now got to the point where it’s like 80 percent, 90 percent fatigue. So even though they’re still scoring a [perfect] 12 on the Joel test, we need another category, which is “and you’re not Microsoft.”
Another problem is that everybody tries to learn about business from Google, Microsoft, eBay and Amazon — and they’re such wacky exceptions. They don’t really apply to you. I say, Microsoft makes their own gravity — they could ship a brown paper bag called Microsoft Brown Paper Bag 1.0 and hundreds of thousands of people would buy it. Or at least try it.
How did you end up working at Microsoft?
It was my first job out of college. I did a summer internship at Microsoft before my senior year. Didn’t like it. Went back full time, didn’t like it. [Laughs] Got transferred to Microsoft in New York. What I didn’t like was living in Seattle — I wanted to be on the East Coast. I transferred back to New York, spent another summer in Seattle working on MSN 1.0. Didn’t like it. Eventually realized that I was just not going to like living in Seattle, and gave up on Microsoft.
I spent a few years on Excel when it was really sort of at the peak. It wasn’t like the early days of Excel when they didn’t know what they were doing. They had learned all the good tricks, and they were performing at their peak. And of course after I left everything went downhill, because it always does. The Excel team at some point decided they didn’t really need to hog all the good people, and so they started farming out some of the best people to other teams. Once you have 100 percent market share, there’s just really no point.
Because there’s some point at which you have a product that does what people want it to do, and you’ve basically solved all the difficult problems?
Or you’ve completely hypnotized yourself into believing that you can’t solve the remaining problems, such as the limit on the number of columns that Excel has, the fact that cut and paste doesn’t work right — all those little problems.
So you say, “Those will never be solved — let’s do something else”?
Exactly. It’s kind of weird but everything that’s happened in Office since about 1995 has been more or less user interface churn. Oh, let’s put in the paper clip! Let’s take out the paper clip! Or slightly redoing the toolbars, and then you have to do it for every single product. There’s all this flotsam that’s not the actual core functionality of the product but just crap on top of it. It’s good crap, but once you get to 100 percent crap …
Have you been writing all along in your career, or is it something new?
Only for four years. Joel on Software started in summer 2000 or so. I’d written papers in college.
You weren’t writing away on the side and at some point decided to go public?
No. My job at Microsoft was program manager, and I was told my job was to write the spec, and I didn’t realize that I could have, like many other program managers, found various excuses to not actually write a spec. So I did it. It was long and detailed. What I was doing there was really writing to a large extent. I got a lot of practice with the use of the typewriter.
Your company makes a content management system and a bug-tracking tool. Which came first?
We really started three companies: a consulting business, where we made a lot of money in the first two months, then that market just completely collapsed in November 2000, and we were back to just the two founders working on the software side. We produced CityDesk, which did OK, nothing to laugh at, but it wasn’t a megahit, it wasn’t really on fire. It had some design decisions which I wouldn’t have made today, necessarily, and I still feel like there isn’t a heck of a lot of money to be made from blogging tools, basically.
The bug tracking, however, became enormously popular. We sort of had a way of doing it that resonated with people. We were very high on the usability score; we had a ready-made audience in the Joel on Software audience of people that might consider using it to spread the word. So we put most of our effort into that, and we consider that to be our lead product right now. We’re about to come out with the first beta of version 4.0.
There are so many stories of software start-ups that set out to make one thing and then find their real product in a different direction. And there are so many people making tools for developers.
That’s a common mistake I never would have made, and that’s why this was our lowest-priority thing. The mistake comes from the fact that you say, “I would want this, I would buy it — surely everybody else would buy it!” And you don’t realize quite how exceptional you are. And there’s never that large a market for development tools, although there seems to be a pretty big market for FogBugz.
I keep hearing that the market for small or medium-size independent software vendors — anyone besides Microsoft and open-source providers — is dead.
Rubbish. Completely not true, because I sell software to all those companies. There’s actually lots of software companies out there that nobody’s ever heard of because they’re in a niche. You ever heard of Blackbaud? I think it’s a couple of hundred people; they’re in, like, South Carolina. And they do software that you use for fundraising, if you’re a charity or a charitable organization. And they’re making piles of money selling to charitable organizations. They’re the No. 1 package in that particular market — Microsoft’s never going to go into it. There’s tons of good software businesses out there. Besides which, for all the fear of Microsoft, they’ve not really demonstrated the ability to move into a new market since the days of Office. How long has it taken them to get anything out of MSN that works?
Eight, nine years?
And the strategy changes every single year, and it’s never very good. You probably remember when Microsoft was trying to do content.
You mean, when was it, 1997, and they were doing “shows”?
Yeah, shows. And they were all terrible. You know why they were terrible? They were tone-deaf. It was like they knew that there should be some kind of creative person. And a creative person should generate something called content that could be thrown up on a Web site, throwing in some advertising. But they just had no ability to generate that stuff.
Can you look at software development today and say, “Things will continue the way they are, it’s just the nature of the beast,” or is there some —
Sure, silver bullet, or just some elephant in the room that we haven’t seen yet about how to do things better.
You’re familiar with [Frederick Brooks'] “No Silver Bullet” article. That’s a pretty strong theme. But it’s not like Gödel’s proof, like we now have a proof that there’s no silver bullet.
In some fields you do reach a point where there’s a giant leap of some kind that changes the way people work and think.
There have been a couple of quantum leaps in the history of programming, possibly three. But they’re not the things everybody thinks.
First of all, there are a lot of people who say they’re going to invent some new thing where the business manager is going to specify what they want the software to do and it will mechanically translate that specification, and brrrrt! The trouble is, this is a perpetual-motion machine. We keep inventing it, keep making companies to ship it, they’re always wrong. The fundamental problem that you’re trying to solve here is that humans think of things in vague, mushy terms. In order to visualize something, they don’t have to actually visualize every part of it. Whereas the programmer, in order to actually implement that thing, to create it, needs to have every part specified.
So you can imagine an airplane without even knowing what a rudder is, yet without a rudder an airplane’s not going to work very well. Partially that’s just the way the brain works. The brain is totally optimized for just looking at one thing. Just think about how the eyes work. When you’re looking at the world, you imagine that you have an extremely high resolution, extremely large screen full of billions and billions of pixels in front of you. Yet actually your eye has, I don’t know how many, but under a million pixels. And what’s really happening is, you’re just moving your eye a lot. And your brain is filling in the rest and pretending. Because you can move your eyes so fast, anything you want to see you can see, and anything you want to focus on you can focus on very, very quickly.
So your brain doesn’t actually work the way a computer works. Your brain doesn’t assume that there’s all this input coming in and then process it. Instead, it just has a variety of senses available to it, and it picks the ones it wants to answer whatever questions it has right now. So you ask questions, and your eye goes and finds out the information it needs. So you’re used to thinking that you have the big picture, and you don’t.
And with software, as soon as you start talking about how the pieces are going to fit together, you discover that the things you were imagining are going to be completely ambiguous or impossible or contradictory — instead of being as simple as you say.
One of the biggest mistakes you can make in developing software is to say, “I’m going to build this thing and it’s going to be super easy to use because it’s going to be super simple, it’s going to be unbelievably simple.” And then you say, “OK, how am I going to do footnotes?” Oh, well that’ll be real simple — you just ask for a footnote and you type a footnote, and it’ll just be there. Well, what if the footnote is too long to fit on the page it’s currently on? Um, I don’t know — maybe you have to move part of the footnote onto the next page. Suddenly, it’s not so simple.
As soon as you start to think about how to get everything to really, really, really, really, REALLY work, it becomes much more complicated. So software starts out with a simple vision and it grows a million little hairy messy things. And depending on the quality of the initial vision, those hairy things may be more or less hairy, but they’re going to exist.
And therefore, because software seems so simple and is actually complicated, you can’t implement it until you specify the complication. And all these people that are trying to make the same perpetual-motion machine — where you just write your specification and it automatically becomes code — don’t realize that the specification has to be as detailed as the code in order to work.
In order for it to be translatable into code, or in order for it to become the code.
Right. Isaac Bashevis Singer, I believe, wrote in Yiddish, if I’m not mistaken. But very few people ever read him in Yiddish. The first thing that happened to almost everything he wrote was that it was translated into English. And so it’s almost as if the goal is to create an English-language short story, but you can only write in Yiddish. And you’re going to really have to write the whole story in utter detail, and then it will get mechanically translated.
But now somebody tells you, No, don’t even bother writing the story in Yiddish — just tell me what happens, and I’ll write the story for you in English. Just give me the gist of the story.
So you say, OK, it’s gonna be sort of a love thing, in the shtetl, in Eastern Europe. The guy is a young man, he’s enthralled with the Communist Party. The woman is sort of a traditional Jew from an Orthodox family, and so her father is not very happy with this guy. Make a story out of that. You can’t. It’s not enough of a specification. You’ve got to tell me the guy’s name; you’ve got to tell me what he says and what happens. Otherwise it’s not a story. So because there’s this last-minute phase in which the Yiddish is translated into English, that deceives people into thinking that you could translate anything into Yiddish, including your …
Hold on, I’m trying to follow the analogy here! The Yiddish is the equivalent of …?
The Yiddish is the C++. The English is the machine language code. Just because there is that compilation stage doesn’t mean that the C++ level, the programming language level, doesn’t actually have to specify everything that’s going to happen. And so what a programmer is doing when they translate a quote unquote spec into quote unquote code, although it seems like a translation process, what they’re actually doing is filling in lots and lots of details. And as programmers are wont to do, they’re trying to take something, the vague thing that the humans want, and make it very, very specific, which is the kind of thing the computer wants. That’s not really a translation; it’s more of an interpretation. It’s a very hard thing to do.
The perpetual-motion machine is where you say, let’s just take these business requirements that some business guy came up with and suddenly, instantly make code from them. And that doesn’t work, because the business requirements don’t actually specify what to do with the footnotes when they run off the bottom of the page. And the computer’s going to have to know.
You’re saying there’s a ton of work between the general requirements and the detailed specifications.
There’s requirements, specifications, code — there’s many levels. If you think of the requirements as being this paragraph, and the specification as being this page, and the code as being several pages, at each point, you’re adding more detail. And in fact you can go straight from the requirements to the code, and most people try to do that. The trouble is, it’s like eating your meat without chewing it — you tend to choke a little bit. Whereas the specification is just a hope that before you actually start on the code you can think of as many problems as possible and solve them when it’s still easy to make changes, before you get into the code and it’s more costly to make changes.
So there are a lot of false messiahs in the programming world, which is where the “no silver bullet” concept comes in. On the other hand, there have been big steps forward in productivity; they’re just not as exciting as people think.
Fortran was the first one, where the idea was you would actually write out your equations, and you would be able to write formulas, you know, (I+J) x 2, as opposed to writing LOAD 2, LOAD I, multiply, blah blah blah, in assembler language.
The second thing was Algol, the first procedural programming language, where the idea was you could actually group concepts of subroutines and functions and pieces of code; certain things like the “while” loop were invented, which made another leap forward, in that code could become that much more abstract and modular. A lot of people at this point would add object-oriented programming as a big leap forward. I just don’t include it as a leap forward in productivity. I’m not saying it’s a bad thing, just not as big a leap as these other things.
And the third big leap is what they call the memory-managed languages. Languages with either garbage collection or with reference counting or some other mechanism so you don’t have to decide where memory comes from.
Your essay on “How Microsoft Lost the API War” argues that Web-based applications are winning out over desktop-based “rich client” programs. Is everything headed in that direction?
I think what’s likely to happen is that Firefox right now already has enough market share, or will in the next couple of months, to make it a testing requirement for almost every serious Web developer. Which means that in the next six months, if there were any Web sites that aren’t compatible with Firefox, they will rapidly disappear. Everybody says, “I gotta keep Internet Explorer around for my bank!” Those things are rapidly disappearing.
Once that falls away, you really will have a full choice, Firefox or Internet Explorer, so there will be zero reason left not to go to Firefox. And then you’ll start to see a lot more of a real 50/50 kind of split, or 30/70, whatever. And so now there’s actually a possibility that Firefox might be able to implement some innovations, as long as they downgrade gracefully on Internet Explorer somehow.
So one of the things that might be neat that they could do is, give me a text area, in the text area tag let me say “HTML = true” or something, and that text area becomes a rich edit, and it submits HTML, with little bold tags and italic tags. As a Web designer, I don’t care what the user interface is, that’s entirely up to the Web browser, so Web browsers could show me a little italic button, a bold button, maybe a toolbar and keyboard shortcuts — that’s something the browsers could compete with, how good an HTML editor they give you. So then anybody making any kind of content management system can instantly let you edit rich text, with just this one little tag.
Anyone who uses any kind of Web application would be very grateful.
As soon as you start doing things like that, then the limitations of the Web as a platform will start to fall away very quickly. I don’t know if you’d ever get a serious word processor in a Web browser. Maybe you could.
There’s a couple of other issues which would be nice to solve. One is to give you a way to work offline in limited circumstances. Let’s say you wanted to make Gmail, but Gmail where you could actually take it on a plane and work offline, read your e-mail and send e-mail, queue it up for delivery. And so somebody could develop some kind of standardized system for doing that kind of offline work. Adam Bosworth has been working on this a lot — we’ll see if anything comes out of that, now that he’s at Google. That would be cool.
But the funny thing is, Bosworth has been talking about this same problem for a long time — he’s just obsessed with the person on the airplane. And, lo and behold, airplanes are actually getting Internet connections. And Wi-Fi is spreading like crazy. What’s kind of surprising is that it has turned out to be easier to rewire the entire world for high-bandwidth Internet than it is to make a good replication architecture so you can work disconnected! It’s actually far more likely that this problem will be solved that way, oddly enough.
It assumes either you’re broken or you’re hostile?
It feels like in the last year we’ve begun to see more of that, from Gmail –
Oddpost. Flickr. And there really is a large crowd of people who like having everything on the Web where they can access it everywhere. Not just college students. My dad travels a lot, giving lectures all over the world, and he has switched to Gmail.
There’s more writing about programming on the Web today than ever before — it seems like every developer on the planet has a blog. If this much expertise is being shared, won’t that improve the quality of the software being built?
Partially yes. But the majority of people still don’t read. Or write. The majority of developers don’t read books about software development, they don’t read Web sites about software development, they don’t even read Slashdot. But it is true that Microsoft used to have some severe advantages. They knew this stuff, they’d figured it out, and lots of other people hadn’t. Not just Microsoft — Lotus, whatever.
Places with big groups of developers sharing their tricks.
Yeah. And Digital. That was Digital’s great strength, in those years, though they were more engineering. But even in software, they had a culture of managing engineering. Today, as I look around, even though it’s all on the Web, and there’s a million things you can read about how to make your software project work well and so forth, you still see the same old characters in Silicon Valley that should know better by now, because it’s their third company, making the same mistakes, doing the same things wrong. Making the same silly assumptions.
That’s just human nature — the march of folly.
It’s not even folly. You can sit there and preach, “You should do this!” And even if it’s a very convincing argument, that doesn’t necessarily cause institutional knowledge of why you should do it to be transferred. There’s much to be said for learning from experience.
And the truth is, things have changed so rapidly in programming and computer technology that a lot of the time, when you’re learning from experience, it may be wrong. For example, I wanted to learn how to market software. There’s this great book by this guy Rick Chapman, who has taken over Softletter, the software industry’s newsletter. And he’s got this big thick book — he sells it for like $100 on his Web site — and it’s everything you need to know about marketing software. It’s really great. It talks all about how to get into CompUSA and Egghead, and retail boxes, the companies that will make the boxes for you, and how you get your CDs duplicated, all this kind of stuff. Great details about all the incentive programs you use to make sure that CompUSA puts your box at eye level. All wrong!
All ancient history in the era of downloadable software.
So even if people hung around long enough in this industry to learn from them, you are right to be skeptical of your elders. Your elders may be wrong. I was writing about the benefits of the rich client years ago, and I was wrong. Not thoroughly wrong — there are still benefits. But anybody who listened to me made a mistake.