Famous programs from just a generation or two ago are in danger of disappearing from human ken, forever.
Jul 30, 2003 | For Grady Booch, the nightmare goes something like this: Deep in the future, a team of archaeologists stumble onto a rare cache of 20th century art, a major assortment of works thought lost to the ravages of time.
The only problem, of course, is that they don't know it. All the images are recorded in an obsolete digital format, JPEG, and nobody knows how to unscramble the data. As a result, the hard disk containing said artwork spends its days not in a museum but as a coffee coaster in some college professor's crowded office.
"It might seem silly now, but put yourself 1,000 years in the future," says Booch, chief scientist at IBM's Rational Software subsidiary. "It's not too hard to imagine."
In an industry where one man's clever C code is another man's Linear B, Booch already knows the frustration of playing software archaeologist. As co-developer of the Universal Modeling Language (UML), a mid-1990s effort to create a common "blueprint" notation for object-oriented software programs, he's spent the last 10 years laboring to spare future programmers the same torment.
It's an uphill battle on a hill that is only growing steeper. With new programs replacing old and no major company or institution playing the central role of source-code archivist, the amount of software history currently circling the memory hole is scarily large. And even if there were a central institution, recent changes to the copyright code have made the transfer of source code from old media to new forms of storage a dicey prospect, legally. Add it all up, and you have the ideal makings for what some are already calling the "digital dark age."
"Things are going to be lost not because people don't want to save them or because the original creators don't want to save them, but because they can't save them," says Brewster Kahle, founder of the Internet Archive, an institution that has lobbied for a safe harbor within the Digital Millennium Copyright Act to shield institutions looking to archive source code.
For Booch, the barriers to software preservation aren't so much legal as educational. Most developers have come to accept the evolvable nature of software programs. What is lacking is the ability to examine static source-code snapshots with a scholarly, comparative eye. In the interest of encouraging that skill, Booch this fall will lead a seminar on software archaeology and preservation at the newly reopened Computer History Museum in Mountain View, Calif.
"Our industry has had a major effect in changing the world," says Booch, talking over the phone from his Denver, Colo., office. "It would be great if we could preserve the artifacts and interview the architects while they're still alive."
Booch isn't alone. Now that the hysteria surrounding Y2K has faded, developers are free to worry about legacy code again. One increasingly common worry is what to do with it? For every modern offshoot of DOS/Windows, Unix and Macintosh OS evolving with the marketplace, a dozen ghost programs lurk inside yellowed engineering pads, punch-card stacks and slowly degaussing magnetic memories. Even if programmers could get their hands on these programs and find a way to preserve and update their contents, a new question emerges: How do you qualitatively analyze those contents on a historical basis?
"It's funny," says Dave Thomas, a Dallas software consultant and co-author, with Andrew Hunt, of "The Pragmatic Programmer," a 1999 book on software design methods. "Colleges spend a lot of time teaching people how to write code, but very few teach them how to read code. When you think about it, we programmers spend most of our time reading code, not writing code."
To help fill the gap, Thomas served as cohost of the 2001 Software Archaeology: Understanding Large Systems workshop, hosted by Object Oriented Programming, Systems, Languages and Architecture (OOPSLA). Starting with the unifying question, "How do you come to grips with 1,000,000 lines of code right away?" conference speakers traded various tips, tools and techniques acquired through professional and personal encounters with unfamiliar systems.
"Whenever we're faced with big problems in software, we tend to fall back on metaphors," says Thomas. "In this case archaeology metaphor happens to be a good one. Sometimes you do archaeology with a backhoe. Sometimes you do it with a toothbrush."
Get Salon in your mailbox!