Paradoxically, it's the Web's very reliability that makes it hard to preserve. "Web logs specifically run on one server, so there's only one copy," explains Brewster Kahle, the Net's best-known archivist, who is trying to capture and preserve the Web as part of his nonprofit Internet Archive.
"In the old Usenet days, you couldn't count on the network, so there were hundreds of copies," says Kahle. "But the Web was built for a network that was reliable. Having one copy is the wrong answer. This is not going in the right direction."
Collin Ong, 30, a Sacramento, Calif., computer scientist, is one of the many Usenet habitués who migrated to Web-based forums in the late '90s, when the booming online population overran Usenet with spam and newbies. Even so, he's troubled by the future implications of evolution from Usenet to the Web: "With forums, each message is pretty much trapped on those sites. I haven't seen any Web forums where messages propagate to other sites. Think of all that's being lost in Web forums that aren't being archived -- and there is no clear, systematic way to do so."
Dan Eicher, 35, knows the unique frustrations of losing precious virtual conversations -- information that still exists somewhere but can't be accessed. As a hobby, the Indianapolis systems engineer takes an interest in the old TI-994A computer. For more than a decade, the online service Delphi maintained a forum where several hundred aficionados of the old machines talked about their nerdling obsession and exchanged software for it. But in 1998, Delphi, which was migrating to Web-based forums, canceled the host's contract and shut down the TI-994A discussion area.
Eicher and other forum members tried to obtain the old posts from Delphi, but even appealing to the top didn't work. "The bigwigs decided it's not worth their time. So, it's dead. It probably lives on the tapes somewhere, but access to it is nonexistent," says Eicher. Sharing information about an obsolete computer, generated by the users who still love it -- isn't this exactly what all this online culture was supposed to be all about?
Luckily, vigilant forum members saved the software and support documents from the Delphi TI-994A forum, but the users' posts still haven't made it back into circulation. Jerry L. Coffey, 59, a retired mathematical statistician in the Washington, D.C., area who hosted the late forum, says that he has backups of the posts. But even in retirement he hasn't gotten around to going through the 100 megabytes of messages to sort the public messages from the private ones.
"What's been lost is the active message base. It's still around. It's just not in a format that people can get at," says Coffey. "Several people have approached me to try to put it together, and I just haven't found the time. It's just such a chore."
The sorry fate of the TI-994A posts on Delphi is a perfect illustration of the archivist's "too few copies" nightmare. The Internet Archive has tried to solve this problem by simply making another copy -- of everything. The Archive does periodic sweeps of the whole Web, taking snapshots of everything that's there. At last count, it had gathered 100 terabytes of data -- about 100 trillion bytes. But this work, while massive, is Sisyphean, in that the Archives just can't get every page -- there's too much. And Web forum postings, in particular, are often generated by dynamic database queries that frustrate the Internet Archive's page-gathering, anyway.
The nonprofit archive has asked the commercial search engines to join the effort by pledging to donate all the files they collect to the Internet Archive. But so far, the only company that's done so is the niche player Alexa Internet, a company founded by Kahle, which was later purchased by Amazon.com.
"All these companies think they're immortal," says Kahle. "It's hard to say to people: 'At some point your company is going to be in the hands of a bunch of lawyers.' "
Who at Google wants to think now about what will happen to those Usenet archives if in five -- or 10, or 50 -- years, their company goes belly up? Ironically, it's hard to get technology executives to think about the future -- the future in preservationist's terms, that is.
About the writer
Katharine Mieszkowski is a senior writer for Salon Technology.
Related Stories
The geeks who saved Usenet
Google's restoration of digital history relied on a few heroes' packrat mentality and a mountain of decaying mag tapes.
01/08/02
Dumpster diving on the Web
The Internet Wayback Machine aims to archive everything online. But will copyright laws leave nothing but junk?
11/02/01
Can history survive Silicon Valley?
Stanford University archivists struggle to preserve the past of a place that cares only for the future.
06/10/99
Story finder (3 ways to search Salon)
Salon Directory (browse by topic)
