Google's restoration of digital history relied on a few heroes' packrat mentality and a mountain of decaying mag tapes.
Topics: Entertainment News
On May 11, 1981, one Mark Horton, then a graduate student at the University of California at Berkeley, using the e-mail address “ucbvax^mark,” posted this message to the Usenet newsgroup Net.general:
Rusty is right (or is that “Rusty is Wright”?)
- we have ALL in our .ngfile so I tend to forget
this. ALL.ALL may or may not work, but
ALL certainly does. Mark
Then, the ancient Internet scribe added this ominous postscript:
I plan to make the change on Tuesday
unless something horrible happens.
Horton’s message was a response to a previous post, the intact original of which is now lost to history, from one “sdcarl!rusty,” aka Rusty Wright. With this incomplete fragment of a cryptic exchange, the history of Usenet, as we have it today, begins.
The message is the oldest Usenet posting in the 20-year archive, now searchable on Google. It’s the first of some 700 million posts that provide a record spanning the early history to the present of Usenet — the sprawling public bulletin board, composed of a vast hierarchy of newsgroups, that grew up alongside the Internet itself.
Granted, this message doesn’t exactly have the ever-quotable and historic ring of Alexander Graham Bell braying on the first telephone call, “Mr. Watson. Come here. I need you.” But it’s not the first Usenet message ever — it’s just the first one captured in this vast, yet still incomplete, archive of Usenet’s 35,000 topic categories. It’s an ordinary exchange between two of the first few hundred denizens of Usenet posting back in 1981.
Still, if you squint, you can see glimmers of what’s to follow in this poignant gem of a fragment. What are these geeks talking about, anyway? It’s a meta-post about the system itself, of course! It’s part of a technical discussion of how Usenet should be administered. And catch that corny play on words, goofing off Rusty’s last name: “or is that ‘Rusty is Wright’?”
Geeks talking amongst themselves on Usenet about how Usenet should best be run, while having fun with homonyms: Almost 20 years later, has anything really changed?
In mid-December 2001, Google unveiled its improved Usenet archives, which now go more than a decade deeper into the Net’s past than did the millions of posts that the company salvaged from DejaNews. Now on a browser near you: a glimpse of the prehistory of the Net culture we all take for granted today. The first “me too” post! The first “Make-Money-Fast” post! It’s enough to make even a relative newbie nostalgic for a past she never experienced firsthand.
The debut of the archive touched off a flurry of chatter among the geeks on Slashdot, some of whom had been there back in the day. There were some grumbles. Imagine what it’s like to see your flames from 15 years ago, when Usenet still had the population of a small town, now searchable by anyone on the Web.
“Glad I’ve changed my e-mail address since those long, (best) forgotten days. It wasn’t me, I swear,” joked one poster to Slashdot. Another one griped: “It’s like having naked baby pictures of yourself stapled to your forehead when you walk around.” (Google vows that at the author’s request, they’ll delete old posts; so if you want to be the Internet equivalent of a rare-book burner, go right ahead.)
Google gets the credit for making these relics of the early Net accessible to anyone on the Web, bringing the early history of Usenet to all. Michael Schmidt, 29, a Google software engineer, spent the last year and a half playing detective, trying to track down the Internet’s lost history: “It was a long and painful investigative process. I was searching on the Web, calling people. There were a lot of dead ends.”
But it was the geeky pragmatism and historical foresight of Usenet old-timers themselves that actually saved the early history of the newsgroups so that we can all poke around in it today. These “archive donors,” whom Google thanks here, gave their copies of the millions of messages they’d saved back to the Net.
The tale of how early Usenet was saved begins with one of the Net’s great old-timers: Henry Spencer. “Henry Spencer is the real hero, because his contributions are what makes this historic,” says Schmidt. “Back in the Stone Age of the Internet, he was already archiving this stuff, and he was the only one doing it.”
Spencer, a legendary Unix hacker — a species not exactly known for humility — is pleasantly understated about his role as Usenet’s great early archivist. He’s the first to point out that he wasn’t really the only one saving those early messages. But the copies he kept of Usenet postings from 1981 to 1991 appear to be the only ones that still exist. “There were several other people who were archiving stuff, but all of them gave up before we did, and as far as I know none of their archiving survived,” he says. For instance, legend has it that two guys at Bell Labs kept back-ups as well, but their stores of these ultra-rare posts are nowhere to be found.
“I’m very glad the stuff is finally out there, and I can stop worrying about how the only copy might get lost,” Spencer says, now that Google has assured the preservation of the more than 2 million old messages he saved. “I’m just glad that this particular great mass of data is no longer my worry.”
One of the early adopters of the computer language C, Spencer is known for his Ten Commandments for C Programmers, as well as for being the coauthor of C News, one of the early programs for transferring and reading Usenet messages.
Now 46 years old, he works as an independent consultant, but back in 1981 he ran the computer facility at the University of Toronto’s zoology department. While the geeks over in the university’s computer science department were busy with the Arpanet, the Department of Defense’s system was too expensive for the zoologists.
“The zoology department may sound like a funny place for pioneering networking work,” says Spencer. “But the computer science department wasn’t very interested in this inferior networking. It was very low-tech by their standards. But it worked and theirs didn’t. Their opinion changed fast when we started providing e-mail.”
That’s how, in the spring of 1981, with a 300 baud modem, the zoology department at the University of Toronto became a central distribution point for Usenet, when the network was just 2 years old.
Traffic was almost unimaginably lighter in those days. Only about 200 people had access to Usenet: “In the first few years, it was at least plausible to come in in the morning and read all the Usenet traffic that had come in, and 15 minutes later be off doing something useful,” remembers Spencer. But even that low level of traffic was too much for the storage requirements of the day. “Pretty soon, it was necessary to think about expiring old stuff,” he says.
It wasn’t a sense of historical importance that initially led Spencer to think about creating an archive. His motivation was much more pragmatic than that: Most of the conversations on Usenet at the time were very technical, and he was reluctant to see the information in them disappear, because it might be useful to the university’s geeks: “A lot of the early traffic was about things like Unix systems bugs, and it seemed unwise to just throw it out.”
So the archiving began with 40 megabytes filling up a new mag tape — each reel one-half inch thick and 10 inches in diameter — every few months. In this era, messages from the outside world came in at the tortoise rate of 300 baud. (“When we got a 1,200 baud auto-dialing modem, that was just wonderful. Twelve-hundred baud was just total luxury,” Spencer recalls.) As Usenet grew, this meant that Spencer and his system administrators had to be selective about which newsgroups they received and archived, keeping technical conversations but throwing away some of the more general discussions that generated a lot of traffic.
“We started dumping stuff that we thought was obviously of no future use, groups that specialized in a lot of talk and no substance, so to speak. For example, fairly early on there was a newsgroup about abortion which specialized in violent arguments.”
That’s why not only the very earliest Usenet posts, before Spencer started archiving in 1981 (Usenet began in 1979) but even some of the posts in the 1980s are still lost. It’s too bad; today, wouldn’t more of us rather see what was being said about abortion in 1984 than sift through the arcana of bug fixes in systems that have probably been long since retired? “It was perfectly reasonable from the viewpoint of stuff that we might want to use again, but a little sad from today’s viewpoint,” Spencer admits.
For 10 years, the nine-track mag tapes piled up, hanging in a huge rack at the zoology department’s computer facility. Finally, in the early ’90s, with the growth of Usenet outpacing the zoology department’s budget for $15-a-pop tapes, the general archiving project ended.
In the spring of 1991, Bruce Jones, then a grad student in the communications department at the University of California at San Diego, flew to Ontario at his own expense. He was writing his Ph.D. dissertation on the history of Usenet and was eager to get his hands on Spencer’s tapes.
The 141 tapes, most of which held 120 megabytes of posts, now lived at the University of Western Ontario, thanks to a road trip in the middle of the Canadian winter that David Wiseman, the university’s network administrator, had taken earlier that year to unburden the University of Toronto’s zoology department of them.
Jones would spend the next two weeks rescuing the data off them. Not only was the tape technology rapidly becoming obsolete — just try to find a working tape-reader today — but the tapes themselves do not have anything like a 10-year shelf life.
By now the historical import of the tapes was already apparent. But spending two weeks running tapes through a tape-cleaning machine and dumping them on disks was the prerequisite to even looking at them. “Spencer had written a program for removing data from tapes when the tapes went bad,” Jones explains. “I was just the first person who was willing to invest my time and money — a lot of people wanted to see what was on them.” In two weeks, Jones got through the first 105 tapes.
“Usenet has always been about arguing about itself,” Jones says of the posts that were unearthed. “And the arguments that you see today are the same arguments that go way back into the early ’80s, and I’m sure that those arguments will continue well into the future.”
Case in point: the fact that the older parts of the archive are now available on Google has given Usenet denizens something new to argue about. “I’ve already gotten three letters from people accusing me of trying to make money off these archives,” Jones observes wryly. All the “archive donors” gave the posts to Google for posterity.
Over the next 10 years, Wiseman got through the remaining three dozen or so tapes by wangling the time and energies of “bored graduate students.” But by 1995, constrained by university budgets, the archiving project was running out of disk space.
So, Brewster Kahle, the creator of the Web’s other major archiving project, the Internet Archive Wayback Machine, chipped in, donating a then-humongous nine-gigabyte hard drive to the cause.
In the end, they pulled more than 2,056,000 posts off the 141 tapes. “It took us 10 years. I got so busy and everybody else got less interested,” says Wiseman, almost sheepishly. More than 2 million posts: It doesn’t sound like a lot compared to the 700 million total in Google’s archive, but they’re the oldest remnants.
Apparently someone is still interested. Wiseman used FTP to hand off the files to Google. And just after Google announced the availability of the archive, some rogue used FTP to grab the whole archive off the University of Western Ontario’s FTP server — all three gigs of it transferring in one night. “I have no idea what they plan on using it for, since if it’s spam e-mail the addresses are all wrong,” says Wiseman. Now, anyone who wants a full copy will have to ask politely first — it’s no longer on the server.
Google filled in the more recent posts not covered by the old DejaNews archive thanks to J|rgen Christoffel of the German National Research Center for Information Technology, who’d kept his own archives in the ’90s, and Kent Landfield, a network security developer and the maintainer of FAQs.org.
Landfield started archiving with entrepreneurial motives. In 1992 and 1993, while at Sterling Software in Omaha, Neb., Landfield had a side project that sold CDs of the Usenet archive. For $349.95 a year, every month you could get a CD burned with the content of Usenet. It was an attempt to cater to the user with a slower modem who still wanted access to every newsgroup.
“I realized that there was definitely a valuable historical aspect to the CDs themselves,” says Landfield. “The reality is, everybody thought that. We’re all just a bunch of packrats. We all knew there was a value to it, and it was a matter of how and when it would be used.”
Thanks to these packrats, Google now estimates that 95 percent of the posts ever made to Usenet are now searchable from the site. But Spencer, for one, can’t help thinking of all that’s still been lost — not just of the other 5 percent of Usenet, but also of the other early history of online communication.
Think of the Arpanet mailing lists that were the precursors to Usenet. Spencer points out that while most of the mailing lists kept archives, a significant number of them have been lost over time. “The first flame war, things like that, most certainly dates before Usenet,” he says. “And I would bet that a lot of that material is gone, because at some point, nobody thought it was worth saving.”
More Related Stories
- Cannes: Ryan Gosling's new movie draws the boo-birds
- Radio host tweets rape joke, blames journalists for reporting on it
- Juror responds to Joe Francis' insults with thoughtful email
- New track from the Lonely Island features Solange Knowles, semicolons
- Amazon introduces fan fiction publishing platform
- Naomi Watts, "Argo," "Wonderstone" among bizarre Teen Choice Awards nominees
- Imprisoned Pussy Riot member declares hunger strike
- The camp-free "Behind the Candelabra"
- Justin Bieber will destroy you if you live-tweet his parties
- Marc Maron on Twitter feud with Michael Ian Black: "We have an understanding"
- "Girls Gone Wild" creator Joe Francis to jury: "You should be euthanized"
- Ai Weiwei releases heavy metal music video
- Actually, Beyoncé is a feminist
- Marc Maron and Michael Ian Black's epic Twitter battle
- Cannes: Directing 101 with James Franco
- Welcome to the jungle: The definitive oral history of '80s metal
- Burt Bacharach opens up on daughter's suicide
- Steven Spielberg to produce "Halo" television series
- Amazon set to launch fine-art gallery
- Twitter torches Dan Brown's "Inferno"
- Brad Pitt keeps breaking his silence on how boring marriage to Jennifer Aniston was
Featured Slide Shows
The week in 10 picsclose X
- 1 of 11
Lisa Montgomery embraces her nephew Thursday after a tornado tore apart her home in Cleburne, Texas. The twister killed six people and destroyed entire swaths of the North Texas town.
Credit: AP/LM Otero
Jack McMahon, the defense attorney for abortion doctor Kermit Gosnell, speaks outside the Criminal Justice Center in Philadelphia Tuesday. His client was convicted of killing three babies in his clinic, and will serve multiple life sentences.
Credit: AP/Matt Rourke
A photo taken Monday captures Vice President Joe Biden's response to a Milwaukee second-grader's innovative proposal to end America's epidemic of gun violence. This guy!
Credit: AP/Jenny Aicher
Sen. Rand Paul, R-Ky., flanked by a grouper-eyed Michele Bachmann, addresses the IRS' admission that it targeted Tea Party groups in advance of the 2012 election. In an op-ed for CNN Thursday, the Kentucky senator slammed the president for his faux outrage.
Credit: AP/Molly Riley
Ousted IRS chief Steven Miller is sworn in on Capitol Hill Friday. Miller testified before the House Ways and Means Committee on the extra scrutiny the agency gave conservative groups applying for tax-exempt status.
Credit: AP/J. Scott Applewhite
Attorney General Eric Holder pauses as he testifies on Capitol Hill before the House Judiciary Committee Wednesday. Holder is under fire, among other things, for the Justice Department's gathering of phone records at the Associated Press.
Credit: AP/Carolyn Kaster
O.J. Simpson sits during an evidentiary hearing at Clark County District Court in Las Vegas, Nev., Thursday. Simpson, who is currently serving a nine-to-33-year sentence in state prison for armed robbery and kidnapping, is using a writ of habeas corpus to seek a new trial.
Credit: AP/Las Vegas Review-Journal/Jeff Scheid
Major Tom to ground control: On Sunday astronaut Chris Hadfield recorded the first music video from space, a cover of David Bowie's "Space Oddity."
Credit: AP/NASA/Chris Hadfield
When it rains it pours. President Barack Obama speaks during a news conference Thursday with Turkish Prime Minister Recep Tayyip Erdogan, inexplicably inspiring an #umbrellagate Twitter meme.
Credit: AP/Jacquelyn Martin
A smoke plume rises high above a road block at the intersection of County A and Ross Road east of Solon Springs, Wis., Tuesday. No injuries were reported, but the the wildfire caused evacuations across northwestern Wisconsin.
Credit: AP/The Duluth News-Tribune/Clint Austin
Recent Slide Shows
- 1 of 11
Katharine Mieszkowski is a Bay Area journalist, who covers science and
the environment. A Salon senior writer from 2000 to 2009, she
chronicled the dot-com boom and bust as a technology correspondent and co-founded the Broadsheet blog.
Her Salon stories have been anthologized in "Panic: The Story of Modern Financial Insanity,"
A Yale grad, Katharine has also written for the New York Times, Mother Jones, MS, Rolling Stone, Glamour and Reader's Digest, while her commentaries have appeared on National Public Radio's "All Things
Considered." In 1994, she joined her first Internet start-up, Women.com, then known as Women's Wire. Since then, she's also been a writer for Fast Company magazine covering Silicon Valley and a columnist for the San Francisco Bay Guardian investigating local subcultures. In 2001, she was named one of the Top 25 Women on the Web by San Francisco Women on the Web.
Katharine, who grew up near Houston, now lives in the San Francisco Bay Area with her husband and daughter. You can sign up for Twitter updates from her here.