Wikipedia's founder builds an open-source search engine

Jimmy Wales invests in a "distributed" effort to index the Web, part of his plan to build an open alternative to Google.


Farhad Manjoo
July 28, 2007 12:52AM (UTC)

Note: This post has been updated below.

Jimmy Wales, the co-founder of Wikipedia, announced today that his for-profit community-hosting site Wikia has deepened its investment in developing an open-source Web search engine. Wikia purchased Grub, a company that makes a distributed Web-crawling program. Instead of having a single set of computers index the Web -- as Google and other search engines do -- Grub passes out the indexing work to computers across the globe.

Advertisement:

You can download the Grub client to make your own computer pitch in on the indexing work. While you're not using it, the machine will scan the Web and send back its index to a central server; your scan, combined with input from others running the Grub client, will form the index that will power Wikia's open-source search engine.

Wales, who was speaking at the O'Reilly Open Source Convention in Portland, Ore., announced that Wikia has turned Grub into an open-source program; the company hopes for input from developers all over the world.

Unlike Wikipedia, Wikia's search engine will run as a for-profit venture. Gil Penchina, the CEO of Wikia, has said that the company hopes to one day reach 5 percent of the search market -- a number that sounds small but that could be quite lucrative. But because the project is open-source, anyone else could build a competing search engine, whether for profit or not, based on the same index, Wales pointed out to me this afternoon in the briefest of phone conversations (we had some kind of cellphone issue).

Wikia sets out several guidelines for its open engine: It will be transparent -- the algorithms determining how results are ranked will be visible to all. Google and other engines invest huge sums to develop these algorithms, and they guard them extremely closely. But that's precisely why Wales believes we need an open search engine -- the world, he says, must have an alternative to a Web that's ranked by "invisible rules inside an algorithmic black box."

But Wales isn't looking for transparency for transparency's sake: The project rests on the idea that community involvement will improve on today's search results. Whether that's possible seems a gamble; Wikia has not announced a timeline for the project's debut. A search engine is a huge undertaking, and there's something nearly crazy about the idea of doing it with volunteers. But then, so too does Wikipedia and every open-source project seem somewhat impossible; that all those people could make something together doesn't seem likely. Miraculously, though, these projects work -- and the same thing could happen for search.

Update: I just got back in touch with Wales. He clarified, first, that not only will the open search engine take contributions for its source code but community members will also be actively involved in the editorial process governing search engine results.

Advertisement:

"The idea would be a wiki-like process where the community can whitelist URLS, blacklist URLs, control for spam, block users who are being bad, that kind of thing," Wales says.

Wales says that Wikia will have a simple front end for the search engine built by the end of this year -- a place where people can "enter a search term and get some results." He adds, "We expect that it probably won't be very good at that point, and we'll probably have to put a big disclaimer on the site: 'We know this isn't very good; please help us to make it better.'"

I asked Wales if it's possible he's too late in starting this -- is Google too entrenched to beat? "Sure," he says. "I could fail. I have no idea. But I'm going to have fun trying."


Farhad Manjoo

Farhad Manjoo is a Salon staff writer and the author of True Enough: Learning to Live in a Post-Fact Society.

MORE FROM Farhad Manjoo



Fearless journalism
in your inbox every day

Sign up for our free newsletter

• • •