Don’t be afraid of the big bad Gmail

Privacy advocates are frothing about Google's plan to scan e-mail for advertising purposes. A report from an early tester of the service says their concerns are overblown.

By Mathew Honan

Published April 26, 2004 7:30PM (EDT)

I’m not scared of Gmail. I’m not at all worried that my privacy is about to be invaded by the world’s most popular search engine company. Call me brave, call me crazy, but I’m not. Nor should you be.

Gmail is the new Web-based e-mail service from Google that offers a gigabyte of storage space to users. Google announced Gmail on April 1 and has been handing out a limited number of beta accounts since then. The company still doesn’t have a firm release date, but says it will most likely be within the next three to six months. More beta users are being added every day. On April 20, Google began divvying out Gmail accounts to “active” users of its weblogging service, Blogger.com. A representative of Google says that there are currently not quite “tens of thousands” of Gmail users, and that it wants to try to incrementally roll out more accounts. If it’s as good as Google search — and it is — it won’t be long before that number hits the tens of millions.

I was fortunate enough to be one of the early beta testers. Here’s my report on how it works, and why you shouldn’t let it frighten you.

Gmail grew from an internal program to help Google employees better manage their e-mail. Among the program’s most useful features is its incorporation of Google’s search technology right in the inbox. Finding a particular message based on its content has never been easier in a Web-mail application.

“I think the principal issue that we had to contend with early on was the fact that I had a lot of e-mail,” says Google co-founder Sergey Brin. “In Unix, all of your mail is stored in a mail spool file. Oftentimes, people I know would just open up a raw spool file. You can’t do anything but it’s very, very fast, and you can use other Unix tools such as grep. That was the inspiration as much as anything.

“Fairly early on, we decided we wanted an internal search for e-mail. I wanted it for myself. I found I was eventually using it for all my e-mail. And as a consequence of that, I decided everyone should have a version of it and that’s been the more recent effort.”

But Gmail is much more than storage and search. It makes extensive use of JavaScript to give users such niceties as keyboard shortcuts, spell-checking, and seamless composition, replying and forwarding. The result feels like a cross between a Web-based application and a standalone program.

“We wanted to have the benefits of both,” says Brin. “We wanted to have the efficiency of a stand-alone application. I think we aimed for that, but I don’t think we went as far as those kinds of things that really try to duplicate apps. We did not want to go that far. If you push it too far, then it really interferes with what people are used to in a Web application and it causes the browsers to be really confused.”

It has a fairly effective spam filter that uses both rules-based and Bayesian filtering. I redirected much of the spam from my old mail account to Gmail, and it caught every piece of spam I threw at it. Brin says that coming down the pike will be even better filtering tools.

Gmail also sports several interesting organizational features. Messages are grouped together in “conversations,” similar to the way some stand-alone e-mail applications thread replies. But Google does it differently, making it easy for users to view the new text in each reply in a thread without reading the full body of the message that’s being quoted back and forth. Furthermore, each response is expandable and collapsible, with previous replies being collapsed by default. When you need to go back and look at previously quoted text, clicking on an individual message within the conversation causes it to expand instantly.

“While there are mail programs that support threading, it’s often not on by default, and it usually has some funny trade-offs,” says Brin. “Ultimately when they do thread them you don’t see all the messages at once. And then you end up reading the same content twice. We’ve dealt with all those issues, and it makes it easier for me.”

Gmail’s labeling is also interesting. Normally, when you want to organize your e-mail, you have to sort it into different folders. Gmail uses “labels” instead, that allow users to apply multiple labels to the same message.

Want to keep your work mail separate from your personal mail? Set up a filter that applies a “work” label to incoming mail from your office’s domain. Have friends at work who send you personal messages? Set up another filter that labels those messages as “friends.” Thus when a co-worker sends you a message about catching an A’s game after work, you don’t have to make a decision about where to file it. It can appear in both places.

Like all the major free-mail services, Google is relying on ads to turn a profit. But Gmail ads are different; they aren’t just appended willy-nilly to the bottom of your message (they don’t show up in the body of your message at all, in fact). Gmail robots automatically scan the text of your incoming messages, and then use that information to deliver targeted ads and related links that appear next to incoming messages.

“The goal is we should only show you an ad when we think it’s going to be useful to you, based on the same technology that you see on many Adsense Ads,” says Brin. “It’s the exact same technology that tries to figure out what from our advertisers is likely to be relevant to the reader.

“It will try to find concepts in the message that match concepts we’ve associated with advertisers, and it will also know how well those various ads performed on all the Adsense sites in our network.”

How well do the ads perform? It’s a mixed bag.

In preparing for this story, I asked several Gmail beta testers to forward the ads they received to me. An e-mail Jessamyn West received about cheese prompted Gmail to serve several cheese-related links, but no ads. Similarly, an e-mail I sent myself to test the system that contained several San Diego travel-related phrases prompted Gmail to serve San Diego-related links, but again, no ads. But more often than not, even the most obvious-seeming messages failed to produce either ads or related links.

Yet when I added specific products or corporate names to the mix, Gmail typically homed in on them immediately. It also seemed to have a much easier time with technology-related subjects. When a friend bemoaned her corporate e-mail server’s tendency to bounce messages over a certain file size, Gmail offered up a sponsored link that would remedy the problem. Messages about Wi-Fi and Web hosting also triggered ads and sponsored links. When I added a few company names to the travel message listed above, Gmail fired back with relevant links. When I sent myself an e-mail with Hoover’s data on Costco, Gmail served both sponsored links to Costco and related pages. Furthermore, it’s smart enough to figure out that just because an e-mail mentions Costco, that doesn’t mean that a Costco ad would be relevant, as evidenced by the H.R.-related ads it served in response to this message.

But it also has a tendency to be goofy. Andre Torrez forwarded a message where Gmail keyed in on an ad appended to the bottom of a Yahoo mail message that had nothing to do with the body of the message. A spam message with a body full of unrelated words triggered two related links and an ad, while a bake sale fundraising e-mail from MoveOn.org triggered ads for “delightful candy bouquets.”

Unlike traditional Adsense ads, however, Gmail ads don’t collect, or reveal to advertisers, the terms used to deliver the ads. “We’re not using any of the data people are clicking on in the messages,” says Brin.

In fact, the company is collecting so little data on user click-throughs that Brin claims to be unaware even of how many ads per message Gmail is serving.

“We did not set a target ratio and, to be honest with you, we weren’t even exactly sure as of yesterday what the ratio was,” says Brin. “Because of all the privacy concerns, we can’t get the kind of basic stats that we ought to have. I get the feeling that right now it’s maybe a third of the time or so.”

Brin’s claim that he doesn’t know exactly how many ads Google is serving comes at an odd time, since Gmail’s highly targeted in-box ads have quickly raised a massive privacy stink. Even as we spoke, California state Sen. Liz Figueroa was in the process of drafting legislation banning the service from California, which is Google’s home state.

“We’re in the process of drafting a piece of legislation that would really ban the Gmail concept that Google is pushing forward,” Figueroa told Salon. “Our premise is that it is an invasion of privacy. They are scanning for content purposes. I know that e-mail is scanned in a general way but this is a scan in a specific way for marketing purposes.”

When Salon posed the question of what, exactly, was wrong with this, if users knew about it beforehand, Sen. Figueroa noted that many users don’t read privacy policies, and pointed out that third parties who send e-mail to Gmail users are unknowingly submitting to having their e-mail scanned.

Indeed, the third-party question is at the heart of the legislation, California S.B. 1822, which Figueroa introduced the day after speaking with Salon. The law would forbid the review of e-mail content unless Google (or any other e-mail provider) first obtains the consent of all the parties to an e-mail conversation — senders, receivers, everyone. If it passes, the bill will not only block Gmail from scanning incoming e-mails, but it could also end up prohibiting employers and other e-mail providers from filtering e-mail for objectionable content.

“I think more and more people don’t have the time [to read agreements], or just don’t realize how far it’s gone,” says Figueroa, who acknowledges that she hasn’t used Gmail. “And yes, somebody’s got to say for everybody ‘privacy is really an issue for me. I want you to think of that foremost on the agenda so we start with that.’ We have to make Google be aware that this is a product that’s somewhat offensive to a lot of people. ”

Oh, but they are. They are.

“I think a lot of people got up in arms before they saw the product,” says Brin. “That was a little unfortunate the way we released it. As a consequence of that, most people who have heard about it, the only thing they have access to is the privacy policy. In many cases, there’s misinformation out there. I think many are misunderstandings of the product.”

For example, one of the early rumors swirling around Gmail that sprung from unclear wording in Gmail’s privacy policy was that e-mail would be archived forever — regardless of whether or not you delete it. Regardless, even, if you close your account. Not so, says Brin.

“That was our fault for not having carefully enough worded the privacy policy,” he notes. “It’s exactly the same as any other Web-mail services. We have a variety of backups because we never want to lose mail. We do in fact delete the messages. It just sometimes takes a while to propagate the messages through all the proxies.”

But in the meantime, Privacy International filed a complaint against Google in 16 countries on April 19.

“Google is showing its true colors. The company pays lip service to privacy but in this case has demonstrated no real commitment to it,” fumed P.I.’s Simon Davies in a press release. “I am beginning to suspect that Google looks at privacy in the same way that a worm looks at a fishhook.”

Far from it. The privacy issue is overblown. Indeed, as Sen. Figueroa herself points out, virtually every piece of e-mail sent across the Internet is already scanned by robots, be it for spam or viruses. If you have a problem with robots reading your mail — with or without your consent — you’re going to have to go back to the U.S. Postal Service, or start encrypting everything. Similarly, the demands that Google erect a “wall” between Gmail and its other sites — such as Search, Groups or Orkut — are not only preposterous, they are counterproductive to the best interests of consumers. Not only that, but it appears that Google’s critics are holding it to a different standard than its competitors. Yahoo and MSN (Hotmail) both collect vastly more personal data that can be linked back to e-mail accounts, including address books, search data and even stock portfolios.

Sometimes, data should be aggregated. It makes for more convenient, useful applications. Would you rather your e-mail not be linked to your address book? Do you really want to have to log in again and again when you try to navigate across the Yahoo suite of Web sites? Is it intrinsically bad for MSN Messenger to notify me when new e-mail arrives in my Hotmail account?

More important, says Brin, is protecting the data you choose to allow Google to handle.

“We treat the data with great care and we never let any personal data get out in any form,” he says. “If you log into Orkut, yeah it’d be nice to see if you have unread messages, so I don’t think that it makes sense to have complete walls. A single e-mail can be very, very sensitive. Therefore we have to treat every single bit very carefully. I’m not sure the aggregation makes that huge of a difference, because ultimately we have to be very guarded about all data. But we will use good judgment.”

What remains to be seen is whether its critics can be counted on to use good judgment as well.

Don’t be afraid of the big bad Gmail

Privacy advocates are frothing about Google's plan to scan e-mail for advertising purposes. A report from an early tester of the service says their concerns are overblown.

Published April 26, 2004 7:30PM (EDT)

By Mathew Honan

By

Related Topics ------------------------------------------

Related Articles