Spam, Spam, Spam, Spam, Spam, the FTC, and Spam

October 2, 2003
Volume 1, issue 6

Download PDF version of this article PDF

Spam, Spam, Spam, Spam, Spam, The FTC and Spam
ERIC ALLMAN, SENDMAIL

A forum sponsored by the FTC highlights just how bad spam is—and and how it’s only going to get worse without some intervention.

The Federal Trade Commission (FTC) held a forum on spam in Washington, D.C., April 30 to May 2. Rather to my surprise, it was a really good, content-full event. The FTC folks had done their homework and had assembled panelists that ran the gamut from ardent anti-spammers all the way to hard-core spammers and everyone in between: lawyers, legitimate marketers, and representatives from vendor groups.

I assume that no readers need to be convinced that spam is a problem. But even I was surprised at how bad it really is. Some panelists said that the spam doubling rate is down to about eight weeks. I checked my spam, and my doubling rate seems to be a touch slower than that—with a 25- to 30-percent growth rate month-to-month (still pretty horrible). With a situation this dismal, all anti-spam filters can do is roll back the clock a bit: a spam filter that lets only 1.5 percent of spam through will be letting through, in one year, as much spam as you are receiving today (assuming the eight-week doubling; if you believe my 25-percent month-to-month growth, you’ll have to wait somewhat over 18 months to be in the same position).

LEGAL AND LEGISLATIVE ISSUES

The FTC being in Washington, it was no surprise that the majority of the discussion at the forum was about legal, legislative, and litigation issues.

The first question was, “What is spam?” This is much harder to answer than it at first sounds. For example, some people define spam as “any e-mail I don’t want to get,” even if the mail is for a list that they really did sign up for. As one panelist pointed out, some people really do want to receive pornography. Most people agreed that getting a newsletter that the recipient has actually requested is not spam. My personal take on the only “reasonable” definition comes down to consent: If you request that you receive something, it’s by definition not spam. However, reselling such a list may or may not result in spam, and not everything unsolicited is spam.

For a great deal of spam, no additional legislation is needed. For example, an FTC study [1] found that two-thirds of all spam is fraudulent in some way. For example, Earthlink recently won a $25 million case against a spammer under the Computer Fraud and Abuse Act, RICO, the Electronic Communications Privacy Act, and others [2]. Other preexisting laws that have been used include trespass to chattels and conspiracy.

New laws are definitely being proposed. Perhaps the best known is the CAN SPAM Act (S.877) introduced by Senators Conrad Burns (R.-Mont.) and Ron Wyden (D.-Ore.) [3]. Many people feel this bill is problematic: It explicitly permits opt-out marketing and overrides state laws even if they are stronger. Given the explicit permission to send unsolicited e-mail, some forum attendees referred to the Burns-Wyden bill as the “I Can Spam” act.

Burns-Wyden has been amended to allow Private Right of Action by Internet service providers (ISPs). This means that ISPs may sue spammers directly; without private right of action, all they can do is to ask their state attorney general (AG) to sue for them. But, of course, the AG offices are already packed, and as one person from the New York State AG’s office put it, given the choice between going after a spammer that has cost businesses a billion dollars versus a terrorist, it’s clear how they must proceed.

Other legislation in play during the FTC spam forum included the REDUCE Spam Act (HR.1933), sponsored by Rep. Zoe Lofgren (D.-Calif.) and an at-the-time-unintroduced “do not spam registry” bill by Sen. Charles Schumer (D.-N.Y.), now introduced as S.1231. Several other bills have been introduced since the FTC Forum was held.

LEGITIMATE MARKETING

Yes, Virginia, there are such things as legitimate e-mail marketers. And believe it or not, they are probably more upset about spam than you. I do get newsletters that I have requested and really want to read, but often they are classified as spam by content filters.

It wasn’t hard to tell the legitimate folks from the spammers. Several people from existing e-mail marketing companies insisted that opt-in, preferably double-opt-in, was the only way to go; without it, the value of the lists wasn’t high enough. As one of them put it, “We sell quality, not quantity.” The spammers, of course, insisted that their business would be crippled if they couldn’t send blind e-mail, and so opt-out was the only way to go. Interestingly enough, the Direct Marketing Association (DMA), an industry trade group with a history that predates spam, was in strong support of the spammers on this one, even though the legitimate marketers (the people the DMA purports to represent) were saying that they don’t need opt-out.

The question of opt-out versus opt-in was, of course, fairly contentious. There are several ways to run e-mail lists. Going from best to worst:

The first, double opt-in, requires that a subscriber e-mail two messages to get on a list. The first message requests addition of thus-and-such address (this first message can be done via a Web form, e-mail, or even scanned badges at a conference). The list owner then sends a confirmation (“challenge”) message saying, “If you really want to subscribe, reply to this message”—usually with some random number in the subject to prevent guessing. Only when that reply is received is the address added to the list.

The second is confirmed opt-in. It works exactly like double opt-in, except that the confirmation message says, “You have been added; do this if you want to unsubscribe.”

The third is plain opt-in. You sign up for a list and you start getting it. This has the obvious problem that if someone else signs you up, you still haven’t given consent.

The fourth is opt-out (they send you anything they care to, and you get to later request that they stop sending it to you) with a working unsubscribe link.

The fifth is opt-out with no unsubscribe link.

The sixth is opt-out with a nonworking unsubscribe link.

The seventh is opt-out with an unsubscribe link that actually confirms your address as belonging to a live account.

One problem with methods four through seven is that you, as consumer, can’t tell the difference between them. In particular, since you can’t tell the difference between methods four and seven, even experts differ on whether you should ever try to unsubscribe. There’s actually even another case: The unsubscribe link removes you from the list in question, but it also adds your address to another list. To my mind, this is morally equivalent to method seven—perhaps even worse.

TECHNOLOGY

There are many technologies available for addressing spam today. Some of them don’t stand alone: For example, whitelists (discussed below) aren’t very interesting without some other technology backing them up. Others can work independently or be combined to good effect.

Whitelists. Simply put, whitelists are sets of e-mail addresses that will automatically skip other spam checks. They are fundamental to challenge/response systems, described below—in fact, you can think of challenge/response as nothing more than an automated way of building a whitelist (where any addresses not whitelisted are rejected). They do have one noticeable problem, however: Since SMTP is not authenticated, if a spammer can guess an address on my whitelist then they can get past my checks. For this reason, authenticated sender schemes are ultimately going to be necessary.

Disposable addresses. Not really an anti-spam technique in itself, disposable addresses can reduce the damage if you accidentally sign up at a Web site that sells its lists. Essentially, you give each Web site a separate address, all of which forward back to you. If one of the addresses gets abused, you can turn off that single address. This is commonly implemented using “+detail” addressing—for example, “[email protected].” However, some spammers have figured out that they can strip off everything after the plus sign. Some people are responding by requiring that there must be something after the plus sign. Spammers will react by adding some random string after the plus sign, and anti-spammers will respond by changing the system to “default to reject” instead of “default to accept.” This does, however, make it harder for the users of disposable addresses to set them up.

Such is life in an arms race—in the long run, everyone loses (except the arms dealers).

Realtime IP blackhole lists. Essentially, these are lists of IP addresses, distributed in realtime via DNS, that are “known” to be used by spammers. They are centrally managed by various groups and are very controversial. Well-known realtime IP blackhole lists are the MAPS RBL (http://www.mail-abuse.org/rbl/), SpamCop (http://www.spamcop.net/), and SPEWS (http://www.spews.org). They differ in many ways: degree of accountability (MAPS is a company and can be found easily; SPEWS is a completely anonymously run organization—anyone knowing who they are isn’t telling); how an IP address gets on the list (some accept reports and investigate, some accept reports and immediately blacklist the address, some actively search for perceived problems); and how an address gets off the list (some have carefully published policies, while others have essentially unknown criteria). Realtime IP blackhole lists are popular, but tend to have high false-positive rates.

Content-based. These look at the contents of a message to try to determine if it is spam. There are really three approaches to this: expert systems, also known as heuristics (clever people read huge amounts of spam to determine patterns that are updated relatively infrequently—e.g., SpamAssassin); fingerprint-based with frequent updates (spam messages are collected either from honey pots, a la Brightmail, or using collaboration-based algorithms such as Razor); and machine learning, usually Bayesian (the anti-spam filter “learns” by being shown examples of spam and non-spam). Fingerprints generally have very low false-positive rates, but are essentially reactive (someone has to actually get some of the spam) and is susceptible to snowflaking (adding random text into each message to make it harder to build fingerprints). Some algorithms can survive snowflaking.

Most content-based anti-spam schemes try to identify spam by in some sense “understanding” the text. For example, for most people a mention of Viagra or little blue pill in a message is a good indicator of spam. But these schemes can be extremely context-dependent—for example, Pfizer, the manufacturers of Viagra, can’t make the same assumption. As a result, content-based filters are best deployed as “close” to the recipient as possible. This is especially important when using Bayesian filters.

Note that when I say “understanding” I don’t mean in the hard-core AI sense. Some systems have tried to do this on the basis of simple keyword searches, usually with scoring—that is, the existence of the word Viagra is not enough in itself to classify a message as spam, but several “danger” words in addition (even innocuous words such as longer and satisfy) might push the message over the edge. These schemes often include negative scoring words; for example, the presence of the word Ethernet in the same message might make the message less likely to be spam. There have been some attempts to do full, natural language understanding, but these tend to be too slow for general use.

Challenge/response. In this scheme, no mail gets through unless it is whitelisted (this is known as “default to deny.”) Whitelists are generally per recipient. If a sender tries to deliver to a protected mailbox, the message in question is held in a quarantine queue and a challenge is returned to the sender. This can be as simple as “reply to this message,” to sending them an image and asking them to enter the word that is included there (since the human brain is much better at visual processing than even powerful computers, this is trivial for people and hard for algorithms). Once the challenge is correctly solved, the sender address is added to the recipient’s whitelist and the original message is delivered.

Although these tend to be very effective, there is anecdotal evidence that some users are confused by the confirmation requests; this is particularly problematic for nontechnical users. EarthLink is now offering challenge/response to its customers, which may prove to be an interesting test case.

Economic Solutions. These impose some sort of cost (not necessarily monetary) on the sender of e-mail. In some sense, challenge/response is a form of economic solution, since it is more work for the sender to get a message through, and this translates fairly directly to an economic cost. Other proposals include e-stamps, computational- or memory-based costs (e.g., hashcash), and bonding schemes (the sender posts a bond to a third party that it forfeits if it spams). The concept is that the cost is low enough that an individual who presumably sends less than a hundred messages a day pays nominal, even trivial, costs, whereas mailers sending out millions of messages a day incur a substantial cost. Of course, some lists, such as my snark discussion group, have no marketing basis at all, and if I had to attach $10 to each outgoing message (1,000 users at, say, a penny a message), and there are 100 messages a day to that list (snarks are very popular this year), I would be paying $1,000 a day. This makes it extremely unlikely that I would be willing to host the snark discussion list.

The problem with most economic solutions is that the costs are, for the most part, imposed on everyone. Bonding may be an exception to this, but bonding is really only a “get out of jail” card for legitimate mass-mailers. Thus, when I want to contact an old college friend or request product information from the snark-of-the-month company, I have to “pay” something for the privilege. These costs can be ameliorated by whitelisting (my college friend will probably be willing to whitelist me so I only have to pay the cost on the first message) and 800-number-style mailboxes (e.g., product information). However, the very concept of having to pay for e-mail just rubs people the wrong way.

ENABLING TECHNOLOGY

Some of these technologies are predicated on various enabling technologies. For example, whitelists are potential soft spots that give spammers a chance to slip by filters. This is a direct result of the default unauthenticated nature of SMTP. There is an SMTP authentication protocol, but it is primarily used to authenticate SMTP servers to one another, not to prove the identity of the actual sender. To do so would likely require some e-mail protocol extensions (probably in SMTP), plus some sort of distributed public key infrastructure (PKI). Unfortunately, such PKIs have never been successfully deployed in the context of the highly heterogeneous Internet. A successful PKI would need to be federated (so that no single provider could lock down the market), distributed, and replicated (for performance and resilience). This is a tall order.

Any stamp-based economic solution (that is, where some sort of e-cash is attached to each message) is likely to need some sort of micropayment structure. Again, many people have proposed such infrastructures in the past, but none has succeeded. Just as with PKI, locking into one vendor is a showstopper, and performance, distribution, and resilience are essential.

It’s likely that open protocols and open source implementations will be critical to the creation and acceptance of these enabling technologies. There will be a thicket of patent battles to be fought, many companies will try to lock down the technologies for personal profit, and vendors and consumers must be convinced to participate. For this reason, it may take a decade for these to appear.

OTHER ACTIONS

A group called the Anti-Spam Research Group (ASRG) is operating under the auspices of the Internet Research Task Force (IRTF). The IRTF is a sibling to the better-known Internet Engineering Task Force (IETF) and is tasked with doing research but not establishing protocols. It’s too early to tell if this effort will succeed. However, even if the ASRG has a positive outcome, any real protocol specifications and implementations will have to go through the IETF.

AOL, Microsoft, and Yahoo have recently announced an alliance to solve the spam problem. This triumvirate proposes to work on all sides of the spam problem, from technology to legal solutions. Given their heft in the market, they might be able to push through some significant protocol changes. But it isn’t clear if that’s the good news or the bad news. They promise to include the broader industry in this effort. We’ll see how serious they are on this.

THE SPAM REFRAIN

Spam is a horrible problem, and it’s worsening rapidly. Nuclear Research has recently estimated that spam costs businesses approximately $87 billion each year in the United States alone. At the FTC forum, it was claimed that ISPs are spending billions of dollars combating spam, and this doesn’t count the time lost by individual recipients. The amount of spam is growing faster than tribbles, and just about as destructively. At least tribbles are cuddly and purr. There is growing anecdotal evidence that some users are giving up on e-mail as a medium entirely, because of spam. Even e-mail marketers hate spam.

There are many technologies available in the anti-spam fight. Unfortunately, some of the most promising rely on other technologies, which, although technically possible today, will take long periods to deploy for totally nontechnical reasons. Some of the available technologies threaten to split the Internet into pieces that won’t talk with one another, putting us back to where we were in the early 1980s when I wrote Sendmail to try to impose some sanity onto e-mail.

Personally, I don’t have a lot of sympathy for people who say that they don’t want to have to pay for e-mail. We are all going to have to pay in one way or another. I pay every morning as I slog through the 100 or so spams that get past my filters. I pay every time I update my spam filters or whitelists. In some sense I’m paying by writing this article: I would much rather be working on something that was moving us forward rather than just trying to keep us from slipping backward. We all pay in reduced service from the network, increased ISP costs, and frustration.

To make things worse, we are fighting against ourselves out of the most basic of motives: greed. For example, a company called Mailblocks has recently sued EarthLink for violating patents that it holds on challenge/response technology. These patents were granted in 2000 and 2001; I first encountered challenge/response software probably around ten years ago. Such lawsuits just hold us all back. Until the good guys (that would be us) can learn to work and play well together, spam is going to continue to get worse.

REFERENCES

1. Federal Trade Commission. False claims in spam: A report by the FTC’s division of marketing practices (April 30, 2003); http://www.ftc.gov/reports/spam/030429spamreport.pdf.

2. Kane, Margaret. EarthLink wins $24 million from spammer (July 19, 2002); http://zdnet.com.com/2100-1106-945169.html.

3. Text of Senate bill S.877 (CAN SPAM Act of 2003); http://thomas.loc.gov/.

ERIC ALLMAN is the cofounder and chief technology officer of Sendmail, one of the first open source-based companies. Allman was previously the lead programmer on the Mammoth Project at the University of California at Berkeley. This was his second incarnation at Berkeley, as he was the chief programmer on the INGRES database management project. In addition to his assigned tasks, Allman got involved with the early Unix effort at Berkeley. His first experiences with Unix were with the 4th Edition. Over the years, he wrote a number of utilities that appeared with various releases of BSD, including the -me macros, tset, trek, syslog, vacation, and, of course, sendmail. Allman spent the years between the two Berkeley incarnations at Britton Lee (later Sharebase) doing database user and application interfaces, and at the International Computer Science Institute, contributing to the Ring Array Processor project for neural-net-based speech recognition. He also coauthored the “C Advisor” column for Unix Review for several years. Allman is a former member of the board of directors of USENIX Association.

I Challenge You to Respond?
MICHAEL MAYOR, NETCREATIONS

The war on unsolicited e-mail continues to be fought on many fronts. On the attack are spammers looking to infiltrate your inbox. The front line of defense is made up of millions of consumers armed with filtration software and, of course, the ever-powerful delete button. Capitol Hill is promising these ground troops air support in the form of anti-spam legislation. However, the battlefront I find the most intriguing these days is the escalating weaponry of technology. After all, spam is simply very good technology in the hands of very bad people. So what happens when the good guys launch a technological counterattack? The results aren’t always predictable.

Take the latest weapon for example, the “challenge/response” process employed by such stalwarts as EarthLink and many filtering firms, including Mailblocks, Spam Arrest, and MailFrontier. The premise of challenge/response is based upon the fact that most, if not all, spam contains an invalid “from” e-mail address. Therefore this software “challenges” the sender by immediately firing back an e-mail that the sender must “respond” to before the message is permitted to reach its intended inbox. In other words, if the from address is bogus and no one is on the other end to respond to the challenge, the mail will be rerouted to a less desirable location or blocked altogether.

In theory, what this means to legitimate e-mail marketers (and yes, there are legitimate e-mail marketers) is that their customers will no longer be pummeled by dozens, if not hundreds, of unsolicited e-mails a day. Thanks to ISPs installing challenge/response systems, consumers will have more time to read the e-mail they requested from legitimate marketers, and going forward, they will likely hold a much higher opinion of commercial e-mail in general. In short, marketers will have a better environment for interacting with the people who really do wish to hear from them. All that’s required is for legitimate marketers to respond to these challenges. Simple, right?

Well, let’s just say that most marketers don’t quite see it that way. Putting aside their specific nits regarding this technology, what they seem to dislike the most about challenge/response is the fact that a human being has to oversee the process. Someone must actually respond to each challenge (which is essentially a reply to the message they just sent). Imagine yourself as a legitimate marketer, sending e-mails to people who did in fact ask for them. All of a sudden you need to hire a team of challenge responders to get your mail through, not because you did anything wrong, but because some other bozos gave all e-mail marketing a bad name. Why should the good guys have to pay the penalty?

Don’t get me wrong. I have some reservations about challenge/response and don’t feel it is by any means a magic bullet that will rid the world of spam. However, I do take issue with the marketer’s core objection.

Remember the overused term “one-to-one marketing” of a few years ago? OK, maybe you don’t, but we marketers do. It was everywhere, on Web sites, in countless articles, and it stretched across hundreds of trade-show booths at the dozens of industry shows that we held each year. It was billed as nothing less than the Internet’s greatest advantage over every other form of marketing. E-mail marketing’s great promise was that businesses could actually speak to thousands of customers and/or prospects at the same time, yet in a very personal manner. Businesses could deepen and strengthen the relationships they had initiated through increasingly relevant and personalized content.

It seems pretty straightforward to me that it takes two to have a one-to-one relationship. Good marketers need to be prepared to receive “responses” to their campaigns from any possible direction, not just “orders” or “sign-ups” but also replies to e-mails they just sent, like those generated in challenge/response systems. It’s not about what the marketer wants to hear (orders), it’s about what the customer wants to say (orders and challenges, and everything in between).

Marketers’ expectations of technology are often misguided. Technology won’t do their jobs for them, but it will help them do it better. Occasionally technology gets the pleasure of illustrating this important fact all too well by creating software that does what it should but reveals a chink in the marketing armor in the process. Challenge/response is the perfect example of this.

It’s often fun to watch what happens right after this kind of technology comes out. Usually the first response is denial—“it’s not the marketing, it’s the technology” causing all of the headaches. These marketers will want the new technology turned off immediately. (“These darned challenge/response systems, they’re ruining everything!”) Other marketers will grow from the experience and change the way they do business. (“OK, I guess we have to deal with this.”) The best marketers will thank the new technology and use it to their advantage. (“Challenge/response means less junk and more attention to legitimate e-mail, our e-mail.”)

Technology will play an important role in ultimately winning the war against spam, as techno-weapons such as challenge/response demonstrate, by forcing sender transparency. By creating software that acts as the consumers’ infrared goggles so they can distinguish friend from foe, software developers are taking away the only thing these unscrupulous marketers have going for them: a place to hide.

I am confident that a broad-based sender-transparency technology is not too far off. Let us hope that legitimate marketers who are truly interested in having two-way relationships with their customers can keep up with the pace.

MICHAEL MAYOR is an 18-year veteran of the direct-marketing industry and a recognized pioneer of e-mail marketing. He joined NetCreations as the company’s third employee in 1998, and his vision and leadership played an integral role in helping it to become one of the largest and most respected e-mail list management companies in the industry today. Mayor has also played a significant role in shaping the e-mail marketing industry by developing many of its standards and best practices. He is a leading advocate of privacy and a frequent speaker at industry functions, including taking part in the FTC’s Spam Forum by appearing on the Best Practices panel. Mayor founded and currently chairs the Interactive Advertising Bureau’s Email Committee, and he writes a bimonthly column on e-mail marketing for iMedia. Prior to joining NetCreations, he spent 12 years at Hearst Magazines.

Originally published in Queue vol. 1, no. 6—
Comment on this article in the ACM Digital Library