Site hosted by Angelfire.com: Build your free website today!

From the Trenches: One ISPs Response to the Problem of Spam



   Nick Christenson           Dan Farmer
   npc@jetcafe.org            zen@earthlink.net
   Independent Researcher     EarthLink Network, Inc.


Introduction

As the Internet has become less of a research oriented collection of computer oriented acquaintances and more of a multinational business driven communications medium, it has had to deal with problems that simply didn't exist before. One of these is a problem known as "spam", the Internet equivalent of bulk mailings, junk faxes, and unsolicited telemarketing phone calls. This paper details what we have employed in our effort to stop spam, with the obvious hope that the techniques used and the lessons we have learned will be of use to others.

Background

In 1994, two Arizona based immigration attorneys, Canter and Segal, advertised their services by sending a message to each of the several thousand Usenet newsgroups, whether their message was appropriate for the discussions taking place there or not. This unusual behavior earned them public scorn, and after they continued this practice they were kicked off of one ISP after another. These days you can't enter a newsgroup without seeing such messages, typically called "spam," and in some newsgroups (especially the sexually oriented ones) they make up the majority of the traffic.

These sorts of tactics have not been reserved to Usenet news. More recently, individuals have taken to harvesting email addresses from Web sites they maintain, open mailing lists, and Usenet newsgroups so they might send unsolicited advertisements for various products and services. Officially known as UCE (Unsolicited Commercial Email) or UBE (Unsolicited Bulk Email), these practices have also been lumped under the general category of "spam," the definition of which has now been expanded to include essentially "all electronic garbage messages."

One of the more significant problems with spam is, unlike telemarketing or bulk postal mail, the sender pays very little of the cost of transporting the message. The spammer simply gives a mail host (often a ISP, due to their excellent connectivity, high volume capacity, and a general difficulty keeping track of the huge amount of mail and news that passes through its system) a list of targets with a single message to send. The sender incurs very little cost per message - essentially only their time and the cost to setup an account with an ISP. The host that relays the mail pays for the bulk of the transmission in bandwidth, service degradation, and cost of responding to the ensuing complaints. The target site also pays in loss of bandwidth, disk usage, connection costs, overflowing mailboxes, etc.

Of course, the cost that most people complain about is the expenditure of time and effort to sort, read, and delete the unwelcome communique. This can be especially painful when paying for access by the minute or by the byte. Most perceive the situation as unfair, and feel that the costs of sending such messages should be paid for by the initiator, not by the systems that are being abused or by the recipients.

Who we Are

EarthLink is one of the largest Internet Service Providers, serving approximately 450,000 customers, handling about two million pieces of email per day. By mid 1997, EarthLink was targeted by a large number of spammers, and the sheer volume of spam going through our networks was starting to have a significant impact on the performance of our services. At this point, we started taking legal action against the spammers and implemented the technical solutions that we present in this article. Our "zero tolerance" policy has the full backing of our upper management, and we go to great effort to ensure its implementation.

We are often asked if we claim to be so anti-spam, why don't we simply throw a switch and stop it from going through EarthLink's servers altogether? Unfortunately, there is no total solution. Despite deploying a great deal of human and computing resources into the fray along with a large variety of technical and social tools, the spammers (new and old) keep misusing our resources almost as fast as we can stop them. The sheer amount of data flowing through our system defies implementation of a simple and all-inclusive mechanism by which to stop all network abuse.

We believe that the decision on what Internet traffic one wants to see or not see should be made by the end user, if at all possible. And if one makes the decision that to see or not see data coming from an individual source, that individual or entity should be able to effect that decision.

On the other hand, we also believe that anyone should have the right to refuse to provide service to those who use resources without paying for them, and also reserve the right to refuse to do business with anyone, especially those who consume a disproportionate amount of server, network, and human resources. People who use other people's resources to deliver their message without express consent are, in essence, stealing. Those that deliver a message pretending to be someone they are not are, in essence, committing fraud. We do not support anyone's claim to commit acts of this nature.

Catchy Title for this section that I haven't figured out yet

At various times, the privilege of being the prime source of spam has made the rounds through the ISP community. AoL, Netcom, MCI, UUNet, EarthLink, Compuserve, the various regional Bell companies, and others have all held this distinction at one time or another. Not surprisingly, this title usually befalls an ISP during a time of extreme growth or other significant set of events in the history of the organization. ISPs are then forced to take their most extreme anti-spam measures when they have the fewest resources available. We believe that almost every entity who has been through this will recommend that organizations tighten up their systems and to develop policies and procedures to deal with these situations before they arise. If an organization fails to do this when resources are available, they may be forced to face this problem when they least are able to.

Compounding the problem are the unscrupulous practices by many of the spam purveyors. Internet discussion groups are replete with stories of forged return addresses (so the advertising targets cannot complain to the true sources), hijacked servers (the spammers may use the computing resources of others to distribute their messages without permission), fraudulent identity claims (so that it is more difficult for filtering software to determine if the message comes from a source of spam), and procedures for removal of one's email address from these mailing lists that often do not work. It is no wonder why many have declared outright war on spam, and how consequently spammers have had to resort to even more aggressive hit-and-run tactics to get their messages through.

Caught in the middle are the Internet Service Providers. The subscribers of these services generally don't want to see these messages, but if the ISP tries to filter these connections they are accused of censorship. These organizations have the largest and most powerful email systems in the world, which their subscribers insist be more accepting than corporate servers (which can be restricted by policy and/or firewalled off), so they are natural targets for relaying by the spammers. Also, ISPs provide very inexpensive and convenient access as a jumping off point for the spammers to gain access to the Internet. These problems are not all easily solved, and it doesn't help, nor is it coincidental, that the tremendous increase in demand for Internet capable professionals has been coincident with the time of ISPs most rapid growth. While this does not excuse an ISP who has ignored these issues, it is improper to believe that you can completely judge another situation without understanding the details of it.

The Current State of the Internet

On Jan 30th, 1998, we decided to use a modified version of SATAN to conduct a quick-n-dirty technical survey of all the ISPs listed in CNET's "Ultimate guide to Internet Service Providers" to get an idea of how many ISPs allowed unrestricted mail relaying. We decided to examine ISPs both because we are in the ISP business ourselves and most of the mail delivered on the Internet today is by ISPs. The results were staggering, quite simply, and go a long way to explain much of the reason why spam is such a problem on the Internet today:

Number of ISPs checked:597
# of ISPs allowing unrestricted mail relaying:320
% of ISPs:53.6%

(See the appendix for more on the details of the survey and the methodology used.)

Although the final percentage of ISPs having open relays is only an approximate value, its easy to see that the Internet is indeed a spammer's heaven. Even if a conscientious ISP turns off mail relaying or kicks a spammer off their systems, the miscreant can easily choose a different home or target to abuse. The story is analogous to security problems in general - the solutions are widely known, but apathy, the easy cushion of ignorance, the pain of change or implementation, the lack of auditing/verifying tools or all of the above are preventing people from doing anything about it.

At the Usenix LISA 11 conference in San Diego in October of 1997, we held a BoF (Birds of a Feather) session on the problem of spam. It became apparent to us that the Internet community was extremely hungry for any advice on practical methods to reduce spam. In this article we hope to provide practical information on how help protect one's systems and to provide insight into how a site (be it an ISP or otherwise) might structure its response to these sorts of problems.

Unfortunately there appears to be no single solution to this problem, but in the next section we shall discuss various technical and social solutions that can help a site fight spam.

Anti-spam

We can roughly divide our anti-spam efforts into technical and social methods. While both have proved effective at reducing spam, the spammers are sometimes wily and always tenacious, and have been very adaptive in combating our efforts.

Technical efforts to stop spam are, of course, favored by us (being long time Internet geeks.) Real time monitoring is fascinating, but very difficult if for no other reason than because of the sheer volume of data flowing through our networks, so we've tried to focus on proactive methods whenever possible.

Unix Mail Relay Filtering

Ever since email was sent via the Internet people have generally configured their machines to accept and attempt to deliver any and all email whatsoever. If their host was not the final destination, it would be dutifully forwarded to the appropriate machine. Indeed, UUCP and early Internet mail would never have worked if this were not true.

This was part of the general philosophy of the early Internet, be a good neighbor, be generous in what one receives and restrictive in what one sends. Thus, if I sent an email message to zen@trouble.org from npc@acm.org, but sent it via the SMTP capable machine mail.earthlink.net, this server would have dutifully tried to deliver it to the trouble.org mail server for me.

This has been subverted by the spammers to (a) make somebody else do the hard part of delivering mail messages, (b) get around an administrative block of this spammer's organization, and/or (c) mask their culpability in this act. For example, let's assume I'm a spammer dialed up to my ISP, and I'm currently logged on to their service at dialup666.faux_isp.net. Now, I have a list of email addresses, maybe many thousands of them, to which I want to send an ad. I connect to mail.good_isp.net, claiming to be niceguy@innocentcompany.com. I then give a list of addresses to send my ad to, and the mail server will dutifully try to send the mail.

This open relaying policy is a friendly thing, in the best tradition of the Internet. On those rare occasions when an email message might get misrouted, machines will try to straighten everything out in a spirit of openness and cooperation. Before the rampant commercialization of the Internet, nobody thought twice about relaying mail for other sites, especially if they spanned networks. In fact, there were several sites that openly offered to do this as a public service. Unfortunately, this has been so badly abused by the spammers that the practice is on its way to being a distant memory on the Internet today. Here is how you can set up a system running the sendmail SMTP agent to prohibit unauthorized mail relaying for trivial and more complex cases.

Simple Case

The easiest way to prevent mail relaying is to simply disallow it altogether. The vast majority of hosts on the Internet can be set up this way. In fact, if the machine in question does not provide remote mail access (typically via the POP or IMAP protocols) or is not a central mail hub this is undoubtedly the way the machine should be set up.

In order to block relaying in this manner, you need to be running the freely distributable version of sendmail, version 8.8 or higher. If you are not running at least this version, an upgrade is in order in any case because of the security problems associated with earlier versions.

In the sendmail.cf file, you simply need to add the following lines, stolen from the antispam rules at the sendmail Web site:


  Scheck_rcpt
  # anything terminating locally is ok
  R< $+ @ $=w >	$@ OK

  # anything originating locally is ok
  R$*		$: $(dequote "" $&{client_name} $)
  R$=w		$@ OK
  R$@		$@ OK

  # anything else is bogus
  R$*		$#error $: "550 Relaying Denied"

These lines can be placed anywhere in the sendmail.cf file as long as they're not in the middle of another ruleset. We like to put ours at the beginning of the file just before the "w" macro is defined.

You do not need to do anything more than add these lines and restart the sendmail daemon for the rules to take effect. These rules also only operate on the envelope of the mail message, not the header, so that sendmail can't be fooled by forged headers. If the sendmail daemon receives email that either is not bound for the machine in question (that is, the machine in the "RCPT TO:" field of the envelope does not match the list of machines in the "w" macro of sendmail), or is not sent by itself, it rejects the connection with error 550 and the message, "Relaying Denied." This is the way we recommend all machines that aren't mail hubs be set up (e.g. desktop machines which need to run sendmail in daemon mode.)

One final thing to note is that it's a bad idea for most hosts to run sendmail in daemon mode at all. Despite the fact that Unix workstations come out of the box with sendmail installed (and almost always with relaying enabled), it is rarely necessary to run sendmail on more than a small fraction of computers on a given network. We urge system administrators to consider using the prospect of mail relaying as an impetus to rearchitect their mail systems, where appropriate. If your systems have been abused in this manner (if not by spam, it might be enough to remember that sendmail is one of the prime ways that intruders break into computers these days) you'll probably find this to be a relatively easy sell to your organization and/or management.

Advanced Case

What if a mail machine does act as a POP or IMAP server? In this case, there very well may be legitimate computers which need to use this machine to relay mail. In this case you can specify a class - we use "W" - to be the host names of allowed relayers. In case DNS isn't set up as well as we like, we additionally specify a class, "C" of all the Class C networks allowed to relay through us. The whole section to be added to the sendmail.cf looks like:


  Scheck_rcpt
  # anything terminating locally is ok
  R< $- @ $=w >   $@ OK

  # anything originating locally is ok
  R$*     $: $(dequote "" $&{client_name} $)
  R$=W    $@ OK
  R$+ . $=W       $@ OK

  # IP address ranges 
  R$*     $: $(dequote "" $&{client_addr} $)
  R$=C . $-       $@ OK

  R$@     $@ OK

  # anything else is bogus
  R$*     $#error $: "550 Relaying Denied"

For this to work, we also need to add the following definitions, also in the sendmail.cf file:


  # file containing domains which are allowed to relay through us.
  FW-o /etc/mail/sendmail.cW

  # file containing legitimate client relayers by Class C prefix.
  FC-o /etc/mail/sendmail.cC

The file /etc/mail/sendmail.cW might contain something like:

  earthlink.net
  trouble.org
and the file /etc/mail/sendmail.cC might look like:
  208.197.253
  207.217.91
  207.217.118
That's all there is to it. Now, mail will be rejected by this machine unless one of the following conditions holds:

For those who aren't as familiar with the sendmail.cf file syntax, machines listed on a line that begins with Dw, Cw, or in a file called sendmail.cw make up the complete list of machines and domains for which the machine in question stores mail.

Of course, you can play with the rules, changing the class C networks to class B's, removing the domain checking rules, or whatever is appropriate.

Testing these rule sets

Of course you don't want to simply put these changes in and hope they work - they need to be tested. First you should get an account on a machine from which relaying should not be allowed (not the machine that sendmail is running on!) For example, if the machine with the new relay rules is named death.trouble.org, you should telnet to port 25 of this host from a disallowed host and verify that regular mail works but relaying doesn't by doing the following (the typed commands are in bold text):


  fish.com % telnet death.trouble.org 25
  Trying 208.197.253.134...
  Connected to death.trouble.org.
  Escape character is '^]'.
  220 death.trouble.org ESMTP Sendmail 8.8.5/8.6.4 ready at Wed, 5 Nov 1997 14:52:55 -0800 (PST)
  mail from: npc@acm.org
  250 npc@acm.org... Sender ok
  rcpt to: <npc@death.trouble.org>
  250 <npc@death.trouble.org>... Recipient ok
  rcpt to: <npc@acm.org>
  550 <npc@acm.org>... Relaying Denied
  quit

Mail is accepted for the local machine, and denied for destinations not in the sendmail "w" macro.

Now, we test from a machine that should be allowed to relay.

  trouble.trouble.org % telnet death.trouble.org 25
  Trying 208.197.253.134...
  Connected to death.trouble.org.
  Escape character is '^]'.
  220 death.trouble.org ESMTP Sendmail 8.8.5/8.6.4 ready at Wed, 5 Nov 1997 14:52:55 -0800 (PST)
  mail from: npc@death.trouble.org
  250 npc@death.trouble.org... Sender ok
  rcpt to: <npc@death.trouble.org>
  250 <npc@death.trouble.org>... Recipient ok
  rcpt to: <npc@acm.org>
  250 <npc@acm.org>... Recipient ok

In this case, relaying was not denied whether the mail was to be delivered locally or not. These rules work and are probably safe to implement. As always, when making changes to the sendmail.cf you need to restart the sendmail daemon for them to take effect.

There are two things to note. First, the angle brackets when typing in the "RCPT TO:" line are mandatory. If these are omitted, you will always get "relaying denied". On machines without the Scheck_rcpt ruleset present, you will get "Sender ok" if they are omitted, but they are required by the SMTP protocol. Second, what is typed as an email address in the "MAIL FROM:" line is irrelevant, as long as it's a proper SMTP email address. This is never checked. Only the host name/IP address of the sending host and the "RCPT TO:" line are ever checked by these relay rules.

You can do these things and more using the Scheck_relay ruleset, but it's been our experience that using this ruleset is buggier, slower, and rarely necessary. Nonetheless, information on these rules and others like them can be found at the sendmail web site or in the second edition of the book Sendmail by Bryan Costales with Eric Allman, published by O'Reilly & Associates, Inc. Both sources are highly recommended.

We also created a SATAN testing module that can be run on individual hosts or large networks. See the tools section below for more information on this.

Blocking Email from Sites that Spam

It is very important to minimize the amount of spam that comes from a site that you maintain. If everyone did this, the amount of spam we all received would dry up substantially. However, it is not the case that everyone is conscientious about these issues, and even if they were, there will always be a few rogue sites who exist to spam. There are several ways to tell sendmail to reject spam from unwanted sources. We demonstrate one set of methods here that do not interfere with the anti-relay rules we presented in previous sections.

As with the anti-relay rules, you have to be using sendmail 8.8 or better to be able to use this technique, which is heavily dependent on some ideas presented in Sendmail, 2nd Ed. by Bryan Costales with Eric Allman. These lines need to be put in your sendmail.cf file. You may as well put them near the Scheck_rcpt rules that prevent relaying.


  Scheck_mail

  R$*			$: $>3 $1
  R$* <@ $+. > $*	$1 <@ $2> $3		strip trailing dots
  R$* <@ $+ > $*	$: $2			isolate the host and save in <>
  R$* . $+ . $+		$: $2 . $3		strip subdomains
  # Two part domain names from which we don't receive mail.
  R$=D			$#error $@ 5.7.1 $: "Access Administratively Denied"

  # List of Class C addresses from which we don't receive mail.
  R$*			$: $(dequote "" $&{client_addr} $)
  R$=A . $-		$#error $@ 5.7.1 $: "Access Administratively Denied"
  R$*			$@ OK


It is also necessary to add lines which define the following variables in the sendmail.cf file:

  # file containing domains to be administratively blocked.
  FD-o /etc/mail/sendmail.cD

  # file containing Class C addresses to be administratively blocked.
  FA-o /etc/mail/sendmail.cA

The file /etc/mail/sendmail.cD contains two level domain names from which electronic mail is to be rejected, based on the occurrence of these names given by the connecting host in the "MAIL FROM:" portion of the SMTP envelope. That is, you can put in items like trouble.org or spammerhaven.com. If you want to reject mail from the subdomain of spammers.spammerhaven.com but not the rest, there is no provision to do that with these rules.

Note that we're filtering on the "MAIL FROM" portion of the envelope. If you remember from section on the testing of the rule sets, we said that you can put anything you want in this field, it isn't verified. This makes this part of the rule set less useful, as it will only continue to work as long as the spammers continue to use the same domain name in this field in their spamming programs. Nonetheless, many do, so this can be useful, especially for those spammers that jump around between ISPs. No matter which IP address they connect from, as long as they use the same domain name to indicate where they're coming from, this portion of the rule set will reject their mail. Note also that there's no reason to expect that these domains are real, so no matter what, putting domains in the sendmail.cD file that don't really exist, is likely to always be appropriate, as long as they continue to not be registered as some valid domain.

Sometimes it would be really nice, though, to be able to reject mail connections, regardless of what anyone puts in the envelope, from certain segments of the IP address space. There are two ways to do this:

  1. Run sendmail regulated by inetd and to use TCP Wrappers to disallow connections from unwanted sources. We don't recommend using this method if the server in question handles a substantial amount of incoming email, but if the amount of email is modest and you're familiar with TCP_Wrappers (and if you're not, you should be), this is fairly easy to do.
  2. Use the second part of this rule set. This checks the IP address of the connecting machine, and if it is found in the list of blocked Class C addresses in the sendmail.cA file, mail is rejected. As with other files, class C addresses that are to be blocked look like:
      10.0.0
      172.16.10
      192.168.2
    

By both of these methods, regardless of what the spammer sends, connections from these IP address ranges will simply not be accepted. These methods are effective against spammers who might change what they put in their headers, but always connect from the same network.

As a matter of practice, it's our recommendation not to be too quick on the trigger to banish some domain to these files. Many times the domain in question is an unwilling participant in these schemes. Don't assist the bad guys in perpetrating a denial of service attack against yourself by reacting sharply if a lot of unwanted email comes in with forged "MAIL FROM:" lines in the envelope implicating some innocent site with which you may wish to correspond some day. Similarly, many folks may not be as quick to block relaying on their servers as you are, and merely require time and encouragement to patch their systems. In all cases, give folks who are not expressly in the business of spamming the benefit of the doubt, and periodically review your blacklist to make sure nobody is unfairly being blocked.

News Backoff Algorithm

Usenet news spam has been around longer than its email cousin, but it turns out to be fairly easy to implement a technical solution that greatly curtails it without serious side effects. The NNRP daemon is the process on an INN based Usenet news server that receives news reader client connections, that is, this is the process on the news server to which the news client connects. The first thing you must do is make sure that posting is restricted to only those hosts which should have access to the news server. The file that restricts access is called nnrp.access and is located with the rest of INN's configuration files. The exact location is operating system and version specific. Configuring this file is relatively simple - simply consult the nnrp.access(5) man page.

Additionally, what we've done is modify the NNRP daemon to keep track of how many posts come from a particular IP address in a period of time. If either the threshold for number of articles per unit time or the total number of articles are exceeded, the nnrpd daemon goes to sleep for a few seconds. The sleep time exponentially increases with each new successful post until a maximum value is reached; of course if the posting attempts cease, nnrpd recognizes this and resets the counter after a period time.

This algorithm has been very successful for us on our news service. We have drastically cut down the spam sent through our service without eliciting too many complaints. Indeed, we have found that if these (configurable) values are set properly, very few human posters will notice this policy change while any overly prolific automated posting program will quickly slow down to a crawl.

The backoff patches to INN's nnrpd (for INN Version 1.4unoff4) are available to the public. Dave Hayes came up with the idea and wrote the patches while under contract by EarthLink Network. We expect these options to be added to the base INN distribution in the near future.

Of course, despite our best efforts and intentions, this can adversely affect some legitimate users. The first class of these are the frequent binary posters - their robot posting programs are, as far as this algorithm is concerned, indistinguishable from spammers. The second class of users that might notice this are those who use off line news readers. They slurp down piles of articles, read them off line, generate their responses, connect to the news server, and then send them up in one big batch. If the initial threshold is set to a number above what even an extreme news poster is likely to want to post in a single session, they won't be affected. Even if they are backed off, it may not be a problem for them as the postings will get through eventually, albeit slowly. If they are paying for connect time charges, though, this could be more than annoying.

The numbers of subscribers we have encountered who are legitimate users of the system but have been significantly affected by this change in service has been very small. For those who need to do robot posting, you could try to provide an authenticated NNRP service for them to post with. The details of such a news protocol have not been incorporated into an Internet standard, but the latest version of INN interoperates with several authenticating news clients.

Auditing Tools

We started by writing a simple script for parsing INN logs to assist humans in identifying spam. It examines news headers and reports on suspicious items - non-local email addresses, stereotypical spammy subject keywords ("FREE", "PANTIES", and others too explicit to print in a family periodical), excessive cross posting, a single person posting too many messages, etc.. However, this is simply a reactive tool. Ideally we want to stop the spam before it starts. Nonetheless, we believe it is impossible to stop completely, so maintaining a battery of reactive tools is necessary.

SATAN Module for relay and news checking

We also created a module for SATAN which can systematically walk through a network detecting if any hosts allow mail relaying, VRFY and EXPN queries, or are running unrestricted NNTP servers that may need to be protected. This will be packaged with the next release of SATAN and will be available at http://www.trouble.org/satan/spam.html. It's remarkable, even with a fine system administration staff and a conscientious technical crew, how many systems continually keep cropping up with these sorts of problems.

Additional Logging - RADIUS Accounting

Another problem ISPs face is identifying service abusers in real time. If caught "in the act", there is little room for argument as to whether or not they are responsible, and an immediate response can be taken. Needless to say, making a mistake here is both unfair and bad for business; therefore it is important to make this as accurate and efficient a process as possible. An extension to this protocol designed to communicate accounting information between a piece of network access gear and an accounting server, RADIUS Accounting, is an exceedingly... .

Most dialup access equipment can be set up to use the RADIUS protocol to authenticate users' access to an ISP. An extension to this protocol, RADIUS Accounting, was designed to communicate accounting information between network access gear and an accounting server, and is an exceedingly valuable tool that can greatly help in identifying a resource abuser, albeit after the fact. Unfortunately this is still a problematic solution for us, primarily due to a lack of interoperability standardization and some very poor vendor implementations of this relatively new protocol. However, with the release of RFC2159, which standardizes RADIUS Accounting, we have further hope that sites will be able to support a RADIUS Accounting service stably across a variety of dialup access platforms.

Note that if you have a large number of dialup ports, setting up a RADIUS Accounting server can require a significant amount of planning and resources. The service must be stable, accurate, and reasonably speedy, or it isn't going to do any good. Therefore some thought and planning needs to be expended considering how this service will be set up and maintained, as well as what tools need to be written to access this information. If you are a smaller ISP or otherwise have a small dialup pool to maintain, the RADIUS Accounting logging code in the standard distribution can suffice, but larger services need to plan carefully for the very large volumes of data this service can generate.

As a final warning, even with RADIUS Accounting we (like many other ISPs) have an additional logging problem. Since we lease POPs from other ISPs (primarily UUNET) and therefore don't own all the resources involved, our people trying to identify the abuse of our systems will not always be the one who is able to identify the account. This discontinuity makes it doubly important to have a single point of contact internal to EarthLink to manage and facilitate all communications for any network abuse.

Spam Cowboy

Despite our best efforts in the technical arena, we have discovered by far the most important ingredient in reducing spam is not technical in nature, but simply having a single person who understands both the technical and legal issues involved and personally handles the whole investigative process from beginning to end. This includes (but is not limited to) watching the logs (manually or assisted by automated processes) for suspicious behavior, determining if the records indicate potential abuse, deducing the originating host, checking the specific piece of access gear or logs for the abuser's identity, and at least recording that person's identity for possible action by the ISP. The basic idea is to eliminate any middle-men in determining who is responsible for a given infraction. Taking the appropriate action should be done as quickly as possible, because even a single abuser can do a lot of harm in a relatively short period of time! In addition, a rapid response by the ISP to a spamming incident tells the attackers that it would probably be unproductive to attempt further abuses if they were to sign up again with this particular organization.

As this is a relatively new job description, it's nearly impossible to find people who have any experience to fill the position - every organization must either develop these resources in house or steal them from another ISP. Not only must they have a good level of technical competence, a well developed sense of ethics, and a good set of social skills (interacting with individuals at other ISPs and organizations, law enforcement personnel, and customers demands this) but they also must have a very thick skin. Complaints from the general public, griping from the subscribers, and telephone calls from the abusers (which can even take the form of death threats!) are a daily occurrence. It really is a tasking job, and is difficult for who haven't experienced it first hand to understand its demands.

Punishment

One controversial measure which we employed was to modify the EarthLink Subscriber Acceptable Use Policy (AUP) to include a provision to charge $200 to a subscriber who commits acts of network abuse, which include spamming as we have described it here. Employing this was not without considerable controversy within EarthLink, and we had to lobby our legal department and upper management to get it passed. Fortunately, having worked closely with them throughout the process, we haven't experienced any negative legal fallout from imposing these fines.

Collecting the fines turned out to be very simple - we simply charge the credit card used by us for billing. >From the start we consciously tried to reduce the number of "friendly fire" accidents by focusing only on the more egregious offenses and using these as examples. This way we've managed to hit back hard against the really bad offenders, while sending a message to the casual spammer that they should think twice before using our service for these purposes.

As with any policy change, it was vital to get the message out to our subscribers. We modified the AUP on our web page, sent email to our all of our subscribers detailing the changes, and printed an article in our bimonthly newsletter, Blink, which is also online. The response we received from our subscribers regarding the changes we made in this regard have been overwhelmingly positive. The problem is well known and understood, and our candid description of what we were doing about it and how it would affect our customers was very well received.

This has been a big success for us, and we heartily recommend that other ISPs to consider adopting a similar measure. It does require some serious work to accomplish, but we have found it to be more than worth pursuing.

Negative Solutions

There are several sets of measures that folks have taken on the Internet in an attempt to deal with spam that weren't mentioned here because we do not like them. Some so-called solutions are, in our opinions, not solutions at all, for they advocate an eye for an eye (or worse) philosophy. We feel that in some cases these "solutions" are at least as worrisome as the spamming problem they're attempting to solve, and we do not recommend that they be adopted.

Terrorism

Foremost among these are what can only be described as terrorist attacks: Ping-of-Death, mailbombing, smurfing, hacking, and other denial of service or outright attacks against both the spam purveyors and the unwilling accessories to their offenses. These attacks are worse than the spammers, for while they are typically out for monetary gains, terrorists have real malice behind their actions with an intent to injure. In addition, these efforts can often have far reaching and unintended consequences; not only to the target of these sorts of attacks, but also to innocent victims along the path of destruction.

Black-Hole Routing

Another so-called solution that some folks have adopted to combat the spammers is to fail to route their networks; at the last LISA one such group claimed to have a set of participants that could eliminate a target's capability to see about twenty percent of the Internet by blacklisting them. While we believe that a terminal or endpoint network certainly has the right to not accept traffic from places they do not wish to communicate with, the potential for abuse makes this a practice we cannot support.

First, transit networks should not do this, only endpoint networks. As an ISP, we should not prevent the folks to whom we provide service from being able to contact anyone on the Internet that they choose. Under no circumstances should we censor their access without their express consent. If they ask us to filter, that is an entirely different matter and entirely acceptable.

Second, on more than one occasion, legitimate users have been cut off from a significant portion of the Internet accidentally, despite their innocence of any form of network abuse. We cannot, in good conscience, support a system where this is such a strong probability.

Third, this solution, as it is currently implemented, bestows a great deal of power to an individual, so a potential for abuse is there. Even though we don't suspect that any unethical activity is likely, the mere possibility of this is distressing.

In addition, misconduct, were it to occur, or the appearance of misconduct by the individuals maintaining a black hold list might make the spammers appear to be victims, rather than the network abusers that we believe they are.

We do not support these sorts of activities in any way shape or form, implore the employers of these methods to desist, and call for other legitimate organizations to decry these methods as well.

Future Work

There's a lot more still going on that will or may build on the efforts we have outlined here; unfortunately they aren't all positive or constructive, in our opinion. Here are some of those efforts.

Positive Solutions

Sendmail's no-relaying default

Starting with sendmail 8.9, sendmail will have mail relaying off by default - this should cut down the amount of open relays by a considerable margin, since it is still by far the most popular mail delivery agent on the Internet.

SMTP Backoff

Since our Usenet news backoff solution was so wildly successful, we've turned our attention to doing this for SMTP as well. We're currently talking with Eric Allman with the hope that he will add these capabilities to sendmail. What we would like to see is that the mail for which the recipient is local would stream through unaffected, while at the same time, outbound mail being relayed through the mail system would be subject to the same basic kinds of backoff procedures that we use for news. There's no completion date on this, but you might want to start looking for it sometime in 4Q 1998.

Real-Time Monitoring

We have the unenviable (from a security perspective) situation of having a large amount of network traffic and bandwidth that will only grow larger. Trying to monitor 50 megabits a second of email in real-time is a difficult task at best - we have yet to find something that can keep up with this volume of traffic. However, with the recent release of Network Flight Recorder, a programmable high-speed network monitoring tool, we are hoping to put more significant effort into solving this problem. Having a tool that could warn us of network abuses as they occur could help us greatly mitigate our current dilemma. It remains to be seen whether this or any network monitoring tool can keep up with present and future load.

IP Caller-ID

We are envisioning a system whereby a unique identifier is handed out to a computer with a dynamic IP address when they sign on. This information can be used to grant or deny access to individual client machines within a single piece of dialup access gear. In this way multiple ISPs could share a common dialup access provider without making themselves vulnerable to network abuse by the other ISPs subscribers using IP based authentication alone. This idea is hardly even in its infancy, but it is a technical possibility which might be worth pursuing.

ISP version of NCTDE

Starting in December 1997, the Telephone Long Distance phone companies (IXCs) put into full service a blind database maintained by an external entity for the purpose of coordinating information on households which are a bad credit risk, that is, they jump from one long distance provider to another without paying their bills. This database is known as NCTDE (National Consumer Telecommunications Data Exchange.) Because of the way this database is structured, the phone companies have obtained an anti-trust exemption from the Department of Justice. A nearly identical system, called NTDE (National Telecommunications Data Exchange), has been in service for over two years to track businesses in this matter.

No research or work that we're aware of has been done on this yet, but it seems reasonable that a similar service might be implemented in the ISP world, and that this service might be expandable to track spammers.

Alternate Mail Delivery Systems

While there have always been alternatives to sendmail, there has never been a serious challenge to its supremacy as the Unix mailer of choice on the Internet. However, there are two mailers that have stirred up quite a bit of interest and popularity - Qmail, by Dan Bernstein, and the upcoming VMailer by Wietse Venema - that have various anti-spam features, and, of course, have mail relaying off by default.

Resource Sharing Among ISPs

Resources can include real time information, as well as personnel, hardware, and software. Rapid and easy communication among ISPs on resource abuse may have a great deal of promise in reducing the overall impact of the spammers on the Internet, although there are significant technical and legal barriers to making this happen. However, we hope groups like IOPS will help establish a dialog on how the ISP industry as a whole can cooperate to reduce the spam problem.

Laws

For better or for worse it is primarily through legislation that governments have such an enormous impact on how the Internet functions. Opinion is currently split between those who believe that a legal approach would be a productive way to attack the spam problem and those who believe that government intervention is more to be feared than invited. We believe that both of these viewpoints are reasonable and are ourselves split on this issue, so in this section we cover the basic arguments for and against legislative solutions. In addition we give a rough, very US-centric overview of some of the more prominent bills currently being put forth.

Until now the US government has mostly let the Internet grow and evolve in a fairly unfettered state. This combined with the overwhelming success of the Internet has some fearing that if the government does intervene on the issue of spam that it will be an invitation for even more legislation on other issues that will have undesired consequences.

On the other hand, the often cited Junk Fax law (47 USC 227) has had a powerful and beneficial effect on curbing this nuisance in the fax world, and it's easy to understand why many folks believe that extending it to cover UCE and Usenet News spam would be very beneficial. If an anti-spam law would have an analogous impact to the Junk Fax legislation, it would be hard for anyone who opposes spam, no matter how anarchistic they might be, not to concede that such legislation was a good thing. Indeed, even if some small interference by government in other Internet areas were a consequence, on balance it might well be a price worth paying.

The bottom line is that any debate on whether the spam problem should be addressed via legislation leads to two key questions. First, would the legislation effective in solving the problem? Second, is the price of the direct and indirect consequences of this legislation worth paying? Obviously and unfortunately, the answers to these questions are unknowable at the present time.

Virtually everyone does agree that it is by no means certain that good legislation will result from any governmental legislative efforts. However, since it is the nature (and, indeed, the vocation) of politicians and lawyers to legislate on topical issues, it seems certain that some laws will be forthcoming. And if poor legislation does get passed it would almost certainly be more difficult to undo this and obtain effective legal solutions in the future.

It's worth noting that almost no laws have been passed anywhere in the world to cover spam. It is absolutely in our best interests (out of self-preservation, if nothing else) to both try to understand the issues and to guide our legislators by written and verbal commentary. If the issues involving UCE are important to you, we urge you to educate both yourself and your legislators, regardless of your personal stance.

Here is a listing of the most prominent pieces of pending U.S. legislation and some brief commentary on how we view their relative merits. Some good, if partisan, overview of these bills is available at the CAUCE (the Coalition Against Unsolicited Commercial Email) web site. The focus of all these bills is on email, not Usenet spam.

Of course, even with good legislation there are problems. First and foremost, these laws would be only applicable within the United States. A major reason there are currently few problems with junk faxes that originate outside the country is because sending these is prohibitively expensive (although perhaps junk faxers just haven't figured out how to use the Internet for this yet). Since the Internet currently employs a distance insensitive pricing model, legislative action in this country could simply prompt a migration of the spammers to offshore locations. Not only would this lead to further loading of already congested international links, but it could lead to either an international law enforcement nightmare, or, more likely, a situation where nobody can take effective legal action.

In any case, unless the organizations or individuals sending the UCE are held accountable, there's not much anyone can do about overseas or domestic spamming. For all of these reasons and many more, technical solutions to UCE certainly are easier to implement! And while it is possible that good legislation may help reduce spam on the Internet, it is our opinion that even very good new laws will not completely solve the problem and strong technical mechanisms will still be the first line of defense.

Conclusion

For better or for worse, spam is here to stay. Electronic mail and news are both simply too effective a communication tool to be ignored by people either wanting to make money or spread a message to the masses. However, we are not advocating total elimination of UCE - we simply want people to use responsible and acceptable distribution practices. Abusing resources is simply not acceptable, and that is what this paper is trying to help prevent.

Unfortunately, even if everyone implemented all of the solutions we discussed in this paper, spam would continue. It is always going to be possible to misuse the Internet, because its two main strengths, power and flexibility, are particularly easy to exploit. But we do feel that some measures and practices are better than others, and if everyone (or even only ISPs) adopted them, they would definitely reduce the total amount of spam on the Internet. The most significant positive changes are:

The problems caused by spam and UCE are real and they are significant. The entire Internet has been affected by this malady, but there are things we can do to alleviate the problem. We believe that a combination of existing technical solutions, many of them described in this article, future technical work, and cooperation among Internet Service Providers can significantly impact of network abusers without unduly affecting responsible Internet users. Much work still needs to be done on this, but in this article we have shared our experiences and techniques that EarthLink Network has found useful in combating these problems, and we hope to start a dialogue on how we can further reduce the problems that spamming causes for the Internet as a whole.

Final Note

Also, note that this document is a work in progress. This article represents the version as of early February, 1998. Updated versions can be found at:

   http://www.trouble.org/security/spam_war.html

Acknowledgments

We'd like to thank David Beckemeyer for information on his IP Caller-ID proposal, Tim Bosserman for providing information on RADIUS Accounting, Dave Hayes for his work on the news backoff patches, Harris Schwartz for information on the results of our efforts, and to Lisa Hoyt for her unceasing work in making the spammers leave EarthLink Network for some other provider, for now....

Appendix

We utilized a new SATAN module that used a very simple method to determine if a host allowed unrestricted mail relaying or not. Our methodology:

Note that the percent token (%) was used instead of the at sign (@) to determine if the system was a mail relay. Simple anti-mail relay rules in the SMTP daemon (and those proposed initially by sendmail.org) would allow this sort of mail to be delivered; we found several sites that blocked the latter method but not the former.

The return codes were then examined. If an appropriate response was received (250, etc.) the host was assumed to be an unrestricted mail relay. Obviously it would be best to actually send the mail and see if it was delivered, rather than this partial test, but the difficulties of scanning arbitrarily large networks from arbitrary hosts make this a more palatable (at least, significantly easier to program) solution.

The method used is error prone in many ways, however. While none of them are fatal, false positives could occur in numerous ways:

On the other hand, false negatives are possible as well, due to:

Ideally we would hope that the errors would either not come up or simply cancel themselves out, but in practice results found in this survey are probably an upper limit, probably within 10% of the final total (this is not an exact science!)