WindowsDevCenter.com
oreilly.comSafari Books Online.Conferences.

advertisement


AddThis Social Bookmark Button

Kill Internet Ads with HOSTS and PAC Files

by Sheryl Canter
03/30/2004

If you're looking to block online site ads and offensive Web content, you don't need to buy special software -- instead, you can use two techniques available for any browser. One technique uses the HOSTS file built into Windows, and the other uses PAC files, a feature of all modern browsers. Problems can crop up with both these approaches. This article will explain why the problems occur and how to solve them.

Few web sites host their own banner ads. Typically they sign up with ad servers that deliver content and track views and clicks. Thus you can block most web site ads by blocking a fairly limited number of ad servers. HOSTS and PAC files can block web ads by blocking access to these ad servers. You can also block other sites serving objectionable content.

What Is the HOSTS File?

Unless a computer is configured to use a proxy server, the HOSTS file is the first place a browser looks for an IP address when you type in a URL such as www.permutations.com. Only if the domain name is not found in the HOSTS file does the browser then query the DNS server. It is this fact that makes the HOSTS file an effective means for blocking web site ads.

The HOSTS file is stored in different places depending on your operating system:

Windows 95/98/Me  c:\windows\hosts 
Windows NT/2000/XP Pro       c:\winnt\system32\drivers\etc\hosts
Windows XP Home              c:\windows\system32\drivers\etc\hosts

It's a text file you can open in Notepad. Comments at the top explain the simple syntax. Each line consists of an IP address, a domain name, and an optional comment placed after a pound sign. The one default entry in every HOSTS file looks like this:

127.0.0.1      localhost      # this is the IP address of your local computer

127.0.0.1 is a special IP address called the "loopback" because it refers to the local computer. The loopback address gives developers a way to test network software without being physically connected to a network. This prevents buggy network hardware or software from obscuring test results. The loopback address also can be used to prevent web ads from displaying.

Figure 1
Figure 1 A site with flashing banner ads before and after ad blocking.

To use the HOSTS file to block web ads, you add a list of hosts serving objectionable content (such as ad servers), and associate these domains with the loopback address -- your own computer. Then when you navigate to a site that contains banner ads, the browser looks on your own machine for the ads and never visits the ad server. Thus the ads are never displayed, and the ad server has no opportunity to put tracking cookies on your computer.

Compiling a list of ad servers for an ad-blocking HOSTS file would take a lot of time, but happily you don't have to do it. There are numerous ad-blocking HOSTS files available for download on the Internet. Mike Scallas distributes one that is updated each month.

Regular updates are necessary because new ad servers pop up all the time. If you see an ad while running an ad-blocking HOSTS file, it means one of two things: (1) the ad is hosted on the site's own server, or (2) it's new. To find out where the ad is coming from, right click on it and select Copy Shortcut. If the ad is hosted on the site, you can't block it with a HOSTS file because HOSTS files only can block entire sites. (This is not true of PAC files, which I'll discuss later.) If it's a new ad server, paste the domain portion of this URL into your HOSTS file with a redirect to 127.0.0.1.

HOSTS File Problems and Solutions

The HOSTS file trick is clever, but there are some potential problems with it. Ad-blocking HOSTS files can include sites that shouldn't be there, blocking access to sites you want to see. This occurs because some ad servers also provide other types of content. For example, the ad server akamai.com also provides streaming media for many web sites, including Microsoft, for whom they handle Windows Updates. If you block akamai.com, you won't be able to access Windows Updates.

Related Reading

Network Security Hacks
100 Industrial-Strength Tips & Tools
By Andrew Lockhart

Then there's the aesthetic issue. Ideally, you'd see blank areas in place of ads, but in actual practice there are unattractive "Action canceled" error messages repeated wherever an ad would have been. There is a solution to this, as you'll see shortly.

And then there is the problem with delays. The idea behind the HOSTS file trick is to redirect ad-server requests to an IP address where there is no server. Internet Explorer will fail immediately if it can't find a server, but other browsers (notably, Opera) wait much longer before giving up.

Both these problems can be solved by installing a small, single purpose web server that does nothing but serve transparent bitmaps when requests are received on the loopback address. This replaces unsightly error messages with blank areas, and eliminates delays because the browser receives an immediate response. A free utility for this purpose will be described later in this article.

But there are other potential problems. If you are running a real web server on your computer such as Personal Web Server (PWS) or Internet Information Services (IIS), you'll get a dialog prompting for a network password each time you navigate to a site with redirected ads. This is because, by default, PWS and IIS are configured as the "default web site," responding to all IP addresses assigned to the computer that are not assigned to other sites. When the HOSTS file redirects your browser to the loopback address, an actual web server is there to answer. Since the request is for resources it can't find, it pops up an "Enter network password" dialog.

There are various things you can do to get around this, but all involve giving up something. If your computer is on a network, you can change the default IP setting of "(All Unassigned)" to the computer's network IP address, thus excluding 127.0.0.1. The PWS/IIS Help file warns against doing this because it can cause some server features to stop working. But if all you're using your web server for is testing sites before uploading, you may not care.

Another possibility is to redirect the ad servers in the HOSTS file to a non-existent IP address such as 0.0.0.0. This works with IE and Mozilla-based browsers, but Opera objects to the non-existent address and pops up an error message. Also, if you change the port from 127.0.0.1, you can't use a special purpose web server to eliminate unsightly error messages and delays.

PWS and IIS are configured by default to use TCP port 80, which is standard for HTTP. Another way you can prevent the "Enter network password" popup is to change the port to something other than 80 (81, for example). But this will make your server invisible to anyone who doesn't know that the port must be specified in the URL.

The best solution if you're running a web server is to not use the HOSTS file for ad blocking at all, but instead to use a PAC file, which doesn't conflict with existing web servers. PAC files have other advantages as well. As mentioned earlier, HOSTS files can only block entire sites, and not specific URLs within a site. PAC files can block specific URLs within a site so, for example, you could block akamai.com ads without disabling Windows Update.

HOST files have to be large to block all the major ad servers because wildcards are not supported; you have to list the exact domain names. Very large HOSTS files slow your browser because of the time it takes to search a large, unindexed text file. PAC files are based on JavaScript and can specify URLs using shell expressions (the Unix implementation of regular expressions), so this problem is eliminated.

Finally, ad-blocking HOSTS files cannot be used on systems using proxy servers because the HOSTS file is bypassed. Proxy servers are not a problem with PAC files.

Pages: 1, 2

Next Pagearrow