Articles Weblogs Books School Short Cuts Podcasts  

Gnutella and the Transient Web
Pages: 1, 2, 3, 4

Contrasting the Transient and Permanent Webs

To better understand the transient Web, let's contrast it with the conventional permanent Web on some key points.

The Web is Web-like because of hyperlinks. Hyperlinks work under the assumptions that content remains accessible at a fixed URL and that a server specified by the URL is available to serve the content. Unfortunately, both assumptions fail on the transient Web. The machine at a given IP address may not be there tomorrow, or in one hour or five minutes or one second. For this reason, with a few exceptions, you cannot browse your way from the permanent Web to the transient Web, nor will you find transient Web sites indexed by conventional search engines.

In fact, the transient Web is presently mostly devoid of the sense of "place" that dominates the vocabulary of the permanent Web. Normally, we "visit" sites specified by a "location" or "address." When we can't enter a fixed address or follow a static hyperlink, this sense of place vanishes. An alternative sense of "medium" fills its shoes.

Instead of laboring to locate a particular site carrying a sought-after piece of content, we currently turn to the transient Web primarily as a medium. A search goes out into the ether and answers come back. The simplicity of this is so desirable that search engines arose on the permanent Web to provide precisely the same experience, although the execution under the hood is rather different. On the transient Web instantiated by Gnutella, a search engine is built in to the infrastructure. Without it, no one would find anything. Another way to find transient sites and their content would be to maintain a resource registry - essentially a dynamic DNS that provides a basis for location-independent URLs instead of location-dependent URLs (see FirstPeer and XDegrees for more on this approach). A unique registry implies centralization and an external dependency anathema to Gnutella, however, which opts for a more decentralized approach with minimal reliance on outside systems.

The transient Web's "sense of medium" has a profound effect on marketing and distribution. As many dot-coms discovered on the conventional Web, just because you build it doesn't mean that they'll come - and if by good fortune they do come, you may not be able to handle the load. Promotional expense is required to expose your address and bring people to your site, and a key promotional tactic is to be listed as prominently as possible in search engines. In a system dominated by a sense of place, you must distribute as many signposts as possible, and reach doesn't come cheap.

Gnutella's search scheme makes the query stream a publicly accessible resource. By tuning in to the query signal on the ether, it's possible for anyone to hear a torrent of broadcast searches and route back responses with results advertising your content. In short, on the transient Web enabled by Gnutella, reach is nearly free. Moreover, since content is more important than its source, users are willing to obtain it from almost any site. LimeWire's "Smart Downloader" feature even automates this process, retrying multiple sources of the same content item until a complete download succeeds. Users who download content can easily become re-distributors, leading to the phenomenon known as "superdistribution." The upshot: On the transient Web, distribution is almost free.

Gnutella's search capability is not perfect, of course. From the point of view of the searcher, there is no guarantee your query will reach the sites holding what you seek, and the results that you do receive will arrive in a jumble. From the point of view of the content provider, there is no guarantee you will hear every query you're interested in hearing; maximum possible reach might still take some effort.

A number of projects are addressing these shortcomings. For example, search engine attempts to comprehensively track content on the transient Web. A search issued at first hits the engine's local database of transient site content listings, and as a more time-consuming fallback, it simultaneously broadcasts the query to the Gnutella network. Because holds relatively fresh addresses of transient sites, it is one exception to the rule that the transient Web cannot be reached from the permanent Web. Addressing the other side of the coin, at Clip2 we have continuously studied the distribution of query traffic across the network in order to inform strategies for connecting to the network in order to hear as many queries as possible.

While Gnutella's search functionality makes queries public, it keeps them anonymous to a degree. Each query is assigned a unique ID at its source. As queries are handed from site to site across the network, each site keeps a temporary record in memory of which neighboring site handed it which query, but no record is passed of who originated the query. In this way, query responses have to route back through the chain, and only your immediate neighbors can correlate your IP address with your queries. The privacy of your queries is therefore dependent upon the hosts to which you are connected, which are likely to be operated by random users such as yourself. By contrast, on the permanent Web the privacy of your queries is dependent upon the policies of the search engine you use.

Pages: 1, 2, 3, 4

Next Pagearrow