Linux DevCenter    
 Published on Linux DevCenter (http://www.linuxdevcenter.com/)
 See this if you're having trouble printing code examples


Peer-to-Peer Makes the Internet Interesting Again

by Andy Oram
09/22/2000

Related Articles:

Open Source Roundtable: O'Reilly's Peer-to-Peer SummitWith audio

Peer-to-Peer Summit Attendees


P2P news from Meerkat

By the summer of 2000, it looked like the Internet had fallen into predictable patterns. Retail outlets had turned the Web into the newest mail order channel, while entertainment firms used it to rally fans. Portals and search engines presented a small slice of Internet offerings in the desperate struggle to win eyes for banner ads. The average user, stuck behind a firewall at work or burdened with usage restrictions on a home connection, settled down to sending e-mail and passive viewing.

In a word, boredom. Nothing much to write about, or for creative souls to look forward to. An Olympic sports ceremony that would go on forever.

At that moment a number of shocks exploded upon the public. The technologies were not precisely new, but a number of people realized for the first time that they were having a wide social impact. First, Napster (preceded by Scour) caused a ruckus over its demands on campus bandwidth, as well as its famous legal problems. A new generation of distributed file servers like Freenet and Gnutella were widely reported. The founders of a new chat service called Jabber declared that XML was more than a tool for B2B transaction processing, and in fact could structure information chosen by ordinary users. Shortly after that, Microsoft announced they were betting the house on the .NET initiative, in which Web clients and servers divide jobs among themselves.

Analysts trying to find the source of inspiration among these developments also noted that the computing model of the SETI@Home project, which exploited processing power that had gone to waste for years, was being commercialized by a number of ventures. And that a new world of sporadically connected Internet nodes was emerging in laptops, handhelds, and cell phones, with more such nodes promised for the future in the form of household devices.

What thread winds itself around all these developments? In various ways they return content, choice, and control to ordinary users. Tiny end points on the Internet, sometimes without even knowing each other, exchange information and form communities. There are no more clients and servers -- or at least, the servers retract themselves discreetly. Instead, the significant communication takes place between cooperating peers. And thus, starting around early July 2000, the new Internet model was dubbed peer-to-peer.

Mapping the terrain at the summit

In August, Tim O'Reilly surmised that peer-to-peer technology could evolve faster if key leaders, each of whom "had a hand on a piece of the elephant," started talking intensively to each other. Furthermore, he hoped they could emerge from a summit with a message to help the public understand the potential of peer-to-peer. In particular, he wanted to counteract the negative image often attached to the movement and its most visible expressions, such as Napster and Gnutella.

Organized by numerous departments across O'Reilly & Associates, the summit on September 18 in San Francisco was attended by some 20 people whose expertise ranged across the computer field. (See list in accompanying Web page.) They included technologists from major companies, visionary heads of startups, journalists, leaders of academic and experimental projects, and people who have just done a variety of interesting things in the information processing field and are always looking for the next interesting thing.

The meeting took place in a modest conference room in the Lone Mountain Center of the University of San Francisco, located atop a hill that offered gorgeous views in every direction. The day went fast, even though most participants looked tired at the end. Several expressed pleasure at being in the same room with such an aggressively innovative bunch of people. Our agenda broke down pretty much as follows:

The energy in the day rose and fell in a kind of bell curve. At first we were getting to know each other and setting the groundwork for discussion. It was notable how often someone would state what he felt to be a defining characteristic of peer-to-peer (such as "it can't do searches for poorly understood items") only to be vigorously contradicted by somone from another project. Being geeks to some extent, the group could become highly animated over minor technical points, such as whether instant messaging passes text through a central server (the answer is sometimes) where a snoop could theoretically monitor content. We also got incensed over the threat of software patents.

But in the descending rays of afternoon sun, when we tried to extract a simple set of principles from our experience, we found out how new and unformed the field is. Simple generalities couldn't hold up to dispassionate observation. At the end of the day, literally, we had to be content with listing the early successes of peer-to-peer and suggesting a vision that many of us are fashioning.

Still, some themes emerge from a number of peer-to-peer projects.

Making galaxies out of dark matter

Clay Shirky, a partner at The Accelerator Group and a writer for Business 2.0, who came up with many of the catchy epigrams of the day, pointed out that peer-to-peer exploits the "dark matter of the

"With peer-to-peer the computer is no longer just a life-support system for your browser."
-- Clay Shirky

Internet." Just as interstellar space contains huge amounts of inert dust that we can't track or measure -- but that may be forming galaxies all the time -- personal computers offer enormous amounts of cycles consumed by busy waits, along with disk space waiting to be filled and communication lines that lie idle in between keystrokes. Peer-to-peer creates constellations of cooperating users out of these wasted resources. As explained by Dave Stutz, a software architect at Microsoft, these three key resources -- CPU power, storage, and communications capacity -- are the prerequisites for each new leap in computer applications. Or, to quote again from Shirky, "With peer-to-peer the computer is no longer just a life-support system for your browser."

Performance concerns many observers. Does peer-to-peer cause unnecessary processing and Internet traffic? Napster is not valid evidence for these concerns, because exchanging large files would stress networks regardless of the technology used to administer the transaction. Many expect peer-to-peer to actually conserve bandwidth when used appropriately.

Intel hopes that they can avoid the crippling effect of users downloading training videos by letting files relocate near groups of systems that request them.

One such observer is Bob Knighten, peer-to-peer evangelist for Intel. His company hopes that they can avoid the crippling effect of users downloading training videos by letting files relocate near groups of systems that request them. Comparable to caching by Internet hubs or multicasting, this technique is distinguished by its use of regular, underutilized computers instead of specialized, expensive file servers. The model of shifting content to where it's wanted is also the basis of Freenet.

But there's no doubt that peer-to-peer will challenge the architecture of current Internet services. Nelson Minar, cofounder of a distributed computing startup named Popular Power, says that peer-to-peer redefines the assumptions behind asymmetric service (like ADSL and cable modems). Michael Tiemann, CTO of Red Hat, adopts a positive attitude and goes so far as to say, "Peer-to-peer may be the critical enabling technology that makes broadband possible."

Flexibility offers a key opportunity

Traditional search services may offer a variety of advanced options, but you can't tell the service how to organize its data. Napster is child's play for finding pop songs, but its rigid classification service doesn't accommodate classical music well.

Peer-to-peer raises the possibility for people interested in a topic to create their own language for talking about it. While different communities may all share an underlying infrastructure, like Jabber's chat service or Gnutella file sharing, the structure of the users' data can emerge directly from the users.

Metadata, which describes each file and the elements within it, holds the key to self-organization. XML is a good foundation -- but only a foundation, because it just offers a syntax. Building on the XML foundation, schemas hold some promise for structuring both content and users' reactions to the content. One slogan we considered was, "Publish my taste, not just my music files."

Schemas can be handed down by standards committees, but they can also be thrown out on the Net by individuals or communities for widespread consideration. Each community will settle on the ones it likes. We called this principle "schema agnosticism."

While we all paid homage to the glories of metadata, we remained skeptical that people would understand the need for it and take the trouble to contribute information. Rael Dornfest, creator of the RSS-based Meerkat service for the O'Reilly Network, said that a little metadata goes a long way, adding logarithmically to the value and uses of the data it describes.

Any service or tool that provides either automated or manual access to reading and writing metadata offers an opening for an intrepid explorer to map the data and thus create an innovative service for other users. (Invisible Worlds is promoting a generalized approach to such APIs through the Simple Exchange Protocol, currently an Internet Draft.) This is a good reason to ask services to expose their data schemas and publish open APIs to their services. It's also a reason to deplore attempts to shut off innovation through bans on deep linking, restrictions on the amount of data one user can retrieve, or forcing developers to resort to "screen-scraping" and other fragile data-retrieval tricks.

The message

Peer-to-peer is so new that we have trouble attaching categorical statements to it. Yet it is also the oldest model in the world of communications. Telephones are peer-to-peer, as are Fidonet and the old-style UUCP implementation of Usenet. IP routing, the basis of the Internet, is peer-to-peer, even now when the largest access points raise themselves above the rest.

Up until the past decade, every Internet-connected system hosted both servers and clients. Aside from dial-up users, the second-class status of today's PC browser set didn't exist. One of our messages, therefore, is:

Peer-to-peer is fundamental to the architecture of the Internet.

Another element of our message is that peer-to-peer brings people together; it builds communities in a way that eludes all the portals that so earnestly try to build them from the top down. Jabber, by providing chatters with an easy way to categorize their content, hopes to help them organize themselves. Weblogs, Wiki, and Meerkat let people follow and comment on the goings-on of other people who interest them. Even a passive sharing of resources, like Popular Power or SETI@Home, makes people feel they're part of something big -- and they like the feeling, according to Minar.

When people collaborate, each has the opportunity -- though not the requirement -- of being a producer as well as a consumer. Those who produce may be relatively few: For instance, a recent study found that only 2% of Gnutella users contribute content, and even on Usenet News the ratio of posters to total readers is only about 7%. But that is enough to keep the system afloat. The important thing is to give everybody the opportunity. Thus our slogans:

Peer-to-peer is the end of the read-only Web.

Peer-to-peer allows you to participate in the Internet again.

Peer-to-peer: steering the Internet away from TV.

Since each peer represents a person (and in fact, a person can appear in many different guises on one or more peer-to-peer systems), these systems can lead to emerging, self-organizing communities.

Some peer-to-peer systems deal in fungible resources, such as the use of idle computing power by SETI@Home or Popular Power. In these systems, resources are valuable because they're interchangeable. But in many systems, where the goal is to share files or metadata, the diversity of the peers is what makes the system valuable. Resources like disk space may start out fungible but then develop differences as users add unique content.

Limitations

Ray Ozzie, the developer of Lotus Notes and more recently the founder of Groove Networks, asked us to come up with a clear marketing slogan that would let a busy system designer decide whether peer-to-peer was right for his system. We didn't succeed at finding a pithy summary, but we did make a list of applications that raise a warning flag. There are situations where peer-to-peer is counterindicated.

Peer-to-peer is probably not appropriate when it's important for everybody to know all the current data at the same time. You probably can't run a stock market with peer-to-peer, for instance, because fair pricing fundamentally depends on everybody knowing the asking price and seeing the impact of all bids. An airline reservation system wouldn't be appropriate either, unless each buyer is willing to sit on somebody else's lap. We did not achieve consensus on whether peer-to-peer applications are good for rapidly changing data.

Peer-to-peer systems seem to contain an inherent fuzziness. Gnutella, for instance, doesn't promise you'll find a file that's hosted on its system; it has a "horizon" beyond which you can't see what's there. Knighten said, "There are a lot of things in the peer-to-peer world that we can take advantage of only if we give up certitude." Most computers come and go on the Internet, so that the host for a particular resource may be there one minute and gone the next.

Furthermore, peer-to-peer seems to reward large communities. If very few people ask for a file on Freenet, it might drop off the caches of the various participating systems. Steve Burbeck, a Senior Technical Staff Member at IBM, pointed out that self-organizing file sharing systems like Napster, Gnutella and Freenet are affected by the popularity of files, and hence may be susceptible to the tyranny of the majority.

Some participants suggested that peer-to-peer is good if you have a strong idea what you're looking for (an author, for instance) but have no idea where it is. A centralized system like Yahoo! is better if you have only a vague idea what you're looking for, but possess some confidence that you can find it at a particular site.

Darren New of Invisible Worlds explained that the value of knowing what you want in advance gives the advantage to peer-to-peer systems where a lot of information is available to potential users out-of-band. This rather abstract statement becomes meaningful when one considers Napster. No one would be able to find much on Napster if they couldn't obtain the name of a band and its songs beforehand. Ironically, many people depend on traditional sales channels like Amazon.com to obtain names of songs before they troop over to Napster and download them. Thus, Napster requires a pre-existing, outside infrastructure.

Furthermore, data is easy to find if it's easy to categorize. Tim O'Reilly pointed out that people would have an easier time searching a peer-to-peer system for a pop song than for a particular combination of performers singing an opera number -- and they'd have even more trouble searching for a chapter in an O'Reilly book. Metadata, once again, proves key to the future of peer-to-peer.

Even if you don't know what you want, you might be able to find it by contacting a knowledgeable member of your online cohort.

Peer-to-peer does offer an alternative strength, though: connecting people. Even if you don't know what you want, you might be able to find it by contacting a knowledgeable member of your online cohort. You might even find your request met by an infobot (which records frequently asked questions along with their answers, and spits back answers to newsgroup users), which is a functioning application developed by Kevin Lenzo of Carnegie Mellon University.

Historically, the balance between centralized systems and peer-to-peer shifts back and forth. Many systems, like Usenet and IP routing, started off as pure peer-to-peer but developed some degree of hierarchy as they got bigger. Other systems started out centralized but got too big for it, like the Domain Name System (which had to give up single hosts files). Because of peer-to-peer's scalability, some systems are clearly going to require it. To quote IBM's Burbeck, "When you get up to a billion cellular phones and other devices, you have to be decentralized."

Barriers to peer-to-peer

As with any new technology, peer-to-peer has to grapple with old assumptions, which have been embodied in policies ranging in size from a site administrator's firewall configuration to acts of Congress.

Bandwidth limitations

These have already been mentioned. Low-cost technologies can deliver high bandwidth downstream, but cheat the user who wants to publish as well as read. License restrictions such as bans on running a service out of the home, while justifiable from the viewpoint of maintaining adequate throughput on multiple systems, also amputate the possibilities of peer-to-peer.

Firewalls and address translation

When it comes to IP addresses, IPv6 may eliminate the scarcity but not the scares. Most firewall administrators keep their users from getting fixed IP addresses out of fear that they make intrusion easier. The sense of security may be illusory (because people monitoring their own ports have reported intrusion attempts within a few minutes of bringing up their network software), but the practice makes it much harder for an end user's system to host a service.

Even if each user gets a fixed IP address, the Domain Name System is a barrier to end-to-end addressing. Domain names have become a significant expense over the years, and ICANN is not improving the situation. (For instance, they require anyone who wants a new top level domain to pay $50,000 just for the privilege of having the request considered.) For that reason, new services like AOL Instant Messenger and Napster bypass DNS and use their own addressing.

Unfortunately, such ad-hoc addressing solutions lead to problems of their own. AOL tries to shut out competitors by denying them its addresses. Systems that use proprietary systems of addressing can't interoperate. Most ad-hoc namespaces are flat and crude.

Finally, packet filtering, while it plays an important role in securing a LAN, also leads to "port 80 pollution" as everybody tries to use the Web for services that would be more appropriate with a different protocol. The solution to this problem, unfortunately, will not be in sight until operating systems become more secure and IP security can be set up with a few mouse clicks (or vocal commands).

Regulation

Government restrictions have not yet affected peer-to-peer technologies, but could have a chilling effect in the future. Scott Miller, lead U.S. developer for Freenet, pointed out that early peer-to-peer systems happen to be associated with violations of intellectual property (at least in the minds of the copyright holders) but that such activities were neither the purpose of peer-to-peer nor the most likely types of mature applications to emerge. Nevertheless, a couple of participants reported that peer-to-peer has been branded with a scarlet I for Infringement in the minds of investors, courts, and the general public.

Software patents could also stifle technology. We spent some time discussing ways to use prior art searches of patents defensively.

Barriers to entry

Finally, things that currently make it hard for users to start a peer-to-peer service need to be attacked by the peer-to-peer developers themselves. If every new peer-to-peer service requires the user to download, install, and configure a new program, it will remain a craft shop for tinkerers. Shirky pointed out that Java was supposed to provide the universal client that allowed everything else to run seamlessly, but Java on the client didn't pan out. We need something as easy as the update service offered by Windows and some other products.

Even after users install a client, they may abort their involvement if they are forced to learn a system of metadata and apply a lot of tags. Jon Udell, programming consultant and Byte Magazine columnist, said, "When everybody's a publisher, responsibility goes along with that." Dornfest called for local vocabularies to provide just the right depth of metadata, combined with clients that allow its application with just a mouse click, a keystroke, or automagically.

Napster made the registration of users' MP3 files nearly automatic. Janelle Brown, senior writer at Salon, reminded us of the sad history of Hotline, a service similar to Napster that appeared much earlier but never took off. Among the various diagnoses the group offered in post mortem, it was pointed out that prospective users had to upload material before doing any downloads. That restriction may have turned them off. Nevertheless, another quid-pro-quo system -- where you have to provide a resource in order to use a resource -- is being tried by the new venture MojoNation.

Trust

The ability to trust the system and other users is critical. Balancing the privacy of the individual with the need to authenticate users and kick malicious disrupters off the system is a difficult feat.

Future developments

In this article I've tried to summarize the most interesting themes from the September 2000 peer-to-peer summit. Leaders in the field are exploring more such themes in the book Peer-to-Peer, to be released by O'Reilly & Associates in February, and at the O'Reilly Peer-To-Peer Conference in San Francisco, which takes place February 14-16. I hope you'll stick around -- and don't be afraid to participate.

Andy Oram is an editor for O'Reilly Media, specializing in Linux and free software books, and a member of Computer Professionals for Social Responsibility. His web site is www.praxagora.com/andyo.


Discuss this article in the O'Reilly Network Linux Forum.

Return to the Linux DevCenter.

 

Copyright © 2009 O'Reilly Media, Inc.