Dissecting Web 2.0 Examples: Chapter 3 - Web 2.0 Architectures
Pages: 1, 2, 3

Britannica Online and Wikipedia

A disruptive technology can do more than cost a business money. Sometimes the disruption extends so deep that the virtues of the business’s past become problems, and techniques that would previously have been vices suddenly become virtues. The emergence of Wikipedia and its overshadowing of the Encyclopedia Britannica is one case where the rules changed decisively in favor of an upstart challenger.

Applicable Web 2.0 Patterns

The collaborative encyclopedia approach ushered in by Wikipedia capitalizes on several Web 2.0 patterns:

  • Software as a Service

  • Participation-Collaboration

  • Rich User Experience

  • The Synchronized Web

  • Collaborative Tagging

You can find more information on these patterns in Chapter 7, Specific Patterns of Web 2.0.

From a Scholarly to a Collaborative Model

The Encyclopedia Britannica was originally published in 1768 as a three-volume set, emerging from the intellectual churn of Edinburgh. It grew quickly, reaching 21 volumes by 1801, and over the next two centuries, it solidified its reputation as a comprehensive reference to the world. Producing the printed tomes was a complex and expensive enterprise, requiring editors to judge how long to leave an edition in print, how much to change between editions, what new material to cover, and who should cover it.



The possibility of an electronic edition was in many ways a relief at first. The Encyclopedia Britannica took huge strides during the computer revolution to survive a changing world. In the mid-1990s, the static book publisher tried bundling an Encyclopedia Britannica CD with some PCs. That experiment was short-lived, as it soon became obvious that any publishing effort in the new digital age had to be dynamic. The company then migrated its entire encyclopedia set to the Web, where it was free of many of the edition-by-edition obstacles to updating that had limited its print and CD editions.

Although this was a daring move, and Britannica continues to sell its content online, the model behind the encyclopedia’s creation now faced a major challenge from newcomer Wikipedia. Whereas Encyclopedia Britannica had relied upon experts and editors to create its entries, Wikipedia threw the doors open to anyone who wanted to contribute. While it seemed obvious to many that an encyclopedia created by volunteers—many of them non-experts, many of them anonymous, and some of them actually out to cause trouble—just had to be a terrible idea, Wikipedia has thrived nonetheless. Even Wikipedia’s founders didn’t quite know what they were getting into—Wikipedia was originally supposed to feed into a much more formal, peer-reviewed Nupedia.

In Wikipedia, rather than one authority (typically a committee of scholars) centrally defining all subjects and content, people all over the world who are interested in a certain topic can collaborate asynchronously to create a living, breathing work. Wikipedia combines the collaborative aspects of wiki sites (websites that let visitors add, remove, edit, and change content) with the presentation of authoritative content built on rich hyperlinks between subjects to facilitate ultra-fast cross-references of facts and claims.

Wikipedia does have editors, but everyone is welcome to edit. Volunteers emerge over time, editing and re-editing articles that interest them. Consistency and quality improve as more people participate, though the content isn’t always perfect when first published. Anonymous visitors often make edits to correct typos or other minor errors. Defending the site against vandals (or just people with agendas) can be a challenge, especially on controversial topics, but so far the site seems to have held up. Wikipedia’s openness allows it to cover nearly anything, which has created some complications as editors deleted pages they didn’t consider worthy of inclusion. It’s always a conversation.

The shift from a top-down editorial approach to a bottom-up approach is a painful reversal for people who expect only expert advice when they look up something—and perhaps an even harder reversal for people who’ve built their careers on being experts or editors. Businesses facing this kind of competition need to study whether their business models are sustainable, and whether it is possible to incorporate the bottom-up approach into their own work.

Personal Websites and Blogs

The term blog is short for weblog, a personal log (or diary) that is published on the Internet. In many cases, blogs are what personal websites were initially meant to be. Many early website gurus preached the idea that online content should always be fresh and new to keep traffic coming back. That concept holds just as true now as it did then—the content has just shifted form.

Applicable Web 2.0 Patterns

Many blogs embrace a variety of the core patterns discussed in Chapter 7, Specific Patterns of Web 2.0, such as:

  • Participation-Collaboration

  • Collaborative Tagging

  • Declarative Living and Tag Gardening

  • Software as a Service

  • Asynchronous Particle Update (the pattern behind AJAX)

  • The Synchronized Web

  • Structured Information (Microformats)

Shifting to Blogs and Beyond

Static personal websites were, like most websites, intended to be sources of information about specific subjects. The goal of a website was to pass information from its steward to its consumers. Some consumers might visit certain websites (personal or otherwise) only once to retrieve the information they sought; however, certain groups of users might wish to visit again to receive updated information.

In some ways, active blogs are simply personal websites that are regularly updated, though most blog platforms support features that illustrate different patterns of use. Because there are no hard rules for how frequently either a blog or a personal website should be updated, nor is it possible to classify either in a general sense, it is probably not possible to identify clear differences as patterns. However, here are a few key points that differentiate blogs:

  • Blogs are built from posts—often short posts—which are usually displayed in reverse chronological order (newest first) on an organizing front page. Many blogs also support some kind of archive for older posts.

  • Personal websites and blogs are both published in HTML. Blog publishing, however, usually uses a slightly different model from traditional HTML website publishing. Most blog platforms don’t require authors to write HTML, letting them simply enter text for the blog in an online form. Blog hosting generally allows users to know less about their infrastructure than classic HTML publishing. Blogs’ ease of use makes them attractive to Internet users who want a web presence but have not yet bothered to learn about HTML, scripts, HTTP, FTP, and other technologies.

  • Blogs often include some aspects of social networking. Mechanisms such as a blogroll (a list of other blogs to which the blog owner wishes to link from his blog) create mini-communities of like-minded individuals. A blogroll is a great example of the Declarative Living pattern documented in Chapter 7, Specific Patterns of Web 2.0. Comment threads can also help create small communities around websites.

  • Blogs support mechanisms for publishing information that can be retrieved via multiple patterns (like Search and Retrieve, Push, or Direct Request). Instead of readers having to request the page via HTTP GETs, they can subscribe to feeds (including Atom and RSS) to receive new posts in a different form and on a schedule more convenient to them.

Standard blog software (e.g., Blogger or WordPress) has evolved well beyond simple tools for presenting posts. The software allows readers to add their own content, tag content, create blogrolls, and host discussion forums. Some blog management software lets readers register to receive notifications when there are updates to various sections of a blog. The syndication functionality of RSS (or Atom) has become a core element of many blogs. Many blogs are updated on a daily basis, yet readers might not want to have to reload the blog page over and over until a new post is made (most blog authors do not post exact schedules listing the times their blogs are updated). It is much more efficient if the reader can register interest and then receive a notification whenever new content is published. RSS also describes the content so that readers can decide whether they want to view the actual blog.

Blogs are also moving away from pure text and graphics. All kinds of blog mutations are cropping up, including mobile blogs (known as moblogs), video blogs, and even group blogs.

Developers are adding tools that emphasize patterns of social interactions surrounding blogs. MyBlogLog.com has software that uses an AJAX widget to place the details of readers of a blog on the blog page itself so that you can see who else has been reading a specific blog. Figure 3.18, “Screenshot of MyBlogLog.com blog widget” shows the latest readers of the Technoracle blog at the time of this writing.[39]

Figure 3.18. Screenshot of MyBlogLog.com blog widget

Screenshot of MyBlogLog.com blog widget

Most blog software also offers the ability to socially network with like-minded bloggers by adding them to your blogroll. Having your blog appear on other people’s blogrolls helps to elevate your blog’s status in search engines, as well as in blog directories such as Technorati that track blog popularity. It also makes a statement about your personality and your stance on a variety of subjects. Figure 3.19, “Example of a blogroll from Technoracle.blogspot.com” shows an example of a blogroll.

Figure 3.19. Example of a blogroll from Technoracle.blogspot.com

Example of a blogroll from Technoracle.blogspot.com

A blogroll is a good example of the Declarative Living and Tag Gardening pattern, as the list of fellow bloggers in some ways tags the person who posts it. By making a statement regarding whose blogs they encourage their readers to read, blog owners are declaring something about themselves. Blog readers can learn more about blog writers by looking at who they have on their blogrolls. For example, in Figure 3.19, “Example of a blogroll from Technoracle.blogspot.com”, knowing that John Lydon is in fact Johnny Rotten, the singer for the Sex Pistols, may imply to a reader that the owner of Technoracle has a somewhat disruptive personality and will try to speak the truth, even if it’s unpopular.

Blogs lowered the technical barrier for getting a personal presence on the Internet, making it much easier for many more people to join the conversation. Blogs have also changed the patterns of dissemination of information. Rather than simply reading a news story on a particular topic, interested readers can also find related blogs and find out what the average person thinks about that topic. Blogs represent a new kind of media and offer an alternative source for people who want more than news headlines.

More recently, blogs have evolved beyond their basic form. Blogs have become one of many components in social networking systems like MySpace and Facebook: one component in pages people use to connect with others, not merely to present their own ideas. Going in a different direction, Twitter has stripped blogging down to a 140-character minimalist approach, encouraging people to post tiny bits of information on a regular basis and providing tools for following people’s feeds.

Screen Scraping and Web Services

Even in the early days of the Web, developers looked for ways to combine information from multiple sites. Back then, this meant screen scraping—writing code to dig through loosely structured HTML and extract the vital pieces—which was often a troublesome process. As Web 2.0 emerged, more and more of that information became available through web services, which presented it in a much more structured and more readily usable form.

Applicable Web 2.0 Patterns

These two types of content grabbing illustrate the following patterns:

  • Service-Oriented Architecture

  • Collaborative Tagging

You can find more information on these patterns in Chapter 7, Specific Patterns of Web 2.0.

Intent and Interaction

In the earliest days of the Web, screen scraping often meant capturing information from the text-based interfaces of terminal applications to repurpose them for use in web applications, but the same technology was quickly turned to websites themselves. HTML is, after all, a text-based format, if a loosely (or even chaotically, sometimes) structured one. Web services, on the other hand, are protocols and standards from various standards bodies that can be used to programmatically allow access to resources in a predictable way. XML made the web services revolution when it made it easy to create structured, labeled, and portable data.

Note

There is no specific standardized definition of web services that explains the exact set of protocols and specifications that make up the stack, but there is a set that is generally accepted. It’s important to examine the web services architecture document from the W3C to get a feel for what is meant by “web services.” When this book refers to “web services,” it doesn’t specifically mean SOAP over HTTP, although this is one popular implementation. RESTful services available via the Web are just as relevant.

One major difference between the two types of interactions is intent. Most owners of resources that have been screen-scraped did not intend to allow their content to be repurposed. Many were, of course, probably open to others using their resources; otherwise, they probably wouldn’t have posted the content on the Internet. However, designing resources for automated consumption, rather than human consumption, requires planning ahead and implementing a different, or even parallel, infrastructure.

A classic example of the shift from screen scraping to services is Amazon.com. Amazon provides a tremendous amount of information about books in a reasonably structured (though sometimes changing) HTML format. It even contains a key piece of information, the Amazon sales rank, that isn’t available anywhere else. As a result, many developers have written programs that scrape the Amazon site.

Rather than fighting this trend, Amazon realized that it had an opportunity. Its network of Amazon Associates (people and companies that help Amazon sell goods in exchange for a commission) could use the information that others were scraping from the site. Amazon set out to build services to make it easier for its associates to get to this information—the beginning of a process that has led Amazon to offer a variety of web services that go far beyond its product information.

Most web services work falls under the Service-Oriented Architecture (SOA) pattern described in Chapter 7, Specific Patterns of Web 2.0. SOA itself doesn’t depend on the web services family of technologies and standards, nor is it limited to the enterprise realm where SOA is most ubiquitous. Web services are built on a set of standards and technologies that support programmatic sharing of information. These usually include XML as a foundation, though JSON has proven popular lately for lightweight sharing. Many web services are built using SOAP and the Web Services Description Language (WSDL), though others take a RESTful approach. Additional useful specifications include the SOAP processing model,[40] the XML Infoset[41] (the abstract model behind XML), and the OASIS Reference Model for SOA (the abstract model behind services deployed across multiple domains of ownership).

While web services and SOA are often thought of as technologies used inside of enterprises, rather than publicly on the Internet, the reality is that there is a wide spectrum of uses in both public and private environments. Open public services are typically simpler, while services used internally or for more specific purposes than information broadcast and consumption often support a richer set of capabilities. Web services now include protocol support for expressing policies, reliable messaging features, secure messaging, a security context, domains of trust, and several other key features. Web services have also spawned an industry for protocols and architectural models that make use of services such as Business Process Management (BPM), composite services, and service aggregation. The broader variety of web services standards has been documented in many other books, including Web Services Architecture and Its Specifications by Luis Felipe Cabrera and Chris Kurt (Microsoft Press).

Content Management Systems and Wikis

As the Web evolved from the playground of hobbyists to the domain commercial users, the difficulty of maintaining sites capable of displaying massive amounts of information escalated rapidly. Content management systems (CMSs) such as Vignette leaped into the gap to help companies manage their sites. While CMSs remain a common component of websites today, the model they use is often one of outward publication: a specific author or organization creates content, and that content is then published to readers (who may be able comment on it). Wikis take a different approach, using the same system to both create and publish information and thereby allowing readers to become writers and editors.

Applicable Web 2.0 Patterns

The patterns illustrated in this discussion focus on collaboration:

  • Participation-Collaboration

  • Collaborative Tagging

You can find more information on these patterns in Chapter 7, Specific Patterns of Web 2.0.

Participation and Relevance

Publishing is often a unilateral action whereby content is made available and further modifications to the content are minimal. Those who consume the content participate only as readers.

Wikis may look like ordinary websites presenting content, but the presence of an edit button indicates a fundamental change. Users can modify the content by providing comments (much like blog comments), use the content to create new works based on the content (mashups), and, in some cases, create specialized versions of the original content. Their participation gives the content wider relevancy, because collective intelligence generally provides a more balanced result than the input of one or two minds.

The phrases “web of participation” and “harnessing collective intelligence” are often used to explain Web 2.0. Imagine you owned a software company and you had user manuals for your software. If you employed a static publishing methodology, you would write the manuals and publish them based on a series of presumptions about, for example, the level of technical knowledge of your users and their semantic interpretations of certain terms (i.e., you assume they will interpret the terms the same way you did when you wrote the manuals).

A different way to publish the help manuals would be to use some form of website—not necessarily a wiki, but something enabling feedback—that lets people make posts on subjects pertaining to your software in your online user manuals. Trusting users to apply their intelligence and participate in creating a better set of software manuals can be a very useful way to build manuals full of information and other text you might never have written yourself. The collective knowledge of your experienced software users can be instrumental in helping new users of your software. For an example of this pattern in use, visit http://livedocs.adobe.com and see how Adobe Systems trusts its users to contribute to published software manuals.

Directories (Taxonomy) and Tagging (Folksonomy)

Directories are built by small groups of experts to help people find information they want. Tagging lets people create their own classifications.

Applicable Web 2.0 Patterns

The following patterns are illustrated in this discussion:

  • Participation-Collaboration

  • Collaborative Tagging

  • Declarative Living and Tag Gardening

  • Semantic Web Grounding

  • Rich User Experience

You can find more information on these patterns in Chapter 7, Specific Patterns of Web 2.0.

Supporting Dynamic Information Publishing and Finding

Directory structures create hierarchies of resource descriptions to help users navigate to the information they seek. The terms used to divide the hierarchy create a taxonomy of subjects (metadata keywords) that searchers can use as guideposts to find what they’re looking for. Library card catalogs are the classic example, though taxonomies come in many forms. Within a book, tables of contents and especially indexes often describe taxonomies.

Navigation mechanisms within websites also often describe taxonomies, with layers of menus and links in the place of tables of contents and a full-text search option in place of an index. These resources can help users within a site, but users’ larger problem on the Web has often been one of finding the site they want to visit. As the number of sites grew exponentially in the early days of the Web, the availability of an incredible amount of information was often obscured by the difficulty of finding what you wanted. The scramble for domain names turned into a gold rush and advertisers rushed to include websites in their contact information—but many people arrived on the Web looking for information on a particular subject, not a particular advertiser.

The answer, at least at the beginning, was directories. Directory creators developed taxonomic classification systems for websites, helping users find their way to roughly the right place. Online directories usually started with a classification system with around 8 to 12 top-level subjects. Each subject was further classified until the directory browser got to a level where most of the content was very specialized. The Yahoo! directory was probably the most used directory in the late 1990s, looking much like Figure 3.20, “The Yahoo! directory”. (You can still find it at http://dir.yahoo.com.)

Figure 3.20. The Yahoo! directory

The Yahoo! directory

Each category, of course, has further subcategories. Clicking on “Regional,” for example, provided users with the screen in Figure 3.21, “Subcategories under the Regional category”.

Figure 3.21. Subcategories under the Regional category

Subcategories under the Regional category

Similarly, clicking on “Countries” in the subcategory listing shown in Figure 3.21, “Subcategories under the Regional category” yielded an alphabetical list of countries, which could be further decomposed into province/state, city, community, and so on, until you reached a very small subset of specific results.

Directories have numerous problems. First and foremost, it is very difficult for a small group—even a small group of directory specialists—to develop terms and structures that readers will consistently understand. Additionally, there is the challenge of placing information in the directory. When web resource owners add pages to the Yahoo! directory, they navigate to the nodes where they think the pages belong and then add their resources from there. However, other people won’t necessarily go to the same place when looking for that content.

Say, for example, you had a rental car company based in Vancouver, British Columbia, Canada. Would you navigate to the node under Regional→Countries→Canada→Provinces→British Columbia→Cities→Vancouver, and then add your content? Or would you instead add it under Recreation & Sports→Travel→Transportation→Commuting, or perhaps Business & Economy→Shopping and Services→Automotive→Rentals? Taxonomists have solved this problem by creating polyhierarchies, where an item can be classified under more than one node in the tree. However, many Internet directories are still implemented as monohierarchies, where only one node can be used to classify any specific object. While polyhierarchies are more flexible, they can also be confusing to implement.

Another problem concerns terminology. Although terms such as “vehicles for hire” and “automobiles for lease” are equally relevant regarding your rental car company, users searching for these terms will not be led to your website. Adding non-English-speaking users to the mix presents a whole new crop of problems. Taxonomists can solve these problems too, using synonyms and other tools. It just requires an ever-greater investment in taxonomy development and infrastructure.

Hierarchical taxonomies are far from the only approach to helping users find data, however. More and more users simply perform searches. Searches work well for textual content but often turn up false matches and don’t apply easily to pictures and multimedia. As was demonstrated in our earlier discussion of Flickr, tagging offers a much more flexible approach—one that grows along with a library of content.

Sites such as Slashdot.org have implemented this type of functionality to let readers place semantic tags alongside content. Figure 3.22, “Screenshot from Slashdot.org showing user tags (the tags appear in the oval)” shows an example of the tagging beta on a typical Slashdot.org web page. The tags appear just below the article.

Figure 3.22. Screenshot from Slashdot.org showing user tags (the tags appear in the oval)

Screenshot from Slashdot.org showing user tags (the tags appear in the oval)

The most effective tagging systems are those created by lots of people who want to make it easier for themselves (rather than others) to find information. This might seem counterintuitive, but if a large number of people apply their own terms to a few items, reinforcing classification patterns emerge more rapidly than they do if a few people try to categorize a large number of items in the hopes of helping other people find them. For those who want to extract and build on folksonomies selfish tagging can be tremendously useful, because people are often willing to share their knowledge about things in return for an immediate search benefit to them.

Delicious, which acts as a gigantic bookmark store, expects its users to create tags for their own searching convenience. As items prove popular, the number of tags for those items grows and they become easier to find. It may also be useful for the content creators to provide an initial set of tags that operate primarily as seed tags—that is, a way of encouraging other users to add their own tags.

More Hints for Defining Web 2.0

Tim’s examples illustrate the foundations of Web 2.0, but that isn’t the end of the conversation. Another way to look at these concepts is through a meme (pronounced “meem”) map. A meme map is an abstract artifact for showing concepts and their relationships. These maps are, by convention, ambiguous. For example, if two concepts are connected via a line, you can’t readily determine what type of relationship exists between them in tightly defined ontological terms. Figure 3.23, “Meme map for Web 2.0” depicts the meme map for Web 2.0, as shown on the O’Reilly Radar website.

Figure 3.23. Meme map for Web 2.0

Meme map for Web 2.0

This map shows a lot of concepts and suggests that there are “aspects” and “patterns” of Web 2.0, but it doesn’t offer a single definition of Web 2.0. The logic captured in the meme map is less than absolute, yet it declares some of the core concepts inherent in Web 2.0. This meme map, along with the Web 2.0 examples discussed earlier in the chapter, was part of the conversation that yielded the patterns outlined in Chapter 7, Specific Patterns of Web 2.0. Concepts such as “Trust your users” are primary tenets of the Participation-Collaboration and Collaborative Tagging patterns. “Software that gets better the more people use it” is a key property of the Collaborative Tagging pattern (a.k.a. folksonomy). “Software above the level of a single device” is also represented with the Software as a Service and Mashup patterns.

Reductionism

Figure 3.24, “Reductionist view of Web 2.0” shows a reductionist view of Web 2.0. Reductionism holds that complex things can always be reduced to simpler, more fundamental things, and that the whole is nothing more than the sum of those simpler parts. The Web 2.0 meme map, by contrast, is a largely holistic analysis. Holism, the opposite of reductionism, says that the properties of any given system cannot be described as the mere sum of its parts.

Figure 3.24. Reductionist view of Web 2.0

Reductionist view of Web 2.0

In a small but important way, this division captures an essential aspect of the debates that surround Web 2.0 and the next generation of the Web in general: there is one set of thinkers who are attempting to explain what’s happening on the Web by exploring the fundamental precepts, and another set who seek to explain in terms of the things we’re actually seeing happen on the Web (online software as a service, self-organizing communities, Wikipedia, BitTorrent, Salesforce, Amazon Web Services, etc.). Neither view is complete, of course, though combining them could help.

In the next part of the book, we’ll delve deeper into detail, some of it more technical, and try to distill core patterns that will be applicable in a range of scenarios.



[37] Flickr’s tagging is effective, but it represents only one style of folksonomy: a narrow folksonomy. For more information on different styles of folksonomy, see http://www.personalinfocloud.com/2005/02/explaining_and_.html.

[38] Cloud computing, in which developers trust their programs to run as services on others’ hardware, may seem like a return to centralization (“All those programs run on Amazon S3 and EC2....”). The story is more complicated than that, however, as cloud computing providers have the opportunity to give their customers the illusion of centralization and the easy configuration that comes with it, while supporting a decentralized infrastructure underneath.

Web 2.0 Architectures book cover

This excerpt is from Web 2.0 Architectures. This fascinating book puts substance behind Web 2.0. Using several high-profile Web 2.0 companies as examples, authors Duane Nickull, Dion Hinchcliffe, and James Governor have distilled the core patterns of Web 2.0 coupled with an abstract model and reference architecture. The result is a base of knowledge that developers, business people, futurists, and entrepreneurs can understand and use as a source of ideas and inspiration.

buy button