Articles Weblogs Books School Short Cuts Podcasts  
P2P Profiles

Jibe: Building distributed databases that standardize product searches


P2P Profiles is an ongoing column that takes an in-depth look at companies in the peer-to-peer space. They will be published in O'Reilly's P2P 2001 Industry Overview, available this July.

Distributed searching -- where a user sends a query to multiple sites at once, and the owners of the data return results from their own data stores -- is one of the Holy Grails of P2P. It was one of the early goals of Gnutella developers, embodied in Gene Kan's InfraSearch project and is now a part of Sun Microsystem's JXTA. Distributing a search improves the overall efficiency of a query (because many sites can handle it in parallel), while eliminating the need for data to be translated and uploaded to some large central server.

But one cannot drink from this chalice merely by tying together a lot of sites and asking them to offer up their databases. A workable solution must also integrate legacy data that can be stored in different formats with different column headings.

Jibe's product makes the most of industry standards -- including SOAP and Java -- on the inside, while providing wrappers that they claim make the process efficient and flexible for both the users doing searches and the participating vendors.

Also in P2P Profiles:

Tadaaa! It's Thinkstream

Allcast: New Life for Live Content

OpenCola: Swarming Folders

XDegrees tackles name service and file caching

Porivo: Load Testing with P2P

Of course, Jibe hides the complexities of product classification, distributed queries and formatting from the end-user. This end-user might simply type "motorcycle frames" into the GO FIND box and let Jibe do the retrieval, ranking, sorting and display. However, a sophisticated user who understands standard product classifications can use that metadata to browse the data.

Distributed Searching -- the Supply Side

Many industries have standardized product codes so customers researching products from different companies can feel like they're comparing apples to apples. Computer products, for instance, often follow a product code called the UN/SPSC. Jibe does not expect companies to store data according to any particular code, but it does provide different classifications -- or taxonomies, in their terminology -- and makes it as easy as possible for companies to supply data according to a chosen taxonomy.

The Jibe storage format, as one might guess, is XML. Taxonomies are defined in XML. "Jibe believes the future is in distributed services using XML-based messaging (i.e. SOAP)," writes CEO Greg Schmitzer. But they're also planning support for JXTA.

At this point, many readers may be asking about the recently finalized specification for XML Schemas -- isn't it tailored for this kind of industry classification? Jibe, of course, has to develop a solution that can be used right now, and it will be a long time before software products implement XML schemas or before each industry settles on a schema. But Jibe is open to using schemas and can adapt if companies start using them. Essentially, a company that wants to use a schema can use it to classify data before inserting it into the Jibe system. Most of the important services provided by Jibe will remain the same.

The data stored by each site (such as name, quantity, description, and price) is referenced in a Jibe XML "lens." One such lens might use the UN/SPSC classifications, another might use a product classification appropriate to another industry, and another might involve an entirely different field of research, such as human resources. Jibe allows users to search through multiple lenses at once.

In a sense, by licensing Jibe software, customers join a gigantic and potentially industry-wide database. A Jibe customer can also set up a private, secure P2P hub with its suppliers or partners. Any storage system that can be accessed through JDBC or ODBC can participate, including Microsoft Access and Excel spreadsheets. For clients without any such datastore, Jibe even goes so far as to give them Access and Excel templates. A wizard allows them to organize their data in just three steps into a format that Jibe can query:

  • Bind to the source of the data (usually a product database).

  • Choose a desired taxonomy (such as UN/SPSC) or a custom schema.

  • Map the appropriate columns in the database to the standard fields in the taxonomy (for instance, what the company might call "inventory" in its database might be called "availability" in the taxonomy).

The site's data now "jibes" with the related data from all other sites.

Distributed Searching -- the Demand Side

For users who wish to search for products, Jibe provides a Java servlet that can interpret the XML for each taxonomy and present a Web form to the user doing a search. The user first selects a lens to search through (such as the Product/UNSPSC lens) and then enters a search string (such as "test tubes"). A Jibe application can run standalone, or a company licensing Jibe can store a single servlet on an internal Web server and let its employees do searches through their browsers.

At its central server, Jibe stores a limited subset of information indicating which company sites have what types of products. User queries are then directed to those sites, which delve down and return detailed information.

Jibe, the Company

Jibe is located at the domain name They started with the goal of helping companies access relevant product information directly from one another. Along the way, they realized this technology could be applied to a variety of content issues, including real-time data analysis and CRM. They are VC-funded and are offering a beta release this summer.

Andy Oram is an editor for O'Reilly Media, specializing in Linux and free software books, and a member of Computer Professionals for Social Responsibility. His web site is

Return to