ONLamp.com
oreilly.comSafari Books Online.Conferences.

advertisement


An OpenLDAP Update

by Marty Heyman
09/13/2007

OpenLDAP is the de facto Open Source reference implementation of the Internet standard Lightweight Directory Access Protocol (LDAP, see RFC 4510). This standard is now in Version 3 and was recently republished to clarify some points. Ever since various experimenters began developing bridges between internet applications and pre-internet X.500 directories, they have relied on a reference implementation to validate their approaches and verify the standard would be robust and complete.

The original reference implementation was done at the University of Michigan where much of the LDAP innovation was done. When the Michigan team joined Netscape, the code languished for a bit and then the OpenLDAP project was formed to continue the Open Source work as the standard continued to evolve over the past decade.

This article highlights the major changes in OpenLDAP since the project picked up the University of Michigan code in 1998.

Utilities and Libraries

The University of Michigan team provided a very complete and comprehensive package of source with need of some work. In the package, the line mode commands were both useful for scripting routine maintenance tasks and as examples of the use of many of the requests (at the C language interface). Many developers have used those line-mode commands as a reference to develop libraries for Java, Perl, Python, and many others. The OpenLDAP team has maintained those programs and enhanced them to conform to new requirements, to make them even more robust, and to improve their performance. Several new useful commands have been added. The net effect of all that work is that they continue to work pretty much as they always have and as they should.

The same can be said of the LDAP client libraries. There has been a great deal of work to improve the libraries, but most of it is largely invisible. The changes were, however, essential for preparing this Open Source LDAP directory services package for production use in the enterprise.

The most visible changes to the commands, utilities, and libraries are those that enable the SSL/TLS and SASL features required by LDAPv3. Numerous optional parameters have been added to these elements to enable enterprises to take advantage of these valuable security features. The addition of TLS/SSL and SASL was required for LDAPv3 and OpenLDAP to gain any significant enterprise usage. These introduce a level of security that is required for most production enterprise directories.

The OpenLDAP Database

The University of Michigan LDAP team was committed to Open Source and relied heavily on the GNU project for tools and libraries. The team used the ldbm database manager for its storage management. ldbm is a University of Michigan database framework that supports several different database packages through its API. ldbm was very tightly integrated into slapd and the framework lacked many capabilities that the OpenLDAP team wanted.

In 1999, our company (Symas Corporation) was working on a proprietary piece of software based on OpenLDAP. Our Chief Architect (Howard Chu) was a member of the OpenLDAP team and realized that the structural dependency on ldbm was in the way of a number of desirable innovations. He implemented database backends of slapd, largely completed for OpenLDAP 1.2 and enhanced for OpenLDAP 2.0 (August 2000). The version of backends delivered in 2.0 allowed backends to be dynamically loadable modules. The database backend interface made it much easier to use other database technologies as the storage mechanisms for OpenLDAP. 2.0 came with experimental backends for SQL and LDAP which evolved over time to production-ready status. 2.0 also brought OpenLDAP up to the LDAPv3 standard and the rest of the commands and utilities were enhanced with TLS/SSL and SASL support in the subsequent months.

A native BDB backend, which was part of OpenLDAP 2.1, was released in June of 2002. This replaced the ldbm BDB support previously available. Two database backends based on BDB have been released for production: back-bdb and back-hdb. back-bdb was the original high-performance database backend with ACID (Atomicity, Consistency, Isolation, Durability, properties that guarantee that database transactions are processed reliably) support. This provided a database with the reliability characteristics required for enterprise production-level directories.

It is worth noting that IBM took the restructured OpenLDAP 2.1 code and implemented their directory product (now called Tivoli Directory Server) by building a backend for DB2 and adding code to the rest of the package. In spite of IBM's periodic participation in the OpenLDAP project, they do not appear to have updated their product to any of the subsequent releases of OpenLDAP.

back-hdb (introduced in October 2003 in OpenLDAP 2.2) builds on back-bdb and implements a true hierarchic database. back-hdb is the only database available that implements the subtree rename operation standardized in LDAPv3.

OpenLDAP now ships with quite a large number of database backends that help enterprises make non-directory data available through LDAP. There is ongoing discussion of adding more backends to either improve access to legacy data (internal interfaces to PostgreSQL or MySQL instead of the higher-level SQL interface) or to support smaller directories with a lighter-weight database engine (such as tdb). These discussions will yield code if customers and project members show interest. The OpenLDAP database backend interfaces have proven themselves the most flexible and cost-effective way to address many requirements including integration of meta-directories, virtual directories, and production directories using older data sources.

Overlays

The database wasn't the only part of OpenLDAP that needed work. There are many application-specific extensions that have been developed for various directory services products. It was obvious that OpenLDAP would greatly benefit from implementations of similar extensions. However, the code did not provide clean interfaces for the addition of new capabilities.

The team worked on another refactoring of the slapd code to create an internal API for the implementation of extensions, which they called overlays. The overlay interface was released in March of 2005 as part of OpenLDAP 2.2. Two overlays were released at that time (Dynamic Groups and proxy cache) in experimental state. The overlays have been extensively tested and many of the existing overlays are in production use by major enterprises. Current overlays include:

  • Password policy
  • Referential integrity
  • Attribute uniqueness
  • Value sorting
  • Translucency
  • Proxy cache
  • Access logging
  • Audit logging
  • Dynamic lists

HP contributed financially to the development of some and developed some of the original overlays. This led to HP's adoption of OpenLDAP for their Enterprise Directory world-wide in 2006. Currently, HP is accessing their OpenLDAP directories fifty million times a day, or over a billion and a half times a month. The overlay capability provided the mechanism for Symas and the OpenLDAP project to address key HP requirements quickly and at modest cost.

Replication and Redundancy

The University of Michigan developers implemented a directory replication capability called slurpd. slurpd pushed changes out to replica servers as updates were made to the master directory. Replication lets enterprises place copies (complete or partial) in various parts of their network to optimize for performance and reliability. The original University of Michigan approach has worked reasonably well for many enterprise production directories but it has a number of inherent problems.

In the late 1990s, a new feature called Content Synchronization (see RFC 4533) offered a new basis for replication. In OpenLDAP 2.2, the project introduced synchronization replication (syncrepl) based on persistent search. syncrepl uses change sequence numbering and is a pull approach by the replica server. It is much more robust replication approach and more forgiving when replica servers lose connectivity. syncrepl was enhanced in OpenLdAP 2.3 with a change that replicated only changed attributes, a network loading and performance enhancement.

Replication supports excellent scaling out of a search load. It also minimizes the impact of an outage on any single server. However, it does not provide protection for the Master Directory Server that applications use to update the database. This has been a serious concern for many larger enterprises.

Symas added MirrorMode capability to the OpenLDAP source libraries to provide a hot-standby replica ready to take over the duties of Master. This support was added to the OpenLDAP source repository but not integrated into the OpenLDAP Releases. It has been available through Symas's distributions.

An alternate Master server is set up as a full replica of the Master. In MirrorMode, an LDAP request load-balancer is put in front of the Master and its fail-over backup. The load-balancer monitors the performance of the Master and switches the update load to the backup Master if it detects a failure. The other replicas are set up to switch over as well. This feature is in production in several enterprise directories.

The final element of redundancy that customers ask about is full Multi-Master updating. The OpenLDAP team has been skeptical of the real value of Multi-Master updating but customers consistently look for the feature. Multi-master is still an invitation to irreconcilable (and frequently even undetectable) conflicts. It's real value is in terms of fast fail-over (i.e., MirrorMode). OpenLDAP 2.4 (now in beta) will introduce Multi-Master updating.

Configuration

The University of Michigan slapd and early versions of OpenLDAP used old-school flat file configuration. slapd had to be restarted to effect changes to the configuration of the directory. Other directory providers implemented various mechanisms to update at least some of the directory configuration in real time without service interruptions. OpenLDAP 2.3 introduced a configuration backend using back-ldif (an LDAP directory using the LDAP Data Interchange Format, a text-form for directory entries).

Dynamic configuration for OpenLDAP lets you manage the entire configuration of an OpenLDAP directory through LDAP itself. The schema can be extended and indexes added in real time. If you add an index, the attributes are indexed automatically in the background and the index is transparently activated when complete. Databases can be added and access controls changed.

Dynamic configuration lets you use the same browser/editor tools you use to enter and correct entry data to edit the configuration data. The configuration data is visible as entries and attributes in <cn=config>. Provisions are made to retain attribute value ordering where necessary to preserve the ordering dependency of configuration directives.

Dynamic configuration is initiated either on slapd startup or during setup by one of the utility programs. Configuration data is either initially specified in an old-school slapd.conf flat file or loaded from LDIF data. This makes slapd its own administration server and opens the door to development of custom administration applications via the popular web programming languages.

Performance

The performance of OpenLDAP is many times better than the University of Michigan code and even the early releases of OpenLDAP. Some measurements indicate hundreds of times faster performance. Conversion to BDB yielded substantial performance slowdowns. BDB's transaction locking (ACID support) introduced quite a bit of code that reduced performance. To counteract that, the team has obsessively profiled the code and attacked performance bottlenecks. The result is striking.

Recent benchmarks on typical server configurations show that, with sufficient RAM to cache active data, OpenLDAP delivers twelve to fourteen thousand queries a second. Symas published several benchmarks run on popular platforms. There are some comparison benchmarks for the other Open Source directory projects (Fedora DS and OpenDS). There is also a comparative benchmark on a similar configuration to a published Sun Microsystems benchmark. In all cases, OpenLDAP significantly outperforms the competition.

We had an opportunity to benchmark OpenLDAP 2.3 on a large SGI Altix configuration (480GB of RAM, 32 Processors, lots of fast disks). The benchmark used a customer defined workload, schema, and data generation. It was a series of runs against tens of millions, around a hundred million and then one hundred and fifty million entries. The entries were over three thousand bytes each making it a huge database. OpenLDAP peaked at over twenty-two thousand queries a second. It delivered over six thousand updates a second at peak and over four thousand steady state. Performance was essentially the same on all three directory size tests and it appears that performance would remain the same as the directory grew assuming RAM were made available.

As a side-note, someone accidentally bumped a memory card during the benchmark. The system panicked. So did the customer. On their current directory that would probably have resulted in a corrupted database. We didn't have time to do the reload, which would have taken many hours. The directory was back up in seconds after the system was back online, demonstrating the value of ACID support. There was no data corruption and the testing was only halted for a few minutes total.

That experience is consistent with reliability experience with other customer experiences. Outages are primarily administrative (generally administration of the platform that interrupts service) or hardware maintenance. We virtually never hear of outages from software failures.

Portability and Integration

The OpenLDAP Project has been particularly attentive to portability. OpenLDAP compiles and runs correctly on all of the major Linux and UNIX systems. It also runs on Apple's Mac OS X and Microsoft Windows. Symas provided a port to a client for IBM's Z/OS, a testament to the code's portability. IBM is now reselling that port.

Customers need to install OpenLDAP, BDB, OpenSSL or GNU TLS, and Cyrus SASL to get a fully LDAPv3 compliant directory server running. The project verifies proper operation against releases of those packages. Symas offers convenience binary distributions for free trial and fully maintained distributions for subscribers.

Summary

The OpenLDAP project picked up where the original University of Michigan team left off. Very substantial improvements in directory capability have been added and OpenLDAP continues to be the most internet standards-compliant directory server available. It is, demonstrably, the fastest. It is also, demonstrably, the most reliable. And it is the simplest to administer, integrate, and extend.

Enterprise-grade commercial support is available from Symas (the primary corporate development contributor to the project), from HP (supported by Symas), and from other corporate project contributors like Suretec in the UK and SysNet in Italy. This is an important consideration for many enterprises looking to convert to Open Source solutions but requiring sophisticated professional support services.

It is clear that the OpenLDAP project has made huge strides. The level of innovation is still very high and the quality of the resulting code is excellent.

Marty Heyman is the President of Symas Corporation. He previously worked as Marketing and Development Executive at IBM, Locus Computing, Palyn-Gould Group and two software start-ups since acquired by Microsoft and EMC.


Return to ONLamp.



Sponsored by: