We’re migrating all of our authentication system to LDAP; got a master
OpenLDAP machine hidden away, with each of the public access servers
(i.e. webservers and mailservers) using syncrepl to keep their own
copies up to date. The failover design is pretty good, but comes at a
cost: we need to use OpenLDAP 2.2 and for that we need to manually
build the RPMs for RHEL3, RHEL4, RHEL2.1 and RHL7.3.
Needless to say, this sucks.
So it’s 5pm Friday night, last night, and I’m fighting my way through
a backport of our backported RHEL3 SRPM to RHL7.3, and the versioned
dependencies are killing me. Worst of all are the build system
dependencies, GAR! Backporting libtool just to build the fucking
thing? Uncool.
I bitched to Benno, and we got to thinking, Tridge has written an LDAP-like library, it’s cool,
it’s fast, but it’s not a server, but then what would you need on top
of that? Samba 4? :-) Well, it’d be nice to have a software package designed to be
buildable on legacy operating system versions in order to get my job
done. Python’d be good for prototyping (but in order to get mindshare
in the ricer demographic we’d need to write the final version in C,
for maximal -funroll-loops impact :-)
You’d also want compatibilty with OpenLDAP: communication over
ldapi://
, ldap://
, and ldaps://
protocols, and an ABI compatible
client library so you could just drop in on top. Strict schema
checking only, with a collection of excellent (and non-conflicting!)
schemas one can use.
SSL mode would absolutely need to do bidirectional certificate
checking. No exceptions.
I’d like such a server to carry the same pride that small, safe,
secure deamons dovecot and vsftpd have; using secure programming
techniques, simple configuration, and sticking to the desired
featureset.
Plenty of tests in the test suite :-) Write it using a single
threaded statemachine model; no threads whatsoever.
Finally, what’s the point of a new product if it doesn’t provide
advantages over the existing competitors? Out of the box multi-master
capabilities would be gold. I chatted to Benno about this at the
time, he even suggested multi-master, but conceded it’s probably going
to be the hardest thing to do.
So, the night turned into a prototyping-fest where Benno and I
attacked the problem on paper and in emacs, and eventually coming up
with a simple algorithm that looked to us (though we haven’t been able
to prove it :-) that it would work for multi-master cliques of any N.
For tested N of 1 up to 25, we found message passes until
synchronisation increased almost N^2 with the size of the clique.
This sucked as we injected update messages during the update – but
our thought experiments suggest that
- people are unlikely to have more than a handful of servers in a single cluster
- real life updates are going to be coming less frequently than they
were in our test.
The test code lives in arch at
jaq@spacepants.org--2004/almanac--prototype--0
(thanks to
thesaurus.com for synonyms of “directory”). You can check out
Node.py
for the master update algorithm, a simple three-conditional
check and a global time counter.
Benno thinks we need a vector clock to reduce the number of messages
passed, I think he’s right. But currently the algorithm works good
enough, I think it’s enough to start building the rest of the server
on top of it :-)