I’ve been looking at object storage systems (think Amazon S3, Swift, “cloud storage”) lately to see if they solve some storage problems, and I’m not particularly pleased with where any of them are right now. I realize it’s still early days for them, but all the ones I’ve looked at have some development time in store for them. Most of them eschew traditional storage systems, opting instead to use raw hard disks and replicating objects to protect against hardware issues, and keeping track of objects via an internal database or hashing mechanism.
The thing I can’t get past is that compared to traditional storage (RAIDed, high availability or clustered) and traditional databases (mysql, linter, filesystems like xfs, ext3), they all seem to want to reinvent the wheel. The object stores aren’t necessarily better for it; to get performance and protection you have to scale up the number of servers and hard drives to the point where it’s reasonable to ask whether or not it makes sense just to buy a traditional storage system. Object stores give you 33% storage efficiency; traditional RAID6 gives you 67%.
Separately, I was reading about the git source control software and got to thinking: it looks a heck of a lot like a simple object store, where the files are stored as objects, and ‘versions’ or ‘branches’ of a project are just different lists of objects.
If you take Swift as an example, and instead of deploying it on raw hard drives, use some kind of RAIDed and clustered storage (e.g. NetApp, BlueArc, EMC, maybe even Isilon or lustre) where the data protection (and recovery) mechanism is offloaded from Swift, you should be able to turn the number of object replicas down to one, instead of three. Storage protection solved by the backend system; and if a Swift node (either storage or proxy) dies, spin up a new server and point it at the storage system.
At that point, Swift is effectively storing files, calculating their checksums and returning a key to the client to use as an object key, keeping metadata (tags) if you like, and serving the files back on request. You can turn off the replication mechanism, since the storage is taking care of that for you. Keeping the scrubber running to make sure objects match their checksum probably isn’t a bad idea if you’re paranoid, but also shouldn’t be necessary (again, the storage takes care of that for you). But whether Swift is running on protected storage or raw hard drives, you can’t (easily) provide an index of the available files, or replicate them geographically (from, say, the New York server to the Los Angeles server).
Bring in git. git stores all files as objects, named by their SHA1 checksum, and keeps lists (“trees”) of these objects and of sub-trees to represent a directory structure. Update a file or few, and a new tree is created to represent the new structure and include the new objects.
Git apparently isn’t very good at large repositories of files because of how it stats every file in a tree during certain operations, while Swift uses a hashing mechanism to keep it quick. But if you serve up the git objects via a web server (perhaps with a module to strip out the git header and to look into git object packages) and primarily store or retrieve objects, you can use Basic or Digest authentication to protect the content (for GETs or PUTs), and (in theory) can use ‘git clone’ and ‘git pull’ to keep remote replicas up to date. If you had a mechanism for dealing with conflicts (or just never allowed file modification) you might be able to use this mechanism to allow multimaster replication between sites. And Git handles full-file deduplication by keeping only one copy of an object — if a second copy is added, the SHA1 checksum already exists, so no need to write a second file.
Richard Anderson at Stanford has already taken a look at this and listed out what he’s found when considering git as an object store, with some good detail on the strengths and weaknesses.
Git and Swift both have their strengths and weaknesses for specific applications; but maybe the Swift folks can take some direction from git, or maybe someone can address some of the issues with creating a git-based object store. Perforce is also good at holding very large source trees; they keep a database of the repository and use a proprietary client/protocol server, but maybe they’d be interested in seeing if a p4-based object store makes sense?