7. Release early. Release often. And listen to your customers.
– Eric S. Raymond, The Cathedral and the Bazaar
First of all, my apologies for the nearly month-long absence. Most of that time has been spent wrapping my head around binary packing, zlib inflation/decompression (in fact the distinction still eludes me…) and playing with Granite, the product of all this tortuous work.
Granite is a pure PHP library for Git. There were several issues with the libraries I managed to find, the most useful of which was Glip.
Glip’s developer has no intention of supporting ‘push/pull/sync’ operations [1], although an alternative implementation exists [2] that permits pushing over “smart” HTTP. However, the Upliner Glip fork [2] hasn’t been updated for a year, and Glip [1] appears to have stalled.
Ideally, Granite will provide a simple, up-to-date implementation of Git reading, writing, with smart HTTP push and pull support. I’d like to start using this library for a variety of uses besides ownCloud, like a locally-installable version of GitHub.
ownCloud
So how does this fit into ownCloud versioning? Granite can read any object from the repository, including tags/branches (refs). Once I’ve developed some classes to represent each of the major objects, I can start writing a PHP StreamWrapper implementation. This should allow me to ‘tweak’ one of the existing OC_Filestorage providers, allowing access to a Git repository as if it were a local directory.
I’m still fuzzy on the details for ownCloud – I don’t much want to go changing the user interface, that’s somebody else’s code. Ideally, you would enable the ‘Git Integration’ (or whatever) application, mount a repository or two (or add existing folders to new repositories) and then be able to browse the current HEAD. For version rollback/recovery, the application should use a configuration value which points to HEAD by default, which the user can then override to allow the viewing of previous files.
Active or Passive Versioning?
That’s a poor choice of heading, but one of my major concerns is how often to make a new ‘version’. I don’t want it to be too granular (i.e. virtually every write results in a new commit) as repository history can get quite large. On the other hand, I don’t want to leave too long between changes, for obvious reasons.
What do you think? (I know people are reading this!) Client-side projects tend to use file change notification systems (SpiderOak, for example [3]) but that’s not really applicable here. Apple’s Time Machine seems to go for an hourly approach [4]. I’ll look more into the choices made by other projects.
Testing
The code is a bit of a shambles at the moment: I’m tired, and it’s just started working, so I want to share it with the world! Beyond all that though, I want to put Granite through its paces before using it to integrate with ownCloud. So please, download it, test it, break it, and tell me all about it. Hopefully the unit tests make sense to people, the tests should run regardless of whether a repository has packed objects or not.
[1] http://lists.fimml.at/glip-devel/0014.html
[2] https://github.com/Upliner/gitphp-glip/wiki/
[3] https://spideroak.com/blog/20091204132500-spideroak-releases-lightweight-filesystem-change-notification-utilities-for-windows-os-x-and-linux-gplv3
[4] https://discussions.apple.com/thread/1949414?start=0&tstart=0
Great to see progress getting made,
I would suggest committing changes when the file changes with a (configurable) maximum frequency (say 1 commit every 10 min)
Repo history can always be squashed if it gets to large.
I like that, I’ll try to implement it as an app/admin panel option. I’d like to try to avoid rebasing if possible. but I guess there’s no other way of shrinking the history. Perhaps a nice big warning about synchronisation issues in that case