DVCS-Autosync

Executive summary

dvcs-autosync is a project to create an open source replacement for Dropbox/Wuala/Box.net/etc. based on distributed version control systems (DVCS). It offers nearly instantaneous mutual updates when a file is added or changed on one side but with the added benefit of (local, distributed) versioning and that it does not rely on a centralized service provider, but can be used with any DVCS hosting option including a completely separate server - your data remains your own.

Synchronization of directories is based on DVCS repositories. Git is used for main development and is being tested most thoroughly as the backend storage, but other DVCS such as Mercurial are also supported. dvcs-autosync is comparable to SparkleShare in terms of overall aim, but takes a more minimalistic approach. A single Python script monitors the configured directory for live changes, commits these changes to the DVCS (such as git) and synchronizes with other instances using XMPP messages.

It is based on the following components:

  • Git (or any other DVCS) is used as the backend for synchronizing changes and keeping a version history of all changes.
  • The inotify API of the Linux kernel (and in the future, the respective OS-dependent APIs) is used to monitor the repository that should be kept in sync for local changes.
  • XMPP messages are sent between multiple instances of dvcs-autosync to notify all other repositories of changes on any of the synchronized hosts.

A nice introductory article on dvcs-autosync was published on Linux Weekly News.

Download

dvcs-autosync is distributed as open source under the terms of the GNU GPL (v2 or v3). Development currently happens at a Github project. You can get the most current source tree from there, while I am maintaining Debian/Ubuntu packages of stable snapshots to make it more easily installable.

Issue/bug tracker

Please report all bugs or wishlist items in the Github issue tracker.

Synchronization procedure

  1. Set up desktop notifications (for these nice bubble-style popups when anything happens) and log into a Jabber/XMPP account specified in the config file.
  2. Monitor a specific path (and its subdirectories) for changes with inotify.
    At the moment, only one path is supported and multiple script instances have to be run for multiple disjoint paths. This path is assumed to be (part of) a repository. Currently tested with git, but should support most DVCS (the config file allows to specify the DVCS commands called when interacting with it).
    Optionally, an [ignores] file is read with one exclusion pattern per line and files matching any of the patterns are ignored. This will typically be the .gitignore file already existing the git tree.
  3. When changes are detected, check them into the repository that is being monitored (or delete, or move, etc.).
    It automatically ignores any patterns listed in .gitignore and the config file allows to exclude other directories (e.g. repositories within the main repository).
  4. Wait for a configurable time. When nothing else changes in between, commit.
  5. Wait a few seconds longer (again configurable) and, if nothing else is commited, initiate a push to the server.
  6. After the push has finished, send an XMPP message to self (that is, to all clients logged in with the same account) to notify other accounts of the push.

At any time in between, when receiving a proper XMPP message, pull from the repository.

Dependencies

  • Linux kernel with inotify enabled
  • Python 2.6
  • Pyinotify (better performance with version >= 0.9)
  • A patched JabberBot (>= 0.9) (included in this repository, the patch allows reception of messages from its own XMPP id and will be included in the next upstream JabberBot version)
  • xmpppy
  • [recommended] Pyinotify

Installation

Package installation

  •  Either install the Debian package (generated by dpkg-buildpackage from the source tree) or use the arch package
  • or (on other systems) simply execute (to install to /usr/local/bin and /usr/share/dvcs-autosync):
    1. python setup.py build
    2. sudo python setup.by install

Manual installation

  • Copy dvcs-autosync to a location in $PATH and jabberbot.py to a location in $PYTHONPATH (quick and dirty: keep both in the same directory and run ./dvcs-autosync later)

Initial repository set-up

  1. Create the repository and do an initial push
    1. [on the server used to host the central git repository]
       $ git init --bare autosync.git
    2. [on the first host using that repository]
       $ cd ~ && git clone <server>:autosync.git autosync && cd autosync $ [ populate initial contents and add to index ] $ git commit -m 'Initial commit' $ git push origin master
    3. [on each additional host]
       $ git clone <server>:autosync.git autosync
  2. Configure autosync on all hosts
    • Create an XMPP/Jabber account (for example on jabber.org, or set up your own server)
    • Copy the included .autosync-example config file to ~/.autosync (or wherever you want)
    • Change it to your needs
  3. Run autosync.py on all hosts
     $ autosync.py [config file] # config defaults to ~/.autosync

Potential pitfalls

  • For Jabber login, there probably needs to be a _xmpp-client._tcp.<domain name of jabber account> SRV entry in DNS so that   the Python XMPP module can look up the server and port to use. Without such an SRV entry, Jabber login may fail even if the account details are correct and the server is reachable.
  • When there are errors ERROR:pyinotify:add_watch: cannot watch ...  on startup, it will either be an invalid file or directory name which can not be watched for changes, or the number of files a user may watch concurrently using the kernel inotify interface has reached the set limit. In the latter case, the limit can be changed by modifying the sysctl variable fs.inotify.max_user_watches and increasing it to a sufficient value (e.g. 500000).

Development contributors

  • Dieter Plaetinck: documentation and bug fixes, Arch Linux packaging
  • René ‘Necoro’ Neumann: improvements for embedded Jabberbot with regards to disconnects, bug fixes
  • Philipp Tölke: Windows port
René Mayrhofer
René Mayrhofer
Professor of Networks and Security & Director of Engineering at Android Platform Security; pacifist, privacy fan, recovering hypocrite; generally here to question and learn