Real time file synchronization like Dropbox via Unison

[Dropbox](https://www.dropbox.com/referrals/NTkxMTE4MDk) is a very nice tool for real time synchronization. It works very well to keep files from multiple devices (computers, phones, etc.) in sync. I use it mainly as a cloud-based backup for some of my files. However, it’s been on the headlines recently due to [security](http://lifehacker.com/5813861/dropbox-accidentally-unlocked-all-accounts-for-4-hours) and [privacy](http://gizmodo.com/5817885/dropboxs-new-tcs-pushes-users-into-a-panic-over-privacy-and-ownership) concerns, leading to calls for [encrypting](http://lifehacker.com/5794486/how-to-add-a-second-layer-of-encryption-to-dropbox) your files prior to syncing with Dropbox.

I’ve always contemplated on running my own Dropbox-like service to have yet another safe backup of my files. Besides knowing where my data are stored exactly, I have (in theory) an unlimited amount of space. [This](http://fak3r.com/geek/howto-build-your-own-open-source-dropbox-clone/) post and [this](http://www.cerebralmastication.com/2011/04/fast-two-way-sync-in-ubuntu/) post outline solutions based on open source tools such as [OpenSSH](http://www.openssh.com/) (for encrypted file transfer), [lsyncd](http://code.google.com/p/lsyncd/) (for monitoring files), and [Unison](http://www.cis.upenn.edu/~bcpierce/unison/) ([rsync](http://rsync.samba.org/)-like tool). I’ve attempted [this](http://www.cerebralmastication.com/2011/04/details-of-two-way-sync-between-two-ubuntu-machines/) setup, but failed to get things working with lsyncd (see the extensive discussion with the author via the comments).

I stumbled upon [this](http://www.digitalplayground.at/?p=1) post that outlines a solution based on the bleeding edge version of Unison, which includes the `-repeat watch` option, featuring the monitoring of files. However, the author outlined a solution for Mac OS X. I played around with the new Unison and arrived at a solution I am pretty satisfied with for my Ubuntu machines (easily extended to Mac and Windows, I’m sure). I will outline my setup in this post. Note that I have [password-less ssh](http://blog.nguyenvq.com/2009/06/19/passwordless-ssh/) set up so that I can ssh into my server without typing in the password. Also, I am using Unison version 2.44.2, which I downloaded via svn around 7/16/2011.

## Installing Unison
The same version of Unison must be installed on both the client and the server. Both my client and server runs Ubuntu (11.04 and 10.04 server). On the client, the folder I would like to sync is `/home/vinh/Documents`; the server’s destination is `/home/vinh/Backup/Documents`.

sudo apt-get install ocaml python-pyinotify
## install the .deb file from http://packages.ubuntu.com/search?keywords=python-pyinotify via `dpkg -i` if python-pyinotify is not in your repository
svn checkout https://webdav.seas.upenn.edu/svn/unison
cd trunk
make NATIVE=true UISTYLE=text
## `make install` installs into $HOME/bin/
sudo cp src/unison /usr/local/bin/
sudo cp src/fsmonitor.py /usr/local/bin/

Everything following is done on the client computer.

## Scripts

`unisonNetworkOnPortForward`:

#! /bin/bash

## http://ubuntuforums.org/showpost.php?p=6679437&postcount=4
## can't have extension in filename http://www.duncanelliot.com/blog/?p=28

# ssh username@server.ip -f -N -L 9922:server.ip:22 ## minimal
sudo -u local.username ssh username@server.ip -Y -C -f -N -L 9922:server.ip:22

## multiple instances can run in case of disconnect and reconnect

This script forwards my local port 9922 to the server’s port 22 via ssh. That way, I can `ssh username@localhost -p 9922` if I wanted to connect to the server. I do this so that file synchronization can resume after a disconnect and reconnect (changed files does not get synced after a reconnect if I connect to the remote server directly).

Run `sudo cp unisonNetworkOnPortForward /etc/network/if-up.d/` on Debian or Ubuntu. By doing this, the script will be executed whenever the computer is connected to a network (this will be different for non-debian-based distros). Note that multiple instances of this port forwarding will be present if the network is disconnected and reconnected multiple times. This makes things a little ugly, but I haven’t noticed any problems really. Also note that the script name cannot have a file extension or things will [not](http://www.duncanelliot.com/blog/?p=28) work.

`unisonMonitor.sh`:

#! /bin/bash

## in /etc/rc.local, add:
## sudo -u local.username /path/to/unisonMonitor.sh &

unison default ~/Documents ssh://username@localhost:9922//home/vinh/Backup/Documents -repeat watch -times -logfile /tmp/unison.log
# -times: sync timestamps
# -repeat watch: real-time synchronization via pyinotify

Add to `/etc/rc.local` before the last line:

sudo -u local.username /path/to/unisonMonitor.sh &

This turns on unison sync at startup (unison will keep trying to connect to the server if it is disconnected). Again, this implementation is different for non-debian-based distros.

`unisonSync.sh`:

#! /bin/bash

unison -batch -times ~/Documents ssh://username@localhost:9922//home/vinh/Backup/Documents -logfile /tmp/unison.log

Run `unisonSync.sh` when you want to manually sync the two folders. I add the following line to `cron` (`crontab -e`) to have a manual sync everyday at 12:30pm:

30 12 * * * /path/to/unisonSync.sh

I set up this cron job because `unisonMonitor.sh` will only sync files that have changed while the unison process is running. This daily backup makes sure all my files are in sync at least once a day.

`unisonKill.sh`:

#! /bin/bash

ps aux | grep unison | awk '{print $2}' | xargs kill -9

I run this script on the client or server when I want to clean up unison processes. The one drawback about the monitor feature of unison currently is that the `unison -server` and `fsmonitor.py` process on the server is not killed when the unison process stops on the client side. After multiple connects, this will leave a lot of unison processes running on the server. Although I haven’t seen any issues with this, the `unisonKill.sh` script should make cleaning up the processes easier.

## Start the service
Once these scripts are in their correct locations, first run `unisonSync.sh` to have the initial sync. Then restart the computer. You should see a `unison` and `fsmonitor.py` process by executing `ps aux | grep unison` on the client and server. Also, you should see an `ssh` process corresponding to the port forwarding by executing `ps aux | grep ssh`. Run `touch foo.txt` in the directory that you are watching and see if it appears on the server. Remove it and see if it gets deleted. Good luck!

What are some drawbacks with this setup compared to Dropbox? Well, I can’t revert back to files from a previous date, and I don’t have a dedicated Android app that I can access the files with. To solve the former, you can set up another cron job that syncs to a different location on your server every few days, giving you access to files that are a few days old. To solve the latter, I’m sure there are Android apps that allow you to access files via the `sftp` protocol.

8 Comments

  • Pingback: archive and synchronizations with unison and rsync

  • Duane
    July 26, 2011 - 2:04 am | Permalink

    Hi.
    For your last issue check out Back in Time. It keeps snapshots based on only files that have changed.
    I am interested in this project. I want a low powered media/file system that keeps files synced across multiple computers. When a file is branched then it notifies users for correction. Also when a new PC is added to the system the files are seemlessly synced in idle time or when in need.
    Will keep an eye in your progress.
    Cheers.

  • Pingback: Petite revue des solutions libres de synchronisation « Sciunto

  • Pingback: Gonzague » Équivalent DropBox sur son propre serveur dédié. (2)

  • Alessandro
    December 6, 2011 - 3:06 am | Permalink

    Great implementation!

    I am new to syncing, but I think this is one of the best solutions. You have all the functionalities of Dropbox, without privacy and cost issues!

    But I have a question.
    How you cope with conflicts?

  • Jay Janssen
    December 6, 2011 - 9:35 am | Permalink

    Re: backups. I think you missed the ‘-backups’ option in unison?

  • December 6, 2011 - 9:40 am | Permalink

    @Alessandro what conflicts? I don’t usually have conflicts. If it arises somehow, I just do a manual sync and then restart my real-time sync.

    @Jay What does the ‘-backups’ argument do?

  • Alessandro
    December 6, 2011 - 11:36 am | Permalink

    A conflict happen when both sides change before having a connection again.
    It can be even more hard if you use more than one computer.

    But you are right, I guess it is enough to just run a manual sync, I will try

  • Leave a Reply

    Your email address will not be published. Required fields are marked *

    *

    You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>