Background: git-annex supports storing data in various special remotes. The git-annex assistant will make it easy to configure these, and easy configurators have already been built for a few: removable drives, rsync.net, locally paired systems, and remote servers with rsync.

Help me prioritize my work: What special remote would you most like to use with the git-annex assistant?

[[!poll open=yes 16 "Amazon S3 (done)" 12 "Amazon Glacier (done)" 9 "Box.com (done)" 71 "My phone (or MP3 player)" 23 "Tahoe-LAFS" 10 "OpenStack SWIFT" 31 "Google Drive"]]

This poll is ordered with the options I consider easiest to build listed first. Mostly because git-annex already supports them and they only need an easy configurator. The ones at the bottom are likely to need significant work. See cloud for detailed discussion.

Have another idea? Absolutely need two or more? Post comments..

I've been looking at ceph for various reasons in work, it supports a swift interface as well as it's own restful api. so +1 for swift (and any s3 compatible api).
Comment by Jimmy Thu Sep 13 04:39:59 2012
Swift has its own API but offers a S3 compatibility layer. Last I tried that layer, it did not work.
Comment by Richard Thu Sep 13 05:07:02 2012

Here are a couple of things which are (I think) unique about the Tahoe-LAFS special remote:

  1. encryption ; All of the data is encrypted before leaving your local system and heading for the server (or for the clouds). This is true even though you don't (I think) have to enter an encryption key into git-annex to access your data.

(Note: the above implies that you're in danger of permanently losing access to your data, by losing the last copy of the encryption key, if your local git-annex state is lost. This deserves careful consideration.)

  1. erasure-coding ; You can configure Tahoe-LAFS to spread the data out in a RAID-like way across multiple remote storage servers, where each server holds only, say, 1/3 of the data, but there are, say, 10 different servers, where any 3 of them are sufficient to give you full access to your data. Does that make sense it uses less bandwidth and storage space than replication (i.e. putting a complete replica of your data on each of 4 or 5 or 10 different storage servers), but it is more robust than sharding (i.e. putting 1/3 of your data on each of three different servers so that if any one of them goes down you lose 1/3 of your data).
Comment by Zooko Fri Oct 12 14:17:42 2012
A dropbox like folder which syncs with Google Music. Google Music allows uploading upto 20K songs. Also using git-annex , I can ensure that I dont need to store all the duplicate mp3 on my local drive. Only one copy of "older music" is "arvhived" on GMusic, whereas more recent songs are on my local drive.
Comment by Pankaj Wed Oct 31 02:07:32 2012
Comments on this page are closed.