Sparkleshare and dvcs-autosync are tools to automatically commit your changes to git and keep them in sync with other repositories. Unlike git-annex, they don't store the file content on the side, but directly in the git repository. Great for small files, less good for big files.

Here's how to use the git-annex assistant to do the same thing, but even better!


First, get git-annex version 4.20130329 or newer.


Let's suppose you're delveloping a video game, written in C. You have source code, and some large game assets. You want to ensure the source code is stored in git -- that's what git's for! And you want to store the game assets in the git annex -- to avod bloating your git repos with possibly enormous files, but still version control them.

All you need to do is configure git-annex to treat your C files as small files. And treat any file larger than, say, 100kb as a large file that is stored in the annex.

git config annex.largefiles "largerthan=100kb and not (include=*.c or include=*.h)"

Now if you run git annex add, it will only add the large files to the annex. You can git add the small files directly to git.

Better, if you run git annex assistant, it will automatically add the large files to the annex, and store the small files in git. It'll notice every time you modify a file, and immediately commit it, too. And sync it out to other repositories you configure using git annex webapp.


It's also possible to disable the use of the annex entirely, and just have the assistant always put every file into git, no matter its size:

git config annex.largefiles "exclude=*"
I think you probably meant at least version 4.20130323 ;-)
Comment by http://hands.com/~phil/ Sun Mar 31 17:30:34 2013
I meant 4.20130329
Comment by http://joeyh.name/ Sun Mar 31 18:50:35 2013

I just gave this feature a try, but it seems it doesn't work as expected or maybe I don't understand it:

~/annex/largefilestest % git init
~/annex/largefilestest (git)-[master] % git annex init "test repo"
~/annex/largefilestest (git)-[master] % git config annex.largefiles "not include=*.txt"

Now I copy two files to this directory and add both to the annex

~/annex/largefilestest (git)-[master] % ll
total 100
-rw-rw-r-- 1 tobru tobru 93709 Oct 19 16:14 dpkg-get-selections.txt
-rw-rw-r-- 1 tobru tobru  7256 Jan  6 15:52 x3400.jpg
~/annex/largefilestest (git)-[master] % git annex add .
add x3400.jpg (checksum...) ok
(Recording state in git...)
~/annex/largefilestest (git)-[master] % git status
# On branch master
#
# Initial commit
#
# Changes to be committed:
#   (use "git rm --cached <file>..." to unstage)
#
#       new file:   x3400.jpg
#
# Untracked files:
#   (use "git add <file>..." to include in what will be committed)
#
#       dpkg-get-selections.txt
~/annex/largefilestest (git)-[master] % ll
total 96
-rw-rw-r-- 1 tobru tobru 93709 Oct 19 16:14 dpkg-get-selections.txt
lrwxrwxrwx 1 tobru tobru   192 Jan  6 15:52 x3400.jpg -> .git/annex/objects/vf/QX/SHA256E-s7256--60e5b69ade5619e37f7fcaa964626da9c415959d861241aa13e2516fffc2dddf.jpg/SHA256E-s7256--60e5b69ade5619e37f7fcaa964626da9c415959d861241aa13e2516fffc2dddf.jpg

So the picture is added to the annex as expected. But the .txt file is not added to git. Do I have to manually add this to git? And why is the picture seen as new file by git?

The second question could be answered by: "run git annex sync". Is this correct? Because after running this command, git does not see this file as a new file anymore:

~/annex/largefilestest (git)-[master] % git annex sync
commit  
[master (root-commit) a0afb14] git-annex automatic sync
 1 file changed, 1 insertion(+)
 create mode 120000 x3400.jpg
ok
git-annex: no branch is checked out
~/annex/largefilestest (git)-[master] % git status
# On branch master
# Untracked files:
#   (use "git add <file>..." to include in what will be committed)
#
#       dpkg-get-selections.txt
nothing added to commit but untracked files present (use "git add" to track)

Like it says in the tip, git annex add will add the large files to git. You can add the small files with git add; git-annex won't do that for you.

To automatically add both sorts of files, you can use the git annex watch or git annex assistant daemons. The latter also keeps files in sync between repositories automatically.

(Why did the picture show up as a new file in git? Because you hadn't committed it. This is the same as when you git add a file; it's only staged in the index; git status will show it is new until you git commit)

Comment by http://joeyh.name/ Sun Apr 14 18:37:50 2013
Does annex.largefiles support mimetypes? F.e. git config annex.largefiles "not mimetype=text/plain"
I was wondering if the annex.largefiles feature was compatible with direct mode?

annex.largefiles does not support mime types. I agree it would be a useful addition.

annex.largefiles can be used with direct mode. I would only recommending using it this way using the assistant, which will keep straight which files are which and commit them appropriately.

Comment by http://joeyh.name/ Thu Sep 19 18:03:29 2013
Comments on this page are closed.