Work on git-annex is crowdfunded. Joey blogs about his progress here on a semi-daily basis.
My last day before thanksgiving, getting caught up with some recent bug reports and, quite a rush to get a lot of fixes in. Adding to the fun, wintery weather means very limited power today.
It was a very productive day, especially for Android, which hopefully has XMPP working again (at least it builds..), halved the size of the package, etc.
Fixed a stupid bug in the automatic v5 upgrade code; annex.version was not being set to 5, and so every git annex command was actually re-running the upgrade.
Fixed another bug I introduced last Friday, which the test suite luckily caught, that broke using some local remotes in direct mode.
Tracked down a behavior that makes git annex sync
quite slow on
filesystems that don't support symlinks. I need to switch direct mode to
not using git commit
at all, and use plumbing to make commits there.
Will probably work on this over the holiday.
Worked to get git-remote-gcrypt included in every git-annex autobuild bundle. (Except Windows; running a shell script there may need some work later..)
Next I want to work on making the assistant easily able to create encrypted git repositories on removable drives. Which will involve a UI to select which gpg key to use, or creating (and backing up!) a gpg key.
But, I got distracted chasing down some bugs on Windows. These were
quite ugly; more direct mode mapping breakage which resulted in
files not being accessible. Also fsck on Windows failed to detect and fix
the problem. All fixed now. (If you use git-annex on Windows, you should
certainly upgrade and run git annex fsck
.)
As with most bugs in the Windows port, the underlying cause turned out to
be stupid: isSymlink
always returned False on Windows. Which makes sense
from the perspective of Windows not quite having anything entirely like
symlinks. But failed when that was being used to detect when files in the
git tree being merged into the repository had the symlink bit set..
Did bug triage. Backlog down to 32 (mostly messages from August).
Upgrades should be working on OSX Mavericks, Linux, and sort of on Android. This needs more testing, so I have temporarily made the daily builds think they are an older version than the last git-annex release. So when you install a daily build, and start the webapp, it should try to upgrade (really downgrade) to the last release. Tests appreciated.
Looking over the whole upgrade code base, it took 700 lines of code to build the whole thing, of which 75 are platform specific (and mostly come down to just 3 or 4 shell commands). Not bad..
Last night, added support for quvi 0.9, which has a completely changed command line interface from the 0.4 version.
Plan to spend tomorrow catching up on bug reports etc and then low activity for rest of the week.
I've been investigating ways to implement a direct mode guard.
Preventing a stray git commit -a
or git add
doing bad things in a
direct mode repository seems increasingly important.
First, considered moving .git
, so git won't know it's a git repository.
This doesn't seem too hard to do, but there will certainly be unexpected
places that assume .git
is the directory name.
I dislike it more and more as I think about it though, because it moves direct mode git-annex toward being entirely separate from git, and I don't want to write my own version control system. Nor do I want to complicate the git ecosystem with tools needing to know about git-annex to work in such a repository.
So, I'm happy that one of the other ideas I tried today seems quite
promising. Just set core.bare=true in a direct mode repository. This nicely
blocks all git commands that operate on the working tree from doing
anything, which is just what's needed in direct mode, since they don't know
how to handle the direct mode files. But it lets all git commands and other
tools that don't touch the working tree continue to be used. You can even
run git log file
in such a repository (surprisingly!)
It also gives an easy out for anyone who really wants to use git commands
that operate on the work tree of their direct mode repository, by just
passing -c core.bare=false
. And it's really easy to implement in
git-annex too -- it can just notice if a repo has core.bare and
annex.direct both set, and pass that parameter to every git command it
runs. I should be able to get by with only modifying 2 functions to
implement this.
Yesterday I spent making a release, and shopping for a new laptop, since this one is dying. (Soon I'll be able to compile git-annex fast-ish! Yay!) And thinking about wishlist: dropping git-annex history.
Today, I added the git annex forget
command. It's currently been lightly
tested, seems to work, and is living in the forget
branch until I gain
confidence with it. It should be perfectly safe to use, even if it's buggy,
because you can use git reflog git-annex
to pull out and revert to an old
version of your git-annex branch. So if you're been wanting this feature,
please beta test!
I actually implemented something more generic than just forgetting git history. There's now a whole mechanism for git-annex doing distributed transitions of whatever sort is needed.
There were several subtleties involved in distributed transitions:
First is how to tell when a given transition has already been done on a branch. At first I was thinking that the transition log should include the sha of the first commit on the old branch that got rewritten. However, that would mean that after a single transition had been done, every git-annex branch merge would need to look up the first commit of the current branch, to see if it's done the transition yet. That's slow! Instead, transitions are logged with a timestamp, and as long as a branch contains a transition with the same timestamp, it's been done.
A really tricky problem is what to do if the local repository has transitioned, but a remote has not, and changes keep being made to the remote. What it does so far is incorporate the changes from the remote into the index, and re-run the transition code over the whole thing to yeild a single new commit. This might not be very efficient (once I write the more full-featured transition code), but it lets the local repo keep up with what's going on in the remote, without directly merging with it (which would revert the transition). And once the remote repository has its git-annex upgraded to one that knows about transitions, it will finish up the transition on its side automatically, and the two branches will once again merge.
Related to the previous problem, we don't want to keep trying to merge from a remote branch when it's not yet transitioned. So a blacklist is used, of untransitioned commits that have already been integrated.
One really subtle thing is that when the user does a transition more
complicated than git annex forget
, like the git annex forget --dead
that I need to implement to forget dead remotes, they're not just telling
git-annex to forget whatever dead remotes it knows right now. They're
actually telling git-annex to perform the transition one time on every
existing clone of the repository, at some point in the future. Repositories
with unfinished transitions could hang around for years, and at some future
point when git-annex runs in the repository again, it would merge in the
current state of the world, and re-do the transition. So you might tell it
to forget dead remotes today, and then the very repository you ran that in
later becomes dead, and a long-slumbering repo wakes up and forgets about
the repo that started the whole process! I hope users don't find this
massively confusing, but that's how the implementation works right now.
I think I have at least two more days of work to do to finish up this feature.
I still need to add some extra features like forgetting about dead remotes, and forgetting about keys that are no longer present on any remote.
After
git annex forget
,git annex sync
will fail to push the synced/annex branch to remotes, since the branch is no longer a fast-forward of the old one. I will probably fix this by makinggit annex sync
do a fallback push of a unique branch in this case, like the assistant already does. Although I may need to adjust that code to handle this case, too..For some reason the automatic transitioning code triggers a "(recovery from race)" commit. This is certainly a bug somewhere, because you can't have a race with only 1 participant.
Today's work was sponsored by Richard Hartmann.
Completely finished up with making the assistant detect when git-annex's binary has changed and handling the restart.
It's a bit tricky because during an upgrade there can be two assistant daemons running at the same time, in the same repository. Although I disable the watcher of the old one first. Luckily, git-annex has long supported running multiple concurrent git-annex processes in the same repository.
The surprisingly annoying part turned out to be how to make the webapp redirect the browser to the new url when it's upgraded. Particularly needed when automatic upgrades are enabled, since the user will not then be taking any action in the webapp that could result in a redirect. My solution to this feels like overkill; the webapp does ajax long polling until it gets an url, and then redirects to it. Had to write javascript code and ugh.
But, that turned out to also be useful when manually restarting the webapp (removed some horrible old code that ran a shell script to do it before), and also when shutting the webapp down.
Getting back to upgrades, I have the assistant downloading the upgrade, and running a hook action once the key is transferred. Now all I need is some platform-specific code to install it. Will probably be hairy, especially on OSX where I need to somehow unmount the old git-annex dmg and mount the new one, from within a program running on the old dmg.
Today's work was sponsored by Evan Deaubl.
The difference picking the right type can make! Last night, I realized that
the where I had a distributionSha256sum :: String
, I should instead use
distributionKey :: Key
. This means that when git-annex is eventually
downloading an upgrade, it can treat it as just another Key being
downloaded from the web. So the webapp will show that transfer along with
all the rest, and I can leverage tons of code for a new purpose. For
example, it can simply fsck the key once it's downloaded to verify its
checksum.
Also, built a DistriutionUpdate program, which I'll run to generate the info files for a new version. And since I keep git-annex releases in a git-annex repo, this too leverages a lot of git-annex modules, and ended up being just 60 easy lines of code. The upgrade notification code is tested and working now.
And, I made the assistant detect when the git-annex program binary is
replaced or modified. Used my existing DirWatcher code for that. The plan
is to restart the assistant on upgrade, although I need to add some sanity
checks (eg, reuse the lsof code) first. And yes, this will work even for
apt-get upgrade
!
Today's work was sponsored by Paul Tötterman
Still working on the git repair code. Improved the test suite, which found some more bugs, and so I've been running tests all day and occasionally going and fixing a bug in the repair code. The hardest part of repairing a git repo has turned out to be reliably determining which objects in it are broken. Bugs in git don't help (but the git devs are going to fix the one I reported).
But the interesting new thing today is that I added some upgrade alert code to the webapp. Ideally everyone would get git-annex and other software as part of an OS distribution, which would include its own upgrade system -- But the survey tells me that a quarter of installs are from the prebuilt binaries I distribute.
So, those builds are going to be built with knowledge of an upgrade url, and will periodically download a small info file (over https) to see if a newer version is available, and show an alert.
I think all that's working, though I have not yet put the info files in place and tested it. The actual upgrade process will be a manual download and reinstall, to start with, and then perhaps I'll automate it further, depending on how hard that is on the different platforms.
Pushed out a minor release of git-annex today, mostly to fix build problems on Debian. No strong reason to upgrade to it otherwise.
Continued where I left off with the Git.Destroyer. Fixed quite a lot of edge cases where git repair failed due to things like a corrupted .git/HEAD file (this makes git think it's not in a git repository), corrupt git objects that have an unknown object type and so crash git hard, and an interesting failure mode where git fsck wants to allocate 116 GB of memory due to a corrupted object size header. Reported that last to the git list, as well as working around it.
At the end of the day, I ran a test creating 10000 corrupt git repositories, and all of them were recovered! Any improvements will probably involve finding new ways to corrupt git repositories that my code can't think of. ;)
Finished the direct mode guard, including the new git annex status
command.
Spent the rest of the day working on various bug fixes. One of them turned into rather a lot of work to make the webapp's UI better for git remotes that do not have an annex.uuid.
Annoyingly, the Android 4.3 fix breaks git-annex on Android 4.0 (probably through 4.2), so I now have two separate builds of the Android app.
Worked on Windows porting today. I've managed to get the assistant
and watcher (but not yet webapp) to build on Windows.
The git annex transferrer
interface needs POSIX stuff, and seems to be
the main thing that will need porting for Windows for the assistant to
work, besides of course file change detection. For that, I've hooked up
Win32-notify.
So the watcher might work on Windows. At least in theory. Problem is, while all the code builds ok, it fails to link:
ghc.exe: could not execute: C:\Program Files (x86)\Haskell Platform\2012.4.0.0\lib/../mingw/bin/gcc.exe
I wonder if this is case of too many parameters being passed?
This happens both on the autobuilder and on my laptop, so I'm stuck here. Oh well, I was not planning to work on this anyway until February...
Finally found the root cause of the Android 4.3/4.4 trouble, and a fix is now in place!
As a bonus, it looks like I've fixed a problem accessing the environment on Android that had been worked around in an ugly way before.
Big thanks to my remote hands Michael Alan, Sören, and subito. All told they ran 19 separate tests to help me narrow down this tricky problem, often repeating long command lines on software keyboards.
The user survey is producing some interesting and useful results!
Added two more polls: using with and blocking problems
(There were some load issues so if you were unable to vote yesterday, try
again..)
Worked on getting the autobuilder for OS X Mavericks set up. Eventually succeeded, after patching a few packages to work around a cpp that thinks it should parse haskell files as if they're C code. Also, Jimmy has resuscitated the OS X Lion autobuilder.
A not too bad bug in automatic merge conflict resolution has been reported, so I will need to dig into that tomorrow. Didn't feel up to it today, so instead have been spending the remaining time finishing up a branch that switches the test suite to use the tasty test framework.
Started by tracking down a strange bug that was apparently ubuntu-specific and caused git-annex branch changes to get committed to master. Root cause turned out to failing to recover from an exception. I'm kicking myself about that, because I remember looking at the code where the bug was at least twice before and thinking "hmm, should add exception handling here? nah..". Exceptions are horrible.
Made a release with a fix for that and a few minor other accumulated changes since last Friday's release. The pain point of this release is to fix building without the webapp (so it will propigate to Debian testing, etc). This release does not include the direct mode guard, so I'll have a few weeks until the next release to get that tested.
Fixed the test suite in directguard
. This branch is now nearly ready to
merge to master, but one command that is badly needed in guarded direct
mode is "git status". So I am planning to rename "git annex status" to
"git annex info", and make "git annex status" display something similar
to "git status".
Also took half an hour and added optional EKG support to git-annex. This is a Haskell library that can add a terrific monitoring console web UI to any program in 2 lines of code. Here we can see the git-annex webapp using resources at startup, followed in a few seconds by the assistant's startup scan of the repository.
BTW, Kevin tells me that the machine used to build git-annex for OSX is going to be upgraded to 10.9 soon. So, hopefully I'll be making autobuilds of that. I may have to stop the 10.8.2 autobuilds though.
Today's work was sponsored by Protonet.
Wrote some evil code you don't want to run today. Git.Destroyer randomly generates Damage, and applies it to a git repository, in a way that is reproducible -- applying the same Damage to clones of the same git repo will always yeild the same result.
This let me build a test harness for git-repair, which repeatedly clones, damages, and repairs a repository. And when it fails, I can just ask it to retry after fixing the bug and it'll re-run every attempt it's logged.
This is already yeilding improvements to the git-repair code. The first randomly constructed Damage that it failed to recover turned out to be a truncated index file that hid some other corrupted object files from being repaired.
[Damage Empty (FileSelector 1),
Damage Empty (FileSelector 2),
Damage Empty (FileSelector 3),
Damage Reverse (FileSelector 3),
Damage (ScrambleFileMode 3) (FileSelector 5),
Damage Delete (FileSelector 9),
Damage (PrependGarbage "¥SOH¥STX¥ENQ¥f¥a¥ACK¥b¥DLE¥n") (FileSelector 9),
Damage Empty (FileSelector 12),
Damage (CorruptByte 11 25) (FileSelector 6),
Damage Empty (FileSelector 5),
Damage (ScrambleFileMode 4294967281) (FileSelector 14)
]
I need to improve the ranges of files that it damages -- currently QuickCheck
seems to only be selecting one of the first 20 or so files. Also, it's quite
common that it will damage .git/config
so badly that git thinks it's not
a git repository anymore. I am not sure if that is something git-repair
should try to deal with.
Today's work was sponsored by the WikiMedia Foundation.
Fixed two difficult bugs with direct mode. One happened (sometimes) when a file was deleted and replaced with a directory by the same name and then those changes were merged into a direct mode repository.
The other problem was that direct mode did not prevent writes to .git/annex/objects the way that indirect mode does, so when a file in the repository was not currently present, writing to the dangling symlink would follow it and write into the object directory.
Hmm, I was going to say that it's a pity that direct mode still has so many bugs being found and fixed, but the last real bug fix to direct mode was made last May! Instead, I probably have to thank Tim for being a very thorough tester.
Finished switching the test suite to use the tasty framework, and prepared tasty packages for Debian.
Release today, right on bi-weekly schedule. Rather startled at the size of the changelog for this one; along with the direct mode guard, it adds support for OS X Mavericks, Android 4.3/4.4, and fixes numerous bugs.
Posted another question in the survey, http://git-annex-survey.branchable.com/polls/2013/roadmap/.
Spun off git-repair as an independant package from git-annex. Of course, most of the source code is shared with git-annex. I need to do something with libraries eventually..
Been chipping away at my backlog of messages, and it's down to 23 items.
Finally managed to get ghc to build with a newer version of the NDK. This might mean a solution to git-annex on Android 4.2. I need help with testing.
One of my goals for this month is to get a better sense of how git-annex is being used, how it's working out for people, and what areas need to be concentrated on. To start on that, I am doing the 2013 git-annex user survey, similar to the git user surveys. I will be adding some less general polls later (suggestions for topics appreciated!), but you can go vote in any or all of 10 polls now.
Found a workaround for yesterday's Windows build problem. Seems that only
cabal runs gcc in a way that fails, so ghc --make
builds is successfully.
However, the watcher doesn't quite work on Windows. It does get events when
files are created, but it seems to then hang before it can add the file to
git, or indeed finish printing out a debug log message about the event.
This looks like it could be a problem with the threaded ghc runtime on
Windows, or something like that.
Main work today was improving the git repository repair to handle corrupt index files. The assistant can now start up, detect that the index file is corrupt, and regenerate it all automatically.
About half way done with a gcrypt special remote. I can initremote it (the hard part to get working), and can send files to it. Can't yet get files back, or remove files, and only local repositories work so far, but this is enough to know it's going to be pretty nice!
Did find one issue in gcrypt that I may need to develop a patch for: https://github.com/blake2-ppc/git-remote-gcrypt/issues/3
Being still a little unsure of the UI and complexity for configuring gcrypt on ssh servers, I thought I'd start today with the special case of gcrypt on rsync.net. Since rsync.net allows running some git commands, gcrypt can be used to make encrypted git repositories on it.
Here's the UI I came up with. It's complicated a bit by needing to explain the tradeoffs between the rsync and gcrypt special remotes.
This works fine, but I did not get a chance to add support for enabling existing gcrypt repos on rsync.net. Anyway, most of the changes to make this work will also make it easier to add general support for gcrypt on ssh servers.
Also spent a while fixing a bug in git-remote-gcrypt. Oddly
gpg --list-keys --fast-list --fingerprint
does not show the fingerprints
of some keys.
Today's work was sponsored by Cloudier - Thomas Djärv.
Did various bug fixes and followup today. Amazing how a day can vanish that way. Made 4 actual improvements.
I still have 46 messages in unanswered backlog. Although only 8 of the are from this month.
Got well caught up on bug fixes and traffic. Backlog is down to 40.
Made the assistant wait for a few seconds before doing the startup scan when it's autostarted, since the desktop is often busy starting up at that same time.
Fixed an ugly bug with chunked webdav and directory special remotes that caused it to not write a "chunkcount" file when storing data, so it didn't think the data was present later. I was able to make it recover nicely from that mistake, by probing for what chunks are actually present.
Several people turn out to have had problems with git annex sync
not
working because receive.denyNonFastForwards is enabled. I made the webapp
not enable it when setting up a ssh repository, and I made git annex sync
print out a hint about this when it's failed to push. (I don't think this
problem affects the assistant's own syncing.)
Made the assistant try to repair a damaged git repository without prompting. It will only prompt when it fails to fetch all the lost objects from remotes.
Glad to see that others have managed to get git-annex to build on Max OS X 10.9. Now I just need someone to offer up a ssh account on that OS, and I could set up an autobuilder for it.
Now the webapp can set up encrypted repositories on removable drives.
This UI needs some work, and the button to create a new key is not wired up. Also if you have no gpg agent installed, there will be lots of password prompts at the console.
Forked git-remote-gcrypt to fix a bug. Hopefully my patch will be merged; for now I recommend installing my worked version.
Today's work was sponsored by Romain Lenglet.
Spent basically all of today getting the assistant to be able to handle gcrypt special remotes that already exist when it's told to add a USB drive. This was quite tricky! And I did have to skip handling gcrypt repos that are not git-annex special remotes.
Anyway, it's now almost easy to set up an encrypted sneakernet using a USB drive and some computers running the webapp. The only part that the assistant doesn't help with is gpg key management.
Plan is to make a release on Friday, and then try to also add support for encrypted git repositories on remote servers. Tomorrow I will try to get through some of the communications backlog that has been piling up while I was head down working on gcrypt.
I decided to keep gpg key generation very simple for now. So it generates a special-purpose key that is only intended to be used by git-annex. It hardcodes some key parameters, like RSA and 4096 bits (maximum recommended by gpg at this time). And there is no password on the key, although you can of course edit it and set one. This is because anyone who can access the computer to get the key can also look at the files in your git-annex repository. Also because I can't rely on gpg-agent being installed everywhere. All these simplifying assumptions may be revisited later, but are enough for now for someone who doesn't know about gpg (so doesn't have a key already) and just wants an encrypted repo on a removable drive.
Put together a simple UI to deal with gpg taking quite a while to generate a key ...
Then I had to patch git-remote-gcrypt again, to have a per-remote signingkey setting, so that these special-purpose keys get used for signing their repo.
Next, need to add support for adding an existing gcrypt repo as a remote (assuming it's encrypted to an available key). Then, gcrypt repos on ssh servers..
Also dealt with build breakage caused by a new version of the Haskell DNS library.
Today's work was sponsored by Joseph Liu.
Worked on making the assistant able to merge in existing encrypted git repositories from rsync.net.
This had two parts. First, making the webapp UI where you click to enable a known special remote work with these encrypted repos. Secondly, handling the case where a user knows they have an encrypted repository on rsync.net, so enters in its hostname and path, but git-annex doesn't know about that special remote. The second case is important, for example, when the encrypted repository is a backup and you're restoring from it. It wouldn't do for the assistant, in that case, to make a new encrypted repo and push it over top of your backup!
Handling that was a neat trick. It has to do quite a lot of probing, including downloading the whole encrypted git repo so it can decrypt it and merge it, to find out about the special remote configuration used for it. This all works with just 2 ssh connections, and only 1 ssh password prompt max.
Next, on to generalizing this rsync.net specific code to work with arbitrary ssh servers!
Today's work was made possible by RMS's vision 30 years ago.
Productive day, but I'm wiped out. Backlog down to 51.
Low activity the past couple of days. Released a new version of git-annex yesterday. Today fixed three bugs (including a local pairing one that was pretty compicated) and worked on getting caught up with traffic.
All command line stuff today..
Added --want-get and --want-drop, which can be used to test preferred content settings
of a repository. For example git annex find --in . --want-drop
will list the same
files that git annex drop --auto
would try to drop. (Also renamed git annex content
to git annex wanted
.)
Finally laid to rest problems with git annex unannex
when multiple files point to the
same key. It's a lot slower, but I'll stop getting bug reports about that.
I try hard to keep this devblog about git-annex development and not me. However, it is a shame that what I wanted to be the beginning of my first real month of work funded by the new campaign has been marred by my home's internet connection being taken out by a lightning strike, and by illness. Nearly back on my feet after that, and waiting for my new laptop to finally get here.
Today's work: Finished up the git annex forget
feature and merged it in.
Fixed the bug that was causing the commit race detection code to
incorrectly fire on the commit made by the transition code. Few other bits
and pieces.
Solid day of working on repository recovery. Got git recover-repository
--force
working, which involves fixing up branches that refer to missing
objects. Mostly straightforward traversal of git commits, trees, blobs, to
find when a branch has a problem, and identify an old version of it that
predates the missing object. (Can also find them in the reflog.)
The main complication turned out to be that git branch -D
and git
show-ref
don't behave very well when the commit objects pointed to by refs
are themselves missing. And git has no low-level plumbing that avoids
falling over these problems, so I had to write it myself.
Testing has turned up one unexpected problem: Git's index can itself refer to missing objects, and that will break future commits, etc. So I need to find a way to validate the index, and when it's got problems, either throw it out, or possibly recover some of the staged data from it.
Implemented git annex forget --drop-dead
, which is finally a way to
remove all references to old repositories that you've marked as dead.
I've still not merged in the forget
branch, because I developed this
while slightly ill, and have not tested it very well yet.
Finished up the automatic recovery from stale lock files. Turns out git has quite a few lock files; the assistant handles them all.
Improved URL and WORM keys so the filenames used for them will always work on FAT (which has a crazy assortmeny of illegal characters). This is a tricky thing to deal with without breaking backwards compatability, so it's only dealt with when creating new URL or WORM keys.
I think my next step in this disaster recovery themed month will be adding
periodic incremental fsck to the assistant. git annex fsck
can already
do an incremental fsck, so this should mostly involve adding a user
interface to the webapp to configure when it should fsck. For example, you
might choose to run it for up 1 hour every night, with a goal of checking
all your files once per month. Also will need to make the assistant do
something useful when fsck finds a bad file (ie, queue a re-download).
Fixed a lot of bugs in the assistant's fsck handling today, and merged it into master. There are some enhancments that could be added to it, including fscking ssh remotes via git-annex-shell and adding the ability to schedule events to run every 30 days instead of on a specific day of the month. But enough on this feature for now.
Today's work was sponsored by Daniel Brockman.
Got git annex sync working with gcrypt. So went ahead and made a release today. Lots of nice new features!
Unfortunately the linux 64 bit daily build is failing, because my build host only has 2 gb of memory and it is no longer enough. I am looking for a new build host, ideally one that doesn't cost me $40/month for 3 gb of ram and 15 gb of disk. (Extra special ideally one that I can run multiple builds per day on, rather than the current situation of only building overnight to avoid loading the machine during the day.) Until this is sorted out, no new 64 bit linux builds..
Spent most of the day building some generic types for scheduling recurring events. Not sure if rolling my own was a good idea, but that's what I did.
In the incrementalfsck branch, I have hooked this up in git-annex vicfg
,
which now accepts and parses scheduled events like
"fsck self every day at any time for 60 minutes" and
"fsck self on day 1 of weeks divisible by 2 at 3:45 for 120 minutes", and
stores them in the git-annex branch. The exact syntax is of course subject
to change, but also doesn't matter a whole lot since the webapp will have
a better interface.
I think that git-recover-repository is ready now. Made it deal with the index file referencing corrupt objects. The best approach I could think of for that is to just remove those objects from the index, so the user can re-add files from their work tree after recovery.
Now to integrate this git repository repair capability into the git-annex
assistant. I decided to run git fsck
as part of a scheduled
repository consistency check. It may also make sense for the assistant to
notice when things are going wrong, and suggest an immediate check. I've
started on the webapp UI to run a repository repair when fsck detects
problems.
Did I say it would be easy to make the webapp detect when a gcrypt repository already existed and enable it? Well, it wasn't exactly hard, but it took over 300 lines of code and 3 hours..
So, gcrypt support is done for now. The glaring omission is gpg key management for sharing gcrypt repositories between machines and/or people. But despite that, I think it's solid, and easy to use, and covers some great use cases.
Pushed out a release.
Now I really need to start thinking about disaster recovery.
Today's work was sponsored by Dominik Wagenknecht.
Long, long day coding up the direct mode guard today. About 90% of the fun
is dealing with receive.denyCurrentBranch
not preventing pushes that
change the current branch, now that core.bare is set in direct mode.
My current solution to this involves using a special branch when using
direct mode, which nothing will ever push to (hopefully). A much nicer
solution would be to use a update
hook to deny pushes of the current
branch -- but there are filesystems where repos cannot have git hooks.
The test suite is falling over, but the directguard
branch otherwise
seems usable.
Today's work was sponsored by Carlo Matteo Capocasa.
A long day of bugfixing. Split into two major parts. First I got back to a bug I filed in August to do with the assistant misbehaving when run in a subdirectory of a git repository, and did a nice type-driven fix of the underlying problem (that also found and fixed some other related bugs that would not normally occur). Then, spent 4 hours in Windows purgatory working around crazy path separator issues.
Built everything needed to run a fsck when a remote gets connected. Have not tested it; only testing is blocking merging the incrementalfsck branch now.
Also updated the OSX and Android builds to use a new gpg release (denial of service security fix), and updated the Debian backport, and did a small amount of bug fixing. I need to do several more days of bug fixing once I get this incremental fsck feature wrapped up before moving on to recovery of corrupt git repositories.
Now I can build git-annex twice as fast! And a typical incremental build is down to 10 seconds, from 51 seconds.
Spent a productive evening working with Guilhem to get his encryption patches reviewed and merged. Now there is a way to remove revoked gpg keys, and there is a new encryption scheme available that uses public key encryption by default rather than git-annex's usual approach. That's not for everyone, but it is a good option to have available.
I've started a new page for my devblog, since I'm not focusing extensively on the assistant and so keeping the blog here increasingly felt wrong. Also, my new year of crowdfunded development formally starts in September, so a new blog seemed good.
Started work on gcrypt support.
The first question is, should git-annex leave it up to gcrypt to transport the data to the encrypted repository on a push/pull? gcrypt hooks into git nicely to make that just work. However, if I go this route, it limits the places the encrypted git repositores can be stored to regular git remotes (and rsync). The alternative is to somehow use gcrypt to generate/consume the data, but use the git-annex special remotes to store individual files. Which would allow for a git repo stored on S3, etc. For now, I am going with the simple option, but I have not ruled out trying to make the latter work. It seems it would need changes to gcrypt though.
Next question: Given a remote that uses gcrypt, how do I determine the
annex.uuid of that repository. I found a nice solutuon to this. gcrypt has
its own gcrypt-id, and I convert it to a UUID in a
reproducible, and even standards-compliant way. So
the same encrypted remote will automatically get the same annex.uuid
wherever it's used. Nice. Does mean that git-annex cannot find a uuid
until git pull
or git push
has been used, to let gcrypt get the
gcrypt-id. Implemented that.
The next step is actually making git-annex store data on gcrypt remotes.
And it needs to store it encrypted of course. It seems best to avoid
needing a git annex initremote
for these gcrypt remotes, and just have
git-annex automatically encrypt data stored on them. But I don't
know. Without initializing them like a special remote is, I'm limited to
using the gpg keys that gcrypt is configured to encrypt to, and cannot use
the regular git-annex hybrid encryption scheme. Also, I need to generate
and store a nonce anyway to HMAC ecrypt keys. (Or modify gcrypt
to put enough entropy in gcrypt-id that I can use it?)
Another concern I have is that gcrypt's own encryption scheme is simply to use a list of public keys to encrypt to. It would be nicer if the full set of git-annex encryption schemes could be used. Then the webapp could use shared encryption to avoid needing to make the user set up a gpg key, or hybrid encryption could be used to add keys later, etc.
But I see why gcrypt works the way it does. Otherwise, you can't make an encrypted repo with a friend set as one of the particpants and have them be able to git clone it. Both hybrid and shared encryption store a secret inside the repo, which is not accessible if it's encrypted using that secret. There are use cases where not being able to blindly clone a gcrypt repo would be ok. For example, you use the assistant to pair with a friend and then set up an encrypted repo in the cloud for both of you to use.
Anyway, for now, I will need to deal with setting up gpg keys etc in the assistant. I don't want to tackle full gpgkeys yet. Instead, I think I will start by adding some simple stuff to the assistant:
- When adding a USB drive, offer to encrypt the repository on the drive so that only you can see it.
- When adding a ssh remote make a similar offer.
- Add a UI to add an arbitrary git remote with encryption. Let the user paste in the url to an empty remote they have, which could be to eg github. (In most cases this won't be used for annexed content..)
- When the user has no gpg key, prompt to set one up. (Securely!)
- Maybe have an interface to add another gpg key that can access the gcrypt repo. Note that this will need to re-encrypt and re-push the whole git history.
So close to being done with gcrypt support.. But still not quite there.
Today I made the UI changes to support gcrypt when setting up a repository on a ssh server, and improved the probing and data types so it can tell which options the server supports. Fairly happy with how that is turning out.
Have not yet hooked up the new buttons to make gcrypt repos. While I was testing that my changes didn't break other stuff, I found a bug in the webapp that caused it to sometimes fail to transfer one file to/from a remote that was just added, because the transferrer process didn't know about the new remote yet, and crashed (and was restarted knowing about it, so successfully sent any other files). So got sidetracked on fixing that.
Also did some work to make the gpg bundled with git-annex on OSX be compatable with the config files written by MacGPG. At first I was going to hack it to not crash on the options it didn't support, but it turned out that upgrading to version 1.4.14 actually fixed the problem that was making it build without support for DNS.
Today's work was sponsored by Thomas Hochstein.
Finally got the assistant to repair git repositories on removable drives,
or other local repos. Mostly this happens entirely automatically, whatever
data in the git repo on the drive has been corrupted can just be copied
to it from ~/annex/.git
.
And, the assistant will launch a git fsck of such a repo whenever it fails to sync with it, so the user does not even need to schedule periodic fscks. Although it's still a good idea, since some git repository problems don't prevent syncing from happening.
Watching git annex heal problems like this is quite cool!
One thing I had to defer till later is repairing corrupted gcrypt
repositories. I don't see a way to do it without deleting all the objects
in the gcrypt repository, and re-pushing everything. And even doing that
is tricky, since the gcrypt-id
needs to stay the same.
Last night, built this nice user interface for configuring periodic fscks:
Rather happy that that whole UI needed only 140 lines of code to build. Though rather more work behind it, as seen in this blog..
Today I added some support to git-annex for smart fscking of remotes. So far only git repos on local drives, but this should get extended to git-annex-shell for ssh remotes. The assistant can also run periodic fscks of these.
Still need to test that, and find a way to make a removable drive's fsck job run when the drive gets plugged in. That's where picking "any time" will be useful; it'll let you configure fscking of removable drives when they're available, as long as they have not been fscked too recently.
Today's work was sponsored by Georg Bauer.
Long day, but I did finally finish up with gcrypt support. More or less.
Got both creating and enabling existing gcrypt repositories on ssh servers working in the webapp. (But I ran out of time to make it detect when the user is manually entering a gcrypt repo that already exists. Should be easy so maybe tomorrow.)
Fixed several bugs in git-annex's gcrypt support that turned up in testing.
Made git-annex ensure that a gcrypt repository does not have
receive.denyNonFastForwards set, because gcrypt relies on always forcing
the push of the branch it stores its manifest on. Fixed a bug in
git-annex-shell recvkey
when it was receiving a file from an annex in
direct mode.
Also had to add a new git annex shell gcryptsetup
command, which is
needed to make setting up a gcrypt repository work when the assistant
has set up a locked-down ssh key that can only run git-annex-shell. Painted
myself into a bit of a corner there.
And tested, tested, tested. So many possibilities and edge cases in this part of the code..
Today's work was sponsored by Hendrik Müller Hofstede.
Goal for the rest of the month is to build automatic recovery git repository corruption. Spent today investigating how to do it and came up with a fairly detailed design. It will have two parts, first to handle repository problems that can be fixed by fetching objects from remotes, and secondly to recover from problems where data never got sent to a remote, and has been lost.
In either case, the assistant should be able to detect the problem and
automatically recover well enough to keep running. Since this also affects
non-git-annex repositories, it will also be available in a standalone
git-recover-repository
command.
Spent today reviewing my plans for the month and filling in a couple of missing peices.
Noticed that I had forgotten to make repository repair clean up any stale git locks, despite writing that code at the beginning of the month, and added that in.
Made the webapp notice when a repository that is being used does not have any consistency checks configured, and encourage the user to set up checks. This happens when the assistant is started (for the local repository), and when removable drives containing repositories are plugged in. If the reminders are annoying, they can be disabled with a couple clicks.
And I think that just about wraps up the month. (If I get a chance, I would still like to add recovery of git-remote-gcrypt encrypted git repositories.)
My roadmap has next month dedicated to user-driven features and polishing and bugfixing.
Some neat stuff is coming up, but today was a pretty blah day for me.
I did get the Cronner tested and working (only had a few little bugs). But
I got stuck for quite a while making the Cronner stop git-annex fsck
processes it was running when their jobs get removed. I had some code to do
this that worked when run standalone, but not when run from git-annex.
After considerable head-scratching, I found out this was due to
forkProcess
masking aync exceptions, which seems to be probably
a bug. Luckily was able to
work around it. Async exceptions continue to strike me as the worst part of
the worst part of Haskell (the worst part being exceptions in general).
Was more productive after that.. Got the assistant to automatically queue re-downloads of any files that fsck throws out due to having bad contents, and made the webapp display an alert while fscking is running, which will go to the page to configure fsck schedules. Now all I need to do is build the UI of that page.
The webapp now fully handles repairing damage to the repository.
Along with all the git repository repair stuff already built, I added
additional repairs of the git-annex branch and git-annex's index file.
That was pretty easy actually, since git-annex already handles merging
git-annex branches that can sometimes be quite out of date. So when git repo
repair has to throw away recent changes to the git-annex branch, it just
effectively becomes out of date. Added a git annex fsck --fast
run to
ensure that the git-annex branch reflects the current state of the
repository.
When the webapp runs a repair, it first stops the assistant from committing new files. Once the repair is done, that's started back up, and it runs a startup scan, which is just what is needed in this sitation; it will add any new files, as well as any old files that the git repository damange caused to be removed from the index.
Also made git annex repair
run the git repository repair code,
for those with a more command-line bent. It can be used in non-git-annex
repos too!
So, I'm nearly ready to wrap up working on disaster recovery. Lots has been accomplished this month. And I have put off making a release for entirely too long!
The big missing piece is repair of git remotes located on removable drive. I may make a release before adding that, but removable drives are probably where git repository corruption is most likely to occur, so I certainly need to add that.
Today's work was sponsored by Scott Robinson.
Made a release on Friday. But I had to rebuild the OSX and Linux standalone builds today to fix a bug in them.
Spent the past three days redoing the whole Android build environment. I've been progressively moving from my first hacked up Android build env to something more reproducible and sane. Finally I am at the point where I can run a shell script (well, actually, 3 shell scripts) and get an Android build chroot. It's still not immune to breaking when new versions of haskell libs are uploaded, but this is much better, and should be maintainable going forward.
This is a good starting point for getting git-annex into the F-Droid app store, or for trying to build with a newer version of the Android SDK and NDK, to perhaps get it working on Android 4.3. (Eventually. I am so sick of building Android stuff right now..)
Friday was all spent struggling to get ghc-android to build. I had not built it successfully since February. I finally did, on Saturday, and I have made my own fork of it which builds using a known-good snapshot of the current development version of ghc. Building this in a Debian stable chroot means that there should be no possibility that upstream changes will break the build again.
With ghc built, I moved on to building all the haskell libs git-annex needs. Unfortunately my build script for these also has stopped working since I made it in April. I failed to pin every package at a defined version, and things broke.
So, I redid the build script, and updated all the haskell libs to the newest versions while I was at it. I have decided not to pin the library versions (at least until I find a foolproof way to do it), so this new script will break in the future, but it should break in a way I can fix up easily by just refreshing a patch.
The new ghc-android build has a nice feature of at least being able to
compile Template Haskell code (though still not run it at compile time.
This made the patching needed in the Haskell libs quite a lot less. Offset
somewhat by me needing to make general fixes to lots of libs to build with
ghc head. Including some fun with ==#
changing its type from Bool
to
Int#
. In all, I think I removed around 2.5 thousand lines of patches!
(Only 6 thousand lines to go...)
Today I improved ghc-android some more so it cross builds several C libraries that are needed to build several haskell libraries needed for XMPP. I had only ever built those once, and done it by hand, and very hackishly. Now they all build automatically too.
And, I put together a script that builds the debian stable chroot and installs ghc-android.
And, I hacked on the EvilSplicer (which is sadly still needed) to work with the new ghc/yesod/etc.
At this point, I have git-annex successfully building, including the APK!
In a bored hour waiting for a compile, I also sped up git annex add
on OSX by I think a factor of 10. Using cryptohash for hash calculation
now, when external hash programs are not available. It's still a few
percentage points slower than external hash programs, or I'd use it by
default.
This period of important drudgery was sponsored by an unknown bitcoin user, and by Bradley Unterrheiner and Andreas Olsson.
Finished moving the Android autobuilder over to the new clean build environment. Tested the Android app, and it still works. Whew!
There's a small chance that the issue with the Android app not working on Android 4.3 has been fixed by this rebuild. I doubt it, but perhaps someone can download the daily build and give it another try..
I have 7 days left in which I'd like to get remote gcrypt repositories working in the assistant. I think that should be fairly easy, but a prerequisite for it is making git-annex-shell support being run on a gcrypt repository. That's needed so that the assistant's normal locked down ssh key setup can also be used for gcrypt repositories.
At the same time, not all gcrypt endpoints will have git-annex-shell installed, and it seems to make sense to leave in the existing support for running raw rsync and git push commands against such a repository. So that's going to add some complication.
It will also complicate git-annex-shell to support gcrypt repos. Basically, everything it does in git-annex repos will need to be reimplemented in gcrypt repositories. Generally in a more simple form; for example it doesn't need to (and can't) update location logs in a gcrypt repo.
I also need to find a good UI to present the three available choices (unencrypted git, encrypted git, encrypted rsync) when setting up a repo on a ssh server. I don't want to just remove the encrypted rsync option, because it's useful when using xmpp to sync the git repo, and is simpler to set up since it uses shared encryption rather than gpg public keys.
My current thought is to offer just 2 choices, encrypted and non-encrypted. If they choose encrypted, offer a choice of shared encryption or encrypting to a specific key. I think I can word this so it's pretty clear what the tradeoffs are.
gcrpyt is fully working now. Most of the examples in fully encrypted git repositories with gcrypt should work.
A few known problems:
git annex sync
refuses to sync with gcrypt remotes. some url parsing issue.- Swapping two drives with gcrypt repositories on the same mount point doesn't work yet.
- http urls are not supported
While I said I was done with fsck scheduling yesterday, I ended up adding one more feature to it today: Full anacron style scheduling. So a fsck can be scheduled to run once per week, or month, or year, and it'll run the fsck the next time it's available after that much time has passed. The nice thing about this is I didn't have to change Cronner at all to add this, just improved the Recurrance data type and the code that calculates when to run events.
Rest of the day I've been catching up on some bug reports. The main bug I
fixed caused git-annex on Android to hang when adding files. This turns out
to be because it's using a new (unreleased) version of git, and
git check-attr -z
output format has changed in an incompatable way.
I am currently 70 messages behind, which includes some ugly looking bug reports, so I will probably continue with this over the next couple days.
Woke up with a pretty solid plan for gcrypt. It will be structured as a
separate special remote, so initremote
will be needed, with a gitrepo=
parameter (unless the remote already exists). git-annex will then set up
the git remote, including pushing to it (needed to get a gcrypt-id).
Didn't feel up to implementing that today. Instead I expectedly spent the day doing mostly Windows work, including setting up a VM on my new laptop for development. Including a ssh server in Windows, so I can script local builds and tests on Windows without ever having to touch the desktop. Much better!
I've been out sick. However, some things kept happening. Mesar contributed a build host, and the linux and android builds are now happening, hourly, there. (Thanks as well to the two other people who also offered hostng.) And I made a minor release to fix a bug in the test suite that I was pleased three different people reported.
Today, my main work was getting git-annex to notice when a gcrypt remote located on some removable drive mount point is not the same gcrypt remote that was mounted there before. I was able to finesse this so it re-configures things to use the new gcrypt remote, as long as it's a special remote it knows about. (Otherwise it has to ignore the remote.) So, encrypted repos on removable drives will work just as well as non-encrypted repos!
Also spent a while with rsync.net tech support trying to work out why someone's git-annex apparently opened a lot of concurrent ssh connections to rsync.net. Have not been able to reproduce the problem though.
Also, a lot of catch-up to traffic. Still 63 messages backlogged however, and still not entirely well..
Fixed a typo that broke automatic youtube video support in addurl
.
Now there's an easy way to get an overview of how close your repository is to meeting the configured numcopies settings (or when it exceeds them).
# time git annex status . [...] numcopies stats: numcopies +0: 6686 numcopies +1: 3793 numcopies +3: 3156 numcopies +2: 2743 numcopies -1: 1242 numcopies -4: 1098 numcopies -3: 1009 numcopies +4: 372
This does make git annex status
slow when run on a large directory tree,
so --fast disables that.
Started the day by getting the builds updated for yesterday's release. This included making it possible to build git-annex with Debian stable's version of cryptohash. Also updated the Debian stable backport to the previous release.
The roadmap has this month devoted to improving git-annex's support for recovering from disasters, broken repos, and so on. Today I've been working on the first thing on the list, stale git index lock files.
It's unfortunate that git uses simple files for locking, and does not use fcntl or flock to prevent the stale lock file problem. Perhaps they want it to work on broken NFS systems? The problem with that line of thinking is is means all non-broken systems end up broken by stale lock files. Not a good tradeoff IMHO.
There are actually two lock files that can end up stale when using
git-annex; both .git/index.lock
and .git/annex/index.lock
. Today I
concentrated on the latter, because I saw a way to prevent it from ever
being a problem. All updates to that index file are done by git-annex when
committing to the git-annex branch. git-annex already uses fcntl locking
when manipulating its journal. So, that can be extended to also cover
committing to the git-annex branch, and then the git index.lock
file
is irrelevant, and can just be removed if it exists when a commit is
started.
To ensure this makes sense, I used the type system to prove that the journal
locking was in effect everywhere I need it to be. Very happy I was able to
do that, although I am very much a novice at using the type system for
interesting proofs. But doing so made it very easily to build up to a point
where I could unlink the .git/annex/index.lock
and be sure it was safe to do
that.
What about stale .git/index.lock
files? I don't think it's appropriate
for git-annex to generally recover from those, because it would change
regular git command line behavior, and risks breaking something. However, I
do want the assistant to be able to recover if such a file exists when it
is starting up, since that would prevent it from running. Implemented that
also today, although I am less happy with the way the assistant detects
when this lock file is stale, which is somewhat heuristic (but should work
even on networked filesystems with multiple writing machines).
Today's work was sponsored by Torbjørn Thorsen.
Added support for gcrypt remotes to git-annex-shell. Now gcrypt special remotes probe when they are set up to see if the remote system has a suitable git-annex-shell, and if so all commands are sent to it. Kept the direct rsync mode working as a fallback.
It turns out I made a bad decision when first adding gcrypt support to
git-annex. To make implementation marginally easier, I decided to not
put objects inside the usual annex/objects
directory in a gcrypt remote.
But that lack of consistency would have made adding support to
git-annex-shell a lot harder. So, I decided to change this. Which
means that anyone already using gcrypt with git-annex will need to
manually move files around.
Today's work was sponsored by Tobias Nix.
Lots of progress from yesterday's modest start of building data types for
scheduling. Last night I wrote the hairy calendar code to calculate when
next to run a scheduled event. (This is actually quite superior to cron
,
which checks every second to see if it should run each event!) Today I
built a "Cronner" thread that handles spawning threads to handle each
scheduled event. It even notices when changes have been made to the its
schedule and stops/starts event threads appropriately.
Everything is hooked up, building, and there's a good chance it works without too many bugs, but while I've tested all the pure code (mostly automatically with quickcheck properties), I have not run the Cronner thread at all. And there is some tricky stuff in there, like noticing that the machine was asleep past when it expected to wake up, and deciding if it should still run a scheduled event, or should wait until next time. So tomorrow we'll see..
Today's work was sponsored by Ethan Aubin.
John Millikin came through and fixed that haskell-gnutls segfault on OSX that I developed a reproducible test case for the other day. It's a bit hard to test, since the bug doesn't always happen, but the fix is already deployed for Mountain Lion autobuilder.
However, I then found another way to make haskell-gnutls segfault, more reliably on OSX, and even sometimes on Linux. Just entering the wrong XMPP password in the assistant can trigger this crash. Hopefully John will work his magic again.
Meanwhile, I fixed the sync-after-forget problem. Now sync always forces its push of the git-annex branch (as does the assistant). I considered but rejected having sync do the kind of uuid-tagged branch push that the assistant sometimes falls back to if it's failing to do a normal sync. It's ugly, but worse, it wouldn't work in the workflow where multiple clients are syncing to a central bare repository, because they'd not pull down the hidden uuid-tagged branches, and without the assistant running on the repository, nothing would ever merge their data into the git-annex branch. Forcing the push of synced/git-annex was easy, once I satisfied myself that it was always ok to do so.
Also factored out a module that knows about all the different log files
stored on the git-annex branch, which is all the support infrastructure
that will be needed to make git annex forget --drop-dead
work. Since this
is basically a routing module, perhaps I'll get around to making it use
a nice bidirectional routing library like
Zwaluw one day.
Spent a few hours improving gcrypt in some minor ways, including adding a --check option that the assistant can use to find out if a given repo is encrypted with dgit, and also tell if the necessary gpg key is available to decrypt it. Also merged in a fix to support subkeys, developed by a git-annex user who is the first person I've heard from who is using gcrypt. I don't want to maintain gcrypt, so I am glad its author has shown up again today.
Got mostly caught up on backlog. The main bug I was able to track down today is git-annex using a lot of memory in certian repositories. This turns out to have happened when a really large file was committed right intoo to the git repository (by mistake or on purpose). Some parts of git-annex buffer file contents in memory while trying to work out if they're git-annex keys. Fixed by making it first check if a file in git is marked as a symlink. Which was really hard to do!
At least 4 people ran into this bug, which makes me suspect that lots of
people are messing up when using direct mode (probably due to not reading
the documentation, or having git commit -a
hardwired into their fingers,
and forcing git to commit large files into their repos, rather than having
git-annex manage them. Implementing direct mode guard seems more
urgent now.
Today's work was sponsored by Amitai Schlair.
Built a git-recover-repository
command today. So far it only does the
detection and deletion of corrupt objects, and retrieves them from remotes
when possible. No handling yet of missing objects that cannot be recovered
from remotes.
Here's a couple of sample runs where I do bad things to the git repository and it fixes them:
joey@darkstar:~/tmp/git-annex>chmod 644 .git/objects/pack/* joey@darkstar:~/tmp/git-annex>echo > .git/objects/pack/pack-a1a770c1569ac6e2746f85573adc59477b96ebc5.pack joey@darkstar:~/tmp/git-annex>~/src/git-annex/git-recover-repository Running git fsck ... git fsck found a problem but no specific broken objects. Perhaps a corrupt pack file? Unpacking all pack files. fatal: early EOF Unpacking objects: 100% (148/148), done. Unpacking objects: 100% (354/354), done. Re-running git fsck to see if it finds more problems. Re-running git fsck to see if it finds more problems. Initialized empty Git repository in /home/joey/tmp/tmprepo.0/.git/ Trying to recover missing objects from remote origin Successfully recovered repository! You should run "git fsck" to make sure, but it looks like everything was recovered ok.
joey@darkstar:~/tmp/git-annex>chmod 644 .git/objects/00/0800742987b9f9c34caea512b413e627dd718e joey@darkstar:~/tmp/git-annex>echo > .git/objects/00/0800742987b9f9c34caea512b413e627dd718e joey@darkstar:~/tmp/git-annex>~/src/git-annex/git-recover-repository Running git fsck ... error: unable to unpack 000800742987b9f9c34caea512b413e627dd718e header error: inflateEnd: stream consistency error (no message) error: unable to unpack 000800742987b9f9c34caea512b413e627dd718e header error: inflateEnd: stream consistency error (no message) git fsck found 1 broken objects. Unpacking all pack files. removing 1 corrupt loose objects Re-running git fsck to see if it finds more problems. Re-running git fsck to see if it finds more problems. Initialized empty Git repository in /home/joey/tmp/tmprepo.0/.git/ Trying to recover missing objects from remote origin Successfully recovered repository! You should run "git fsck" to make sure, but it looks like everything was recovered ok.
Works great! I need to move this and git-union-merge
out of the git-annex
source tree sometime.
Today's work was sponsored by Francois Marier.