Journal

2018-06-04 Collaborating with non-Git-users - Git hook

In the first part, I showed how I can make Git add information about the commit in every file I want it to in such a way that it is updated every time I Git-archive the project. The second part was devoted to integrating this workflow in Emacs and mu4e. Today I finish with showing my Git hook protecting me from accidentally messing up.

One issue with this setup is that I can accidentally commit the values (like the commit hash etc.) I got from one of the other authors instead of the placeholder. Using Magit’s magit-discard on a region is easy, but forgetting to do it is even easier. So, git hooks to the rescue – this is perfectly suited to a Git pre-commit hook:

#!/bin/bash

git diff --name-only --cached | while read -r filename; do
    if ! grep -q "\$Format:" <(head -n3 "$filename")
    then
	echo File "$filename" contains the commit data, please revert to the placeholders!
	exit 1
    fi
done

Notice that I used a few tricks here:

  • git diff --name-only --cached gives us the list of files about to be committed,
  • we pipe it to bash while loop, which reads it line by line
  • we use grep -q to quiet grep and only make it return a non-zero code when nothing is found,
  • we use ! to negate the exit status of grep,
  • we use head to only search the first few (10 by default) lines of each file,
  • and we use the famous <(...) bash trick to avoid fussing around with temp files ourselves.

Notice also that the above is not very robust: if the filenames contain spaces, things might break down. (Use IFS= read in that case.) In my case, this is not a concern, so I didn’t bother.

The only remaining thing to do is to ensure that I won’t accidentally commit things from other people as myself. I know two methods of doing this: one is an extremely dirty hack, and the other one relies on a not-so-well-documented feature of Git. Go figure.

The dirty hack is to set the user name in the per-repository Git configuration to empty strings. This makes it impossible to commit anything (even if you use git commit --author), since Git complains about empty ident name (for <email@example.com>) not allowed. The trick is to set the environment variables GIT_COMMITTER_NAME and GIT_COMMITTER_EMAIL (for some very strange reason, they override even the local Git settings). Now, if you don’t specify your author data, the commit will fail, but if you do, everything will be fine.

The more elegant way is to use a client-side Git hook. In this case, the pre-commit hook is again our best bet. The problem is, it does not get any parameters. Happily for us, it has an undocumented feature: it has access to some environment variables, one of them being GIT_AUTHOR_EMAIL. (I learned about it here; it is actually described in the docs, but that seems not to imply that the pre-commit hook is going to get it.) So, we can use this pre-commit hook:

#!/bin/bash

if ! grep "@allowed.domain$" <(echo $GIT_AUTHOR_EMAIL)
then
    echo Please correctly set the commit author!
    exit 1
fi

(Of course, I merged both the above scripts to one pre-commit script.)

Problem solved!

CategoryEnglish, CategoryBlog, CategoryGit

Comments on this page

2018-05-28 Collaborating with non-Git-users - Emacs support

In the previous part, I showed how I can make Git add information about the commit in every file I want it to in such a way that it is updated every time I Git-archive the project.

Actually, I have written an Elisp function to do this for me; it prepares a zip file with git archive and then writes a boilerplate email in mu4e, with the recipients’ addresses, attachment and a minimal body already in place. Here it is (with the personal data stripped down). (If you don’t use mu4e contexts, you can safely delete all parts relating to them, of course. I just didn’t want this command to ask me for context every time.)

(defvar auto-git-archive-repo-dir "~/repo/directory/")

(defun git-archive-and-prepare-email ()
  "Prepare a message to the rest of people with an update."
  (interactive)
  (let ((default-directory auto-git-archive-repo-dir)
	(mu4e-compose-context-policy nil)
	(mu4e-context nil))
    (shell-command "git archive -o current-version.zip master")
    (mu4e-context-switch nil "context")
    (gnus-dired-attach (list (concat auto-git-archive-repo-dir "current-version.zip")))
    (message-goto-to)
    (insert "recipient1 <rec1@example.com>, recipient2 <rec2@example.com>")
    (message-goto-subject)
    (insert "Project update")
    (message-goto-body)
    (insert "Dear all,\n\nI enclose the current state of our project.\n\nBest,\n")
    (previous-line 2)
    (end-of-line)))

I can just now run this command and press C-c C-c (or add some text to the email first). Again, Elisp’s ability to write programs mimicking the actions of a human editor turns out to be a great thing.

Another place Emacs can help me with my workflow is entering author information in Magit. I could probably somehow plug in the Magit’s pop-up menu, but instead I went for a simpler version, which is the following hydra:

(defhydra hydra-git-project (:exit t)
  "Simple hydra for the Git project"
  ("m" (insert "Marcin Borkowski <mbork@mbork.pl>") "mbork")
  ("1" (insert "Collaborator1 <coll1@example.com>") "col1")
  ("2" (insert "Collaborator2 <coll2@example.com>") "col2"))

(global-set-key (kbd "s-b") 'hydra-git-project/body)

I can now just press =A s-b before committing to display the menu with the collaborator list and select the right one with one keystroke.

In the next and final part I am going to describe how I configured a Git hook to prevent me from accidentally committing a wrong thing or the right thing with a wrong author. Stay tuned!

CategoryEnglish, CategoryBlog, CategoryGit, CategoryEmacs

Comments on this page

2018-05-20 Collaborating with non-Git-users - workflow and basic setup

This is the first post in a three-part series describing my setup designed to work flawlessly and almost automatically when collaborating with people not using Git. It describes the premise and the basic elements of the machinery I use.

I am now engaged in a project involving a few collaborators. It turned out that I am the only one who knows how to use a version control system and is willing to do so.

Reminder to myself: never, ever work with people who insist on sending intermediate files via email and not learning a VCS.

(It’s not that I insist on Git. The fact is – as we will see later – Git is sometimes horrible. But it has Magit, and after some time using it, coming back to command line feels sort of like going back to a cave dwelling.)

So, there we are, a bunch of people who write emails of the form “Hey, I added something to the file such-and-such, here’s the new version, what d’ya think?”

Of course, everyone of us works with different frequency (and has different sleep patterns – for instance, I started writing this blog post before 5:30, when most people are still sleeping). And while technically we could just number our versions, and lock files (like in RCS, only in a human version – “please do not touch this file, I’m working on it now”) – this just doesn’t make sense in the twenty-first century.

So, I have a Git repo on my computer (backed up regularly and pushed to a server just in case), and I commit each and every change I get over email.

That alone doesn’t solve the problem yet. Assume that I send out version 5 today at 6:00. Then I leave to other projects for the rest of the day. Meanwhile, person X makes some edits (call them version 6x). Tomorrow at 5:20, I sit down to do some more work, and commit my version 6. Person X does a similar thing later (v7x), and only then sends me an email with a new version of some file. (They didn’t send me anything earlier, since there’s no point in sending work in progress, right?) So, here is what we should have in our repo:

      person-x
      6x---7x
     /
    /
---5---6
   master

(I assume that I am on the master branch, since I am the great Git mastermind here;-).)

Now I know that I should commit v7x somewhere, but how do I know where exactly? From my point of view, the situation looks like this:

  person-x
     7x

---5---6
   master

In other words, I got v7x, and it’s “floating in the air” – remembering where was the point when I sent the email to the other people is troublesome, and attaching v7x in the wrong place might result in data loss! If I add something in v6, copy v7x’s files on top of v6 and then commit, then we will lose my changes. (This is not a problem when every commit is a small one. But in a culture of “let’s not send this WiP to everyone until I do a lot of work”, which is the opposite of Git good practices, carefully reviewing each and every line of a 100-line commit is no fun.)

I can see two possible ways of solving this problem (assuming that making people use Git is not a viable solution). One is to create branches corresponding to every person in the project and use them to track which version I send to whom. (The mess is made worse because of the fact that if I collaborate on some part – which would be a feature branch in a normal workflow – with person Y, I might not want to even bother person X with emails regarding parts they are not very interested in.)

This, however, is not optimal, since it requires manual work, and manual work leads to errors. Also, you have no guarantee that people will start editing with the latest version you send them – they may have already started with one of the previous ones (this actually happened to me the other day).

Happily, it turns out Git has you covered. Here is what I came up with.

First, I wanted each copy of each file to have some kind of stamp which would tell me which version it originated from.

Initially, I wanted Git to perform some trick so that the current commit SHA could be itself part of the commit (I think SVN does something similar with revision numbers). This is of course impossile (or at least extremely hard – and I mean getting-the-Turing-award-and-breaking-Git-for-everyone-level hard), since it would entail knowing the SHA beforehand, and making a commit containing this very SHA somehow hash to it. (Similar things have been accomplished, although they are way easier, since zipping was never meant to be a cryptographically secure hash function.)

So, let’s forget about the fully automated way and do something less convenient, but possible.

Git has something called attributes. They can do quite a few things (filters are especially interesting, although, as mentioned above, they couldn’t solve my problem). There is, however, an attribute called export-subst. It works when using git archive, a Git command used seldom enough that there is no corresponding Magit command (!). With it, you can put somehwhere a comment line containing the string $Format:...$, and use git-log placeholders in the ellipsis part.

So, I decided to put this at the beginning of every file in the repo:

%
% Please never remove or alter the following line!
% $Format:This file is based on commit %h, authored by %an on %ai.$

and then edited .git/info/attributes to contain the line

*.tex export-subst

Of course, I now have to remember to use git archive every time I send files to other people in the project. This is not a problem, however, since it is no worse than manually zipping them.

In the next part, I will describe my Emacs setup which helps me with this, as well as committing stuff on people’s behalf.

CategoryEnglish, CategoryBlog, CategoryGit

Comments on this page

More...