Git for the real world

13 Jul 2008

Now that we’ve been using git at Twitter for a couple of months, we’ve overcome several crippling problems and misunderstandings about how to use it properly. There are dozens of “intros” and “tutorials” to git online, but at some point you need to know more than just the basics of DVCS and the map to svn commands – you need to know practical considerations of real-world usage. None of the intros or tutorials had this stuff, so I thought I’d share what we learned.

Git’s command-line interface is hands-down the worst of any DVCS (except the archaic tla). There are inconsistencies: Some commands will expect you to type “origin/master”, while others will want “origin master”. Other commands should never ever be used, but are presented in the documentation as if they’re part of a normal usage pattern. Some commands are useless in their default form and need several command-line options to make them work right.

I ended up writing a wrapper script to cover up a lot of these flaws, which I consider an “ultimate fail” for a UI. But I’m still not sure the script is a good idea, since it may make me forget all the quirks I need to keep in mind when the script isn’t around.

Don’t change history

Two commands you should avoid: git rebase and git reset. Some of the tutorials will tell you that rebase is one of the first commands you should learn. Lies! rebase is a way to trick you into creating merge conflicts.

When you rebase, you are erasing every local commit you’ve made, and turning them into patches (as if you were back on CVS). After syncing your repository up with the remote one, your patches are re-applied one by one. Presto! you’ve changed history.

The only reason I can think of for doing this is if you’re not comfortable doing merges. But DVCS is all about merges, so you should just get used to doing them. A merge provides a little signpost to everyone else about your branch. Don’t fear the merge – love it! It records exactly how your local work should be rectified with remote changes, without requiring you to keep tweaking your patch.

reset is even worse. It erases commits from your history, which will very likely make your local repository different from everyone else’s, and guarantee future conflicts or even an inability to push in the future. Some people will say you should learn reset so you can use it in a panic situation, but if you’re panicking, you’re more likely to make things worse, so stop. Calm down. You have time to think and solve the problem in a rational way.

My problem with these two commands is that they violate a core philosophy of DVCS: Everyone has their own view of the repository, but these views obey entropy and flow in only one time direction. When they meet, they merge. Doing a rebase or reset goes back in time and changes the past. They should be in a separate tool, like “git-fix” or “git-hack”.

The story matters more than the chronology

Have you ever read a history book that said “In 1812, the British empire shelled the tiny new American capital. Meanwhile, Napoleon marched across Europe. In China, …”? Hopefully not, that would suck. Telling a thread of the story from beginning to end is more important than placing every single event in its exact chronological order. The default format for git log reorders commits by their exact date and time, so you need to be aware of that and not get confused.

Say, for example, you made a local branch, and made 3 local commits: L1, L2, and L3. Meanwhile, someone else is working on a different feature on their branch, and does commits R1 and R2. After you merge (M1), git is likely to show you a history like this:

M1 -> R2 -> L3 -> L2 -> R1 -> L1

Huh? What? Why are my local commits intermixed with my co-worker’s commits? The merge must have messed up! Crap! Time to git reset and destroy everything, right? No! Stop! Don’t do it. It’s a trap! Git is lying by omission – it’s telling the literal, actual truth, but it’s telling it to you in a way that makes it confusing. Git is re-ordering the history to make sure every commit is shown in its actual time order, not the story order.

You should probably just go ahead and alias log to:

git log --topo-order --decorate

That tells git to show things in “topological” (story) order, and to also mark where various branches are sitting. I usually find it useful to take that one step further:

git log --topo-order --decorate --first-parent

That tells git to show things in story order and to tell that story from my point of view. It’s sometimes interesting to see every commit that one of your coworkers did in their branch, but often you just want to see the merge-commit and move on. "-- first-parent" tells git to skip over the details of every branch that isn’t a linear parent of yours. Generally this means you’ll see a simplified history of what’s been going on, without the intricacies of what happened on forked branches while they were forked off.

If you want to see all the threads of history intertwined, I suggest using a graphical tool like gitk instead of git-log.

Don’t fast-forward – live every moment

This one is pretty confusing. And it sucks, because this concept doesn’t even exist in other DVCS. I think it’s another symptom of “fear of merge”. Basically, sometimes when you ask git to merge branch A into branch B, it will decide that it doesn’t want to merge and it will instead turn A and B into clones of each other.

For example, let’s pretend you made a branch of “master” called “feature” and did a few commits on it, and are now ready to merge it back into master. If no other work has happened on the master branch, git will try to out-clever you. It thinks: “Well, nobody else has worked on the master branch, so I could just make the feature branch become the new master branch and that would be logically equivalent.” So after the merge, you’ll see every single commit you made, as if you had done them directly on the master branch. Git has cloned your feature branch into the new master branch.

This might not be so bad if there are only a couple of people working on the project, but there are a few side effects: Your branch has effectively vanished from history. There is no longer any indication that you were working on a side branch; it looks like you were working directly on master. And if it turns out that there were bugs in your new feature (which, you know, sometimes happens), you can’t reverse the merge-commit because there is no merge-commit. You will have to reverse every single commit you made, in reverse order, or worse.

So really, you want git to always create a merge-commit when you do a merge. For this, you have to ask it nicely:

git merge --no-ff

(Git calls the history erasing “fast-forwarding”.)

A few other things

To remove a branch from a remote repository after it’s been merged and deployed, you have to push the branch with a colon in front of the branch name. This has become a running joke in the office: “Colon means delete.” Look, don’t ask me, I’m not Linus. That’s just how it works.

git push origin :stale_completed_branch

When other people remove branches, they won’t be removed from your local copy of the repository. To take care of this housekeeping, you need to express a fruit preference:

git remote prune origin

Again, don’t ask. I don’t know why. That’s just how it is.

I have a few ranty topics on how git is implemented and used, and how that compares with the older DVCS (especially bazaar), but I’ll save that for some other time. If you’re using git, hopefully this information is useful.

« Back to article list

Please do not post this article to Hacker News.

Permission to scrape this site or any of its content, for any purpose, is denied, regardless of your personal beliefs or desire to design a novel opt-out method.