Now that we’ve been using git at Twitter for a couple of months, we’ve overcome several crippling problems and misunderstandings about how to use it properly. There are dozens of “intros” and “tutorials” to git online, but at some point you need to know more than just the basics of DVCS and the map to svn commands – you need to know practical considerations of real-world usage. None of the intros or tutorials had this stuff, so I thought I’d share what we learned.
Git’s command-line interface is hands-down the worst of any DVCS (except the
archaic tla
). There are inconsistencies: Some commands will expect you to
type “origin/master”, while others will want “origin master”. Other commands
should never ever be used, but are presented in the documentation as if
they’re part of a normal usage pattern. Some commands are useless in their
default form and need several command-line options to make them work right.
I ended up writing a wrapper script to cover up a lot of these flaws, which I consider an “ultimate fail” for a UI. But I’m still not sure the script is a good idea, since it may make me forget all the quirks I need to keep in mind when the script isn’t around.
Don’t change history
Two commands you should avoid: git rebase
and git reset
. Some of
the tutorials will tell you that rebase
is one of the first commands you
should learn. Lies! rebase
is a way to trick you into creating merge
conflicts.
When you rebase, you are erasing every local commit you’ve made, and turning them into patches (as if you were back on CVS). After syncing your repository up with the remote one, your patches are re-applied one by one. Presto! you’ve changed history.
The only reason I can think of for doing this is if you’re not comfortable doing merges. But DVCS is all about merges, so you should just get used to doing them. A merge provides a little signpost to everyone else about your branch. Don’t fear the merge – love it! It records exactly how your local work should be rectified with remote changes, without requiring you to keep tweaking your patch.
reset
is even worse. It erases commits from your history, which will
very likely make your local repository different from everyone else’s, and
guarantee future conflicts or even an inability to push in the future. Some
people will say you should learn reset
so you can use it in a panic
situation, but if you’re panicking, you’re more likely to make things worse,
so stop. Calm down. You have time to think and solve the problem in a
rational way.
My problem with these two commands is that they violate a core philosophy of
DVCS: Everyone has their own view of the repository, but these views obey
entropy and flow in only one time direction. When they meet, they merge. Doing
a rebase
or reset
goes back in time and changes the past. They
should be in a separate tool, like “git-fix” or “git-hack”.
The story matters more than the chronology
Have you ever read a history book that said “In 1812, the British empire
shelled the tiny new American capital. Meanwhile, Napoleon marched across
Europe. In China, …”? Hopefully not, that would suck. Telling a thread of
the story from beginning to end is more important than placing every single
event in its exact chronological order. The default format for git log
reorders commits by their exact date and time, so you need to be aware of
that and not get confused.
Say, for example, you made a local branch, and made 3 local commits: L1, L2, and L3. Meanwhile, someone else is working on a different feature on their branch, and does commits R1 and R2. After you merge (M1), git is likely to show you a history like this:
M1 -> R2 -> L3 -> L2 -> R1 -> L1
Huh? What? Why are my local commits intermixed with my co-worker’s commits?
The merge must have messed up! Crap! Time to git reset
and destroy
everything, right? No! Stop! Don’t do it. It’s a trap! Git is lying by
omission – it’s telling the literal, actual truth, but it’s telling it to you
in a way that makes it confusing. Git is re-ordering the history to make sure
every commit is shown in its actual time order, not the story order.
You should probably just go ahead and alias log
to:
git log --topo-order --decorate
That tells git to show things in “topological” (story) order, and to also mark where various branches are sitting. I usually find it useful to take that one step further:
git log --topo-order --decorate --first-parent
That tells git to show things in story order and to tell that story from
my point of view. It’s sometimes interesting to see every commit that one of
your coworkers did in their branch, but often you just want to see the
merge-commit and move on. "-- first-parent"
tells git
to skip over the details of every branch that isn’t a linear parent of yours.
Generally this means you’ll see a simplified history of what’s been going on,
without the intricacies of what happened on forked branches while they were
forked off.
If you want to see all the threads of history intertwined, I suggest using a graphical tool like gitk instead of git-log.
Don’t fast-forward – live every moment
This one is pretty confusing. And it sucks, because this concept doesn’t even exist in other DVCS. I think it’s another symptom of “fear of merge”. Basically, sometimes when you ask git to merge branch A into branch B, it will decide that it doesn’t want to merge and it will instead turn A and B into clones of each other.
For example, let’s pretend you made a branch of “master” called “feature” and did a few commits on it, and are now ready to merge it back into master. If no other work has happened on the master branch, git will try to out-clever you. It thinks: “Well, nobody else has worked on the master branch, so I could just make the feature branch become the new master branch and that would be logically equivalent.” So after the merge, you’ll see every single commit you made, as if you had done them directly on the master branch. Git has cloned your feature branch into the new master branch.
This might not be so bad if there are only a couple of people working on the project, but there are a few side effects: Your branch has effectively vanished from history. There is no longer any indication that you were working on a side branch; it looks like you were working directly on master. And if it turns out that there were bugs in your new feature (which, you know, sometimes happens), you can’t reverse the merge-commit because there is no merge-commit. You will have to reverse every single commit you made, in reverse order, or worse.
So really, you want git to always create a merge-commit when you do a merge. For this, you have to ask it nicely:
git merge --no-ff
(Git calls the history erasing “fast-forwarding”.)
A few other things
To remove a branch from a remote repository after it’s been merged and deployed, you have to push the branch with a colon in front of the branch name. This has become a running joke in the office: “Colon means delete.” Look, don’t ask me, I’m not Linus. That’s just how it works.
git push origin :stale_completed_branch
When other people remove branches, they won’t be removed from your local copy of the repository. To take care of this housekeeping, you need to express a fruit preference:
git remote prune origin
Again, don’t ask. I don’t know why. That’s just how it is.
I have a few ranty topics on how git is implemented and used, and how that compares with the older DVCS (especially bazaar), but I’ll save that for some other time. If you’re using git, hopefully this information is useful.