Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
121 views
in Technique[技术] by (71.8m points)

version control - Git and Mercurial - Compare and Contrast

For a while now I've been using subversion for my personal projects.

More and more I keep hearing great things about Git and Mercurial, and DVCS in general.

I'd like to give the whole DVCS thing a whirl, but I'm not too familiar with either option.

What are some of the differences between Mercurial and Git?

Note: I'm not trying to find out which one is "best" or even which one I should start with. I'm mainly looking for key areas where they are similar, and where they are different, because I am interested to know how they differ in terms of implementation and philosophy.

Question&Answers:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Answer

0 votes
by (71.8m points)

Disclaimer: I use Git, follow Git development on git mailing list, and even contribute a bit to Git (gitweb mainly). I know Mercurial from documentation and some from discussion on #revctrl IRC channel on FreeNode.

Thanks to all people on on #mercurial IRC channel who provided help about Mercurial for this writeup



Summary

Here it would be nice to have some syntax for table, something like in PHPMarkdown / MultiMarkdown / Maruku extension of Markdown

  • Repository structure: Mercurial doesn't allow octopus merges (with more than two parents), nor tagging non-commit objects.
  • Tags: Mercurial uses versioned .hgtags file with special rules for per-repository tags, and has also support for local tags in .hg/localtags; in Git tags are refs residing in refs/tags/ namespace, and by default are autofollowed on fetching and require explicit pushing.
  • Branches: In Mercurial basic workflow is based on anonymous heads; Git uses lightweight named branches, and has special kind of branches (remote-tracking branches) that follow branches in remote repository.
  • Revision naming and ranges: Mercurial provides revision numbers, local to repository, and bases relative revisions (counting from tip, i.e. current branch) and revision ranges on this local numbering; Git provides a way to refer to revision relative to branch tip, and revision ranges are topological (based on graph of revisions)
  • Mercurial uses rename tracking, while Git uses rename detection to deal with file renames
  • Network: Mercurial supports SSH and HTTP "smart" protocols, and static HTTP protocol; modern Git supports SSH, HTTP and GIT "smart" protocols, and HTTP(S) "dumb" protocol. Both have support for bundles files for off-line transport.
  • Mercurial uses extensions (plugins) and established API; Git has scriptability and established formats.

There are a few things that differ Mercurial from Git, but there are other things that make them similar. Both projects borrow ideas from each other. For example hg bisect command in Mercurial (formerly bisect extension) was inspired by git bisect command in Git, while idea of git bundle was inspired by hg bundle.

Repository structure, storing revisions

In Git there are four types of objects in its object database: blob objects which contain contents of a file, hierarchical tree objects which store directory structure, including file names and relevant parts of file permissions (executable permission for files, being a symbolic link), commit object which contain authorship info, pointer to snapshot of state of repository at revision represented by a commit (via a tree object of top directory of project) and references to zero or more parent commits, and tag objects which reference other objects and can be signed using PGP / GPG.

Git uses two ways of storing objects: loose format, where each object is stored in a separate file (those files are written once, and never modified), and packed format where many objects are stored delta-compressed in a single file. Atomicity of operations is provided by the fact, that reference to a new object is written (atomically, using create + rename trick) after writing an object.

Git repositories require periodic maintenance using git gc (to reduce disk space and improve performance), although nowadays Git does that automatically. (This method provides better compression of repositories.)

Mercurial (as far as I understand it) stores history of a file in a filelog (together, I think, with extra metadata like rename tracking, and some helper information); it uses flat structure called manifest to store directory structure, and structure called changelog which store information about changesets (revisions), including commit message and zero, one or two parents.

Mercurial uses transaction journal to provide atomicity of operations, and relies on truncating files to clean-up after failed or interrupted operation. Revlogs are append-only.

Looking at repository structure in Git versus in Mercurial, one can see that Git is more like object database (or a content-addressed filesystem), and Mercurial more like traditional fixed-field relational database.

Differences:
In Git the tree objects form a hierarchical structure; in Mercurial manifest file is flat structure. In Git blob object store one version of a contents of a file; in Mercurial filelog stores whole history of a single file (if we do not take into account here any complications with renames). This means that there are different areas of operations where Git would be faster than Mercurial, all other things considered equal (like merges, or showing history of a project), and areas where Mercurial would be faster than Git (like applying patches, or showing history of a single file). This issue might be not important for end user.

Because of the fixed-record structure of Mercurial's changelog structure, commits in Mercurial can have only up to two parents; commits in Git can have more than two parents (so called "octopus merge"). While you can (in theory) replace octopus merge by a series of two-parent merges, this might cause complications when converting between Mercurial and Git repositories.

As far as I know Mercurial doesn't have equivalent of annotated tags (tag objects) from Git. A special case of annotated tags are signed tags (with PGP / GPG signature); equivalent in Mercurial can be done using GpgExtension, which extension is being distributed along with Mercurial. You can't tag non-commit object in Mercurial like you can in Git, but that is not very important, I think (some git repositories use tagged blob to distribute public PGP key to use to verify signed tags).

References: branches and tags

In Git references (branches, remote-tracking branches and tags) reside outside DAG of commits (as they should). References in refs/heads/ namespace (local branches) point to commits, and are usually updated by "git commit"; they point to the tip (head) of branch, that's why such name. References in refs/remotes/<remotename>/ namespace (remote-tracking branches) point to commit, follow branches in remote repository <remotename>, and are updated by "git fetch" or equivalent. References in refs/tags/ namespace (tags) point usually to commits (lightweight tags) or tag objects (annotated and signed tags), and are not meant to change.

Tags

In Mercurial you can give persistent name to revision using tag; tags are stored similarly to the ignore patterns. It means that globally visible tags are stored in revision-controlled .hgtags file in your repository. That has two consequences: first, Mercurial has to use special rules for this file to get current list of all tags and to update such file (e.g. it reads the most recently committed revision of the file, not currently checked out version); second, you have to commit changes to this file to have new tag visible to other users / other repositories (as far as I understand it).

Mercurial also supports local tags, stored in hg/localtags, which are not visible to others (and of course are not transferable)

In Git tags are fixed (constant) named references to other objects (usually tag objects, which in turn point to commits) stored in refs/tags/ namespace. By default when fetching or pushing a set of revision, git automatically fetches or pushes tags which point to revisions being fetched or pushed. Nevertheless you can control to some extent which tags are fetched or pushed.

Git treats lightweight tags (pointing directly to commits) and annotated tags (pointing to tag objects, which contain tag message which optionally includes PGP signature, which in turn point to commit) slightly differently, for example by default it considers only annotated tags when describing commits using "git describe".

Git doesn't have a strict equivalent of local tags in Mercurial. Nevertheless git best practices recommend to setup separate public bare repository, into which you push ready changes, and from which others clone and fetch. This means that tags (and branches) that you don't push, are private to your repository. On the other hand you can also use namespace other than heads, remotes or tags, for example local-tags for local tags.

Personal opinion: In my opinion tags should reside outside revision graph, as they are external to it (they are pointers into graph of revisions). Tags should be non-versioned, but transferable. Mercurial's choice of using a mechanism similar to the one for ignoring files, means that it either has to treat .hgtags specially (file in-tree is transferable, but ordinary it is versioned), or have tags which are local only (.hg/localtags is non-versioned, but untransferable).

Branches

In Git local branch (branch tip, or branch head) is a named reference to a commit, where one can grow new commits. Branch can also mean active line of development, i.e. all commits reachable from branch tip. Local branches reside in refs/heads/ namespace, so e.g. fully qualified name of 'master' branch is 'refs/heads/master'.

Current branch in Git (meaning checked out branch, and branch where new commit will go) is the branch which is referenced by the HEAD ref. One can have HEAD pointing directly to a commit, rather than being symbolic reference; this situation of being on an anonymous unnamed branch is called detached HEAD ("git branch" shows that you are on '(no branch)').

In Mercurial there are anonymous branches (branch heads), and one can use bookmarks (via bookmark extension). Such bookmark branches are purely local, and those names were (up to version 1.6) not transferable using Mercurial. You can use rsync or scp to copy the .hg/bookmarks file to a remote repository. You can also use hg id -r <bookmark> <url> to get the revision id of a current tip of a bookmark.

Since 1.6 bookmarks can be pushed/pulled. The BookmarksExtension page has a section on Working With Remote Repositories. There is a difference in that in Mercurial bookmark names are global, while definition of 'remote' in Git describes also mapping of branch names from


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...