vcssync: support for linearizing a Git repo (bug 1322769); r?glob draft
authorGregory Szorc <gps@mozilla.com>
Tue, 24 Jan 2017 09:49:44 -0800
changeset 10222 c7abbb778086bbf0ec13202850e7641c6a1db563
parent 10219 bf11cd292483c3aa7bcfc4ce546fdcd86b122748
child 10223 eccdcb581828bfcea59f9affa3a1ec44f4cc4fd3
push id1480
push userbmo:gps@mozilla.com
push dateWed, 25 Jan 2017 19:14:03 +0000
reviewersglob
bugs1322769
vcssync: support for linearizing a Git repo (bug 1322769); r?glob At Mozilla, we like linear history. It is simple and easy to reason about. Contrast with history containing lots of merges, which can sometimes look like a spiderweb in graphical form. Unfortunately, the world (and arguably many tools) do not share our rather strong opinion that linear history is preferred. Git and GitHub are rife with non-linear history. Standard Git workflows rely heavily on merge commits and shy away from history rewriting. GitHub reinforces this with its default "Merge Pull Request" workflow, which takes commits on a branch and merges them into the destination repository. It wasn't until a few months ago that GitHub added a mechanism to rebase (not merge) commits to facilitate a linear history. By then, the damage was done: nearly every repository active on GitHub contains an ugly cobweb of history with merge commits galore. It is desired to import the Servo project into mozilla-central with history. The Servo Git repository relies on merge commits. Although in fairness, its history is *very* clean compared to most Git[Hub] projects. Pretty much every commit in the first-parent ancestry of the repository is a merge commit for a pull request and the non-first-parent ancestry has very few merge commits. But as relatively clean as the history is, it still isn't linear. Furthermore, importing the non-first-parent ancestry of Servo into mozilla-central would violate mozilla-central's goal of being commit-level bisectable. This means that every commit in the repo can build and can be tested. Unfortunately, the prevalent Git[Hub] commit authoring model doesn't necessarily rely on this. There are often tons of e.g. "fixup" commits. I like to ca;; these "how the sausage is made" commits. These break commit-level bisection and provide little value in the context of mozilla-central. With that bit of history lesson out of the way, this commit introduces functionality for "linearizing" a Git repository. You give it a Git repo URL and ref and it iterates through the first parent ancestry of that ref and rewrites commits so they are not merges. It stores the rewritten commits alongside the original, tracked by a different ref. As part of supporting the conversion of the Servo repository, this rewriting also has features that will facilitate import of the commits into mozilla-central will minimal additional rewriting. These features include: * Ability to exclude directories from history (Servo has 100,000+ WPT files which we don't wish to import... yet). * Ability to prefix the commit message with a string. * Ability to annotate the commit message with the original source repo location and commit. * Ability to rewrite Reviewable.io Markdown boilerplate into something more concise. The code was written in support of vendoring Servo. But it can be used against *any* Git repository and mostly "just works." There are parts of the code that could likely be abstracted or extended a bit better, such as commit message rewriting (that Reviewable.io rewriting has no place in a generic function IMO). A lot of work went into making this code fast because tweaking things and waiting several minutes to see results was extremely frustrating. Initial attempts at using `git filter-branch` yielded 4+ hour execution times. I got this down to ~2 hours by optimizing things as much as I could figure out how. After profiling and realizing most of the time was spent in Git index operations, I bailed on the approach and rewrote the rewriting in Python using Dulwich. The epic is described in /docs/vcssync.rst. The tl;dr is the Python+Dulwich approach takes ~10s instead of ~2 hours. (Yes, you read that correctly.) MozReview-Commit-ID: z7lwEGXYfv
test-requirements.txt
vcssync/mozvcssync/cli.py
vcssync/mozvcssync/gitrewrite/__init__.py
vcssync/mozvcssync/gitrewrite/linearize.py
vcssync/setup.py
vcssync/tests/test-linearize-git-author-map.t
vcssync/tests/test-linearize-git-basic.t
vcssync/tests/test-linearize-git-committer.t
vcssync/tests/test-linearize-git-exclude-dirs.t
vcssync/tests/test-linearize-git-message-rewrite.t
vcssync/tests/test-linearize-git-multiple-roots.t
vcssync/tests/test-linearize-git-not-fast-forward.t
vcssync/tests/test-linearize-git-p2-author.t
vcssync/tests/test-linearize-git-record-original-commit.t
vcssync/tests/test-linearize-git-reflog.t
--- a/test-requirements.txt
+++ b/test-requirements.txt
@@ -58,16 +58,19 @@ django-storages==1.4.1 \
     --hash=sha256:0ad7049caa7148b846906a7e114e5d245dba714a7a1ef895150234ae25788c46
 
 docker-py==1.10.6 \
     --hash=sha256:35b506e95861914fa5ad57a6707e3217b4082843b883be246190f57013948aba
 
 docker-pycreds==0.2.1 \
     --hash=sha256:58d2688f92de5d6f1a6ac4fe25da461232f0e0a4c1212b93b256b046b2d714a9
 
+dulwich==0.16.1 \
+    --hash=sha256:470d0feec9d4e7aba091c02f62db7f9cc6549ffe3f623a8039f96f584159da05
+
 enum34==1.1.1 \
     --hash=sha256:9d4a9220e4ebabd7ff60d853e69c3dd89debad5ddeb9ac5e768af811ece7708e
 
 factory-boy==2.7.0 \
     --hash=sha256:36c949d5c7adefb02d25323b7a5a97dc698e58ef84c4654845ecb2e34bee9a23
 
 fake-factory==0.7.2 \
     --hash=sha256:62a9b211c1eea951f63c992de305c31977768f042210df443885444683528173
@@ -75,16 +78,19 @@ fake-factory==0.7.2 \
 flake8==2.6.2 \
     --hash=sha256:7ac3bbaac27174d95bc4734ed23a07de567ffbcf4fc7e316854b4f3015d4fd15
 
 feedparser==5.1.3 \
     --hash=sha256:7f6507d400d07edfd1ea8205da36808009b0c539f5b8a6e0ab54337b955e6dc3 \
     --hash=sha256:ad543639e89d43685e2f1d3b6e48711562eec3be379e6958a920fbeaf4c63bce \
     --hash=sha256:a49ec89ebdb4234de473ad36792bf8da3a8640b8a263afda2eac510ff4908c92
 
+github3.py==0.9.6 \
+    --hash=sha256:650d31dbc3f3290ea56b18cfd0e72e00bbbd6777436578865a7e45b496f09e4c
+
 jsmin==2.1.1 \
     --hash=sha256:582f70f5fef561c8d561271206f45258d0c420eec31a8628914e87c73a2192e1
 
 idna==2.0 \
     --hash=sha256:9b2fc50bd3c4ba306b9651b69411ef22026d4d8335b93afc2214cef1246ce707
 
 ipaddress==1.0.16 \
     --hash=sha256:935712800ce4760701d89ad677666cd52691fd2f6f0b340c8b4239a3c17988a5
@@ -196,13 +202,19 @@ snowballstemmer==1.2.0 \
     --hash=sha256:6d54f350e7a0e48903a4e3b6b2cabd1b43e23765fbc975065402893692954191
 
 Sphinx==1.3.3 \
     --hash=sha256:3ad4cb89ab4baa5f9bb99548a9a1fb5127b3ee83a5213b83da4de7578afdc891
 
 sphinx-rtd-theme==0.1.9 \
     --hash=sha256:3c38d037713bd78043486eea5bf771d71ed697ec25c09e16f49e44887f7fe184
 
+uritemplate==3.0.0 \
+    --hash=sha256:1b9c467a940ce9fb9f50df819e8ddd14696f89b9a8cc87ac77952ba416e0a8fd
+
+uritemplate.py==3.0.2 \
+    --hash=sha256:a0c459569e80678c473175666e0d1b3af5bc9a13f84463ec74f808f3dd12ca47
+
 websocket-client==0.37.0 \
     --hash=sha256:678b246d816b94018af5297e72915160e2feb042e0cde1a9397f502ac3a52f41
 
 Whoosh==2.6.0 \
     --hash=sha256:7de7bc4d00a6d051dbb360b48eb7f3cd002373d87252fb0b284a3c9c453a7677
new file mode 100644
--- /dev/null
+++ b/vcssync/mozvcssync/cli.py
@@ -0,0 +1,112 @@
+# This Source Code Form is subject to the terms of the Mozilla Public
+# License, v. 2.0. If a copy of the MPL was not distributed with this
+# file, You can obtain one at http://mozilla.org/MPL/2.0/.
+
+from __future__ import absolute_import, unicode_literals
+
+import argparse
+import logging
+import sys
+
+from .gitrewrite import (
+    RewriteError,
+)
+from .gitrewrite.linearize import (
+    linearize_git_repo,
+)
+
+
+logger = logging.getLogger(__name__)
+
+
+LINEARIZE_GIT_ARGS = [
+    (('--exclude-dir',), dict(action='append', dest='exclude_dirs',
+                              help='Directory to exclude from rewritten '
+                                   'history')),
+    (('--summary-prefix',), dict(help='String to prefix commit message '
+                                      'summary line with')),
+    (('--reviewable-key',), dict(help='Commit message key to replace '
+                                      'Reviewable Markdown blocks with')),
+    (('--remove-reviewable',), dict(action='store_true',
+                                    help='Remove Reviewable.io Markdown blocks')),
+    (('--source-repo-key',), dict(help='Commit message key that source '
+                                       'repository should be recorded under')),
+    (('--source-revision-key',), dict(help='Commit message key that original '
+                                           'source revision should be stored '
+                                           'under')),
+    (('--normalize-github-merge-message',), dict(
+        action='store_true',
+        help='Rewrite commit messages for GitHub pull request merges '
+             'to be more sensible for linearized repos')),
+    (('--committer-action',), dict(choices={'keep', 'use-author',
+                                             'use-committer'},
+                                    help='What to do with committer field in'
+                                         'Git commits')),
+    (('--author-map',), dict(help='File containing mapping of old to new '
+                                  'commit author/committer values')),
+    (('--use-p2-author',), dict(action='store_true',
+                                help='Use the author of the 2nd parent for '
+                                     'merge commits')),
+    (('--github-username',), dict(help='Username to use for GitHub API '
+                                       'requests')),
+    (('--github-token',), dict(help='GitHub API token to use for GitHub API '
+                                    'requests')),
+]
+
+
+def get_git_linearize_kwargs(args):
+    kwargs = {}
+    for k in ('exclude_dirs', 'summary_prefix', 'reviewable_key',
+              'remove_reviewable', 'source_repo_key',
+              'source_revision_key', 'normalize_github_merge_message',
+              'committer_action', 'use_p2_author',
+              'github_username', 'github_token'):
+        v = getattr(args, k)
+        if v is not None:
+            kwargs[k] = v
+
+    if args.author_map:
+        author_map = {}
+        with open(args.author_map, 'rb') as fh:
+            for line in fh:
+                line = line.strip()
+                if not line or line.startswith(b'#'):
+                    continue
+
+                old, new = line.split(b'=')
+                author_map[old.strip()] = new.strip()
+
+        kwargs['author_map'] = author_map
+
+    return kwargs
+
+
+def configure_logging():
+    root = logging.getLogger()
+    handler = logging.StreamHandler(sys.stdout)
+    root.addHandler(handler)
+
+
+def linearize_git():
+    parser = argparse.ArgumentParser()
+    for args, kwargs in LINEARIZE_GIT_ARGS:
+        parser.add_argument(*args, **kwargs)
+
+    parser.add_argument('--source-repo',
+                        help='URL of repository being converted')
+    parser.add_argument('git_repo', help='Path to Git repository to linearize')
+    parser.add_argument('ref', help='ref to linearize')
+
+    args = parser.parse_args()
+
+    configure_logging()
+
+    kwargs = get_git_linearize_kwargs(args)
+
+    if args.source_repo:
+        kwargs['source_repo'] = args.source_repo
+
+    try:
+        linearize_git_repo(args.git_repo, args.ref, **kwargs)
+    except RewriteError as e:
+        logger.error('abort: %s' % str(e))
new file mode 100644
--- /dev/null
+++ b/vcssync/mozvcssync/gitrewrite/__init__.py
@@ -0,0 +1,298 @@
+# This Source Code Form is subject to the terms of the Mozilla Public
+# License, v. 2.0. If a copy of the MPL was not distributed with this
+# file, You can obtain one at http://mozilla.org/MPL/2.0/.
+
+from __future__ import absolute_import, unicode_literals
+
+import collections
+import json
+import os
+import re
+import stat
+import uuid
+
+import github3.pulls
+
+class RewriteError(Exception):
+    """Represents an error that occurred during rewriting."""
+
+
+def prune_directories(object_store, tree_id, directories):
+    """Remove directories from a tree, writing the new trees to the store.
+
+    An existing Git Tree object defined by ``tree_id`` will be examined for
+    directories in the ``directories`` iterable. If a directory exists, new tree
+    objects will be created as necessary.
+
+    The ``dulwich.objects.Tree`` instance for the rewritten tree will be
+    returned.
+    """
+    directories = set(d.strip(b'/') for d in directories)
+    assert b'' not in directories
+
+    class TreeDict(collections.defaultdict):
+        def __missing__(self, key):
+            path = b''
+            tree = self[path]
+            for d in key.split(b'/'):
+                if path:
+                    path += b'/%s' % d
+                else:
+                    path = d
+
+                new_tree = self.get(path)
+                if not new_tree:
+                    new_tree = object_store[tree[d][1]]
+                    self[path] = new_tree
+
+                tree = new_tree
+
+            return tree
+
+    # Our strategy is to iterate directories, find the tree object for its
+    # parent directory/tree, then rewrite parent tree objects, if necessary.
+    #
+    # A cache of path to Tree is maintained so lookups are fast and so we can
+    # wait until the end to write any objects.
+
+    # Maps path to possibly rewritten tree.
+    trees = TreeDict()
+    trees[b''] = object_store[tree_id]
+    dirty = set()
+
+    for directory in sorted(directories):
+        if b'/' in directory:
+            parent_path, basename = directory.rsplit(b'/', 1)
+        else:
+            parent_path, basename = b'', directory
+
+        # Parent directory doesn't exist.
+        try:
+            tree = trees[parent_path]
+        except KeyError:
+            continue
+
+        # This path doesn't exist.
+        try:
+            entry = tree[basename]
+        except KeyError:
+            continue
+
+        # Path isn't a directory.
+        if not entry[0] & stat.S_IFDIR:
+            continue
+
+        # Remove the entry and rewrite parent trees.
+        del tree[basename]
+        dirty.add(parent_path)
+
+        # Special case where we're already at the root.
+        if parent_path == b'':
+            continue
+
+        while b'/' in parent_path:
+            parent_path, basename = parent_path.rsplit(b'/', 1)
+            parent_tree = trees[parent_path]
+            parent_tree[basename] = (parent_tree[basename][0], tree.id)
+            tree = parent_tree
+            dirty.add(parent_path)
+
+        # And handle the root tree.
+        parent_tree = trees[b'']
+        parent_tree[parent_path] = (entry[0], tree.id)
+        dirty.add(b'')
+
+    for t in sorted(dirty, key=len, reverse=True):
+        object_store.add_object(trees[t])
+
+    return trees[b'']
+
+
+RE_REVIEWABLE = re.compile(r'''
+<!--\sReviewable:start\s-->
+# Fast forward to start of (URL) expression
+[^\(]+\(
+(?P<url>[^\)]+)
+\).*<!--\sReviewable:end\s-->
+''', re.DOTALL | re.VERBOSE)
+
+
+RE_GITHUB_MERGE_PR = re.compile(r'''
+^Merge\spull\srequest\s\#(?P<number>\d+)\sfrom\s(?P<where>[^\s]+)
+''', re.VERBOSE)
+
+
+RE_GITHUB_MERGE_PR2 = re.compile(r'''
+[aA]uto\smerge\sof\s
+\#(?P<number>\d+)
+\s[:-]\s
+(?P<where>[^,]+),
+\sr=(?P<reviewer>[^\s]+)
+''', re.VERBOSE)
+
+
+def rewrite_commit_message(message, summary_prefix=None, reviewable_key=None,
+                           remove_reviewable=False,
+                           normalize_github_merge=False,
+                           github_client=None,
+                           github_org=None,
+                           github_repo=None,
+                           github_cache_dir=None):
+    """Rewrite a Git commit message.
+
+    ``summary_prefix`` can prefix the summary line of the commit message
+    with a string.
+
+    ``reviewable_key`` replaces Reviewable.io Markdown with a ``<key>: <URL>``
+    string.
+
+    ``remove_reviewable`` will remove a Reviewable.io Markdown block.
+
+    ``normalize_github_merge`` will reformat the commit message generated
+    by performing a merge in GitHub. It replaces the somewhat uninformative
+    "Merge pull request #N" with the pull request title, which is extracted
+    from a subsequent line in the commit message.
+    """
+    if reviewable_key and remove_reviewable:
+        raise Exception('cannot specify both reviewable_key and remove_reviewable')
+
+    result = {}
+
+    if remove_reviewable:
+        message = RE_REVIEWABLE.sub(b'', message)
+
+    if reviewable_key:
+        # We can't plug reviewable_key into sub() because it may contain special
+        # characters. So, we replace with a unique value during substitution
+        # then do a literal replace.
+        unique = str(uuid.uuid1())
+        message = RE_REVIEWABLE.sub(b'%s: \\g<url>' % unique, message)
+        message = message.replace(unique, reviewable_key)
+
+    def get_summary():
+        for i, line in enumerate(message.splitlines()[1:]):
+            if line.strip():
+                return line, i + 1
+
+        return None, None
+
+    def get_user(login):
+        if github_cache_dir:
+            path = os.path.join(github_cache_dir, 'user-%s.json' % login)
+
+            if os.path.exists(path):
+                with open(path, 'rb') as fh:
+                    data = json.load(fh, encoding='utf-8')
+
+                return github3.users.User(data, github_client)
+
+        if github_client:
+            user = github_client.user(login)
+            if user and github_cache_dir:
+                with open(path, 'wb') as fh:
+                    json.dump(user.to_json(), fh, encoding='utf-8',
+                              indent=2, sort_keys=True)
+
+            return user
+
+        return None
+
+    def get_pull_request(number):
+        pr = None
+
+        if github_cache_dir:
+            path = os.path.join(github_cache_dir, 'pr-%s.json' % number)
+
+            if os.path.exists(path):
+                with open(path, 'rb') as fh:
+                    data = json.load(fh, encoding='utf-8')
+
+                pr = github3.pulls.PullRequest(data, github_client)
+
+        if not pr and github_client and github_org and github_repo:
+            pr = github_client.pull_request(github_org, github_repo, number)
+            if pr and github_cache_dir:
+                with open(path, 'wb') as fh:
+                    json.dump(pr.to_json(), fh, encoding='utf-8',
+                              indent=2, sort_keys=True)
+
+        user = None
+        if pr and pr.head.user:
+            user = get_user(pr.head.user.login)
+
+        return pr, user
+
+    if normalize_github_merge:
+        # This is the GitHub default message.
+        m = RE_GITHUB_MERGE_PR.match(message)
+        if m:
+            lines = message.splitlines()
+
+            pr, user = get_pull_request(m.group('number'))
+            result['pull_request'] = pr
+            result['pull_request_user'] = user
+
+            pr_summary, pr_line = get_summary()
+            if pr_summary:
+                where = b':'.join(m.group('where').rsplit(b'/', 1))
+                summary = b'Merge #%s - %s (from %s)' % (
+                    m.group('number'),
+                    pr_summary.rstrip().rstrip(b'.'),
+                    where)
+
+                message = b'\n'.join([summary] + lines[pr_line + 1:])
+
+        # This convention is used by Servo / Bors.
+        m = RE_GITHUB_MERGE_PR2.match(message)
+        if m:
+            lines = message.splitlines()
+
+            where = m.group('where')
+            pr, user = get_pull_request(m.group('number'))
+            result['pull_request'] = pr
+            result['pull_request_user'] = user
+            if pr:
+                if pr.head.label:
+                    where = pr.head.label.encode('utf-8')
+
+                title = pr.title.encode('utf-8')
+
+                summary = b'Merge #%s - %s (from %s); r=%s' % (
+                    m.group('number'),
+                    title.rstrip().rstrip(b'.'),
+                    where,
+                    m.group('reviewer'))
+
+                # Some commits have the PR title as the first non-summary line.
+                # If so, remove that line and a blank line that follows.
+                if lines[2:4] in ([title, b''], [title]):
+                    lines = lines[0:2] + lines[4:]
+
+                message = b'\n'.join([summary] + lines[1:])
+
+            # No GitHub API data. Rely on the commit message itself.
+            else:
+                pr_summary, pr_line = get_summary()
+                if pr_summary:
+                    summary = b'Merge #%s - %s (from %s); r=%s' % (
+                        m.group('number'),
+                        pr_summary.rstrip().rstrip(b'.'),
+                        m.group('where'),
+                        m.group('reviewer'))
+                    message = b'\n'.join([summary] + lines[pr_line + 1:])
+                else:
+                    summary = b'Merge #%s (from %s); r=%s' % (
+                        m.group('number'),
+                        m.group('where'),
+                        m.group('reviewer'))
+                    message = b'\n'.join([summary] + lines[1:])
+
+    if summary_prefix:
+        message = b'%s %s' % (summary_prefix, message)
+
+    message = b'%s\n' % message.rstrip()
+
+    assert isinstance(message, str)
+    result['message'] = message
+
+    return result
new file mode 100644
--- /dev/null
+++ b/vcssync/mozvcssync/gitrewrite/linearize.py
@@ -0,0 +1,313 @@
+# This Source Code Form is subject to the terms of the Mozilla Public
+# License, v. 2.0. If a copy of the MPL was not distributed with this
+# file, You can obtain one at http://mozilla.org/MPL/2.0/.
+
+"""Functionality for "linearizing" a Git repository.
+
+This takes a repository with merge commits and rewrites commits to remove
+the merges, yielding a clean, linear history.
+"""
+
+from __future__ import absolute_import, print_function, unicode_literals
+
+import logging
+import os
+import re
+import subprocess
+
+import dulwich.repo
+import github3
+
+from . import (
+    prune_directories,
+    rewrite_commit_message,
+    RewriteError,
+)
+
+
+logger = logging.getLogger(__name__)
+
+
+def linearize_git_repo(git_repo, ref, exclude_dirs=None,
+                       summary_prefix=None,
+                       reviewable_key=None,
+                       remove_reviewable=False,
+                       normalize_github_merge_message=False,
+                       source_repo_key=None, source_repo=None,
+                       source_revision_key=None,
+                       committer_action='keep',
+                       author_map=None,
+                       use_p2_author=False,
+                       github_username=None, github_token=None):
+    """Linearize a ref in a Git repository.
+
+    The commits in the ref will be rewritten to only include first parent
+    ancestry. i.e. all merge commits will be removed. All commits from the
+    non-first-parent ancestry will be dropped.
+
+    As a side-effect of conversion, the refs ``refs/convert/source/<ref>`` and
+    ``refs/convert/dest/<ref>`` will be written containing pointers to the
+    last converted commit (identical to ``ref``) and its new, converted
+    commit, respectively. Reflog entries will be written to indicate movement
+    of these refs.
+
+    The original ``ref`` is untouched.
+
+    Subsequent invocations of this function will perform an incremental
+    conversion and only convert commits introduced since the last conversion.
+
+    ``exclude_dirs`` is an iterable of directories to exclude from history.
+
+    ``summary_prefix`` allows prefixing the summary line of the commit message
+    with a string. A space separates the prefix from the original message.
+
+    ``reviewable_key`` if set will replace Reviewable.io Markdown in the
+    commit message with a string of the form ``<reviewable_key>: <URL>``.
+
+    ``remove_reviewable`` will remove Reviewable.io Markdown in the commit
+    message.
+
+    ``source_repo_key`` and ``source_repo`` rewrite the commit message to
+    contain metadata listing the source repository in the form
+    ``<source_repo_key>: <source_repo>``. ``source_repo`` should presumably
+    be a URL.
+
+    ``source_revision_key`` if specified will rewrite the commit message
+    to contain a line of the form ``<source_revision_key>: COMMIT`` where
+    ``COMMIT`` is the original Git commit ID.
+
+    ``committer_action`` specifies how to handle the ``committer`` field in
+    the Git commit object. Possible values are ``keep`` (the default) to
+    not modify the field, ``use-author`` to copy the ``author`` field to the
+    ``committer`` field, or ``use-committer`` to copy the ``committer`` field
+    to the ``author`` field.
+
+    ``author_map`` is a dict mapping old author/committer values to new
+    ones.
+
+    ``use_p2_author`` indicates whether to use the author of the 2nd parent
+    on merge commits. By default, the author of the merge commit is used.
+
+    Returns a dict describing what rewrites were performed. The dict has the
+    following keys:
+
+    source_commit
+        Git commit hash corresponding to ``ref``.
+    dest_commit
+        The converted commit hash corresponding to converted ``ref``.
+    commit_map
+        Dict mapping old commit IDs to converted commit IDs. Only contains
+        commits that were converted as part of this evaluation.
+    source_ref
+        The ref holding the original commit that was last converted (points
+        to ``source_commit``).
+    dest_ref
+        The ref holding the converted commit that was last converted (points
+        to ``dest_commit``).
+    """
+    if committer_action not in ('keep', 'use-author', 'use-committer'):
+        raise ValueError('committer_action must be one of keep, use-author, '
+                         'or use-committer')
+
+    author_map = author_map or {}
+
+    repo = dulwich.repo.Repo(git_repo)
+    head = repo[b'refs/%s' % ref].id
+
+    # Look for state from previous conversion.
+    source_ref = b'refs/convert/source/%s' % ref
+    dest_ref = b'refs/convert/dest/%s' % ref
+
+    if source_ref in repo.refs and dest_ref not in repo.refs:
+        raise Exception('convert source ref without dest ref %s' % dest_ref)
+    if dest_ref in repo.refs and source_ref not in repo.refs:
+        raise Exception('convert dest ref without source ref %s' % source_ref)
+
+    try:
+        source_commit_id = repo[source_ref].id
+    except KeyError:
+        source_commit_id = None
+
+    try:
+        dest_commit_id = repo[dest_ref].id
+    except KeyError:
+        dest_commit_id = None
+
+    result = {
+        'source_commit': head,
+        'dest_commit': dest_commit_id,
+        'commit_map': {},
+        'source_ref': source_ref,
+        'dest_ref': dest_ref,
+    }
+
+    # Walk the p1 ancestry to find commits to convert, stopping when we found
+    # the commit that was converted last. On first run, this will walk all the
+    # way to a root commit.
+    #
+    # This walk also verifies the last converted commit is in the ancestry.
+    # If it isn't, a force push / reset has occurred. While we could support
+    # non-fast-forward conversions, we choose not to at this time.
+    source_commits = []
+    commit = repo[head]
+    source_commit_found = False
+    while True:
+        if commit.id == source_commit_id:
+            source_commit_found = True
+            break
+
+        source_commits.append(commit)
+
+        if not commit.parents:
+            break
+
+        commit = repo[commit.parents[0]]
+
+    if source_commit_id and not source_commit_found:
+        raise RewriteError('source commit %s not found in ref %s; refusing to '
+                           'convert non-fast-forward history' % (
+                           source_commit_id, ref))
+
+    if not source_commits:
+        logger.warn('no new commits to linearize; not doing anything')
+        return result
+
+    source_commits = list(reversed(source_commits))
+
+    logger.warn('linearizing %d commits from %s (%s to %s)' % (
+        len(source_commits), ref, source_commits[0].id, source_commits[-1].id))
+
+    github_client = None
+    if github_username and github_token:
+        github_client = github3.login(username=github_username,
+                                      token=github_token)
+
+    github_org, github_repo = None, None
+    github_cache_dir = os.path.join(git_repo, 'github-cache')
+
+    if source_repo and source_repo.startswith(b'https://github.com/'):
+        orgrepo = source_repo[len(b'https://github.com/'):]
+        github_org, github_repo = orgrepo.split(b'/')
+
+    if github_client and github_repo and not os.path.exists(github_cache_dir):
+        os.mkdir(github_cache_dir)
+
+    for i, source_commit in enumerate(source_commits):
+        logger.warn('%d/%d %s %s' % (
+            i + 1, len(source_commits), source_commit.id,
+            source_commit.message.splitlines()[0].decode('utf-8', 'replace')))
+
+        dest_commit = source_commit.copy()
+
+        # If we're pruning directories, we need to rewrite tree objects.
+        if exclude_dirs:
+            dest_commit.tree = prune_directories(repo.object_store,
+                                                 dest_commit.tree,
+                                                 exclude_dirs).id
+
+        if use_p2_author and len(source_commit.parents) == 2:
+            c = repo[source_commit.parents[1]]
+            author = c.author
+            committer = c.committer
+        else:
+            author = source_commit.author
+            committer = source_commit.committer
+
+        # Replace parents list with our single parent from the last conversion.
+        dest_commit.parents = [dest_commit_id] if dest_commit_id else []
+
+        dest_commit.author = author_map.get(author, author)
+        dest_commit.committer = author_map.get(committer, committer)
+
+        if committer_action == 'use-author':
+            dest_commit.committer = dest_commit.author
+            dest_commit.commit_time = dest_commit.author_time
+            dest_commit.commit_timezone = dest_commit.author_timezone
+        elif committer_action == 'use-committer':
+            dest_commit.author = dest_commit.committer
+            dest_commit.author_time = dest_commit.commit_time
+            dest_commit.author_timezone = dest_commit.commit_timezone
+        else:
+            assert committer_action == 'keep'
+
+        # Basic commit message rewriting.
+        # TODO consider factoring this into a callback to make it extensible.
+        if summary_prefix or reviewable_key or remove_reviewable or normalize_github_merge_message:
+            message_result = rewrite_commit_message(
+                dest_commit.message,
+                summary_prefix=summary_prefix,
+                reviewable_key=reviewable_key,
+                remove_reviewable=remove_reviewable,
+                normalize_github_merge=normalize_github_merge_message,
+                github_client=github_client,
+                github_org=github_org,
+                github_repo=github_repo,
+                github_cache_dir=github_cache_dir,
+            )
+
+            dest_commit.message = message_result['message']
+
+        # Record source repository and revision annotations in commit message
+        # if requested.
+        if source_repo_key or source_revision_key:
+            lines = dest_commit.message.rstrip().splitlines()
+
+            # Insert a blank line if previous line isn't a "metadata" line.
+            if not re.match('^[a-zA-Z-]+: \S+$', lines[-1]) or len(lines) == 1:
+                lines.append(b'')
+
+            if source_repo_key:
+                lines.append(b'%s: %s' % (source_repo_key, source_repo))
+            if source_revision_key:
+                lines.append(b'%s: %s' % (source_revision_key,
+                                          source_commit.id))
+
+            dest_commit.message = b'%s\n' % b'\n'.join(lines)
+
+        # Our commit object is fully transformed. Write it.
+        repo.object_store.add_object(dest_commit)
+
+        dest_commit_id = dest_commit.id
+        result['commit_map'][source_commit.id] = dest_commit_id
+
+    result['dest_commit'] = dest_commit_id
+
+    # Store refs to the converted source and dest commits. We use
+    # ``git update-ref`` so reflogs are written (Dulwich doesn't appear
+    # to write reflogs).
+    reflog_commands = []
+    if source_ref in repo:
+        reflog_commands.append(b'update %s\0%s\0%s' % (
+            source_ref, head, repo[source_ref].id))
+    else:
+        reflog_commands.append(b'create %s\0%s' % (source_ref, head))
+
+    if dest_ref in repo:
+        reflog_commands.append(b'update %s\0%s\0%s' % (
+            dest_ref, dest_commit_id, repo[dest_ref].id))
+    else:
+        reflog_commands.append(b'create %s\0%s' % (dest_ref, dest_commit_id))
+
+    p = subprocess.Popen([b'git', b'update-ref',
+                          b'--create-reflog',
+                          b'-m', b'linearize %s' % ref,
+                          b'--stdin', b'-z'],
+                         stdin=subprocess.PIPE,
+                         cwd=git_repo)
+    p.stdin.write(b'\0'.join(reflog_commands))
+    p.stdin.close()
+    res = p.wait()
+    if res:
+        raise Exception('failed to update refs')
+
+    logger.warn('%s converted; original: %s; rewritten: %s' % (
+                ref, head, repo[dest_ref].id))
+
+    # Perform a garbage collection so we don't have potentially thousands
+    # of loose objects sitting around, as performance will suffer and Git
+    # will complain otherwise.
+    subprocess.check_call([b'git',
+                           b'-c', b'gc.autodetach=false',
+                           b'gc', b'--auto'], cwd=git_repo)
+
+    return result
--- a/vcssync/setup.py
+++ b/vcssync/setup.py
@@ -9,10 +9,15 @@ setup(
     author_email='dev-version-control@lists.mozilla.org',
     license='MPL 2.0',
     classifiers=[
         'Development Status :: 4 - Beta',
         'Intended Audience :: Developers',
         'Programming Language :: Python :: 2.7',
     ],
     packages=find_packages(),
-    install_requires=['Mercurial>=4.0'],
+    entry_points={
+        'console_scripts': [
+            'linearize-git=mozvcssync.cli:linearize_git',
+        ],
+    },
+    install_requires=['dulwich>=0.16', 'github3.py>=0.9.6', 'Mercurial>=4.0'],
 )
new file mode 100644
--- /dev/null
+++ b/vcssync/tests/test-linearize-git-author-map.t
@@ -0,0 +1,57 @@
+  $ . $TESTDIR/vcssync/tests/helpers.sh
+
+  $ git init grepo0
+  Initialized empty Git repository in $TESTTMP/grepo0/.git/
+
+  $ cd grepo0
+  $ echo 0 > foo
+  $ git add foo
+  $ git commit -m initial
+  [master (root-commit) dbd62b8] initial
+   1 file changed, 1 insertion(+)
+   create mode 100644 foo
+
+  $ echo 1 > foo
+  $ git add foo
+  $ GIT_AUTHOR_NAME='Old Author' GIT_AUTHOR_EMAIL=old-author@example.com GIT_COMMITTER_NAME='Old Committer' GIT_COMMITTER_EMAIL=old-committer@example.com git commit -m 1
+  [master cb229ea] 1
+   Author: Old Author <old-author@example.com>
+   1 file changed, 1 insertion(+), 1 deletion(-)
+
+Author map works
+
+  $ cat > author_map << EOF
+  > # This is a comment followed by an empty line
+  > 
+  > Old Author <old-author@example.com> = New Author <new-author@example.com>
+  > Old Committer <old-committer@example.com> = New Committer <new-committer@example.com>
+  > # This ia another comment
+  > EOF
+
+  $ linearize-git --author-map author_map . heads/master
+  linearizing 2 commits from heads/master (dbd62b82aaf0a7a05665d9455a9b4d490d52ddaf to cb229eaf293faecf0580d4b911000425a3338150)
+  1/2 dbd62b82aaf0a7a05665d9455a9b4d490d52ddaf initial
+  2/2 cb229eaf293faecf0580d4b911000425a3338150 1
+  heads/master converted; original: cb229eaf293faecf0580d4b911000425a3338150; rewritten: cc01022a784bb7973b3ec8c5167c41302426286c
+
+  $ git log convert/dest/heads/master
+  commit cc01022a784bb7973b3ec8c5167c41302426286c
+  Author: New Author <new-author@example.com>
+  Date:   Thu Jan 1 00:00:00 1970 +0000
+  
+      1
+  
+  commit dbd62b82aaf0a7a05665d9455a9b4d490d52ddaf
+  Author: test <test@example.com>
+  Date:   Thu Jan 1 00:00:00 1970 +0000
+  
+      initial
+
+
+  $ git cat-file -p convert/dest/heads/master
+  tree a229c158b3d5560cc44ad3dec6ff5d13a47e11cf
+  parent dbd62b82aaf0a7a05665d9455a9b4d490d52ddaf
+  author New Author <new-author@example.com> 0 +0000
+  committer New Committer <new-committer@example.com> 0 +0000
+  
+  1
new file mode 100644
--- /dev/null
+++ b/vcssync/tests/test-linearize-git-basic.t
@@ -0,0 +1,145 @@
+  $ . $TESTDIR/vcssync/tests/helpers.sh
+
+Create a Git repo with a simple merge
+
+  $ git init grepo0
+  Initialized empty Git repository in $TESTTMP/grepo0/.git/
+  $ cd grepo0
+  $ touch foo
+  $ git add foo
+  $ git commit -m initial
+  [master (root-commit) a547cc0] initial
+   1 file changed, 0 insertions(+), 0 deletions(-)
+   create mode 100644 foo
+  $ git checkout -b head1
+  Switched to a new branch 'head1'
+  $ touch file0
+  $ git add file0
+  $ git commit -m 'add file0'
+  [head1 48fba69] add file0
+   1 file changed, 0 insertions(+), 0 deletions(-)
+   create mode 100644 file0
+  $ git checkout master
+  Switched to branch 'master'
+  $ touch file1
+  $ git add file1
+  $ git commit -m 'add file1'
+  [master c37ea67] add file1
+   1 file changed, 0 insertions(+), 0 deletions(-)
+   create mode 100644 file1
+  $ git merge head1
+  Merge made by the 'recursive' strategy.
+   file0 | 0
+   1 file changed, 0 insertions(+), 0 deletions(-)
+   create mode 100644 file0
+
+  $ git log --graph --format=oneline
+  *   9127cbf8ed74dd362cf28e37e8df7864df3057e3 Merge branch 'head1'
+  |\  
+  | * 48fba69d25d8ec2d06c8d0a00851d109acd7d986 add file0
+  * | c37ea67cfc02a686d402594235bcba334fb727af add file1
+  |/  
+  * a547cc07d30f025e022b27310c713705158c21b4 initial
+
+  $ git for-each-ref
+  48fba69d25d8ec2d06c8d0a00851d109acd7d986 commit	refs/heads/head1
+  9127cbf8ed74dd362cf28e37e8df7864df3057e3 commit	refs/heads/master
+
+Linearized repo should have no merges
+
+  $ linearize-git . heads/master
+  linearizing 3 commits from heads/master (a547cc07d30f025e022b27310c713705158c21b4 to 9127cbf8ed74dd362cf28e37e8df7864df3057e3)
+  1/3 a547cc07d30f025e022b27310c713705158c21b4 initial
+  2/3 c37ea67cfc02a686d402594235bcba334fb727af add file1
+  3/3 9127cbf8ed74dd362cf28e37e8df7864df3057e3 Merge branch 'head1'
+  heads/master converted; original: 9127cbf8ed74dd362cf28e37e8df7864df3057e3; rewritten: 4a8e25bc50dc5e927f209e1cbac8a7c0346b72b7
+
+  $ git log --graph --format=oneline convert/dest/heads/master
+  * 4a8e25bc50dc5e927f209e1cbac8a7c0346b72b7 Merge branch 'head1'
+  * c37ea67cfc02a686d402594235bcba334fb727af add file1
+  * a547cc07d30f025e022b27310c713705158c21b4 initial
+
+Original refs should be untouched, new tracking refs should be added
+
+  $ git for-each-ref
+  4a8e25bc50dc5e927f209e1cbac8a7c0346b72b7 commit	refs/convert/dest/heads/master
+  9127cbf8ed74dd362cf28e37e8df7864df3057e3 commit	refs/convert/source/heads/master
+  48fba69d25d8ec2d06c8d0a00851d109acd7d986 commit	refs/heads/head1
+  9127cbf8ed74dd362cf28e37e8df7864df3057e3 commit	refs/heads/master
+
+Linearize with no changes should no-op
+
+  $ linearize-git . heads/master
+  no new commits to linearize; not doing anything
+
+Add more commits to the source repository
+
+  $ touch file2
+  $ git add file2
+  $ git commit -m 'add file2'
+  [master 622273f] add file2
+   1 file changed, 0 insertions(+), 0 deletions(-)
+   create mode 100644 file2
+  $ touch file3
+  $ git add file3
+  $ git commit -m 'add file3'
+  [master e6c4fa0] add file3
+   1 file changed, 0 insertions(+), 0 deletions(-)
+   create mode 100644 file3
+
+Incremental linearize should only convert new commits, graft on top of existing conversion
+
+  $ git for-each-ref
+  4a8e25bc50dc5e927f209e1cbac8a7c0346b72b7 commit	refs/convert/dest/heads/master
+  9127cbf8ed74dd362cf28e37e8df7864df3057e3 commit	refs/convert/source/heads/master
+  48fba69d25d8ec2d06c8d0a00851d109acd7d986 commit	refs/heads/head1
+  e6c4fa028c4bbb545d8b72667cf224e2141d88e7 commit	refs/heads/master
+
+  $ linearize-git . heads/master
+  linearizing 2 commits from heads/master (622273f903fba1c0fabe939ec34a61e804fa66cf to e6c4fa028c4bbb545d8b72667cf224e2141d88e7)
+  1/2 622273f903fba1c0fabe939ec34a61e804fa66cf add file2
+  2/2 e6c4fa028c4bbb545d8b72667cf224e2141d88e7 add file3
+  heads/master converted; original: e6c4fa028c4bbb545d8b72667cf224e2141d88e7; rewritten: dd15e055c3525362c7d61d09f0e71be97d730415
+
+  $ git for-each-ref
+  dd15e055c3525362c7d61d09f0e71be97d730415 commit	refs/convert/dest/heads/master
+  e6c4fa028c4bbb545d8b72667cf224e2141d88e7 commit	refs/convert/source/heads/master
+  48fba69d25d8ec2d06c8d0a00851d109acd7d986 commit	refs/heads/head1
+  e6c4fa028c4bbb545d8b72667cf224e2141d88e7 commit	refs/heads/master
+
+  $ git log --graph --format=oneline refs/heads/master
+  * e6c4fa028c4bbb545d8b72667cf224e2141d88e7 add file3
+  * 622273f903fba1c0fabe939ec34a61e804fa66cf add file2
+  *   9127cbf8ed74dd362cf28e37e8df7864df3057e3 Merge branch 'head1'
+  |\  
+  | * 48fba69d25d8ec2d06c8d0a00851d109acd7d986 add file0
+  * | c37ea67cfc02a686d402594235bcba334fb727af add file1
+  |/  
+  * a547cc07d30f025e022b27310c713705158c21b4 initial
+
+  $ git log --graph --format=oneline refs/convert/dest/heads/master
+  * dd15e055c3525362c7d61d09f0e71be97d730415 add file3
+  * 70d34f749be26b16519ea65aac6d9851040d1bd8 add file2
+  * 4a8e25bc50dc5e927f209e1cbac8a7c0346b72b7 Merge branch 'head1'
+  * c37ea67cfc02a686d402594235bcba334fb727af add file1
+  * a547cc07d30f025e022b27310c713705158c21b4 initial
+
+  $ git cat-file -p refs/convert/dest/heads/master^
+  tree ba95d78c0a2301f6c6d095af7cbb5e0ee2254de3
+  parent 4a8e25bc50dc5e927f209e1cbac8a7c0346b72b7
+  author test <test@example.com> 0 +0000
+  committer test <test@example.com> 0 +0000
+  
+  add file2
+
+  $ git log --graph --format=oneline convert/dest/heads/master
+  * dd15e055c3525362c7d61d09f0e71be97d730415 add file3
+  * 70d34f749be26b16519ea65aac6d9851040d1bd8 add file2
+  * 4a8e25bc50dc5e927f209e1cbac8a7c0346b72b7 Merge branch 'head1'
+  * c37ea67cfc02a686d402594235bcba334fb727af add file1
+  * a547cc07d30f025e022b27310c713705158c21b4 initial
+
+Should no-op again
+
+  $ linearize-git . heads/master
+  no new commits to linearize; not doing anything
new file mode 100644
--- /dev/null
+++ b/vcssync/tests/test-linearize-git-committer.t
@@ -0,0 +1,79 @@
+  $ . $TESTDIR/vcssync/tests/helpers.sh
+
+  $ export GIT_AUTHOR_NAME='Git Author'
+  $ export GIT_AUTHOR_EMAIL='author@example.com'
+  $ export GIT_COMMITTER_NAME='Git Committer'
+  $ export GIT_COMMITTER_EMAIL='committer@example.com>'
+  $ export GIT_COMMITTER_DATE='Fri Jan 6 00:00:00 2017 +0000'
+
+  $ git init grepo0
+  Initialized empty Git repository in $TESTTMP/grepo0/.git/
+
+  $ cd grepo0
+
+  $ echo 0 > foo
+  $ git add foo
+  $ git commit -m initial
+  [master (root-commit) 81ceece] initial
+   Author: Git Author <author@example.com>
+   1 file changed, 1 insertion(+)
+   create mode 100644 foo
+  $ git branch keep
+  $ git branch use-author
+  $ git branch use-committer
+
+Git committer should be retained by default
+
+  $ linearize-git . heads/master
+  linearizing 1 commits from heads/master (81ceece45bfdbe831a28eb6b90d196aea1330184 to 81ceece45bfdbe831a28eb6b90d196aea1330184)
+  1/1 81ceece45bfdbe831a28eb6b90d196aea1330184 initial
+  heads/master converted; original: 81ceece45bfdbe831a28eb6b90d196aea1330184; rewritten: 81ceece45bfdbe831a28eb6b90d196aea1330184
+
+  $ git cat-file -p refs/convert/dest/heads/master
+  tree 2d2675b9e90bde3e722e6ef55faee52aec2e3857
+  author Git Author <author@example.com> 0 +0000
+  committer Git Committer <committer@example.com> 1483660800 +0000
+  
+  initial
+
+--committer-action keep is the default behavior
+
+  $ linearize-git --committer-action keep . heads/keep
+  linearizing 1 commits from heads/keep (81ceece45bfdbe831a28eb6b90d196aea1330184 to 81ceece45bfdbe831a28eb6b90d196aea1330184)
+  1/1 81ceece45bfdbe831a28eb6b90d196aea1330184 initial
+  heads/keep converted; original: 81ceece45bfdbe831a28eb6b90d196aea1330184; rewritten: 81ceece45bfdbe831a28eb6b90d196aea1330184
+
+  $ git cat-file -p refs/convert/dest/heads/keep
+  tree 2d2675b9e90bde3e722e6ef55faee52aec2e3857
+  author Git Author <author@example.com> 0 +0000
+  committer Git Committer <committer@example.com> 1483660800 +0000
+  
+  initial
+
+use-author copies author to committer
+
+  $ linearize-git --committer-action use-author . heads/use-author
+  linearizing 1 commits from heads/use-author (81ceece45bfdbe831a28eb6b90d196aea1330184 to 81ceece45bfdbe831a28eb6b90d196aea1330184)
+  1/1 81ceece45bfdbe831a28eb6b90d196aea1330184 initial
+  heads/use-author converted; original: 81ceece45bfdbe831a28eb6b90d196aea1330184; rewritten: 42591cc3c328b9e9c0ee9ae6e4573894b17ba691
+
+  $ git cat-file -p refs/convert/dest/heads/use-author
+  tree 2d2675b9e90bde3e722e6ef55faee52aec2e3857
+  author Git Author <author@example.com> 0 +0000
+  committer Git Author <author@example.com> 0 +0000
+  
+  initial
+
+use-committer copies committer to author
+
+  $ linearize-git --committer-action use-committer . heads/use-committer
+  linearizing 1 commits from heads/use-committer (81ceece45bfdbe831a28eb6b90d196aea1330184 to 81ceece45bfdbe831a28eb6b90d196aea1330184)
+  1/1 81ceece45bfdbe831a28eb6b90d196aea1330184 initial
+  heads/use-committer converted; original: 81ceece45bfdbe831a28eb6b90d196aea1330184; rewritten: 610b4fec36ab4c55172fa95cfa9c462323ad335f
+
+  $ git cat-file -p refs/convert/dest/heads/use-committer
+  tree 2d2675b9e90bde3e722e6ef55faee52aec2e3857
+  author Git Committer <committer@example.com> 1483660800 +0000
+  committer Git Committer <committer@example.com> 1483660800 +0000
+  
+  initial
new file mode 100644
--- /dev/null
+++ b/vcssync/tests/test-linearize-git-exclude-dirs.t
@@ -0,0 +1,181 @@
+  $ . $TESTDIR/vcssync/tests/helpers.sh
+
+Create a Git repo with files we wish to prune
+
+  $ git init repo0
+  Initialized empty Git repository in $TESTTMP/repo0/.git/
+  $ cd repo0
+  $ touch foo
+  $ git add foo
+  $ git commit -m initial
+  [master (root-commit) a547cc0] initial
+   1 file changed, 0 insertions(+), 0 deletions(-)
+   create mode 100644 foo
+  $ git checkout -b head1
+  Switched to a new branch 'head1'
+  $ mkdir dir0 dir1 dir2
+  $ touch dir0/file0 dir1/file0 dir2/file0
+  $ git add -A
+  $ git commit -m 'add file0s'
+  [head1 db8c4de] add file0s
+   3 files changed, 0 insertions(+), 0 deletions(-)
+   create mode 100644 dir0/file0
+   create mode 100644 dir1/file0
+   create mode 100644 dir2/file0
+  $ git checkout master
+  Switched to branch 'master'
+  $ mkdir dir0 dir1 dir2
+  $ touch dir0/file1 dir1/file1 dir2/file1
+  $ git add -A
+  $ git commit -m 'add file1s'
+  [master 0ac77c9] add file1s
+   3 files changed, 0 insertions(+), 0 deletions(-)
+   create mode 100644 dir0/file1
+   create mode 100644 dir1/file1
+   create mode 100644 dir2/file1
+  $ touch dir0/file2
+  $ git add -A
+  $ git commit -m 'add dir0/file2'
+  [master b7b3abc] add dir0/file2
+   1 file changed, 0 insertions(+), 0 deletions(-)
+   create mode 100644 dir0/file2
+  $ git checkout master
+  Already on 'master'
+  $ git merge head1
+  Merge made by the 'recursive' strategy.
+   dir0/file0 | 0
+   dir1/file0 | 0
+   dir2/file0 | 0
+   3 files changed, 0 insertions(+), 0 deletions(-)
+   create mode 100644 dir0/file0
+   create mode 100644 dir1/file0
+   create mode 100644 dir2/file0
+
+  $ git log --graph --format=oneline
+  *   e9fb4537517445c07d491482211919591e4dae45 Merge branch 'head1'
+  |\  
+  | * db8c4dec7798ea623eeb989c3112e9e96767a722 add file0s
+  * | b7b3abcd50597761f65c0a11846de6ebc98cc5b7 add dir0/file2
+  * | 0ac77c9293242a70f71defcee37a74659207b19e add file1s
+  |/  
+  * a547cc07d30f025e022b27310c713705158c21b4 initial
+
+  $ git for-each-ref
+  db8c4dec7798ea623eeb989c3112e9e96767a722 commit	refs/heads/head1
+  e9fb4537517445c07d491482211919591e4dae45 commit	refs/heads/master
+
+Directories can be excluded when linearizing
+
+  $ linearize-git --exclude-dir dir2 . heads/master
+  linearizing 4 commits from heads/master (a547cc07d30f025e022b27310c713705158c21b4 to e9fb4537517445c07d491482211919591e4dae45)
+  1/4 a547cc07d30f025e022b27310c713705158c21b4 initial
+  2/4 0ac77c9293242a70f71defcee37a74659207b19e add file1s
+  3/4 b7b3abcd50597761f65c0a11846de6ebc98cc5b7 add dir0/file2
+  4/4 e9fb4537517445c07d491482211919591e4dae45 Merge branch 'head1'
+  heads/master converted; original: e9fb4537517445c07d491482211919591e4dae45; rewritten: d017a118a5429ca800345e6f14e1a61f6f613b57
+
+  $ git show -m refs/convert/dest/heads/master
+  commit d017a118a5429ca800345e6f14e1a61f6f613b57
+  Author: test <test@example.com>
+  Date:   Thu Jan 1 00:00:00 1970 +0000
+  
+      Merge branch 'head1'
+  
+  diff --git a/dir0/file0 b/dir0/file0
+  new file mode 100644
+  index 0000000..e69de29
+  diff --git a/dir1/file0 b/dir1/file0
+  new file mode 100644
+  index 0000000..e69de29
+
+--exclude-dir works multiple times
+
+  $ git update-ref -d refs/convert/source/heads/master
+  $ git update-ref -d refs/convert/dest/heads/master
+  $ linearize-git --exclude-dir dir0 --exclude-dir dir1 . heads/master
+  linearizing 4 commits from heads/master (a547cc07d30f025e022b27310c713705158c21b4 to e9fb4537517445c07d491482211919591e4dae45)
+  1/4 a547cc07d30f025e022b27310c713705158c21b4 initial
+  2/4 0ac77c9293242a70f71defcee37a74659207b19e add file1s
+  3/4 b7b3abcd50597761f65c0a11846de6ebc98cc5b7 add dir0/file2
+  4/4 e9fb4537517445c07d491482211919591e4dae45 Merge branch 'head1'
+  heads/master converted; original: e9fb4537517445c07d491482211919591e4dae45; rewritten: d8230193bc11a2745bec8258c94b95324f3c4955
+  $ git log --graph --format=oneline refs/convert/dest/heads/master
+  * d8230193bc11a2745bec8258c94b95324f3c4955 Merge branch 'head1'
+  * 8a2c50c762f3483c5b3d26947d81a0cbe2ba8e69 add dir0/file2
+  * 925f1eab825ed50a1f80058c6a1f220c009a8bfd add file1s
+  * a547cc07d30f025e022b27310c713705158c21b4 initial
+  $ git show -m refs/convert/dest/heads/master
+  commit d8230193bc11a2745bec8258c94b95324f3c4955
+  Author: test <test@example.com>
+  Date:   Thu Jan 1 00:00:00 1970 +0000
+  
+      Merge branch 'head1'
+  
+  diff --git a/dir2/file0 b/dir2/file0
+  new file mode 100644
+  index 0000000..e69de29
+
+  $ cd ..
+
+Excluding an intermediate directory works
+
+  $ git init repo1
+  Initialized empty Git repository in $TESTTMP/repo1/.git/
+  $ cd repo1
+  $ mkdir -p dir0/subdir0 dir0/subdir1 dir1 dir2/subdir0 dir2/subdir1
+  $ touch dir0/subdir0/file0
+  $ touch dir0/file0
+  $ touch dir0/subdir1/file0
+  $ touch dir1/file0
+  $ touch dir2/file0
+  $ touch dir2/subdir0/file0
+  $ touch dir2/subdir1/file0
+
+  $ git add --all
+  $ git commit -m initial
+  [master (root-commit) 1190a97] initial
+   7 files changed, 0 insertions(+), 0 deletions(-)
+   create mode 100644 dir0/file0
+   create mode 100644 dir0/subdir0/file0
+   create mode 100644 dir0/subdir1/file0
+   create mode 100644 dir1/file0
+   create mode 100644 dir2/file0
+   create mode 100644 dir2/subdir0/file0
+   create mode 100644 dir2/subdir1/file0
+  $ git branch master2 HEAD
+
+  $ linearize-git --exclude-dir dir0/subdir0 . heads/master
+  linearizing 1 commits from heads/master (1190a970be8401aac3e4773332dd10f78e4141f2 to 1190a970be8401aac3e4773332dd10f78e4141f2)
+  1/1 1190a970be8401aac3e4773332dd10f78e4141f2 initial
+  heads/master converted; original: 1190a970be8401aac3e4773332dd10f78e4141f2; rewritten: 2c092d0f01a4a443e2120c897bc7f1fa3b94c3c5
+
+  $ git ls-tree -r -t refs/convert/dest/heads/master
+  040000 tree a5a64b4a01d3e32ff0050e6323ff8abcbce0ded7	dir0
+  100644 blob e69de29bb2d1d6434b8b29ae775ad8c2e48c5391	dir0/file0
+  040000 tree 09767bd3484e22b41138116992cc1cb5bc45fb7f	dir0/subdir1
+  100644 blob e69de29bb2d1d6434b8b29ae775ad8c2e48c5391	dir0/subdir1/file0
+  040000 tree 09767bd3484e22b41138116992cc1cb5bc45fb7f	dir1
+  100644 blob e69de29bb2d1d6434b8b29ae775ad8c2e48c5391	dir1/file0
+  040000 tree 871a0c072ebc416415cc682bbda94e7948c8f568	dir2
+  100644 blob e69de29bb2d1d6434b8b29ae775ad8c2e48c5391	dir2/file0
+  040000 tree 09767bd3484e22b41138116992cc1cb5bc45fb7f	dir2/subdir0
+  100644 blob e69de29bb2d1d6434b8b29ae775ad8c2e48c5391	dir2/subdir0/file0
+  040000 tree 09767bd3484e22b41138116992cc1cb5bc45fb7f	dir2/subdir1
+  100644 blob e69de29bb2d1d6434b8b29ae775ad8c2e48c5391	dir2/subdir1/file0
+
+  $ linearize-git --exclude-dir dir1 --exclude-dir dir2/subdir0 . heads/master2
+  linearizing 1 commits from heads/master2 (1190a970be8401aac3e4773332dd10f78e4141f2 to 1190a970be8401aac3e4773332dd10f78e4141f2)
+  1/1 1190a970be8401aac3e4773332dd10f78e4141f2 initial
+  heads/master2 converted; original: 1190a970be8401aac3e4773332dd10f78e4141f2; rewritten: d46fb3759f69a9ed2c56395653cf3e61fad6f5e7
+
+  $ git ls-tree -r -t refs/convert/dest/heads/master2
+  040000 tree 871a0c072ebc416415cc682bbda94e7948c8f568	dir0
+  100644 blob e69de29bb2d1d6434b8b29ae775ad8c2e48c5391	dir0/file0
+  040000 tree 09767bd3484e22b41138116992cc1cb5bc45fb7f	dir0/subdir0
+  100644 blob e69de29bb2d1d6434b8b29ae775ad8c2e48c5391	dir0/subdir0/file0
+  040000 tree 09767bd3484e22b41138116992cc1cb5bc45fb7f	dir0/subdir1
+  100644 blob e69de29bb2d1d6434b8b29ae775ad8c2e48c5391	dir0/subdir1/file0
+  040000 tree a5a64b4a01d3e32ff0050e6323ff8abcbce0ded7	dir2
+  100644 blob e69de29bb2d1d6434b8b29ae775ad8c2e48c5391	dir2/file0
+  040000 tree 09767bd3484e22b41138116992cc1cb5bc45fb7f	dir2/subdir1
+  100644 blob e69de29bb2d1d6434b8b29ae775ad8c2e48c5391	dir2/subdir1/file0
new file mode 100644
--- /dev/null
+++ b/vcssync/tests/test-linearize-git-message-rewrite.t
@@ -0,0 +1,259 @@
+  $ . $TESTDIR/vcssync/tests/helpers.sh
+
+  $ git init grepo0
+  Initialized empty Git repository in $TESTTMP/grepo0/.git/
+
+  $ cd grepo0
+  $ echo 0 > foo
+  $ git add foo
+  $ git commit -m initial
+  [master (root-commit) dbd62b8] initial
+   1 file changed, 1 insertion(+)
+   create mode 100644 foo
+  $ echo 1 > foo
+  $ git add foo
+  $ git commit -m 'commit 1'
+  [master f3dcf0e] commit 1
+   1 file changed, 1 insertion(+), 1 deletion(-)
+
+--summary-prefix adds prefix to the summary line of commit message
+
+  $ linearize-git --summary-prefix my-prefix: . heads/master
+  linearizing 2 commits from heads/master (dbd62b82aaf0a7a05665d9455a9b4d490d52ddaf to f3dcf0ea970616078b22c97ff104fa368b61973c)
+  1/2 dbd62b82aaf0a7a05665d9455a9b4d490d52ddaf initial
+  2/2 f3dcf0ea970616078b22c97ff104fa368b61973c commit 1
+  heads/master converted; original: f3dcf0ea970616078b22c97ff104fa368b61973c; rewritten: 10874c20986a49df5dd96f35017858fc3e52fe70
+
+  $ git cat-file -p 10874c20986a49df5dd96f35017858fc3e52fe70
+  tree a229c158b3d5560cc44ad3dec6ff5d13a47e11cf
+  parent e532d0c9cf2e5662401c8821f9eedb37356201f9
+  author test <test@example.com> 0 +0000
+  committer test <test@example.com> 0 +0000
+  
+  my-prefix: commit 1
+
+  $ cd ..
+
+Reviewable Markdown can be rewritten to a <key>: <URL> pattern
+
+  $ git init grepo1
+  Initialized empty Git repository in $TESTTMP/grepo1/.git/
+
+  $ cd grepo1
+
+  $ echo 0 > foo
+  $ git add foo
+  $ git commit -m initial
+  [master (root-commit) dbd62b8] initial
+   1 file changed, 1 insertion(+)
+   create mode 100644 foo
+  $ echo 1 > foo
+  $ git add foo
+  $ cat >> message << EOF
+  > Auto merge of #14737 - UK992:package-prefs, r=Wafflespeanut
+  > 
+  > Package: Various improvements
+  > 
+  > Fixes https://github.com/servo/servo/issues/11966
+  > Fixes https://github.com/servo/servo/issues/12707
+  > 
+  > <!-- Reviewable:start -->
+  > ---
+  > This change is [<img src="https://reviewable.io/review_button.svg" height="34" align="absmiddle" alt="Reviewable"/>](https://reviewable.io/reviews/servo/servo/14737)
+  > <!-- Reviewable:end -->
+  > EOF
+
+  $ git commit -F message
+  [master 9ccde32] Auto merge of #14737 - UK992:package-prefs, r=Wafflespeanut
+   1 file changed, 1 insertion(+), 1 deletion(-)
+
+  $ git branch master2 master
+
+  $ linearize-git --reviewable-key Reviewable-URL . heads/master
+  linearizing 2 commits from heads/master (dbd62b82aaf0a7a05665d9455a9b4d490d52ddaf to 9ccde32cb7cc412d2c797a0fea52c258be9b76f2)
+  1/2 dbd62b82aaf0a7a05665d9455a9b4d490d52ddaf initial
+  2/2 9ccde32cb7cc412d2c797a0fea52c258be9b76f2 Auto merge of #14737 - UK992:package-prefs, r=Wafflespeanut
+  heads/master converted; original: 9ccde32cb7cc412d2c797a0fea52c258be9b76f2; rewritten: 58b5ec5252ed2d3d8ab73d6abae4f6253b88674f
+
+  $ git log convert/dest/heads/master
+  commit 58b5ec5252ed2d3d8ab73d6abae4f6253b88674f
+  Author: test <test@example.com>
+  Date:   Thu Jan 1 00:00:00 1970 +0000
+  
+      Auto merge of #14737 - UK992:package-prefs, r=Wafflespeanut
+      
+      Package: Various improvements
+      
+      Fixes https://github.com/servo/servo/issues/11966
+      Fixes https://github.com/servo/servo/issues/12707
+      
+      Reviewable-URL: https://reviewable.io/reviews/servo/servo/14737
+  
+  commit dbd62b82aaf0a7a05665d9455a9b4d490d52ddaf
+  Author: test <test@example.com>
+  Date:   Thu Jan 1 00:00:00 1970 +0000
+  
+      initial
+
+Reviewable.io Markdown can be removed
+
+  $ linearize-git --remove-reviewable . heads/master2
+  linearizing 2 commits from heads/master2 (dbd62b82aaf0a7a05665d9455a9b4d490d52ddaf to 9ccde32cb7cc412d2c797a0fea52c258be9b76f2)
+  1/2 dbd62b82aaf0a7a05665d9455a9b4d490d52ddaf initial
+  2/2 9ccde32cb7cc412d2c797a0fea52c258be9b76f2 Auto merge of #14737 - UK992:package-prefs, r=Wafflespeanut
+  heads/master2 converted; original: 9ccde32cb7cc412d2c797a0fea52c258be9b76f2; rewritten: e7fa11e1edfada45a007d36941b4d919f4b7fe5d
+
+  $ git log convert/dest/heads/master2
+  commit e7fa11e1edfada45a007d36941b4d919f4b7fe5d
+  Author: test <test@example.com>
+  Date:   Thu Jan 1 00:00:00 1970 +0000
+  
+      Auto merge of #14737 - UK992:package-prefs, r=Wafflespeanut
+      
+      Package: Various improvements
+      
+      Fixes https://github.com/servo/servo/issues/11966
+      Fixes https://github.com/servo/servo/issues/12707
+  
+  commit dbd62b82aaf0a7a05665d9455a9b4d490d52ddaf
+  Author: test <test@example.com>
+  Date:   Thu Jan 1 00:00:00 1970 +0000
+  
+      initial
+
+  $ cd ..
+
+GitHub pull request commit message rewriting works
+
+  $ git init grepo2
+  Initialized empty Git repository in $TESTTMP/grepo2/.git/
+  $ cd grepo2
+  $ echo 0 > foo
+  $ git add foo
+  $ git commit -m initial
+  [master (root-commit) dbd62b8] initial
+   1 file changed, 1 insertion(+)
+   create mode 100644 foo
+
+  $ echo 1 > foo
+  $ git add foo
+
+First non-blank line is the summary line
+
+  $ cat > message << EOF
+  > Merge pull request #376 from servo/foo-feature
+  > 
+  > Removed reference to cairo from servo-gfx/font.rs
+  > 
+  > More data below
+  > EOF
+
+  $ git commit -F message
+  [master 026e845] Merge pull request #376 from servo/foo-feature
+   1 file changed, 1 insertion(+), 1 deletion(-)
+
+  $ echo 2 > foo
+  $ git add foo
+  $ cat > message << EOF
+  > Merge pull request #653 from foo/bar
+  > No blank line after summary
+  > 
+  > More content here
+  > EOF
+  $ git commit -F message
+  [master 2e52fe5] Merge pull request #653 from foo/bar No blank line after summary
+   1 file changed, 1 insertion(+), 1 deletion(-)
+
+Servo style commit message syntax rewriting works
+
+  $ echo 3 > foo
+  $ git add foo
+  $ cat > message << EOF
+  > Auto merge of #6532 - servo/bar-feature, r=gps
+  > 
+  > This is the PR summary line
+  > 
+  > Extra content here
+  > EOF
+  $ git commit -F message
+  [master fa522c7] Auto merge of #6532 - servo/bar-feature, r=gps
+   1 file changed, 1 insertion(+), 1 deletion(-)
+
+  $ echo 4 > foo
+  $ git add foo
+  $ cat > message << EOF
+  > auto merge of #4690 : indygreg/servo/some-feature, r=bholley
+  > 
+  > Summary line w/o PR JSON
+  > EOF
+  $ git commit -F message
+  [master cf1c79b] auto merge of #4690 : indygreg/servo/some-feature, r=bholley
+   1 file changed, 1 insertion(+), 1 deletion(-)
+
+  $ echo 5 > foo
+  $ git add foo
+  $ cat > message << EOF
+  > Auto merge of #5700 - Ms2ger:content, r=jdm
+  > 
+  > 
+  > 
+  > <!-- Reviewable:start -->
+  > [<img src="https://reviewable.io/review_button.png" height=40 alt="Review on Reviewable"/>](https://reviewable.io/reviews/servo/servo/5700)
+  > <!-- Reviewable:end -->
+  > EOF
+  $ git commit -F message
+  [master a7332b8] Auto merge of #5700 - Ms2ger:content, r=jdm
+   1 file changed, 1 insertion(+), 1 deletion(-)
+
+  $ linearize-git --normalize-github-merge-message --remove-reviewable . heads/master
+  linearizing 6 commits from heads/master (dbd62b82aaf0a7a05665d9455a9b4d490d52ddaf to a7332b8424dc931df611b8feab5fa6840218bfa1)
+  1/6 dbd62b82aaf0a7a05665d9455a9b4d490d52ddaf initial
+  2/6 026e845f3f1293fb53ea2ee98cbc349120993c7c Merge pull request #376 from servo/foo-feature
+  3/6 2e52fe53a63dc972ef737702f0467ea0575d0392 Merge pull request #653 from foo/bar
+  4/6 fa522c79808c18641e57fff1e3a7d67ae802fa04 Auto merge of #6532 - servo/bar-feature, r=gps
+  5/6 cf1c79b916bc61fa77a215acb13f00c770d2ac9e auto merge of #4690 : indygreg/servo/some-feature, r=bholley
+  6/6 a7332b8424dc931df611b8feab5fa6840218bfa1 Auto merge of #5700 - Ms2ger:content, r=jdm
+  heads/master converted; original: a7332b8424dc931df611b8feab5fa6840218bfa1; rewritten: 804b27caa81d7ac94b2ad48e23f1b152c21c5490
+
+  $ git log refs/convert/dest/heads/master
+  commit 804b27caa81d7ac94b2ad48e23f1b152c21c5490
+  Author: test <test@example.com>
+  Date:   Thu Jan 1 00:00:00 1970 +0000
+  
+      Merge #5700 (from Ms2ger:content); r=jdm
+  
+  commit 1924605259679ed7f3115d6d316ecf0f5664d286
+  Author: test <test@example.com>
+  Date:   Thu Jan 1 00:00:00 1970 +0000
+  
+      Merge #4690 - Summary line w/o PR JSON (from indygreg/servo/some-feature); r=bholley
+  
+  commit a41d8edf9850a8d7372a9cbe0b544ea552752432
+  Author: test <test@example.com>
+  Date:   Thu Jan 1 00:00:00 1970 +0000
+  
+      Merge #6532 - This is the PR summary line (from servo/bar-feature); r=gps
+      
+      Extra content here
+  
+  commit 97edfbc3c88a6fd32e67b88428384e700eda0cfb
+  Author: test <test@example.com>
+  Date:   Thu Jan 1 00:00:00 1970 +0000
+  
+      Merge #653 - No blank line after summary (from foo:bar)
+      
+      More content here
+  
+  commit e852a645a35166cfe99ed9457a63fd0a9c2d0f38
+  Author: test <test@example.com>
+  Date:   Thu Jan 1 00:00:00 1970 +0000
+  
+      Merge #376 - Removed reference to cairo from servo-gfx/font.rs (from servo:foo-feature)
+      
+      More data below
+  
+  commit dbd62b82aaf0a7a05665d9455a9b4d490d52ddaf
+  Author: test <test@example.com>
+  Date:   Thu Jan 1 00:00:00 1970 +0000
+  
+      initial
new file mode 100644
--- /dev/null
+++ b/vcssync/tests/test-linearize-git-multiple-roots.t
@@ -0,0 +1,145 @@
+  $ . $TESTDIR/vcssync/tests/helpers.sh
+
+Create a Git repo with multiple heads
+
+  $ git init grepo
+  Initialized empty Git repository in $TESTTMP/grepo/.git/
+  $ cd grepo
+
+  $ touch foo
+  $ git add foo
+  $ git commit -m initial
+  [master (root-commit) a547cc0] initial
+   1 file changed, 0 insertions(+), 0 deletions(-)
+   create mode 100644 foo
+
+  $ echo 0 > file0
+  $ echo 0 > file1
+  $ echo 0 > file2
+  $ git add file0 file1 file2
+  $ git commit -m 'add file0 file1 file2'
+  [master 14ed61b] add file0 file1 file2
+   3 files changed, 3 insertions(+)
+   create mode 100644 file0
+   create mode 100644 file1
+   create mode 100644 file2
+
+  $ git checkout -b head1
+  Switched to a new branch 'head1'
+  $ echo h1c1 > file1
+  $ git add file1
+  $ git commit -m h1c1
+  [head1 ab003f0] h1c1
+   1 file changed, 1 insertion(+), 1 deletion(-)
+  $ echo h2c2 > file1
+  $ git add file1
+  $ git commit -m h1c2
+  [head1 cf9ad69] h1c2
+   1 file changed, 1 insertion(+), 1 deletion(-)
+
+  $ git checkout master
+  Switched to branch 'master'
+  $ echo mc1 > file0
+  $ git add file0
+  $ git commit -m 'master c1'
+  [master bcd2192] master c1
+   1 file changed, 1 insertion(+), 1 deletion(-)
+  $ git checkout -b head2
+  Switched to a new branch 'head2'
+  $ echo h2c1 > file2
+  $ git add file2
+  $ git commit -m h2c1
+  [head2 019d621] h2c1
+   1 file changed, 1 insertion(+), 1 deletion(-)
+  $ echo h2c2 > file2
+  $ git add file2
+  $ git commit -m h2c2
+  [head2 8de7644] h2c2
+   1 file changed, 1 insertion(+), 1 deletion(-)
+
+  $ git checkout master
+  Switched to branch 'master'
+  $ echo mc2 > file0
+  $ git add file0
+  $ git commit -m 'master c2'
+  [master 824ed6b] master c2
+   1 file changed, 1 insertion(+), 1 deletion(-)
+
+  $ git log --graph --format=oneline --all
+  * cf9ad694e235b1cdc386f05e7a177c364de926ee h1c2
+  * ab003f0dcf722f60b12e1d88eb169294419afc1e h1c1
+  | * 8de7644ef74338499cc06d361abcada458d63ae0 h2c2
+  | * 019d621ccabb0fa19da01c2d2f6c6911f75fa80a h2c1
+  | | * 824ed6bd9a20abbdfc2f30d51697fb38aaeed77f master c2
+  | |/  
+  | * bcd219215eeef8329b848347a4df596c97637c8d master c1
+  |/  
+  * 14ed61bb65666fab453c2c73779776b45a82ed1c add file0 file1 file2
+  * a547cc07d30f025e022b27310c713705158c21b4 initial
+
+  $ git for-each-ref
+  cf9ad694e235b1cdc386f05e7a177c364de926ee commit	refs/heads/head1
+  8de7644ef74338499cc06d361abcada458d63ae0 commit	refs/heads/head2
+  824ed6bd9a20abbdfc2f30d51697fb38aaeed77f commit	refs/heads/master
+
+  $ linearize-git --summary-prefix prefix: . heads/master
+  linearizing 4 commits from heads/master (a547cc07d30f025e022b27310c713705158c21b4 to 824ed6bd9a20abbdfc2f30d51697fb38aaeed77f)
+  1/4 a547cc07d30f025e022b27310c713705158c21b4 initial
+  2/4 14ed61bb65666fab453c2c73779776b45a82ed1c add file0 file1 file2
+  3/4 bcd219215eeef8329b848347a4df596c97637c8d master c1
+  4/4 824ed6bd9a20abbdfc2f30d51697fb38aaeed77f master c2
+  heads/master converted; original: 824ed6bd9a20abbdfc2f30d51697fb38aaeed77f; rewritten: a252594d0435ec401a688422fc9d5d8609411b31
+
+  $ git log --graph --format=oneline convert/dest/heads/master
+  * a252594d0435ec401a688422fc9d5d8609411b31 prefix: master c2
+  * 43297b4d3719db354cd672eda3ccba002083fb51 prefix: master c1
+  * adc692e75b22584d802c3e586719f16759126267 prefix: add file0 file1 file2
+  * a8164f0194857c91f13cefc9ade0378488e89502 prefix: initial
+
+  $ echo mc3 > file0
+  $ git add file0
+  $ git commit -m 'master c3'
+  [master 9f1866b] master c3
+   1 file changed, 1 insertion(+), 1 deletion(-)
+  $ git merge head1
+  Merge made by the 'recursive' strategy.
+   file1 | 2 +-
+   1 file changed, 1 insertion(+), 1 deletion(-)
+  $ git merge head2
+  Merge made by the 'recursive' strategy.
+   file2 | 2 +-
+   1 file changed, 1 insertion(+), 1 deletion(-)
+
+  $ git log --graph --format=oneline
+  *   c2aa3459b5fb3528e9808b6229a67165b4a3b565 Merge branch 'head2'
+  |\  
+  | * 8de7644ef74338499cc06d361abcada458d63ae0 h2c2
+  | * 019d621ccabb0fa19da01c2d2f6c6911f75fa80a h2c1
+  * |   8a64d1d9fabfd12eb7c8c2876b3a09b80a60657f Merge branch 'head1'
+  |\ \  
+  | * | cf9ad694e235b1cdc386f05e7a177c364de926ee h1c2
+  | * | ab003f0dcf722f60b12e1d88eb169294419afc1e h1c1
+  * | | 9f1866ba6011fb3621d68dcaa917d8d3044d7ccd master c3
+  * | | 824ed6bd9a20abbdfc2f30d51697fb38aaeed77f master c2
+  | |/  
+  |/|   
+  * | bcd219215eeef8329b848347a4df596c97637c8d master c1
+  |/  
+  * 14ed61bb65666fab453c2c73779776b45a82ed1c add file0 file1 file2
+  * a547cc07d30f025e022b27310c713705158c21b4 initial
+
+  $ linearize-git --summary-prefix prefix: . heads/master
+  linearizing 3 commits from heads/master (9f1866ba6011fb3621d68dcaa917d8d3044d7ccd to c2aa3459b5fb3528e9808b6229a67165b4a3b565)
+  1/3 9f1866ba6011fb3621d68dcaa917d8d3044d7ccd master c3
+  2/3 8a64d1d9fabfd12eb7c8c2876b3a09b80a60657f Merge branch 'head1'
+  3/3 c2aa3459b5fb3528e9808b6229a67165b4a3b565 Merge branch 'head2'
+  heads/master converted; original: c2aa3459b5fb3528e9808b6229a67165b4a3b565; rewritten: 5d54e9062c565acba8fe3b7dda7e7fd1c29e550c
+
+  $ git log --graph --format=oneline refs/convert/dest/heads/master
+  * 5d54e9062c565acba8fe3b7dda7e7fd1c29e550c prefix: Merge branch 'head2'
+  * b0923b63a71946878539e5a82b2430d2ececc0f2 prefix: Merge branch 'head1'
+  * 7de4a0e3a1dec279df9f62590c2be67103957b97 prefix: master c3
+  * a252594d0435ec401a688422fc9d5d8609411b31 prefix: master c2
+  * 43297b4d3719db354cd672eda3ccba002083fb51 prefix: master c1
+  * adc692e75b22584d802c3e586719f16759126267 prefix: add file0 file1 file2
+  * a8164f0194857c91f13cefc9ade0378488e89502 prefix: initial
new file mode 100644
--- /dev/null
+++ b/vcssync/tests/test-linearize-git-not-fast-forward.t
@@ -0,0 +1,72 @@
+  $ . $TESTDIR/vcssync/tests/helpers.sh
+
+  $ git init grepo0
+  Initialized empty Git repository in $TESTTMP/grepo0/.git/
+
+  $ cd grepo0
+  $ echo 0 > foo
+  $ git add foo
+  $ git commit -m initial
+  [master (root-commit) dbd62b8] initial
+   1 file changed, 1 insertion(+)
+   create mode 100644 foo
+  $ echo 1 > foo
+  $ git add foo
+  $ git commit -m 'commit 1'
+  [master f3dcf0e] commit 1
+   1 file changed, 1 insertion(+), 1 deletion(-)
+  $ echo 2 > foo
+  $ git add foo
+  $ git commit -m 'commit 2'
+  [master 2a57f45] commit 2
+   1 file changed, 1 insertion(+), 1 deletion(-)
+
+  $ linearize-git . heads/master
+  linearizing 3 commits from heads/master (dbd62b82aaf0a7a05665d9455a9b4d490d52ddaf to 2a57f453609d9dffe0dad9a0544b792a09d4b234)
+  1/3 dbd62b82aaf0a7a05665d9455a9b4d490d52ddaf initial
+  2/3 f3dcf0ea970616078b22c97ff104fa368b61973c commit 1
+  3/3 2a57f453609d9dffe0dad9a0544b792a09d4b234 commit 2
+  heads/master converted; original: 2a57f453609d9dffe0dad9a0544b792a09d4b234; rewritten: 2a57f453609d9dffe0dad9a0544b792a09d4b234
+
+Simulate a force push by doing a hard reset + new commit
+
+  $ git reset --hard f3dcf0ea970616078b22c97ff104fa368b61973c
+  HEAD is now at f3dcf0e commit 1
+  $ echo 2.new > foo
+  $ git add foo
+  $ git commit -m 'commit 3 (reset)'
+  [master 280dddb] commit 3 (reset)
+   1 file changed, 1 insertion(+), 1 deletion(-)
+
+Attempting an incremental conversion that isn't a fast forward will result in
+error.
+
+  $ linearize-git . heads/master
+  abort: source commit 2a57f453609d9dffe0dad9a0544b792a09d4b234 not found in ref heads/master; refusing to convert non-fast-forward history
+
+And again on HEAD~2
+
+  $ git reset --hard dbd62b82aaf0a7a05665d9455a9b4d490d52ddaf
+  HEAD is now at dbd62b8 initial
+  $ echo 1.new > foo
+  $ git add foo
+  $ git commit -m 'commit 2 (reset)'
+  [master ea5210e] commit 2 (reset)
+   1 file changed, 1 insertion(+), 1 deletion(-)
+
+  $ linearize-git . heads/master
+  abort: source commit 2a57f453609d9dffe0dad9a0544b792a09d4b234 not found in ref heads/master; refusing to convert non-fast-forward history
+
+Resetting back to original will recover
+
+  $ git reset --hard 2a57f453609d9dffe0dad9a0544b792a09d4b234
+  HEAD is now at 2a57f45 commit 2
+  $ echo 3 > foo
+  $ git add foo
+  $ git commit -m 'commit 3'
+  [master 10b4510] commit 3
+   1 file changed, 1 insertion(+), 1 deletion(-)
+  $ linearize-git . heads/master
+  linearizing 1 commits from heads/master (10b45106160cb14fd510875c844f38ea26b559c6 to 10b45106160cb14fd510875c844f38ea26b559c6)
+  1/1 10b45106160cb14fd510875c844f38ea26b559c6 commit 3
+  heads/master converted; original: 10b45106160cb14fd510875c844f38ea26b559c6; rewritten: 10b45106160cb14fd510875c844f38ea26b559c6
new file mode 100644
--- /dev/null
+++ b/vcssync/tests/test-linearize-git-p2-author.t
@@ -0,0 +1,65 @@
+  $ . $TESTDIR/vcssync/tests/helpers.sh
+
+  $ git init grepo0
+  Initialized empty Git repository in $TESTTMP/grepo0/.git/
+
+  $ cd grepo0
+  $ echo 0 > foo
+  $ git add foo
+  $ git commit -m initial
+  [master (root-commit) dbd62b8] initial
+   1 file changed, 1 insertion(+)
+   create mode 100644 foo
+  $ git branch feature-branch
+
+  $ echo 1 > foo
+  $ git add foo
+  $ git commit -m 1
+  [master 3859ebb] 1
+   1 file changed, 1 insertion(+), 1 deletion(-)
+
+  $ git checkout feature-branch
+  Switched to branch 'feature-branch'
+  $ echo 2 > bar
+  $ git add bar
+  $ GIT_AUTHOR_NAME='Another Author' GIT_AUTHOR_EMAIL='another@example.com' git commit -m 2
+  [feature-branch d2b9537] 2
+   Author: Another Author <another@example.com>
+   1 file changed, 1 insertion(+)
+   create mode 100644 bar
+
+  $ git checkout master
+  Switched to branch 'master'
+  $ git merge feature-branch
+  Merge made by the 'recursive' strategy.
+   bar | 1 +
+   1 file changed, 1 insertion(+)
+   create mode 100644 bar
+
+Using p2 parent for rewritten merge commit works
+
+  $ linearize-git --use-p2-author . heads/master
+  linearizing 3 commits from heads/master (dbd62b82aaf0a7a05665d9455a9b4d490d52ddaf to 1d7609530bb5efe1b11c2be19368669f9892e055)
+  1/3 dbd62b82aaf0a7a05665d9455a9b4d490d52ddaf initial
+  2/3 3859ebb89b4a8ef66d455f7f0d550a8a609154da 1
+  3/3 1d7609530bb5efe1b11c2be19368669f9892e055 Merge branch 'feature-branch'
+  heads/master converted; original: 1d7609530bb5efe1b11c2be19368669f9892e055; rewritten: 534e8c7588c35783ba75712dc4549852a65ed720
+
+  $ git log convert/dest/heads/master
+  commit 534e8c7588c35783ba75712dc4549852a65ed720
+  Author: Another Author <another@example.com>
+  Date:   Thu Jan 1 00:00:00 1970 +0000
+  
+      Merge branch 'feature-branch'
+  
+  commit 3859ebb89b4a8ef66d455f7f0d550a8a609154da
+  Author: test <test@example.com>
+  Date:   Thu Jan 1 00:00:00 1970 +0000
+  
+      1
+  
+  commit dbd62b82aaf0a7a05665d9455a9b4d490d52ddaf
+  Author: test <test@example.com>
+  Date:   Thu Jan 1 00:00:00 1970 +0000
+  
+      initial
new file mode 100644
--- /dev/null
+++ b/vcssync/tests/test-linearize-git-record-original-commit.t
@@ -0,0 +1,80 @@
+  $ . $TESTDIR/vcssync/tests/helpers.sh
+
+Create a Git repo with a simple merge
+
+  $ git init grepo0
+  Initialized empty Git repository in $TESTTMP/grepo0/.git/
+  $ cd grepo0
+  $ echo 0 > foo
+  $ git add foo
+  $ git commit -m initial
+  [master (root-commit) dbd62b8] initial
+   1 file changed, 1 insertion(+)
+   create mode 100644 foo
+  $ echo 1 > foo
+  $ git add foo
+  $ cat >> message << EOF
+  > commit 1
+  > 
+  > Reviewable-URL: https://example.com/foo
+  > EOF
+  $ git commit -F message
+  [master 4064f3a] commit 1
+   1 file changed, 1 insertion(+), 1 deletion(-)
+  $ git branch master2
+  $ git branch master3
+  $ git branch master4
+
+Source repo annotations work
+
+  $ linearize-git --source-repo https://github.com/example/repo.git --source-repo-key Source-Repo . heads/master2
+  linearizing 2 commits from heads/master2 (dbd62b82aaf0a7a05665d9455a9b4d490d52ddaf to 4064f3a8845ed27962b26096cfae39610ea97c8e)
+  1/2 dbd62b82aaf0a7a05665d9455a9b4d490d52ddaf initial
+  2/2 4064f3a8845ed27962b26096cfae39610ea97c8e commit 1
+  heads/master2 converted; original: 4064f3a8845ed27962b26096cfae39610ea97c8e; rewritten: 08225afa188929f3b1b5b06d2dff1e0a6dbbd707
+
+  $ git cat-file -p 08225afa188929f3b1b5b06d2dff1e0a6dbbd707
+  tree a229c158b3d5560cc44ad3dec6ff5d13a47e11cf
+  parent c7a2854e7d8d1f3e6b1abc8bd7cf8a6a1a225f9f
+  author test <test@example.com> 0 +0000
+  committer test <test@example.com> 0 +0000
+  
+  commit 1
+  
+  Reviewable-URL: https://example.com/foo
+  Source-Repo: https://github.com/example/repo.git
+
+  $ linearize-git --source-revision-key Source-Revision . heads/master3
+  linearizing 2 commits from heads/master3 (dbd62b82aaf0a7a05665d9455a9b4d490d52ddaf to 4064f3a8845ed27962b26096cfae39610ea97c8e)
+  1/2 dbd62b82aaf0a7a05665d9455a9b4d490d52ddaf initial
+  2/2 4064f3a8845ed27962b26096cfae39610ea97c8e commit 1
+  heads/master3 converted; original: 4064f3a8845ed27962b26096cfae39610ea97c8e; rewritten: f7fabf46f67fae5f49e2776b72307a7d17cd560f
+
+  $ git cat-file -p f7fabf46f67fae5f49e2776b72307a7d17cd560f
+  tree a229c158b3d5560cc44ad3dec6ff5d13a47e11cf
+  parent 1c8e8b4b1c0c0eca6b0452241f05fe983c6f3b52
+  author test <test@example.com> 0 +0000
+  committer test <test@example.com> 0 +0000
+  
+  commit 1
+  
+  Reviewable-URL: https://example.com/foo
+  Source-Revision: 4064f3a8845ed27962b26096cfae39610ea97c8e
+
+  $ linearize-git --source-repo https://github.com/example/repo.git --source-repo-key Source-Repo --source-revision-key Source-Revision . heads/master4
+  linearizing 2 commits from heads/master4 (dbd62b82aaf0a7a05665d9455a9b4d490d52ddaf to 4064f3a8845ed27962b26096cfae39610ea97c8e)
+  1/2 dbd62b82aaf0a7a05665d9455a9b4d490d52ddaf initial
+  2/2 4064f3a8845ed27962b26096cfae39610ea97c8e commit 1
+  heads/master4 converted; original: 4064f3a8845ed27962b26096cfae39610ea97c8e; rewritten: e39b36eab045450d8cf25e77532aa5c062da792d
+
+  $ git cat-file -p e39b36eab045450d8cf25e77532aa5c062da792d
+  tree a229c158b3d5560cc44ad3dec6ff5d13a47e11cf
+  parent d59910e04b7469ebc5f93299632836c79c5a0aff
+  author test <test@example.com> 0 +0000
+  committer test <test@example.com> 0 +0000
+  
+  commit 1
+  
+  Reviewable-URL: https://example.com/foo
+  Source-Repo: https://github.com/example/repo.git
+  Source-Revision: 4064f3a8845ed27962b26096cfae39610ea97c8e
new file mode 100644
--- /dev/null
+++ b/vcssync/tests/test-linearize-git-reflog.t
@@ -0,0 +1,51 @@
+  $ . $TESTDIR/vcssync/tests/helpers.sh
+
+  $ git init grepo0
+  Initialized empty Git repository in $TESTTMP/grepo0/.git/
+  $ cd grepo0
+  $ echo 0 > foo
+  $ git add foo
+  $ git commit -m initial
+  [master (root-commit) dbd62b8] initial
+   1 file changed, 1 insertion(+)
+   create mode 100644 foo
+
+Need to change the committer date because reflogs rely on that time
+
+  $ export GIT_COMMITTER_DATE='Fri Jan 6 00:00:00 2017 +0000'
+
+  $ linearize-git --summary-prefix prefix: . heads/master
+  linearizing 1 commits from heads/master (dbd62b82aaf0a7a05665d9455a9b4d490d52ddaf to dbd62b82aaf0a7a05665d9455a9b4d490d52ddaf)
+  1/1 dbd62b82aaf0a7a05665d9455a9b4d490d52ddaf initial
+  heads/master converted; original: dbd62b82aaf0a7a05665d9455a9b4d490d52ddaf; rewritten: 6f15a738f983e864bdfac5088fcb9c4c0e339757
+
+refs tracking converted commits have a reflog entry
+
+  $ git reflog show convert/source/heads/master
+  dbd62b8 convert/source/heads/master@{0}: linearize heads/master
+
+  $ git reflog show convert/dest/heads/master
+  6f15a73 convert/dest/heads/master@{0}: linearize heads/master
+
+Performing an incremental conversion will create a new reflog entry
+
+  $ echo 1 > foo
+  $ git add foo
+  $ git commit -m initial
+  [master ccdbd02] initial
+   1 file changed, 1 insertion(+), 1 deletion(-)
+
+  $ export GIT_COMMITTER_DATE='Fri Jan 6 00:00:01 2017 +0000'
+
+  $ linearize-git --summary-prefix prefix: . heads/master
+  linearizing 1 commits from heads/master (ccdbd027bb70f567e4e21296450e0c991ee52d4b to ccdbd027bb70f567e4e21296450e0c991ee52d4b)
+  1/1 ccdbd027bb70f567e4e21296450e0c991ee52d4b initial
+  heads/master converted; original: ccdbd027bb70f567e4e21296450e0c991ee52d4b; rewritten: e800cfd3c722caab882652b26f98a9e568582a60
+
+  $ git reflog show convert/source/heads/master
+  ccdbd02 convert/source/heads/master@{0}: linearize heads/master
+  dbd62b8 convert/source/heads/master@{1}: linearize heads/master
+
+  $ git reflog show convert/dest/heads/master
+  e800cfd convert/dest/heads/master@{0}: linearize heads/master
+  6f15a73 convert/dest/heads/master@{1}: linearize heads/master