hghooks: hook to trigger cache population (bug 1358239); r?glob draft
authorGregory Szorc <gps@mozilla.com>
Fri, 09 Jun 2017 15:44:38 -0700
changeset 11192 0b25a097b8485bfc2dd2f6f4405b32569b867718
parent 11189 aef5b2f44147a6e221f7fb6951a3a67e0bcf4ea7
push id1703
push usergszorc@mozilla.com
push dateFri, 09 Jun 2017 22:45:20 +0000
reviewersglob
bugs1358239
hghooks: hook to trigger cache population (bug 1358239); r?glob Mozilla has historically had problems with the tags cache causing performance problems. Most of these problems were resolved a few years ago by a rewritten tags cache implementation in Mercurial. However, for that cache to work it needs to be populated. And, the cache isn't populated until something attempts to resolve a symbol that could be a tag. This is normally not a problem. On hgweb, the tags cache will be updated on most page views. However, it can be problematic on the master server. Nothing in the regular push flow appears to access tags data. This means that the tags cache may not be written on the master server. It so happens that we have processes running on the master server that don't have write privileges to repos. So while they may trigger tags cache population, the cache I/O fails and the cache is never written. Assuming the cache is never written, over time these processes have to compute more and more tags data and over time the amount of CPU required balloons. Enough time passes and alerts start to fire because processes that should be quick are spending dozens of seconds computing tags data. This commit solves that problem by implementing a pre transaction close hook that accesses tags data, thus ensuring the tags cache is up to date. Other processes should never have to compute tags data again. Furthermore, `hg pull` will transfer tags cache data. So populating the cache before replication ensures that hgweb machines also don't have to populate the cache. Finally, the bundles generated for clone bundles should have current tags caches, ensuring that people who clone from them don't need to regenerate the data. So ensuring the tags cache is populated on the master server is full of wins. The new hook is globally installed on the master server so all repos benefit from tags cache generation. Unless we pre-populate the tags cache on all repos, the first push to a repo after this is enabled may be slow. All subsequent pushes may slow down because of this hook. The tags cache population time is proportional to the number of files in a repo and the number of heads being pushed. However, the common case is 1 head per push and Mercurial should cache the manifest for that head, thus assuring rapid tags cache generation. The tags cache generation time is recorded in blackbox logs. So if the logs show this hook causes too much of a perf drain, we can look into alternate mechanisms for populating the tags cache. MozReview-Commit-ID: 5q9TuqO4JaS
ansible/roles/hg-ssh/templates/hgrc.j2
hghooks/mozhghooks/populate_caches.py
hghooks/tests/test-populate-caches.t
hgserver/tests/test-push-basic.t
--- a/ansible/roles/hg-ssh/templates/hgrc.j2
+++ b/ansible/roles/hg-ssh/templates/hgrc.j2
@@ -17,16 +17,18 @@ changegroup.a_recordlogs = /var/hg/versi
 changegroup.push_printurls = python:mozhghooks.push_printurls.hook
 changegroup.z_advertize_upgrade = python:mozhghooks.advertise_upgrade.hook
 #pretxnchangegroup.renamecase = python:mozhghooks.prevent_case_only_renames.hook
 # Disabled because too many people are running into issues. Need more
 # granular checking for now. Bug 787620.
 #pretxnchangegroup.author_format = python:mozhghooks.author_format.hook
 pretxnchangegroup.single_root = python:mozhghooks.single_root.hook
 
+pretxnclose.populate_caches = python:mozhghooks.populate_caches.hook
+
 [extensions]
 blackbox =
 clonebundles =
 
 obsolescencehacks = /var/hg/version-control-tools/hgext/obsolescencehacks
 pushlog = /var/hg/version-control-tools/hgext/pushlog
 serverlog = /var/hg/version-control-tools/hgext/serverlog
 readonly = /var/hg/version-control-tools/hgext/readonly
new file mode 100644
--- /dev/null
+++ b/hghooks/mozhghooks/populate_caches.py
@@ -0,0 +1,10 @@
+# This software may be used and distributed according to the terms of the
+# GNU General Public License version 2 or any later version.
+
+from __future__ import absolute_import
+
+
+def hook(ui, repo, **kwargs):
+    # Trigger tags cache generation.
+    repo.tags()
+    repo.unfiltered().tags()
new file mode 100644
--- /dev/null
+++ b/hghooks/tests/test-populate-caches.t
@@ -0,0 +1,65 @@
+  $ hg init server
+  $ cat >> server/.hg/hgrc << EOF
+  > [extensions]
+  > blackbox =
+  > [blackbox]
+  > track = *
+  > EOF
+
+  $ hg -q clone server client
+  $ cd client
+
+  $ touch foo
+  $ hg -q commit -A -m initial
+  $ hg push
+  pushing to $TESTTMP/server
+  searching for changes
+  adding changesets
+  adding manifests
+  adding file changes
+  added 1 changesets with 1 changes to 1 files
+
+No tags cache should exist because there is no .hgtags file
+
+  $ [ -f ../server/.hg/cache/hgtagsfnodes1 ]
+  [1]
+
+Pushing a tag should not populate the tags cache unless without the hook
+
+  $ hg tag initial
+  $ hg -q push
+
+  $ [ -f ../server/.hg/cache/hgtagsfnodes1 ]
+  [1]
+
+  $ cat ../server/.hg/blackbox.log
+  *> updated base branch cache in * seconds (glob)
+  *> wrote base branch cache with 1 labels and 1 nodes (glob)
+  *> 1 incoming changes - new heads: 96ee1d7354c4 (glob)
+  *> updated base branch cache in * seconds (glob)
+  *> wrote base branch cache with 1 labels and 1 nodes (glob)
+  *> 1 incoming changes - new heads: 5e849d85a748 (glob)
+
+Activating the hook causes tags cache to get populated
+
+  $ cat >> ../server/.hg/hgrc << EOF
+  > [hooks]
+  > pretxnclose.populate_caches = python:mozhghooks.populate_caches.hook
+  > EOF
+
+  $ hg tag newtag
+  $ hg -q push
+
+  $ [ -f ../server/.hg/cache/hgtagsfnodes1 ]
+
+  $ tail -10 ../server/.hg/blackbox.log
+  *> 1 incoming changes - new heads: 5e849d85a748 (glob)
+  *> updated base branch cache in * seconds (glob)
+  *> wrote base branch cache with 1 labels and 1 nodes (glob)
+  *> writing 72 bytes to cache/hgtagsfnodes1 (glob)
+  *> 0/1 cache hits/lookups in * seconds (glob)
+  *> writing .hg/cache/tags2-served with 2 tags (glob)
+  *> 1/1 cache hits/lookups in * seconds (glob)
+  *> writing .hg/cache/tags2 with 2 tags (glob)
+  *> pythonhook-pretxnclose: mozhghooks.populate_caches.hook finished in * seconds (glob)
+  *> 1 incoming changes - new heads: cf120f74c0ec (glob)
--- a/hgserver/tests/test-push-basic.t
+++ b/hgserver/tests/test-push-basic.t
@@ -60,16 +60,19 @@ Blackbox logging recorded appropriate en
   * user1@example.com @0000000000000000000000000000000000000000 (*)> pythonhook-pretxnopen: hgext_vcsreplicator.pretxnopenhook finished in * seconds (glob)
   * user1@example.com @0000000000000000000000000000000000000000 (*)> pythonhook-prechangegroup: hgext_readonly.prechangegrouphook finished in * seconds (glob)
   * user1@example.com @0000000000000000000000000000000000000000 (*)> pythonhook-pretxnchangegroup: mozhghooks.single_root.hook finished in * seconds (glob)
   * user1@example.com @0000000000000000000000000000000000000000 (*)> pythonhook-pretxnchangegroup: hgext_pushlog.pretxnchangegrouphook finished in * seconds (glob)
   * user1@example.com @0000000000000000000000000000000000000000 (*)> pythonhook-pretxnchangegroup: hgext_vcsreplicator.pretxnchangegrouphook finished in * seconds (glob)
   * user1@example.com @0000000000000000000000000000000000000000 (*)> updated base branch cache in * seconds (glob)
   * user1@example.com @0000000000000000000000000000000000000000 (*)> wrote base branch cache with 1 labels and 1 nodes (glob)
   * user1@example.com @0000000000000000000000000000000000000000 (*)> pythonhook-prepushkey: hgext_readonly.prepushkeyhook finished in * seconds (glob)
+  * user1@example.com @0000000000000000000000000000000000000000 (*)> writing .hg/cache/tags2-served with 0 tags (glob)
+  * user1@example.com @0000000000000000000000000000000000000000 (*)> writing .hg/cache/tags2 with 0 tags (glob)
+  * user1@example.com @0000000000000000000000000000000000000000 (*)> pythonhook-pretxnclose: mozhghooks.populate_caches.hook finished in * seconds (glob)
   * user1@example.com @0000000000000000000000000000000000000000 (*)> pythonhook-pretxnclose: hgext_vcsreplicator.pretxnclosehook finished in * seconds (glob)
   * user1@example.com @0000000000000000000000000000000000000000 (*)> pythonhook-pushkey: hgext_vcsreplicator.pushkeyhook finished in * seconds (glob)
   * user1@example.com @0000000000000000000000000000000000000000 (*)> pythonhook-txnclose: hgext_vcsreplicator.txnclosehook finished in * seconds (glob)
   * user1@example.com @0000000000000000000000000000000000000000 (*)> exthook-changegroup.a_recordlogs: /var/hg/version-control-tools/scripts/record-pushes.sh finished in * seconds (glob)
   * user1@example.com @0000000000000000000000000000000000000000 (*)> pythonhook-changegroup: mozhghooks.push_printurls.hook finished in * seconds (glob)
   * user1@example.com @0000000000000000000000000000000000000000 (*)> pythonhook-changegroup: mozhghooks.advertise_upgrade.hook finished in * seconds (glob)
   * user1@example.com @0000000000000000000000000000000000000000 (*)> pythonhook-changegroup: hgext_vcsreplicator.changegrouphook finished in * seconds (glob)
   * user1@example.com @0000000000000000000000000000000000000000 (*)> 1 incoming changes - new heads: 77538e1ce4be (glob)