Bug 1397503 - Vary cache name when using out-of-tree Docker images; r?dustin draft
authorGregory Szorc <gps@mozilla.com>
Wed, 06 Sep 2017 16:09:15 -0700
changeset 661123 9703b2c00c55cdc115adc2836db6578855c36705
parent 660427 5393c21a4ca0ad487d494c74bc9e251c9b248a53
child 730457 29277ff6642640f81c2c9ac44006abd48a5ca7f7
push id78636
push usergszorc@mozilla.com
push dateThu, 07 Sep 2017 23:55:09 +0000
reviewersdustin
bugs1397503
milestone57.0a1
Bug 1397503 - Vary cache name when using out-of-tree Docker images; r?dustin We currently vary the cache name for run-task tasks whenever run-task changes. This allows us to not worry about backwards or forwards compatibility of caches in run-task tasks. This strategy doesn't work for out-of-tree Docker images because the content of run-task cannot be determined at Taskgraph time: the content of run-task was determined when that Docker image was built and there is no way to get that content efficiently during Taskgraph. So, for out-of-tree Docker images we now vary the cache name by the Docker image value, which includes its name and a tag or hash. This means that out-of-tree run-task tasks will get separate caches for each distinct Docker image. This isn't ideal. Ideally we would share caches if run-task doesn't vary between Docker images. But without any way of proving that at Taskgraph time, we take the safe road and force cache separation. MozReview-Commit-ID: FMiQBqfvjqW
taskcluster/taskgraph/transforms/task.py
--- a/taskcluster/taskgraph/transforms/task.py
+++ b/taskcluster/taskgraph/transforms/task.py
@@ -5,16 +5,17 @@
 These transformations take a task description and turn it into a TaskCluster
 task definition (along with attributes, label, etc.).  The input to these
 transformations is generic to any kind of task, but abstracts away some of the
 complexities of worker implementations, scopes, and treeherder annotations.
 """
 
 from __future__ import absolute_import, print_function, unicode_literals
 
+import hashlib
 import json
 import os
 import re
 import time
 from copy import deepcopy
 
 from mozbuild.util import memoize
 from taskgraph.util.attributes import TRUNK_PROJECTS
@@ -717,19 +718,36 @@ def build_docker_worker_payload(config, 
 
         # run-task knows how to validate caches.
         #
         # To help ensure new run-task features and bug fixes don't interfere
         # with existing caches, we seed the hash of run-task into cache names.
         # So, any time run-task changes, we should get a fresh set of caches.
         # This means run-task can make changes to cache interaction at any time
         # without regards for backwards or future compatibility.
-
+        #
+        # But this mechanism only works for in-tree Docker images that are built
+        # with the current run-task! For out-of-tree Docker images, we have no
+        # way of knowing their content of run-task. So, in addition to varying
+        # cache names by the contents of run-task, we also take the Docker image
+        # name into consideration. This means that different Docker images will
+        # never share the same cache. This is a bit unfortunate. But it is the
+        # safest thing to do. Fortunately, most images are defined in-tree.
+        #
+        # For out-of-tree Docker images, we don't strictly need to incorporate
+        # the run-task content into the cache name. However, doing so preserves
+        # the mechanism whereby changing run-task results in new caches
+        # everywhere.
         if run_task:
             suffix = '-%s' % _run_task_suffix()
+
+            if out_of_tree_image:
+                name_hash = hashlib.sha256(out_of_tree_image).hexdigest()
+                suffix += name_hash[0:12]
+
         else:
             suffix = ''
 
         skip_untrusted = config.params['project'] == 'try' or level == 1
 
         for cache in worker['caches']:
             # Some caches aren't enabled in environments where we can't
             # guarantee certain behavior. Filter those out.