--- a/taskcluster/docs/index.rst
+++ b/taskcluster/docs/index.rst
@@ -19,15 +19,16 @@ referring to the source where necessary.
particular goal in mind and would rather avoid becoming a task-graph expert,
check out the :doc:`how-to section <how-tos>`.
.. toctree::
taskgraph
loading
transforms
+ optimization
yaml-templates
docker-images
cron
how-tos
in-tree-actions
action-spec
reference
new file mode 100644
--- /dev/null
+++ b/taskcluster/docs/optimization.rst
@@ -0,0 +1,44 @@
+Optimization
+============
+
+The objective of optimization to remove as many tasks from the graph as
+possible, as efficiently as possible, thereby delivering useful results as
+quickly as possible. For example, ideally if only a test script is modified in
+a push, then the resulting graph contains only the corresponding test suite
+task.
+
+A task is said to be "optimized" when it is either replaced with an equivalent,
+already-existing task, or dropped from the graph entirely.
+
+Optimization Functions
+----------------------
+
+During the optimization phase of task-graph generation, each task is optimized
+in post-order, meaning that each task's dependencies will be optimized before
+the task itself is optimized.
+
+Each task has a ``task.optimizations`` property describing the optimization
+methods that apply. Each is specified as a list of method and arguments. For
+example::
+
+ task.optimizations = [
+ ['seta'],
+ ['files-changed', ['js/**', 'tests/**']],
+ ]
+
+These methods are defined in ``taskcluster/taskgraph/optimize.py``. They are
+applied in order, and the first to return a success value causes the task to
+be optimized.
+
+Each method can return either a taskId (indicating that the given task can be
+replaced) or indicate that the task can be optimized away. If a task on which
+others depend is optimized away, task-graph generation will fail.
+
+Optimizing Target Tasks
+-----------------------
+
+In some cases, such as try pushes, tasks in the target task set have been
+explicitly requested and are thus excluded from optimization. In other cases,
+the target task set is almost the entire task graph, so targetted tasks are
+considered for optimization. This behavior is controlled with the
+``optimize_target_tasks`` parameter.
--- a/taskcluster/docs/taskgraph.rst
+++ b/taskcluster/docs/taskgraph.rst
@@ -91,19 +91,19 @@ Graph generation, as run via ``mach task
#. For all kinds, generate all tasks. The result is the "full task set"
#. Create dependency links between tasks using kind-specific mechanisms. The
result is the "full task graph".
#. Filter the target tasks (based on a series of filters, such as try syntax,
tree-specific specifications, etc). The result is the "target task set".
#. Based on the full task graph, calculate the transitive closure of the target
task set. That is, the target tasks and all requirements of those tasks.
The result is the "target task graph".
-#. Optimize the target task graph based on kind-specific optimization methods.
+#. Optimize the target task graph using task-specific optimization methods.
The result is the "optimized task graph" with fewer nodes than the target
- task graph.
+ task graph. See :ref:`optimization`.
#. Create tasks for all tasks in the optimized task graph.
Transitive Closure
..................
Transitive closure is a fancy name for this sort of operation:
* start with a set of tasks
@@ -118,42 +118,16 @@ Then repeat: the test docker image task
tasks, but those build tasks depend on the build docker image task. So add
that build docker image task. Repeat again: this time, none of the tasks in
the set depend on a task not in the set, so nothing changes and the process is
complete.
And as you can see, the graph we've built now includes everything we wanted
(the test jobs) plus everything required to do that (docker images, builds).
-Optimization
-------------
-
-The objective of optimization to remove as many tasks from the graph as
-possible, as efficiently as possible, thereby delivering useful results as
-quickly as possible. For example, ideally if only a test script is modified in
-a push, then the resulting graph contains only the corresponding test suite
-task.
-
-A task is said to be "optimized" when it is either replaced with an equivalent,
-already-existing task, or dropped from the graph entirely.
-
-A task can be optimized if all of its dependencies can be optimized and none of
-its inputs have changed. For a task on which no other tasks depend (a "leaf
-task"), the optimizer can determine what has changed by looking at the
-version-control history of the push: if the relevant files are not modified in
-the push, then it considers the inputs unchanged. For tasks on which other
-tasks depend ("non-leaf tasks"), the optimizer must replace the task with
-another, equivalent task, so it generates a hash of all of the inputs and uses
-that to search for a matching, existing task.
-
-In some cases, such as try pushes, tasks in the target task set have been
-explicitly requested and are thus excluded from optimization. In other cases,
-the target task set is almost the entire task graph, so targetted tasks are
-considered for optimization. This behavior is controlled with the
-``optimize_target_tasks`` parameter.
Action Tasks
------------
Action Tasks are tasks which help you to schedule new jobs via Treeherder's
"Add New Jobs" feature. The Decision Task creates a YAML file named
``action.yml`` which can be used to schedule Action Tasks after suitably replacing
``{{decision_task_id}}`` and ``{{task_labels}}``, which correspond to the decision
--- a/taskcluster/taskgraph/optimize.py
+++ b/taskcluster/taskgraph/optimize.py
@@ -1,23 +1,31 @@
# This Source Code Form is subject to the terms of the Mozilla Public
# License, v. 2.0. If a copy of the MPL was not distributed with this
# file, You can obtain one at http://mozilla.org/MPL/2.0/.
from __future__ import absolute_import, print_function, unicode_literals
+
import logging
import re
+import os
+import requests
from .graph import Graph
+from . import files_changed
from .taskgraph import TaskGraph
+from .util.seta import is_low_value_task
+from .util.taskcluster import find_task_id
from slugid import nice as slugid
logger = logging.getLogger(__name__)
TASK_REFERENCE_PATTERN = re.compile('<([^>]+)>')
+_optimizations = {}
+
def optimize_task_graph(target_task_graph, params, do_not_optimize, existing_tasks=None):
"""
Perform task optimization, without optimizing tasks named in
do_not_optimize.
"""
named_links_dict = target_task_graph.graph.named_links_dict()
label_to_taskid = {}
@@ -55,16 +63,31 @@ def resolve_task_references(label, task_
return TASK_REFERENCE_PATTERN.sub(repl, val['task-reference'])
else:
return {k: recurse(v) for k, v in val.iteritems()}
else:
return val
return recurse(task_def)
+def optimize_task(task, params):
+ """
+ Optimize a single task by running its optimizations in order until one
+ succeeds.
+ """
+ for opt in task.optimizations:
+ opt_type, args = opt[0], opt[1:]
+ opt_fn = _optimizations[opt_type]
+ optimized, task_id = opt_fn(task, params, *args)
+ if optimized or task_id:
+ return optimized, task_id
+
+ return False, None
+
+
def annotate_task_graph(target_task_graph, params, do_not_optimize,
named_links_dict, label_to_taskid, existing_tasks):
"""
Annotate each task in the graph with .optimized (boolean) and .task_id
(possibly None), following the rules for optimization and calling the task
kinds' `optimize_task` method.
As a side effect, label_to_taskid is updated with labels for all optimized
@@ -90,17 +113,17 @@ def annotate_task_graph(target_task_grap
if label in do_not_optimize:
optimized = False
# Let's check whether this task has been created before
elif existing_tasks is not None and label in existing_tasks:
optimized = True
replacement_task_id = existing_tasks[label]
# otherwise, examine the task itself (which may be an expensive operation)
else:
- optimized, replacement_task_id = task.optimize(params)
+ optimized, replacement_task_id = optimize_task(task, params)
task.optimized = optimized
task.task_id = replacement_task_id
if replacement_task_id:
label_to_taskid[label] = replacement_task_id
if optimized:
if replacement_task_id:
@@ -149,8 +172,69 @@ def get_subgraph(annotated_task_graph, n
(left, right, name)
for (left, right, name) in edges_by_taskid
if left in tasks_by_taskid and right in tasks_by_taskid
)
return TaskGraph(
tasks_by_taskid,
Graph(set(tasks_by_taskid), edges_by_taskid))
+
+
+def optimization(name):
+ def wrap(func):
+ if name in _optimizations:
+ raise Exception("multiple optimizations with name {}".format(name))
+ _optimizations[name] = func
+ return func
+ return wrap
+
+
+@optimization('index-search')
+def opt_index_search(task, params, index_path):
+ try:
+ task_id = find_task_id(
+ index_path,
+ use_proxy=bool(os.environ.get('TASK_ID')))
+
+ return True, task_id
+ except requests.exceptions.HTTPError:
+ pass
+
+ return False, None
+
+
+@optimization('seta')
+def opt_seta(task, params):
+ bbb_task = False
+
+ # no need to call SETA for build jobs
+ if task.task.get('extra', {}).get('treeherder', {}).get('jobKind', '') == 'build':
+ return False, None
+
+ # for bbb tasks we need to send in the buildbot buildername
+ if task.task.get('provisionerId', '') == 'buildbot-bridge':
+ label = task.task.get('payload').get('buildername')
+ bbb_task = True
+ else:
+ label = task.label
+
+ # we would like to return 'False, None' while it's high_value_task
+ # and we wouldn't optimize it. Otherwise, it will return 'True, None'
+ if is_low_value_task(label,
+ params.get('project'),
+ params.get('pushlog_id'),
+ params.get('pushdate'),
+ bbb_task):
+ # Always optimize away low-value tasks
+ return True, None
+ else:
+ return False, None
+
+
+@optimization('files-changed')
+def opt_files_changed(task, params, file_patterns):
+ changed = files_changed.check(params, file_patterns)
+ if not changed:
+ logger.debug('no files found matching a pattern in `when.files-changed` for ' +
+ task.label)
+ return True, None
+ return False, None
--- a/taskcluster/taskgraph/task/base.py
+++ b/taskcluster/taskgraph/task/base.py
@@ -1,94 +1,65 @@
# This Source Code Form is subject to the terms of the Mozilla Public
# License, v. 2.0. If a copy of the MPL was not distributed with this
# file, You can obtain one at http://mozilla.org/MPL/2.0/.
from __future__ import absolute_import, print_function, unicode_literals
import abc
-import os
-import requests
-from taskgraph.util.taskcluster import find_task_id
class Task(object):
"""
Representation of a task in a TaskGraph. Each Task has, at creation:
- kind: the name of the task kind
- label; the label for this task
- attributes: a dictionary of attributes for this task (used for filtering)
- task: the task definition (JSON-able dictionary)
- - index_paths: index paths where equivalent tasks might be found for optimization
+ - optimizations: optimizations to apply to the task (see taskgraph.optimize)
- dependencies: tasks this one depends on, in the form {name: label}, for example
{'build': 'build-linux64/opt', 'docker-image': 'build-docker-image-desktop-test'}
And later, as the task-graph processing proceeds:
- task_id -- TaskCluster taskId under which this task will be created
- optimized -- true if this task need not be performed
A kind represents a collection of tasks that share common characteristics.
For example, all build jobs. Each instance of a kind is intialized with a
path from which it draws its task configuration. The instance is free to
store as much local state as it needs.
"""
__metaclass__ = abc.ABCMeta
def __init__(self, kind, label, attributes, task,
- index_paths=None, dependencies=None):
+ optimizations=None, dependencies=None):
self.kind = kind
self.label = label
self.attributes = attributes
self.task = task
self.task_id = None
self.optimized = False
self.attributes['kind'] = kind
- self.index_paths = index_paths or ()
+ self.optimizations = optimizations or []
self.dependencies = dependencies or {}
def __eq__(self, other):
return self.kind == other.kind and \
self.label == other.label and \
self.attributes == other.attributes and \
self.task == other.task and \
self.task_id == other.task_id and \
- self.index_paths == other.index_paths and \
+ self.optimizations == other.optimizations and \
self.dependencies == other.dependencies
- def optimize(self, params):
- """
- Determine whether this task can be optimized, and if it can, what taskId
- it should be replaced with.
-
- The return value is a tuple `(optimized, taskId)`. If `optimized` is
- true, then the task will be optimized (in other words, not included in
- the task graph). If the second argument is a taskid, then any
- dependencies on this task will isntead depend on that taskId. It is an
- error to return no taskId for a task on which other tasks depend.
-
- The default optimizes when a taskId can be found for one of the index
- paths attached to the task.
- """
- for index_path in self.index_paths:
- try:
- task_id = find_task_id(
- index_path,
- use_proxy=bool(os.environ.get('TASK_ID')))
-
- return True, task_id
- except requests.exceptions.HTTPError:
- pass
-
- return False, None
-
@classmethod
def from_json(cls, task_dict):
"""
Given a data structure as produced by taskgraph.to_json, re-construct
the original Task object. This is used to "resume" the task-graph
generation process, for example in Action tasks.
"""
return cls(
--- a/taskcluster/taskgraph/task/docker_image.py
+++ b/taskcluster/taskgraph/task/docker_image.py
@@ -1,22 +1,19 @@
# This Source Code Form is subject to the terms of the Mozilla Public
# License, v. 2.0. If a copy of the MPL was not distributed with this
# file, You can obtain one at http://mozilla.org/MPL/2.0/.
from __future__ import absolute_import, print_function, unicode_literals
import logging
-import os
-import urllib2
from . import transform
from taskgraph.util.docker import INDEX_PREFIX
from taskgraph.transforms.base import TransformSequence, TransformConfig
-from taskgraph.util.taskcluster import get_artifact_url
from taskgraph.util.python_path import find_object
logger = logging.getLogger(__name__)
def transform_inputs(inputs, kind, path, config, params, loaded_tasks):
"""
Transform a sequence of inputs according to the transform configuration.
@@ -36,35 +33,16 @@ def transform_inputs(inputs, kind, path,
def load_tasks(kind, path, config, params, loaded_tasks):
return transform_inputs(
transform.get_inputs(kind, path, config, params, loaded_tasks),
kind, path, config, params, loaded_tasks)
class DockerImageTask(transform.TransformTask):
- def optimize(self, params):
- optimized, taskId = super(DockerImageTask, self).optimize(params)
- if optimized and taskId:
- try:
- # Only return the task ID if the artifact exists for the indexed
- # task.
- request = urllib2.Request(get_artifact_url(
- taskId, 'public/image.tar.zst',
- use_proxy=bool(os.environ.get('TASK_ID'))))
- request.get_method = lambda: 'HEAD'
- urllib2.urlopen(request)
-
- # HEAD success on the artifact is enough
- return True, taskId
- except urllib2.HTTPError:
- pass
-
- return False, None
-
@classmethod
def from_json(cls, task_dict):
# Generating index_paths for optimization
imgMeta = task_dict['task']['extra']['imageMeta']
image_name = imgMeta['imageName']
context_hash = imgMeta['contextHash']
index_paths = ['{}.level-{}.{}.hash.{}'.format(
INDEX_PREFIX, level, image_name, context_hash)
--- a/taskcluster/taskgraph/task/transform.py
+++ b/taskcluster/taskgraph/task/transform.py
@@ -3,21 +3,19 @@
# file, You can obtain one at http://mozilla.org/MPL/2.0/.
from __future__ import absolute_import, print_function, unicode_literals
import logging
import itertools
from . import base
-from .. import files_changed
from ..util.python_path import find_object
from ..util.templates import merge
from ..util.yaml import load_yaml
-from ..util.seta import is_low_value_task
from ..transforms.base import TransformSequence, TransformConfig
logger = logging.getLogger(__name__)
def get_inputs(kind, path, config, params, loaded_tasks):
"""
@@ -79,56 +77,16 @@ class TransformTask(base.Task):
"""
Tasks of this class are generated by applying transformations to a sequence
of input entities. By default, it gets those inputs from YAML data in the
kind directory, but subclasses may override `get_inputs` to produce them in
some other way.
"""
def __init__(self, kind, task):
- self.when = task.get('when', {})
super(TransformTask, self).__init__(kind, task['label'],
task['attributes'], task['task'],
- index_paths=task.get('index-paths'),
+ optimizations=task.get('optimizations'),
dependencies=task.get('dependencies'))
- def optimize(self, params):
- bbb_task = False
-
- if self.index_paths:
- optimized, taskId = super(TransformTask, self).optimize(params)
- if optimized:
- return optimized, taskId
-
- elif 'files-changed' in self.when:
- changed = files_changed.check(
- params, self.when['files-changed'])
- if not changed:
- logger.debug('no files found matching a pattern in `when.files-changed` for ' +
- self.label)
- return True, None
-
- # no need to call SETA for build jobs
- if self.task.get('extra', {}).get('treeherder', {}).get('jobKind', '') == 'build':
- return False, None
-
- # for bbb tasks we need to send in the buildbot buildername
- if self.task.get('provisionerId', '') == 'buildbot-bridge':
- self.label = self.task.get('payload').get('buildername')
- bbb_task = True
-
- # we would like to return 'False, None' while it's high_value_task
- # and we wouldn't optimize it. Otherwise, it will return 'True, None'
- if is_low_value_task(self.label,
- params.get('project'),
- params.get('pushlog_id'),
- params.get('pushdate'),
- bbb_task):
- # Always optimize away low-value tasks
- return True, None
- else:
- return False, None
-
@classmethod
def from_json(cls, task_dict):
- # when reading back from JSON, we lose the "when" information
- task_dict['when'] = {}
return cls(task_dict['attributes']['kind'], task_dict)
--- a/taskcluster/taskgraph/test/test_optimize.py
+++ b/taskcluster/taskgraph/test/test_optimize.py
@@ -1,17 +1,17 @@
# This Source Code Form is subject to the terms of the Mozilla Public
# License, v. 2.0. If a copy of the MPL was not distributed with this
# file, You can obtain one at http://mozilla.org/MPL/2.0/.
from __future__ import absolute_import, print_function, unicode_literals
import unittest
-from ..optimize import optimize_task_graph, resolve_task_references
+from ..optimize import optimize_task_graph, resolve_task_references, optimization
from ..optimize import annotate_task_graph, get_subgraph
from ..taskgraph import TaskGraph
from .. import graph
from .util import TestTask
class TestResolveTaskReferences(unittest.TestCase):
@@ -48,113 +48,112 @@ class TestResolveTaskReferences(unittest
"resolve_task_references raises a KeyError on reference to an invalid task"
self.assertRaisesRegexp(
KeyError,
"task 'subject' has no dependency named 'no-such'",
lambda: resolve_task_references('subject', {'task-reference': '<no-such>'}, {})
)
-class OptimizingTask(TestTask):
- # the `optimize` method on this class is overridden direclty in the tests
- # below.
- pass
-
-
class TestOptimize(unittest.TestCase):
kind = None
- def make_task(self, label, task_def=None, optimized=None, task_id=None):
+ @classmethod
+ def setUpClass(cls):
+ # set up some simple optimization functions
+ optimization('no-optimize')(lambda self, params: (False, None))
+ optimization('optimize-away')(lambda self, params: (True, None))
+ optimization('optimize-to-task')(lambda self, params, task: (True, task))
+ optimization('false-with-taskid')(lambda self, params: (False, 'some-taskid'))
+
+ def make_task(self, label, optimization=None, task_def=None, optimized=None, task_id=None):
task_def = task_def or {'sample': 'task-def'}
- task = OptimizingTask(label=label, task=task_def)
+ task = TestTask(label=label, task=task_def)
task.optimized = optimized
+ if optimization:
+ task.optimizations = [optimization]
+ else:
+ task.optimizations = []
task.task_id = task_id
return task
def make_graph(self, *tasks_and_edges):
- tasks = {t.label: t for t in tasks_and_edges if isinstance(t, OptimizingTask)}
- edges = {e for e in tasks_and_edges if not isinstance(e, OptimizingTask)}
+ tasks = {t.label: t for t in tasks_and_edges if isinstance(t, TestTask)}
+ edges = {e for e in tasks_and_edges if not isinstance(e, TestTask)}
return TaskGraph(tasks, graph.Graph(set(tasks), edges))
def assert_annotations(self, graph, **annotations):
def repl(task_id):
return 'SLUGID' if task_id and len(task_id) == 22 else task_id
got_annotations = {
t.label: (t.optimized, repl(t.task_id)) for t in graph.tasks.itervalues()
}
self.assertEqual(got_annotations, annotations)
def test_annotate_task_graph_no_optimize(self):
"annotating marks everything as un-optimized if the kind returns that"
- OptimizingTask.optimize = lambda self, params: (False, None)
graph = self.make_graph(
- self.make_task('task1'),
- self.make_task('task2'),
- self.make_task('task3'),
+ self.make_task('task1', ['no-optimize']),
+ self.make_task('task2', ['no-optimize']),
+ self.make_task('task3', ['no-optimize']),
('task2', 'task1', 'build'),
('task2', 'task3', 'image'),
)
annotate_task_graph(graph, {}, set(), graph.graph.named_links_dict(), {}, None)
self.assert_annotations(
graph,
task1=(False, None),
task2=(False, None),
task3=(False, None)
)
def test_annotate_task_graph_taskid_without_optimize(self):
"raises exception if kind returns a taskid without optimizing"
- OptimizingTask.optimize = lambda self, params: (False, 'some-taskid')
- graph = self.make_graph(self.make_task('task1'))
+ graph = self.make_graph(self.make_task('task1', ['false-with-taskid']))
self.assertRaises(
Exception,
lambda: annotate_task_graph(graph, {}, set(), graph.graph.named_links_dict(), {}, None)
)
def test_annotate_task_graph_optimize_away_dependency(self):
"raises exception if kind optimizes away a task on which another depends"
- OptimizingTask.optimize = \
- lambda self, params: (True, None) if self.label == 'task1' else (False, None)
graph = self.make_graph(
- self.make_task('task1'),
- self.make_task('task2'),
+ self.make_task('task1', ['optimize-away']),
+ self.make_task('task2', ['no-optimize']),
('task2', 'task1', 'build'),
)
self.assertRaises(
Exception,
lambda: annotate_task_graph(graph, {}, set(), graph.graph.named_links_dict(), {}, None)
)
def test_annotate_task_graph_do_not_optimize(self):
"annotating marks everything as un-optimized if in do_not_optimize"
- OptimizingTask.optimize = lambda self, params: (True, 'taskid')
graph = self.make_graph(
- self.make_task('task1'),
- self.make_task('task2'),
+ self.make_task('task1', ['optimize-away']),
+ self.make_task('task2', ['optimize-away']),
('task2', 'task1', 'build'),
)
label_to_taskid = {}
annotate_task_graph(graph, {}, {'task1', 'task2'},
graph.graph.named_links_dict(), label_to_taskid, None)
self.assert_annotations(
graph,
task1=(False, None),
task2=(False, None)
)
self.assertEqual
def test_annotate_task_graph_nos_do_not_propagate(self):
"a task with a non-optimized dependency can be optimized"
- OptimizingTask.optimize = \
- lambda self, params: (False, None) if self.label == 'task1' else (True, 'taskid')
graph = self.make_graph(
- self.make_task('task1'),
- self.make_task('task2'),
- self.make_task('task3'),
+ self.make_task('task1', ['no-optimize']),
+ self.make_task('task2', ['optimize-to-task', 'taskid']),
+ self.make_task('task3', ['optimize-to-task', 'taskid']),
('task2', 'task1', 'build'),
('task2', 'task3', 'image'),
)
annotate_task_graph(graph, {}, set(),
graph.graph.named_links_dict(), {}, None)
self.assert_annotations(
graph,
task1=(False, None),
@@ -236,21 +235,19 @@ class TestOptimize(unittest.TestCase):
self.assertEqual(sub.graph.edges, {(task2, task3, 'test')})
self.assertEqual(sub.tasks[task2].task_id, task2)
self.assertEqual(sorted(sub.tasks[task2].task['dependencies']), sorted([task3, 'dep1']))
self.assertEqual(sub.tasks[task2].task['payload'], 'http://dep1/' + task3)
self.assertEqual(sub.tasks[task3].task_id, task3)
def test_optimize(self):
"optimize_task_graph annotates and extracts the subgraph from a simple graph"
- OptimizingTask.optimize = \
- lambda self, params: (True, 'dep1') if self.label == 'task1' else (False, None)
input = self.make_graph(
- self.make_task('task1'),
- self.make_task('task2'),
- self.make_task('task3'),
+ self.make_task('task1', ['optimize-to-task', 'dep1']),
+ self.make_task('task2', ['no-optimize']),
+ self.make_task('task3', ['no-optimize']),
('task2', 'task1', 'build'),
('task2', 'task3', 'image'),
)
opt, label_to_taskid = optimize_task_graph(input, {}, set())
self.assertEqual(opt.graph, graph.Graph(
{label_to_taskid['task2'], label_to_taskid['task3']},
{(label_to_taskid['task2'], label_to_taskid['task3'], 'image')}))
--- a/taskcluster/taskgraph/transforms/job/__init__.py
+++ b/taskcluster/taskgraph/transforms/job/__init__.py
@@ -50,16 +50,17 @@ job_description_schema = Schema({
Optional('routes'): task_description_schema['routes'],
Optional('scopes'): task_description_schema['scopes'],
Optional('tags'): task_description_schema['tags'],
Optional('extra'): task_description_schema['extra'],
Optional('treeherder'): task_description_schema['treeherder'],
Optional('index'): task_description_schema['index'],
Optional('run-on-projects'): task_description_schema['run-on-projects'],
Optional('coalesce-name'): task_description_schema['coalesce-name'],
+ Optional('optimizations'): task_description_schema['optimizations'],
Optional('needs-sccache'): task_description_schema['needs-sccache'],
Optional('when'): task_description_schema['when'],
# A description of how to run this job.
'run': {
# The key to a job implementation in a peer module to this one
'using': basestring,
--- a/taskcluster/taskgraph/transforms/task.py
+++ b/taskcluster/taskgraph/transforms/task.py
@@ -127,16 +127,28 @@ task_description_schema = Schema({
# See the attributes documentation for details.
Optional('run-on-projects'): [basestring],
# If the task can be coalesced, this is the name used in the coalesce key
# the project, etc. will be added automatically. Note that try (level 1)
# tasks are never coalesced
Optional('coalesce-name'): basestring,
+ # Optimizations to perform on this task during the optimization phase,
+ # specified in order. These optimizations are defined in
+ # taskcluster/taskgraph/optimize.py.
+ Optional('optimizations'): [Any(
+ # search the index for the given index namespace, and replace this task if found
+ ['index-search', basestring],
+ # consult SETA and skip this task if it is low-value
+ ['seta'],
+ # skip this task if none of the given file patterns match
+ ['files-changed', [basestring]],
+ )],
+
# the provisioner-id/worker-type for the task. The following parameters will
# be substituted in this string:
# {level} -- the scm level of this push
'worker-type': basestring,
# Whether the job should use sccache compiler caching.
Required('needs-sccache', default=False): bool,
@@ -334,18 +346,18 @@ task_description_schema = Schema({
Required('taskType'): basestring,
# Paths to the artifacts to sign
Required('paths'): [basestring],
}],
}),
# The "when" section contains descriptions of the circumstances
- # under which this task can be "optimized", that is, left out of the
- # task graph because it is unnecessary.
+ # under which this task should be included in the task graph. This
+ # will be converted into an element in the `optimizations` list.
Optional('when'): Any({
# This task only needs to be run if a file matching one of the given
# patterns has changed in the push. The patterns use the mozpack
# match function (python/mozbuild/mozpack/path.py).
Optional('files-changed'): [basestring],
}),
})
@@ -779,16 +791,27 @@ def add_files_changed(config, tasks):
if 'in-tree' in task['worker'].get('docker-image', {}):
task['when']['files-changed'].append('taskcluster/docker/{}/**'.format(
task['worker']['docker-image']['in-tree']))
yield task
@transforms.add
+def setup_optimizations(config, tasks):
+ for task in tasks:
+ optimizations = task.setdefault('optimizations', [])
+ optimizations.extend([['index-search', idx] for idx in task.get('index-paths', [])])
+ optimizations.append(['seta'])
+ if 'when' in task and 'files-changed' in task['when']:
+ optimizations.append(['files-changed', task['when']['files-changed']])
+ yield task
+
+
+@transforms.add
def build_task(config, tasks):
for task in tasks:
worker_type = task['worker-type'].format(level=str(config.params['level']))
provisioner_id, worker_type = worker_type.split('/', 1)
routes = task.get('routes', [])
scopes = task.get('scopes', [])
@@ -871,18 +894,17 @@ def build_task(config, tasks):
attributes = task.get('attributes', {})
attributes['run_on_projects'] = task.get('run-on-projects', ['all'])
yield {
'label': task['label'],
'task': task_def,
'dependencies': task.get('dependencies', {}),
'attributes': attributes,
- 'index-paths': task.get('index-paths'),
- 'when': task.get('when', {}),
+ 'optimizations': task['optimizations'],
}
# Check that the v2 route templates match those used by Mozharness. This can
# go away once Mozharness builds are no longer performed in Buildbot, and the
# Mozharness code referencing routes.json is deleted.
def check_v2_routes():
with open("testing/mozharness/configs/routes.json", "rb") as f: