Bug 1396022: combine docker docs, talk about hsahes; r?garndt draft
authorDustin J. Mitchell <dustin@mozilla.com>
Fri, 01 Sep 2017 17:47:46 +0000
changeset 659075 853de5a91fa4bc9defa4e8f832b5970342eee140
parent 659065 3ecda4678c49ca255c38b1697142b9118cdd27e7
child 729869 90994ff9e92423d69eefb1bd88f9d19ebc5a6496
push id77999
push userdmitchell@mozilla.com
push dateTue, 05 Sep 2017 13:07:27 +0000
Bug 1396022: combine docker docs, talk about hsahes; r?garndt MozReview-Commit-ID: A27Qoemw2T3
deleted file mode 100644
--- a/taskcluster/docker/README.md
+++ /dev/null
@@ -1,161 +0,0 @@
-# Docker Images for use in TaskCluster
-This folder contains various docker images used in [taskcluster](http://docs.taskcluster.net/) as well as other misc docker images which may be useful for
-hacking on gecko.
-## Organization
-Each folder describes a single docker image.  We have two types of images that can be defined:
-1. [Task Images (build-on-push)](#task-images-build-on-push)
-2. [Docker Images (prebuilt)](#docker-registry-images-prebuilt)
-These images depend on one another, as described in the [`FROM`](https://docs.docker.com/v1.8/reference/builder/#from)
-line at the top of the Dockerfile in each folder.
-Images could either be an image intended for pushing to a docker registry, or one that is meant either
-for local testing or being built as an artifact when pushed to vcs.
-### Task Images (build-on-push)
-Images can be uploaded as a task artifact, [indexed](#task-image-index-namespace) under
-a given namespace, and used in other tasks by referencing the task ID.
-Important to note, these images do not require building and pushing to a docker registry, and are
-build per push (if necessary) and uploaded as task artifacts.
-The decision task that is run per push will [determine](#context-directory-hashing)
-if the image needs to be built based on the hash of the context directory and if the image
-exists under the namespace for a given branch.
-As an additional convenience, and a precaution to loading images per branch, if an image
-has been indexed with a given context hash for mozilla-central, any tasks requiring that image
-will use that indexed task.  This is to ensure there are not multiple images built/used
-that were built from the same context. In summary, if the image has been built for mozilla-central,
-pushes to any branch will use that already built image.
-To use within an in-tree task definition, the format is:
-  type: 'task-image'
-  path: 'public/image.tar.zst'
-  taskId: '{{#task_id_for_image}}builder{{/task_id_for_image}}'
-##### Context Directory Hashing
-Decision tasks will calculate the sha256 hash of the contents of the image
-directory and will determine if the image already exists for a given branch and hash
-or if a new image must be built and indexed.
-Note: this is the contents of *only* the context directory, not the
-image contents.
-The decision task will:
-1. Recursively collect the paths of all files within the context directory
-2. Sort the filenames alphabetically to ensure the hash is consistently calculated
-3. Generate a sha256 hash of the contents of each file.
-4. All file hashes will then be combined with their path and used to update the hash
-of the context directory.
-This ensures that the hash is consistently calculated and path changes will result
-in different hashes being generated.
-##### Task Image Index Namespace
-Images that are built on push and uploaded as an artifact of a task will be indexed under the
-following namespaces.
-* docker.images.v2.level-{level}.{image_name}.latest
-* docker.images.v2.level-{level}.{image_name}.pushdate.{year}.{month}-{day}-{pushtime}
-* docker.images.v2.level-{level}.{image_name}.hash.{context_hash}
-Not only can images be browsed by the pushdate and context hash, but the 'latest' namespace
-is meant to view the latest built image.  This functions similarly to the 'latest' tag
-for docker images that are pushed to a registry.
-### Docker Registry Images (prebuilt)
-***Deprecation Warning: Use of prebuilt images should only be used for base images (those that other images
-will inherit from), or private images that must be stored in a private docker registry account.  Existing
-public images will be converted to images that are built on push and any newly added image should
-follow this pattern.***
-These are images that are intended to be pushed to a docker registry and used by specifying the
-folder name in task definitions.  This information is automatically populated by using the 'docker_image'
-convenience method in task definitions.
-  image: {#docker_image}builder{/docker_image}
-Each image has a hash and a version, given by its `HASH` and `VERSION` files.
-When rebuilding a prebuilt image the `VERSION` should be bumped. Once a new
-version of the image has been built the `HASH` file should be updated with the
-hash of the image.
-The `HASH` file is the image hash as computed by docker, this is always on the
-format `sha256:<digest>`. Note that Docker produces a numbre of hashes in this
-format; the hash used in this context is the one returned from `docker push`.
-In production images will be referenced by image hash.  This mitigates attacks
-against the registry as well as simplifying validate of correctness. The
-`VERSION` file only serves to provide convenient names, such that old versions
-are easy to discover in the registry (and ensuring old versions aren't deleted
-by garbage-collection).
-This way, older tasks which were designed to run on an older version of the image
-can still be executed in taskcluster, while new tasks can use the new version.
-Further more, this mitigates attacks against the registry as docker will verify
-the image hash when loading the image.
-Each image also has a `REGISTRY`, defaulting to the `REGISTRY` in this directory,
-and specifying the image registry to which the completed image should be uploaded.
-## Building images
-Generally, images can be pulled from the [registry](./REGISTRY) rather than
-built locally, however, for developing new images it's often helpful to hack on
-them locally.
-To build an image, invoke `mach taskcluster-build-image` with the name of the
-folder (without a trailing slash):
-./mach taskcluster-build-image <image-name>
-This is a tiny wrapper around `docker build -t $REGISTRY/$FOLDER:$VERSION`.
-Once a new version image has been built and pushed to the remote registry using
-`docker push $REGISTRY/$FOLDER:$VERSION` the `HASH` file must be updated for the
-change to effect in production.
-Note: If no "VERSION" file present in the image directory, the tag 'latest' will be used and no
-registry will be defined. The image is only meant to run locally and will overwrite
-any existing image with the same name and tag.
-## Adding a new image
-The docker image primitives are very basic building block for
-constructing an "image" but generally don't help much with tagging it
-for deployment so we have a wrapper (./build.sh) which adds some sugar
-to help with tagging/versioning... Each folder should look something
-like this:
-  - your_amazing_image/
-    - your_amazing_image/Dockerfile: Standard docker file syntax
-    - your_amazing_image/VERSION: The version of the docker file
-      (required* used during tagging)
-    - your_amazing_image/REGISTRY: Override default registry
-      (useful for secret registries)
-## Conventions
-In some image folders you will see `.env` files these can be used in
-conjunction with the `--env-file` flag in docker to provide a
-environment with the given environment variables. These are primarily
-for convenience when manually hacking on the images.
-You will also see a `system-setup.sh` script used to build the image.
-Do not replicate this technique - prefer to include the commands and options directly in the Dockerfile.
--- a/taskcluster/docs/docker-images.rst
+++ b/taskcluster/docs/docker-images.rst
@@ -3,20 +3,183 @@
 Docker Images
 TaskCluster Docker images are defined in the source directory under
 ``taskcluster/docker``. Each directory therein contains the name of an
 image used as part of the task graph.
-More information is available in the ``README.md`` file in that directory.
+Each folder describes a single docker image.  We have two types of images that can be defined:
+1. Task Images (build-on-push)
+2. Docker Images (prebuilt)
+These images depend on one another, as described in the `FROM
+<https://docs.docker.com/v1.8/reference/builder/#from>`_ line at the top of the
+Dockerfile in each folder.
+Images could either be an image intended for pushing to a docker registry, or
+one that is meant either for local testing or being built as an artifact when
+pushed to vcs.
+Task Images (build-on-push)
+Images can be uploaded as a task artifact, [indexed](#task-image-index-namespace) under
+a given namespace, and used in other tasks by referencing the task ID.
+Important to note, these images do not require building and pushing to a docker registry, and are
+built per push (if necessary) and uploaded as task artifacts.
+The decision task that is run per push will [determine](#context-directory-hashing)
+if the image needs to be built based on the hash of the context directory and if the image
+exists under the namespace for a given branch.
+As an additional convenience, and a precaution to loading images per branch, if an image
+has been indexed with a given context hash for mozilla-central, any tasks requiring that image
+will use that indexed task.  This is to ensure there are not multiple images built/used
+that were built from the same context. In summary, if the image has been built for mozilla-central,
+pushes to any branch will use that already built image.
+To use within an in-tree task definition, the format is:
+  type: 'task-image'
+  path: 'public/image.tar.zst'
+  taskId: '{{#task_id_for_image}}builder{{/task_id_for_image}}'
+Context Directory Hashing
+Decision tasks will calculate the sha256 hash of the contents of the image
+directory and will determine if the image already exists for a given branch and hash
+or if a new image must be built and indexed.
+Note: this is the contents of *only* the context directory, not the
+image contents.
+The decision task will:
+1. Recursively collect the paths of all files within the context directory
+2. Sort the filenames alphabetically to ensure the hash is consistently calculated
+3. Generate a sha256 hash of the contents of each file.
+4. All file hashes will then be combined with their path and used to update the hash
+of the context directory.
+This ensures that the hash is consistently calculated and path changes will result
+in different hashes being generated.
+Task Image Index Namespace
+Images that are built on push and uploaded as an artifact of a task will be indexed under the
+following namespaces.
+* docker.images.v2.level-{level}.{image_name}.latest
+* docker.images.v2.level-{level}.{image_name}.pushdate.{year}.{month}-{day}-{pushtime}
+* docker.images.v2.level-{level}.{image_name}.hash.{context_hash}
+Not only can images be browsed by the pushdate and context hash, but the 'latest' namespace
+is meant to view the latest built image.  This functions similarly to the 'latest' tag
+for docker images that are pushed to a registry.
+Docker Registry Images (prebuilt)
+***Warning: Use of prebuilt images should only be used for base images (those that other images
+will inherit from), or private images that must be stored in a private docker registry account.***
-Adding Extra Files to Images
+These are images that are intended to be pushed to a docker registry and used
+by specifying the docker image name in task definitions.  They are generally
+referred to by a ``<repo>@<repodigest>`` string:
+.. code-block:: none
+    image: taskcluster/decision:0.1.10@sha256:c5451ee6c655b3d97d4baa3b0e29a5115f23e0991d4f7f36d2a8f793076d6854
+Each image has a repo digest, an image hash, and a version. The repo digest is
+stored in the ``HASH`` file in the image directory  and used to refer to the
+image as above.  The version is in ``VERSION``.  The image hash is used in
+chain-of-trust verification in `scriptworker
+The version file only serves to provide convenient names, such that old
+versions are easy to discover in the registry (and ensuring old versions aren't
+deleted by garbage-collection).
+Each image directory also has a ``REGISTRY``, defaulting to the ``REGISTRY`` in
+the ``taskcluster/docker`` directory, and specifying the image registry to
+which the completed image should be uploaded.
+Docker Hashes and Digests
+There are several hashes involved in this process:
+ * Image Hash -- the long version of the image ID; can be seen with
+   ``docker images --no-trunc`` or in the ``Id`` field in ``docker inspect``.
+ * Repo Digest -- hash of the image manifest; seen when running ``docker
+   push`` or ``docker pull``.
+ * Context Directory Hash -- see above (not a Docker concept at all)
+The use of hashes allows older tasks which were designed to run on an older
+version of the image to be executed in Taskcluster while new tasks use the new
+version.  Furthermore, this mitigates attacks against the registry as docker
+will verify the image hash when loading the image.
+(Re)-Building images
+Generally, images can be pulled from the Docker registry rather than built
+locally, however, for developing new images it's often helpful to hack on them
+To build an image, invoke ``mach taskcluster-build-image`` with the name of the
+folder (without a trailing slash):
+.. code-block:: none
+    ./mach taskcluster-build-image <image-name>
+This is a wrapper around ``docker build -t $REGISTRY/$FOLDER:$VERSION``.
+It's a good idea to bump the ``VERSION`` early in this process, to avoid
+``docker push``-ing  over any old tags.
+For task images, test your image locally or push to try. This is all that is
+Docker Registry Images
+Landing docker registry images takes a little more care.
+Once a new version of the image has been built and tested locally, push it to
+the docker registry and make note of the resulting repo digest.  Put this value
+in the ``HASH`` file, and update any references to the image in the code or
+task definitions.
+The change is now safe to use in Try pushes.  However, if the image is used in
+building releases then it is *not* safe to land to an integration branch until
+the whitelists in `scriptworker
+have also been updated. These whitelists use the image hash, not the repo
+Special Dockerfile Syntax
 Dockerfile syntax has been extended to allow *any* file from the
 source checkout to be added to the image build *context*. (Traditionally
 you can only ``ADD`` files from the same directory as the Dockerfile.)
 Simply add the following syntax as a comment in a Dockerfile::
    # %include <path>