Bug 1405408 - Pre: Handle query string and fragments in URL parameters. r=dustin
This was simply oversight before. I ran into this using the
taskcluster-proxy /bewit interface, which returns a URL of the form
https://domain.net/short/path/to.file?bewit="several thousand
characters", which leads to an IOError due to the long path. Let's
assume that such query strings and fragments are transient; we should
drop these parts of the fetched URLs when writing to disk.
MozReview-Commit-ID: FMJHMp7a3rA
--- a/python/mozbuild/mozbuild/artifacts.py
+++ b/python/mozbuild/mozbuild/artifacts.py
@@ -785,17 +785,19 @@ class ArtifactCache(object):
binascii.unhexlify(fname)
except TypeError:
# We download to a temporary name like HASH[:16]-basename to
# differentiate among URLs with the same basenames. We used to then
# extract the build ID from the downloaded artifact and use it to make a
# human readable unique name, but extracting build IDs is time consuming
# (especially on Mac OS X, where we must mount a large DMG file).
hash = hashlib.sha256(url).hexdigest()[:16]
- fname = hash + '-' + os.path.basename(url)
+ # Strip query string and fragments.
+ basename = os.path.basename(urlparse.urlparse(url).path)
+ fname = hash + '-' + basename
path = os.path.abspath(mozpath.join(self._cache_dir, fname))
if self._skip_cache and os.path.exists(path):
self.log(logging.DEBUG, 'artifact',
{'path': path},
'Skipping cache: removing cached downloaded artifact {path}')
os.remove(path)