Bug 1355630 - Inline specialized version of relpath(); r?chmanchester draft
authorGregory Szorc <gps@mozilla.com>
Tue, 11 Apr 2017 16:46:55 -0700
changeset 561606 9bc23f7645772eca410cfd48ddd8b279f2bca938
parent 561605 004466bd4b9a25151b73d9e52cfe00ec43a94c7e
child 624028 77d703049a6b3469297def15c7ca5510d29447bb
push id53789
push userbmo:gps@mozilla.com
push dateWed, 12 Apr 2017 23:08:05 +0000
reviewerschmanchester
bugs1355630
milestone55.0a1
Bug 1355630 - Inline specialized version of relpath(); r?chmanchester Profiling revealed that mozpath.relpath() accounted for a lot of CPU time when operating on an input of ~42,000 paths. Due to the nature of the paths we're operating on, we don't need the full power of mozpath.relpath() here. Instead, we can implement a specialized version that works given already normalized paths and the knowledge that context paths must be ancestors of the current path being examined. This change drops execution time of a mach command feeding ~42,000 paths to this function from ~90s to ~24s. On an input with 9131 paths, execution time dropped from ~8.8s to ~3.7s. MozReview-Commit-ID: EGLiJa10Zj2
python/mozbuild/mozbuild/frontend/reader.py
--- a/python/mozbuild/mozbuild/frontend/reader.py
+++ b/python/mozbuild/mozbuild/frontend/reader.py
@@ -1383,23 +1383,37 @@ class BuildReader(object):
             if key not in defaults_cache:
                 defaults_cache[key] = self.test_defaults_for_path(ctxs)
 
             return defaults_cache[key]
 
         r = {}
 
         for path, ctxs in paths.items():
+            # Should be normalized by read_relevant_mozbuilds.
+            assert '\\' not in path
+
             flags = Files(Context())
 
             for ctx in ctxs:
                 if not isinstance(ctx, Files):
                     continue
 
-                relpath = mozpath.relpath(path, ctx.relsrcdir)
+                # read_relevant_mozbuilds() normalizes paths and ensures that
+                # the contexts have paths in the ancestry of the path. When
+                # iterating over tens of thousands of paths, mozpath.relpath()
+                # can be very expensive. So, given our assumptions about paths,
+                # we implement an optimized version.
+                ctx_rel_dir = ctx.relsrcdir
+                if ctx_rel_dir:
+                    assert path.startswith(ctx_rel_dir)
+                    relpath = path[len(ctx_rel_dir) + 1:]
+                else:
+                    relpath = path
+
                 pattern = ctx.pattern
 
                 # Only do wildcard matching if the '*' character is present.
                 # Otherwise, mozpath.match will match directories, which we've
                 # arbitrarily chosen to not allow.
                 if pattern == relpath or \
                         ('*' in pattern and mozpath.match(relpath, pattern)):
                     flags += ctx