Bug 1414168 - Change and move the relaxed calculation rule for small size classes. r?njn
First and foremost, the code and corresponding comment weren't in
agreement on what's going on.
The code checks:
RUN_MAX_OVRHD * (bin->mSizeClass << 3) <= RUN_MAX_OVRHD_RELAX
which is equivalent to:
(bin->mSizeClass << 3) <= RUN_MAX_OVRHD_RELAX / RUN_MAX_OVRHD
replacing constants:
(bin->mSizeClass << 3) <= 0x1800 / 0x3d
The left hand side is just bin->mSizeClass * 8, and the right hand side
is about 100, so this can be roughly summarized as:
bin->mSizeClass <= 12
The comment says the overhead constraint is relaxed for runs with a
per-region overhead greater than RUN_MAX_OVRHD / (mSizeClass << (3+RUN_BFP)).
Which, on itself, doesn't make sense, because it translates to
61 / (mSizeClass * 32768), which, even for a size class of 1 would mean
less than 0.2%, and this value would be even smaller for bigger classes.
The comment would make more sense with RUN_MAX_OVRHD_RELAX, but would
still not match what the code was doing.
So we change how the relaxed rule works, as per the comment in the new
code, and make it happen after the standard run overhead constraint has
been checked.
--- a/memory/build/mozjemalloc.cpp
+++ b/memory/build/mozjemalloc.cpp
@@ -509,35 +509,16 @@ static const size_t gRecycleLimit = 128_
// The current amount of recycled bytes, updated atomically.
static Atomic<size_t, ReleaseAcquire> gRecycledSize;
// Maximum number of dirty pages per arena.
#define DIRTY_MAX_DEFAULT (1U << 8)
static size_t opt_dirty_max = DIRTY_MAX_DEFAULT;
-// RUN_MAX_OVRHD indicates maximum desired run header overhead. Runs are sized
-// as small as possible such that this setting is still honored, without
-// violating other constraints. The goal is to make runs as small as possible
-// without exceeding a per run external fragmentation threshold.
-//
-// We use binary fixed point math for overhead computations, where the binary
-// point is implicitly RUN_BFP bits to the left.
-//
-// Note that it is possible to set RUN_MAX_OVRHD low enough that it cannot be
-// honored for some/all object sizes, since there is one bit of header overhead
-// per object (plus a constant). This constraint is relaxed (ignored) for runs
-// that are so small that the per-region overhead is greater than:
-//
-// (RUN_MAX_OVRHD / (reg_size << (3+RUN_BFP))
-#define RUN_BFP 12
-// \/ Implicit binary fixed point.
-#define RUN_MAX_OVRHD 0x0000003dU
-#define RUN_MAX_OVRHD_RELAX 0x00001800U
-
// Return the smallest chunk multiple that is >= s.
#define CHUNK_CEILING(s) (((s) + kChunkSizeMask) & ~kChunkSizeMask)
// Return the smallest cacheline multiple that is >= s.
#define CACHELINE_CEILING(s) \
(((s) + (kCacheLineSize - 1)) & ~(kCacheLineSize - 1))
// Return the smallest quantum multiple that is >= a.
@@ -3014,25 +2995,33 @@ arena_bin_run_size_calc(arena_bin_t* bin
} while (sizeof(arena_run_t) + (sizeof(unsigned) * (try_mask_nelms - 1)) >
try_reg0_offset);
// Don't allow runs larger than the largest possible large size class.
if (try_run_size > gMaxLargeClass) {
break;
}
- // This doesn't match the comment above RUN_MAX_OVRHD_RELAX.
- if (RUN_MAX_OVRHD * (bin->mSizeClass << 3) <= RUN_MAX_OVRHD_RELAX) {
- break;
- }
-
// Try to keep the run overhead below kRunOverhead.
if (Fraction(try_reg0_offset, try_run_size) <= arena_bin_t::kRunOverhead) {
break;
}
+
+ // The run header includes one bit per region of the given size. For sizes
+ // small enough, the number of regions is large enough that growing the run
+ // size barely moves the needle for the overhead because of all those bits.
+ // For example, for a size of 8 bytes, adding 4KiB to the run size adds
+ // close to 512 bits to the header, which is 64 bytes.
+ // With such overhead, there is no way to get to the wanted overhead above,
+ // so we give up if the required size for regs_mask more than doubles the
+ // size of the run header.
+ if (try_mask_nelms * sizeof(unsigned) >= sizeof(arena_run_t)) {
+ break;
+ }
+
}
MOZ_ASSERT(sizeof(arena_run_t) + (sizeof(unsigned) * (good_mask_nelms - 1)) <=
good_reg0_offset);
MOZ_ASSERT((good_mask_nelms << (LOG2(sizeof(int)) + 3)) >= good_nregs);
// Copy final settings.
bin->mRunSize = good_run_size;