Bug 1385998 - Don't use atomic increments / decrements on stackPointer. r?froydnj
Only one thread ever modifies a PseudoStack, so we don't need to enforce
synchronization of writes from different threads. We can just read the old
value, add one to it, and then do an atomic store with the new value, because
we know that the current value of stackPointer can't have changed in the
meantime.
On its own, this patch actually seems to make things slower. But combined with
the next patch (which changes the memory ordering to ReleaseAcquire) it doesn't.
(I haven't checked whether the next patch on its own would give just as much
improvements with and without this patch.)
MozReview-Commit-ID: 3WIdyJC9kcj
--- a/js/public/ProfilingStack.h
+++ b/js/public/ProfilingStack.h
@@ -221,33 +221,51 @@ class PseudoStack
void pushCppFrame(const char* label, const char* dynamicString, void* sp, uint32_t line,
js::ProfileEntry::Kind kind, js::ProfileEntry::Category category) {
if (stackPointer < MaxEntries) {
entries[stackPointer].initCppFrame(label, dynamicString, sp, line, kind, category);
}
// This must happen at the end! The compiler will not reorder this
// update because stackPointer is Atomic.
- stackPointer++;
+ // Do the read and the write as two separate statements, in order to
+ // make it clear that we don't need an atomic increment, which would be
+ // more expensive on x86 than the separate operations done here.
+ // This thread is the only one that ever changes the value of
+ // stackPointer.
+ uint32_t oldStackPointer = stackPointer;
+ stackPointer = oldStackPointer + 1;
}
void pushJsFrame(const char* label, const char* dynamicString, JSScript* script,
jsbytecode* pc) {
if (stackPointer < MaxEntries) {
entries[stackPointer].initJsFrame(label, dynamicString, script, pc);
}
// This must happen at the end! The compiler will not reorder this
// update because stackPointer is Atomic.
- stackPointer++;
+ // Do the read and the write as two separate statements, in order to
+ // make it clear that we don't need an atomic increment, which would be
+ // more expensive on x86 than the separate operations done here.
+ // This thread is the only one that ever changes the value of
+ // stackPointer.
+ uint32_t oldStackPointer = stackPointer;
+ stackPointer = oldStackPointer + 1;
}
void pop() {
MOZ_ASSERT(stackPointer > 0);
- stackPointer--;
+ // Do the read and the write as two separate statements, in order to
+ // make it clear that we don't need an atomic decrement, which would be
+ // more expensive on x86 than the separate operations done here.
+ // This thread is the only one that ever changes the value of
+ // stackPointer.
+ uint32_t oldStackPointer = stackPointer;
+ stackPointer = oldStackPointer - 1;
}
uint32_t stackSize() const { return std::min(uint32_t(stackPointer), uint32_t(MaxEntries)); }
private:
// No copying.
PseudoStack(const PseudoStack&) = delete;
void operator=(const PseudoStack&) = delete;