Bug 1272782 - Wait longer and stop after "done" message; r?ahal draft
authorGregory Szorc <gps@mozilla.com>
Fri, 13 May 2016 14:05:17 -0700
changeset 366988 80987e39d92cf780d30902c7cbaf54827a7408d7
parent 366987 fcf58eaa378128613a5a671ad921cc5748fc7b90
child 366989 8f680eab4f03f16eda188fd130e21fb85a2d016c
push id18105
push userbmo:gps@mozilla.com
push dateFri, 13 May 2016 21:11:59 +0000
reviewersahal
bugs1272782, 1239939
milestone49.0a1
Bug 1272782 - Wait longer and stop after "done" message; r?ahal Before, we kept waiting for data in the pipe after receiving the "done" message. This didn't really make much sense because the "done" message should be the final thing sent over the pipe! e9113fd6cdb8 (bug 1239939) recently dropped the poll interval of the pipe from 1.0 to 0.1s. This appears to have introduced an intermittent failure in a test. The race condition was between the child process sending data and the parent process timing out (after only 0.1s) waiting for that data. Increasing the timeout makes the failure reproduce less often. Although technically the race condition is still present! I'm not inclined to fix it at this time, however. The rationale for dropping the pipe timeout was that it was causing lag when terminating short-lived processes. Now that we abort the pipe reading/polling loop as soon as the "done" message is received, we no longer poll the pipe after receiving "done" and no longer have to worry about its timeout impacting shutdown time. MozReview-Commit-ID: EeENQ95RAs1
testing/mozbase/mozsystemmonitor/mozsystemmonitor/resourcemonitor.py
--- a/testing/mozbase/mozsystemmonitor/mozsystemmonitor/resourcemonitor.py
+++ b/testing/mozbase/mozsystemmonitor/mozsystemmonitor/resourcemonitor.py
@@ -285,23 +285,29 @@ class SystemResourceMonitor(object):
         self._pipe.send(('terminate',))
         self._running = False
         self._stopped = True
 
         self.measurements = []
 
         done = False
 
-        while self._pipe.poll(0.1):
+        # The child process will send each data sample over the pipe
+        # as a separate data structure. When it has finished sending
+        # samples, it sends a special "done" message to indicate it
+        # is finished.
+        while self._pipe.poll(1.0):
             start_time, end_time, io_diff, cpu_diff, cpu_percent, virt_mem, \
                 swap_mem = self._pipe.recv()
 
+            # There should be nothing after the "done" message so
+            # terminate.
             if start_time == 'done':
                 done = True
-                continue
+                break
 
             io = self._io_type(*io_diff)
             virt = self._virt_type(*virt_mem)
             swap = self._swap_type(*swap_mem)
             cpu_times = [self._cpu_times_type(*v) for v in cpu_diff]
 
             self.measurements.append(SystemResourceUsage(start_time, end_time,
                 cpu_times, cpu_percent, io, virt, swap))