Bug 1274584 - [mozprocess] Fix IO Completion Port failed to signal process shutdown, r?jgriffin draft
authorAndrew Halberstadt <ahalberstadt@mozilla.com>
Mon, 30 May 2016 11:02:13 -0400
changeset 372926 825047f8e1bb30e9f0cb656ffdf9438a1de2ffa3
parent 372925 373d3a3e1b27b66faeea09747ce9404e9775ae3b
child 522285 fc50a221f3ed1acb62e30a96dc739191d15c9d53
push id19636
push userahalberstadt@mozilla.com
push dateMon, 30 May 2016 16:28:51 +0000
reviewersjgriffin
bugs1274584
milestone49.0a1
Bug 1274584 - [mozprocess] Fix IO Completion Port failed to signal process shutdown, r?jgriffin Sometimes the IO completion port doesn't shutdown child processes. When this happens, mozprocess will attempt to force kill the child processes manually. However, there is a bug here which causes the OSError to get raised. Although this fixes that bug, the original issue(s) which prevented the IOC port from signaling shutdown remain and are still undiagnosed. MozReview-Commit-ID: L3DQPW0Is5v
testing/mozbase/mozprocess/mozprocess/processhandler.py
--- a/testing/mozbase/mozprocess/mozprocess/processhandler.py
+++ b/testing/mozbase/mozprocess/mozprocess/processhandler.py
@@ -390,22 +390,24 @@ falling back to not using job objects fo
                     if countdowntokill != 0:
                         diff = datetime.now() - countdowntokill
                         # Arbitrarily wait 3 minutes for windows to get its act together
                         # Windows sometimes takes a small nap between notifying the
                         # IO Completion port and actually killing the children, and we
                         # don't want to mistake that situation for the situation of an unexpected
                         # parent abort (which is what we're looking for here).
                         if diff.seconds > self.MAX_IOCOMPLETION_PORT_NOTIFICATION_DELAY:
+                            print >> sys.stderr, "WARNING | IO Completion Port failed to signal process shutdown"
                             print >> sys.stderr, "Parent process %s exited with children alive:" % self.pid
                             print >> sys.stderr, "PIDS: %s" %  ', '.join([str(i) for i in self._spawned_procs])
-                            print >> sys.stderr, "Attempting to kill them..."
+                            print >> sys.stderr, "Attempting to kill them, but no guarantee of success"
 
                             self.kill()
                             self._process_events.put({self.pid: 'FINISHED'})
+                            break
 
                     if not portstatus:
                         # Check to see what happened
                         errcode = winprocess.GetLastError()
                         if errcode == winprocess.ERROR_ABANDONED_WAIT_0:
                             # Then something has killed the port, break the loop
                             print >> sys.stderr, "IO Completion Port unexpectedly closed"
                             self._process_events.put({self.pid: 'FINISHED'})
@@ -467,17 +469,17 @@ falling back to not using job objects fo
                     self.returncode = winprocess.GetExitCodeProcess(self._handle)
                 else:
                     # Dude, the process is like totally dead!
                     return self.returncode
 
                 threadalive = False
                 if hasattr(self, "_procmgrthread"):
                     threadalive = self._procmgrthread.is_alive()
-                if self._job and threadalive:
+                if self._job and threadalive and threading.current_thread() != self._procmgrthread:
                     self.debug("waiting with IO completion port")
                     # Then we are managing with IO Completion Ports
                     # wait on a signal so we know when we have seen the last
                     # process come through.
                     # We use queues to synchronize between the thread and this
                     # function because events just didn't have robust enough error
                     # handling on pre-2.7 versions
                     err = None