serverlog: perform garbage collection on every request (
bug 1443984); r?sheehan, glob
hgwebdir is currently leaking repository objects. This can lead to OOM
on hgweb machines in production.
The leaks are likely due to a cycle in repository objects. Those
leaks likely won't get fixed until 4.6 at the earliest.
Since memory leaks are effectively a fact of life at this
juncture, let's mitigate their existence by forcing a garbage
collection at the end of every request.
A similar patch to do this is proposed upstream. Worst case, both us
and core perform a collect. The 2nd collect should be very fast.
We implement this in the serverlog extension because monkeypatching
hgweb is hard and this extension already does it.
MozReview-Commit-ID: 3HDarYrDF3J
--- a/hgext/serverlog/__init__.py
+++ b/hgext/serverlog/__init__.py
@@ -162,16 +162,17 @@ The extension currently only uses syslog
The extension assumes only 1 thread is running per process. If multiple threads
are running, CPU time calculations will not be accurate. Other state may get
mixed up.
"""
from __future__ import absolute_import
+import gc
import inspect
import os
import resource
import syslog
import time
import uuid
from mercurial import (
@@ -272,16 +273,22 @@ class hgwebwrapped(hgweb_mod.hgweb):
for what in super(hgwebwrapped, self)._runwsgi(req, repo):
sl['writecount'] += len(what)
yield what
if sl['writecount'] - lastlogamount > datasizeinterval:
logsyslog(sl, 'WRITE_PROGRESS', '%d' % sl['writecount'])
lastlogamount = sl['writecount']
finally:
+ # It is easy to introduce cycles in localrepository instances.
+ # Versions of Mercurial up to and including 4.5 leak repo instances
+ # in hgwebdir. We force a GC on every request to help mitigate
+ # these leaks.
+ gc.collect()
+
endtime = time.time()
endusage = resource.getrusage(resource.RUSAGE_SELF)
endcpu = endusage.ru_utime + endusage.ru_stime
deltatime = endtime - starttime
deltacpu = endcpu - startcpu
logsyslog(sl, 'END_REQUEST', '%d' % sl['writecount'],