ansible/hg-web: increase network timeout from 60s to 120s (
bug 1291926); r?fubar
This should hopefully make many of the Mercurial client failures
reported in this bug go away. We had ~8000 of these "failed to proxy
response to client" errors in March. And the rate went up last week
when we converted various server repos to generaldelta. So we should
know relatively quickly if this change reduces the failure rate.
Currently, the load balancer is not enforcing an idle timeout on
connections. We should consider changing that. And once we do, we
can increase Timeout to effectively infinity, since as the in-line
comment explains, the thing it is measuring isn't terribly
important so it doesn't add much value.
MozReview-Commit-ID: AmsL7EZCnN6
--- a/ansible/roles/hg-web/templates/vhost.conf.j2
+++ b/ansible/roles/hg-web/templates/vhost.conf.j2
@@ -6,16 +6,35 @@ WSGIPythonHome /var/hg/venv_hgweb
# Default is 100. Mercurial encodes some arguments in HTTP request headers.
# Repos with large numbers of heads (namely Try) run into these limits.
LimitRequestFields 1000
<VirtualHost *:80>
ServerName hg.mozilla.org
DocumentRoot /repo_local/mozilla/webroot_wsgi
+ # Clients processing e.g. bundle data may consume data much slower than the
+ # server can emit. This can result in the network being idle for >60s.
+ # Compounding this problem is that the load balancer has its own buffer.
+ # So there may be network activity between the client and the load balancer
+ # but not between the load balancer and this server. That can lead to even
+ # longer periods of network idle. We increase the timeout from its default
+ # of 60s to mitigate this problem.
+ #
+ # Since the TCP connection between the load balancer and this server doesn't
+ # totally reflect what the TCP connection between the client and load
+ # balancer is doing, measuring idle on this server is not very useful.
+ # Instead, network idle should be measured (and enforced) on the load
+ # balancer.
+ #
+ # If this value is too small, logs will appear in the error log:
+ #
+ # The timeout specified has expired: ... mod_wsgi ... Failed to proxy response to client
+ Timeout 120
+
RewriteEngine on
RewriteRule ^/(.*)index.cgi/?(.*) https://hg.mozilla.org/$1$2
SetEnv HGENCODING UTF-8
SetEnv LC_TYPE UTF-8
WSGIDaemonProcess hg.mozilla.org processes={{ wsgi_processes }} threads=1 maximum-requests=20 deadlock-timeout=60 inactivity-timeout=300 user=hg group=hg display-name=hg.mozilla.org
WSGIProcessGroup hg.mozilla.org