Bug 1291035 - Increase HTTP connection pool capacity; r?dustin
I was looking at some decision task logs and noticed lines like:
Connection pool is full, discarding connection: taskcluster
I also noticed lines like:
Starting new HTTP connection (153): taskcluster
In this case, we had established 153 TCP connections to a server.
Looking at the requests source code, a requests.session by default
creates a connection pool with capacity of 10. There are actually 2
components to the capacity: idle connections and active connections.
What appeared to be happening was we could obtain an idle connection,
use it, and then it would be discarded when put back in the idle
pool because the idle pool was at capacity. Furthermore, it also
appears that connections were sitting around waiting for a TCP
connection.
This commit uses a custom "adapter" with an increased pool size that
matches the concurrency level of the code issuing the HTTP requests.
This should increase the number of concurrent TCP connections / requests,
decreease the number of TCP connections being used overall, and make
decision tasks complete faster.
MozReview-Commit-ID: 6NDbz78TM2y
--- a/taskcluster/taskgraph/create.py
+++ b/taskcluster/taskgraph/create.py
@@ -25,16 +25,24 @@ CONCURRENCY = 50
def create_tasks(taskgraph, label_to_taskid):
# TODO: use the taskGroupId of the decision task
task_group_id = slugid()
taskid_to_label = {t: l for l, t in label_to_taskid.iteritems()}
session = requests.Session()
+ # Default HTTPAdapter uses 10 connections. Mount custom adapter to increase
+ # that limit. Connections are established as needed, so using a large value
+ # should not negatively impact performance.
+ http_adapter = requests.adapters.HTTPAdapter(pool_connections=CONCURRENCY,
+ pool_maxsize=CONCURRENCY)
+ session.mount('https://', http_adapter)
+ session.mount('http://', http_adapter)
+
decision_task_id = os.environ.get('TASK_ID')
with futures.ThreadPoolExecutor(CONCURRENCY) as e:
fs = {}
# We can't submit a task until its dependencies have been submitted.
# So our strategy is to walk the graph and submit tasks once all
# their dependencies have been submitted.