Demystifying Database Connection Pools in AWS Lambda Functions

It’s a common practice in traditional applications to maintain a pool of ready-to-use database connections, instead of creating one from scratch every time a piece of code needs to talk to the database. But what happens if you do the same in Lambda functions? Each function is separate from the other & cannot share the pool. How do connection pools behave then? Are they even required? How should you size them? This article aims to explore these questions & explain in detail how database connection pools behave in Lambda functions.

Say you have a Node.js Lambda function talking to a MySQL database. Here’s how you’ll create a DB connection pool in this case:

pool = mysql.createPool({
    host     : 'host',
    database : 'database',
    user     : 'user',
    password : 'password'

First of all, you should always put this pool creation code outside of the function’s handler method. The handler is what is called for every invocation of your function. Since you don’t want the pool to be recreated every time, you put its code outside the handler. Any code outside the handler is only run once, when the Lambda service provisions a container to run your function in response to an incoming request. So creating the pool once at initialization & sharing it among all invocations of the function (that reach this particular container/instance), makes perfect sense!

When a request comes in for a function, a new instance of that function’s container is created, initialized & the handler invoked. If another request comes in for the same function while this instance is busy serving the first request, a new instance of the function’s container is created, initialized & the handler invoked. If however, an instance has finished serving a request & is idle when another request happens to come in, a new instance is not created. The existing idle instance’s handler is invoked with the new request.

So if you create a connection pool at the initialization stage of an instance of a function, it’s shared by all requests handled by that instance, as shown below:

So a single pool can end up serving thousands of requests one at a time. Set the max connection limit of the pool based on the DB’s capacity to handle simultaneous connections & the expected traffic to your Lambda functions. A simplified calculation of this number would go something like this: If your database can handle 100 connections at a time & you expect enough incoming traffic to keep 10 instances of 1 Lambda function running concurrently (& you only have 1 Lambda function in your application), then the pool size should be 100/10 = 10.