Handling heavy loads with AWS Lambda
TL;DR: We send a lot of messages here at CityGro and have had to overcome some obstacles as we’ve grown. This article offers a behind-the-scenes look at one way we’ve leveraged technology to allow us to continue offering a great quality of service while handling heavy server loads.
Full Version: We at CityGro send a lot of messages on behalf of our customers — I suppose that’s to be expected from an Automated Marketing company. We love doing it and are always happy to handle more. Regardless, we have run into some challenges and growing pains in that past that we’ve learned from. In this article, I will outline one of the challenges we’ve faced recently and how it was overcome.
Sending transactional messages takes effort, and it takes even more effort to do quickly. Each message sent requires an HTTP request to a remote server. These requests sometimes take a few seconds to complete, so running many requests concurrently is important to be able to get more done in a small amount of time.
Sending messages is only half of the battle. It’s also important to know if the message you sent actually got to its destination. For this reason message providers “kindly” respond to every message we send (in their own time) with a status of that message. We call these “status receipts.” Typically we get one to three status receipts for every message indicating acceptance and delivery statuses which typically — but not always — arrive within a couple seconds of sending the message.
Doing some math on this, if we were sending 1,000 messages per second, we need to be able to handle bursts of up to 5,000 status receipts per second in order to not get bogged down. Processing a status receipt consists of not only marking a message as Delivered or Bounced, but also consists of aggregating data on the overall group of messages that message was part of so you can answer questions like “How many messages in my campaign were delivered?” or “What is my bounce rate?”
Rule of thumb: make sure you have enough processing power. We’ve done a decent job at following this rule by scaling up our status receipt processing power as we scale up our message sending power. For the most part, this works OK, but it’s not ideal. Servers take time to warm up, it takes money to start the servers and it takes, even more, money to keep them running for long periods of time. And what if our message provider is busy and doesn’t send us a huge load of status receipts for a delayed period of time after our servers have already scaled back down? We either need to scale them back up or the system will act like it’s experiencing a DDOS attack. This also costs money as Amazon charges a minimum of one hour every time a server is even started. Even more of an issue is routing tasks to these servers and maximizing server usage. Doing things this way really only utilizes 10 percent (or less) of the server’s potential mainly due just to architecture challenges. There’s got to be a better way…
Solution: AWS Lambda
AWS Lambda is a relatively new product offered by Amazon Web Services (AWS) that lets you write code for a task and run it as needed. This is actually pretty cool because AWS handles running the servers that the tasks run on, and they are damn good at it! This makes it very easy for us to just focus on writing the code for the task. Please don’t misread me on this… I think DevOps is a fascinating field, but the Dev can be so much cheaper (especially for proving a product) when one doesn’t have to focus as much on the Ops. In the long run, both need to be considered for better optimization, but we can get a lot done and save a lot of money by outsourcing our “Ops” to a specialist like Amazon for the time being.
Another reason AWS Lambda is so cool: we aren’t running our own dedicated server instances so we don’t need to pay for them (or keep them running all the time). AWS Lambda runs as needed and easily scales up to 10000 requests per second. Best of all, we only pay for what we use (per request among other small fees).
We utilize Lambda as an endpoint for our message providers to send us status receipts. It typically responds to our providers within 200 milliseconds and then queues the status receipt in our system to be processed when more resources are available. This is a win-win. It keeps our worker servers more fully utilized and there isn’t nearly the need to be able to support unreasonably and unpredictably large bursts of traffic.
Here’s an example of what I mean:
In the above chart…here we are having a good day and all of a sudden the server is getting hit with over 20k requests. When this traffic was hitting our app servers it would slow everything down for everyone. Now we don’t even notice these traffic spikes.
Overall, I’ve been very impressed with AWS Lambda as we’ve implemented it for this purpose. There are some quirks that I will talk about in future articles and how to overcome them, but for many cases, it’s a great and inexpensive fit.