Strategies to reduce AWS lambda costs

5 min readNov 18, 2024

Things we should focus so that we have a lower AWS lambda costs

Understanding AWS Lambda Costs

Before we go on to the optimization strategies, let us first understand Lambda’s pricing structure. AWS Lambda offers its services on a pay-per-use model, meaning we only pay for what we use. These charges can be majorly grouped in four broad categories

Compute Charges: AWS charges us based on the provisioned memory of our function (the GBs, which also imply the number of vCPUs) and the execution time of our functions.
Request Charges: AWS Lambda charges for every one million requests. Thus, we must design our app to reduce unnecessary function invocations. We can do so by using an intelligent event-driven framework and efficient triggers.

The formula is executions x avg. execution time x GB per second

Data Transfer Costs: Costs are incurred for data transferred in and out of AWS Lambda. This includes data coming from external sources and other AWS regions or services. Optimizing data transfer by minimizing data size and frequency can help control costs.
Ephemeral Storage Cost: The price for Ephemeral Storage is $0.0000000309 for every GB-second. For a Lambda function with 2GB of ephemeral storage, the cost will be — 2 GB -0.5 GB (free storage) = 1.5 GB x 800,000 seconds = 1,200,000 GB seconds x 0.0000000309 USD = 0.0371 USD.

Inefficient function configurations — like over-provisioning memory or unnecessary invocations, can lead to ballooning costs. Even small inefficiencies can compound over millions of executions, which is why tuning your Lambda setup is essential.

Cost reduction strategies

Runtime Choice

Different runtimes will have different execution times for the same code achieving the same business logic and hence different billing. JVM-based languages suffer from a cold-start. Golang & Rust have faster execution times, and hence lesser cost. TypeScript & Python perform decent in most of the cases.

For custom runtimes, we are billed for the init time as well. For the runtimes which Lambda supports; init time isn’t counted towards the billed duration.

Architecture Choice

With Lambda we can use AWS’s own Graviton processors. These processors are based on ARM architecture. The runtime is about 34% cheaper compared to the default x86 processors.

Note: We may get better data parsing performance with x86_64 than with arm64, especially if our parsing library leverages specialized SIMD instructions.

Caching Lambda responses

If the function is stateless and functional in nature then calling the lambda with same payload on same endpoint will not change the o/p of the lambda. We can enable caching to reduce such calls. To enable caching for API Gateway we can use the MethodSettings attribute CachingEnabled

Variable as local Cache

The Lambda internal memory can be used as a cheap and fast caching mechanism. Anything loaded outside the handler function remains in memory for the next invocations.

We can keep a copy of information retrieved from a database inside a global variable so that the data can be pulled from the Lambda internal memory in future requests.

Using optimized packages for specific use cases

Using built-ins or code from your language’s standard library is a big asset in speeding up code.

Sometimes standard library may not be fast for some use cases, for those we should use specialized packages. For example:

With Go, use fastjson instead of the standard library’s encoding/json, to get 10x better performance.
With Rust, use simdjson instead of serde_json to get 3x better performance.

Instantiation

If the variables exist outside any function, they will get called when the function is initialized. This reduces startup time of code.

Event Filtering

While creating an Event source mapping for lambda we can think of putting a filter on the data coming into our lambda’s handler. This will reduce the number of invocations.

Leverage AWS Lambda Power Tuning Tool

AWS Lambda Power Tuning is a powerful tool that helps visualize the cost-performance trade-offs by testing different memory configurations. With it, we can find the most efficient memory settings for our function without manually experimenting.

pip install aws-lambda-powertools

# Structured logging, Distributed Tracing, Metrics Collection 
from aws_lambda_powertools import Logger, Tracer, Metrics
# Event handler's built in validations
from aws_lambda_powertools.event_handler import APIGatewayRestResolver, BadRequestError
# Error handling
from aws_lambda_powertools.event_handler.exceptions import (
    BadRequestError,
    NotFoundError,
    InternalServerError
)
from aws_lambda_powertools.event_handler.api_gateway import Response
from aws_lambda_powertools.utilities.typing import LambdaContext
from aws_lambda_powertools.metrics import MetricUnit

Steps to use AWS Lambda Power Tuning

Set up the Power Tuning tool from the AWS Serverless Application Repository.
Run a test across different memory configurations.
Review the report to choose the optimal memory setting for your use case.

Refine & have conservative Time-out Settings

Every AWS Lambda function includes a timeout setting that represents its running duration. While we may want to make it as long as possible, setting it to the lowest will reduce costs.

We should only allow timeouts that are enough to complete the functions’ tasks. This way, we can offload unnecessary runtime, improving efficiency and lowering costs.

Leverage Lambda Layers

With Lambda Layers, we can share dependencies and codes with several functions. This feature reduces redundant data and optimizes storage. However, it can also help reduce costs.

Direct integrations

If a Lambda function is not performing custom logic when it integrates with other AWS services, it may be unnecessary and could be replaced by a lower-cost direct integration.

Avoid idle wait time

Don’t add wait or event watcher in your lambda code. Idle time will cost money as it is included in the billed duration.

Utilize Queues to Batch Lambda Invocations

Using batch size =1, is approximately 200% more expensive than using batch size=10.
Additionally, using a low batch size could exhaust our Lambda concurrency for the AWS account if there is a high volume of message throughput.

Enable batch item failures for handling error records and improving efficiency

Use multi-threading to speed up code

Memory to number of CPU in Lambda:

AWS re:Invent 2020 Day 3: Optimizing Lambda Cost with Multi-Threading

Allocate at least 640MB of memory to your Lambda function to maximize your S3 read throughput, which is roughly 90 MB / sec

Consider using provisioned concurrency

Optimise function logic

Early exit
Avoid sequential loops
remove unnecessary object references

Reduce External service calls

Each call to an external service or API from within your Lambda function can add to the execution duration, and some services may have their associated charges.

Watch and reduce Errors & retries

AWS Lambda automatically retries failed executions, which can increase costs if failures are frequent.