Troubleshooting latency issues in Amazon DynamoDB

If your workload appears to experience high latency, you can analyze the CloudWatch SuccessfulRequestLatency metric, and check the average latency and median latency through percentile metrics (p50) to see if it’s related to DynamoDB. Some variability in the reported SuccessfulRequestLatency is normal, and occasional spikes (particularly in the Maximum statistic and high percentiles) should not be cause for concern. However, if the Average statistic or p50 (median) shows a sharp increase and persists, you should check the AWS Service Health Dashboard and your Personal Health Dashboard for more information. Some possible causes include the size of the item in your table (a 1 KB item and a 400 KB item will vary in latency) or the size of the query (10 items versus 100 items).

The percentile metrics (p99, p90, etc.) can help you better understand your latency distribution. For example:

p50 (median) shows the typical latency for your workload.
p90 shows that 90 percent of requests are faster than this value.
p99 helps identify the worst-case latency affecting 1 percent of requests.

High p99 values with normal p50 values might indicate sporadic issues affecting a small portion of requests, while consistently elevated p50 values might suggest some performance degradation.

Some variation in latency metrics, particularly in higher percentiles, is expected and can be seen as a result of DynamoDB-driven background operations that help maintain high availability and durability for your data stored in DynamoDB tables or transient infrastructure issues.

If necessary, consider opening a support case with AWS Support, and continue to assess any available fall-back options for your application (such as evacuation of a Region if you have a multi-Region architecture) according to your runbooks. You should log request IDs for slow requests for providing these IDs to AWS Support when you open a support case.

The SuccessfulRequestLatency metric only measures latency which is internal to the DynamoDB service — client side activity and network trip times are not included. To learn more about overall latency for calls from your client to the DynamoDB service, you can enable latency metric logging in your AWS SDK.

Note

For most singleton operations (operations which apply to a single item by fully specifying the primary key's value), DynamoDB delivers single-digit millisecond Average SuccessfulRequestLatency. This value does not include the transport overhead for the caller code accessing the DynamoDB endpoint. For multi-item data operations, the latency will vary based on factors such as size of the result set, the complexity of the data structures returned, and any condition expressions and filter expressions applied. For repeated multi-item operations to the same data set with the same parameters, DynamoDB will provide highly consistent Average SuccessfulRequestLatency.

Consider one or more of the following strategies to reduce latency:

Adjust request timeout and retry behavior: The path from your client to DynamoDB traverses many components, each of which is designed with redundancy in mind. Think about the scope of network resiliency, TCP packet timeouts, and the distributed architecture of DynamoDB itself. The default SDK behaviors are designed to find the right balance for most applications. A request which is taking significantly longer than normal is less likely to ultimately succeed — if you fail fast and make a new request, this is likely to take a different path and may quickly succeed. Keep in mind that there can be downsides to being too aggressive in these settings. A helpful discussion on this topic can be found in Tuning AWS Java SDK HTTP request settings for latency-aware Amazon DynamoDB applications.
Reduce the distance between the client and DynamoDB endpoint: If you have globally dispersed users, consider using Global tables - multi-Region replication for DynamoDB. With global tables, you can specify the AWS Regions where you want the table to be available. Reading data from a local global tables replica can significantly reduce latency for your users. Also, consider using a DynamoDB gateway endpoint to keep your client traffic within your VPC.
Use caching: If your traffic is read heavy, consider using a caching service, such as In-memory acceleration with DynamoDB Accelerator (DAX). DAX is a fully managed, highly available, in-memory cache for DynamoDB that delivers up to a 10x performance improvement, from milliseconds to microseconds, even at millions of requests per second.
Reuse connections: DynamoDB requests are made via an authenticated session which defaults to HTTPS. Initiating the connection takes time so the latency of the first request is higher than typical. Requests over an already initialized connection deliver DynamoDB's consistent low latency. For this reason, you may wish to make a "keep-alive" GetItem request every 30 seconds if no other requests are made, to avoid the latency of establishing a new connection.
Use eventually consistent reads: If your application doesn't require strongly consistent reads, consider using the default eventually consistent reads. Eventually consistent reads are lower cost and are also less likely to experience transient increases in latency. For more information, see DynamoDB read consistency.
Implement request hedging: For very low p99 latency requirements, consider implementing request hedging. With request hedging, if the initial request doesn't receive a response quickly enough, send a second equivalent request and let them race. For writes, use timestamp-based ordering to ensure hedged requests are treated as occurring at the time of the first attempt, preventing out-of-order updates. This improves tail latency at the cost of some extra requests. This approach has been discussed in Timestamp writes for write hedging in Amazon DynamoDB.

Document Conventions

Internal server errors

Throttling issues