AWS Kinesis Streams offers a scalable and efficient solution for processing real-time data streams. However, like any cloud service, it operates within certain constraints and limitations that users must be aware of to design robust and reliable applications. In this article, we’ll dive into the various limits of AWS Kinesis Streams, including throughput limits and data retention periods, and discuss strategies for working within these boundaries.
Throughput Limits in AWS Kinesis Streams
AWS Kinesis Streams impose limits on both the maximum throughput per shard and the total number of shards per stream. These limits dictate the maximum capacity for ingesting and processing data within a given stream.
Maximum Throughput per Shard
Each shard in an AWS Kinesis Stream has a maximum ingest rate and a maximum egress rate, which determine the throughput capacity of the shard. Let’s consider an example to illustrate these limits:
- Maximum Ingest Rate: Assume a shard has a maximum ingest rate of 1,000 records per second (RPS). If you attempt to push more than 1,000 records per second into the shard, AWS Kinesis will throttle the incoming data.
- Maximum Egress Rate: Similarly, let’s assume the same shard has a maximum egress rate of 2 MB per second. If the consumer application reads data from the shard at a rate exceeding 2 MB per second, AWS Kinesis will throttle the data retrieval.
Total Number of Shards per Stream
The total throughput capacity of an AWS Kinesis Stream is determined by the cumulative throughput of all its shards. AWS imposes limits on the maximum number of shards allowed per stream, which indirectly limits the overall throughput capacity. Consider the following example:
- Maximum Shards per Stream: Suppose the maximum number of shards allowed per stream is 50. If each shard has a maximum ingest rate of 1,000 RPS, the total maximum ingest rate for the stream would be 50,000 RPS.
Data Retention Periods in AWS Kinesis Streams
Another crucial aspect to consider is the data retention period of AWS Kinesis Streams, which defines how long data records persist within the stream before they expire and are automatically deleted. Let’s examine an example to understand data retention periods:
- Default Retention Period: By default, AWS Kinesis Streams retain data records for 24 hours. After this period elapses, the records are automatically purged from the stream.
- Extended Retention Period: AWS offers the option to extend the data retention period up to 7 days for an additional fee. This feature allows you to retain data for a longer duration, which can be beneficial for scenarios requiring prolonged data analysis or compliance purposes.