NashTech Blog

Amazon EBS Explained: Your Key to Scalable Cloud Storage (Part 2)

Table of Contents
Emerging Technologies and Innovations

Having established the core concept, features, and advantages of Block storage in Part 1 Amazon EBS Explained: Your Key to Scalable Cloud Storage, this post delves deeper into Amazon EBS. We’ll explore use cases and performance of Amazon EBS.

Amazon EBS Use Cases

The functionality and available performance options make Amazon EBS a good storage option for many workloads and use cases. This section discusses the most common use cases for Amazon EBS storage, including enterprise applications, relational databases, non-relational (NoSQL) databases, big data analytics, and file systems and media workflows.

Enterprise applications

Amazon EBS provides high availability and high durability block storage to reliably run mission-critical applications such as Oracle, SAP, Microsoft Exchange, Microsoft SharePoint, and VMware applications on VMware Cloud on AWS.

The design of Amazon EBS provides versatile storage services with:

  • Low latency and consistently high IOPS and throughput performance
  • Capacity and performance scalability without workload disruption
  • 99.999 percent availability
  • A low volume annual failure rate (AFR) compared to on-premises systems of 0.1 to 0.2 percent

Lift and shift application workloads

Most organizations have on-premises applications burdened with high capital expenses, complex management, scalability challenges, and hardware that needs to be replaced every 3 to 5 years.

Maintaining existing on-premises infrastructure results in increased operational burden and drains IT budgets for organizations that are already budget and resource strapped. With these on-premises challenges, IT organizations want to move to the cloud and away from the traditional, costly lifecycle of buying, managing, and replacing on-premises hardware, software, services, and networking.

Most cloud migrations happen in phases to minimize risk and shorten time to production. The most common approach is to lift and shift an application and its data with as few changes as possible to similar services running in the cloud. This provides the fastest time to production. Once the application is on AWS, it is easier to modernize and rearchitect application elements, to take advantage of the cloud services and optimizations that provide the most significant benefits.

Relational databases

Amazon EBS scales with your performance needs, whether you are supporting millions of gaming customers or billions of e-commerce transactions. Databases such as SAP HANA, Oracle, Microsoft SQL Server, MySQL, and PostgreSQL are widely deployed on Amazon EBS.

For running relational databases on AWS, you have the following options:

  • You can refactor or rearchitect your database to Aurora for maximum cost savings. Aurora is a MySQL- and PostgreSQL-compatible relational database built for the cloud with the performance and availability of commercial-grade databases at one-tenth the cost.
  • You could also re-platform your workload with a few optimizations to run on Amazon RDS, which let’s you reduce some of your time used to manage database instances. Amazon RDS is an AWS-managed service for running a fully featured relational database while offloading database administration tasks such as hardware provisioning, database setup, patching, and backups.
  • You can lift and shift or rehost your database using EC2 instances and EBS volumes. Lift and shift let’s you migrate and implement legacy databases quickly to the AWS Cloud. With a lift-and-shift migration, you maintain full control over your database. Lift-and-shift migrations are useful if you have requirements that cannot be met by AWS native or managed services.
    • With lift and shift, you bring your database and host it on AWS with full security access privileges, choice of four different EBS volume types for capacity and performance needs, no license restrictions, and greater feature support.

NoSQL databases

EBS volumes provide consistent and low-latency performance for running NoSQL databases such as Cassandra, MongoDB, and CouchDB.

For running NoSQL database applications on AWS, you can select Amazon DynamoDB or use Amazon EC2 and Amazon EBS to host your Cassandra, MongoDB, CouchDB, or other NoSQL databases.

  • DynamoDB is a fully managed, multi-Region, multi-master database with built-in security, backup and restore, and in-memory caching for internet-scale applications.
  • Using Amazon EC2 and Amazon EBS, you can bring your database and host it on AWS with full security access privileges, choice of four different EBS volume types for capacity and performance needs, no license restrictions, and full feature support.

Big data analytics engines

Amazon EBS offers data persistence, dynamic performance adjustments, and the ability to detach and reattach volumes, allowing you to resize clusters for big data analytics engines such as Hadoop and Adobe Spark.

Amazon EBS provides persistent, fast storage to address key components of your solution, including data warehouses, search and indexing, NoSQL databases, and streaming data.

For running big data analytics applications on AWS, you can use Amazon EMR or Amazon Managed Streaming for Apache Kafka (Amazon MSK), or you can use Amazon EC2 and Amazon EBS to host your Hadoop, Spark, or other big data analytics solutions.

  • Amazon EMR is an AWS-managed Hadoop framework to process vast amounts of data across dynamically scalable EC2 instances.
  • Amazon MSK is a fully managed, highly available, and secure Apache Kafka service.
  • Using Amazon EC2 and Amazon EBS, you can choose your big data framework and use flexible instance selection and persistent storage for large, throughput-oriented workloads.

Amazon EBS Performance

Several factors, including I/O characteristics and the configuration of your EC2 instances and volumes, can affect the performance of EBS volumes. If you follow the guidance on the Amazon EBS and Amazon EC2 product detail pages, typically you can achieve good performance with your initial configuration. However, in some cases, you may need to do some tuning to achieve peak performance on the platform.

On a given volume configuration, certain I/O characteristics drive the performance behavior for your EBS volumes.

  • SSD-backed volumes—General Purpose SSD (gp2 and gp3) and Provisioned IOPS SSD (io1 and io2)—deliver consistent performance for random or sequential I/O operations.
  • HDD-backed volumes—Throughput Optimized HDD (st1) and Cold HDD (sc1)—deliver optimal performance only when I/O operations are large and sequential.

IOPS

IOPS are a unit of measure representing I/O operations per second. The operations are measured in KiB, and the underlying drive technology determines the maximum amount of data that a volume type counts as a single I/O.

  • I/O size is capped at 256 kibibyte (KiB) for SSD volumes.
  • I/O size is capped at 1,024 KiB for HDD volumes.

SSD volumes handle small or random I/O much more efficiently than HDD volumes. When small I/O operations are physically contiguous, Amazon EBS attempts to merge them into a single I/O operation up to the maximum size.

  • Large sequential I/O operations are divided into separate I/O operations up to the maximum I/O size. A single 1,024 KiB operation would count as four operations on SSDs and one operation on HDDs.
  • Noncontiguous I/O operations are not merged and handled as separate I/O operations.

Throughput

Throughput is the measurement the volume of data transferred. Throughput is generally used to measure transfer performance in regard to large sequential files. Each EBS volume storage type has both IOPS and throughput limitations.

  • SSD-backed volumes with large I/O sizes may experience a smaller number of IOPS than you provisioned because you are hitting the throughput limit for the volume.
  • HDD-backed volumes with sequential I/O workloads may experience a higher than expected number of IOPS as measured from inside your EC2 instance. This happens when the instance operating system merges sequential I/Os and counts them in 1,024 KiB-sized units.
  • HDD-backed volumes used for small or random I/O workloads, can experience a lower throughput than expected. This is because we count each random, nonsequential I/O toward the total IOPS count, which can cause you to hit the volume’s IOPS limit sooner than expected.

Burst Balance

For some SSD-backed and HDD-backed EBS volume types, you are able to burst your performance above your provisioned baseline limits.

  • When you operate within your normal baseline range, you accumulate burst credits.
  • When your workload uses IOPS or throughput above your baseline range, you use your accumulated burst credits.
  • If your burst balance is depleted, you are unable to burst, and operations are limited to your provisioned baseline limits.

Latency

Latency is the true round trip time of an I/O operation or the elapsed time, between sending an I/O to Amazon EBS and receiving an acknowledgement from Amazon EBS that the I/O read or write operation is complete.

  • The expected average latency for SSD-backed volumes ranges from sub-1 millisecond to single-digit millisecond performance depending on the SSD volume type.
  • The expected average latency for HDD-backed volumes is two-digit millisecond performance. Latency for HDD volumes is highly dependent on the EBS volume type and the workload.

Volume queue length can affect latency. The volume queue length is the number of pending I/O requests for a device. Queue length must be correctly calibrated with I/O size and latency to avoid creating bottlenecks, either on the guest operating system or on the network link to Amazon EBS.

  • Optimal queue length varies for each workload depending on your particular application’s sensitivity to IOPS and latency. If your workload is not delivering enough I/O requests to fully use the performance available to your EBS volume, your volume might not deliver the IOPS or throughput that you have provisioned.
  • Transaction-intensive applications are sensitive to increased I/O latency and are well suited for SSD-backed volumes. You can maintain high IOPS while keeping latency down by maintaining a low queue length and a high number of IOPS available to the volume.
  • Throughput-intensive applications are less sensitive to increased I/O latency and are well suited for HDD-backed volumes. You can maintain high throughput to HDD-backed volumes by maintaining a high queue length when performing large, sequential I/O.

Conclusion

Amazon EBS offers robust and scalable block storage solutions for diverse workloads across various industries. From mission-critical enterprise applications to big data analytics, EBS delivers the performance, flexibility, and security needed to thrive in the cloud. Explore the use cases highlighted in this post to discover how EBS can optimize your cloud storage strategy.

Picture of Hao Nguyen Tan

Hao Nguyen Tan

Leave a Comment

Your email address will not be published. Required fields are marked *

Suggested Article

Scroll to Top