Using AWS DynamoDB to build web apps

Using AWS DynamoDB to build web apps

Welcome to this comprehensive guide on using Amazon DynamoDB as the database for your web application. This detailed guide will cover various aspects, ranging from hosting your web app using AWS App Runner to understanding the fundamentals of Amazon DynamoDB, designing and building tables, and exploring advanced concepts.

The article is divided into five sections, each addressing different topics:

Section 1: Hosting Your Application with App Runner: In this section, you will learn about the powerful capabilities of AWS App Runner and how it can efficiently host your applications.

Section 2: Understanding DynamoDB Basics: The second section provides an in-depth understanding of the fundamental concepts of DynamoDB. It caters to both beginners and experts, delving into the inner workings of Amazon DynamoDB.

Section 3: Designing Your DynamoDB Table: This section introduces the Single Design method and guides you through the process of designing your DynamoDB table. You will learn effective strategies to create optimal table structures.

Section 4: Implementing DynamoDB in Your Application: In the fourth section, you will discover step-by-step instructions to implement the concepts learned in the previous sections into your own application. This practical approach enables you to apply the knowledge effectively.

Section 5: Adding Event-Driven Characteristics: The final section demonstrates how to incorporate event-driven characteristics into your DynamoDB application. You will explore mechanisms to react to database events and leverage them in your application's architecture.

By the end of this post, you will have gained a solid grasp of all the covered concepts and be well-equipped to determine if DynamoDB is the right database solution for your specific needs. Additionally, you will have the ability to design efficient access patterns and become familiar with AWS services like App Runner and DynamoDB.

This guide stands out for its level of detail, making it one of the most comprehensive resources available for mastering DynamoDB. Each section includes accompanying videos and code samples, allowing you to apply the concepts in a practical manner.

So, without further ado, let's dive into the world of Amazon DynamoDB and unlock its full potential!

Happy learning!

1. Hosting the webapp using AWS AppRunner

What is AWS App Runner?

AWS App Runner is a powerful and intuitive service that simplifies the deployment and running of web applications in the cloud. It eliminates the need to worry about infrastructure, allowing you to focus on your code and deploy it quickly and easily. App Runner combines the benefits of AWS Fargate and AWS Lambda, making it an ideal choice for developers who want to deploy their applications in the cloud with no hassle.

With App Runner, you can start with a GitHub repository or a container image in AWS Elastic Container Registry (ECR). The service will automatically build and deploy your application, and any changes made to the code will trigger the CI/CD pipeline, which will rebuild the application on the fly. You can also use multiple branches for multiple environments and customize your application's settings, such as the number of vCPUs and memory required, as well as how the application should scale.

Once your application is deployed, you will receive a public endpoint that you can use. App Runner takes care of the scaling and load balancing of the application, allowing you to focus on your code. The pay-as-you-go pricing system means that you only pay for what you need.

To create an App Runner service, you can simply configure the compute settings, and you will receive a public endpoint to use. The infrastructure is handled by the cloud, freeing you up to focus on your code.

For more information, you can visit the AWS App Runner page.

The following video goes into more details on how AppRunner works

Service Source Types

AWS App Runner offers two different types of service sources: source code and source image. Regardless of the source type, App Runner takes care of starting, running, scaling, and load balancing your service. You can use the CI/CD capability of App Runner to track changes to your source image or code. When App Runner discovers a change, it automatically builds (for source code) and deploys the new version to your App Runner service.

To learn more about these two service types, please refer to the AWS App Runner documentation.

Source Code-Based Services

For this demo, we will focus on services based on source code. Source code is the application code that App Runner builds and deploys for you. You need to point App Runner to a source code repository with a supported runtime. App Runner will then build an image based on the base image of the runtime and your application code. Finally, it will start a service that runs a container based on this image. You don't need to provide container configuration or build instructions such as a Dockerfile.

Check the supported runtimes for more information.

Demo Instructions

In this demo, you can use a NodeJS app that is stored in a GitHub repository. AppRunner will automatically build, deploy, and host it in the cloud for you.

You have two options to perform this demo:

  1. Console: You can use the AWS Management Console to follow along. The video below demonstrates how to use the console for this demo.

  2. Infrastructure as Code: The video demonstrates how to perform the demo using AWS CDK. You can access the NodeJS application code from the GitHub repository here. For the AWS CDK infrastructure code used in the video, you can find it here.

The following video, shows how you can deploy and host a NodeJS application using AppRunner.

2.Amazon DynamoDB 101

Amazon DynamoDB is a fast, flexible, serverless NoSQL database service that delivers single-digit millisecond performance at any scale.

Data Models

DynamoDB supports two main data models: key-value and wide-column. These data models are highly efficient for retrieving items.

The key-value data model allows you to retrieve one item at a time using a primary key. It functions like a massive hash, enabling fast retrieval when the primary key is known.

For more complex access patterns, you can use the wide-column data model provided by DynamoDB. In this model, the hash is still required, but the value for each hash record is a B-tree. A B-tree is a data structure that enables quick element retrieval and range queries.

Main Components

DynamoDB organizes data into tables. You can create or delete a table with a simple API call. Each table contains items, and each item has attributes. Every item requires at least one attribute, which is the partition key. Optionally, you can define a sort key, which becomes another required attribute for each item.

The combination of the partition key and sort key (if defined) forms the primary key, which must be unique. Each item can also have additional attributes, and the set of attributes can vary from item to item.

DynamoDB also supports secondary indexes. While primary keys allow access to data, secondary indexes provide alternative ways to retrieve data efficiently. There are two types of secondary indexes: local secondary indexes and global secondary indexes.

Operations

DynamoDB offers three main types of operations:

  1. Item-based operations: These operations are performed on individual items in a table.

  2. Queries: Queries operate on a group of items that share the same partition key in a table or secondary index. This group is referred to as an item collection.

  3. Scans: Scans retrieve all items in a table, which can be useful when you need to analyze or process the entire dataset.

Pricing

DynamoDB is a cloud-based NoSQL database service that charges customers based on the amount of data stored and the amount of data read and written. The pricing is as follows:

  • Storage costs: Based on the amount of data stored per month, measured in GB-month. This includes data and indexes.

  • Read capacity units (RCUs): Charged for data read from DynamoDB. One RCU is charged for each strongly consistent read per second, two for transactional reads, and half for each eventually consistent read per second (up to 4 KB).

  • Write capacity units (WCUs): Charged for data written to DynamoDB. One WCU is charged for each write per second (up to 1 KB), and two WCUs for each transactional write per second.

For detailed information on DynamoDB pricing, refer to the Amazon DynamoDB Pricing page.

This video covers more in detail the most important concepts for DynamoDB

Amazon DynamoDB under the hood

In this section, you will delve into the inner workings of DynamoDB. Two Principal Engineers, Somu Perianayagam and Akshat Vig, provide insights into how DynamoDB functions.

The video begins by discussing the history of DynamoDB and highlights the distinctions between Dynamo and DynamoDB. It then delves into the architecture that powers DynamoDB and explains why this database is exceptionally fast. The video concludes by covering several essential features of DynamoDB, such as indexes, streams, backup and restore functionality, and transactions.

For a comprehensive understanding of DynamoDB's architecture and performance, you can refer to the following resources:

DynamoDB is a NoSQL database created by Amazon for their website, amazon.com. It was developed because the existing database systems couldn't handle the load required for the site's cart system. DynamoDB became popular within Amazon due to its scalability and predictable performance. However, teams using Dynamo had to manage their own infrastructure, which was a drawback. In 2012, DynamoDB was launched as a fully managed service, providing the flexibility of NoSQL with the convenience of not having to worry about servers. It gained popularity and became the default standard for key-value stores internally at Amazon. Dynamo and DynamoDB have some differences, such as Dynamo being single-tenant and requiring more configuration, while DynamoDB is multi-tenant and offers simplified APIs. The architecture of DynamoDB involves request routing, authentication, metadata lookup, and replication of data across multiple availability zones.

DynamoDB is a highly scalable database service that uses partitions to handle reads and writes. Each partition can handle a certain amount of reads and writes per second. The number of partitions needed depends on the requested throughput. In the past, admission control was done at the partition level, assuming uniform distribution of traffic. However, real-world workloads are often non-uniform, leading to hot partitions and throttling. To address this, admission control was moved to the table level, and smart placement algorithms were introduced to efficiently distribute partitions. DynamoDB Streams allows customers to consume changes in their tables in real-time, enabling features like event-driven programming, backup, and restore.

Check this interview with two of DynamoDB engineers.

3.Designing Your DynamoDB Table

Is DynamoDB the right database for you?

When considering adopting DynamoDB as the database for your application, the first question you need to answer is whether DynamoDB is the right database for you.

At AWS, we believe in the concept of database freedom, which means choosing the right database for your specific data requirements, access model, and scalability needs. AWS offers 16 different types and engines of databases for various purposes. So, why should you consider DynamoDB?

In this beginner's guide, you'll explore the top reasons why DynamoDB is the best NoSQL database for your next project. DynamoDB is a fully managed NoSQL database service that provides fast and predictable performance with seamless scalability. It's an ideal solution for applications that require low latency and high throughput at any scale.

Adopting DynamoDB

Now that you know DynamoDB is a good fit for your needs, let's discuss the steps you need to take for adopting and migrating to DynamoDB:

  1. Get yourself trained in DynamoDB characteristics: Familiarize yourself with the features and capabilities of DynamoDB. This will help you make informed decisions during the adoption process.

  2. Understand Single Table design: When building applications with DynamoDB, it's beneficial to learn about the concept of Single Table design. This design approach can simplify your data model and improve performance.

  3. Design and build DynamoDB tables and indexes: With the knowledge gained from the previous steps, you can now design and build your DynamoDB tables and indexes based on your application's data access patterns.

  4. Convert and migrate your data: If you are migrating from another database, you'll need to convert and migrate your data to DynamoDB. Depending on the data source, you may need to perform some preprocessing to prepare the data for migration.

  5. Code your application: Develop or modify your application code to interact with DynamoDB using the appropriate access patterns. This step may involve significant developer effort.

  6. Deploy and test the migrated application: Deploy your migrated application to the cloud and thoroughly test it. This step ensures that your database is provisioned correctly and that your application functions as expected.

By following these steps, you can successfully adopt DynamoDB for your application. This section covers step 1, and the following sections will walk you through the remaining steps in the adoption process.

For detailed guidance on determining if DynamoDB is appropriate for your needs and planning your migration, refer to the following resource:

Introduction to Single Table Design

The main concept behind single table design is to organize your data in a way that allows you to retrieve all the data you need to solve a specific access pattern with a single query. The goal is to design your table in such a way that you can fulfill all your access patterns with the least number of requests to DynamoDB, ideally just one.

Single table design is based on the idea of denormalization, where you store related data together in a single table rather than spreading it across multiple tables. By doing so, you eliminate the need for complex joins and enable efficient queries that retrieve all the required data at once.

To effectively implement single table design, you need to carefully consider your access patterns and design your table schema accordingly. Each access pattern should have a corresponding index or set of attributes that allow you to retrieve the required data efficiently.

By leveraging single table design, you can achieve high performance and scalability in DynamoDB while minimizing the number of requests and simplifying your application logic.

Implementing single table design requires careful planning and understanding of your application's requirements and access patterns. It is recommended to review the provided resources and explore further documentation to design your DynamoDB table effectively.

If you want to deep dive in the single table design concept you can check this video.

Design a DynamoDB Table with Single Table Design

In this section, you will learn how to design a DynamoDB table using the single table design approach. By following these steps, you can efficiently model your data for maximum performance and scalability.

Here is an overview of the steps involved in designing a DynamoDB table with single table design:

  1. Create an Entity-Relationship Diagram (ERD): Start by creating an ERD that visualizes the relationships between your entities. This diagram will help you understand the structure of your data and identify the primary entities and their attributes.

  2. Identify Access Patterns: Determine all the access patterns required by your application. This involves identifying the different ways you need to retrieve and manipulate data from the table. Consider the types of queries you'll need to perform and the data you'll need to retrieve for each query.

  3. Model the Primary Key Structure: Design the primary key structure based on your identified access patterns. The primary key consists of a partition key and an optional sort key. Choose meaningful attribute(s) that will allow you to efficiently retrieve the required data for each access pattern. You may need to denormalize your data and duplicate it in different parts of the table to optimize query performance.

  4. Satisfy Additional Access Patterns: In some cases, you may have access patterns that cannot be efficiently satisfied with the primary key alone. In such cases, consider using secondary indexes or DynamoDB streams to address these additional access patterns. Secondary indexes allow you to query the table using different attributes, while streams enable capturing and processing changes to the table in real-time.

You can also refer to the following resource for more information:

By following these steps and considering your application's requirements, you can design a DynamoDB table using the single table design approach, resulting in a highly performant and scalable data model.

Remember to adapt these steps to your specific use case and consult the provided resources for more detailed guidance.

If you want to see how I design a table for an application using those for steps mentioned above check this video.

4.Implementing DynamoDB in Your Application

Provision Amazon DynamoDB Table and Indexes with AWS CDK

In this section, you will learn how to provision a DynamoDB table and global secondary indexes using AWS CDK (Cloud Development Kit). AWS CDK is an infrastructure-as-code framework that allows you to define and provision AWS resources using familiar programming languages.

Here are the steps to provision the DynamoDB table and indexes using AWS CDK:

  1. Install and configure AWS CDK: If you haven't already, install AWS CDK and configure it with your AWS credentials. You can refer to the official AWS CDK documentation for installation and setup instructions.

  2. Create a new CDK project: Start by creating a new AWS CDK project using your preferred programming language (e.g., TypeScript, Python, Java). This project will contain the infrastructure code for provisioning the DynamoDB table and indexes.

  3. Write the CDK code: Write the AWS CDK code that defines the DynamoDB table and indexes. You can use the AWS CDK constructs provided by the AWS SDK for the programming language you're using. Define the table, primary key attributes, and any global secondary indexes you need. You can refer to the following resources for sample AWS CDK code:

  4. Deploy the infrastructure: Once you've written the AWS CDK code, you can deploy the infrastructure by running the appropriate AWS CDK deployment command. This command will create the DynamoDB table and indexes in your AWS account based on the code you've written.

  5. Test the provisioned DynamoDB table: After the infrastructure is deployed, you can test the provisioned DynamoDB table by performing various read and write operations using the AWS SDK or other tools. Verify that the table and indexes are functioning as expected and serving your application's access patterns efficiently.

By following these steps, you can provision a DynamoDB table and global secondary indexes using AWS CDK. This allows you to define your infrastructure as code and easily manage and replicate your DynamoDB setup across different environments.

Code your application

Once you have the infrastructure ready, it's time to start coding your application to support the new access patterns. This section will guide you through six steps that you can follow when writing your application.

  1. Configure the AWS SDK in your application: Set up the AWS SDK in your application to interact with DynamoDB. This includes providing your AWS credentials and configuring the SDK to connect to the correct AWS region.

  2. Configure and pick your test library: Choose a test library for your application and configure it to run tests against your DynamoDB infrastructure. This will allow you to test your application's interactions with the database.

  3. Create models: Define models for your application that map to the DynamoDB table and indexes. These models will help you work with data in a structured way and provide an abstraction layer between your application and the database.

  4. Modify controllers: Update your application's controllers to handle the new access patterns. This may involve making changes to the logic for retrieving, creating, updating, and deleting data based on the specific requirements of your application.

  5. Check routes: Review and update the routes in your application to ensure they align with the new access patterns. Verify that the routes are correctly handling requests and interacting with the DynamoDB table and indexes.

  6. Ensure middleware is working: Test and validate any middleware that you have in your application, such as authentication or request validation middleware. Make sure that the middleware is functioning correctly and properly integrating with your DynamoDB infrastructure.

Additionally, in this section, you will learn about DynamoDB expressions, which are powerful tools for working with the database:

  • Key condition expressions: These expressions define the conditions for retrieving items based on the primary key attributes. You can use key condition expressions to perform operations like equality, comparison, and filtering on key attributes.

  • Filter expressions: Filter expressions allow you to further refine the results of a query or scan operation. You can use filter expressions to apply additional conditions to the data retrieved from DynamoDB.

  • Condition expressions: Condition expressions are used to ensure that specific conditions are met before performing write operations (e.g., put, update, delete). They enable you to add conditional logic to your write operations.

  • Update expressions: Update expressions define how attributes of an item should be modified during an update operation. You can use update expressions to add, remove, or modify attributes of an item.

  • Projection expressions: Projection expressions determine which attributes should be returned in the query or scan results. You can use projection expressions to specify the attributes you need and optimize the data retrieval process.

By following these steps and understanding DynamoDB expressions, you can build an application that effectively interacts with your DynamoDB infrastructure and supports the desired access patterns.

How to Migrate Data to DynamoDB?

Once you have the infrastructure ready and your application coded, if you are performing a migration, it's time to start converting and migrating your data. The following section will provide you with a set of tips to help you build a migration plan.

AWS Database Migration Service (AWS DMS)

The recommended option for migrating your data is to use AWS Database Migration Service (DMS). This service is designed to facilitate database migrations and can be a valuable tool for your migration process.

However, if you are using the single table design approach, AWS DMS might not be the best fit, as it creates separate tables for each table in the source database. If you are migrating from MongoDB and still want to explore using AWS DMS, I have included a detailed blog post in the resources that explains how to perform a live migration from a MongoDB cluster to Amazon DynamoDB.

Custom Script

Due to the unique structure of your application table using the single table design, you will need to create your own customizable script for migration.

Your script will consist of two parts: retrieving the data from the original database and saving the items into the DynamoDB table.

Retrieving the data from the source

Most databases allow you to export data in formats such as JSON or CSV. If that is the case, you can store those exported items in Amazon S3 and then upload them to DynamoDB.

For databases with custom export formats, you may need to build a script that performs the data extraction for you.

The result of this step is having one file per entity in Amazon S3, and potentially some additional files for relationships if you are migrating from a relational database.

Importing the data to DynamoDB

You have a couple of options for importing the data:

  1. Create a table file in Amazon S3 and import it into DynamoDB.

  2. Process the items one by one and upload them to DynamoDB.

DynamoDB provides a feature that allows you to import data from Amazon S3 into a table, handling capacity management and data import for you. With this feature, you can import a file stored in Amazon S3, formatted like the DynamoDB table, into DynamoDB. However, note that this feature requires creating a new table; you cannot import data into an existing table.

To build the table file for direct import into DynamoDB or to iterate over the dump files from the database, you will need to create a process to handle that. You can leverage AWS Step Functions distributed map, which is a state type in Step Functions that iterates over items in a JSON or CSV file and executes a set of states. The maximum parallelization with Distributed map is 10,000.

Deploy and Load Test

In this step it's time to deploy your migrated application to the cloud and conduct various tests to ensure that your database is provisioned correctly. This section will guide you on how to perform load tests to verify the correct configuration of your table.

Before you begin, make sure you have completed the necessary steps to deploy your application using AWS CDK. You can find the code in the resources.

To load test your application, you can utilize tools like Artillery.io, which allows you to run load tests against your serverless applications. I have included a video in the resources that demonstrates how to run Artillery in a Lambda function for load testing.

During the load testing process, it's important to monitor the performance and metrics of your application. Amazon CloudWatch provides a powerful monitoring solution, and you can use AWS CDK to build CloudWatch dashboards for visualizing and analyzing your application's metrics. The resources include a video that explains how to build Amazon CloudWatch dashboards using AWS CDK.

By deploying your application and conducting load tests while monitoring the performance metrics, you can ensure that your database is provisioned correctly and that your application is capable of handling the expected workload.

5.Adding Event-Driven Characteristics

The final section of this guide explores the addition of event-driven characteristics to your application by leveraging DynamoDB Streams.

What are DynamoDB Streams?

DynamoDB Streams is a powerful feature provided by DynamoDB that allows you to capture changes in a table. Whenever a record is inserted, modified, or deleted in a table, an event is emitted into the stream. These streams provide a time-ordered sequence of modifications within a table, and the data is stored for 24 hours.

With DynamoDB Streams, various applications can consume these stream records in real-time and perform operations based on the events captured.

In this section, you will gain an understanding of what DynamoDB Streams are, when to utilize them, and how to get started with DynamoDB Streams and AWS CDK.

You can learn more about DynamoDB Streams in this video

Acting on streams events

There are multiple ways to take action on the events sent over DynamoDB Streams. One simple approach is to leverage Amazon EventBridge Pipes. Pipes connect sources to targets, reducing the need for specialized knowledge and integration code when building event-driven architectures. For example, you can easily connect DynamoDB Streams with an Amazon SNS email subscription to receive email notifications whenever changes occur in the application.

Check the following video to see how to do it.

Filtering subscription messages

By utilizing SNS message payload filtering, we can filter the messages received by the email subscription, optimizing our application for our specific use case without making changes to the consumer or the receiver.

Watch this video to see how you can use this feature from SNS.

Conclusion

Congratulations on reaching the end! This guide is quite extensive and not intended to be read all at once.

By now, you have gained knowledge about Amazon DynamoDB, learned how to design a single table, implemented it as infrastructure as code and in your application's codebase. You also discovered how to develop, test, and deploy an application that utilizes DynamoDB effectively. Lastly, you explored the integration of event-driven architecture features to your application using Streams and Pipes.

Well done on completing this guide!