Data modeling with Amazon DynamoDB

YOUTUBE DIQVJqiSUkE Alex DeBrie presents at AWS re:Invent, 2019

Alex maintains a guide online with great notes. dynamodbguide.com

Basics 1. Start with an ERD 2. Define your access patterns 3. Design your primary and secondary indexes

The main usecase for Amazon DynamoDB is when you have so much information that you need something that operates with high scale.

A secondary use case is with ephemeral compute-- when you have lots of small connections that need to access data concurrently in burst.

Authorization is done with a session store over HTTPS as a cookie or header.

One to Many

One to many relationships are difficult, you need to understand how your data will be accessed and decide on what strategy to take from there.

Recap of one to many strategies, at 29:06. Start of this topic is 22:45

De-normalize Data is a strategy which stores entities in your document-- e.g. storing a users' address in the users table.

Compost Primary Key is a technique for modelling one to many relationships. You use the same partition key (pk) but different sort keys. This means they're stored together and can be queried quickly. This is very common.

Invert Index is a technique where the sort key are the same across different items and you create an index against sk and pk in the reverse direction.

Filtering

You can't query across partitions! Beware of "scan" as this will cludge things. There's a 1mb limit to read items from a table, and if you scan with a filter expression will return incomplete queries. i.e. filtering on your local machine may work with scan, but production wont.

Access based filtering patterns, 33:18

Filter on primary key, filtering secondary key with a pattern ``` "PK = USER#alexdebrie AND BEGINS_WITH(SK, 'ORDER#')" ```

Compost sort key strategy requires you to create a new attribute which jams together different attributes, e.g. "OrderStatusDate" combines a date with an order's status ``` "PK = USER#alexdcebrie AND BEGINS_WITH(OrderStatusDate, 'SHIPPED#')" ```

Sparse index pattern strategy allows you to query across different partitions. To do this, add an attribute to your records which only shows up when the item should show up in your query. For instance, if you add a PlacedId to an orders table, and create an index against it. This query looks like the above, but items have attributes that don't

Terms

GSI - Global Secondary Index, another way to sort or search your data.

See Also

DAT301-R: Data modeling with Amazon DynamoDB in 60 minutes

DAT325: Amazon DynamoDB: Under the hood of a hyperscale database.