DynamoDB Dynomite Indexing

Important: These docs are for the outdated Jets 5 versions and below. For the latest Jets docs: docs.rubyonjets.com

Indexes Summary

DynamoDB has a lot of different indexes.

  1. Primary Key Index
  2. Local Secondary Index
  3. Global Secondary Index
  • Primary Key: A primary key is either: A Partition Key only or Both a Partition Key and Sort Key (Also known as a Composite Key). It must be created at table creation time and cannot be changed. The primary key is under .Table.KeySchema when you describe the table. The partition key type is Hash, and the sort key type is Range.
  • Local Secondary Indexes: LSIs must be a Composite Key and have the same partition key as the primary key. It can use any other Sort Key, though. The Sort Key is required. It must be created at table creation time and cannot be changed. There is a limit of 5.
  • Global Secondary Indexes: GSIs are like Primary Keys. You can choose any attribute for the Partition Key and, optionally, any Sort Key. You can create GSIs at any time. They can take quite a while to create, 8ms or so, but do not block any other database operations during that time. GSIs are like copies of the table. There is a limit of 20.

By definition, composite keys are indexes that have a sort key also.

It’s All About Indexing

If you want fast queries, it’s all about indexing. This is why DynamoDB forces us to explicitly use separate scan and query operations. It forces us to acknowledge that we’re doing a slow scan or a fast query 🤣.

Dynomite prints a warning message about the slow scan operation if a scan occurs. You can then either change the querying logic or add an index. Adding the right index based on the columns your app is querying will keep your app fast.

Example of Indexes

To help understand the different types of indexes, here is an example:

products table:
primary key: category (partition_key), sku (sort_key)
LSI: category (partition_key), name (sort_key)
GSI: updated_at (partition_key)

Here’s another example:

posts table:
primary key: id (partition_key)
GSI: author (partition_key), updated_at (sort_key)

Notice for posts since the Primary Key only consists of the id Partition Key, which will be unique; there’s no point in making the Primary Key a Composite Key with a Sort Key. Sort keys are only useful if the partition key is a “category”.

Primary Key: Partition Key Only or Composite Key with Sort Key

If you’re coming from the ActiveRecord, in the relational database world, a “primary key” is a single column that identifies uniqueness. With DynamoDB, there’s a primary key also, but you can choose between 2 types of primary keys:

Primary Key Types:

  1. Parition Key Only: One Field Only. This field value must be unique within the table.
  2. Composite Key: Partition Key + Sort Key. The combination of these fields must be unique within the table.

In general, I prefer creating tables with a Partition Key only because:

  • Partition Key Only primary keys are easier to understand. You use it the same conceptually so you would with relational database. This reduce mental overhead makes a difference.
  • When querying, with a Primary Key Partition Key Only, you only need to provide one value get the item. With a Composite Key, you have to provide both field values. Even though it’s just one extra field, the interface is more cumbersome.
  • You can always create a GSI index that is a Composite Key. You can use this to group “categories” and a sort column for filtering together.
  • Associations work faster and better for tables with only a Partition Key. Only the Partition Key is stored in the “foreign key” column that Dynomite manages.

When you use a partition_key only, the partition_key identifies uniqueness.

With a “composite key”, the partition_key acts like a “category”. Both the partition_key and sort_key are needed to identify uniqueness. A “composite key” is an abstract concept and only exists a result of specifying a sort_key during table creation. When you have a sort key, you will always have a composite key.

This article also has a useful explanation of DynamoDB’s partition, sort, and compose keys: DynamoDB Composite Key - The Ultimate Guide

When you have a composite key on your DynamoDB Table, that will be considered the primary key. In simpler terms, this means when you have a sort key set on your table, all Get, Update, Delete item commands must include both the partition key and sort key. This is because the partition key is no longer considered unique, the composite key is considered unique, and therefore you need both elements to identify the specific item you are referring to.

Dynomite Smart Use of Indexes

With DynamoDB, you must usually be explicit about performing a fast operation with the query method and specifying an index. If an index is unavailable, you can still find items by explicitly performing a slow operation with the scan method. This makes sense from a performance perspective, but it’s non-ideal from a developer interface perspective.

If there’s an index of some form, IE: the primary index, local secondary index, or global secondary index, dynomite uses the index automatically and calls the fast query operation instead of the slower scan operation. Dynomite will print a warning message if the slow scan operation is being used. Here’s an example:

> Product.where(category: "Electronics").to_a
I, [2023-07-29T00:59:21.680888 #66166]  INFO -- : WARN: No index found. Using slow scan operation. Consider creating an index.
See: https://rubyonjets.com/docs/database/dynamodb/indexes/
=> [#<Product:0x00007fbce08672d0 >]
>

Note: We’re calling to_a to force the load since where returns a lazy Enumerator.

If there’s an index, the warning message does not get printed.

> Product.where(category: "Electronics").to_a
=> [#<Product:0x00007fbce08672d0 >]
>