Node.js – DynamoDB query on boolean key

amazonamazon-dynamodbamazon-web-servicesnode.jsnosql

I'm new to DynamoDB (and to noSQL in general) and am struggling a little to get my head round some of the concepts. One thing in particular is giving me some problems, which is around querying a table based on a boolean key.

I realize that I can't create a primary or secondary index on a boolean key, but I can't see how I should ideally index and query a table with the following structure;

reportId: string (uuid)
reportText: string
isActive: boolean
category: string

I would like to be able to complete the following searches:

  1. Access a specific report directly (a primary hash index of reportId)
  2. List reports of a specific category (a primary hash index on
    category)

These are both straightforward, but I would like to perform two other queries;

  1. List all reports that are marked as isActive = true
  2. List all reports of a specific category that are marked as isActive
    = true

My first approach would be to create a primary hashkey index on isActive, with a rangekey on category, but I'm only able to choose String, Number of Binary as the attribute type.

Storing isActive as a string (saved as 'true' rather than a boolean true) solves the problem, but its horrible using a string for a boolean property.

Am I missing something? Is there a simple way to query the table directly on a boolean value?

Any advice duly appreciated.

Thanks in advance.

Best Solution

My project includes this particular scenario and I've followed the DynamoDB best practice of using sparse indexes on both Local and Global Secondary Indexes. Here is what I would do with your example:

Table: reportId (string, hash key) || reportText (string) || isActive (string, marked as "x") || category (string)

ActiveReportsIndex (Local Secondary Index): reportID (hash key) || isActive (range key)

ActiveReportsByCategoryIndex (Global Secondary Index): category (hash key) || isActive (range key) || reportId

The idea behind sparse indexes is that only reports marked as isActive: "x" will show up in your indexes, so they should require less storage and processing than your main table. Instead of making the isActive attribute a boolean type, which will always store a true or false value, use use a string like "x" or anything else you want when the report is active and DELETE the attribute completely when the report is not active. Makes sense?

UPDATE: If you want a specific kind of sort when you query (e.g. chronological), use a number (e.g. a unix timestamp) instead of an "x" string.

Related Question