Python – App Engine Datastore IN Operator – how to use



I want to use:

:= IN

but am unsure how to make it work. Let's assume the following

class User(db.Model):
    name = db.StringProperty()

class UniqueListOfSavedItems(db.Model):
    str = db.StringPropery()
    datesaved = db.DateTimeProperty()

class UserListOfSavedItems(db.Model):
    name = db.ReferenceProperty(User, collection='user')
    str = db.ReferenceProperty(UniqueListOfSavedItems, collection='itemlist')

How can I do a query which gets me the list of saved items for a user? Obviously I can do:

q = db.Gql("SELECT * FROM UserListOfSavedItems WHERE name :=", user[0].name)

but that gets me a list of keys. I want to now take that list and get it into a query to get the str field out of UniqueListOfSavedItems. I thought I could do:

q2 = db.Gql("SELECT * FROM UniqueListOfSavedItems WHERE := str in q")

but something's not right…any ideas? Is it (am at my day job, so can't test this now):

q2 = db.Gql("SELECT * FROM UniqueListOfSavedItems __key__ := str in q)

side note: what a devilishly difficult problem to search on because all I really care about is the "IN" operator.

Best Solution

Since you have a list of keys, you don't need to do a second query - you can do a batch fetch, instead. Try this:

#and this should get me the items that a user saved
useritems = db.get(saveditemkeys)

(Note you don't even need the guard clause - a db.get on 0 entities is short-circuited appropritely.)

What's the difference, you may ask? Well, a db.get takes about 20-40ms. A query, on the other hand (GQL or not) takes about 160-200ms. But wait, it gets worse! The IN operator is implemented in Python, and translates to multiple queries, which are executed serially. So if you do a query with an IN filter for 10 keys, you're doing 10 separate 160ms-ish query operations, for a total of about 1.6 seconds latency. A single db.get, in contrast, will have the same effect and take a total of about 30ms.