ndb Query partial string matching

mehulkar Source

This seems like it should be an easy question. But the docs don't seem to answer it. Using the example from them, I want to do this:

Account.query(Account.title == "best")

Except I want to match partial strings as well. So in this scenario:

acct = Account(title="the best account in the world")

an ndb query with argument "best" would match the acct.

The only option I see at the moment is to loop through Account.query() and match each title with re.search module in python. This doesn't seem like a good solution.

Update: I am also looking at gql. Doing this:

acct = ndb.gql('SELECT * from Account WHERE title LIKE '%best%')

returns a Parse Error: Invalid WHERE Condition at symbol LIKE

google-app-engineapp-engine-ndbdjangoappengine

Answers

answered 6 years ago Shay Erlichmen #1

GQL doesn't have wildcards matching, to achieve that you will need to use the full text search.

answered 6 years ago Greg #2

For a (presumably) short field like a title, adding a repeated StringProperty that contains each word of the title (ignoring stop words, maybe) would allow you to match on words, and would be simpler than using the search API.

answered 3 years ago Kris Subramanian #3

NOTE : This does not exactly answer the question, but someone looking for starts with might find this answer useful.

NDB's String field is indexed in a way where you can do greater than (>=) and less than (<) search. Assuming the following Person model:

class Person(ndb.Model):
    name         = ndb.StringProperty()
    name_lower   = ndb.ComputedProperty(lambda self: self.name.lower())

You can do the following:

def search_by_text(text):
  text = text.lower()
  limit = text[:-1] + chr(ord(text[-1]) + 1)
  return Person.query(Person.name_lower >= text, Person.name_lower < limit).fetch(50)

p = search_by_text('kri')

The limit variable in this example will contain the string 'krj' and becomes the limit of the search values. The above will get you all the people whose name is greater than kri but less than krj and limit to the first 50 findings. Because of the limit, names like kross and lark will be filtered out.

Note: it's important that you have an ndb.ComputedProperty to contain a lower case version of the field you want to search on. Don't forget to add that!

comments powered by Disqus