How to find index entries whose underlying documents expired?

I add existing documents to an index using FT.ADDHASH.
The documents expire after a certain timeout, let’s say 2 hours. My goal is now to clean up the index by deleting the entries of documents which expired.

Is there any way to search for documents which no longer exist? If I do a “wildcard” search

FT.SEARCH my_index *

``

For expired documents, it returns only IDs, but no other properties. Is there a way to select all such documents?

Funny you are asking about this capability, we are currently developing the ability for RediSearch to follow hashes, which means that when the hash is delete it will be cleaned from the index as well. Hope to have it available in future major releases.

Until then, I suggest you to check RedisGears (https://oss.redislabs.com/redisgears/). Using RedisGears you can define a registration (which is kind of a rule) that will be trigger each time a key is expire/deleted/renamed, and you can update the index accordingly. And its all do-able using python.

Interesting, it will be great to have it built-in, thanks.

I will have a look at redisgears. More likely for now will just do a regular check of the oldest entries in the index and remove those which don’t have document hashes anymore. I initially thought I would use a field containing epoch timestamp for this (then I could relatively accurately calculate the expiry time), but reluctant to rely on it now after observing https://groups.google.com/forum/#!topic/redisearch/V5h3k69T5Qw.