FT.SEARCH O(n) query time

I have some questions on how the query engine works internally,

I changed my example to movie but it can applied to something else:

 ft.explaincli movie "@category:{1|2} @releaseYear:{2021} @isBlockbuster:{FALSE} limit 0 60
 
  INTERSECT {
    TAG:@category {
      1
      2
    }
    TAG:@releaseYear {
      2021
    }
    TAG:@isBlockbuster {
      false
    }
   }

Is the order of query params important?

If f.e. I know that :

  1. category 1,2 would yield 3M rows
  2. releaseYear: 300K
  3. and isBlockBuster: 200
  • Would changing the position of the fields make any difference (placing the most limiting one first f.e.)?

  • Does redisearch internally fetches a reference to of all these “tags” entries and need to loop over all of them to find the intersection?

  • Is there a way to have a combined index of fields to have "faster/ less “heavy” queries:
    f.e. combined field entry in the inverted index : category_releaseYear_isBlockBuster that immediately would point to the right doc ids,
    without the need of doing some intersection internally … probably will have to create new field myself if I want this behaviour?

  • Are there any other quick wins possible, some extra placement with () in the query and have intersections behave differently/ faster?

Thanks a lot!
Jayme

Hey @jrots

  • Definitely yes, we pass on all the results of the first statement in the intersection and for the others, we use binary search to find if those also contain the document. So if the first statement returns fewer results it should be faster. We are working to make this optimization automatic in future versions.
  • I think I answered it above, only on the first statement, the other is a binary search.
  • I guess you can do it yourself by defining such field and put the value with some separator (make sure not to use the TAG field separator cause you do not want RediSearch to tokenize it)
  • Notice that the first statement is on OR operation between category:1 and category:2, OR is much less performant than AND. So if this can somehow be avoid you might even get a better performance. We have an optimization comming in future version that should make the OR performance much better.
1 Like

Makes sense, thanks a lot for your answers!
Will start on making some smarter combined tag field and put that one first,
Regards
Jayme