I read a cool thread on mid-word querying via the RediSearch autocomplete engine: Complex querying
I am trying to achieve this but I think my current solution is inefficient and was wondering how you guys would make it work.
Let’s stick to the original post and have a phrase: star wars trilogy
I create permutations of this so I index these with payload set to “star wars trilogy”.
In my index I will have:
star wars trilogy wars star trilogy trilogy star wars star trilogy wars wars trilogy star trilogy wars star
This works fine - as it will suggest "star wars trilogy" for whichever word we enter - but as you can see if we have a longer phrase, the permutation count quickly add up and we need to index hundreds of thousands of suggestions.
Looking forward to your ideas, thanks in advance!
PS: I am trying to imitate one of Algolia’s solution, eg: Algolia Places you can see that the order of the entered words doesn’t matter and you’ll still get autocomplete results.
Can you better explain the usecase, I do not see why not just using normal index with TEXT field that will tokenize your text into words and you will be able to search each word by prefix:
127.0.0.1:6379> FT.CREATE idx SCHEMA s TEXT
OK
127.0.0.1:6379> ft.add idx doc1 1.0 FIELDS s "star wars trilogy"
OK
127.0.0.1:6379> FT.SEARCH idx sta*
1) (integer) 1
2) "doc1"
3) 1) "s"
2) "star wars trilogy"
127.0.0.1:6379> FT.SEARCH idx tril*
1) (integer) 1
2) "doc1"
3) 1) "s"
2) "star wars trilogy"
The usecase is basically same as the link above, an autocomplete input box where searching with a typo eg.: triol would return a suggestion which is: star wars trilogy
The prefix solution you mentioned works perfectly as long as there is no typo, so I started using the autocompleter, as it had the fuzzy prefix search functionality.
Previously I tried using the fts engine with queries something like these:
%triol*%
triol*
%triol%
but of course the 1st is an invalid query and querying the 2nd and 3rd would return nothing because of the typo.
@meirsh Thanks for the suggestion, I see, I think I overcomplicated this a little bit.
Anyways if the word is a long one, this would still return 0. Eg: given an indexed value: International Man of Mystery
By querying intr the query would most likely fail cause the high LD, so I came up with an idea if charlen < 4 I am using the * prefix search, after it I am using the LD you mentioned.
Not sure if its a good solution, but seems to work for now:D