Recommendation needed FT.SEARCH vs HGET for multiple records

paulflo · January 4, 2021, 2:02pm

Hi,
I have a use case in which I am executing an FT.AGGREGATE which then returns a set of records that have a “Foreign Key” in different hashes. What would be the best way to retrieve these other records so that I can join them in my code for 5 records? What about for 1000?

Use a loop and execute HGETALL for each FK?
Create a TAG on the ID field and use FT.SEARCH to retrieve them by this tag.

In Redis currently there is no way to retrieve multiple hashes at the same time is there? Something like an MGETALL hash1 hash2, etc.

An example would be a collection of Flights, which have Airports and Aircrafts. In my flight hash I am only storing AirportID and AircraftID, and once I have retrieved the flights I need, I have to also attach the Airports to them.

Thanks!

meirsh · January 4, 2021, 2:13pm

Hey @paulflo

So if your Foreign Key is the key name itself then you can hgetall on each key in a pipeline which should make it very fast. Another option is to use LUA and do the ft. aggregate + hgetall inside the LUA script and return only the airports, the downside of this approach is that it will not work on cluster. If you want to do something similar on cluster you can use RedisGears RedisGears - Programmable engine for data processing in Redis

Let me know which option you want to go with and I can help you further.

paulflo · January 4, 2021, 2:38pm

Thank you @meirsh!

I am using c# (stackexchange’s library) and I went ahead and implemented pipelining/ batching as it seemed to be the easiest of the options provided, and it seems to work much better! So this should be much more efficient than using FT.SEARCH with tags right?

I have my dev environment running in the cloud as well so it’s a bit difficult at the moment to test performance, is there a recommended number of records per batch I should not go over? I am assuming sending something like 100k at the same time would likely be a bad idea?

Thank you again.

meirsh · January 4, 2021, 5:29pm

Yes it should be the fastest, its a direct access to a hashtable instead of search inside inverted index.

Regarding pipeline size, for sure not 100K, Its also depends on the network, the hardware, and the size of the results (like how many fields in the hashes and the size of each field). You will have to test and see what gives the best results.

Topic		Replies	Views
FT.AGGREGATE performance problems RediSearch	4	886	December 16, 2019
FT.AGGREGATE FT.ADD with tag fields RediSearch redisearch	7	1122	July 15, 2020
FT.AGGREGATE equivalent of FT.SEARCH RediSearch	1	1422	June 19, 2022
FT.AGGREGATE without GROUPBY and TAGs RediSearch	1	1133	January 29, 2021
Query efficiency of ft.aggregate RediSearch redisearch	0	338	August 30, 2023

Recommendation needed FT.SEARCH vs HGET for multiple records

Related Topics