Filtering by float values of a numeric field is inconsistent

I have an index with a numeric field time_QUEUED.

> ft.search /h/t/idx * sortby time_QUEUED return 1 time_QUEUED
 1) (integer) 5
 2) "/h/t/d/922daf3d-04ab-41a0-b72e-93ac469cf3c1"
 3) 1) "time_QUEUED"
 2) "1583500901.55"
 4) "/h/t/d/eeba5082-4e33-48ff-a937-5f1a31069bf0"
 5) 1) "time_QUEUED"
 2) "1583501038.48"
 6) "/h/t/d/5faebed5-24d1-49c5-8fd9-bb4f76d811ac"
 7) 1) "time_QUEUED"
 2) "1583501038.47"
 8) "/h/t/d/343e2d27-1937-436d-9c1f-b110833d30bc"
 9) 1) "time_QUEUED"
 2) "1583501038.62"
10) "/h/t/d/db3b0302-a9b1-4b1a-ac43-f74b9bb7c256"
11) 1) "time_QUEUED"
 2) "1583501038.63"

Now I expect this to return the last record:

> ft.search /h/t/idx "@time_QUEUED:[1583501038.63 +inf]"
1) (integer) 0

And this to return 2 last records:

> ft.search /h/t/idx "@time_QUEUED:[1583501038.62 +inf]"
1) (integer) 0

I only start getting results when the floor of the condition is the largest integer not greater than the results:

> ft.search /h/t/idx "@time_QUEUED:[1583501038.0 +inf]" return 1 time_QUEUED
1) (integer) 4
2) "/h/t/d/db3b0302-a9b1-4b1a-ac43-f74b9bb7c256"
3) 1) "time_QUEUED"
 2) "1583501038.63"
4) "/h/t/d/eeba5082-4e33-48ff-a937-5f1a31069bf0"
5) 1) "time_QUEUED"
 2) "1583501038.48"
6) "/h/t/d/343e2d27-1937-436d-9c1f-b110833d30bc"
7) 1) "time_QUEUED"
 2) "1583501038.62"
8) "/h/t/d/5faebed5-24d1-49c5-8fd9-bb4f76d811ac"
9) 1) "time_QUEUED"
 2) "1583501038.47"

Float condition works, too, as long as it's less than the floored results:

ft.search /h/t/idx “@time_QUEUED:[1583501037.99 +inf]” return 1 time_QUEUED

  1. (integer) 4
  2. “/h/t/d/db3b0302-a9b1-4b1a-ac43-f74b9bb7c256”
    1. “time_QUEUED”
  3. “1583501038.63”
  4. “/h/t/d/eeba5082-4e33-48ff-a937-5f1a31069bf0”
    1. “time_QUEUED”
  5. “1583501038.48”
  6. “/h/t/d/343e2d27-1937-436d-9c1f-b110833d30bc”
    1. “time_QUEUED”
  7. “1583501038.62”
  8. “/h/t/d/5faebed5-24d1-49c5-8fd9-bb4f76d811ac”
    1. “time_QUEUED”
  9. “1583501038.47”

Am I using a wrong syntax? Does it work like this by design? Is there a workaround to do an exact search?

Thanks

Hey Yuriy,

Can you please provide me an rdb/reproduction steps to reproduce the issue.
I suspect you encounter this issue: https://github.com/RediSearch/RediSearch/issues/1098, But I want to make sure.

Thanks

@Meir, I did more testing today and I hope the reason was that time_QUEUED field was not SORTABLE. I could not reproduce it after making it SORTABLE, although I did not try too hard.
Let me carry on with my changes and get back here if I see it happening again.

I saw https://github.com/RediSearch/RediSearch/issues/1098 earlier and it was the reason I started testing my indexes last week - because I also use epoch time as a sortable float field. But searches work fine in my tests except for the problem I reported which now seems to me a false positive.

Hiving said that, I still suspect there is a bug that causes non-sortable float fields to be filtered wrongly within a whole number. But I could not yet reproduce it on a separate non-sensitive dataset. Will update here if I get more info.

I keep having the issue I reported originally. SORTABLE was a red herring: it does not help.
Weirdly and very unfortunately, it disappears when the same data is reloaded from rdb. I.e. the filtering does not work, then I restart redis without any data changes and the same FT.SEARCH query works as expected.

Attaching the rdb, but it’s probably pointless. Just in case it gives you some hints on the possible cause of the issue. Unfortunately I can’t send you my whole codebase and I still could not reproduce it with a simpler code snippet.

This search returned all 10 records before the redis restart (as if searching with “*”), but returns only 6 after the restart, which is correct.

“FT.SEARCH” “/h/t/idx” “@time_QUEUED:[-inf 1583932226.83]” “SORTBY” “time_QUEUED” “DESC” “LIMIT” “0” “26” “RETURN” 1 “time_QUEUED”

``

I will look for a workaround for now (maybe converting the time to an integer field assuming this helps?), but will keep you posted if I find a better way to reproduce.

dump.rdb (14.3 KB)

BTW I wonder if this is related to the fact that float number in the index are truncated compared to the original field value?
E.g. in the db I attached earlier:

“FT.SEARCH” “/h/t/idx” “@time_QUEUED:[-inf 1583932226.83]” “SORTBY” “time_QUEUED” “DESC” “LIMIT” “0” “1” “RETURN” 1 “time_QUEUED”

  1. (integer) 6
  2. “/h/t/d/e2ad3448-7f9a-4fb8-b6be-710d35d880c5”
    1. “time_QUEUED”
    2. “1583932226.83”

``

but:

HGET “/h/t/d/e2ad3448-7f9a-4fb8-b6be-710d35d880c5” time_QUEUED
“1583932226.825489”

``

The documents have been added to the index from a LUA script using FT.ADDHASH.

Sorry Yuriy but I want to make sure I understand correctly, are you saying the reloading from rdb fixes the issue or causing the issue?
If it cause the issue then analyze the rdb will help, I just need you to give the query which you believe do not return the correct results.

Reloading from rdb fixes the issue :frowning:
The query is 2 posts above where the rdb is attached.

I’m trying now a different approach: log everything with MONITOR and replay. Not exactly because I need to remove blocking BRPOPs. But after removing BRPOP’s and replaying it’s fixed again!

So I’m out of ideas for now. Apparently it’s related to the exact sequence of events, possibly specific timings. So far looks most likely like a concurrency bug. Possibly Redisearch concurrency?

@Meir, would it help if I extract some low-level data from the index when it’s in the faulty state? Is it possible?

Thanks Yuriy,

I will take a look at the RDB to see if I can reproduce and understand the issue.

Thanks. When you have time, could you please also check my question here.
Are they expected to be truncated? When I use e.g. FT.GET, redisearch returns the full untruncated field

Can it be related to the filtering issue?

Thought to ask this in a separate topic, but it would be difficult to explain without the context.

Thanks

I will check it also Yurih. Is it possible to open a github issue with all the details (including the rdb to reproduce) so we will have it all arrange in one place?

Thanks.

Meir, I thought we already have it pretty much in one place. Not sure how much I can improve it by copying the same details and the attachment to GitHub.
Would it work if I create 2 github issues with references to specific posts above? This would clearly separate the filter and float precision problems.

Yuriy github is where we put all the issues, I do not want this issue to be missed this is why I ask it.
Open issues with a link to this discussion is also fine.

@Meir, thanks for following up on this. I still did not find a way to reproduce this apart from running our exact application code consisting of several processes calling half of redis commands from LUA concurrently. As I said, replay of the same commands captured by MONITOR does not recreate the bug.

Hence, I have not create a GitHub issue to avoid wasting your time on something which even I struggle to reproduce with sample data. I will keep trying when I have time and will register it when I have more info. For now I just have to avoid relying on searchable timestamp fields. It would help if Redisearch had min/max filter on a text field - I would then use a text field instead of numeric.