Erroneous search results?

I’m getting weird search results for very simple queries in my project:

A simple query of:

FT.SEARCH myindex “olla @post_type:scheduled_chat_appointment” RETURN 2 post_type post_id

``

returns:

  1. (integer) 4
  2. “92410”
    1. post_type
    2. “test”
    3. post_id
    4. “92410”
  3. “scheduled_chat_5d8379a64b93b”
    1. post_type
    2. “scheduled_chat_appointment”
    3. post_id
    4. “scheduled_chat_5d8379a64b93b”
  4. “scheduled_chat_5d7f5f1a87f58”
    1. post_type
    2. “scheduled_chat_appointment”
    3. post_id
    4. “scheduled_chat_5d7f5f1a87f58”
  5. “scheduled_chat_5d7f5d83df1ff”
    1. post_type
    2. “scheduled_chat_appointment”
    3. post_id
    4. “scheduled_chat_5d7f5d83df1ff”

``

At the same time:

FT.SEARCH myindex “@post_id:92410 @post_type:test” RETURN 2 post_id post_type

``

returns:

  1. (integer) 1
  2. “92410”
    1. post_id
    2. “92410”
    3. post_type
    4. “test”

``

and:

FT.SEARCH myindex “@post_id:92410 @post_type:scheduled_chat_appointment” RETURN 2 post_id post_type

``

  1. (integer) 0

``

I can’t figure out why the first search query returns the doc id 92410 with clearly different post_type than defined in the query.

Can you please describe a reproduction steps, the ft.create command by which you created the index, and the RediSearch version you are using?

The creation command is quite long:

FT.CREATE myindex SCHEMA post_title TEXT WEIGHT 5 SORTABLE post_name TEXT WEIGHT 1 post_content TEXT WEIGHT 1 post_type TEXT WEIGHT 1 SORTABLE post_excerpt TEXT WEIGHT 2 post_author TEXT WEIGHT 1 post_author_id NUMERIC post_id TEXT WEIGHT 1 SORTABLE menu_order NUMERIC SORTABLE post_status TEXT WEIGHT 1 SORTABLE post_date NUMERIC SORTABLE post_parent TEXT WEIGHT 1 SORTABLE search_index TEXT WEIGHT 1 taxonomy_category TAG SEPARATOR * taxonomy_id_category TAG SEPARATOR * taxonomy_post_tag TAG SEPARATOR * taxonomy_id_post_tag TAG SEPARATOR * taxonomy_nav_menu TAG SEPARATOR * taxonomy_id_nav_menu TAG SEPARATOR * taxonomy_link_category TAG SEPARATOR * taxonomy_id_link_category TAG SEPARATOR * taxonomy_post_format TAG SEPARATOR * taxonomy_id_post_format TAG SEPARATOR * taxonomy_language TAG SEPARATOR * taxonomy_id_language TAG SEPARATOR * taxonomy_post_translations TAG SEPARATOR * taxonomy_id_post_translations TAG SEPARATOR * taxonomy_term_language TAG SEPARATOR * taxonomy_id_term_language TAG SEPARATOR * taxonomy_term_translations TAG SEPARATOR * taxonomy_id_term_translations TAG SEPARATOR * taxonomy_target_group TAG SEPARATOR * taxonomy_id_target_group TAG SEPARATOR * taxonomy_topic TAG SEPARATOR * taxonomy_id_topic TAG SEPARATOR * taxonomy_forum TAG SEPARATOR * taxonomy_id_forum TAG SEPARATOR * taxonomy_qa_forum TAG SEPARATOR * taxonomy_id_qa_forum TAG SEPARATOR * taxonomy_location TAG SEPARATOR * taxonomy_id_location TAG SEPARATOR * taxonomy_media_category TAG SEPARATOR * taxonomy_id_media_category TAG SEPARATOR * active NUMERIC SORTABLE primary_topic TEXT WEIGHT 1 primary_target_group TEXT WEIGHT 1

``

We are running the official Docker image of version 1.6.2.

maanantai 23. syyskuuta 2019 9.29.28 UTC+2 Meir Shpilraien kirjoitti:

Several suggestions:

  1. Try to run FT.EXPLAIN (or FT.EXPLAINCLI) on your query.

  2. Ensure that the document was not modified outside of redisearch. Are you only using FT.ADD? or are you using FT.ADDHASH and then modifying the document

  3. Can you try this with the 1.4 images? do you get the same results?

if the rdb is small enough, can you attach it to the issue?

  1. Here’s the explanation:

INTERSECT {

UNION {

olla

+olla(expanded)

}

@post_type:UNION {

@post_type:scheduled_chat_appointment

@post_type:+scheduled_chat_appoint(expanded)

@post_type:scheduled_chat_appoint(expanded)

}

}

``

  1. They are only created using FT.ADD. We have some functions that use FT.ADDHASH in our application but this happens whether or not is has been used.

  2. We can’t, we use some of the 1.6-only features (mainly INFIELDS with FT.AGGREGATE) and this is actually in production already.

maanantai 23. syyskuuta 2019 13.29.49 UTC+2 Mark Nunberg kirjoitti:

Clearly seems like a bug. does it happen with other documents/queries too? are you using replace/update/delete operations?
is “test” an actual post type, or is it just a test entry. does it show up if you search for scheduled_chat_appointment without olla?

btw the latest tag is 1.6.3; not that I think it should affect your issue

I’d also suggest opening a ticket on github as it will make it easier for us to track,

regards

Yeah, seems to happen with other documents and queries as well, and also other post_types. We are using replace operations but they don’t affect this problem, it appears even though no replace operations have been made. “Test” is an actual post type, just named test. And yeah, it shows up without the keyword as well. I can try updating to 1.6.3, but can’t do it right now as the deploy process means downtime which can’t be done during workday.

keskiviikko 25. syyskuuta 2019 9.43.04 UTC+2 Mark Nunberg kirjoitti:

is this a clean rdb, or was this upgraded from an older version? is it possible to provide an rdb?

Miika is it possible to get an rdb to check it out?

We have over 10 000 quite big documents in the database so it’d be very difficult. I did just some tests on this in a separate environment indexing our content from scratch, and it didn’t affect the results. 1.6.1, 1.6.2 and 1.6.3 all have the same bug. I also ran 1.4.16 against the same database, and it returns the right results. So there clearly is a bug in 1.6.x on this.

keskiviikko 25. syyskuuta 2019 10.18.37 UTC+2 Meir Shpilraien kirjoitti:

The content in the database is also not ours but our client’s.

keskiviikko 25. syyskuuta 2019 10.46.48 UTC+2 Miika Arponen kirjoitti:

After doing some tests I could reproduce the issue using the following steps:

Enter code here…127.0.0.1:6379> FT.CREATE idx SCHEMA test1 TEXT test2 TEXT

``
OK

127.0.0.1:6379> FT.ADD idx doc1 1.0 FIELDS test1 laa1 test2 laa2

OK

127.0.0.1:6379> FT.ADD idx doc2 1.0 FIELDS test1 laa3 test2 laa1

OK

127.0.0.1:6379> FT.SEARCH idx “laa1 @test2:laa3”

  1. (integer) 1

  2. “doc2”

    1. test1
  3. “laa3”

  4. test2

  5. “laa1”

It is really a 1.6 only issue and it happened because of a bug in an optimization code introduced in 1.6.
This PR should fix it: https://github.com/RediSearch/RediSearch/pull/917

Will soon be merged to master.

Thanks for reporting it.