GROUPBY/ REDUCE TOLIST help, can one filter on the list prior to returning the result?

Hey guys,
given the following data, what I’d like to do is do a GROUPBY with REDUCE TOLIST, but also filter on the LIST items prior to returning the result.

For instance:

FT.CREATE “books_hash_index_1” NOSCOREIDX SCHEMA “BookID” NUMERIC SORTABLE “Title” TEXT SORTABLE “Type” NUMERIC SORTABLE

FT.ADD books_hash_index_1 1 1 FIELDS “BookID” 1 “Title” “book1” “Type” 1

FT.ADD books_hash_index_1 3 1 FIELDS “BookID” 3 “Title” “book1” “Type” 2

FT.ADD books_hash_index_1 2 1 FIELDS “BookID” 2 “Title” “book2” “Type” 1

FT.ADD books_hash_index_1 4 1 FIELDS “BookID” 4 “Title” “book3” “Type” 2

“FT.AGGREGATE” “books_hash_index_1” “*” GROUPBY 1 @Title REDUCE TOLIST 1 @BookID AS BookID

``

returns:

  1. (integer) 3

    1. “Title”
  2. “book2”

  3. “BookID”

    1. “2”
    1. “Title”
  4. “book1”

  5. “BookID”

    1. “1”
  6. “3”

    1. “Title”
  7. “book3”

  8. “BookID”

    1. “4”

``

From this result set, I would only like to return the results that have a Type of 2 in the grouping, so in this case the first result would not be returned because BookID of 2 only has a Type of 1.

There is actually a FILTER keyword that does just that!

127.0.0.1:6379> “FT.AGGREGATE” “books_hash_index_1” “*” FILTER “(@Type==1)” GROUPBY 1 @Title REDUCE TOLIST 1 @BookID AS BookID

  1. (integer) 2

    1. “Title”

    2. “book2”

    3. “BookID”

      1. “2”
    1. “Title”

    2. “book1”

    3. “BookID”

      1. “1”

Mark Nunberg | Senior Software Engineer
Redis Labs - home of Redis

Email: mark@redislabs.com

Hey Mark,
that is close, I need @Type==2, but in the list of results, I would like to also return all of the other types, so in my example above the result returned should be:

  1. (integer) 2

    1. “Title”
  2. “book1”

  3. “BookID”

    1. “1” - type 1 but keep it since the one below is of Type 2

    2. “3” - type 2

    1. “Title”
  4. “book3”

  5. “BookID”

    1. “4”- type 2

``

Oh, you wish to return all the IDs of the title, if any of those books are of type 2. This would require an aggregation functionality which would (1) create a set of all types in the reduced entries, and (2) filter through this list, omitting any which don’t have type 1 as a member. Currently such a function doesn’t exist, but I guess it could be implemented if the use case were compelling enough. This does sound more like a use case for redis gears, though.

Mark Nunberg | Senior Software Engineer
Redis Labs - home of Redis

Email: mark@redislabs.com

Yup, that is exactly what I am trying to do. The problem is I am also implementing paging, so I need to be able to get the results in one query.

Can you think of another way to accomplish this? Would it be worth looking at scripting?

Thanks for your help.

Quite a puzzle. I think I captured what you’re trying to do.

I was able to accomplish this by doing a query like this:
“FT.AGGREGATE” “books_hash_index_1” “*” LOAD 3 @Title @Type @BookID APPLY “100^(@type)” As TypeNumber GROUPBY 1 @Title REDUCE TOLIST 1 @BookID AS BookID REDUCE SUM 1 @TypeNumber as SumType APPLY “@SumType >= 1000” as Type2 APPLY “(@SumType >= 100) && (@SumType < 1000)” As Type1 FILTER “@Type2 == 1”

The results look like this:

  1. (integer) 3

    1. “Title”

    2. “book1”

    3. “BookID”

      1. “1”

      2. “3”

    4. “SumType”

    5. “10100”

    6. “Type2”

    7. “1”

    8. “Type1”

  2. “0”

    1. “Title”

    2. “book3”

    3. “BookID”

      1. “4”
    4. “SumType”

    5. “10000”

    6. “Type2”

    7. “1”

    8. “Type1”

  3. “0”

I know this is a lot, but it’s really not that complicated. What I’m doing is creating a pseudo-bitmask out of the type field by using exponents. The later, I’m using some logic to check if it is a member of a particular type. You don’t really need to create fields, but I’m doing it for clarity. Finally, I’m filtering on the a particular type.

Hope this helps!

-K

Hey Kyle,
that does look like the desired result! The only problem I notice is the count is 3, instead of 2, do you think it’s counting the BookIDs instead of the result set?

If I could get this one last bit correct, I would be all set! :slight_smile:

Thanks for all of your help!

Paul