RedisSearch - Release - 2.0.0-M1

DrewKreiger · July 29, 2020, 5:37pm

This is the first milestone release for RedisSearch 2.0 Headlines:

This milestone re-architects the way indices are kept in sync with the data. Instead of having to write data through the index (using FT.ADD), RediSearch will now follow the data written in hashes and automatically index it. Read more about it in this blog.Details:• The index no longer resides within the key space.
• Indexes need be created with an IndexCondition (prefix matcher and filter)
• The deprecated FT.X commands were mapped to their Redis equivalents commands
• The inverted index itself is no longer saved to the RDB.• Other notable Features:
• #1097 Add hindi snowball stemmer.

• Bugfixes:

• #1282 Crash when querying with some combinations of optional terms.
• #1290 Crash when calling RediSearch_IndexAddDocument and NewAddDocumentCtx fails.
• #1313,#1330 Unexpected: “field does not support phonetics”.
• #1255 Change ForkGC threshold to 100 to reduce empty GC cycles.
• Known issues in this milestone release:

• Upgrade is not possible from 1.X versionsNotes:
• This is not the GA version of 2.0. The version inside Redis will be 19901 or 1.99.1 in semantic versioning. Since the version of a module in Redis is numeric, we could not add an M1 flag.
• Requires Redis v6 or above

tgrall · July 30, 2020, 12:41pm

Hi,

Let me give you a small example about the new automatic indexing of hashes in RedisSearch 2.0

1- Start RediSearch docker container

> docker run -it --rm --name redis-search-2 \
     -p 6379:6379 \
     redislabs/redisearch:1.99.1

2- Add some data
For example let’s create few “movies”:


> HSET movies:1002 title "Star Wars: Episode V - The Empire Strikes Back" plot "After the Rebels are brutally overpowered by the Empire on the ice planet Hoth, Luke Skywalker begins Jedi training with Yoda, while his friends are pursued by Darth Vader and a bounty hunter named Boba Fett all over the galaxy."  release_year 1980 genre "Action" rating 8.7 nb_of_votes 1127635 imdb_id tt0080684

> HSET movies:1003 title "The Godfather" plot "The aging patriarch of an organized crime dynasty transfers control of his clandestine empire to his reluctant son."  release_year 1972 genre "Drama" rating 9.2 nb_of_votes 1563839 imdb_id tt0068646

> HSET movies:1004 title "Heat" plot "A group of professional bank robbers start to feel the heat from police when they unknowingly leave a clue at their latest heist."  release_year 1995 genre "Thriller" rating 8.2 nb_of_votes 559490 imdb_id tt0113277

The key and fields of the hashes are:

movie_id : The unique ID of the movie, internal to this database
title : The title of the movie.
plot : A summary of the movie.
genre : The genre of the movie, for now a movie will only h ave one single genre.
release_year : The year the movie has been released as a numerical value.
rating : The ratings from the public numerical value.
nb_of_votes : Number of votes.
imdb_id : Id in IMDn id of the movie.

3- Create an Index

So you have 3 movies in your database, and you want to be able to search on title, release_year, rating and genre so you can create an index with the following command:

> FT.CREATE idx:movies ON hash PREFIX 1  "movies:" SCHEMA title TEXT SORTABLE release_year NUMERIC SORTABLE rating NUMERIC SORTABLE genre TAG SORTABLE

FT.CREATE : the command that allows you to create a new index
idx:movies : the name of the index
ON hash : the type of structure to be indexed. Note that in RediSearch 2.0 only hash structure are supported, this is parameter will allow RediSearch to index other structure in the future
PREFIX 1 "movies:" : the prefix of the keys that should be index. This is a list, so since we want to only index movies:* keys the number is 1. Suppose you want to index movies and tv_show that have the same fields, you can use: PREFIX 2 "movies:" "tv_shows:"
SCHEMA ...: define the schema (field and their type) to index, as you can see in the command, we are using TEXT, NUMERIC and TAG, and SORTABLE parameter; let’s explain the detail later when you run queries.

You can use the following commands to list the indices and look into it:

> FT._LIST 

> FT.INFO "idx:movies"

If you look at the information returned by the info command you can see that the “movies:*” have been indexes. ( num_docs = 3 ).

4- Querying/Searching the data.

You can find some query examples below:

> FT.SEARCH idx:movies "star war" RETURN 2 title release_year

> FT.SEARCH idx:movies * FILTER release_year 1970 1980  RETURN 2 title release_year

5- Add a new movie:

> HSET movies:1005 title "The Exorcist" plot "When a 12 year-old girl is possessed by a mysterious entity, her mother seeks the help of two priests to save her."  release_year 1973 genre "Horror" rating 8.0 nb_of_votes 352898 imdb_id tt0070047

Then query it:

> FT.SEARCH idx:movies * FILTER release_year 1970 1980  RETURN 2 title release_year SORTBY release_year

The new movie is automatically indexed.

6- Delete a movie:

Let’s delete “The Godfather” movie

> DEL movies:1003

> FT.SEARCH idx:movies * FILTER release_year 1970 1980  RETURN 2 title release_year SORTBY release_year

When deleting the hash, the index is updated automatically

6- Delete the index.

Let’s now delete the index without deleting the hashes using the new command FT.DELETE

> FT.DELETE idx:movies

> FT._LIST

This is it! for this quick intro to RediSearch 2.0, you can look at the documentation and select the version in the right menu to have more information about this new release and how to use it

Regards
Tug

u66NH9MJQ1dv · September 1, 2020, 2:45am

This is working quite amazingly on my existing dataset. Any chance of sharing a rough estimate for a public release date?

Also, any chance anyone from StackExchange.Redis team is here and could comment on work being done for the 2.0 release for the NRedisearch package?

Lastly, would there be any issue with running Redisearch M3 in production if it’s not a core feature of my application and I could afford to either tweak my searches or unload the module and ignore search altogether if problems arise? Basically I want to make sure there wouldn’t be any possibility of it corrupting my existing data. At the same time I want to start building out my client and getting familiar with 2.0.

k-jo · September 1, 2020, 10:37am

Hi @u66NH9MJQ1dv

Thanks for the great feedback!
We’re aiming to do a public preview still this month.
.Net client should be tagged milestone this week.

With regards to using milestones in production. The risk of it mutating your existing data (hashes) should be very low. Since it’s a milestone, it will be less stable than a GA release and also we don’t put effort in validating upgrades of Milestone releases.

Does this give you enough information?

u66NH9MJQ1dv · September 1, 2020, 5:03pm

Yeah that sounds great. Thanks a lot!

Topic		Replies	Views
Redisearch 2.0 what exactly is PREFIX {count} {prefix} in FT.CREATE? RediSearch redisearch	4	3189	September 18, 2020
RediSearch AIX portability - I had a go, but (hopefully) small thing in the way RediSearch	9	533	December 12, 2017
About RediSearch RediSearch	0	961	April 23, 2020
RediSearch 2.0 on Non-Hash keys RediSearch	2	795	May 20, 2021
[ANN] RediSearch v1.4.0 RediSearch	0	581	August 20, 2018

RedisSearch - Release - 2.0.0-M1

Related Topics