can't query with chinese

hello:
There is a problem in my use: can’t query with chinese.

****evn:

centos6.10

redis-5.0.7

redisearch.so v1.4.19(from :https://github.com/RediSearch/RediSearch/releases/download/v1.4.19/redisearch.so)

create index :

  FT.CREATE myIdx SCHEMA title TEXT body TEXT

``

I think to be able to query properly with languages other than English, you will have to set the LANGUAGE property in your query.

You can check this part of the doc for more details https://oss.redislabs.com/redisearch/Chinese/

1 Like

@pacost

Indexing a Chinese document is different than indexing a document in most other languages because of how tokens are extracted. While most languages can have their tokens distinguished by separation characters and whitespace, this is not common in Chinese.

RediSearch makes use of the Friso chinese tokenization library for this purpose. This is largely transparent to the user and often no additional configuration is required.

If you wish to use a custom dictionary, you can do so at the module level when loading the module. The FRISOINI setting can point to the location of a friso.ini file which contains the relevant settings and paths to the dictionary files.

Note that there is no “default” friso.ini file location. RediSearch comes with its own friso.ini and dictionary files which are compiled into the module binary at build-time.

See below commands work for you.

FT.CREATE idx SCHEMA txt TEXT
FT.ADD idx docCn 1.0 LANGUAGE chinese FIELDS txt "Redis支持主从同步。数据可以从主服务器向任意数量的从服务器上同步,从服务器可以是关联其他从服务器的主服务器。这使得Redis可执行单层树复制。从盘可以有意无意的对数据进行写操作。由于完全实现了发布/订阅机制,使得从数据库在任何地方同步树时,可订阅一个频道并接收主服务器完整的消息发布记录。同步对读取操作的可扩展性和数据冗余很有帮助。[8]"
FT.SEARCH idx "数据" LANGUAGE chinese HIGHLIGHT SUMMARIZE
# Outputs:
# <b>数据</b>?... <b>数据</b>进行写操作。由于完全实现了发布... <b>数据</b>冗余很有帮助。[8...