ReBloom 2.2.0 size issues

Claudio_Henriquez · January 17, 2020, 12:25pm

Dear RedisLabs team,

We have some time working with RedisBloom Filter with a large amount of data, let’s say over 250 million of records.

In this context, we have found an issue in latest version published: 2.2.0. When we tried to reserve space for more than 300 million using an error rate of 0.000001, the size was unexpectedly smaller compared to a reservation for 250 million. However, there is no error message, the only thing you can notice that something is wrong is after inserting some data. Then, everything becomes available and the results are erratic.

To avoid this issue, we had to downgrade the service and continue working with the previous version 2.0.3. So please, if you can take a look to this behavior, it will be appreciated by our team.

Evidence 300M (erratic behavior):

Evidence 280M:

Many thanks for your help!

Best Regards,

Claudio

ashtul · January 17, 2020, 4:23pm

Hello Claudio,

Thank you for your message.

I looked at the code and what you experience is an overflow of an integer.

It will be fixed ASAP. Feel free to open an issue at https://github.com/RedisBloom/RedisBloom.

Regards,

Ariel

Claudio_Henriquez · January 17, 2020, 6:54pm

Many thanks Ariel,

Anyway, independently of the number, the behavoir of the filter was erratic. I assume you will fix everything, right?.

Best regards

Claudio H.

ashtul · January 19, 2020, 4:15pm

Hi Claudio,

A PR has been created and will be soon discussed within our team.

V2.2 brings several advances including a bug fix that corrects/reduces the number of hashes used (== lower memory usage) and an option to make the filter unscalable (if you know your total size) which will reduce memory footprint as well. If you are interested in discussing your use case in order to get some feedback, I will be happy to assist.

Kind regards,

Ariel

ashtul · January 23, 2020, 1:56pm

Hello Claudio,

A fix was merged into the master branch on RedisBloom repository.

The same issue existed prior to V2.2.0 and the reason you were not aware is probably the lack of detailed TS.INFO command. I would recommend you move to v2.2.1 (will be tagged soon).

Kind regards,

Ariel

Claudio_Henriquez · January 28, 2020, 1:26pm

Many Thanks Ariel,

Sorry for the late reply. First of all, many thanks for the great application.

Regarding Bloom filter bug, I haven’t experienced any issue until now with the same data. In fact, the way we notice the issue was the amount of RAM taken by the process. In the version 2.0.3 the app takes approximately 1.3 GB of RAM to allocate 250.000.000 of records. However, in the new version that was not happening (it takes 100 MB as much).

Please, count on me to get feedback of new releases in the future.

Many thanks again, I will be waiting for the new version.

Best regards,

Claudio H.

ashtul · January 29, 2020, 7:38am

Hi Claudio,

V2.2.1 with the fix is out.

Please note, with this version you can use NONSCALING flag with BF.RESERVE if you know the size of your filter. This will save you 1 hash function (20 vs 21) and 1.44 bit per entry since you are using less hashing.

Regards,

Ariel

Topic		Replies	Views
Max capacity of RedisBloom in a single key RedisBloom	1	1475	November 14, 2019
strange behavior in BF.INSERT? RedisBloom	4	688	November 14, 2019
RedisBloom filter for stream of data Redis use cases streams	0	449	January 17, 2022
About RedisBloom RedisBloom	0	708	April 23, 2020
Filters delete issue ! RedisBloom	1	688	August 24, 2019

ReBloom 2.2.0 size issues

Related Topics