Batch size for model in RadisAI

We are faced with the task of optimizing the calculations for detecting people on the GPU and we want to apply the BATCSIZE and MINBATCHSIZE parameters for models in RedisAI. But when adding these parameters, we do not get the expected system behavior.

We use a bundle of Redis - RedisGears - Redis AI


Redis is used to store stream and key-value structures.

One client sends the image from the camera in key-value, and sends the key and another metadata to stream ‘detection: process’ as a task for detection.


gb = GearsBuilder('StreamReader')

From the message to stream, we get the key and the image from the key-value, and then start the model prediction.


Uploading a frozen graph model to the Tensorflow backend in RedisAI.

cat model.pb | AI.MODELSET detector:model TF GPU BATCHSIZE 32 MINBATCHSIZE  8 INPUTS image_arrays OUTPUTS detections BLOB

Running model prediction inside RedisGears.

import redisAI

def run_model(image_tensor):
    model_runner = redisAI.createModelRunner('detector:model')
    redisAI.modelRunnerAddInput(model_runner, 'image_arrays', image_tensor)
    [redisAI.modelRunnerAddOutput(model_runner, output) for output in ['detections']]
    output = redisAI.modelRunnerRun(model_runner)
    return output

When adding the BATCHSIZE and MINBATCHSIZE parameters, we expect that steam tasks will stop at the execution of MODELRUN, batch up to MINBATCHSIZE and perform calculations on a GPU with several images, but the system does not wait for the accumulation of tasks, but continues to recognize one image at a time.

How can we test this case with our problem and how should the system work correctly with the BATCHSIZE and MINBATCHSIZE parameters?
Do you have examples using these parameters?

hi @vladislav.s
The current implementation of the synchronous redisAI.modelRunnerRun in RedisGears is not following the same rules of the “High Level API” (regular RedisAI commands.
We just merged a few days ago the async varient in the master branch of the two modules. This should behave the same as AI.MODELRUN command with respect to internal scheduling and batching.

The next recipe executes a simple multiplication model on two tensors in the keyspace with the new async API

    script = '''
import redisAI

async def RedisAIModelRun(record):
    keys = ['a{1}', 'b{1}']
    tensors = redisAI.mgetTensorsFromKeyspace(keys)
    modelRunner = redisAI.createModelRunner('m{1}')
    redisAI.modelRunnerAddInput(modelRunner, 'a', tensors[0])
    redisAI.modelRunnerAddInput(modelRunner, 'b', tensors[1])
    redisAI.modelRunnerAddOutput(modelRunner, 'mul')
    res = await redisAI.modelRunnerRunAsync(modelRunner)
    redisAI.setTensorInKey('c{1}', res[0])
    return "OK"


con = env.getConnection()
    ret = con.execute_command('rg.pyexecute', script)
    env.assertEqual(ret, b'OK')

    test_data_path = os.path.join(os.path.dirname(__file__), 'test_data')
    model_filename = os.path.join(test_data_path, 'graph.pb')

    with open(model_filename, 'rb') as f:
        model_pb =

    ret = con.execute_command('AI.MODELSET', 'm{1}', 'TF', DEVICE,
                              'INPUTS', 'a', 'b', 'OUTPUTS', 'mul', 'BLOB', model_pb)
    env.assertEqual(ret, b'OK')

    con.execute_command('AI.TENSORSET', 'a{1}', 'FLOAT',
                        2, 2, 'VALUES', 2, 3, 2, 3)
    con.execute_command('AI.TENSORSET', 'b{1}', 'FLOAT',
                        2, 2, 'VALUES', 2, 3, 2, 3)

    ret = con.execute_command('rg.trigger', 'ModelRunAsyncTest')
    env.assertEqual(ret[0], b'OK')
    values = con.execute_command('AI.TENSORGET', 'c{1}', 'VALUES')
    env.assertEqual(values, [b'4', b'9', b'4', b'9'])