Apply real matrix multiplication

I got a bipartite graph of user(~300K)-item(~300K) ratings :

:START_ID(User) :END_ID(Item) rating:INT
1234 8765 1
66 88 5

which can be represent as a csr matrix of real number R and it’s boolean version B & C.

image

And I try to get items similar to a specific item by querying

MATCH (:Item {id: 6})<-[:RATE]-(u:User)-[:RATE]->(i:Item)
RETURN i.id, count(u) AS similarity

// referece
// stackoverflow.com/questions/29066274/count-duplicated

gives the same answer as image. (but is it done like this?)

Q1:

How about image or image ?

How to apply matrix multiplication (float) with more explicit query?

Just like image

MATCH (o:Item {id: 6})<-[]-(:User)-[]->(:Item)<-[]-(:User)-[]->(i:Item)
RETURN o, i   // [,count(*)]

// referece
// www.slideshare.net/RedisLabs/redisconf18-lower-latency-graph-queries-in-cypher-with-redis-graph

does the chain matrix multiplication of boolean matrices . image

Q2:

I expect following query was done by image, where R' is R with case when replacement.
But query is slow when the item is rated by many user, and all GRAPH.EXPLAIN can tell me is “aggregate”.

MATCH (:Item {id: 6})<-[:RATE]-(:User)-[r:RATE]->(i:Item)
RETURN 
    i.id, 
    SUM(CASE r.rating
        WHEN 1 THEN 2
        ...
    END)

and another try

MATCH (:Item {id: 6})<-[:RATE]-(us:User)
WITH distinct(us) as u, count(*) as cnt 
MATCH (u)-[r:RATE]->(i:Item)
RETURN 
    i.id, 
    SUM(cnt*
    (CASE r.rating
          ...
    END)
    )

RedisGraph always performs multiplication over a boolean semiring; properties like id and rating are stored in a separate data structure.

Additionally, multiplication is performed in chunks (currently 16xN blocks of the left-hand matrix).

I don’t consider it generally safe to treat RedisGraph as an interface that exposes a matrix multiplication function; there is too much else going on behind the scenes.

With regard to Q1, we will chain multiple operands in a multiplication sequence. In this case, we will chain all of them; if intermediate operands have filters that must be validated, we will break the chain at those points to apply the relevant filters. We also store transposed copies of every adjacency matrix and substitute those into the multiplication instead of performing run-time transposes.

With regard to Q2, I’m not sure why this is slow. A GRAPH.PROFILE call may give you more information than GRAPH.EXPLAIN, as it will show the number of results each op produces and how much time each takes.

1 Like