Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Most effective way to perform Matrix Multiplication on Polars

Considering the example data bellow what is the correct way to perform a matrix multiplication with data that is in Polars?

In []: matrix_1 = pl.DataFrame({"col_1":[1,2,3],"col_2":[4,5,6], "col_3":[7,8,9]})
In []: matrix_2 = pl.DataFrame({"col_1":[9,8,7],"col_2":[6,5,4], "col_3":[3,2,1]})

I've done the following using numpy to perform computation:

In []: np.matmul(matrix_1, matrix_2)
Out[]: 
array([[ 30,  24,  18],
       [ 84,  69,  54],
       [138, 114,  90]])

In []: np.dot(matrix_1, matrix_2)
Out[]: 
array([[ 30,  24,  18],
       [ 84,  69,  54],
       [138, 114,  90]])

I was just wondering if there's a native way to do it to avoid copies because IRL I'm using much more data and if I could have the ergonomy of not having to convert data in and out of numpy this would be great.

P.s.: Another great thing would be able to use the @ to use the __matmult__ that if I'm not mistaken is not implemented in Polars API.

like image 472
Igor Marcos Riegel Avatar asked Sep 19 '25 08:09

Igor Marcos Riegel


1 Answers

The interoperability of polars with numpy is already pretty strong as per the link @jqurious already posted in comments.

You can also see that interoperability in the fact that you can even use polars dataframes as the input to np.dot.

It seems what you really need/want is a way to do the following while getting back a DataFrame

matrix_1.dot(matrix_2)

shape: (3, 3)
┌───────┬───────┬───────┐
│ col_1 ┆ col_2 ┆ col_3 │
│ ---   ┆ ---   ┆ ---   │
│ i64   ┆ i64   ┆ i64   │
╞═══════╪═══════╪═══════╡
│ 30    ┆ 84    ┆ 138   │
│ 24    ┆ 69    ┆ 114   │
│ 18    ┆ 54    ┆ 90    │
└───────┴───────┴───────┘

You can achieve this by making a helper function and then monkey patching it into pl.DataFrame

Just do:

import polars as pl
import numpy as np
def dot(self, rightdf):
    return pl.from_numpy(np.dot(self, rightdf), columns=rightdf.columns)
pl.DataFrame.dot=dot

and then when you create your matrix_1 and matrix_2 it will have the method dot built in as above.

like image 163
Dean MacGregor Avatar answered Sep 21 '25 22:09

Dean MacGregor