Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

R with function in python

Tags:

python

pandas

In R, I can use with(obj, a + b + c + d) instead of obj$a + obj$b + obj$c + obj$d, where obj can be a list or data.frame.

Is there any similar function for dict, pandas.Series, pandas.DataFrame in python?

like image 869
pe-perry Avatar asked Oct 23 '25 14:10

pe-perry


2 Answers

In a way, no. But there are lots of somewhat similar alternatives. The with function of R seems quite versatile, so in Python one has to replace it case by case.

You could use itemgetter() for simple collections:

In [1]: d = dict(a=1, b=2, c=3, d=4)

In [2]: from operator import itemgetter

In [3]: sum(itemgetter('a', 'b', 'c', 'd')(d))
Out[3]: 10

Or attrgetter() for, again simple, objects:

In [4]: from collections import namedtuple

In [5]: from operator import attrgetter

In [8]: sum(attrgetter('a', 'b', 'c', 'd')(
        namedtuple('sdf', 'a b c d')(1, 2, 3, 4)))
Out[8]: 10

Pandas' DataFrames support directly accessing specific columns and applying operations on them. Summing is an easy example, as it has a function as is:

In [10]: df = pd.DataFrame({'A': range(10), 'B': range(10), 'C': range(10)})

In [21]: df[['A', 'B']].sum(axis=1)  # row sums
Out[21]: 
0     0
1     2
2     4
3     6
4     8
5    10
6    12
7    14
8    16
9    18
dtype: int64

There's also DataFrame.eval, which is closest to what you're after, I think:

Evaluate an expression in the context of the calling DataFrame instance.

In [9]: df.eval('(A + B) ** C')
Out[9]: 
0               1
1               2
2              16
3             216
4            4096
5          100000
6         2985984
7       105413504
8      4294967296
9    198359290368
dtype: int64
like image 54
Ilja Everilä Avatar answered Oct 25 '25 02:10

Ilja Everilä


Not really. R and Python have pretty different philosophies when it comes to this kind of thing--in R it's possible to write a function which parses the entire syntax of its arguments before they are evaluated, whereas in Python it's not. So in Python, this is impossible:

df = pd.DataFrame({'a':[1,2],'b':[3,4],'c':[5,6],'d':[7,8]})
with(df, a + b + c)

However, this works:

sum(map(df.get, ('a','b','c'))) # gives Series([9,12])

If you wanted to apply other chained operations, you could implement support for something like this:

def chain(op, df, name, *names):
    res = df[name]
    while names:
        res = op(res, df[names[0]])
        names = names[1:]
    return res

Then you can do this:

from operator import div
chain(div, df, 'a', 'b', 'c')
like image 2
John Zwinck Avatar answered Oct 25 '25 02:10

John Zwinck