Pipe or sequence of function in python pandas or Filter then summarize (as dplyr)

Question

To contextualize. I'm an R heavy user, but currently switching between python (with pandas). Let's say I have this data frame

data = {'participant': ['p1','p1','p2','p3'],
        'metadata': ['congruent_1','congruent_2','incongruent_1','incongruent_2'],
        'reaction': [22000,25000,27000,35000]
        }

df_s1 = pd.DataFrame(data, columns = ['participant','metadata', 'reaction'])
df_s1 = df_s1.append([df_s1]*15,ignore_index=True)
df_s1

and I want to reproduce what I can easily do in R (pipe functions), by:

df_s1[(df_s1.metadata == "congruent_1") | (df_s1.metadata == "incongruent_1")].df_s1["reaction"].mean()

This is not possible. I just can success when I split this code into parts/variables:

x = df_s1[(df_s1.metadata == "congruent_1") | (df_s1.metadata == "incongruent_1")]
x = x["reaction"].mean()
x

In dplyr way, I'd go with

ds_s1 %>% 
  filter(metadata == "congruent_1" | metadata == "incongruent_1") %>% 
  summarise(mean(reaction))

Note: I highly appreciate concise references to a site in which I could transpose my R code to Python. Several literature is available, but with mixed formats and flexible styles.

Thanks

BENY · Accepted Answer

We have .loc here

df_s1.loc[(df_s1.metadata == "congruent_1") | (df_s1.metadata == "incongruent_1"), 'reaction'].mean()
Out[117]: 24500.0

Change to isin as Quang mentioned try to reduce the line of code

In base R

mean(ds_s1$reaction[ds_s1$metadata%in%c('congruent_1','incongruent_1')])

Quang Hoang · Answer

Do you mean:

df_s1.loc[(df_s1.metadata == "congruent_1") | (df_s1.metadata == "incongruent_1"), "reaction"].mean()

Or simpler with isin:

df_s1.loc[df_s1.metadata.isin(["congruent_1", "incongruent_1"]), "reaction"].mean()

Out:

24500.0

Pipe or sequence of function in python pandas or Filter then summarize (as dplyr)

Tags:

python

pandas

r

pipe

dplyr

Luis

2 Answers

BENY

Quang Hoang

Recent Activity

Donate For Us

Pipe or sequence of function in python pandas or Filter then summarize (as dplyr)

Tags:

python

pandas

r

pipe

dplyr

Luis

2 Answers

BENY

Quang Hoang

Related questions

Recent Activity

Donate For Us