I have a dataframe like this:
>>>import pandas as pd
>>>df = pd.DataFrame({'x1':[20,25],'y1':[5,8],'x2':[22,27],'y2':[10,2]})
>>>df
x1 y1 x2 y2
0 20 5 22 10
1 25 8 27 2
>>>
X and Y pair together. I need to compare y1 and y2 and get the max in every row. And find the corresponding x. Hence the max of row [0] is y2 (=10), and the corresponding x is x2 (=22). The second row will be y1 (=8) and x1(=25). Expected result, new columns x and y:
x1 y1 x2 y2 x y
0 20 5 22 10 22 10
1 25 8 27 2 25 8
This is a simple dataframe I made to elaborate on the question. X and Y pairs, in my case, can be 30 pairs.
# get a hold on "y*" columns
y_cols = df.filter(like="y")
# get the maximal y-values' suffixes, and then add from front "x" to them
max_x_vals = y_cols.idxmax(axis=1).str.extract(r"(\d+)$", expand=False).radd("x")
# get the locations of those x* values
max_x_ids = df.columns.get_indexer(max_x_vals)
# now we have the indexes of x*'s in the columns; NumPy's indexing
# helps to get a cross section
df["max_xs"] = df.to_numpy()[np.arange(len(df)), max_x_ids]
# for y*'s, it's directly the maximum per row
df["max_ys"] = y_cols.max(axis=1)
to get
>>> df
x1 y1 x2 y2 max_xs max_ys
0 20 5 22 10 22 10
1 25 8 27 2 25 8
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With