The following:
q = pd.DataFrame([[1,2],[3,4]])
r = pd.DataFrame([[1,2],[5,6]], columns=['a','b'])
pd.merge(q, r, left_on=q.columns, right_on=r.columns, how='left')
raises an error:
ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all()
The following doesn't:
q = pd.DataFrame([[1,2],[3,4]])
r = pd.DataFrame([[1,2],[5,6]], columns=['a','b'])
pd.merge(q, r, left_on=q.columns.tolist(), right_on=r.columns.tolist(), how='left')
Is this a bug?
It depends on what is considered array-like in Pandas. It might also be a bug in documentation.
Pandas checks the type of left_on and right_on parameters (see _maybe_make_list function in pandas source), and since they are both not tuple/lists (namely, q.columns is RangeIndex and r.columns is Index), it basically does:
[q.columns] == [r.columns]
instead of comparing them directly, so that outputs the error.
Documentation says left_on: label or list, or array-like. I couldn't find a definition of array-like in Pandas, but in this case it seems to be limited to tuple or list.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With