When using concat and then isin to drop all rows, I encounter:
ValueError: cannot compute isin with a duplicate axis.
Meanwhile if there are any rows left in the DF, there are no issues. Also, not using concat works in any case, returning an empty DF gracefully. I'm attempting to use concat as it's slightly faster in my use case.
Pandas 0.24.2 (latest).
df = pd.DataFrame()
for r in range(5):
df = df.append({'type':'teine', 'id':r}, ignore_index=True)
# Problem line
df = pd.concat([df.reset_index(drop=True), pd.DataFrame({'type':'teine', 'id':5}, index=[0])], sort=True)
dfCopy = df.copy()
df.query("(type == 'teine')", inplace=True)
df = dfCopy[~dfCopy.isin(df).all(axis=1)]
you can just avoid the duplicated index problem not duplicating it:
df = pd.DataFrame()
for r in range(5):
df = df.append({'type':'teine', 'id':r}, ignore_index=True)
# Problem line
df = pd.concat([df.reset_index(drop=True), pd.DataFrame({'type':'teine', 'id':5}, index=[max(df.index.values)+1])], sort=True)
dfCopy = df.copy()
df.query("(type == 'teine')", inplace=True)
df = dfCopy[~dfCopy.isin(df).all(axis=1)]
Or better yet, you can reset your index after concatenate:
df = pd.DataFrame()
for r in range(5):
df = df.append({'type':'teine', 'id':r}, ignore_index=True)
# Problem line
df = pd.concat([df, pd.DataFrame({'type':'teine', 'id':5}, index=[0])], sort=True).reset_index(drop=True)
dfCopy = df.copy()
df.query("(type == 'teine')", inplace=True)
df = dfCopy[~dfCopy.isin(df).all(axis=1)]
In the line :
df = pd.concat([df.reset_index(drop=True), pd.DataFrame({'type':'teine', 'id':5}, index=[0])], sort=True)
there is no need to reset the index inside the concat (df.reset_index(drop=True)). But you have to reset the index after the concat to avoid your error. Here is what it looks like :
df = pd.concat([df, pd.DataFrame({'type':'teine', 'id':5}, index=[0])], sort=True).reset_index(drop=True)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With