filter dataframe rows based on length of column values

Tags:

pandas

I have a pandas dataframe as follows:

df = pd.DataFrame([ [1,2], [np.NaN,1], ['test string1', 5]], columns=['A','B'] )

df
              A  B
0             1  2
1           NaN  1
2  test string1  5

I am using pandas 0.20. What is the most efficient way to remove any rows where 'any' of its column values has length > 10?

len('test string1') 12

So for the above e.g., I am expecting an output as follows:

df
              A  B
0             1  2
1           NaN  1

747

asked Jul 13 '17 19:07

D.prd

2 Answers

If based on column A

In [865]: df[~(df.A.str.len() > 10)]
Out[865]:
     A  B
0    1  2
1  NaN  1

If based on all columns

In [866]: df[~df.applymap(lambda x: len(str(x)) > 10).any(axis=1)]
Out[866]:
     A  B
0    1  2
1  NaN  1

123

answered Sep 22 '22 09:09

Zero

I had to cast to a string for Diego's answer to work:

df = df[df['A'].apply(lambda x: len(str(x)) <= 10)]

answered Sep 25 '22 09:09

Elizabeth

Related questions
                            
                                "CSV file does not exist" for a filename with embedded quotes
                            
                                Display all information with data.info() in Pandas
                            
                                ValueError: Number of labels is 1. Valid values are 2 to n_samples - 1 (inclusive) when using silhouette_score
                            
                                ValueError when trying to have multi-index in DataFrame.pivot
                            
                                How to specify a variable in pandas as ordinal/categorical?
                            
                                Remove leap year day from pandas dataframe
                            
                                How to remove multilevel index in pandas pivot table
                            
                                How can I add title on seaborn lmplot?
                            
                                Pandas: Delete Row if cell contains specific text
                            
                                Only copy one key-column into merged DataFrame
                            
                                sheets of Excel Workbook from a URL into a `pandas.DataFrame`
                            
                                How to make the first row as header when reading a file in PySpark and converting it to Pandas Dataframe
                            
                                How to plot frequency count of pandas column?
                            
                                Show all pandas dataframes in an IPython Notebook
                            
                                Aggregating lambda functions in pandas and numpy
                            
                                How to add a shared x-label and y-label to a plot created with pandas' plot?
                            
                                Pandas to_sql() inserting index
                            
                                Compute co-occurrence matrix by counting values in cells
                            
                                How to round a Pandas `DatetimeIndex`?
                            
                                Plotting Pandas DataFrames in to Pie Charts using matplotlib

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With