Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to limit rows in pandas dataframe?

Tags:

python

pandas

How to limit number of rows in pandas dataframe in python code. I needed last 1000 rows the rest need to delete. For example 1000 rows, in pandas dataframe -> 1000 rows in csv.

I tried df.iloc[:1000]

I needed autoclean pandas dataframe and saving last 1000 rows.

like image 891
pozhilou Avatar asked Oct 26 '25 08:10

pozhilou


1 Answers

With df.iloc[:1000] you get the first 1000 rows.

Since you want to get the last 1000 rows, you have to change this line a bit to df_last_1000 = df.iloc[-1000:]

To safe it as a csv file you can use pandas' to_csv() method: df_last_1000.to_csv("last_1000.csv")


Update - Speed Comparison:

Both .tail(1000) and .iloc[-1000:, :] return the last 1000 rows, so let's compare their performance:

import pandas as pd
import numpy as np
import timeit

df = pd.DataFrame(np.random.rand(1000000, 5), columns=['A', 'B', 'C', 'D', 'E'])

def tail_operation():
    _ = df.tail(1000)

def iloc_operation():
    _ = df.iloc[-1000:, :]

tail_time = timeit.timeit(tail_operation, number=1000)
iloc_time = timeit.timeit(iloc_operation, number=1000)

print(f"Execution time for tail operation: {tail_time} seconds")
print(f"Execution time for iloc operation: {iloc_time} seconds")

Execution time for tail operation: 0.0280200999986846 seconds

Execution time for iloc operation: 0.07651790000090841 seconds

like image 125
DataJanitor Avatar answered Oct 28 '25 22:10

DataJanitor



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!