Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Python Dataframe - iloc every nth rows

Tags:

python

pandas

I have a dataframe 'Codes' which is 3 columns and 36 rows.

I have successfully used iloc to select every nth row in my dataframe. My below code correctly allocated out every 12th row starting at row 0 for James, row 4 for Steve and row 8 for Gary. Leaving each person with 3 rows each.

James = Codes.iloc[0::12, :]
Steve = Codes.iloc[4::12, :]
Gary = Codes.iloc[8::12, :]

Going on from this, how can I amend my code to allocate out the rows in between? So, allocating the first 4 rows (0 to 3) for James, the second 4 rows (4 to 7) for Steve and the third 4 rows (8 to 11) for Gary. Intuitively it should look like this:

James = Codes.iloc[0:3:12, :]
Steve = Codes.iloc[4:7:12, :]
Gary = Codes.iloc[8:11:12, :]

But it only selects 1 row per name as the code ends after 3 for James, 7 for Steve and 11 for Gary. If I got it working correctly, it would split my dataframe of 36 rows between the 3 people, giving them 12 rows each in blocks of 4.

This script does the job:

James = Codes.iloc [[0,1,2,3, 12,13,14,15, 24,25,26,27]]
Steve = Codes.iloc [[4,5,6,7, 16,17,18,19, 28,29,30,31]]
Gary = Codes.iloc [[8,9,10,11, 20,21,22,23, 32,33,34,35]]

But it is long winded and doesn't help me with big datasets I will look to apply it to

like image 857
James Avatar asked Dec 05 '25 18:12

James


2 Answers

You can add a column to your dataframe called name and fill in this column with the name you want to assign that row to.

from itertools import cycle, islice
name_pattern = ["James"]*4 + ["Steve"]*4 + ["Gary"]*4
Code["name"] = list(islice(cycle(name_pattern), Code.shape[0]))

The last line there just repeats the pattern for the number of rows in the dataframe. Then if you still want these rows saved into separate dataframes you can go

James = Code[Code["name"]=="James"]
Steve = Code[Code["name"]=="Steve"]
Gary = Code[Code["name"]=="Gary"]
like image 198
collinb9 Avatar answered Dec 08 '25 08:12

collinb9


One possible approach to get it done is to first get nth rows (in your case, 12) using range or arange from numpy, then make a list comprehension to get the next n rows from each row (e.g. 3). Here I typed 4 since the both range and arange functions ignores the last value.

It could happen to get indexes outside the range of rows of your dataframe, so it's important to filter them out from the indexes' list.

Using only built-in functions:

i = 0 # starting position
arange = range(i, df.shape[0]+1, 12) # get the indexes each 12th position starting from `i`
idx = sum([list(range(i,i+4)) for i in arange], []) # for each index, get the next 3
idx = list(filter(lambda x: x < df.shape[0], idx)) # remove possible outlier indexes
df.iloc[idx]

Using numpy:

import numpy as np

i = 0 # starting position
arange = np.arange(i, df.shape[0]+1, 12) # get the indexes each 12th position starting from `i`
idx = np.array([list(range(i,i+4)) for i in arange]).flatten() # for each index, get the next 3
idx = idx[idx <= df.shape[0]] # remove possible outlier indexes
df.iloc[idx]

where i is your start position (0 for James, 4 for Steve, and 8 for Gary),
and df is your dataframe (e.g. Codes)


To make the code scalable, you can put it inside a function, like this:

def slice_df(df, startpos, gsize, nth):
    """
    Slice the data based on start position, group size and nth row
    df : pandas DataFrame
    startpos : int
        start position for the target-value (e.g. person)
    gsize : int
        group size (plus 1)
    nth : int
        slice dataframe every nth rows
    """
    i = startpos
    arange = range(i, df.shape[0]+1, nth) 
    idx = sum([list(range(i,i+gsize)) for i in arange], [])
    idx = list(filter(lambda x: x < df.shape[0], idx))
    return df.iloc[idx]

then call:

James = slice_df(df, 0, 4, 12)
Steve = slice_df(df, 4, 4, 12)
Gary = slice_df(df, 8, 4, 12)
like image 32
Cainã Max Couto-Silva Avatar answered Dec 08 '25 06:12

Cainã Max Couto-Silva



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!