Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Pandas Insert a new row after every nth row

I have a dataframe that looks like below:

 **L_Type   L_ID    C_Type      E_Code**
    0       1           1         9
    0       1           2         9
    0       1           3         9
    0       1           4         9
    0       2           1         2
    0       2           2         2
    0       2           3         2
    0       2           4         2
    0       3           1         3
    0       3           2         3
    0       3           3         3
    0       3           4         3

I need to insert a new row after every 4 row and increment the value in third column (C_Type) by 01 like below table while keeping the values same as first two columns and does not want any value in last column:

 L_Type     L_ID    C_Type          E_Code
    0       1           1           9
    0       1           2           9
    0       1           3           9
    0       1           4           9
    0       1           5           
    0       2           1           2
    0       2           2           2
    0       2           3           2
    0       2           4           2
    0       2           5           
    0       3           1           3
    0       3           2           3
    0       3           3           3
    0       3           4           3
    0       3           5           

I have searched other threads but could not figure out the exact solution:

How to insert n DataFrame to another every nth row in Pandas?

Insert new rows in pandas dataframe

like image 466
Baig Avatar asked Sep 07 '25 10:09

Baig


1 Answers

You can seelct rows by slicing, add 1 to column C_Type and 0.5 to index, for 100% sorrect slicing, because default method of sorting in DataFrame.sort_index is quicksort. Last join together, sort index and create default by concat with DataFrame.reset_index and drop=True:

df['C_Type'] = df['C_Type'].astype(int)

df2 = (df.iloc[3::4]
         .assign(C_Type = lambda x: x['C_Type'] + 1, E_Code = np.nan)
         .rename(lambda x: x + .5))
df1 = pd.concat([df, df2], sort=False).sort_index().reset_index(drop=True)
print (df1)
    L_Type  L_ID  C_Type  E_Code
0        0     1       1     9.0
1        0     1       2     9.0
2        0     1       3     9.0
3        0     1       4     9.0
4        0     1       5     NaN
5        0     2       1     2.0
6        0     2       2     2.0
7        0     2       3     2.0
8        0     2       4     2.0
9        0     2       5     NaN
10       0     3       1     3.0
11       0     3       2     3.0
12       0     3       3     3.0
13       0     3       4     3.0
14       0     3       5     NaN
like image 188
jezrael Avatar answered Sep 09 '25 23:09

jezrael