Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to replicate rows of a dataframe a fixed number of times?

Tags:

python

pandas

I want to replicate rows of a dataframe as to prepare for the adding of a column. The dataframe contains years column and I want to add a fixed column of months. The idea is to replicate each same year rows exactly 12 times then add a fixed value column (1-12). my code is the following:

        all_years = dataframe["Year"].unique().tolist()
        new_dataset = pd.DataFrame()
        for idx, year in enumerate(all_years):
            rows_dataframe = pd.concat(
                [dataframe.where(dataframe["Year"] == year).dropna()] * 12,
                ignore_index=True)
            new_dataset = pd.concat([rows_dataframe, new_dataset], ignore_index=True)

The results are correct, but can I avoid the for loop here, and implement this in a more "pandas-ic" way?

EDIT: expected results for one value of years (here 2012) is: (to note that months column is not added through my code, but added it to show the final output)

+-------+--------+---------+
| Years | Months | SomeCol |
+-------+--------+---------+
| 2011  | 12     | val1    |
+-------+--------+---------+
| 2012  | 1      | val1    |
+-------+--------+---------+
| 2012  | 2      | val1    |
+-------+--------+---------+
| 2012  | 3      | val1    |
+-------+--------+---------+
| 2012  | 4      | val1    |
+-------+--------+---------+
| 2012  | 5      | ...     |
+-------+--------+---------+
| 2012  | 6      | ...     |
+-------+--------+---------+
| 2012  | 7      |   val1  |
+-------+--------+---------+
| 2012  | 8      | val1    |
+-------+--------+---------+
| 2012  | 9      | val1    |
+-------+--------+---------+
| 2012  | 10     |         |
+-------+--------+---------+
| 2012  | 11     |         |
+-------+--------+---------+
| 2012  | 12     |         |
+-------+--------+---------+
| 2013  | 1      | ...     |
+-------+--------+---------+
like image 211
Sam Avatar asked Sep 06 '25 08:09

Sam


1 Answers

Use a combination of pd.DataFrame.loc and pd.Index.repeat:

dataframe = dataframe.loc[dataframe.index.repeat(12)].reset_index(drop=True)
like image 127
Shirin Yavari Avatar answered Sep 10 '25 05:09

Shirin Yavari