Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Incremental id based on another column's value

Tags:

pandas

From this DataFrame:

car_id    month
93829     September
27483     April
48372     October
93829     December
93829     March
48372     February
27483     March

How to add a third column which is basically a new id for car, but an incremental one, like this:

car_id    month        new_incremental_car_id
93829     September    0
27483     April        1
48372     October      2
93829     December     0
93829     March        0
48372     February     2
27483     March        1

Currently I'm doing it by using groupby('car_id') to create a new DataFrame, to which I add an incremental column, which I then join back to the original DataFrame using car_id join key.

Is there a less cumbersome, more direct method to achieve this goal?


EDIT

The code I'm currently using:

cars_id = pd.DataFrame(list(car_sales.groupby('car_id')['car_id'].groups))
cars_id['car_short_id'] = cars_id.index
cars_id.set_index(0, inplace=True)
car_sales.join(cars_id, on='car_id', how='left')
like image 322
Jivan Avatar asked Mar 12 '26 23:03

Jivan


1 Answers

Apart from pd.factorize you can

Use, map a dict constructed from unique values.

In [959]: df.car_id.map({x: i for i, x in enumerate(df.car_id.unique())})
Out[959]:
0    0
1    1
2    2
3    0
4    0
5    2
6    1
Name: car_id, dtype: int64

Or, using category type and codes but not in the same order.

In [954]: df.car_id.astype('category').cat.codes
Out[954]:
0    2
1    0
2    1
3    2
4    2
5    1
6    0
dtype: int8
like image 152
Zero Avatar answered Mar 16 '26 04:03

Zero



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!