From this DataFrame:
car_id month
93829 September
27483 April
48372 October
93829 December
93829 March
48372 February
27483 March
How to add a third column which is basically a new id for car, but an incremental one, like this:
car_id month new_incremental_car_id
93829 September 0
27483 April 1
48372 October 2
93829 December 0
93829 March 0
48372 February 2
27483 March 1
Currently I'm doing it by using groupby('car_id') to create a new DataFrame, to which I add an incremental column, which I then join back to the original DataFrame using car_id join key.
Is there a less cumbersome, more direct method to achieve this goal?
EDIT
The code I'm currently using:
cars_id = pd.DataFrame(list(car_sales.groupby('car_id')['car_id'].groups))
cars_id['car_short_id'] = cars_id.index
cars_id.set_index(0, inplace=True)
car_sales.join(cars_id, on='car_id', how='left')
Apart from pd.factorize you can
Use, map a dict constructed from unique values.
In [959]: df.car_id.map({x: i for i, x in enumerate(df.car_id.unique())})
Out[959]:
0 0
1 1
2 2
3 0
4 0
5 2
6 1
Name: car_id, dtype: int64
Or, using category type and codes but not in the same order.
In [954]: df.car_id.astype('category').cat.codes
Out[954]:
0 2
1 0
2 1
3 2
4 2
5 1
6 0
dtype: int8
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With