I have two csv files.Depending upon the value of a cell in csv file 1 I should be able to search that value in a column of csv file 2 and get he corresponding value from other column in csv file 2. I am sorry if this very confusing.It will probably get clear by illustration
CSV file 1
Car Mileage
A 8
B 6
C 10
CSV file 2
Score Mileage(Min) Mileage(Max)
1 1 3
2 4 6
3 7 9
4 10 12
5 13 15
And my desired output CSV file is something like this
Car Mileage Score
A 8 3
B 6 2
C 10 4
Car A is given a score of 3 depending upon its mileage 8 and then looking that mileage in csv file 2 in what range it falls and then getting corresponding score value for that range. Any help will be appreciated Thanks in advance
As of writing this, the current stable release is v0.21.
To read your files, use pd.read_csv
-
df0 = pd.read_csv('file1.csv')
df1 = pd.read_csv('file2.csv')
df0
Car Mileage
0 A 8
1 B 6
2 C 10
df1
Score Mileage(Min) Mileage(Max)
0 1 1 3
1 2 4 6
2 3 7 9
3 4 10 12
4 5 13 15
To find the Score, use pd.IntervalIndex
by calling IntervalIndex.from_tuples
. This should be really fast -
v = df1.loc[:, 'Mileage(Min)':'Mileage(Max)'].apply(tuple, 1).tolist()
idx = pd.IntervalIndex.from_tuples(v, closed='both') # you can also use `from_arrays`
df0['Score'] = df1.iloc[idx.get_indexer(df0.Mileage.values), 'Score'].values
df0
Car Mileage Score
0 A 8 3
1 B 6 2
2 C 10 4
Other methods of creating an IntervalIndex
are outlined here.
To write your result, use pd.DataFrame.to_csv
-
df0.to_csv('file3.csv')
Here's a high level outline of what I've done here.
pd.IntervalIndex
to build an interval index tree. So, searching is now logarithmic in complexity.idx.get_indexer
to find the index of each value in the treeScore
value in df1
, and assign this back to df0
. Note that I call .values
, otherwise, the values will be misaligned when assigning back.For more information on Intervalindex
, take a look at this SO Q/A - Finding matching interval(s) in pandas Intervalindex
Note that IntervalIndex
is new in v0.20
, so if you have an older version, make sure you update your version with
pip install --upgrade pandas
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With