Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Merging two CSV files by a common column python

Tags:

python

csv

I am trying to merge two csv files with a common id column and write the merge to a new file. I have tried the following but it is giving me an error -

import csv
from collections import OrderedDict

filenames = "stops.csv", "stops2.csv"
data = OrderedDict()
fieldnames = []
for filename in filenames:
    with open(filename, "rb") as fp:  # python 2
        reader = csv.DictReader(fp)
        fieldnames.extend(reader.fieldnames)
        for row in reader:
            data.setdefault(row["stop_id"], {}).update(row)

fieldnames = list(OrderedDict.fromkeys(fieldnames))
with open("merged.csv", "wb") as fp:
    writer = csv.writer(fp)
    writer.writerow(fieldnames)
    for row in data.itervalues():
        writer.writerow([row.get(field, '') for field in fieldnames])

Both files have the "stop_id" column but I'm getting this error back - KeyError: 'stop_id'

Any help would be much appreciated.

Thanks

like image 628
sgpbyrne Avatar asked Jan 28 '26 14:01

sgpbyrne


1 Answers

Here is an example using pandas

import sys
from StringIO import StringIO
import pandas as pd

TESTDATA=StringIO("""DOB;First;Last
    2016-07-26;John;smith
    2016-07-27;Mathew;George
    2016-07-28;Aryan;Singh
    2016-07-29;Ella;Gayau
    """)

list1 = pd.read_csv(TESTDATA, sep=";")

TESTDATA=StringIO("""Date of Birth;Patient First Name;Patient Last Name
    2016-07-26;John;smith
    2016-07-27;Mathew;XXX
    2016-07-28;Aryan;Singh
    2016-07-20;Ella;Gayau
    """)


list2 = pd.read_csv(TESTDATA, sep=";")

print list2
print list1

common = pd.merge(list1, list2, how='left', left_on=['Last', 'First', 'DOB'], right_on=['Patient Last Name', 'Patient First Name', 'Date of Birth']).dropna()
print common
like image 196
Shijo Avatar answered Jan 31 '26 03:01

Shijo



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!