Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Determining Data Type From Value in Row of CSV File

I wrote a Python program that reads CSV files and spits them out as insert statements for MySQL. Now I want an additional program that is able to form the CREATE statements for each new table using the column headers. Due to MySQL formatting I need to be able to determine the type of each value in a column similar to the type() function in shell. My idea is to read the second row of a CSV file, after the column headers, and pull out values as separated by a , (i.e. one value from each column). Then take that value and run it through the type() function and return int, str, float, etc.. Later I will do some if statements with the value returned by type() to append the proper data type to the end of the column header for the CREATE statement.

So far I have managed to get as far as pulling each value out of a compiled list of the first row and printing the type() that it is. Only problem is that everything in a list is treated as a string so the only return is <class 'str'>. If I knew what type of value it was I would be able to convert from string to the correct type, but that defeats the whole purpose of the program. I am sure there is a better way to do it that doesn't spit the first row out as a list of strings, but I am not sure how to do that. Here is what I have so far as well as ideas with how to continue:

import csv, os

path = 'C:/Users/user/Desktop/file/test/'
for file in os.listdir(path):
    if file.endswith('.csv'):
        with open(path +file) as inFile:
            with open(path + file[:-4] + ".txt", "w") as outFile:
                csvFile = csv.reader(inFile)
                columnHeader = next(csvFile)
                firstRow = next(csvFile)
                i = 0
                for value in firstRow:
                    valueType = (type(value))
                    for header in columnHeader:
                        if valueType = class 'str': # Don't think this formatting is correct
                            columnHeader.append(' varchar (255)')
                        if valueType = class 'int':
                            columnHeader.append(' int')
                        if valueType = class 'float':
                            columnHeader.append(' float')

I have also looked into using this library but would prefer to do it myself.

Are the methods I chose to do this plausible/possible? Are there any better ways you can think of to do it? If so, how?

like image 608
ThoseKind Avatar asked Oct 14 '25 21:10

ThoseKind


1 Answers

This would be the simple way to do it:

def find_type(a):
    try:
        var_type = type(int(a))
    except ValueError:
        try:
            var_type = type(float(a))
        except ValueError:
            var_type = type(a)
    return var_type

a = ['123123', '11.21', 'Some Bank', '11/2/1995']

print([find_type(x) for x in a])
#[<class 'int'>, <class 'float'>, <class 'str'>, <class 'str'>]

I am pretty sure that there are more elegant ways though.

like image 92
Ma0 Avatar answered Oct 17 '25 11:10

Ma0