We are using Java API to load a CSV file to Google Big Query. Is there a way to detect the columns on load and auto select the appropriate schema type?
For example, if a specific column has only float, then BigQuery assigns the column as float, if non numeric then it assigns column as string. Is there a method to do this?
The roundabout way is to assign each column as string by default when loading the CSV.
Then do a query on each column -
SELECT count(columnname)- count(float(columnname)) FROM dataset.table (assuming I am only interested in isolating columns that have "float values" that I can use for math functions from my application)
Any other method to solve this problem?
Right now, BigQuery does not support schema inference, so as you suggest, your options are:
Note that you can use the allowLargeResults feature to clean up and rewrite your imported data (but note that you'll be charged for the query, which will increase your data ingestion costs).
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With