Google-BigQuery

Question

We are using Java API to load a CSV file to Google Big Query. Is there a way to detect the columns on load and auto select the appropriate schema type?

For example, if a specific column has only float, then BigQuery assigns the column as float, if non numeric then it assigns column as string. Is there a method to do this?

The roundabout way is to assign each column as string by default when loading the CSV.

Then do a query on each column -

SELECT count(columnname)- count(float(columnname)) FROM dataset.table (assuming I am only interested in isolating columns that have "float values" that I can use for math functions from my application)

Any other method to solve this problem?

Jeremy Condit · Accepted Answer

Right now, BigQuery does not support schema inference, so as you suggest, your options are:

Provide the schema explicitly when loading data.
Load all data using the string type, and cast/convert at query time.

Note that you can use the allowLargeResults feature to clean up and rewrite your imported data (but note that you'll be charged for the query, which will increase your data ingestion costs).

Google-BigQuery - schema parsing of CSV file

Tags:

csv

deepakd

1 Answers

Jeremy Condit

Recent Activity

Donate For Us