Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to skip rows of csv file in BIGQUERY load API

I am trying to load CSV data from cloud storage bucket to BigQuery table using BigQuery API My Code is :

def load_data_from_gcs(dataset_name, table_name, source):
    bigquery_client = bigquery.Client()
    dataset = bigquery_client.dataset(dataset_name)
    table = dataset.table(table_name)
    job_name = str(uuid.uuid4())

    job = bigquery_client.load_table_from_storage(
        job_name, table, source)
    job.sourceFormat = 'CSV'
    job.fieldDelimiter = ','
    job.skipLeadingRows = 2

    job.begin()
    job.result()  # Wait for job to complete

    print('Loaded {} rows into {}:{}.'.format(
        job.output_rows, dataset_name, table_name))

    wait_for_job(job)

It is giving me error:

400 CSV table encountered too many errors, giving up. Rows: 1; errors: 1.

this error is because,my csv file contains first two rows as header information and that is not supposed to be loaded. I have given job.skipLeadingRows = 2 but it is not skipping the first 2 rows. Is there any other syntax to set skip rows ?

Please help on this.

like image 689
Shikha Avatar asked Dec 02 '25 07:12

Shikha


1 Answers

You're spelling it wrong (using camelcase instead of underscores). It's skip_leading_rows, not skipLeadingRows. Same for field_delimiter and source_format.

Check out the Python sources here.

like image 199
Graham Polley Avatar answered Dec 05 '25 17:12

Graham Polley



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!