I'm working with Google BigQuery using Python application.
I have a dataframe with a field which contains lists, let's call it "keywords". I also have a BigQuery table whose keywords field is STRING and mode=REPEATED.
This is the schema of my BigQuery table:
My BigQuery table schema
SCHEMA = [
bq.SchemaField("id", "STRING", mode="NULLABLE"),
bq.SchemaField("fecha", "DATE", mode="NULLABLE"),
bq.SchemaField("keywords", "STRING", mode="REPEATED")
]
And this is my code:
import pandas as pd
from datetime import date
from google.cloud import bigquery as bq
df_dict = {
"id": ["asdf173","qwer783","vcda619"],
"fecha": [date(2019,1,15), date(2019,1,28), date(2019,2,12)],
"keywords": [['a','b'], ['c','d','e'],['f']]
}
df = pd.DataFrame(df_dict)
client = bq.Client()
dataset = client.dataset(dataset_name)
table_ref = dataset.table(table_name)
client.load_table_from_dataframe(df, table_ref).result()
I'm getting the following error when I try to upload the dataframe into the BigQuery table:
400 Provided Schema does not match Table project-id:dataset-name.table-name. Field keywords has changed type from STRING to RECORD.
How can I solve it?
Given this error message:
400 Provided Schema does not match Table project-id:dataset-name.table-name. Field keywords has changed type from STRING to RECORD.
And the structure of the table you provided

You can see that you are trying to insert an ARRAY aka RECORD into a string field.
You need to change the type of the field keywords from String to RECORD to solve your problem

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With