Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

BigQuery Results to Panda DataFrame in Chunks

I am trying to save the results of a BigQuery query to a Panda DataFrame using bigquery.Client.query.to_dataframe()

This query can return millions of rows.

Given that Panda to BQ (Dataframe.to_gbq()) has a chunk parameter, is there something similar for BQ to Pandas to incrementally add to the dataframe without having to run the query multiple times with a limit and offset?

like image 802
user1596707 Avatar asked Sep 18 '25 15:09

user1596707


1 Answers

You can use to_dataframe_iterable instead to do this.

job = client.query(query)
result = job.result(page_size=20)

for df in result.to_dataframe_iterable():
    # df will have at most 20 rows
    print(df)
like image 105
Decko Avatar answered Sep 21 '25 04:09

Decko