Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

how to provide credentials in apache beam python programmatically?

We are using apache beam through airflow. Default GCS account is set with environmental variable - GOOGLE_APPLICATION_CREDENTIALS. We don't want to change environmental variable as it might affect other processes running at that time. I couldn't find a way to change Google Cloud Dataflow Service Account programmatically. We are creating pipeline in following way p = beam.Pipeline(argv=self.conf)

Is there any option through argv or options, where in I can mention the location of gcs credential file? Searched through documentation, but didn't find much information.

like image 237
srig Avatar asked Dec 29 '25 19:12

srig


1 Answers

You can specify a service account when you launch the job with a basic flag: --serviceAccount=my-service-account-name@my-project.iam.gserviceaccount.com

That account will need the Dataflow Worker role attached plus whatever else you would like(GCS/BQ/Etc). Details here. You don't need the SA to be stored in GCS, or keys locally to use it.

like image 146
FridayPush Avatar answered Dec 31 '25 09:12

FridayPush