I have created a Vertex AI pipeline similar to this.
Now the pipeline has reference to a csv file. So if this csv file changes the pipeline needs to be recreated.
Is there any way to pass a new csv as a parameter to the pipeline when it is re-run? That is without recreating the pipeline using the notebook?
If not, is there a best practice way of auto updating the dataset, model and deployment?
Have a look to that documentation.
You can define your pipeline like that
...
# Define the workflow of the pipeline.
@kfp.dsl.pipeline(
name="automl-image-training-v2",
pipeline_root=pipeline_root_path)
def pipeline(project_id: str):
...
(you have something very similar in your notebook sample)
Then, when you invoke your pipeline, you can pass some parameter
import google.cloud.aiplatform as aip
job = aip.PipelineJob(
display_name="automl-image-training-v2",
template_path="image_classif_pipeline.json",
pipeline_root=pipeline_root_path,
parameter_values={
'project_id': project_id
}
)
job.submit()
You can see the project_id a dict parameter in the parameter values, and in parameter of your pipeline function.
Do the same for your CSV file name!
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With