I am writing a Kubeflow component which reads an input query and creates a dataframe
, roughly as:
from kfp.v2.dsl import component
@component(...)
def read_and_write():
# read the input query
# transform to dataframe
sql.to_dataframe()
I was wondering how I can pass this dataframe to the next operation in my Kubeflow pipeline. Is this possible? Or do I have to save the dataframe in a csv or other formats and then pass the output path of this? Thank you
You need to use the concept of the Artifact. Quoting:
Artifacts represent large or complex data structures like datasets or models, and are passed into components as a reference to a file path.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With