I want to establish a pipeline connection between the components by passing any kind of data just to make it look like organized like flowchart with arrows. Right now it is like below
Irrespective of whether the docker container generates output or not I would want pass some sample data between the components. However If any changes is required in the docker container code or the .yaml please let me know
KFP Code
import os
from pathlib import Path
import requests
import kfp
#Load the component
component1 = kfp.components.load_component_from_file('comp_typed.yaml')
component2 = kfp.components.load_component_from_file('component2.yaml')
component3 = kfp.components.load_component_from_file('component3.yaml')
component4 = kfp.components.load_component_from_file('component4.yaml')
#Use the component as part of the pipeline
@kfp.dsl.pipeline(name='Document Processing Pipeline', description='Document Processing Pipeline')
def data_passing():
    task1 = component1()
    task2 = component2(task1.output)
    task3 = component3(task2.output)
    task4 = component4(task3.output)
comp_typed.yaml code
name: DPC
description: This is an example
implementation:
  container:
    image: gcr.io/pro1-in-us/dpc_comp1@sha256:3768383b9cd694936ef00464cb1bdc7f48bc4e9bbf08bde50ac7346f25be15de
    command: [python3, /dpc_comp1.py,]
component2.yaml
name: Custom_Plugin_1
description: This is an example
implementation:
  container:
    image: gcr.io/pro1-in-us/plugin1@sha256:16cb4aa9edf59bdf138177d41d46fcb493f84ce798781125dc7777ff5e1602e3
    command: [python3, /plugin1.py,]
I tried this and this but could not achieve anything except for error. I am new to python and kubeflow. What code changes should I make to pass data between all 4 components using KFP SDK. Data can be a file/string
Let's Suppose, Component 1 downloads a .pdf file from gs bucket can i feed the same file into next downstream component?. Component 1 downloads file to '/tmp/doc_pages' location of component 1 docker container which i believe is local to that particular contain and the down stream components can not read them?
This notebook, which describes how to pass data between KFP components, may be useful. It includes the concept of 'small data', to pass directly; vs 'large data' that you write to a file, then— as shown in the example notebook— the paths for the input and output files are chosen by the system and are passed into the function (as strings).
If you don't need to pass data between steps, but want to specify a step ordering dependency (e.g. op2 doesn't run until op1 is finished) you can indicate this in your pipeline definition like so:
op2.after(op1)
In addition to the Amy's excellent answer:
Your pipeline is correct. The best way to establish a dependency between components is to establish data dependency.
Let's look at your pipeline code:
task2 = component2(task1.output)
You're passing output of task1 to component2. This should result in a dependency that you want. But there are couple of problems (and your pipeline will show compilation errors if you try to compile it):
component1 needs to have an outputcomponent2 needs to have an inputcomponent2 needs to have an output (so that you can pass it to component3)Etc.
Let's add them:
name: DPC
description: This is an example
outputs:
- name: output_1
implementation:
  container:
    image: gcr.io/pro1-in-us/dpc_comp1@sha256:3768383b9cd694936ef00464cb1bdc7f48bc4e9bbf08bde50ac7346f25be15de
    command: [python3, /dpc_comp1.py, --output-1-path, {outputPath: output_1}]
name: Custom_Plugin_1
description: This is an example
inputs:
- name: input_1
outputs:
- name: output_1
implementation:
  container:
    image: gcr.io/pro1-in-us/plugin1@sha256:16cb4aa9edf59bdf138177d41d46fcb493f84ce798781125dc7777ff5e1602e3
    command: [python3, /plugin1.py, --input-1-path, {inputPath: input_1}, --output-1-path, {outputPath: output_1}]
With these changes, your pipeline should compile and display the dependencies that you want.
Please check the tutorial about creating components from command-line programs.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With