Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How do I add many CSV files to the catalog in Kedro?

Tags:

python

kedro

I have hundreds of CSV files that I want to process similarly. For simplicity, we can assume that they are all in ./data/01_raw/ (like ./data/01_raw/1.csv, ./data/02_raw/2.csv) etc. I would much rather not give each file a different name and keep track of them individually when building my pipeline. I would like to know if there is any way to read all of them in bulk by specifying something in the catalog.yml file?

like image 986
Srikiran Avatar asked Sep 14 '25 21:09

Srikiran


1 Answers

You are looking for PartitionedDataSet. In your example, the catalog.yml might look like this:

my_partitioned_dataset:
  type: "PartitionedDataSet"
  path: "data/01_raw"
  dataset: "pandas.CSVDataSet"
like image 184
Lim H. Avatar answered Sep 17 '25 12:09

Lim H.