Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Use dbt to load JSON files into staging

Tags:

dbt

Is it possible to use dbt to copy data from files into the database?

I've got some files as data sources and I would like to load them into an staging layer before starting to process them.

Something such as this command in postgresql:

COPY staging.test (json_content) from 'C:\tmp\46002.json' 

But as far as I see in the documentation, the models require a SELECT statement and the COPY statement fails when running dbt run

like image 540
Devom Avatar asked Dec 06 '25 02:12

Devom


1 Answers

From raw local files? No.

Your two dbt direct data options are:

  1. Load some size limited datasets via dbt seeds which only supports csv's currently.
  2. load data from cloud hosted storage like s3 buckets via external-tables.

This is the best resource to explain why this application doesn't attempt to support the EL part of the ELT (Extract-Load-Transformation) process:

What is dbt - dbtLabs Blog

Without getting into tool recommendation, you'll need an additional component to your data stack to handle the extraction process for you. dbt presumes the data has been loaded into a "pre-processing" or "raw" area of your warehouse generally following Lewis Gavin's excellent architecture here:

How to architect the perfect Data Warehouse - Lewis Gavin


Edit: Additional notes on the seed data - currently comma separated is the only format supported. Some discussion in this issue about extending that to other delimiters and escape formats.

No current support for seeding json files. Also under discussion.

like image 188
sgdata Avatar answered Dec 09 '25 15:12

sgdata



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!