I know the fact that in ETL, we transform the data and Load in data warehouse.
In ELT, we Load the data into data warehouse and then do the transformation.
Apart from the above two lines, how would they exactly differ?
What's the advantage of one over the other?
In ELT, I see people telling, we leverage the power of data warehouse to do the transformation. Why transformation advantage does the warehouse provide?
In my use case, I have source data in source ADLS (source layer) and I ingest the data again into my ADLS (raw layer) as it is using azure Databricks, then take the raw layer data and transform the data using azure databricks and again store in ADLS (final layer). Now copy the final layer data from ADLS to azure sql server db as shown below.
Source ADLS --> Raw layer ADLS [copy everything using ADB (source is mounted)]
Raw layer ADLS --> Final layer ADLS [Transform using ADB]
Final layer ADLS --> SQL Server db [copy using ADF]
Is it an ETL or ELT?
The answer from @NickW answer is a good summation. All I'd add is that much will depend on what you have the talent to support. If most of your on-deck talent is SQL based, then ELT (with transforms done in stored procedures, etc.) can make a lot more sense as your team won't have to learn to code transforms in an ETL tool. If you've got the ETL talent, then it might make more sense to do the transformations there, which will offload that processing from your data warehouse server.
This architectural question probably has more to do with the talent and skillsets that you have available to you.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With