Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Maintainability of database integration testing

I am developing a ETL process that extract business data from one database to a data warehouse. The application is NOT using NHibinate, Linq to Sql or Entity Framework. The application has its own generated data access classes that generate the necessary SQL statements to perform CUID.

As one can image, developers who write code that generate custom SQL can easily make mistakes.

I would like to write a program that generate testing data (Arrange), than perform the ETL process (Act) and validate the data warehouse (Assert).

I don't think it is hard to write such program. However, what I worry is that in the past my company had attempt to do something similar, and ending up with a brunch of un-maintainable unit tests that constantly fail because of many new changes to the database schema as new features are added.

My plan is to write an integration test that runs on the build machine, and not any unit tests to ensures the ETL process works. The testing data cannot be totally random generate because of business logic on determine how data are loaded to the data warehouse. We have custom development tool that generates new data access classes when there is a change in the database definition.

I would love any feedback from the community on giving me advice on write such integration test that is easy to easy to maintain. Some ideas I have:

  1. Save a backup testing database in the version control (TFS), developers will need to modify the backup database when there are data changes to the source or data warehouse.

  2. Developers needs to maintain testing data though the testing program (C# in this case) manually. This program would have a basic framework for developer to generate their testing data.

  3. When the test database is initialize, it generate random data. Developers will need to write code to override certain randomly generated data to ensure the test passes.

I welcome any suggestions Thanks

like image 349
dsum Avatar asked Dec 07 '25 12:12

dsum


1 Answers

Hey dsum, allthough I don't really know your whole architecture of the ETL, I would say, that integration-testing should only be another step in your testing process.

Even if the unit-testing in the first encounter ended up in a mess, you should keep in mind, that for many cases a single unit-test is the best place to check. Or do you want to split the whole integration test for triple-way case or sth. other further deep down, in order to guarantee the right flow in every of the three conditions?

Messy unit-test are only the result of messy production code. Don't feel offended. That's just my opinion. Unit-tests force coders to keep a clean coding style and keep the whole thing much more maintainable.

So... my goal is, that you just think about not only to perform integration testing on the whole thing, because unit-tests (if they are used in the right way) can focus on problems in more detail.

Regards, MacX

like image 108
MacX Avatar answered Dec 09 '25 00:12

MacX



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!