Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

SQL Server Data to Kafka in real time

I would like to add real time data from SQL server to Kafka directly and I found there is a SQL server connector provided by https://debezium.io/docs/connectors/sqlserver/

In the documentation, it says that it will create one topic for each table. I am trying to understand the architecture because I have 500 clients which means I have 500 databases and each of them has 500 tables. Does it mean that it will create 250000 topics or do I need separate Kafka Cluster for each client and each cluster/node will have 500 topics based on the number of tables in the database?

Is it the best way to send SQL data to Kafka or should we send an event to Kafka queue through code whenever there is an insert/update/delete on a table?

like image 747
Learn AspNet Avatar asked Oct 24 '25 03:10

Learn AspNet


1 Answers

With debezium you are stuck with one table to one topic mapping. However, there are creative ways to get around it.

Based on the description, it looks like you have some sort of product that has SQL Server backend, and that has 500 tables. This product is being used by 500 or more clients and everyone has their own instance of the database.

You can create a connector for one client and read all 500 tables and publish it to Kafka. At this point you will have 500 Kafka topics. You can route the data from all other database instances to the same 500 topics by creating separate connectors for each client / database instance. I am assuming that since this is a backend database for a product, the table names, schema names etc. are all same, and the debezium connector will generate same topic names for the tables. If that is not the case, you can use topic routing SMT.

You can differentiate the data in Kafka by adding a few metadata columns in the topic. This can easily be done in the connector by adding SMTs. The metadata columns could be client_id, client_name or something else.

As for your other question,

Is it the best way to send SQL data to Kafka or should we send an event to Kafka queue through code whenever there is an insert/update/delete on a table?

The answer is "it depends!". If it is a simple transactional application, I would simply write the data to the database and not worry about anything else.

The answer is also dependent on why you want to deliver data to Kafka. If you are looking to deliver data / business events to Kafka to perform some downstream business processing requiring transactional integrity, and strict SLAs, writing the data from application may make sense. However, if you are publishing data to Kafka to make it available for others to use for analytical or any other reasons, using the K-Connect approach makes sense.

There is a licensed alternative, Qlik Replicate, which is capable of something very similar.

like image 196
Prabodh Mhalgi Avatar answered Oct 25 '25 23:10

Prabodh Mhalgi



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!