Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

NiFi - ConvertCSVtoAVRO - how to capture the failed records?

When converting CSV to AVRO I would like to output all the rejections to a file (let's say error.csv).

A rejection is usually caused by a wrong data type - e.g. when a "string" value appears in a "long" field.

I am trying to do it using incompatible output, however instead of saving the rows that failed to convert (2 in the example below), it saves the whole CSV file. Is it possible to filter out somehow only these records that failed to convert? (Does NiFi add some markers to these records etc?) Both processors: RouteOnAttribute and RouteOnContent route the whole files. Does the "incompatible" leg of the flow somehow mark single records with something like "error" attribute that is available after splitting the file into rows? I cannot find this in any doc.

Nifi flow

like image 880
michalrudko Avatar asked Nov 23 '25 21:11

michalrudko


1 Answers

I recommend using a SplitText processor upstream of ConvertCSVToAvro, if you can, so you are only converting one record at a time. You will also have a clear context for what the errors attribute refers to on any flowfiles sent to the incompatible output.

Sending the entire failed file to the incompatible relationship appears to be a purposeful choice. I assume it may be necessary if the CSV file is not well formed, especially with respect to records being neatly contained on one line (or properly escaped). If your data violates this assumption, SplitText might make things worse by creating a fragmented set of failed lines.

like image 89
James Avatar answered Nov 26 '25 09:11

James



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!