I am learning Talend and I need to a XML to CSV mapping that I cannot make work right.
I have a document like this (really simplified):
<Report>
<Payment>
<Reference Type="1" Value="wathever11"/>
<Reference Type="2" Value="wathever12"/>
<Amount value="222"/>
</Payment>
<Payment>
<Reference Type="1" Value="wathever22"/>
<Reference Type="3" Value="wathever23"/>
<Amount value="123242"/>
</Payment>
<Payment>
<Reference Type="3" Value="wathever33"/>
<Reference Type="2" Value="wathever32"/>
<Amount value="12976"/>
</Payment>
<Payment>
<Reference Type="1" Value="wathever41"/>
<Reference Type="4" Value="wathever44"/>
<Amount value="1456"/>
</Payment>
<Payment>
<Amount value="12976"/>
</Payment>
</Report>
So, I can have none to multiple References that can have up to 4 different Types.
I need to convert this to a CSV like this:
ReferenceType1, ReferenceType2, ReferenceType3, ReferenceType4, Amount
whatever11 , whatever12 , , , 222
, whatever22 , whatever23 , , 123242
, whatever32 , whatever33 , , 12976
whatever41 , , , whatever44 , 1456
, , , , 12976
To achieve this, I created this simple structure:

In the XMLFileInput this is the configuration:

Where Payment is a Document
In the tXMLMap ive set a document structure like this:
Report
└──Payment (loop) First case, see below
├── Reference (loop) Second case, see below
| ├── @Type
| └── @Value
└── Amount
└── @Value
In the output Table i create a column for each reference with a expression like this:
[row1.Payment:/Payment/Reference/@Type].equals("1")? [row1.Payment:/Payment/Reference/@Value]: ""
But, I am unable to map all the references in the same row. If a set the loop at Payment level, i get this:
ReferenceType1, ReferenceType2, ReferenceType3, ReferenceType4, Amount
whatever11 , , , , 222
, , , , 123242
, , , , 12976
whatever41 , , , , 1456
, , , , 12976
If the loop is set at reference level, I get a new line for every Reference, like this:
ReferenceType1, ReferenceType2, ReferenceType3, ReferenceType4, Amount
whatever11 , , , , 222
, whatever12 , , , 222
, whatever22 , , , 123242
, , whatever23 , , 123242
, , whatever33 , , 12976
, whatever32 , , , 12976
whatever41 , , , , 1456
, , , whatever44 , 1456
And no line for the element without Reference.
There must be a way to achieve the expected result. I tried to create an intermediate step, mapping each reference to a different Table, along with a sequence number, to further merge the columns to a single row.
Although I havent it finished, I think it could be a path to go. But the reference is no the only element what i need this for. I got a lot of elements where this happens in the real xml and the number of intermediate tables going like this would be big.
So I prefer to ask the experts opinion to try to find a more simple way to do this.
Any ideas?
Good news: this is possible with tExtractXMLField. Check out my proposed solution:


My output was based on your input xml, but it is different than your expected output, please double check carefully:
.-------------------+-------------------+-------------------+-------------------+------.
| tLogRow_1 |
|=------------------+-------------------+-------------------+-------------------+-----=|
|referenceTypeValue1|referenceTypeValue2|referenceTypeValue3|referenceTypeValue4|amount|
|=------------------+-------------------+-------------------+-------------------+-----=|
|wathever11 |wathever12 |null |null |222 |
|wathever22 |null |wathever23 |null |123242|
|null |wathever32 |wathever33 |null |12976 |
|wathever41 |null |null |wathever44 |1456 |
|null |null |null |null |12976 |
'-------------------+-------------------+-------------------+-------------------+------'
Bad news: It is not possible with tXMLMap in Talend 5.3.1 for me. I tried to put this into the expressions field in tXMLMap: [row2.doc:/Report/Payment/Reference[@Type=1]/@Value] But then my job wouldn't generate anymore because of the additional brackets
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With