Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

XSLT Performance Considerations

I am working on a project which uses following technologies. Java, XML, XSLs

There's heavy use of XMLs. Quite often I need to - convert one XML document into another - convert one XML document into another after applying some business logic.

Everything will be built into a EAR and deployed on an application server. As the number of user is huge, I need to take performance into consideration before defining coding standards.

I am not a very big fan of XSLs but I am trying to understand if using XSLs a better option in this scenario or should I stick of Java only. Note that I have requirements to convert XML into XML format only. I don't have requirements to convert XML into some other format like HTML etc.

From performance and manitainability point of view - isnt JAVA a better option than using XLST for XML to XML transformations?

like image 407
user825258 Avatar asked Sep 07 '25 02:09

user825258


2 Answers

From my previous experience of this kind of application, if you have a performance bottleneck, then it won't be the XSLT processing. (The only exception might be if the processing is very complex and the programmer very inexperienced in XSLT.) There may be performance bottlenecks in XML parsing or serialisation if you are dealing with large documents, but these will apply whatever technology you use for the transformation.

Simple transformations are much simpler to code in XSLT than in Java. Complex transformations are also usually simpler to code in XSLT, unless they make heavy use of functionality available for free in the Java class library (an example might be date parsing). Of course, that's only true for people equally comfortable with coding in both languages.

Of course, it's impossible to give any more than arm-waving advice about performance until you start talking concrete numbers.

like image 55
Michael Kay Avatar answered Sep 09 '25 23:09

Michael Kay


I agree with above responses. XSLT is faster and more concise to develop than performing transformations in Java. You can change XSLT without having to recompile the entire application (just re-create EAR and redeploy). Manual transformations should we always faster but the code might be much larger than XSLT due to XPATH and other technologies allowing very condensed and powerful expressions. Try several XSLT engines (java provided, saxon, xalan...) and try to debug and profile the XSLT, using tools like standalone IDE Altova XMLSpy to detect bottleneck. Try to load the XSLT transformation and reuse it when processing several XMLs that require the same transformation. Another option is to compile the XSLT to Java classes, allowing faster parsing (saxon seems to allow it), but changes are not as easy as you need to re-compile XSLT and classes generated.

We use XSLT and XSL-FO to generate invoices for a billing software. We extract the data from database and create an XML file, transform it with XSLT using XSL-FO and process the result XML (FO instructions) to generate a PDF using Apache FOP. When generating invoices of several pages, job is done in less than a second in a multi-user environment and on a user-request basis (online processing). We do also batch processing (billing cycles) and the job is done faster as reusing the XSLT transformation. Only for very-large PDF documents (>100 pages) we have some troubles (minutes) but the most expensive task is always processing XML with FO to PDF, not XML to XML with XSLT.

As always said, if you need more processing power, you can just "add" more processors and do the jobs in parallel easily. I think time saved using XSLT if you have some experience using it can be used to buy more hardware. It's the dichotomy of using powerful development tools to save development time and buy more hardware or do things "manually" in order to get maximum performance.

Integration tools like ESB are heavily based on XSLT transformations to adapt XML data from one system (sender) to another system (receiver) and usually can perform hundreds of "transactions" (data processing and integration) in a second.

like image 45
David Oliván Ubieto Avatar answered Sep 10 '25 01:09

David Oliván Ubieto