Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to use libtidy with tidyParseBuffer()?

Tags:

c

html

stream

tidy

I'm trying to clean some HTML with libtidy (C language), the problem is:

I want to construct a TidyDoc (a tree-like structure) with tidyParseBuffer().

I have no problem with tidyParseFile(); about tidyParseBuffer(): I'm sure I read the file properly and that the TidyBuffer structure I give to tidyParseBuffer() is correctly filled.

Any ideas?

here is the code:

    //declaration
 tidyInput = malloc(sizeof(TidyBuffer));
 tidyOutput = malloc(sizeof(TidyBuffer));
 do { 
      len = fread(pbInputData, 1, nInputData, h->file);
      tidyBufAttach(tidyInput, (void*)pbInputData, len);
      tidyParseBuffer(h->doc, tidyInput);  // doc is the TidyDoc 
 } while (len >= nInputData);
 tidyOptSetBool(h->doc, TidyForceOutput, yes);

 tidySaveFile(handler->doc, "C://test.xhtml");

I did simplify the code.


1 Answers

The problem stems from the fact that you are trying to parse the contents of a file in chunks, reading each chunk into a buffer and calling tidyParseBuffer() for each chunk.

The tidyParseXxx() functions operate by parsing the whole input in a single call, so to do what you want you should take a look at TidyInputSource and tidyParseSource().

like image 129
Matthew Murdoch Avatar answered Jan 26 '26 09:01

Matthew Murdoch



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!