Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to read ISBN from eBooks on CHM or PDF files

Tags:

c#

.net

vb.net

pdf

chm

I'm doing a database for storing my eBook collection.
Most of them have the ISBN within the text of the book itself.
How can I access this contents?
Is there any sourcecode or DLLs for doing that?

like image 386
InfoStatus Avatar asked Oct 12 '25 19:10

InfoStatus


1 Answers

I did it for eBook library app. First of all you need to extract text from chm or pdf file. There are a lot of utilities\libraries to do it. Here is an article on CodeProject on how to extract content from CHM files. For PDF files I used pdftotext utility. When you get plain text from eBook parse it using regular expression to find ISBN10/13 code.

like image 165
aku Avatar answered Oct 14 '25 09:10

aku



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!