Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

how to extract author details from pdf using java [closed]

Tags:

java

pdf

I have 1000s and 1000s of PDF articles from which I need to extract only author name and his relevant details like address and email ID and whatever provided inside the PDF (I mean the content inside). I don't want to do this by getting the details associated with the metadata of the PDF. Since I tried that where I end up with only less details like author name, title and some other usual details which I do not need at all.

I have gone via all APIs in internet, but still I did get the solution. I need to do it in Java.

like image 778
Sathish Kumar k k Avatar asked Dec 03 '25 06:12

Sathish Kumar k k


1 Answers

I think you can't get it directly from any library. Use iTest library for reading pdf. Once you are able to read text find the Author using regular expression.

like image 115
Subhrajyoti Majumder Avatar answered Dec 05 '25 19:12

Subhrajyoti Majumder