Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Batch rename pdf files (IEEE articles)

Tags:

rename

pdf

ieee

I have a very large number (thousands) of pdf files downloaded from IEEE Xplore.

The filenames only contain the article number of the file. For example

6215021.pdf

Now if you visit

http://ieeexplore.ieee.org/xpl/freeabs_all.jsp?arnumber=6215021

you can find all the information available about this article.

If you check the site source code you can find the section bellow:

        <meta name="citation_title" content="Decomposition-Based Distributed Control for Continuous-Time Multi-Agent Systems">
        <meta name="citation_date" content="Jan. 2013">
        <meta name="citation_volume" content="58">
        <meta name="citation_issue" content="1">
        <meta name="citation_firstpage" content="258">
        <meta name="citation_lastpage" content="264">
        <meta name="citation_doi" content="10.1109/TAC.2012.2204153">
        <meta name="citation_abstract_html_url" content="http://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=6215021' escapeXml='false'/>">
        <meta name="citation_pdf_url" content="http://ieeexplore.ieee.org/iel5/9/6384835/06215021.pdf?arnumber=6215021">
        <meta name="citation_issn" content="0018-9286">
        <meta name="citation_isbn" content="">
        <meta name="citation_language" content="English">
        <meta name="citation_keywords" content="
        Distributed control;
        Output feedback;
        Satellites;
        Stability criteria;
        Standards;
        State feedback;
        Upper bound;
        Distributed control;
        linear matrix inequality (LMI);
        multi-agent systems;
        robust control;">

I would like to rename the files I have as "firstpage - citation_title.pdf"

My programming skills are limited (some C only, no clue about parsing) so I am counting on your help.

Thank all of you in advance!

like image 368
BabylonX Avatar asked Dec 08 '25 06:12

BabylonX


1 Answers

You can compile the following C# code by using the iTextSharp library. It renames all of the PDF files in a directory based on the meta data of the PDF files, including their subjects or titles.

using System.IO;
using iTextSharp.text.pdf;

namespace BatchRename
{
    class Program
    {
        private static string getTitle(PdfReader reader)
        {
            string title;
            reader.Info.TryGetValue("Title", out title); // Reading PDF file's meta data
            return string.IsNullOrWhiteSpace(title) ? string.Empty : title.Trim();
        }

        private static string getSubject(PdfReader reader)
        {
            string subject;
            reader.Info.TryGetValue("Subject", out subject); // Reading PDF file's meta data
            return string.IsNullOrWhiteSpace(subject) ? string.Empty : subject.Trim();
        }

        static void Main(string[] args)
        {
            var dir = @"D:\Prog\1390\iTextSharpTests\BatchRename\bin\Release";
            if (!dir.EndsWith(@"\"))
                dir = dir + @"\";

            foreach (var file in Directory.GetFiles(dir, "*.pdf"))
            {
                var reader = new PdfReader(file);
                var title = getTitle(reader);
                var subject = getSubject(reader);
                reader.Close();

                string newFile = string.Empty;
                if (!string.IsNullOrWhiteSpace(title))
                {
                    newFile = dir + title + ".pdf";
                }
                else if (!string.IsNullOrWhiteSpace(subject))
                {
                    newFile = dir + subject + ".pdf";
                }

                if (!string.IsNullOrWhiteSpace(newFile))
                    File.Move(file, newFile);
            }
        }
    }
}
like image 177
VahidN Avatar answered Dec 11 '25 13:12

VahidN