This is my class for reading mime types. I am trying to add a new mime type(properties file) and read it.
This is my class file:
/*
 * To change this license header, choose License Headers in Project Properties.
 * To change this template file, choose Tools | Templates
 * and open the template in the editor.
 */
package check_mime;
import java.io.IOException;
import java.nio.file.Path;
import java.nio.file.Paths;
import org.apache.tika.Tika;
import org.apache.tika.mime.MimeTypes;
public class TikaFileTypeDetector {
    private final Tika tika = new Tika();
    public TikaFileTypeDetector() {
        super();
    }
    public String probeContentType(Path path) throws IOException {
        // Check contents first
        String fileContentDetect = tika.detect(path.toFile());
        if (!fileContentDetect.equals(MimeTypes.OCTET_STREAM)) {
            return fileContentDetect;
        }
        // Try file name only if content search was not successful
        String fileNameDetect = tika.detect(path.toString());
        if (!fileNameDetect.equals(MimeTypes.OCTET_STREAM)) {
            return fileNameDetect;
        }
        return null;
    }
    public static void main(String[] args) throws IOException {
        Tika tika = new Tika();
        if (args.length != 1) {
            printUsage();
            return;
        }
        Path path = Paths.get(args[0]);
        TikaFileTypeDetector detector = new TikaFileTypeDetector();
        String contentType = detector.probeContentType(path);
        System.out.println("File is of type - " + contentType);
    }
    public static void printUsage() {
        System.out.print("Usage: java -classpath ... "
                + TikaFileTypeDetector.class.getName()
                + " ");
    }
}
From the docs I have created a custom xml:
 <?xml version="1.0" encoding="UTF-8"?>
 <mime-info>
   <mime-type type="text/properties">
          <glob pattern="*.properties"/>
   </mime-type>
 </mime-info>
Now how do I add to my program and read it. Do I have to create a parser? I'm stuck here.
3.2. Another way to get the MIME type of a file is by reading its content. We can determine the MIME type according to specific characteristics of the file content. For example, a JPG starts with the hex signature FF D8 and ends with FF D9.
Multiple MIME types can use one extension. For example, if your organization uses multiple versions of a program, you can define a MIME type for each version; however, file names of all versions use the same extension.
A media type (also known as a Multipurpose Internet Mail Extensions or MIME type) indicates the nature and format of a document, file, or assortment of bytes. MIME types are defined and standardized in IETF's RFC 6838.
This is covered in the Apache Tika 5 minute parser instructions. To add support for Java .properties files, you should first create a file called custom-mimetypes.xml and populate it with something like:
<?xml version="1.0" encoding="UTF-8"?>
<mime-info>
  <mime-type type="text/properties">
     <_comment>Java Properties</_comment>
     <glob pattern="*.properties"/>
     <sub-class-of type="text/plain"/>
   </mime-type>
</mime-info>
Next, you need to put that somewhere that Tika can find it, with the right name. It must be stored as org/apache/tika/mime/custom-mimetypes.xml on your classpath. The easiest thing to do is to create that directory structure, move the new file in, then add the root directory to your classpath. For deployment, you should wrap that up into a jar and put it on the classpath
You can use the Tika App to check your mime type file was loaded, if you're careful. With your code pacakged as a jar, run it as something like:
java -classpath tika-app-1.10-SNAPSHOT.jar:my-custom-mimetypes.jar org.apache.tika.cli.TikaCLI --list-supported-types | grep text/properties
Alternately, if you have it in a local directory, try something like
ls -l org/apache/tika/mime/custom-mimetypes.xml
# Check a file was found, with some content in it
java -classpath tika-app-1.10-SNAPSHOT.jar:. org.apache.tika.cli.TikaCLI --list-supported-types | grep text/properties
If that isn't showing your mime type, then you didn't get the path or filename correct, double check them
(Alternately, upgrade to a newer version of Apache Tika, as since r1686315 Tika has a Java Properties mimetype built in!)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With