Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Verify mime type of uploaded files in node.js

I'm using node and express to handle file uploads and I'm streaming them directly to conversion services using multiparty/busboy and request.

Is there a way to verify that the streams have some certain filetypes before sending them to the corresponding providers? I tried https://github.com/mscdex/mmmagic to get the MIME type out of the first chunk(s) and it worked nicely. I was wondering if the following workflow might work somehow:

  • Buffer the file upload stream and check the incoming data for the Mime type.
  • When the first few chunks are checked and the mime type is correct, empty the buffer into the request-stream.
  • When the mime type turns out not to be correct, send an error message and return.

I tried to get this working but I seem to have some stream compatibility issues (node 0.8.x vs. node 0.10.x streams, which are not supported by the request library).

Are there any best-practices to solve this problem? Am I looking at it the wrong way?

EDIT: Thanks to Paul I came up with this code:

https://gist.github.com/chmanie/8520572

like image 875
chmanie Avatar asked Oct 31 '25 13:10

chmanie


1 Answers

Besides of checking the Content-Type header of the client's request, I'm not aware of a better and more clever way to check MIME types.

You can implement the solution you described above using a Transform stream. In this example, the transform stream buffers some arbitrary amount of data, then sends it to your MIME checking library. If everything is fine, it re-emits data. The subsequent chunks will be emitted as-is.

var stream = require('readable-stream');
var mmm = require('mmmagic');
var mimeChecker = new stream.Transform();
mimeChecker.data = [];
mimeChecker.mimeFound = false;
mimeChecker._transform = function (chunk, encoding, done) {
  var self = this;

  if (self.mimeFound) {
    self.push(chunk);
    return done();
  }

  self.data.push(chunk);
  if (self.data.length < 10) {
    return done();
  }
  else if (self.data.length === 10) {
    var buffered = Buffer.concat(this.data);
    new mmm.Magic(mmm.MAGIC_MIME_TYPE).detect(buffered, function(err, result) {
      if (err) return self.emit('error', err);
      if (result !== 'text/plain') return self.emit('error', new Error('Wrong MIME'));
      self.data.map(self.push.bind(self));
      self.mimeFound = true;
      return done();
    });
  }
};

You can then pipe this transform stream to any other stream, like a request stream (which totally supports Node 0.10 stream by the way).

// Usage example
var fs = require('fs');
fs.createReadStream('input.txt').pipe(mimeChecker).pipe(fs.createWriteStream('output.txt'));

Edit: To be clearer on the incompatibility you encountered between Node 0.8 and 0.10 streams, when you define a stream and attach to it a .on('data') listener, it will switch into flow mode (aka 0.8 streams), which means that it will emit data even if the destination isn't listening. This is what could happen if you launch an asynchronous request to Magic.detect(): the data still flows, even if you listen for it.

like image 134
Paul Mougel Avatar answered Nov 03 '25 03:11

Paul Mougel



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!