Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

In Ruby how can I tell what the extension is to an image file that's hosted on the internet?

Tags:

ruby

open-uri

I want to download some images from the web, however some of the URLs do not specify the file extension such as:

http://sportslabs-webproxy.imgix.net/http%3A%2F%2Fkty-platform-prod.silverchalice.co%2Fv3%2Fimages%2Fcontents%2F55bbe945e4b073340d3851fb?fit=clip&h=532&w=800&s=61b00197aca130a83de011484841158e

I was going to use the following method mentioned in "How do I download a picture using Ruby?" to download the files, but as I said wasn't sure how to tell the script what file extension to save it as.

like image 720
daveomcd Avatar asked Dec 08 '25 05:12

daveomcd


2 Answers

Look into the ruby-filemagic gem.

For example:

require 'open-uri'
require 'filemagic'

url = 'http://sportslabs-webproxy.imgix.net/http%3A%2F%2Fkty-platform-prod.silverchalice.co%2Fv3%2Fimages%2Fcontents%2F55bbe945e4b073340d3851fb?fit=clip&h=532&w=800&s=61b00197aca130a83de011484841158e'

open('raw_file', 'wb') do |file|
  file << open(url).read
end

puts FileMagic.new(FileMagic::MAGIC_MIME).file( 'raw_file' )
# => 'image/jpeg; charset=binary'

UPDATE: To find the extension to save the file as you can use mime-types

content_type = FileMagic.new(FileMagic::MAGIC_MIME).file( 'raw_file' ).split( ';' ).first

require 'mime/types'
puts MIME::Types[content_type].first.extensions.first 
# => 'jpeg'
like image 115
Eric Terry Avatar answered Dec 11 '25 02:12

Eric Terry


You can use the Content-Type HTTP header. For the URL you provided, the headers are:

$ curl -I "http://sportslabs-webproxy.imgix.net/http%3A%2F%2Fkty-platform-prod.silverchalice.co%2Fv3%2Fimages%2Fcontents%2F55bbe945e4b073340d3851fb?fit=clip&h=532&w=800&s=61b00197aca130a83de011484841158e"
HTTP/1.1 200 OK
Cache-Control: public,no-transform,max-age=86400,s-maxage=86400
Last-Modified: Mon, 01 Feb 2016 20:08:08 GMT
Content-Length: 35176
Accept-Ranges: bytes
Connection: keep-alive
Content-Type: image/jpeg
...

Here, you can see that the image is a JPEG. You can use a MIME-type library, e.g. mime-types for Ruby to determine which extension to use given the content type.

The vast majority of servers will specify a Content-Type header. If it's not specified, you can use Eric's approach to infer the file type from the contents.

If you want to stick with open-uri, you can use the content_type field to get the Content-Type:

require 'open-uri'

url = 'http://sportslabs-webproxy.imgix.net/http%3A%2F%2Fkty-platform-prod.silverchalice.co%2Fv3%2Fimages%2Fcontents%2F55bbe945e4b073340d3851fb?fit=clip&h=532&w=800&s=61b00197aca130a83de011484841158e'
open(url) { |file|
  content_type = file.content_type
  # Determine extension, copy file to disk, ...
}
like image 42
Chris Schmich Avatar answered Dec 11 '25 01:12

Chris Schmich



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!