Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

ActiveStorage how to prevent duplicate file uploads ; find by filename

I am parsing email attachments and uploading them to ActiveStorage in S3.

We would like it ignore duplicates but i cannot see to query by these attributes.

class Task < ApplicationRecord
  has_many_attached :documents
end

then in my email webhook job

attachments.each do |attachment|
  tempfile = open(attachment[:url], http_basic_authentication: ["api", ENV.fetch("MAILGUN_API_KEY")])

  # i'd like to do something like this      
  next if task.documents.where(filename: tempfile.filename, bytesize: temfile.bytesize).exist?

  # this is what i'm currently doing
  task.documents.attach(
    io: tempfile,
    filename: attachment[:name],
    content_type: attachment[:content_type]
  )
end

Unfortunately if someone forwards the same files, we've got duplicated and often more.

Edit with current solution:

tempfile = open(attachment[:url], http_basic_authentication: ["api", ENV.fetch("MAILGUN_API_KEY")])
md5_digest = Digest::MD5.file(tempfile).base64digest

# if this digest already exists as attached to the file then we're all good.
next if ActiveStorage::Blob.joins(:attachments).where({
    checksum: md5_digest, 
    active_storage_attachments: {name: 'documents', record_type: 'Task', record_id: task.id
  }).exists?
like image 715
Blair Anderson Avatar asked Jan 31 '26 18:01

Blair Anderson


1 Answers

Rails utilizes 2 tables for storing attachment data; active_storage_attachments and active_storage_blobs

The active_storage_blobs table houses a checksum of the uploaded file. You can easily join this table to verify the existence of a file.

Going from @gustavo's answer I came up with the following:

attachments.each do |attachment|
  tempfile = TempFile.new
  tempfile.write open(attachment[:url], http_basic_authentication: ["api", ENV.fetch("MAILGUN_API_KEY")])
  checksum = Digest::MD5.file(tempfile.path).base64digest

  if task.documents.joins(:documents_blobs).exists?(active_storage_blobs: {checksum: checksum})
    tempfile.unlink
    next
  end

  #... Your attachment saving code here
end

Note: Remember to require 'tempfile' in the class where you are using this

like image 130
Fabian de Pabian Avatar answered Feb 02 '26 08:02

Fabian de Pabian