Extracting Metadata
Before a file is uploaded, Shrine automatically extracts metadata from it, and
stores them in the Shrine::UploadedFile object.
uploaded_file = uploader.upload(file)
uploaded_file.metadata #=>
# {
# "size" => 345993,
# "filename" => "matrix.mp4",
# "mime_type" => "video/mp4",
# }Under the hood, Shrine#upload calls Shrine#extract_metadata, which you can
also use directly to extract metadata from any IO object:
uploader.extract_metadata(io) #=>
# {
# "size" => 345993,
# "filename" => "matrix.mp4",
# "mime_type" => "video/mp4",
# }The following metadata is extracted by default:
| Key | Default source |
|---|---|
filename | extracted from io.original_filename or io.path |
mime_type | extracted from io.content_type |
size | extracted from io.size |
Accessing metadata
You can access the stored metadata in three ways:
# via methods (if they're defined)
uploaded_file.size
uploaded_file.original_filename
uploaded_file.mime_type
# via the metadata hash
uploaded_file.metadata["size"]
uploaded_file.metadata["filename"]
uploaded_file.metadata["mime_type"]
# via the #[] operator
uploaded_file["size"]
uploaded_file["filename"]
uploaded_file["mime_type"]Controlling extraction
Shrine#upload accepts a :metadata option which accepts the following values:
Hash– adds/overrides extracted metadata with the given hashuploaded_file = uploader.upload(file, metadata: { "filename" => "Matrix[1999].mp4", "foo" => "bar" }) uploaded_file.original_filename #=> "Matrix[1999].mp4" uploaded_file.metadata["foo"] #=> "bar"false– skips metadata extraction (useful in tests)uploaded_file = uploader.upload(file, metadata: false) uploaded_file.metadata #=> {}true– forces metadata extraction when aShrine::UploadedFileis being uploaded (by default metadata is simply copied over)uploaded_file = uploader.upload(uploaded_file, metadata: true) uploaded_file.metadata # re-extracted metadata
MIME type
By default, the mime_type metadata will be copied over from the
#content_type attribute of the input file (if present). However, since
#content_type value comes from the Content-Type header of the upload
request, it's not guaranteed to hold the actual MIME type of the file (browser
determines this header based on file extension).
Moreover, only ActionDispatch::Http::UploadedFile, Shrine::RackFile, and
Shrine::DataFile objects have #content_type defined, so when uploading
objects such as File, the mime_type value will be nil by default.
To remedy that, Shrine comes with a
determine_mime_type plugin which is able to extract
the MIME type from IO content:
# Gemfile
gem "marcel", "~> 0.3"Shrine.plugin :determine_mime_type, analyzer: :marceluploaded_file = uploader.upload StringIO.new("<?php ... ?>")
uploaded_file.mime_type #=> "application/x-php"You can choose different analyzers, and even mix-and-match them. See the
determine_mime_type plugin docs for more details.
Image Dimensions
Shrine comes with a store_dimensions plugin for
extracting image dimensions. It adds width and height metadata values, and
also adds #width, #height, and #dimensions methods to the
Shrine::UploadedFile object.
# Gemfile
gem "fastimage" # default analyzerShrine.plugin :store_dimensionsuploaded_file = uploader.upload(image)
uploaded_file.metadata["width"] #=> 1600
uploaded_file.metadata["height"] #=> 900
# convenience methods
uploaded_file.width #=> 1600
uploaded_file.height #=> 900
uploaded_file.dimensions #=> [1600, 900]By default, the plugin uses FastImage to analyze dimensions, but you can also
have it use MiniMagick or ruby-vips. See the
store_dimensions plugin docs for more details.
Custom metadata
In addition to the built-in metadata, Shrine allows you to extract and store
any custom metadata, using the add_metadata plugin (which
internally extends Shrine#extract_metadata).
For example, you might want to extract EXIF data from images:
# Gemfile
gem "exiftool"require "exiftool"
class ImageUploader < Shrine
plugin :add_metadata
add_metadata :exif do |io, context|
Shrine.with_file(io) do |file|
Exiftool.new(file.path).to_hash
end
end
enduploaded_file = uploader.upload(image)
uploaded_file.metadata["exif"] #=> {...}
uploaded_file.exif #=> {...}Or, if you're uploading videos, you might want to extract some video-specific metadata:
# Gemfile
gem "streamio-ffmpeg"require "streamio-ffmpeg"
class VideoUploader < Shrine
plugin :add_metadata
add_metadata do |io, context|
movie = Shrine.with_file(io) { |file| FFMPEG::Movie.new(file.path) }
{ "duration" => movie.duration,
"bitrate" => movie.bitrate,
"resolution" => movie.resolution,
"frame_rate" => movie.frame_rate }
end
enduploaded_file = uploader.upload(video)
uploaded_file.metadata #=>
# {
# ...
# "duration" => 7.5,
# "bitrate" => 481,
# "resolution" => "640x480",
# "frame_rate" => 16.72
# }The yielded io object will not always be an object that responds to #path.
For example, with the data_uri plugin the io can be a StringIO wrapper,
while with restore_cached_data or refresh_metadata plugins the io might
be a Shrine::UploadedFile object. So, we're using Shrine.with_file to
ensure we have a file object.
Adding metadata
If you wish to add metadata to an already attached file, you can do it as follows:
photo.image_attacher.add_metadata("foo" => "bar")
photo.image.metadata #=> { ..., "foo" => "bar" }
photo.save # persist changesMetadata columns
If you want to write any of the metadata values into a separate database column
on the record, you can use the metadata_attributes plugin.
Shrine.plugin :metadata_attributes, :mime_type => :typephoto = Photo.new(image: file)
photo.image_type #=> "image/jpeg"Direct uploads
When attaching files that were uploaded directly to the cloud or a tus server, Shrine won't automatically extract metadata from them, instead it will copy any existing metadata that was set on the client side. The reason why this is the default behaviour is because metadata extraction requires (at least partially) retrieving file content from the storage, which could potentially be expensive depending on the storage and the type of metadata being extracted.
# no additional metadata will be extracted in this assignment by default
photo.image = '{"id":"9e6581a4ea1.jpg","storage":"cache","metadata":{...}}'Extracting on attachment
If you want metadata to be automatically extracted on assignment (which is
useful if you want to validate the extracted metadata or have it immediately
available for any other reason), you can load the restore_cached_data plugin:
Shrine.plugin :restore_cached_data # automatically extract metadata from cached files on assignmentphoto.image = '{"id":"ks9elsd.jpg","storage":"cache","metadata":{}}' # metadata is extracted
photo.image.metadata #=>
# {
# "size" => 4593484,
# "filename" => "nature.jpg",
# "mime_type" => "image/jpeg"
# }Extracting in the background
A) Extracting with promotion
If you're using backgrounding, you can extract metadata during background
promotion using the refresh_metadata plugin (which the restore_cached_data
plugin uses internally):
Shrine.plugin :refresh_metadata # allow re-extracting metadata
Shrine.plugin :backgrounding
Shrine::Attacher.promote_block do
PromoteJob.perform_async(self.class.name, record.class.name, record.id, name, file_data)
endclass PromoteJob
include Sidekiq::Worker
def perform(attacher_class, record_class, record_id, name, file_data)
attacher_class = Object.const_get(attacher_class)
record = Object.const_get(record_class).find(record_id) # if using Active Record
attacher = attacher_class.retrieve(model: record, name: name, file: file_data)
attacher.refresh_metadata! # extract metadata
attacher.atomic_promote
end
endB) Extracting separately from promotion
You can also extract metadata in the background separately from promotion:
MetadataJob.perform_async(
attacher.class.name,
attacher.record.class.name,
attacher.record.id,
attacher.name,
attacher.file_data,
)class MetadataJob
include Sidekiq::Worker
def perform(attacher_class, record_class, record_id, name, file_data)
attacher_class = Object.const_get(attacher_class)
record = Object.const_get(record_class).find(record_id) # if using Active Record
attacher = attacher_class.retrieve(model: record, name: name, file: file_data)
attacher.refresh_metadata!
attacher.atomic_persist
end
endCombining foreground and background
If you have some metadata that you want to extract in the foreground and some that you want to extract in the background, you can use the uploader context:
class VideoUploader < Shrine
plugin :add_metadata
add_metadata do |io, **options|
next unless options[:background] # proceed only when `background: true` was specified
# example of metadata extraction
movie = Shrine.with_file(io) { |file| FFMPEG::Movie.new(file.path) }
{ "duration" => movie.duration,
"bitrate" => movie.bitrate,
"resolution" => movie.resolution,
"frame_rate" => movie.frame_rate }
end
endclass PromoteJob
include Sidekiq::Worker
def perform(attacher_class, record_class, record_id, name, file_data)
attacher_class = Object.const_get(attacher_class)
record = Object.const_get(record_class).find(record_id) # if using Active Record
attacher = attacher_class.retrieve(model: record, name: name, file: file_data)
attacher.refresh_metadata!(background: true) # specify the flag
attacher.atomic_promote
end
endNow triggering metadata extraction in the controller on attachment (using
restore_cached_data or refresh_metadata plugin) will skip the video
metadata block, which will be triggered later in the background job.
Optimizations
If you want to do both metadata extraction and file processing during
promotion, you can wrap both in an UploadedFile#open block to make
sure the file content is retrieved from the storage only once.
class PromoteJob
include Sidekiq::Worker
def perform(attacher_class, record_class, record_id, name, file_data)
attacher_class = Object.const_get(attacher_class)
record = Object.const_get(record_class).find(record_id) # if using Active Record
attacher = attacher_class.retrieve(model: record, name: name, file: file_data)
attacher.file.open do
attacher.refresh_metadata!
attacher.create_derivatives
end
attacher.atomic_promote
end
endIf you're dealing with large files and have metadata extractors that use
Shrine.with_file, you might want to use the tempfile plugin to make sure
the same copy of the uploaded file is reused for both metadata extraction and
file processing.
Shrine.plugin :tempfile # load it globally so that it overrides `Shrine.with_file`# ...
attacher.file.open do
attacher.refresh_metadata!
attacher.create_derivatives(attacher.file.tempfile)
end
# ...