I'm working with some large datasets, and trying to improve performance. I need to determine whether an object is contained in an array. I was considering using either index
or include?
, so I benchmarked both.
require 'benchmark'
a = (1..1_000_000).to_a
num = 100_000
reps = 100
Benchmark.bmbm do |bm|
bm.report('include?') do
reps.times { a.include? num }
end
bm.report('index') do
reps.times { a.index num }
end
end
Surprisingly (to me), index
was considerably faster.
user system total real
include? 0.330000 0.000000 0.330000 ( 0.334328)
index 0.040000 0.000000 0.040000 ( 0.039812)
Since index
provides more information than include?
, I would have expected it to be slightly slower if anything, although this was not the case. Why is it faster?
(I know that index
comes directly from the array class, and include?
is inherited from Enumerable. Might that explain it?)
Looking at the Ruby MRI source, it seems that index
uses the optimized rb_equal_opt
while include?
uses rb_equal
. This can be seen in rb_ary_includes and rb_ary_index. Here is the commit that made the change. Its not immediately clear to me why its used in index
and not include?
You might also find it interesting to read the discussion of this feature
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With