I'm working with some large datasets, and trying to improve performance. I need to determine whether an object is contained in an array. I was considering using either index or include?, so I benchmarked both.
require 'benchmark'
a = (1..1_000_000).to_a
num = 100_000
reps = 100
Benchmark.bmbm do |bm|
bm.report('include?') do
reps.times { a.include? num }
end
bm.report('index') do
reps.times { a.index num }
end
end
Surprisingly (to me), index was considerably faster.
user system total real
include? 0.330000 0.000000 0.330000 ( 0.334328)
index 0.040000 0.000000 0.040000 ( 0.039812)
Since index provides more information than include?, I would have expected it to be slightly slower if anything, although this was not the case. Why is it faster?
(I know that index comes directly from the array class, and include? is inherited from Enumerable. Might that explain it?)
Looking at the Ruby MRI source, it seems that index uses the optimized rb_equal_opt while include? uses rb_equal. This can be seen in rb_ary_includes and rb_ary_index. Here is the commit that made the change. Its not immediately clear to me why its used in index and not include?
You might also find it interesting to read the discussion of this feature
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With