Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Ruby Hash.new with a block need in-depth explanation

Tags:

ruby

hash

I'm looking at the solution to a problem called duped_index and I'm not quite getting the concept of this particular Hash.new variable:

def duped_index(arr)
    result = Hash.new { |hash, key| hash[key] = [] }

    arr.each_with_index do |ele, idx|
        result[ele] << idx
    end

    result.select { |alphabet, indices| indices.length > 1 }
end

p duped_index(["a", "b", "c", "c", "b", "b", "c", "d", "b"]) => # {"b"=>[1,4,5,8], "c"=>[2,3,6]}

Could you explain to me what's going on in between the Hash.new block? Would there be a more efficient way to solve this exercise in general?

like image 729
hjungj21o Avatar asked Jan 25 '26 04:01

hjungj21o


1 Answers

TL;DR

Default values are a way to declare a static or dynamic value for Hash keys without having to explicitly assign to each key ahead of time. Pragmatically, this is often used to ensure that some sensible value is returned for all new keys without requiring an explicit assignment to each key created.

Your code uses the block form of Hash initialization to set the default value. I explain the block form below, and then contrast it with two simpler examples.

Setting a Default Value with a Block

In Ruby, a Hash object can be instantiated in a number of different ways. One way is to pass a block to Hash#new. This block will be called for any key that doesn't have a value.

Consider this related example:

# define a default value using a block
h = Hash.new { |hash, key| hash[key] = [] }

# block dynamically assigns empty array
# to new keys
h.has_key? 'foo' #=> false
h['foo']         #=> []
h.has_key? 'foo' #=> true

Here, h is assigned a new Hash object with a block. This block basically assigns an empty Array object as the "default value" for new members of the Hash that are not given explicit values. In practice, this means the value returned by the block when a previously-unassigned key is looked up will be [].

Now consider:

h = Hash.new { |hash, key| hash[key] = [] }

# block does nothing for assigned keys
h.has_key? 'bar' #=> false
h['bar'] = nil
h['bar']         #=> nil
h.has_key? 'bar' #=> true

Note how assigning a value (even nil) sets the expected value. The default value is really only used when making the first access to a key that doesn't already exist.

Why Use a Block?

A block declaration is generally more useful when you want to calculate the default value at runtime, or when the default value for new keys should be dynamic. For example:

# define some variable that will change
@foo = 0

# declare a Hash that dynamically calculates
# its default value
h = Hash.new { @foo + 1 }

h['foo']  #=> 1
@foo += 1
h['bar']  #=> 2

Unless you need that additional flexibility, though, you could just as easily have passed an Array literal to the constructor instead. For example:

# sets default values to `[]` instead of invoking
# a block each time
h = Hash.new []

Unless you expect your default value to change for different keys in your hash, it's often semantically clearer to assign a single object as the default value rather than a block.

See Also: Hash#fetch

Another way to get similar behavior to a default value is to use the block form of Hash#fetch. For example, given a Hash without a default value, you can still declare a default when you do a key lookup:

h = {}
h.fetch 'foo', []
#=> []

The semantics and use cases for #fetch are different than #new, but in an example like yours the practical results should be the same. The approach you take will ultimately just depend on what you're trying to express with your code.

like image 68
Todd A. Jacobs Avatar answered Jan 28 '26 01:01

Todd A. Jacobs