Skip to content

Too easy to get inconsistent #hash and #eql? methods #59

@ms-ati

Description

@ms-ati

Problem

The Values gem in Ruby, much like case classes in Scala, makes a very sensible and pragmatic trade-off in its approach to enforcing immutability:

  • Value class instances are always frozen after construction
  • Actual values wrapped by the instances are not required to be frozen, however

Note: I 100% agree with this trade-off for a variety of reasons, no need to discuss here.

However, the current implementation has a problem in the case that a mutable wrapped value, such as an Array or String, is mutated after being wrapped by a value object. In such cases, the value object will have inconsistent #hash and #eql? methods, which violates Ruby language specification:

This function must have the property that a.eql?(b) implies a.hash == b.hash.

Example

Stats = Value.new(:totals)

x = Stats.new([1])
y = Stats.new([1])

[x == y, x.hash == y.hash]
#=> [true, true]

#
# EDGE CASE: mutate value of attribute of y
#
y.totals << 2
y
#=> #<Stats totals=[1, 2]>

[x == y, x.hash == y.hash]
#=> [false, true] <-- INCONSISTENCY

Analysis

As opposed to case classes in Scala, the Values gem currently pre-calculates the hash codes in all instance constructors. This enforces paying the cost of a call to #hash exactly once per instance, as a trade-off against paying it when used as a Hash key or as a Set element.

Because it is pre-calculated in all instances it cannot take mutations into account, but #eql? does take mutations into account.

Proposal

I'd already noticed that when wrapping sufficiently complex values, the time it takes to pre-calculate the hash code is already a significant drag on performance due to performing an expensive calculation during tight loops of functional code that produce an intermediate value object in each iteration. It appears that we could improve this gem, as well as fix this behavior, by introducing an opt-in way to pre-calculate certain methods in general, and then treat the #hash method as just one case of the more general phenomenon of wanting a "memoize this method" solution for immutable value objects.

Then, the general case could be that hash is no longer pre-calculated, speeding up general usage of Value objects when they are not to be used often as Hash keys or in Sets.

And in the cases where we do want the hash to be pre-calculated, we could use machinery such as:

# Returns a single instance with `#hash` pre-calculated
x2 = x.with_precalculated(:hash)

[x2 == x, x2.hash == x.hash]
#=> [true, true]

# Or, set a method as always pre-calculated at class level
Stats = Value.new(:totals) do
  precalculate :hash
end

This is an RFC, please review @tcrayford @michaeldiscala and other interested folks!

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions