How does work for different sorting strategies?

Question

I'm going through some tutorials on CodeAcademy, and came across this scenario:

books = ["Charlie and the Chocolate Factory", "War and Peace", "Utopia", "A Brief History of Time", "A Wrinkle in Time"]

# To sort our books in ascending order, in-place
books.sort! { |firstBook, secondBook| firstBook <=> secondBook }


# Sort your books in descending order, in-place below
# this lin initially left blank
books.sort! {|firstBook, secondBook| secondBook <=> firstBook}

Instead of using if/else blocks, I gave this a shot, and it worked, but I don't know why. I assume it doesn't matter which order you place the items in the check (i.e., a <=> b vs. b <=> a). Could someone explain what's happening here?

Ivaylo Strandjev · Accepted Answer

If you reverse the elements in <=> you reverse its value. If the elements are equal this operator returns 0, but if the first one is smaller it returns a negative value, if the first is greater it returns a positive value. Thus if temp = a <=> b then b <=> a is -temp. So you reverse the order of sorting if you write the arguments in reverse order.

the Tin Man · Answer

Here's some simple visual ways of seeing what <=> does, and how reversing the order of the comparison variables affects the order of the output.

Starting with a basic Array:

foo = %w[a z b x]

We can do an ascending sort:

foo.sort { |i, j| i <=> j } # => ["a", "b", "x", "z"]

Or a descending sort by reversing the two variables being compared:

foo.sort { |i, j| j <=> i } # => ["z", "x", "b", "a"]

The <=> operator returns -1, 0 or 1, depending on whether the comparison is <, == or > respectively.

We can test that by negating the result of the comparison, which will reverse the order if the theory holds true.

foo.sort { |i, j| -(i <=> j) } # => ["z", "x", "b", "a"]
foo.sort { |i, j| -(j <=> i) } # => ["a", "b", "x", "z"]

By negating the result of the comparisons the order does reverse. But, for clarity in code, just reverse the order of the variables.

That all said, using sort, or its destructive sibling sort!, isn't always the fastest way to sort complex objects. Simple objects, like strings and characters, and numerics, sort extremely quickly because their classes implement the necessary methods to perform <=> tests quickly.

Some of the answers and comments mention sort_by, so let's go there.

Complex objects don't usually sort correctly, so we end up using getters/accessors to retrieve some value we want to compare against, and that action has a cost in CPU time. sort repeatedly compares the values so that retrieval would occur repeatedly, and add up as wasted time when sorting wasn't happening.

To fix that, a smart guy named Randall Schwartz, who's a major player in the Perl world, started using an algorithm that precomputes once the value to be used to sort; That algorithm is commonly called a Schwartzian Transform as a result. That value, and the actual object, are bundled together in a small sub-array, and then sorted. Because the sort occurs against the pre-computed value, it, and its associated object, are moved around in the ordering, until the sort completes. At that point, the actual objects are retrieved and returned as the result of the method. Ruby implements that type of sort using sort_by.

sort_by doesn't use <=> externally, so you can sort by simply telling it how to get at the value you want to compare against:

class Foo
  attr_reader :i, :c
  def initialize(i, c)
    @i = i
    @c = c
  end
end

Here's the array of objects. Note that they are in the order they were created, but not sorted:

foo = [[1,  'z'], [26, 'a'], [2,  'x'], [25, 'b'] ].map { |i, c| Foo.new(i, c) }
# => [#<Foo:0x007f97d1061d80 @c="z", @i=1>,
#     #<Foo:0x007f97d1061d58 @c="a", @i=26>,
#     #<Foo:0x007f97d1061d30 @c="x", @i=2>,
#     #<Foo:0x007f97d1061ce0 @c="b", @i=25>]

Sorting them by the integer value:

foo.sort_by{ |f| f.i } 
# => [#<Foo:0x007f97d1061d80 @c="z", @i=1>,
#     #<Foo:0x007f97d1061d30 @c="x", @i=2>,
#     #<Foo:0x007f97d1061ce0 @c="b", @i=25>,
#     #<Foo:0x007f97d1061d58 @c="a", @i=26>]

Sorting them by the character value:

foo.sort_by{ |f| f.c } 
# => [#<Foo:0x007f97d1061d58 @c="a", @i=26>,
#     #<Foo:0x007f97d1061ce0 @c="b", @i=25>,
#     #<Foo:0x007f97d1061d30 @c="x", @i=2>,
#     #<Foo:0x007f97d1061d80 @c="z", @i=1>]

sort_by doesn't respond as well to using a negated value as sort and <=>, so, based on some benchmarks done a while back on Stack Overflow, we know that using reverse on the resulting value is the fastest way to switch the order from ascending to descending:

foo.sort_by{ |f| f.i }.reverse
# => [#<Foo:0x007f97d1061d58 @c="a", @i=26>,
#     #<Foo:0x007f97d1061ce0 @c="b", @i=25>,
#     #<Foo:0x007f97d1061d30 @c="x", @i=2>,
#     #<Foo:0x007f97d1061d80 @c="z", @i=1>]

foo.sort_by{ |f| f.c }.reverse 
# => [#<Foo:0x007f97d1061d80 @c="z", @i=1>,
#     #<Foo:0x007f97d1061d30 @c="x", @i=2>,
#     #<Foo:0x007f97d1061ce0 @c="b", @i=25>,
#     #<Foo:0x007f97d1061d58 @c="a", @i=26>]

They're somewhat interchangable, but you have to remember that sort_by does have overhead, which is apparent when you compare its times against sort times when running against simple objects. Use the right method at the right time and you can see dramatic speed-ups.

How does <=> work for different sorting strategies?

Tags:

ruby

MrDuk

2 Answers

Ivaylo Strandjev

the Tin Man

Recent Activity

Donate For Us