Lets say my key is not a simple data type but a class, and I need to sort the keys by using a comparison function. In Scala, I can do this by using, new Ordering. How can I achieve the same functionality in Python? For example, what would be the equivalent code in Python?
implicit val someClassOrdering = new Ordering[SomeClass] {
override def compare(a: SomeClass, b: SomeClass) = a.compare(b)
}
You can pass keyfunc argument:
from numpy.random import seed, randint
from collections import namedtuple
Point = namedtuple('Point', ['x', 'y'])
seed(1)
rdd = sc.parallelize(
(Point(randint(10), randint(10)), randint(100)) for _ in range(5))
Now, lets say you want to sort Points by y coordinate:
rdd.sortByKey(keyfunc=lambda p: p.y).collect()
and result is:
[(Point(x=5, y=0), 16),
(Point(x=9, y=2), 20),
(Point(x=5, y=2), 84),
(Point(x=1, y=7), 6),
(Point(x=5, y=8), 9)]
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With