Lets say I have around 1,000,000 users. I want to find out what position any given user is in, and which users are around him. A user can get a new achievement at any time, and if he could see his standing update, that would be wonderful.
Honestly, every way I think of doing this would be horrendously expensive in time and/or memory. Ideas? My closest idea so far is to order the users offline and build percentile buckets, but that can't show a user his exact position.
Some code if that helps you django people :
class Alias(models.Model) :
    awards = models.ManyToManyField('Award', through='Achiever')
    @property
    def points(self) :
        p = cache.get('alias_points_' + str(self.id))
        if p is not None : return p
        points = 0
        for a in self.achiever_set.all() :
            points += a.award.points * a.count
        cache.set('alias_points_' + str(self.id), points, 60 * 60) # 1 hour
        return points
class Award(MyBaseModel):
    owner_points = models.IntegerField(help_text="A non-normalized point value. Very subjective but try to be consistent. Should be proporional. 2x points = 2x effort (or skill)")
    true_points = models.FloatField(help_text="The true value of this award. Recalculated with a cron job. Based on number of people who won it", editable=False, null=True)
    @property
    def points(self) :
        if self.true_points :
            # blend true_points into real points over 30 days
            age = datetime.now() - self.created
            blend_days = 30
            if age > timedelta(days=blend_days) :
                age = timedelta(days=blend_days)
            num_days = 1.0 * age.days / blend_days
            r = self.true_points * num_days + self.owner_points * (1 - num_days)
            return int(r * 10) / 10.0
        else :
            return self.owner_points
class Achiever(MyBaseModel):
    award = models.ForeignKey(Award)
    alias = models.ForeignKey(Alias)
    count = models.IntegerField(default=1)
I think Counterstrike solves this by requiring users to meet a minimum threshold to become ranked--you only need to accurately sort the top 10% or whatever.
If you want to sort everyone, consider that you don't need to sort them perfectly: sort them to 2 significant figures. With 1M users you could update the leaderboard for the top 100 users in real time, the next 1000 users to the nearest 10, then the masses to the nearest 1% or 10%. You won't jump from place 500,000 to place 99 in one round.
Its meaningless to get the 10 user context above and below place 500,000--the ordering of the masses will be incredibly jittery from round to round due to the exponential distribution.
Edit: Take a look at the SO leaderboard. Now go to page 500 out of 2500 (roughly 20th percentile). Is there any point to telling the people with rep '157' that the 10 people on either side of them also have rep '157'? You'll jump 20 places either way if your rep goes up or down a point. More extreme, is that right now the bottom 1056 pages (out of 2538), or the bottom 42% of users, are tied with rep 1. you get one more point, and you jumped up 1055 pages. Which is roughly a 37,000 increase in rank. It might be cool to tell them "you can beat 37k people if you get one more point!" but does it matter how many significant figures the 37k number has?
There's no value in knowing your peers on a ladder until you're already at the top, because anywhere but the top, there's an overwhelming number of them.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With