View Poll Results: Improve rating system by incorporating the following features

Voters
32. You may not vote on this poll
  • Correctly calculated four digit rating

    21 65.63%
  • Public rating displayed in server browser and in-game scores

    24 75.00%
  • Allow unrated games by callvote

    11 34.38%
Multiple Choice Poll.
+ Reply to Thread
Page 1 of 2 1 2 LastLast
Results 1 to 10 of 17

Thread: Elo Rating System

  1. #1
    *Zo
    Guest

    Elo Rating System

    The Elo system is a mathematical formula for estimating individual player's strength. It seems like iD developed their own to fit Quake Live and my opinion is that it's not nearly good enough.

    None skill matched games can be avoided by publicly showing a correctly calculated rating for each and every player. Rating is a number in the 1000-3000 range not 0-100. Rating never change faster than regular 32 points over five games not 15 current rating points over one game.

    Tier slumming can be combated by indicating the number of matches played by the new player. Rating is provisional until 20 games have been played. Rating is corrected if you play versus players with provisional rating once 20 games have been played. Provisional rating is flagged and displayed with a P, like ####P. So when you are about to face a tier slummer, there is a clear indication that the skill rating might be totally off. Rating take weeks to build, just like Quake Live skill.

    I can't tell you exactly how to implement this correctly, but clearly you haven't because a lot of players are pretty upset. Players can help the system to work if they know how it is working. The current one is way too easy to exploit, you just create a new account to get easy matches.

    Rating is accumulated more like experience points alongside frags and wins. Losing a match 30 to -1 doesn't make you a very bad player in comparison to your opponent. You should never lose more than 32 regular points of rating. In the current system that would be ~2 rating point.

    With all this in mind, the skill rating system for team games and FFA must be corrected also, but that is even trickier because Elo rating can only be used for 1 vs 1 matches like Duels.

    When you create an account you are supposed to play 100 games in your favorite gametypes not just one or two to check out Quake Live.

    Few rules of thumb. When you join a game or a specific team, you should know what rating is required of you to perform a skill matched game. When an opponent joins, you should know what will be expected from you to maintain the balance. Rating must be displayed in web lobby and in-game scores for every player, not just an useless server average comparison of some sort.

    Read further about the math behind Elo rating on Wikipedia.

    How will the community repay you? Is the reward only in the deed? Making Quake Live as good as it possibly can be clearly must be worth something for a dedicated game developer. Players will notice if things work and use it more frequently.

    Vote for new public rating.
    Last edited by Zo; 10-03-2010 at 08:34 AM.

  2. #2
    Senior Member Lam has a spectacular aura about Lam has a spectacular aura about Lam has a spectacular aura about Lam's Avatar
    Join Date
    Aug 2010
    Posts
    3,763
    Quote Originally Posted by forum
    Message

    You did not select an option to vote for. Please press back to return to the poll and choose an option before voting.
    I'd vote, but something is obviously broken. Either it's the forum or your poll, I can't decide.

    None of the options will make skill rating any better. I also don't believe that you know better how it should be calculated than the people who work on it.

    One thing I liked at first was showing who's on probation (having unstable, provisional rating). But on second thought... There's a category of players, let's call them "Campgrounds pros". They have a mental condition prohibiting them from entering a game they are expecting to lose. So not only will they refuse playing on any map besides 3 maps they know, they will also kick everyone that's not yet rated in a game mode out of fear that he'll end up with 5-60 KDR.

    I've seen votes for kicking someone like that between maps. I've been called names for voting against such kick. I've seen other players' harassment and ragequits after such vote couldn't pass. I'm scared of making it worse.

    So you can show the ratings on profile page (I can already see some hints about someone's skill there), but not in game, please.

  3. #3
    Senior Member quakestreme is on a distinguished road
    Join Date
    Aug 2010
    Posts
    1,611
    I just thought of an easy peasy way ELO works for FFA. All it has to be is it's as if you're playing a single game against every player that's playing (provided they were all there from the start)? So if there are ten players, it would be as if you are playing nine 1 vs 1 games (unconnected with your Duel rating of course): the people who you are ahead of you are regarded as beating and the people who you are behind you lost to. That way the ELO system isn't modified at all and it's all done very easily.

    You could also get to see the ratings of people in a server before you join. PERHAPS a person that joins late shouldn't have his frags or deaths counted towards the ratings of others since he/she's not being rated him/herself.

  4. #4
    *Zo
    Guest
    Your points are valid but they don't outright prove that my idea is bad or wrong.

    I was thinking about posting a thread with a poll that asked if the rating system was flawless but that would be kind of pointless.

    My idea, isn't really my idea at all. It's by no means arbitrary. For several decades the chess community have had a very robust and accurate rating system with hundreds of thousands of players divided into separate classes, having very balanced levels of opponents for everyone to play against. Here is an European player list. With a little clever use of MS Excel you can sort it by rating to get a better picture of what is going on. These are all skilled chess players in Europe, publicly rated.

    iD Software have zero experience dealing with a rating system, they must accept that others may have more accurate and functioning ways of dealing with lots of players that need to play equally skilled opponents.

    You don't have to be afraid of being kicked because of too high rating. The system shouldn't allow you to be able to play at the same server to begin with. Kicking spectators should not be allowed and a number of dedicated spectator slots should always be present. Those Campground pros won't gain rating when they only play lesser opponents. When no rating is public they are just counting wins. With rating public they will still be considered lesser players because they never take on any real challenges.

    Elo is not an acronym, it's an actual forename of a person who invented the system way back. He was a professor in physics and also a chess master.

    In Free For All you can't see a game as a whole bunch of 1 vs 1 games. The most skillful FFA players know who is weak and puts them out of their misery before someone else does. Rating must be determined in score vs deaths ratios. You can't clank down on a winner because he lost many lives to one player without fragging them back. Time played is also an important factor. Players joining late are at an disadvantage and should not suffer unjust rating loses that are not compensating for this. For every minute of the match you have to put together a player pool with individual rating and calculate a fraction of a rating change. After the match you change the rating with a proper K factor. 32 for newbies, 24 for intermediates and 16 for pros. As you can see, rating is a slow process and it has to be or else it wouldn't be accurate or meaningful. With a Premium subscription you are expected to play for a year and most players will reach their true rating after a couple of weeks, days if you play several hours a day.

    Games should not be played separately. There should be tournaments were a lot of players of all skill levels gather and play in groups. It would be really good if there was a way to enlist for a number of games in the web lobby and get a bunch of skill matched games in a row fully automated. Ladders are very good since most players come and go as they like. Elo rating can be used to rank players provided that there are enough digits in the number to separate them all.

    If you want to deal with the Campgrounds pros, if you deem that absolutely necessary, then you have to let the system decide who plays who not the players.

    Having awards for wins and frags leads you to down a path of hunting newbies. Real awards are for tourney wins at equal level of competition.
    Last edited by Zo; 10-03-2010 at 08:34 AM.

  5. #5
    Senior Member quakestreme is on a distinguished road
    Join Date
    Aug 2010
    Posts
    1,611
    Quote Originally Posted by Zo View Post
    Rating must be determined in score vs deaths ratios.
    No. No. No! FFA skill has nothing to do with deaths. Deaths are NOT important in FFA. The K/D ratio has NOTHING to do with the game, they're not ranked. The K/D ratio means nothing, it is just a curiosity and may help people see how another person won. Let's forget about that idea immediately please, it would change the entire game. If you did that then the whole in-game ranking must be kill/death ratio as well.

    Quote Originally Posted by Zo View Post
    You can't clank down on a winner because he lost many lives to one player without fragging them back.
    It is irrelevant, all that matters is how much he scored. His ranking is a pure indicator of how well he did in that FFA. There may be statistical deviations and other factors involved, but I would be totally against taking into account if one player killed another etc. and especially how many times you were killed, because that's not the object of the game.

    Quote Originally Posted by Zo View Post
    Time played is also an important factor. Players joining late are at an disadvantage and should not suffer unjust rating loses that are not compensating for this.
    Of course, my suggestion would be that they aren't rated at all.

    Quote Originally Posted by Zo View Post
    For every minute of the match you have to put together a player pool with individual rating and calculate a fraction of a rating change. After the match you change the rating with a proper K factor. 32 for newbies, 24 for intermediates and 16 for pros. As you can see, rating is a slow process and it has to be or else it wouldn't be accurate or meaningful.
    Your "RD" or "rating deviation" is what's used in the fics elo rating system (and other ones), which if it's high will mean that one match will bring your rating up or down more than usual. If you are provisional or haven't played for a long time then your "RD" is huge, that means a single game can win or lose you a lot of points. On your first or second game, your RD is huge. The more you play, the more your RD goes down. After playing a few games, then your RD should be low and stable again. But there is no need for a different "ratings change" depending on your level. If a pro beats a pro they can gain 8 points, just like if an beginner beats a beginner (after they've gotten a provisional rating). 90% of the time, the RD will be as low as it can be, only if you're not playing for a long time it'll go up. You don't *have* to have an RD in a ratings system, but of course your rating should go up or down fast as it's settling when you're provisional.

    You don't have to know or care about the exact mathematical equations for an ELO system to have a really good idea of what will happen if you win/lose. On fics, if you play someone your level and your RD is low, then a win is +8 and a loss is -8 with a draw 0. If you play someone 100 points greater, it might be +11 for a win and -6 for a loss. It's not critical to know the equations for it, but it might be nice for id to allow us to see them.
    Last edited by quakestreme; 10-03-2010 at 09:10 AM.

  6. #6
    Member FiletOFish is an unknown quantity at this point
    Join Date
    Aug 2010
    Posts
    53
    An individual players skill rating should only be a means to an end. The end being matching players of similar skill together for fair games.

    It doesn't matter how accurately the system calculates individual skill if the system for grouping players together for matches is screwed.

  7. #7
    *Zo
    Guest
    Current rating is not calculated accurately nor displayed properly anywhere.

    Tiers/Ladders should be included in this system, so everyone knows where they belong and play their neighboring players. Leaderboards sorted by rating instead of other boring stuff that might have been earned by playing newbies or win-joining all day long.

    Disallow unmatched rated games. Allow spectating everywhere.

    Why did iD change their skill system recently? Because they are listening to us. People want this because they really need a sense of progress, not just repeating old victories over lesser players. We all want to play better, measuring skill accurately through Elo is the best way of having a value telling us how good we are and allows us to compare ourselves to everyone else.

    For current awards to be meaningful they need to be of different colors. Lets say we divide all players into 7 Tiers. When an award is earned it shows up under the player's current Tier. They have to earn the same award again in a higher Tier if they want it. This could go for all awards or made applicable to every specific Tier. All Tiers can still have the current Leaderboards, but limited to players of their own Tier. Of course the rating leaderboard will be the most important of them all. All players will be able to view and spectate other Tiers, but play is restricted to one Tier and when the player number is decreasing nearby Tiers are allowed to face one another. Unrated play is always allowed across Tiers so real life friends etc can play one another on any server.

  8. #8
    Senior Member g0vn0r is on a distinguished road
    Join Date
    Aug 2010
    Posts
    161
    Interesting thread...

    Zo, I think that you overestimate the merits of the Elo metric.

    People at the FIDE are considering alternatives, because the Elo system has its own flaws:
    - It doesn't handle handicap. In chess, winning with the black pieces should grant more credit, and yet this is not taken into account at all by Elo...
    - In addition, the system isn't robust wrt incorrect initial guesses.

    I let you relate these two points with what is needed in QL

    The Elo rating was designed when computations were done by hand, they had to be simple. But this was at the expense of accuracy.

    Today there are more reliable methods. You might be interested in reading this paper for example.

    I agree with you that a robust rating should be implemented. And splitting the range into tiers would be a simple matter of thresholding. However:
    - why should this number be necessarily visible to anyone?
    - how can you tell that id knows nothing about rating systems?
    - there might be server-side performance issues
    - whatever the quality of ANY rating system, you couldn't prevent "some people" to play bad on purpose in order to get their ratings down and enjoy some occasional noob farming
    - please don't go off-topic with rewards and spectator slots

  9. #9
    *Zo
    Guest
    I am not going off-topic, I am adding more ideas regarding the skill system which is based upon Elo numbers and explaining how various factors like awards make people play for the wrong reasons.

    Considering all the products iD Software has released over the years, where would they have gotten this very specific experience? I am not saying they don't know anything about it. Clearly they have studied the matter deeply, but they have chosen a different, obscure and redundant way of interpreting the math and function of it.

    Just because a well known system is flawed by default in some ways doesn't mean that using it differently is any better. At least you have to have the standard measurement of progress. Experimenting with a different one with less experience on the subject as a whole is dangerous. Trying to work in all sorts of variables that Quake Live provides, like accuracy, frags, etc ain't fool proof because once a player realizes that he is out skilled or out played all in-game stats will go south as garbage time commence. A way to early resign a game to save face and stop wasting time would be very convenient too. Simply callvote resign and if the whole team agrees the game ends. It's pretty easy to win such a vote in Duel.

    Visible rating could be optional, but then again, who would show it openly? We are not playing poker here, we are not to deceive our opponents into thinking that they are going to have an easy win, that is just nonsense. Unfair skill differences should be avoided by simply restricting rated play to skill matched games. When you are playing without rating, you are playing without purpose.

    There won't be performance issues if servers are calculating the rating when the match is done and everyone is waiting for the next match to start. It will basically be nothing more than reporting in the scores. Just one more column for each player named rating adjustment.

    Some players that are giving away their rating against opponents will do so to someone who will happily receive it. Since rating is respect, you rather not part from it. A way to stop abuse is to force players to pay for their account or only have serious rating for Premium users. For instance, once you pass a certain mark you will only be able to gain rating from other Premium users. This can also easily be done in a another way by having two different ratings. A free one that everyone has and a Premium one that is only calculated between Premium users.

    I will read that paper, might prove enlightening and useful.

  10. #10
    Senior Member Lorfa is a jewel in the rough Lorfa is a jewel in the rough Lorfa is a jewel in the rough Lorfa's Avatar
    Join Date
    Aug 2010
    Location
    Kepler-22b
    Posts
    8,386
    Quote Originally Posted by Diplodok View Post
    There's a category of players, let's call them "Campgrounds pros". They have a mental condition prohibiting them from entering a game they are expecting to lose.
    It is such a relief to hear that someone else has encountered who I call the "dirty rotten dm6 only fakenickers".

+ Reply to Thread

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts