Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug]: FFA rating is not zero sum #434

Open
jauggy opened this issue Aug 30, 2024 · 3 comments
Open

[Bug]: FFA rating is not zero sum #434

jauggy opened this issue Aug 30, 2024 · 3 comments
Labels
bug Something isn't working low priority

Comments

@jauggy
Copy link
Member

jauggy commented Aug 30, 2024

Describe the Bug

Three players playing FFA by themselves can gain rating seemingly out of nowhere. Initially these players should start with 25 skill each or 75 skill among them. If they were to play each other for many games, we would expect the total skill to be roughly 75.

We can find it is not.

Reproduce the bug

https://bar-rts.com/replays/b517af66db3fb90f3858f90d34bb06cd
TheAnnihilator 36
nanobot 10
iqwert1717 41

These three players only play each other and nobody else, and yet their total rating is too high. We can find TheAnnihilator's FFA games via:
https://www.beyondallreason.info/replays?page=1&limit=24&preset=ffa&hasBots=false&endedNormally=true&players=TheAnnihilator

He only plays his friends.

Here is a game on July:
https://www.beyondallreason.info/replays?gameId=8ec88866bfc5ab84c52f8643294bb53a
where his rating is 0 and the other ratings are 25, 0.

Since then he has only played with his friends and now they all have higher ratings somehow. The ratings are coming from thin air.

Screenshots

No response

Additional context

See match_rating_lib


    # Build ratings into lists of tuples for the OpenSkill module to handle
    winner_ratings =
      winners
      |> Enum.map(fn membership ->
        rating = rating_lookup[membership.user_id] || BalanceLib.default_rating(rating_type_id)
        {membership.user_id, {rating.skill, rating.uncertainty}}
      end)

    # Now we want to get the best loser to use for the winner's win
    loser_ratings =
      losers
      |> Enum.group_by(
        fn %{team_id: team_id} -> team_id end,
        fn %{user_id: user_id} ->
          rating = rating_lookup[user_id] || BalanceLib.default_rating(rating_type_id)
          {user_id, {rating.skill, rating.uncertainty}}
        end
      )
      |> Map.values()


    # Run the winner calculation
    [win_result | _lose_result] = rate_with_ids([winner_ratings | loser_ratings])
    win_result = Map.new(win_result)

It seems here that the winner's rating is calculated by assuming it was a 1 v 2. I.e. we have the winner's ratings versus all the losers' ratings. Teifion mentions in code "Now we want to get the best loser to use for the winner's win" , but it is not the "best" loser - it is all the losers.

  # If you lose you just count as losing against the winner
    loss_ratings =
      loser_ratings
      |> Enum.map(fn team_ratings ->
        lose_results = rate_with_ids([winner_ratings, team_ratings], as_map: true)

        team_ratings
        |> Enum.map(fn {user_id, _old_rating} ->
          rating_update = lose_results[user_id]

          user_rating = rating_lookup[user_id] || BalanceLib.default_rating(rating_type_id)
          ratiod_rating_update = apply_change_ratio(user_rating, rating_update, opponent_ratio)
          do_update_rating(user_id, match, user_rating, ratiod_rating_update)
        end)
      end)
      |> List.flatten()

However, the loser ratings are calculated as a 1v1 ie. assume the loser lost against the winner.

So if we have three friends playing A,B,C and A wins then

Pretend A vs B+C and calculate A new rating assuming win
Pretend B vs A and calculate B new rating assuming loss.
Pretend C vs A and calculate C new rating assuming loss.

@geekingfrog is my interpretation of the code correct?

@jauggy jauggy added the bug Something isn't working label Aug 30, 2024
@jauggy
Copy link
Member Author

jauggy commented Aug 30, 2024

https://www.beyondallreason.info/leaderboards
iqwert1717 is currently ranked 28 on the leaderboard and he just plays with his two other friends.

@jauggy
Copy link
Member Author

jauggy commented Aug 30, 2024

Need to first confirm if my analysis is correct. Then we can think of solutions from there. One avenue to pursue is to ask Vivek how he would handle FFA.

@L-e-x-o-n
Copy link
Collaborator

FFA ratings are not used for balancing, they just give a rough estimation of how many games someone played (and won). Because they don't affect balance, they can be calculated in any way, something as simple as just counting the wins or something like this where opponent ratings are used as well. FFA itself is very unbalanced from map starting positions and resources to your own starting position and neighbours.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working low priority
Projects
None yet
Development

No branches or pull requests

2 participants