[Bug]: FFA rating is not zero sum #434

jauggy · 2024-08-30T10:10:23Z

Describe the Bug

Three players playing FFA by themselves can gain rating seemingly out of nowhere. Initially these players should start with 25 skill each or 75 skill among them. If they were to play each other for many games, we would expect the total skill to be roughly 75.

We can find it is not.

Reproduce the bug

https://bar-rts.com/replays/b517af66db3fb90f3858f90d34bb06cd
TheAnnihilator 36
nanobot 10
iqwert1717 41

These three players only play each other and nobody else, and yet their total rating is too high. We can find TheAnnihilator's FFA games via:
https://www.beyondallreason.info/replays?page=1&limit=24&preset=ffa&hasBots=false&endedNormally=true&players=TheAnnihilator

He only plays his friends.

Here is a game on July:
https://www.beyondallreason.info/replays?gameId=8ec88866bfc5ab84c52f8643294bb53a
where his rating is 0 and the other ratings are 25, 0.

Since then he has only played with his friends and now they all have higher ratings somehow. The ratings are coming from thin air.

Screenshots

No response

Additional context

See match_rating_lib


    # Build ratings into lists of tuples for the OpenSkill module to handle
    winner_ratings =
      winners
      |> Enum.map(fn membership ->
        rating = rating_lookup[membership.user_id] || BalanceLib.default_rating(rating_type_id)
        {membership.user_id, {rating.skill, rating.uncertainty}}
      end)

    # Now we want to get the best loser to use for the winner's win
    loser_ratings =
      losers
      |> Enum.group_by(
        fn %{team_id: team_id} -> team_id end,
        fn %{user_id: user_id} ->
          rating = rating_lookup[user_id] || BalanceLib.default_rating(rating_type_id)
          {user_id, {rating.skill, rating.uncertainty}}
        end
      )
      |> Map.values()


    # Run the winner calculation
    [win_result | _lose_result] = rate_with_ids([winner_ratings | loser_ratings])
    win_result = Map.new(win_result)

It seems here that the winner's rating is calculated by assuming it was a 1 v 2. I.e. we have the winner's ratings versus all the losers' ratings. Teifion mentions in code "Now we want to get the best loser to use for the winner's win" , but it is not the "best" loser - it is all the losers.

  # If you lose you just count as losing against the winner
    loss_ratings =
      loser_ratings
      |> Enum.map(fn team_ratings ->
        lose_results = rate_with_ids([winner_ratings, team_ratings], as_map: true)

        team_ratings
        |> Enum.map(fn {user_id, _old_rating} ->
          rating_update = lose_results[user_id]

          user_rating = rating_lookup[user_id] || BalanceLib.default_rating(rating_type_id)
          ratiod_rating_update = apply_change_ratio(user_rating, rating_update, opponent_ratio)
          do_update_rating(user_id, match, user_rating, ratiod_rating_update)
        end)
      end)
      |> List.flatten()

However, the loser ratings are calculated as a 1v1 ie. assume the loser lost against the winner.

So if we have three friends playing A,B,C and A wins then

Pretend A vs B+C and calculate A new rating assuming win
Pretend B vs A and calculate B new rating assuming loss.
Pretend C vs A and calculate C new rating assuming loss.

@geekingfrog is my interpretation of the code correct?

The text was updated successfully, but these errors were encountered:

jauggy · 2024-08-30T10:11:53Z

https://www.beyondallreason.info/leaderboards
iqwert1717 is currently ranked 28 on the leaderboard and he just plays with his two other friends.

jauggy · 2024-08-30T10:16:18Z

Need to first confirm if my analysis is correct. Then we can think of solutions from there. One avenue to pursue is to ask Vivek how he would handle FFA.

L-e-x-o-n · 2024-08-30T10:26:56Z

FFA ratings are not used for balancing, they just give a rough estimation of how many games someone played (and won). Because they don't affect balance, they can be calculated in any way, something as simple as just counting the wins or something like this where opponent ratings are used as well. FFA itself is very unbalanced from map starting positions and resources to your own starting position and neighbours.

jauggy added the bug Something isn't working label Aug 30, 2024

L-e-x-o-n added the low priority label Aug 30, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Bug]: FFA rating is not zero sum #434

[Bug]: FFA rating is not zero sum #434

jauggy commented Aug 30, 2024 •

edited

Loading

jauggy commented Aug 30, 2024 •

edited

Loading

jauggy commented Aug 30, 2024

L-e-x-o-n commented Aug 30, 2024

[Bug]: FFA rating is not zero sum #434

[Bug]: FFA rating is not zero sum #434

Comments

jauggy commented Aug 30, 2024 • edited Loading

Describe the Bug

Reproduce the bug

Screenshots

Additional context

jauggy commented Aug 30, 2024 • edited Loading

jauggy commented Aug 30, 2024

L-e-x-o-n commented Aug 30, 2024

jauggy commented Aug 30, 2024 •

edited

Loading

jauggy commented Aug 30, 2024 •

edited

Loading