Team Rankings

Wins Based Systems

Historically and currently most of our leagues use wins or a points system based on wins (wins = 3points, draws = 1point, losses=0) as the primary sorting factor when ranking performance in a season. This a simple metric that is generally easy to understand and compute. But the system has a few major flaws:

  • All wins count equally, if a highly skilled team beats another highly skilled team it counts the same as if they beat a beginner team. Conversely if a "averaged skill" team pulls off a big upset against a highly skilled opponent, it counts equally as if they beat a team of equal or lower skill.
  • Excused Abscences/planned abscences hurt teams disproportionately. A game that isn't played can have a large negative effect on a teams rank, even when that game isn't counted as a loss (excused absence)
  • Schedule constraints mean that most leagues cannot have "perfect" schedules where either all teams play each other equally, this means some teams will have easier schedules and some teams will have more difficult schedules
  • The small number of games played in a season means its likely that multiple teams will end up with the same record and would require multiple levels of "tie breaking"

As a social sports league one of our major considerations when planning a league and its schedule is attempting to provide a fun, competitive and safe atmosphere for all of our players regardless of skill or experience. Having highly skilled teams consisting of players with multiple seasons or years of experience play against a corporate team looking for a fun team building experience or against a team of 30/40 somethings looking for a relaxing night out geared around socializing with their buds or against a "Free Agent Team" consisting primaraly of rookies who haven't played kickball since the 3rd grade works against this goal.

Groupings

To combat this we have used various models of "groupings" to break leagues up either "evenly" i.e. East/West Conferences or into Skill Based Competitive/Social Divisions.

Conferences (East West)

When building out East / West divisions the league director generally evaluates the teams that have registered and attempts to sort them based on skill from 1....N and would then attempt to split the league evenly. Playoffs would see the top teams from each division face off to crown the "A-Bracket" champion.

Often times "cross-over" games or repeat matchups are required to fillout the regular season schedule, which can lead to imbalance which would postively or negatively affect the strength of schedule for some teams.

The rankings and assignment of cross over games also introduces subjectivity into the process that is impossible to avoid

Divisions (Competitive/Social)

When registering teams are given the option of selecting a joining a division based on their self assessment of skill and desire level of game play. We frequently see a large percentage of teams opt to "play down" when given the choice despite the "upper" division being a more suitable fit. We often struggle to find a teams willing to be the "4th" or "6th" team in an "A-Bracket" knowing that they'll likely lose more games than if they were the in the "top" of the "B-Bracket".

This often forces us to chose and assign the team we think best fits that last slot. A process that is highly subjective and generally not well received.

These divisions may not always be evenly sized, i.e. a competitive division may have 4 teams, while a social division can have 8, this may mean repeat matchups or "cross over matchups" with similar problems as outlined above.

The Elo System

The Elo rating system is a method for calculating the relative skill levels of teams within the same league.

The Elo system was invented by Arpad Elo, a physicist and master-level chess player as an improved chess-rating system, but is also used as a rating system in association football (soccer), American football, baseball, the NBA, pool, table tennis, various board games and esports.

The difference in the ratings between two players serves as a predictor of the outcome of a match, more granularly it tells us the number of points a team is expected to score.

We calculate this prediction using the following formula Team 1 Expectation = 1 / (1 + 10^((Team 2 Rating - Team 1 Rating) / 400))

For example if both teams have a score of 1400, our formula would appear as:

1/(1+10^((1400 - 1400)/400)) which simplies to 1/2 or 50% which means we'd expect both teams to score about 50% of the points in a game. If these two "equal" were to play a series of games we'd expect to see outcomes like: 2-1, 1-2, 1-1 or 2-0, 1-0,0-3

A team with 200 point advantage will be predicted to score roughly 75% of the points in a match: 1/1+10^((1400-1600)/400) = 1/1+10^-0.5 = 0.76 or 76%
Conversely, the team with 200 point disadvantage will be predicted to score roughly 25% of the points: 1/1+10^((1600-1400)/400) = 1/1+10^0.5 = 0.24 or 24%
And if we saw these team play a series of three games we'd expect the outcome to look like: 5-0, 6-1, 1-2 or: 3-0, 4-0, 5-3

The second half of the algorithm moves from "predicted performance" to "actual performance", where a teams rating gets updated based on the match outcome using the formula:

R'(A) = R(A) + K * (Actual - Expected) where:
Actual = 1 if "A" won and 0 if they lost.
Expected = the predicted value based on the formula above
K = AKA the "K Factor" determines the maximum amount of points any particular game is worth. The higher the value of K, the more a ranking will fluctuate over time.
R(A) = The current ranking of team "A"
R'(A) = The new ranking of team "A"

Future Adjustments and Areas of Potential Improvement

Currently our algorithm does not take into account the "margin" of victory, if an underdog loses by 1 point when they were expected to lose by 5, than that loss should hurt less than if they lose by 10.

Seasonal Carry Over: Teams currently carry over their full ELO Ranking, this may unfairly weight past performance against current performance

New Team Initial Ranking: New teams generally get assigned the current league average as their initial ELO, placing them in the "middle" of the pack.