

(By the way, I discussed fitting this type of model using a R package in an earlier post.) Simulating Game Results In the below simulation, I use the value which seems like a representative estimated value from recent seasons. Using data for a whole MLB season, one can estimate, the variation in team strengths. If one team A plays a second team B, then the probability that team A wins is given by the logistic model One measures a team’s strength by a parameter S - we assume that the 30 team strengths S1, …, S30 come from a normal distribution with mean 0 and standard deviation. This is one of my favorite models for paired competitions like baseball. I want to see if this observed standard deviation is extreme relative to the predictive distribution.

2018 MLB SCHEDULE DATA SERIES
First, teams have different abilities and we are pretty sure about the “good” teams who are likely to make it to the World Series and the “bad” teams that will have poor records. There are two reasons for variation in win/loss records.

To me, these extreme records seem a bit surprising - there seems to be more variation in win/loss records than one would expect after two weeks of the baseball season. The Phillies are 8-5 which seems surprising for a rebuilding team.In contrast, other teams are really struggling - the Reds are 2-12 and the Royals and Rays have won only 3 games.Several teams with remarkably good records - the Red Sox are 12-2, the Angels are 13-3, the Mets are 11-2 and the Diamondbacks are 11-3.Looking at the standings after the games of April 14, I graph the number of wins of the 30 teams below. In the early weeks of the 2018 baseball season, we’re observing some interesting team records.
