|
|||||||
| Earlier versions of OOTP: General Discussions General chat about the game... |
![]() |
|
|
LinkBack | Thread Tools | Display Modes |
|
|
#1 (permalink) |
|
All Star Reserve
Join Date: Feb 2002
Location: Bangalore, India
Posts: 599
Thanks: 1
Thanked 0x in 0 posts
|
Park Factors and Their Negative Impact on Historical Replays (Looooong...)
When one does a Historical Replay with OOTP, he or she has three options when it comes to dealing with stadiums for the teams:
a) One stadium with 100 for all park factors (Old League Park). b) Create a stadium for each league, with park factors for each league for each season. c) Use Actual stadiums with varying park factors, for each season. Option A uses the default Old League Park that comes with the game and has 100 for every park factor. This works if you don't care much about actual parks, but for me, watching a game at Polo Grounds is more realistic than seeing the Yankees and Giants squaring off in the 1921 World Series at Old League Park. Option B was, if I recall correctly, proposed by TigerFan in the days of OOTP 4 when he was doing his replays. This involved creating two parks, one which all the American League teams and one which all the National League teams used. The park factors would be changed each season depending on which league was more offensive for that year. For example, if you were going to replay the 1920 season, the AL hit 879 Homeruns per 100,000 at-bats, while the NL hit 619. The park factor for HR in the NL park would be 82 while in the AL it would 118. If one used this in their league, they would see that the AL would hit more homeruns than the NL and thus replicating real life. However, the flaw with this is that it assumes that HRs were hit in all parks fairly evenly for each season. Of course, that is not the case as the Yankees hit 115 HRs themselves and a lot of HRs were hit in New York's Polo Grounds that year. Option C, involves changing the park.dat, parkconfig.dat, and weather.dat for each season. A great example of this would be if one was using Soxman's stadium files. It is supposed to bring a lot more realism to your replay. If one were to take a look at his park and parkconfig dat files, for example, from 1901. You would see that each stadium has different park factors for batting average for lefties, righties, homeruns for lefties and righties, doubles and triples. This option seems to be the best one and should be the one used if you want to do historical replays including fictional using actual players. Or is it? Option C is actually the worst of the three options, yes the worst. And this is why: The park factors for each stadium are calculated after a season of baseball has been played. No one knows what the park factor for Citizens Bank Park in Philadelphia are because MLB hasn't been played there yet. It’s a new stadium. The park factors are calculated based on players performances at that stadium during a certain year. For example, the 2001 HR park factor in Pac Bell Park in San Francisco is skewed because Barry Bonds played a majority of his games and hit a lot of his 73 HRs there. Had he played for Colorado, its likely that the HR park factor for Coors Field would be higher than it already is in the 2001 dat file. Now, lets look at how the park factors affects players performance in a OOTP replay. Lets replay the 1920 season using the Lahman DB and using Soxman's stadium dat files. Assuming the overall statistical results were accurate, Babe Ruth will likely hit over 50 HR, and quite possibly over 60, barring injuries. He hit 54 that season so that is good. However, now lets replay the season and instead of Ruth being on the Yankees, trade him to the Boston Braves. He will be lucky to reach the 30 HR mark. Replay the 1920 season again, but this time trade Ruth to the New York Giants. He will end up with around 45-55 HRs but not more or less. Why? Simple, the park factor for the New York Yankees' Polo Grounds is 147 for HR for lefties while it is a mere 47 in Boston's park. But wait, the Giants also played in the Polo Grounds that year, so Ruth should have duplicated his Yankee performance there right? Wrong, the HR park factor for lefties at Polo Grounds in 1920 is 101. Does it make sense that a park would have two different values for the same park factor and just different teams? No. Ruth played for the Yankees and they hit a lot of HR that year thus, their park factor is 147. In Boston, they had no left-handed power hitter and the NL was hitting fewer HR anyway, thus a very low value in Boston. Similar case in Giant's park. This is why Option C is the worst option. Another reason why Option C is a bad choice is due to the limitation in the Lahman DB and how OOTP rates players. The Lahman database and its modified versions do not include park neutral data for everyone, they contain actual stats which mean park effects are already in their stats, and by extension, their ratings. When we do a historical replay using Option C, we are modifying player's ratings and their performance again. No need to do this. Let me explain more clearly. Barry Bonds in 2001 hit 73 HRs and will import with a 18 rating for HR. His 73 HRs are already park-effected. How and why? If Bonds had played in Comerica Park, he would hit maybe 55 HR and thus his HR rating would have been 14. Had he played in Coors Field, he would have likely hit over 80 HR and thus his HR rating would be 20 or 21 or higher. When we import Bonds and he comes with a 18 rating for HRs, park effects have already modified his HR rating. By using park factors in OOTP, we would simply be modifying his HR rating, internally, twice. This is not needed and detracts from realism. So what is the solution to getting realistic stats from Historical Replays? There are two solutions, one which we can use now and continue enjoying OOTP and one which would have to incorporated into future version of OOTP. The solution for the current OOTP is to use actual parks but with 100 for all park factors. The second solution is to create a NEW Strength/Power rating which is based on how far a play can hit the ball on a consistent basis. True power hitters don't just hit homeruns by hitting the ball just over the fence, they hit the ball way over the fence, a la McGwire, Bonds, Giambi, etc. In real life, they won't be penalized much if you take them from their current home parks and putting them in Detroit. However, if one were to use the park factors in OOTP those players would affected a lot more than they should be. Just because they didn't play there a lot, if any at all, doesn't mean they should be affected by the park factor calculated after the real Tigers performance. The real Tigers didn't have good players. Had they had Bonds, or Giambi or McGwire the park ratings would be higher. Good power hitters hit the ball harder than most players, on the ground or in the air, and thus hit lots of HRs. OOTP can incorporate this by having a strength/power rating which is tied to how far a player can hit the ball. For example, on a scale of 1 - 100, 100 means that a player will on average hit the ball in the air about 400 feet. A rating of 1 would mean that the player will on average hit the ball in the air about 100 feet. I believe this is a better representation of a players true power hitting skills than a simple HR rating. In my opinion, park factors are backward looking whereas ball park dimensions are more forward looking. Using the strength rating would make those "what if" scenarios more fun and probably more accurate. P.S.: Soxman, I am in no way against your work. I enjoy using your stadiums a lot and if it weren't for them, I would have never come up with this new rating which is the main point of this posting... |
|
|
|
|
|
#2 (permalink) |
|
All Star Starter
Join Date: Dec 2001
Location: Newburgh, NY
Posts: 1,930
Thanks: 0
Thanked 2x in 2 posts
|
I have thought about this before, and along with players getting a second "bump" could be troublesome also (Helton with a 9 HR whether its in Colorado or Safeco, but in Colorado he gets the added bump of the park factors).
I just bit the bullet in my historical league, and use the park factors at baseballparks.com (I think thats the site), but I also modified them so they come pretty close to averaging 100 in the factors (alittle more or less, but close). I also review them when changing era's, and may adjust the factors slightly based on the era average of the park.
__________________
Well, I don't really think that the end can be assessed as of itself as being the end because what does the end feel like? It's like saying when you try to extrapolate the end of the universe, you say, if the universe is indeed infinite, then how - what does that mean? How far is all the way, and then if it stops, what's stopping it, and what's behind what's stopping it? So, what's the end, you know, is my question to you. |
|
|
|
|
|
#3 (permalink) |
|
Global Moderator
Join Date: Dec 2001
Location: Muscatine, IA
Posts: 8,275
Thanks: 2
Thanked 35x in 2 posts
|
I always thought that park factors were player neutral. Interesting. Eventually, to model things accurately, I think OOTP would have to have some kind of physics model. When a flyball was hit, the distance and trajectory would be calculated based on the player's ratings, weather, wind, etc. When the ball's "drop" location was determined, it would be compared to park dimensions and fielder range data to determine whether the ball would fall for a hit, a fielder would reach it etc. But all of this is probably MANY versions off.
|
|
|
|
|
|
#4 (permalink) |
|
Hall Of Famer
Join Date: Jan 2002
Location: Indianapolis, Indiana
Posts: 5,193
Thanks: 61
Thanked 132x in 78 posts
|
Thanks for the insight Ankit.
Everything you say makes sense to me. I will use 100s from now until someone proves otherwise.
__________________
GM Indianapolis 500s- Classic Baseball Union The American Baseball Congress - a fictional/historical league dynasty currently in 1871 |
|
|
|
|
|
#5 (permalink) |
|
Hall Of Famer
Join Date: Apr 2003
Location: Where you live
Posts: 10,453
Thanks: 2
Thanked 49x in 40 posts
|
I don't think Ankit proved that option C is bad. He only proved that option is bad when the setup is wrong.
__________________
Jonathan Haidt: Moral reasoning is really just a servant masquerading as a high priest. |
|
|
|
|
|
#6 (permalink) | |
|
Hall Of Famer
Join Date: Apr 2003
Location: Where you live
Posts: 10,453
Thanks: 2
Thanked 49x in 40 posts
|
Quote:
However, the problem Ankit mentioned about imported stats not park neutral might be too hard to solve, unless someone can spend some effort to modify the lahman database.
__________________
Jonathan Haidt: Moral reasoning is really just a servant masquerading as a high priest. |
|
|
|
|
|
|
#7 (permalink) |
|
Minors (Triple A)
Join Date: Dec 2001
Location: Christchurch, England
Posts: 275
Thanks: 5
Thanked 7x in 6 posts
|
Ankit,
No offense taken. Could you suggest anything I could do to amend the dat files in my set to give better figures. Should I just change everything back to 100 across the board. Let me know and I'll re-do the files and post an amended set on my site. (I've got to have something to do during the late nights watching my Bruins lose in the first round of the play-offs!!!) Cheers Soxman |
|
|
|
|
|
#9 (permalink) | |
|
All Star Reserve
Join Date: Feb 2002
Location: Bangalore, India
Posts: 599
Thanks: 1
Thanked 0x in 0 posts
|
That is exactly what I would suggest you would do. I have been trying to do this myself because after making these changes, you won't need dat file for every year, only years for which there have been park changes or park name changes. Let me know if I can be of any help.
Quote:
|
|
|
|
|
|
|
#10 (permalink) | |
|
All Star Reserve
Join Date: Feb 2002
Location: Bangalore, India
Posts: 599
Thanks: 1
Thanked 0x in 0 posts
|
It would affect any league that uses players based on the Lahman DB, thus even leagues trying to replay 2003 are affected. The only thing the park effects don't have an affect on are leagues, using purely fictional players or any other league that doesn't want to do "replays" or recreate history.
Quote:
|
|
|
|
|
|
|
#11 (permalink) |
|
Hall Of Famer
Join Date: Mar 2002
Location: Troy, NY
Posts: 2,734
Thanks: 52
Thanked 16x in 14 posts
|
Well some parks are easier to hit triples (Forbes), harder to hit a HR for righties then Lefties (Yankee Stadium)
What I have done to is tweak my league settings so that say Ty Cobb leaves Bennet Park his average may not be effected but his 2bs and 3bs may be.. One player I used to make my adjustments was Gavvy Cravath...who hit 5,8,9 HRS with the Red Sox and Senators, then going to the Baker Bowl he smacked 16 I believe you Aadik that they may be way off...you show enough to back up your claims, but changing every park to 100 across the board, what's really the difference then from using Old League Park except for a name? Maybe Averaging the stats of the park every year of its existence against the league average would do away with players effecting the numbers for the most part? |
|
|
|
|
|
#12 (permalink) | |
|
All Star Reserve
Join Date: Feb 2002
Location: Bangalore, India
Posts: 599
Thanks: 1
Thanked 0x in 0 posts
|
Quote:
|
|
|
|
|
|
|
#13 (permalink) |
|
Hall Of Famer
Join Date: Feb 2002
Location: Scheduleslovakia
Posts: 10,232
Thanks: 4
Thanked 1,177x in 700 posts
|
A couple of thoughts come to mind:
First, for a given park, what about averaging the park factor values for all of the years that park was in use? And in the case of two teams sharing one park, averaging both teams' individual park factors together over the amount of time they each spent in that park. Might using an average value based on many seasons' results be more indicative of the park factors as they applied to the park itself rather than the qualities of a team from any given individual year? Second, it sounds like there's a research opportunity here. Might it be possible to take a park's dimensions and other factors and compare them to the park factors produced from a number of seasons to see if there is any correlation? There would have to be a fair number of values of course - the distances to the outfield fences in an many areas as possible, the wall heights, the typical direction and speed of any wind in the park, and so forth. Offhand, it seems that if one had enough detailed info on the parks' layout and features it ought to be possible to compare that to the teams' stats produced and see if certain park layouts produce consistent park factors...
__________________
. "We choose to go to the moon in this decade and do the other things not because they are easy, but because they are hard. Because that goal will serve to organize and measure the best of our abilities and skills, because that challenge is one we are willing to accept, one we are unwilling to postpone, and one which we intend to win." . |
|
|
|
|
|
#14 (permalink) | |
|
Global Moderator
Join Date: Dec 2001
Location: Muscatine, IA
Posts: 8,275
Thanks: 2
Thanked 35x in 2 posts
|
Quote:
|
|
|
|
|
|
|
#15 (permalink) |
|
Minors (Triple A)
Join Date: Dec 2001
Location: Christchurch, England
Posts: 275
Thanks: 5
Thanked 7x in 6 posts
|
I've started to equalise all factors to 100. I've completed the first 15 of the 70 dat file sets.
The revised set should be uploaded by the weekend. There's a limit to the number of times you can type 100 return in one night!! |
|
|
|
|
|
#17 (permalink) |
|
All Star Starter
Join Date: May 2002
Location: The Lonely Mountain
Posts: 1,873
Thanks: 3
Thanked 20x in 17 posts
|
A park neutralized Lahman database would be far more work than even Ankit's mammoth undertaking. You might be able to do it for one team or one season, but there would have to some way to automate the process to do the entire database. I am very slowly running a career replay from 1901 using Soxman's stadiums, Carlton's rosters, and the Ankit databases. I play most of the games out and use as played schedules, so it will be a while before I can say if the results are skewed. Thanks to all three of you for these terrific additions. Will upgrading to OOTP6 affect my ability to use these utilities in my replay?
|
|
|
|
|
|
#19 (permalink) | |
|
Hall Of Famer
Join Date: Apr 2003
Location: Where you live
Posts: 10,453
Thanks: 2
Thanked 49x in 40 posts
|
Quote:
__________________
Jonathan Haidt: Moral reasoning is really just a servant masquerading as a high priest. |
|
|
|
|
|
|
#20 (permalink) | |
|
Hall Of Famer
Join Date: Apr 2003
Location: Where you live
Posts: 10,453
Thanks: 2
Thanked 49x in 40 posts
|
Quote:
Therefore, I don't think studying what mentioned above would yield much useful data.
__________________
Jonathan Haidt: Moral reasoning is really just a servant masquerading as a high priest. |
|
|
|
|
![]() |
| Bookmarks |
| Thread Tools | |
| Display Modes | |
|
|