|
|
#1 (permalink) |
|
All Star Starter
Join Date: Aug 2008
Posts: 1,660
Thanks: 412
Thanked 251x in 153 posts
|
OOTP 10 vs Puresim 2.45 season replay comparison
I'm kind of obsessed with how close to reality my baseball game gets to the real life performance of the players they model. I own OOTP and Puresim and did a little comparison. I played a 1986 replay 5 times with OOTP and Puresim. I let the computer run all the teams. I then compared the batters perfomances in the game to their real life stats in several categories. Note: These are based on the rate the stat occurs (i.e. HR / AB), not the absolute number of HRs. Below are the results.
1 standard deviation (1 SD) - The % of players within 1 SD of their real life numbers. These players perform pretty darn close to reality. 68% is a good target for this group. 2 standard deviations (2 SD) - The % of players within 2 SD of their real life numbers. Almost all players should be within this group. 95% is a good target for this group. 3 standard deviations (3 SD) - The % of players within 3 SD of their real life numbers. Almost all players should be within this group. 99.5% is a good target for this group. more than 3 standard deviations (3 SD) - The % of players more than 3 SD of their real life numbers. These players peformed very far from their real life performance. 0.5% is a good target for this group. 1986 Replay comparison Strikeouts Puresim---Within 1 SD: 56.02%----Within 2 SD: 86.52%----Within 3 SD: 95.99%----More than 3 SD: 4.01% OOTP------Within 1 SD: 60.42%----Within 2 SD: 89.68%----Within 3 SD: 97.51%----More than 3 SD: 2.50% OOTP is slightly closer to reality in strikeouts. Walks Puresim---Within 1 SD: 57.21%----Within 2 SD: 87.69%----Within 3 SD: 97.54%----More than 3 SD: 2.46% OOTP------Within 1 SD: 41.10%----Within 2 SD: 74.57%----Within 3 SD: 92.65%----More than 3 SD: 7.35% Puresim is much closer to reality in walks. Hits Puresim---Within 1 SD: 64.61%----Within 2 SD: 91.84%----Within 3 SD: 97.99%----More than 3 SD: 2.01% OOTP------Within 1 SD: 58.58%----Within 2 SD: 89.89%----Within 3 SD: 98.68%----More than 3 SD: 1.32% Puresim is moderately closer to reality in hits, but does have a few more players that are significantly further from reality. Doubles Puresim---Within 1 SD: 74.38%----Within 2 SD: 95.98%----Within 3 SD: 99.29%----More than 3 SD: .71% OOTP------Within 1 SD: 55.65%----Within 2 SD: 90.23%----Within 3 SD: 98.68%----More than 3 SD: 1.32% Puresim is much closer to reality. Triples Puresim---Within 1 SD: 68.60%----Within 2 SD: 86.54%----Within 3 SD: 91.72%----More than 3 SD: 8.28% OOTP------Within 1 SD: 53.21%----Within 2 SD: 80.11%----Within 3 SD: 89.82%----More than 3 SD: 10.18% Puresim is much closer to reality. Home Runs Puresim---Within 1 SD: 66.72%----Within 2 SD: 89.13%----Within 3 SD: 94.11%----More than 3 SD: 5.89% OOTP------Within 1 SD: 62.29%----Within 2 SD: 92.99%----Within 3 SD: 99.17%----More than 3 SD: 0.83% Home runs is a toss up. Puresim has more players that are very close to reality. OOTP has less players that are very far from reality. I think Puresim makes a strong case for itself as being the more accurate sim for a season replay. An interesting comparison would be to play 10 seasons and compare how each game does at modeling a player over their career. But, I think I am getting tired of doing comparisons for now. I would also like to get pitcher numbers. If you don't mind, I would like to get an idea how many people are interested in these numbers. I appreciate any comments you have. Another consideration in choosing the sim of choice are the features. OOTP has an edge when it comes to a more engaging financial / contract model with arbitration, rule 5 draft and such. It also tracks some more stats. For people that like verbose play by play OOTP has that too, along with a crazy amount of customization. To me, Puresim makes playing out your games more engaging giving some control over when your runners tag up and try for extra bases. I also believe Puresim models fielders better and gives feedback as to when a player made a great defensive play and when they didn't get to a ball they should have. Puresim's play by play is brief and to the point, which I like. I also don't need all the customization because I just want a MLB set up. It is good that we have two fine baseball sims to chose from. Puresim has come a long way in the last 2 months. They don't make it easy to choose the best one. Rob Last edited by robc; 01-01-2010 at 10:54 PM. Reason: added info |
|
|
|
|
|
#2 (permalink) |
|
All Star Starter
Join Date: Jun 2004
Location: In the vicinity of Buffalo,NY
Posts: 1,627
Thanks: 2
Thanked 9x in 8 posts
|
you need to run hundreds upon hundreds ofsimstocometo anyreal conclusions...
|
|
|
|
| 2 thanks for this post: | Cryomaniac (01-02-2010), edddgar (01-02-2010) |
|
|
#3 (permalink) | |
|
All Star Starter
Join Date: Aug 2008
Posts: 1,660
Thanks: 412
Thanked 251x in 153 posts
|
Quote:
Thanks for your comment Matter2003. |
|
|
|
|
|
|
#5 (permalink) | |
|
All Star Starter
Join Date: Aug 2008
Posts: 1,660
Thanks: 412
Thanked 251x in 153 posts
|
Quote:
|
|
|
|
|
|
|
#6 (permalink) |
|
Hall Of Famer
Join Date: Aug 2003
Posts: 6,566
Thanks: 21
Thanked 149x in 67 posts
|
When you're talking about accuracy, standard deviation while using multiple passes is a great tool to use. With 5 passes, the data isn't grand, but it's not without some value/merit. Add more passes and the confidence you have in the conclusions is better, of course. In this case, each sim has hundreds of players. I might suggest the study should only include players with certain numbers of at bats because a guy with 50 AB based on random chance will be far more likely to be off than a guy with 500 AB. I also wonder if the stats/categories based on rates would be more interesting (K/AB, BB/PA, HR/AB, etc.).
It doesn't surprise me to know that OOTP does not match "real life" on the basis of individual players particularly well, though. The results engine is not designed specifically to do that. This is one of the reasons that I continue to think that OOTP's proper niche is with fictional players and career growth in natural ways (unburdened by a user's expectation due to the name associated with that particular player). I don't know much about Puresim's algorithms, so I can't compare based on personal experience. Last edited by RonCo; 01-01-2010 at 10:53 PM. |
|
|
|
| Thank you for this post: | robc (01-01-2010) |
|
|
#7 (permalink) | |
|
All Star Starter
Join Date: Aug 2008
Posts: 1,660
Thanks: 412
Thanked 251x in 153 posts
|
Quote:
. I do ignore players with less than 100 AB, either in real life or in the game. I am using the stat categories that you suggest (i.e. per AB). Sorry if I didn't specify that. Yes it wouldn't work to compare # of HR in 200AB vs 400 AB without normalizing the AB with each other or using the rate.Again, thanks for your insight. My 'dream' baseball game is one that lets you play in career mode like OOTP or Puresim, but still have statistical results that are true to life. |
|
|
|
|
|
|
#8 (permalink) |
|
Global Moderator
Join Date: Nov 2002
Location: Vancouver
Posts: 7,623
Thanks: 282
Thanked 332x in 190 posts
|
How consistent are players within their own careers though? And if they're not very consistent, do you really want that from a baseball sim? Even if you only count the 3-5 year period around a player's peak, I'm not sure most would be all that consistent. Maybe with some stats. If you do want an exact replay then maybe you'd be more interested in Diamond Mind as many says it's much closer to reality.
What does concern me are a league's totals and as long as we can get them as close to what we want, same as reality or not, then I'm happy. I'd still like to see the full spectrum of different kinds of players, but that's a subtlety I don't think you're really looking at. It does concern me when you say "X is much closer to reality" because it makes me wonder whether OOTP is spreading the stats among the right players, but then I ask myself, "well, what should the %s really be?". To be perfectly honest, I'm pretty solidly in the OOTP camp and I doubt I'll ever be moved from it so I couldn't care less if PureSim is closer or not. The walks % is a good one to look at though. You say Puresim is much closer to reality and that indeed looks to be the case when looking at the 1SD %s: 57 to 41. More importantly than who is doing better, though, 41 seems disconcertingly low. Then you say the same thing for doubles: 74 to 56. That statement too is true for doubles as well, but 56 and 57 are awfully close so shouldn't you be as bothered by PureSim's 57 for walks as OOTP's 56 for doubles? Maybe you are, but it doesn't sound like it. I guess what I'm trying to say is: if what you want is as close to reality as possible then shouldn't you be more concerned with the #s than who is closer? And one last final thought. Not too long ago someone started a thread challenging others to create a monster player. I and I'm sure many others thought of it as a bit of a joke because you could pretty easily adjust game settings to do just that. For example, you could turn off injuries, create an environment where HRs happen on almost every hit, etc. I'm not at all saying you did this, but what settings did you use? Were they the defaults? Were the settings, default or not, what a 1986 league really should have? What about PureSim's settings? I'm not a historical player, but maybe someone who's an expert on that playstyle would have something to say about whatever settings you used. Interesting numbers nonetheless!
__________________
Useful Links: Manuals | Downloads | Newsletters | Knowledge Base | New Tech Support | Updated Forum Rules Interactive Online League Directory - find or advertise a league today! Canadian Baseball League - uses OOTP11, running steadily since April 2002 Last edited by kq76; 01-01-2010 at 11:03 PM. |
|
|
|
| Thank you for this post: | robc (01-01-2010) |
|
|
#9 (permalink) | ||||||
|
All Star Starter
Join Date: Aug 2008
Posts: 1,660
Thanks: 412
Thanked 251x in 153 posts
|
Quote:
Quote:
Quote:
Quote:
Quote:
Quote:
Well, thank you for all of your contributions to this conversation. I do appreciate it. |
||||||
|
|
|
|
|
#11 (permalink) |
|
All Star Starter
Join Date: Aug 2008
Posts: 1,660
Thanks: 412
Thanked 251x in 153 posts
|
Yes, I turned off injuries. Funny but I didn't turn off trades but after looking through all the transactions, there were no trades! Somoe players were optioned down to the minors or called up, but no trades. Development was turned off too.
Last edited by robc; 01-02-2010 at 09:17 AM. |
|
|
|
|
|
#12 (permalink) | |
|
Hall Of Famer
Join Date: Aug 2003
Posts: 6,566
Thanks: 21
Thanked 149x in 67 posts
|
Quote:
By that, I mean, I could consider the stats OOTP creates to be quite true to life. There is a great deal of luck (or random chance) in real life baseball. Sometimes players get hurt a little, or they happen to face a string of power pitchers, or they go on the DL and miss a bunch of away games and play mostly at home, or they're a pitcher and they happen to face poor teams more often than good teams. Sometimes a scorer mis-calls an error into a hit, sometimes they don't. Blah, blah, blah... Who is to say that if you could go back and do 1957 all over again that it would turn out particularly close to how it did in 1957? OOTP (and all games for that matter) take the actual results and use them as the baseline for each player. It then adjusts the possible outcomes of each hitter-pitcher matchup for the league environment (using league totals). This is the right way to make this algorithm work, in my opinion. But there are some built-in factors of error in it if you're trying to get a fairly precise replay as a result. Most (or at least many) are stated in the form of "who is to know if the results that each player created that year actually represent his baseline skill?" This gets to be more important as OOTP adjusts things like playing time and location of games and who the competition is. I'm willing to bet that if you take any OOTP player and compare him to his real life counterpart, he will have played more (or less) games in his home park, or he will have faced worse pitchers (or better) more often, or he will have been hurt less often (or more often) .... [note the injuries an OOTP player receives will have an odd effect on the results. If that player was hurt in real life and it degraded their performance, and does not get hurt in OOTP it could more severely influence things like career average. If that player was hurt in real life for awhile, but it did not influence his output when he played...and the OOTP player does not get hurt, it will likewise influence season and career totals in a "too positive" direction.] I can go on. What if the AI makes him bunt 10 times too often, or 10 times too few? What if he hits with guys on base 10% more often than he should have, or too few? What if...? So, realize that the stats we see in the real life history book are not the player's actual capabilities. They are just one "run" of the league, influenced by a thousand variables. It is highly unlikely that the "real life" results are actually a mean/average that represent true talents and skills. So you're not asking for statistical accuracy, not really. What you're really doing when you say you want results that are "true to life" is that you want the results to match (within statistical significance) one single run of a league. I do understand this desire, but in my mind I admit that it's kind of missing the bigger point. I mean, Babe Ruth hit 714 homers. But if you change how often he played at Yankee Stadium and how often he faced left-handed pitchers, and changed his injury patters a little, he would have wound up with a different total. So if he hits 550 in a sim, is that not representative of real life? How would you know? The only way for OOTP to be spot-on with historical numbers for individual players is to get in a man-handle the results algorithm. I think that would be a bad thing. But that's just my opinion. Last edited by RonCo; 01-02-2010 at 09:09 AM. |
|
|
|
|
| Thank you for this post: | ramblin_man (03-18-2010) |
|
|
#13 (permalink) | |
|
All Star Starter
Join Date: Aug 2008
Posts: 1,660
Thanks: 412
Thanked 251x in 153 posts
|
Quote:
I agree with what you say and understand these different factors that should affect the outcome (where they play, etc). I tried to minimize this type of thing by using the real schedules, turning off injuries, etc. Now I know that in real life some people did get injured and it probably affected their performance, but most players probably didn't. Also I know the game's AI will affect the stats, but in this case I guess I am also factoring this into the 'realism' results. Perhaps the game gets the ratings 100% spot on, but the AI sucks so the results are very different from real life. Well, then I would consider the game to be a poor chocie for somebody who wants a game that is close to reality Maybe a game that does a worse job at creating ratings but has a better AI produces results closer to a player's real life performance. You are right. These types of tests are flawed. But, some less than perfect information may be better than none if trying to get very believable results is important. The thing that started me on my test was when the 1986 Don Mattingly struck out over 100 times in my Puresim league. Shaun adjusted the engine and I have not seen any strikeout results so far from believable. That type of player wasn't handled very well, but now it is much better. In the end, I think it helps to look at this type of stuff. It can lead to more 'true to life' results. Specially if the developer of the game takes its forum participants to heart
|
|
|
|
|
|
|
#14 (permalink) | |
|
All Star Starter
Join Date: May 2003
Location: New Jersey
Posts: 1,471
Thanks: 125
Thanked 493x in 204 posts
|
Quote:
First, you should be calculating your rate stats by plate appearances, not by at bats. If you have two players who both come to the plate 700 times and they both hit 25 home runs, they hit home runs at the same rate. It doesn't matter if one was walked 50 times and the other was walked 100 times, unless your goal was to see how often they hit home runs in plate appearances in which they weren't walked. I think you're better off basing the denominator of your rate as the most basic unit of measure for a hitter - how many times they came to the plate. At bats, although it seems like a basic unit, is really a derived statistic (PA - BB - HBP - SF - SH) that is best applied to things like batting average, in which you don't want to count a walk as a demerit or a merit for the batter. Second, if you find reason to resim, you might want to give each team active 50 man rosters and turn off the ability for the AI to send players to the minors. That way, you have a more level playing field in so far that you can at least guarantee that the OOTP and PureSim teams have identical rosters. Also, if you are concerned with accuracy, why even use the rate stats? Rey Quinones had an awful sub-.500 OPS in 1986. But he had 131 at bats for the Mariners. A reasonable AI (and any manager who had the crystal ball to know he'd have a .484 OPS) would maybe give him a handful of at bats in times where he happened to get a shot at the plate after coming in to a game as a defensive switch. Wouldn't an accurate replay put Rey Quinones into enough games so that he got 131 at bats? Likewise, Lee Guetterman pitched over 100 innings, to a 7.34 ERA. A reasonable AI would never pitch him. Ron Davis had an 8.59 ERA after being the Twins closer for four seasons. Again a reasonable AI would never pitch him (although the MLB manager had all the reason to pitch him - he had a track record of being a capable reliever coming in to 1986). By having a AI never pitch Guetterman or Davis, you essentially make the pitching, as a whole, tougher across the board. By never batting Quinones, you essentially make the batting, as a whole, better across the board. Even if they even out somewhat, the will at least make an altered set of batters face a altered set of pitchers. Lets say (and I don't remember) that Rey Quinones major redeeming feature was that he was far superior defensively to the everyday infielders on the Mariners. If he and other defensive specialists who couldn't hit a lick suddenly get 5% of the PA they got in real life, doesn't that affect how the pitchers and hitters will perform? How do you account for those expected changes? Last edited by BMW; 01-02-2010 at 10:56 AM. |
|
|
|
|
| Thank you for this post: | robc (01-02-2010) |
|
|
#15 (permalink) | |
|
Hall Of Famer
Join Date: Apr 2005
Location: Minnesota
Posts: 3,078
Thanks: 151
Thanked 137x in 105 posts
|
Quote:
There is A LOT about this post I really like. I think it pretty much sums up why I dont like replay sims that emphasize getting numbers to match so closely that it takes away common sense. You nailed it with the Davis, Quinones, and Guetterman examples. If you knew reasonably close to how they were going to do why would you play these guys in the first place? There had to be a reason these guys were in the league which OOTP accounts for while replay sims tend to miss the boat on. Last edited by jbergey22; 01-02-2010 at 11:31 AM. |
|
|
|
|
|
|
#16 (permalink) |
|
Hall Of Famer
Join Date: Mar 2002
Location: Canada
Posts: 5,598
Thanks: 56
Thanked 442x in 307 posts
|
Did you have player development turned on or off?
Sorry, but I really don't get the point of this post. Interesting investigation I suppose, but we already know that OOTP is not a replay sim and hasn't been for a long long time.
__________________
It takes neither courage nor intelligence to cheer for a team only when that team wins. The true test of a fan's mettle is the same as it is for a player: Were you there when you were needed? |
|
|
|
|
|
#17 (permalink) | |||
|
All Star Starter
Join Date: Aug 2008
Posts: 1,660
Thanks: 412
Thanked 251x in 153 posts
|
Quote:
Quote:
Quote:
But if the engine doesn't produce reasonable close results to what happened in real life, it may as well be a fictional league. Maybe that is what you prefer, and that is ok, but I like using the real players. If Don Mattingly strikes out 100 times, it is not Don Mattingly. You brought up many points that are good ones and show the difficulty in trying to evaluate these sims. |
|||
|
|
|
|
|
#18 (permalink) |
|
All Star Starter
Join Date: Aug 2008
Posts: 1,660
Thanks: 412
Thanked 251x in 153 posts
|
The point of the post is that I was trying to determine whether OOTP or Puresim suited my needs better. Since I enjoy the game more when the historical results are believable I was trying to evaluate the two against each other.
|
|
|
|
|
|
#19 (permalink) | ||
|
Hall Of Famer
Join Date: Aug 2003
Posts: 6,566
Thanks: 21
Thanked 149x in 67 posts
|
Quote:
This is a very large issue when it comes to game/player design. All Markus can really do is use the player's actual line as a baseline. But what if that player was injured much of the season, but played through it? Or what if that player was never injured, so got to play at full strength all the time against opponents often hindered by injury or fatigue? Fatigue opens another can of worms. What if a guy played 154 games, but really could have used a day off every now and again to be at peak performance--which would mean his true capability was, say, 10% better than his performance that year because he played more games than he was really capable of. So the game's design needs to take into account lots of things to make this replay work. Quote:
Regarding the use of PA or AB, I can go either way, but to some degree I actually prefer comparing BB, and HBP, and SACB using PA and the rest to AB. I have some semi-valid reasons (in my opinion) for why, but I can see counter-arguments, too. At the end of the day my guess is you'll get adequate precision for your purposes whichever denominator you base it on. Last edited by RonCo; 01-02-2010 at 11:44 AM. |
||
|
|
|
|
|
#20 (permalink) | |
|
Hall Of Famer
Join Date: Apr 2005
Location: Minnesota
Posts: 3,078
Thanks: 151
Thanked 137x in 105 posts
|
Quote:
I played Action PC and just couldnt get into it because it didnt even feel like a baseball game I would ever watch in real life because it all seemed set up just to get to the bottom line. Athough Action PC seemed head and shoulders ahead of Baseball Mogul in that aspect in which BM is simply the worst baseball game I have ever played. Last edited by jbergey22; 01-02-2010 at 11:51 AM. |
|
|
|
|
![]() |
| Bookmarks |
| Thread Tools | |
| Display Modes | |
|
|