View Single Post
Old 06-05-2006, 12:20 AM   #7 (permalink)
DrArbiter
Minors (Double A)
 
Join Date: Feb 2006
Location: Observing
Posts: 157
Quote:
Originally Posted by Le Grande Orange
I know a couple of others tried to do it that way, but the type in the Guides was simply too small to OCR properly. Perhaps you'll have better luck, because if the stats can be scanned in via OCR then things should go reasonably quickly.
I'm convinced OCR simply won't work any time in the foreseeable future, at least at a reasonable quality level. Getting 90% accuracy in OCR is amazing under the best of circumstances, but even if you achieve that, you still have to proof the whole darn thing to get reasonable results for statistical purposes. You're better off just keying in the thing to start with.

But I don't think it's that hard, really. For a separate project, I developed a complete database of all minor leagues from 1982. I find I can input a league-season completely -- batting, pitching, and fielding -- in about 2-3 hours, with fairly good accuracy.

Yeah, there are tons of leagues to be done. But, I suspect many of the concerns of the historically-oriented players would be assuaged simply by having as much AAA data as possible, as far back as possible. That's not a horribly tall order.

Of course, at that, there's still issues to be resolved -- identifying which season lines go with which players, as well as birthdates and origins. But it's a start.

I've been hoping to get around to posting my data (which also so far has some 1981 seasons, as well as some of the open-class years of the PCL). Hasn't happened yet, but when it does, I'll of course post here. Hopefully, at least, it will help avoid reduplication of effort.
DrArbiter is offline   Reply With Quote