|
|||||||
| Register | FAQ | Members List | Calendar | Search | Today's Posts | Mark Forums Read |
![]() |
|
|
LinkBack | Thread Tools | Search this Thread | Display Modes |
|
|
#1 (permalink) |
|
Minors (Single A)
Join Date: Jun 2003
Location: York, PA
Posts: 56
|
NCAA Statistical Database Community Project
I've noticed that every now and then there's a request for an NCAA feeder league, and the usual response is "too hard to get the data"
Well, let's fix that, or at least start fixing it. Poking around for data, I noticed not only are box scores available for the 2008 season, but every game I looked at, at least, had play-by-play available, too (I hope they all do, but I'm not that optimistic). Still, THAT opens up a great possibility. It'll take some time, but with a group of dedicated people, we can convert those play-by-play files to Retrosheet-style event files. I work for MLBAM, entering the data for Gameday, so I'm well versed in this system. I can tell you that while it's complex (go here to learn how to do them), when complete it gives access to TONS AND TONS of data that you won't get from a box score. Like... slugging percentage with two outs in the seventh and a runner on second... on Wednesdays. Seriously, if you want that, you can find it, you just need to know how to read that data. This project would be a tremendous boon to future NCAA league makers, but it would also have a tremendous value to the sabermetric community. I figure what I'll need is one person to work on each conference. You'll have to know how to write in the code, or at least think you can learn (I'm of the opinion that anyone can learn how to do this, most of the codes are very simple). After the database is complete, it'll be much simpler to keep it up-to-date next season, and going forward we'll have a tremendous resource to enhance the game! Who's on board? |
|
|
|
|
|
#4 (permalink) |
|
Minors (Single A)
Join Date: Jun 2003
Location: York, PA
Posts: 56
|
A quick glance indicates that Boyd has the past few years of raw stats, but not this year, and we could get a lot of value out of this project.
Here's how I plan to organize the work (everyone at each level will also be involved in all the levels below): I will coordinate all the data, and assigning people to work in conferences Others will be in charge of individual conferences, assigning people to work on teams Team coordinators will assign anyone assigned to them to work on individual games Non-coordinators (and of course all the coordinators) will code individual games. With all the levels of coordination, nobody will be working on the same game as someone else, there will be multiple levels to check the work and make sure it's right, and people will be assigned to the places where they're needed most. For each team we'll only do their home games (otherwise we'd do every game twice). After entering the data, we'll create a box score from it and check it against the official box score, seeing if there are any discrepancies (and if so, why) then send it up the change (where the box score will be checked again). If you're a conference coordinator, you will assign yourself to work on a team, and if someone is sent to you, you will assign them to work on *another* team -- so you don't cross paths. I am going to assume we aren't even going to get enough people to work on all the conferences, but that's how we'll proceed if we do. We'll focus on DI first, then go to DII and DIII afterwards. Conference coordinators will also be responsible for downloading and saving all the play-by-play records and box scores *offline* in case the web links go down, they'll still be available. There will be lots of work to do. I'll try to put together a spreadsheet that will make the work simpler, but it will still involve going through tons of games and entering lots of codes. I don't want to scare people off, but if you volunteer to help, make sure you're willing to commit to it. |
|
|
|
|
|
#5 (permalink) |
|
Bat Boy
Join Date: Jun 2008
Location: Melville, NY
Posts: 4
|
Wow, this thought had actually occurred to me in the shower last night to do something like this. Funny that I wake up and find this thread on here before work.
I'm extremely interested in working on this project. I'm going to dive into that link you sent to learn more about this. E-mail: Velocity31@gmail.com |
|
|
|
![]() |
| Thread Tools | Search this Thread |
| Display Modes | |
|
|