View Single Post
Old 08-07-2007, 11:26 PM   #142 (permalink)
Comedian2004
Hall Of Famer
 
Comedian2004's Avatar
 
Join Date: Nov 2004
Location: In a house in Saint Cloud, Florida.
Posts: 6,398
I was writing a new program and solved a problem that has hindered some of my programs.

OOTP dumps tons of CSV files that give me tons of valuable information, however, there are some problems when they use CSV files.

They are not fixed length for each line. So, you have to read in each line, one at a time and examine each line to see if it applies to what you need.

It is not a problem for small files, such as the teams.csv or the leagues.csv file, as those files can be read in microseconds. However, bigger files, such as players.csv or even larger ones like the stat files, take a long time to load.

An example is the program I am currently working on. It is a program that tracks players home runs over their career and displays them as an HTML file.

When I read in a csv file, line by line, it looks for matches for that particular player. On some files, it stops looking when it finds the match, but some files may contain more than one match, like the stats files. So, it usually has to read in the whole file. This slows down the program quite a bit in long routines. As it reads in much, much more data than it really needs.

The plan this time around was to read in the file upon loading the league. Keep it in memory in an array and then access the array instead of loading the file each time it is needed. I do this stuff already for the players names and some other data. But it is only loading in part of the data. As in players.csv, it just loads the players names in one array, the players position in another array, etc. The problem is, there is no way that I can tell how many items are in the file, without reading the file in twice. Not a solution.

An array is like this:

dim Players() as string

now, I can redimension that to add players. I redim it after each read and redim it to the players ID. But, I can do this with an array like this:

dim Players(NumPlayers, StatFields), as I have to know how many players there are and how many fields.

Anyhow, in this example, I load in the entire players_at_bat_batting_stats.csv file and then when I want to access it again, I just access the array.

This will speed up tons of my utils and open some doors that were closed (and had a barricade) before. We are talking about a routine that took 15 minutes before, will now take 15 seconds.

Just a quick note, AUHOMERUNTRACK will work this way:

You run it anytime before you proceed to the next season. It can be run during the season, as long as the day is complete. (no unplayed games)

It will go through the above CSV file and read in all of the home runs. It will copy them over a txt file for each player, logging all of his home runs. It will keep the pitcher, team, inning, outs, men on base, was it PH, was it a close game, balls, strikes, etc.

Then when you tell it to create the HTML, it will create HTML pages for each player. (You will be able to toggle to include retired players). The HTML pages will be SION look a likes with the headers and stuff like all of your other HTML pages. It will also modify all of the players HTML pages to show a link to this html page.

Each players page will include stats, very much like the baseball reference page does.

Sorry for the long post, but this is my way to log on what I am working on.

I was really excited to be working on the HOF program, but the interest went from very little to NONE. So, I put it back on the back-burner, since it is such a complex program.
__________________
Visit www.planetootp.com for my MODS. Like BLUES? Visit www.smokestacklightnin.com, you will LOVE it! New show every Monday!! New Blues HOF!
Comedian2004 is offline   Reply With Quote