|
It does not really matter, as I found out this week. Even with it loaded into memory, it takes a while to go through the array with that many records.
I have been working on a new way, but it has not agreed with me. I have done it in the past, but I think a bug in Orcas is causing a problem.
Here is the plan...
Take games.csv as an example. I open the file and 'Seek' the location of where I am in the file. Since it is the 1st record, it will be position 1. I read in the string and then Seek again. This will return the location of the start of the 2nd string. I write these to a IDX file, which is my index file. So, now I can just load in the index file and reference it. I need the 1500th record, I check to see what position it is in the file, open the file, and SEEK that position and load in the string at that line. Wham bam, real quick.
I did this process for another project a few years ago, but I can't get it to work correctly.
It has been driving me crazy all week.
I even tried to read in each line in a stream, using readline and rewriting the line after I pad it. For some reason, it keeps adding a quote to the start of the line, which throws off the whole index.
If I can get this to work, then I could 'index' the larger files and be able to access the data quicker. The downside is that the index has to be rebuilt everytime you do a new CSV dump, but indexing an 8meg file only takes nano seconds, so it is not that bad.
So I am going to start a new program that will be an index making program
|