-
Notifications
You must be signed in to change notification settings - Fork 25
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix for issue #47 If stalled job restarts, the already exported data is overwritten #64
base: 2
Are you sure you want to change the base?
Conversation
… CSV writer overwrites the file resulting in an export only containing the records from the last (re)start and no header row. By changing the mode to append instead of write every record is appended on a new line. Because of that, the setNewLine has to be reset otherwise there is an empty line between each record in the file.
OK - thanks for this. These are some changes that make me a little worried and we can see are now causing the tests to fail. I think due to the line ending changes. The fact that this causes blank lines between values is strange when we're in append mode, that seems like a bug in the CSV writer, but hard to be sure. I definitely see that exports that take a few rounds of output need a way to append to existing files, so this looks like an important fix. If we use append mode, can this mean a new job appends an old job's file, or are the output names always unique? |
I can only talk from my own experience but looking at 30+ large exports that took multiple rounds to complete, the output names seem to be unique. |
The file name is a hash based on a random token generated when the job is constructed, so it should be unique for each new job instance.
silverstripe-gridfieldqueuedexport/src/Jobs/GenerateCSVJob.php Lines 113 to 116 in 9812e9d
Though I thought (from memory, so definitely don't take this as definitely how it works) a new |
I can see in my output that between rounds the same file in the same directory is used. I have exports of 10.000+ records. Before this fix it stalled (and resumed) after processing 6000 records resulting in a single csv file in that folder with the final 4000 records. After this fix, the exports still get a unique folder and a unique file but now I have the header row and the 10.000+ records. There always is just one csv file in the new folder after the job finished. |
Issue #47
If an export job stalls and restarts the CSV writer overwrites the file resulting in an export only containing the records from the last (re)start and no header row.
By changing the mode to append instead of write every record (even after a restart of the job) is appended on a new line. Because of that, the setNewLine has to be reset to empty otherwise there is an extra empty line between each record in the file.