Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make the Shredder ignore empty lines #1088

Merged
merged 2 commits into from
Jun 25, 2020
Merged

Make the Shredder ignore empty lines #1088

merged 2 commits into from
Jun 25, 2020

Conversation

kcgthb
Copy link
Contributor

@kcgthb kcgthb commented Oct 2, 2019

When the job accounting file contains empty lines, the shredder fails.

xdmod-slurm-helper fails when a job name contains a carriage-return character. The generated sacct output contains a blank line that makes xdmod-slurm-helper fail.

Here's an example:

$ sacct -j 51180322 -p -o jobid,jobname%32
JobID|JobName|
51180322|f20__1__e3__d55__t6__p1__rs0.5__rr165__c0.25
|
51180322.batch|batch|
51180322.extern|extern|

Note how the job name sends the last pipe delimiter to the next line. Because of this, the temporary file that sacct output is sent to contains a blank line. And xdmod-slurm-helper fails with the following error:

2019-10-02 11:58:14 [debug] Shredding line ''
2019-10-02 11:58:14 [info] Rolling back database transaction
2019-10-02 11:58:18 [critical] Failed to shred line 165993 of file
/tmp/sacct-log-sherlock-8syyVH "": Malformed Slurm sacct line: ''
(stacktrace: #0 /usr/bin/xdmod-slurm-helper(177):
OpenXdmod\Shredder->shredFile('/tmp/sacct-log-...')
#1 /usr/bin/xdmod-slurm-helper(21): main()
#2 {main})
2019-10-02 11:58:18 [critical] Malformed Slurm sacct line: ''
(stacktrace: #0 /usr/share/xdmod/classes/OpenXdmod/Shredder.php(405):
OpenXdmod\Shredder\Slurm->shredLine('')
#1 /usr/bin/xdmod-slurm-helper(177):
OpenXdmod\Shredder->shredFile('/tmp/sacct-log-...')
#2 /usr/bin/xdmod-slurm-helper(21): main()
#3 {main})

Description

Check that the line is not empty before processing it. Skip empty lines

Motivation and Context

https://help.xdmod.org/support/tickets/22127

Tests performed

Run xdmod-slurm-helper on job records with job names containing carriage returns. The job records are correctly processed.

Types of changes

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to change)

Checklist:

  • My code follows the code style of this project as found in the CONTRIBUTING document.
  • I have added tests to cover my changes.
  • All new and existing tests passed.

@plessbd plessbd requested a review from jtpalmer October 2, 2019 23:17
@plessbd plessbd changed the base branch from xdmod8.5 to xdmod8.6 October 2, 2019 23:52
@plessbd
Copy link
Contributor

plessbd commented Oct 2, 2019

In the mean time you might be able to skip using xdmod-slurm-helper and use the documentation located at https://open.xdmod.org/8.1/resource-manager-slurm.html to export to a file and use xdmod-shredder.

@plessbd
Copy link
Contributor

plessbd commented Oct 3, 2019

I said might ;) I am not the main developer of the shredders, but I thought we had some code in there that would at least catch this, but I could be wrong...

I also might have been thinking of an incident we had where I had to fix old files that had this issue... https://github.com/ubccr/xdmod/blob/xdmod8.5/tools/dev/format-slurm/

} else {
throw $e;
// Skip empty lines
if ($line !== '') {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if the line is skipped then the record count should not be updated.

Comment on lines 404 to 424
try {
$this->shredLine($line);
} catch (PDOException $e) {

// Ignore duplicate key errors.
if ($e->getCode() == 23000) {
$msg = 'Skipping duplicate data: ' . $e->getMessage();
$this->logger->debug(array(
'message' => $msg,
'file' => $file,
'line_number' => $lineNumber,
'line' => $line,
));
$duplicateCount++;
continue;
} else {
throw $e;
// Skip empty lines
if ($line !== '') {

try {
$this->shredLine($line);
} catch (PDOException $e) {

// Ignore duplicate key errors.
if ($e->getCode() == 23000) {
$msg = 'Skipping duplicate data: ' . $e->getMessage();
$this->logger->debug(array(
'message' => $msg,
'file' => $file,
'line_number' => $lineNumber,
'line' => $line,
));
$duplicateCount++;
continue;
} else {
throw $e;
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd also like to see some debug logging in this change. You can also simplify the change and address @jpwhite4 's comment with something like this:

- $recordCount++;

+ if ($line === '') {
+     $this->logger->debug([
+         'message'     => 'Skipping blank line',
+         'file'        => $file,
+         'line_number' => $lineNumber
+     ));
+     continue;
+ }
+ 
+ $recordCount++;

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah yes, good suggestions! Let me try this.

@jtpalmer jtpalmer added bug Bugfixes Category:ETL Extract Transform Load labels Oct 3, 2019
@kcgthb
Copy link
Contributor Author

kcgthb commented Oct 3, 2019

Implemented @jtpalmer's and @jpwhite4's suggestions.

@plessbd
Copy link
Contributor

plessbd commented Oct 3, 2019

switched back to the 8.5 branch so tests can hopefully pass since we dont have the new branch completely upgraded

@kcgthb
Copy link
Contributor Author

kcgthb commented Oct 4, 2019

Well, thinking about it a little more, this will only solve the case where the job name contains \n but nothing after.
If a user submits a job like this:

$ srun -J "$(echo test$'\n'break)" sleep 1

the job name will be split over 2 lines in the sacct output, and the shredder will try to parse a line that contains just break and fail.

So I guess special characters should be escaped earlier, directly at the sacct level.

@jtpalmer jtpalmer changed the base branch from xdmod8.6 to xdmod9.0 October 24, 2019 12:54
@jpwhite4 jpwhite4 added this to the 9.0.0 milestone Jun 25, 2020
@jpwhite4 jpwhite4 merged commit eaa64e1 into ubccr:xdmod9.0 Jun 25, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Bugfixes Category:ETL Extract Transform Load
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants