Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added a maximum bytes limit on a row #88

Merged
merged 2 commits into from
Oct 10, 2018
Merged

Added a maximum bytes limit on a row #88

merged 2 commits into from
Oct 10, 2018

Conversation

SimonSimCity
Copy link
Contributor

Resolves #86

@SimonSimCity
Copy link
Contributor Author

SimonSimCity commented Aug 17, 2018

Any update on this? 😊

@shellscape
Copy link
Collaborator

@SimonSimCity howdy. I'm stepping in to help maintain this module. There's a similar PR in #82 that I have to look at as well. We'll get one of the two implementations in there.

@SimonSimCity
Copy link
Contributor Author

@shellscape any update? 😊

@shellscape
Copy link
Collaborator

@SimonSimCity looks like this PR and the other need a rebase. I can circle back to this PR when it's updated and rebased.

@SimonSimCity
Copy link
Contributor Author

Thanks for the feedback. I've rebased the PR. Gladly the code I changed was some part you mostly took from the old version, so it wasn't hard to adapt 😅

Copy link
Collaborator

@shellscape shellscape left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few minor change requests. I took a look at the other PR, and I'm good to go ahead with yours here.

README.md Outdated
@@ -217,6 +217,13 @@ Default: `0`
Specifies the number of lines at the beginning of a data file that the parser should
skip over, prior to parsing headers.

##### maxRowSize
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

'size' could be ambiguous here. let's use maxRowBytes

README.md Outdated
##### maxRowSize

Type: `Number`<br>
Default: `0`
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd like to go a different route here. (see below) Please update this to "Default: Number.MAX_SAFE_INTEGER"

index.js Outdated
@@ -16,6 +16,7 @@ const defaults = {
raw: false,
separator: ',',
skipLines: null,
maxRowSize: 0,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd like to go with Number.MAX_SAFE_INTEGER here instead of zero. I like to keep default options set to real-world values (doesn't always work out, but hey I can try). 0 doesn't reflect what the module does if this isn't set. Using the max value is more in line with that, and doesn't cost us anything.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Which would fix the maximum bytes on 8 peta byte - which should be enough for the foreseeable future. I'd like to see the document having such a big row ...

index.js Outdated
@@ -214,6 +216,11 @@ class CsvParser extends Transform {
const chr = buf[i]
const nextChr = i + 1 < bufLen ? buf[i + 1] : null

this._currentRowSize++
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if (this.maxRowBytes !== Number.MAX_SAFE_INTEGER) {
  this._currentRowSize++
  if (this._currentRowSize > this.maxRowSize) {
    return cb(new Error('Row exceeds the maximum size'))
  }
}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd then remove the outer condition here. Otherwise we're creating this magic number on which it suddenly would not work 😅

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure I follow you. I'd prefer if no operation around this feature were performed if maxRowBytes is set to the default, do-nothing value.

Copy link
Contributor Author

@SimonSimCity SimonSimCity Oct 10, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you want that, I'd take the default back to 0 to make this equal to infinity. This suggestion here introduces Number.MAX_SAFE_INTEGER as a magic number - something special happens if this number is set. It's then e.g. not possible to actually set this option to Number.MAX_SAFE_INTEGER anymore, but someone wanting it would have to set it to Number.MAX_SAFE_INTEGER-1 to get a comparable result. This you'd have to know, otherwise you could run into a problem.

Ok, 8 peta byte is quite unlikely to reach, but just the fact of introducing a magic number here makes me feel bad about it.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Different strokes for different folks. Typically you're beholden to the maintainers preferences on PRs 😉 I'll merge this and rectify it afterwards, so you can maintain your position on that in your commit, and the blame will show I modified it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

😂 I know ... I just wanted to hear how you'd justify this choice you're taking there. Anyways - thanks for merging it in 🎉

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants