Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fread nrows=0L fixed to work like nrows=0 #4694

Merged
merged 5 commits into from
Apr 16, 2021

Conversation

ben-schwen
Copy link
Member

@ben-schwen ben-schwen commented Sep 5, 2020

Closes #4686. fread now also accepts nrows=0L and treats it like nrows=0. Beforehand 0L was treated like nrows=Inf.

@MichaelChirico
Copy link
Member

thanks @ben-schwen

quick GH tip, if you include Closes #x in your PR description, the issue will automatically close when the PR is merged. Edited to add this

src/freadR.c Outdated
@@ -126,7 +126,7 @@ SEXP freadR(
if (isReal(nrowLimitArg)) {
if (R_FINITE(REAL(nrowLimitArg)[0]) && REAL(nrowLimitArg)[0]>=0.0) args.nrowLimit = (int64_t)(REAL(nrowLimitArg)[0]);
} else {
if (INTEGER(nrowLimitArg)[0]>=1) args.nrowLimit = (int64_t)INTEGER(nrowLimitArg)[0];
if (INTEGER(nrowLimitArg)[0]>=0) args.nrowLimit = (int64_t)INTEGER(nrowLimitArg)[0];
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm glad it works but I'm not sure why nrows=0 would work differently than nrows=0L...

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually thats quite interesting. Apparently 0L is not real and hence the second case triggers.

See also

library(inline)
is_real <- cfunction(c("x" = "ANY"), "
  return ScalarLogical(isReal(x));
")

is_real(0) #returns TRUE
is_real(0L) #returns FALSE

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm and we use real here because we allow nrows=Inf to mean "everything". What about just doing as.numeric(nrows) in the fread.R code?

Copy link
Member Author

@ben-schwen ben-schwen Sep 5, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would also be a possibility. But then we could also remove the else (of if (isReal(nrowLimitArg))) branch in fread.c since it should not be reachable anymore?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Try it out and see if we pass tests. There might be some other use case we're not thinking of

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Passes all tests.

@codecov
Copy link

codecov bot commented Sep 5, 2020

Codecov Report

Merging #4694 (c87fe83) into master (7991630) will increase coverage by 0.00%.
The diff coverage is 100.00%.

❗ Current head c87fe83 differs from pull request most recent head 662bac4. Consider uploading reports for the commit 662bac4 to get more accurate results
Impacted file tree graph

@@           Coverage Diff           @@
##           master    #4694   +/-   ##
=======================================
  Coverage   99.42%   99.42%           
=======================================
  Files          73       73           
  Lines       14439    14440    +1     
=======================================
+ Hits        14356    14357    +1     
  Misses         83       83           
Impacted Files Coverage Δ
R/fread.R 100.00% <100.00%> (ø)
src/freadR.c 100.00% <100.00%> (ø)

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 7991630...662bac4. Read the comment docs.

Copy link
Member

@MichaelChirico MichaelChirico left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks for the PR

R/fread.R Outdated
@@ -26,6 +26,7 @@ yaml=FALSE, autostart=NA, tmpdir=tempdir(), tz="")
stopifnot( isTRUEorFALSE(stringsAsFactors) || (is.double(stringsAsFactors) && length(stringsAsFactors)==1L && 0.0<=stringsAsFactors && stringsAsFactors<=1.0))
stopifnot( is.numeric(nrows), length(nrows)==1L )
if (is.na(nrows) || nrows<0L) nrows=Inf # accept -1 to mean Inf, as read.table does
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not directly related to the bug but related to your PR, I guess nrows<0 is more appropriate here

@mattdowle mattdowle changed the title add fix for #4686 fread nrows=0L fixed to work like nrows=0 Apr 16, 2021
@mattdowle mattdowle added this to the 1.14.1 milestone Apr 16, 2021
@mattdowle mattdowle merged commit 374e208 into Rdatatable:master Apr 16, 2021
@ben-schwen ben-schwen deleted the fread_nrows=0L branch October 1, 2021 22:21
@jangorecki jangorecki modified the milestones: 1.14.9, 1.15.0 Oct 29, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

'nrows = 0L' in fread reads the whole file
4 participants