Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add errors on separate lines for coverage #4301

Closed
wants to merge 23 commits into from

Conversation

MichaelChirico
Copy link
Member

@MichaelChirico MichaelChirico commented Mar 13, 2020

if (FALSE) error("failed");

Is not caught by Codecov because it doesn't build a proper AST for C code. So we have to put error branches on their own line to be properly detected by Codecov.

Only done this for error so far. A few more patterns could be included:

  • warning
  • STOP in fread.c
  • Verbose output? Rprintf etc, not sure how important it is to cover these... I see about 100 more lines with DTPRINT and Rprintf
  • ...

@codecov
Copy link

codecov bot commented Mar 13, 2020

Codecov Report

Attention: Patch coverage is 99.74874% with 1 line in your changes missing coverage. Please review.

Project coverage is 99.61%. Comparing base (b1b1832) to head (4e69db6).
Report is 1113 commits behind head on master.

Files with missing lines Patch % Lines
src/fread.c 97.14% 1 Missing ⚠️
Additional details and impacted files
@@           Coverage Diff            @@
##           master    #4301    +/-   ##
========================================
  Coverage   99.61%   99.61%            
========================================
  Files          72       72            
  Lines       13916    14088   +172     
========================================
+ Hits        13862    14034   +172     
  Misses         54       54            

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@MichaelChirico
Copy link
Member Author

wow! 62% (176/285) of these branches were uncovered. looks like it's paying off already in terms of understanding spurious coverage.

@@ -6312,15 +6312,13 @@ options(datatable.optimize = Inf)

# fread dec=',' e.g. France
test(1439, fread("A;B\n1;2,34\n", dec="12"), error="nchar(dec) == 1L is not TRUE")
test(1440, fread("A;B\n8;2,34\n", dec="1"), data.table(A=8L, B="2,34"))
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

!dec %chin% c('.', ',') is implied incorrect by L1179 of fread.c @ master:

dec='' not allowed. Should be '.' or ','

?fread is not as strict:

The decimal separator as in utils::read.csv. If not "." (default) then usually ",". See details.

Took a conservative approach for now, and blocked !dec %chin% c(',', '.'). But that broke two tests (1440 and 1444.2 here).

dec='1' and dec='*' ran without errors... happy to restore this to work, and change the error/message found in fread.c.

@@ -1328,7 +1348,8 @@ SEXP binary(SEXP x)
{
char buffer[69];
int j;
if (!isReal(x)) error(_("x must be type 'double'"));
if (!isReal(x))
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This function binary( isn't used anywhere... just for debugging?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes

@MichaelChirico
Copy link
Member Author

Don't know how to cover this line so leaving this PR here:

      if (!firstJumpEnd)
        STOP(_("Single column input contains invalid quotes. Self healing only effective when ncol>1"));

@MichaelChirico
Copy link
Member Author

Heads up that there will be a ton of conflicts between this one and #4306 (I count 20 at the moment). Let's merge this one first, since this one adds a bunch of internal errors.

@MichaelChirico
Copy link
Member Author

@jangorecki any thoughts on the third point -- should we put verbose messaging on its own line as well?

@jangorecki
Copy link
Member

@MichaelChirico I don't think it is so important. I would just keep that in mind when coding new stuff, but not necessarily touch all the existing one.

@mattdowle
Copy link
Member

mattdowle commented Aug 21, 2021

What would it take to make C coverage cover branches within one line; i.e. like R coverage already does?
I don't know which tool that would be -- whether it's something inside covr or inside gcc toolchain itself, @jimhester?
But that's the root cause. There would be much more bang-for-the-time to work on that? That way, coverage for all R packages using C code would be improved, not just data.table.
If there is something deeply insurmountable about doing that (e.g. a covr, gcc or clang developer stating that it's not just difficult but not possible in C) then I'd be much more comfortable splitting all our C lines given that justification I can point to. But I hope it's actually possible to get coverage within same-line C branches implemented. That would be awesome. Could even lead to a paper and conference presentation.

@MichaelChirico
Copy link
Member Author

Too many conflicts here. This is being handled incrementally for now. If needed, when the PR queue is emptier, we can start this from scratch & add a linter to prevent backsliding.

@MichaelChirico MichaelChirico deleted the src-error-newline branch August 29, 2024 05:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants