-
Notifications
You must be signed in to change notification settings - Fork 18
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
is_missing() working properly? #27
Comments
At least in Text::CSV_XS this is not a bug: when using getline_hr and column_names are set, none of the columns can be missing in a _hr call, so is_missing is forced to be false for every column.
I admit that the docs can be more clear about that. Suggestions welcome. |
@Tux, is that true when we set "keep_meta_info" to true as I did in https://rt.cpan.org/Public/Bug/Display.html?id=117495 ? I suspect the following line that sets 0x0010 (CSV_FLAGS_MIS) does not work correctly when
(https://github.com/Tux/Text-CSV_XS/blob/master/CSV_XS.pm#L852 ; Same can be said for https://github.com/makamaka/Text-CSV/blob/master/lib/Text/CSV_PP.pm#L814 ) |
I think you are right. Checking … Uh, no: if you have an empty line, there is still ONE field: an empty field. That field is not missing. So, if you set just one single column name, that will never be missing, unless you have a parse error on the first field. This however is most likely the most correct code:
|
@Tux, yeah, your patch in the above message seems most reasonable to me, especially because the example code in the pod has been saying you can skip an empty line if is_missing(0) returns true.
If we consider there is always at least one field, that |
Thanks for pointing there. That example should be changed. I'll reconsider the docs and the current behavior. Maybe a complete empty line should set missing, even if it is legal. |
Any thoughts on if this would be a bug when Text::CSV_PP is in use? |
@rwhitworth your assumption was buggy in the first place (even though the docs are vague) I think that detection an empty line using getline_hr would take some serious extra steps from a user point of view, something like
if you have just one single column, checking for definedness with blank_is_undef should be sufficient |
@Tux, the above code is hard to write, and actually doesn't work (says "EMPTY!" for the first two lines) if the csv looks like the following, and
I agree an empty line is legal in general, but I'd prefer to keep the doc and change the behavior. @rwhitworth, your test code you've sent us as a PR has a few issues: 1) it doesn't set |
@rwhitworth , OK, thanks for your kindness. But please don't hesitate to say which or how you prefer to write when you parse your csv. Like or unlike is an important input for both of us! :) |
I still have not made up my mind. So hard to choose between multiple versions of correct :) |
@rwhitworth I have now committed this: Tux/Text-CSV_XS@6b71f39
And added this to the test suite: open $fh, ">", $tfn or die "$tfn: $!\n";
print $fh <<"EOC";
a,b
2
EOC
close $fh;
ok ($csv = Text::CSV_XS->new (), "new");
open $fh, "<", $tfn or die "$tfn: $!\n";
ok ($csv->column_names ("code", "foo"), "set column names");
ok ($hr = $csv->getline_hr ($fh), "get header line");
is ($csv->is_missing (0), undef, "not is_missing () - no meta");
is ($csv->is_missing (1), undef, "not is_missing () - no meta");
ok ($hr = $csv->getline_hr ($fh), "get empty line");
is ($csv->is_missing (0), undef, "not is_missing () - no meta");
is ($csv->is_missing (1), undef, "not is_missing () - no meta");
ok ($hr = $csv->getline_hr ($fh), "get partial data line");
is (int $hr->{code}, 2, "code == 2");
is ($csv->is_missing (0), undef, "not is_missing () - no meta");
is ($csv->is_missing (1), undef, "not is_missing () - no meta");
close $fh;
open $fh, "<", $tfn or die "$tfn: $!\n";
$csv->keep_meta_info (1);
ok ($csv->column_names ("code", "foo"), "set column names");
ok ($hr = $csv->getline_hr ($fh), "get header line");
is ($csv->is_missing (0), 0, "not is_missing () - with meta");
is ($csv->is_missing (1), 0, "not is_missing () - with meta");
ok ($hr = $csv->getline_hr ($fh), "get empty line");
is ($csv->is_missing (0), 1, "not is_missing () - with meta");
is ($csv->is_missing (1), 1, "not is_missing () - with meta");
ok ($hr = $csv->getline_hr ($fh), "get partial data line");
is (int $hr->{code}, 2, "code == 2");
is ($csv->is_missing (0), 0, "not is_missing () - with meta");
is ($csv->is_missing (1), 1, "not is_missing () - with meta");
close $fh; |
- and now is_missing(0) on empty line return 1 for keep_meta_info = true (#27)
And Text::CSV_PP passes the test above now, with the above commit. Thanks. |
I notice there are no tests for the is_missing() function, so I created a few, but it looks like is_missing returns undef when it shouldn't. Am I misunderstanding the is_missing or getline_hr functions?
The test is also my fork of this repo in case Markdown eats the code..
The text was updated successfully, but these errors were encountered: