Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ignore last row if it is empty #388

Closed
MarkOrtmann opened this issue May 4, 2020 · 7 comments
Closed

Ignore last row if it is empty #388

MarkOrtmann opened this issue May 4, 2020 · 7 comments

Comments

@MarkOrtmann
Copy link

If I parse the following file with line separator ; and delimiter ,
A,B,C;\n
D,E,F;\n
the output contains three rows with the last row being an empty String[] (note the \n).

I know this is a kind of weird csv format, however, are you considering to support it, i.e., suppress the last line if its empty? The lib correctly trims the last row which is \n. I know that there is an option for this, however I don't think that it was introduced to resolve this issue, so it rather feels like a work around.

Thank for your help and I really want to thank you for this great library!
Mark

@jbax
Copy link
Member

jbax commented May 4, 2020 via email

@jbax jbax closed this as completed May 4, 2020
@MarkOrtmann
Copy link
Author

MarkOrtmann commented May 4, 2020

Just for clarification. You say this is a two line csv, because the second line should not end with \n and the proper line seperator is \n?

@jbax
Copy link
Member

jbax commented May 4, 2020 via email

@MarkOrtmann
Copy link
Author

MarkOrtmann commented May 5, 2020

I think I'm confusing you rather than the other way around and I really appreciate your quick response.

So this is my problem. I have the following file semicolon.zip and run the following code.

final CsvParserSettings settings = new CsvParserSettings();
settings..getFormat().setLineSeparator(";");
final CsvParser parser = new CsvParser(settings);
parser.beginParsing(<path to semicolon.csv>);
String row;
while((row = parser.parseNext()) != null) { System.out.println(Arrays.toString(line)); }

Output:
[A]
[B]
[null]

The last entry is is actually \n, but due to trimming it becomes null. What I'd prefer is that the output did not contain [null] and that's the option I'm asking for. Please note that this suppressing of a row would only apply for the very last row. If there was an empty line between A and B this should clearly be returned, expect we use setSkipEmptyLines.

Thanks for you help once more
Mark

@jbax
Copy link
Member

jbax commented May 5, 2020 via email

@MarkOrtmann
Copy link
Author

Seems like I need to make myself more familiar with the normalized Newline option.

Btw. would auto-guessing the line separator resolve the \r\n issue, as you proposed in the other issue, where I asked about OS independent line breaks?

Thx again for you quick reply!
Mark

@MarkOrtmann
Copy link
Author

Ok now I understand I little more.

Added the following two lines before creating the parser
settings.getFormat().setNormalizedNewline('\n');
settings.setSkipEmptyLines(false);

The problem is now that when I disable skipEmptyLines it returns 1 String[null] for row A and 2 rather than one String[null] array for the last \n row.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants