-
Notifications
You must be signed in to change notification settings - Fork 45
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix end of line parsing on Windows #294
base: main
Are you sure you want to change the base?
Conversation
Odoc's build on Windows using this patch (and other fixes on Odoc itself) is passing: https://github.com/Julow/odoc/runs/1285922231?check_suite_focus=true |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good!
Could you also add a changelog entry about this please?
Did you try to add a small unit test for this? I think it would be a nice improvement to this PR, to make sure we don't break parsing of markdown files on windows in the future by mistake!
I think it would be better to test this in CI where every text files would be converted to Windows eol (Git does that). Otherwise, such a test would be easy to break (for example, Git's rules can be configured locally and there is no clear default) and incomplete (spaghetti code). |
It looks good to me, I guess we need to do the same for lexer_top, lexer_cram and mli_parser |
@emillon I seem to recall we discussed this and that the cause for this failure was unclear as you couldn't reproduce the bug on windows. Do we know more about this? |
The general idea is that yes, we do have to handle newlines. Some git installations are configured to translate newlines, some do not. If they are translated, the LFs in this repository will be converted to CR LF and the tests will fail. Conversely this translation can also hide problems in the other direction by making CRs disappear. Two distinct bits are necessary to properly support different kind of newlines: The first one is about accepting various types of newlines - generally accepting But mdx also writes Several strategies are possible for that second part:
The translation layer that exists in ocaml (or really, in the libc) is a bit lacking for this because it can only select between "binary" (1 above) and "text" (2 above), not "always CR" or "always CR LF" that would be required by 3. So to implement the "preserve" strategy one has to open the file in binary mode and do the translation manually. One final remark is that |
I was testing some MDX + Windows things, and found some files using CRLF would simply hang in MDX. It would be great to see full support for line-endings land in MDX -- this might get picked up as part of ocaml-multicore/eio#758 |
There is no way to check this easily, a lot of work is still needed to run the testsuite on Windows. This build fails for a different reason, which is enough for me: https://github.com/Julow/odoc/runs/1281945575?check_suite_focus=true