Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Avoid bgzf EOF check on modern MinGW releases. #1601

Merged
merged 1 commit into from
May 16, 2023

Conversation

jkbonfield
Copy link
Contributor

Unfortunately this test is proving unreliable now following a change to lseek so that it no longer returns -1 on pipes.

Fixes samtools/bcftools#1901

@jkbonfield jkbonfield force-pushed the bgzf_eof_mingw branch 2 times, most recently from 0952899 to ba89d9d Compare April 11, 2023 11:55
@daviesrob
Copy link
Member

I think this is treating the symptom rather than the cause. The MinGW-w64 lseek64() seems to be a simple wrapper around _lseeki64(). According to the Microsoft _lseeki64 documentation, "On devices incapable of seeking (such as terminals and printers), the return value is undefined." So it's not clear how this ever worked...

I think the full solution might be an fstat() on the file descriptor in hopen_fd() and hdopen() to check if it's a regular file. The result can be added to the bitfield in hFILE_fd. Then fd_seek() could simply check this on Windows, and refuse to seek on non-regular files. This would catch all attempts to seek, and not just the BGZF EOF check.

@jkbonfield
Copy link
Contributor Author

jkbonfield commented Apr 18, 2023

Undefined doesn't mean it never worked. It just means they're free to change their mind on implementation. The official posix definition of lseek though is clear, and MinGW has attempted to implement POSIX semantics even when the underlying microsoft interface doesn't. Obviously it can't catch all scenarios though.

My fix was the minimal and most simple solution to get bcftools CI working once more, so I think it's still worth doing for now.

A more invasive change to htslib will inevitably mean a longer development and testing time, especially given other commitments this week. Would you consider doing both?

Edit: also see samtools/bcftools#1901 (comment) last paragraph. Firstly, this worked with gcc 11 but not with 12. Upgrading gcc also upgrades all sorts of other associated stuff, so it's unclear if it's compiler or the libc which has the change, but regardless it demonstrates my comment about about mingw normally working around windows posix breakages, which is why we use it. Your alternative is also basically what I outlined too, although I hadn't thought of using fstat.

@jmarshall
Copy link
Member

My fix was the minimal and most simple solution to get bcftools CI working once more, so I think it's still worth doing for now.

It's not minimal though: having bgzf_check_EOF_common() always return 2 means that you have to hack test_bgzf.c too. (And it appears it changes functionality on Windows, suppressing the EOF check for ordinary files too.)

There's an example of the sort of _WIN32-specific code needed next door in fd_write(); wouldn't it suffice to add similar in fd_seek():

 static off_t fd_seek(hFILE *fpv, off_t offset, int whence)
 {
     hFILE_fd *fp = (hFILE_fd *) fpv;
+#ifdef _WIN32
+    if (GetFileType((HANDLE)_get_osfhandle(fp->fd)) == FILE_TYPE_PIPE) { errno = ESPIPE; return -1; }
+#endif
     return lseek(fp->fd, offset, whence);
 }

That's equivalent to Rob's suggestion, except on Windows it does the check on every seek instead of doing it just once on Windows when opening the file and recording the answer.

@daviesrob
Copy link
Member

Yes, that would be a simpler fix albeit less efficient as the check really only needs to be done once. I'd prefer to look for GetFileType((HANDLE)_get_osfhandle(fp->fd)) != FILE_TYPE_DISK as it would also catch character devices, which also can't seek.

I don't think MinGW has ever claimed to implement POSIX semantics. It's lseek() is just a wrapper around the Microsoft C runtime function, so what happens depends on what that does and not on anything in MinGW.

I expect cram_check_EOF() on a pipe would be affected as well as bgzf_check_EOF_common(). Fixing hseek() would sort that one too.

@jkbonfield
Copy link
Contributor Author

I meant minimal in code. Obviously it had the side effect of not checking, but that was the whole point - the check no longer works due to mingw changes.

I wasn't aware of how to do this using windows native API, but that code snippet shows the way. Thanks.

@jkbonfield
Copy link
Contributor Author

I wasn't aware of how to do this using windows native API, but that code snippet shows the way. Thanks.

The irony of this statement is, those lines of code were written by myself! Haha. Apparently I once did know how to do this, or managed somewhat better on search terms in google and/or stackoverflow to set the ball rolling. :)

Anyway I've confirmed it's fixing (again) my local mingw build of bcftools (glacially), so will work on getting this PR revised.

MinGW 12.x started returning non-zero values from lseek when the fd
is a pipe.  This is unhelpful and it breaks bgzf_check_EOF as seeking
to the end is actually seeking to the end of the pipe memory buffer,
causing invalid EOFs.  (This breaks bcftools CI tests.)

Fixes samtools/bcftools#1901

Co-authored-by: John Marshall <jmarshall@hey.com>
@daviesrob
Copy link
Member

I've managed to reproduce the bug on Windows server 2019 using MinGW-w64 in both MINGW64 and UCRT64 modes, and can confirm that the revised patch fixes both.

An easy way to reproduce the problem was this:

cat  test/ce#large_seq.tmp.bam | ./test/test_view.exe -p /dev/null -

test/ce#large_seq.tmp.bam is made by running make test and at 1.7M is plenty big enough to trigger the issue.

At some point we may want to look into making file access on Windows use the more native APIs. The POSIX-like ones seem to have some quirks, e.g. msvcrt _open() returns error "No such file or directory" when trying to open a named pipe. The MSYS2 wc command can do that, so it must be possible somehow...

Meanwhile I'll merge this so that bcftools' tests will become more reliable again.

@daviesrob daviesrob merged commit e13611a into samtools:develop May 16, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Random bcftools concat CI failures on Windows
3 participants