Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Avoid really closing stdin/stdout in hclose()/hts_close()/et al #1665

Merged
merged 3 commits into from
Aug 31, 2023

Conversation

jmarshall
Copy link
Member

Revisiting the stdin/stdout/dup conversation from #1658:

Specifically the stdin/stdout code feels inherently wrong and I don't entirely understand why it's not just dealt with automatically as any other filename.

hopen_fd_stdinout() just creates an hFILE_fd with an fd of 0 or 1. Hence in = hopen("-", "r") … hclose(in) … in2 = hopen("-", "r") would fail because fd 0 is no longer open or no longer refers to the original stdin. […] really this has caused no significant problems in all the years since it was implemented so far — largely because reading two BAM files from stdin or writing two BAM files to stdout just isn't a useful operation.

Reconsidering this… In fact it has caused some problems over the years: htsfile.c needed some circumlocutions because it too wants to write several SAM/VCF (i.e. textual) htsFile streams to stdout, and pysam has had trouble due to stdout being closed by samtools and bcftools. Moreover, as stdin and stdout are already open and hence are not opened by hts_open/hopen, it's not really morally right for them to be closed by hts_close/hclose.

Hence this pull request, which adds a mode option letter to hdopen() to signal that the fd should not be closed by hclose(), and uses it in hopen_fd_stdinout() which underlies hopen("-", …).

This enables repeated hopen("-") / hclose() / hopen("-") where previously the underlying STDIN/OUT_FILENO would have been prematurely closed. This will mean that the linked bgzip.c pull request would not need to treat "-" specially.

This also means that stdout is never really closed and hclose()'s return value does not reflect closing the underlying fd. Hence particularly paranoid programs that have written significant data to stdout will want to close and check it themselves just before the end of main():

if (fclose(stdout) != 0 && errno != EBADF) perror("closing stdout")

(Ignoring EBADF as stdout may already have been closed and checked, e.g. if the program has been linked to an earlier HTSlib where hclose() still closes STDOUT_FILENO.)

The second commit simplifies htsfile.c's file opening and adds a final fclose(stdout) check accordingly.

Enables repeated hopen("-") / hclose() / hopen("-") where previously
the underlying STDIN/OUT_FILENO would have been prematurely closed.

This means that stdout is never really closed and hclose()'s return
value does not reflect closing the underlying fd. Hence particularly
paranoid programs that have written significant data to stdout will
want to close and check it themselves just before the end of main():

  if (fclose(stdout) != 0 && errno != EBADF) perror("closing stdout")

(Ignore EBADF as stdout may already have been closed and checked,
e.g. if the program has been linked to an earlier HTSlib where
hclose() still closes STDOUT_FILENO.)
We don't need to dup(STDOUT_FILENO) now that hclose()/hts_close()
no longer irretrievably close stdout. Instead fclose(stdout)
explicitly just before the end of main() as it is a last chance
to observe I/O errors.
jmarshall added a commit to jmarshall/samtools that referenced this pull request Aug 21, 2023
Since PR samtools/htslib#1665, hts_open("-", "w") / hts_close() no longer
actually closes stdout. Close it at the end of main() so there is an
opportunity to detect I/O errors in previously-uncommitted writes.

Ignore EBADF as other code may have already closed stdout, e.g., either
particular subcommands or when (dynamically) linked against an older
version of HTSlib.
@jmarshall
Copy link
Member Author

As noted on the previous htslib PR:

In particular, there are some I/O errors that are not reported by the kernel until you call close(2). So to fully check for errors in some-htslib-using-command args > output.bam you need to check the return value of hclose() and it needs to reflect the return value of close(2) even for stdout. Otherwise you risk undetected corruption in output.bam.

Alternatively, you need to explicitly close(STDOUT_FILENO) (or an equivalent) separately from hclose(), as described in the initial comment and demonstrated for htsfile.c. See also samtools/samtools#1909 which adds such a check for samtools. If this approach is accepted, I will prepare a similar PR for bcftools.

@jmarshall
Copy link
Member Author

Having hclose() et al not actually close stdout and concentrating really closing it in one place at the end of samtools' and bcftools' main() functions will make life easier for pysam too.

(However I have a plan to rewrite pysam's I/O redirection in an upcoming release so that it will be immune to whether samtools/bcftools closes these file descriptors or not anyway.)

@jkbonfield
Copy link
Contributor

I like this as a change. It does feel more natural to me, as as you state if the file wasn't opened by htslib then maybe its going too far for it to be doing the closing.

Also given we're paranoid and use fdatasync / fsync already, I expect even if people don't explicitly do a close-and-check themselves then almost all errors will have been spotted already (eg ENOSPC).

@jmarshall
Copy link
Member Author

Hmmm… interesting point about fsync. POSIX is extremely vague about the circumstances in which close() might return EIO: “An I/O error occurred while reading from or writing to the file system”. MacOS's BSD-heritage man page says “A previously-uncommitted write(2) encountered an input/output error” which is slightly more helpful.

You may also be amused to hear that POSIX and Linux have opposite answers to “what happens if during dup2(), the as-if-by-close closing of the previous dest fd fails due to an I/O error?”…

@jkbonfield
Copy link
Contributor

Thinking on things that can go wrong - devil's advocate if you wish. (I do infact like the idea in this PR)

  1. Potential for a broken tool changing behaviour, where we write non-HTS data to stdout after close, which was silently dropped on the floor in the past. Eg:
while ((r = sam_read(in, h, b)) >= 0) {
    sam_write(out, h, b);
    n++;
}
hts_close(out);
printf("Wrote %d records\n", n);

This would previously report the number of records written when given a file, or when given stdout it'd just lose the printf as stdout had already been closed. Now it wouldn't get closed and we'd have chit-chat appearing in the output file instead. This is quite similar to the occasional nohup bugs people hit due to mixing in stderr when using > file instead of -o file.

I'm not hugely concerned as I'd argue this sort of tooling is broken and also trivial to fix, but I wouldn't be surprised if this class of bug also exists in the wild too.

  1. Pipes don't get checked with fdatasync, so they may actually be a genuine source of data loss that we can no longer verify. Eg:
cat in.sam | strace -o tr ./test/test_view - | cat > /dev/null

(Note this also hints that we maybe should fix test_view too to have an explicit close.)
The strace output shows no close of stdin (harmless) or stdout (more serious). It ends with:

read(0, "", 4096)                       = 0
write(1, "H7JKIB>ACCBF,AF@?BED>C6C1\tX0:i:1"..., 3046) = 3046
fdatasync(1)                            = -1 EINVAL (Invalid argument)
exit_group(0)                           = ?

So we see the issue with fdatasync not working here. In this regard, perhaps close would spot errors. Then again it's possible they'd just got lost in exactly the same manner. We're checking our writes, but it's the internals of the kernel which matter here. So we've sent some data down a pipe or socket, which succeeded at that point - we put it in the "todo" buffer basically - but later is found to fail. It's not something that is trivial to test either.

Given how we prefer -o file over > file, really pipes should be the primary source of writing to stdout.

@jmarshall
Copy link
Member Author

  1. I also considered adding another option to override it, e.g., hts_open("-", "wO") (O for owned / one time) would make hts_close() really close stdout. Or leaving closing as the default, so that applications would have to write hts_open("-", "wS") to get the new behaviour.

    However it all seemed a bit complicated and unclean. I too would think that a program like you describe is broken.

  2. I always vaguely assumed the startup code would close stdin/out/err, but I see it just leaves It to exit_group to close them all. It wouldn't know what to do to report failures, so I guess there'd be no point in closing them explicitly.

    I've added an fclose to test_view.

    Re pipes, there's no delayed write to an underlying device going on, so I'd be surprised if close ever reported an error on a pipe. The failure mode would be more along the lines of writing to another process that abended prematurely — and you'd expect to see diagnostics (either from the program itself or whichever shell spawned it) about the other process's unfortunate end. (Or quite likely you'd still be writing to the pipe, so you'd get to emit a diagnostic about the EPIPE too.)

    In any case, by adding explicit closes at the end of main() in samtools and bcftools, things are equivalent to before this change — at least for those two programs.

@jkbonfield jkbonfield merged commit 946f291 into samtools:develop Aug 31, 2023
8 checks passed
@jkbonfield
Copy link
Contributor

We seem to be in agreement on the various points, including what is valid and what is broken code that we need not support. Thanks for clarification on the pipe case too.

Agreed also on trying to override this with extra hts_open letters is unnecessary complication. Besides if it turns out to be a something required then it can still be done later. I'll move on to reviewing the samtools PR. Thanks

jkbonfield pushed a commit to samtools/samtools that referenced this pull request Aug 31, 2023
Since PR samtools/htslib#1665, hts_open("-", "w") / hts_close() no longer
actually closes stdout. Close it at the end of main() so there is an
opportunity to detect I/O errors in previously-uncommitted writes.

Ignore EBADF as other code may have already closed stdout, e.g., either
particular subcommands or when (dynamically) linked against an older
version of HTSlib.
@jmarshall jmarshall deleted the shared_fd branch August 31, 2023 09:24
jmarshall added a commit to jmarshall/pysam that referenced this pull request Aug 31, 2023
Apply PR samtools/htslib#1665. At present, pysam would prefer that
stdin/stdout were never closed from under it.
jmarshall added a commit to jmarshall/bcftools that referenced this pull request Nov 27, 2023
Since PR samtools/htslib#1665, hts_open("-", "w") / hts_close() no longer
actually closes stdout. Close it at the end of main() so there is an
opportunity to detect I/O errors in previously-uncommitted writes.

Ignore EBADF as other code may have already closed stdout, e.g., either
particular subcommands or when (dynamically) linked against an older
version of HTSlib.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants