Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Streamline file reading to use io streams #498

Open
wants to merge 3 commits into
base: main
Choose a base branch
from

Conversation

Ivorforce
Copy link
Contributor

@Ivorforce Ivorforce commented Jul 4, 2024

I've recently needed to read files from buffers rather than files (streamed from file selection pickers). I've opened up a Pull Request with a quick fix allowing a buffer override: #491

However, a simple buffer override is suboptimal. Ideally, there are as few diverging paths in the code as possible. Additionally, it makes sense to separate concerns of parsing and I/O, because being agnostic to the buffer source enables flexibility and decreases complexity on both sides. Hence, merging this PR is much preferred to #491.

This Pull Request implements my recommendation for a restructure of rdrecord, rdheader and rdann. rdann was easy to modify, but rdrecord and rdheader's ability to read multi-segment records does not easily translate to reads using streams. This results in code that isn't as nice as I was hoping - so someone needing to read data from buffers must therefore implement surrounding functionality, such as pre-reading the header or acting on multi-segment records, themselves. Still, streamlining the functions addressed in this PR already facilitate accomplishing this goal at all, without needing to re-write all of the parsing functionality of wfdb.

Before Merging

This pull request needs:

  • Unit Tests passing
  • Updated documentation

Concerns and observations

  • Previously, the header parser differentiated between remote and local headers. The new implementation makes it impossible to distinguish between the two. I would appreciate input on whether this differentiation was intentional. If so, we can allow an argument to _rdheader specifying the encoding.
    • Remote headers were decoded using iso-8859-1
    • Local headers were decoded using ascii
    • I'm also open to _rdheader accepting a pre-decoded string, since the file is plaintext. I advocate a binary input though to avoid inconsistent coding behavior.
  • rdheader previously ignored file read errors for local files. I would like input on whether that was intentional, because a file read error could result in corrupted file reads.
  • rdheader previously allowed passing a directory even if pn_dir was supplied. In this case, the file_name directory was cut from the record name and silently ignored. I don't think this is the right call, so I cut this functionality. Let me know if there's a reason to keep it.
  • Some buffering arguments are, as of now, removed. It should be easy to re-introduce them into the new enclosing functions, if needed.
  • This PR removes a lot of now unneeded private functions. While it's unlikely (and discouraged due to internality) they were used before, it could break dependents relying on them.
  • The no_file parameter could be deprecated, instead using a None test against sig_data. This removes a potential failure case because it's unlikely someone passing sig_data doesn't intend it to be used.

@Ivorforce Ivorforce force-pushed the read-stream-streamlined branch 2 times, most recently from e0f7da4 to 52054f2 Compare July 4, 2024 15:02
@Ivorforce Ivorforce force-pushed the read-stream-streamlined branch 3 times, most recently from 25c3704 to e6277a4 Compare July 4, 2024 15:17
@Ivorforce Ivorforce changed the title Streamline file reading to use buffers Streamline file reading to use io streams Jul 4, 2024
@Ivorforce Ivorforce force-pushed the read-stream-streamlined branch 7 times, most recently from d7fc7f0 to 9de2d92 Compare July 30, 2024 16:50
…d annotation files. These methods all now stack between the public functions, handling mainly i/o, and the parsing functions.
@Ivorforce Ivorforce force-pushed the read-stream-streamlined branch from 9de2d92 to 7a83307 Compare July 30, 2024 16:51
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant