-
Notifications
You must be signed in to change notification settings - Fork 847
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support writing parquet to stdout
#1687
Comments
As described in #937 it should be relatively straightforward to drop the seek requirement. A PR would be most welcome, otherwise I can try to take a stab when I have time |
Great. Sorry for missing the existing issue. I looked, but not thouroghly enough it seems. I also found it would be rather straight forward to drop it. Might become my first contribution, if I am not kept busy with issues on the downstream artefacts. My intention with this issue, was exactly to verify that such a PR indee would be welcome. Thanks for the quick response! |
I'm currently working on this as part of fixing #1717 |
@tustvold Great! This will unblock new features in downstream crate |
Specifically, #1719 allows |
This is great. I'm not sure what the etiquette here is. Am I supposed to close this issue, or do the maintainers do so? For me the change in signature is enough to verify that it solves my use-case. |
It will be closed automatically when #1719 is merged, which will hopefully be in the next few days. Going to leave it open for a bit to give other reviewers a chance to look over it |
stdout
planned for release in 16.0.0 (eta early next week) |
Is your feature request related to a problem or challenge? Please describe what you are trying to do.
A clear and concise description of what the problem is. Ex. I'm always frustrated when [...]
(This section helps Arrow developers understand the context and why for this feature, in addition to the what)
I would like to write parquet files to "true" streams. E.g. stdout. This is in the context of the downstream
odbc2parquet
tool, for which I would like provide the option. This would allow my users to stream the parquet directly into a key value store or other sink just using pipes in their shell.Describe the solution you'd like
I would like to see the
Seek
+TryClone
requirement dropped as a requirement to initialize aSerializedFileWriter
. From what I've seen at least theSeek
requirement is used to determine the length of the Metadata written into the stream. Or tracking stream position in general. I feelSeek
is to strong a requirement to just keep track of a position, or bytes written.Describe alternatives you've considered
I have not considered any alternatives. Happy to hear about them, though.
The text was updated successfully, but these errors were encountered: