Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add File API proposal design document #3101
Add File API proposal design document #3101
Changes from 2 commits
0669d16
e2b8dda
15be818
File filter
Filter by extension
Conversations
Jump to
There are no files selected for viewing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well .. the underlying implementation already loads it once and if it over 100kb it likely loads a couple of more times. This is blocked on removing afero.
But hopefully this will be a lot better
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍🏻
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So removing
afero
is part of the implementation? Should we have a rude but working PoC with it? We should also link the issue here.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wouldn't argue removing
afero
is part of the implementation.afero adds +1 on the times we load it in memory and there is +1-2x if it is fairly big file.
This second memory though is temporarily and will be reused. You can see a more detailed explanation here
So we will have at least 2 copies of the file in memory, which is far from ideal, but definitely a lot better than having a copy per VU.
The whole proposal currently while having some values does not make it a lot more useful to open big files, as the moment you have to use them you still need to start copy. So dropping the 2 copies of this part is likely negligible to getting streams and not copying on the "other side" where this data will be used by something.
I am not against removing afero entirely and would like to prioritise it again, but I don't think this is our biggest problem.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would also keep removing afero out of scope for this design.
Thanks for the heads-up on afero indeed, I didn't have the full context in mind regarding how it handles memory itself 🙇🏻 I'll try to rephrase to be more accurate.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No
close
?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Whether we need one or not depends a lot on the underlying implementation, so I was on the verge. As a safety net, and for the API to fill intuitive indeed, it's probably better to have one indeed. Adding it 👍
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think
File.close()
would be confusing with the current implementation, consideringopen()
returns a slice with the file data.So what would
close()
actually do? It certainly wouldn't close the file, as the comment here states. If we have the concept of aFileHandle
, thenclose()
could make it unusable from that point on? Not sure what the purpose of that would be, sinceFileHandle
s shouldn't have any expensive resources to free.I get that a counterpart to
open()
would be intuitive, but since we're not dealing with traditional file handles here, aclose()
doesn't make sense to me.Maybe instead of a top-level
open()
function, aFile
constructor would be more intuitive? E.g.:This would make it clear that
File
handles are cheap and disposable.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Regarding
close
I had left it out of the proposal and POC implementation as they were not necessary with the way I approached solving the problem on the implementation side. But this discussion is a good opportunity to mention that, yet, I didn't want to assume what the precise underlying implementation would be. However, I had thought of one potential use of close: the file registry I used in the POC to keep files content in memory, and provide a pointer to the data to VUs, could keep a reference count for each file->vu number, and once a file is not referenced anymore (because all VUs closed it), it could explicitly drop it. But I'm not entirely convinced that's really useful in practice.Regarding using a constructor as opposed to
open
, I'm not opposed to it, but I would personally prefer to stick to an API that ressembles what you find in OSes and other languages. On another note that maybe proves interesting somehow: In my initial implementation of the POC, theFile
struct was actually calledFileView
to reflect that it's a view of the file content, cannot be modified, and is a "cheap and disposable" handle.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
implementation is the key word here.
And this is an API proposal. If after making this better we need to break the API because now
close
will need to be called in order for it to work that will be bad.I would expect that any file you open - you will be able to close. Whether that has a significant impact is a good question.
If there was a different way to tell us that "I no longer need this file" I will be fine. But at this point this is not possible AFAIK. There are some resource management tc39 proposals that I am not going to discuss as they are just proposal and will require that we have a
close
like method to tell us that a file is no longer needed. So ... not really a solution.I am okay with this not having
close
for now - I do expect we will likely need to iterate on all the changes around the epic "support big files in k6" quite a lot. But wanted to lay my reasoning on wanting aclose
.While I am not against it in principal, I don't see why that will be better..
If anything the current API is closer to what is in deno and in other places and consequently might be reusable across platforms. Or more likely easier to support when someone wants to reuse a library. So ... 👎 on my side.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My point is that
open()
andclose()
only make sense if the file itself is opened and closed. But since we'll only open the file and read all its data once, and each subsequent call toopen()
will return a view of the data, callingclose()
on this view would be confusing if people are expecting the file to be closed.This behavior will not change in a different implementation, since it's the core of what we want to accomplish. So the naming doesn't quite align with the behavior, and repurposing the name used in other frameworks to forcefully align it wouldn't make it more intuitive--quite the contrary. As a user, I'd rather have the API clearly reflect the behavior, than have to remap my existing assumptions of what
open()
andclose()
do.I suggested using a constructor instead of
open()
since it doesn't imply that aclose()
would be needed. But I'm not sold on it either, and we could consider alternatives likeFileView
.Anyway, this is not a blocker from my side. So if you strongly feel we should keep it as is, I'm fine with it.