-
Notifications
You must be signed in to change notification settings - Fork 366
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
added type guards for MemoryFileSystem and name property to MemoryFile class #1574
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you have time to add a couple of simple tests?
I am not certain why you would use Path with fsspec, since it is really meant for local filesystem files, and universal_pathlib (upath) attempts to do this for fsspec. Still, I'd rather not get in users' way :)
I'm thinking we can add a test for retrieving a file using Path. Is there anything else I'm missing? |
Hi @martindurant, I added a test for the Path changes. Please let me know if you want me to do anything else! |
On requiring a |
Co-authored-by: Martin Durant <martindurant@users.noreply.github.com>
Good to know. I'm not using fsspec directly. I'm looking at the implementation of As for depending on Maybe we can roll back the |
OK, that sounds like a plan - let me know what the response is. |
Will do. In the meantime, I rolled back the |
Co-authored-by: Martin Durant <martindurant@users.noreply.github.com>
I pulled your changes where we check if the path is an instance of Path. Would it be more reasonable to only accept PurePosixPaths? I assume that's the format memfs uses anyway. We wouldn't have to worry about the OS specific edge cases. |
I don't suppose people use this, and you would be back in the same situation as before. But yes, memfs in some places (e.g., directory components) assumes the path-string is posix-like. |
People are using the library in this way. I'm essentially trying to fix using this class with memfs. I don't understand what situation I'm back in, can you please explain? |
I mean, people passing Path rather than what we expect. If you instead make a case handling PurePosixPath, that would still leave Path unhandled. |
I think the issue is that in some implementations in fsspec accept a Path object in As a result, the code I linked to will either work or not work depending on the selected implementation of AbstractFileSystem. And I agree, half-handling path may not be as good as I initially thought. The idea was to support a Path object that makes sense for memfs |
I think you should call stringify_path and don't worry about windows Paths looking strange - if calling code doesn't want that, they can either make their own strings or use posix paths. |
The calling code is using PurePosixPaths if the file system isn't LocalFS. I changed memfs in my latest commit to check if it's a pure posix path and stringify it if it is. Does this work for you? |
Why not pathlib.PurePath, then? That would catch Path and the expected PurePosixPath. |
Sure, I can make that change. |
There's an issue handling PureWindowsPath. make_posix_path depends on the current os, and not the type of the input. This means that if I call LocalFileSystem._strip_protocol with a PurePosixPath on windows, it will convert the posix path to a windows path and then try to convert it back to posix. I'm ending up with a value like "/c:/myfile/foo/bar" instead of "/myfile/foo/bar". I'd consider changing make_posix_path to check the type of the input, but I want to get your opinion. |
Is there really any problem with paths looking like "/c:/myfile/foo/bar" ? |
I think so. Intuitively, I'd expect PosixPath("/myfile/foo/bar") and WindowsPath("/myfile/foo/bar") to refer to the same file. I suppose we won't have a situation where someone's using both WindowsPath and PosixPath interchangeably. Maybe we refactor the test, and do one for windows path and one for posix path? |
... in the memoryFS? I'm not sure why we need to make such guarantees. |
… in memoryfs paths
Ok, I allowed for "/c:/.." in the paths and made tests for both posix and windows. |
Before I disappear back into the woodwork, thank you for taking the time to review my changes and offering feedback. Also, thank you for creating and maintaining this library in general. |
Hi everyone,
I've made small changes to the
MemoryFileSystem
class. The goal of this PR is to fix issues that prevent users from using this class with theSimpleDirectoryReader
from llama_index.The first issue is that
SimpleDirectoryReader
will create Path objects and pass them tofs.open(...)
, but theMemoryFileSystem
class assumes path will be a string. I added a quick call tostringify_path
in_strip_protocol
to fix this issue.The second issue is llamaindex assumes the file will have a
name
property, so I added this to theMemoryFile
class.