Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Files should now be supplied as file(...) in the Client, and some fixes to gr.load() as well #7575

Merged
merged 89 commits into from
Mar 8, 2024

Conversation

abidlabs
Copy link
Member

@abidlabs abidlabs commented Feb 29, 2024

This PR makes the following changes:

  • Adds a file method to gradio_client that all files/urls should be wrapped in
  • Allows the current behavior (filepath provided as a string), but in this case a warning is raised
  • Correctly handles files when using gr.load()
  • Updates all of the example_payload() functions in the components and adds a test for these
  • Fixes the View API page to show the correct syntax
  • Updates the documentation to show the new syntax

Fixes: #7360

Note: this PR also makes some fixes to gr.load() so that Gradio apps with .then() events can be loaded correctly. Starting to fix #7362, but a complete fix will come later to not blow this PR up too much

@gradio-pr-bot
Copy link
Collaborator

gradio-pr-bot commented Feb 29, 2024

🪼 branch checks and previews

Name Status URL
Spaces ready! Spaces preview
Website ready! Website preview
Storybook failed! Details
🦄 Changes detected! Details

Install Gradio from this PR

pip install https://gradio-builds.s3.amazonaws.com/3cce5cbc90cc2d5e34216312e38e8561af2fe022/gradio-4.20.1-py3-none-any.whl

Install Gradio Python Client from this PR

pip install "gradio-client @ git+https://github.com/gradio-app/gradio@3cce5cbc90cc2d5e34216312e38e8561af2fe022#subdirectory=client/python"

@gradio-pr-bot
Copy link
Collaborator

gradio-pr-bot commented Feb 29, 2024

🦄 change detected

This Pull Request includes changes to the following packages.

Package Version
@gradio/app minor
@gradio/client minor
gradio minor
gradio_client minor
  • Maintainers can select this checkbox to manually select packages to update.

With the following changelog entry.

Files should now be supplied as file(...) in the Client, and some fixes to gr.load() as well

Maintainers or the PR author can modify the PR title to modify this entry.

Something isn't right?

  • Maintainers can change the version label to modify the version bump.
  • If the bot has failed to detect any changes, or if this pull request needs to update multiple packages to different versions or requires a more comprehensive changelog entry, maintainers can update the changelog file directly.

@abidlabs abidlabs changed the title Require files to be explicitly provided as File(...) in the Client Recommend files to be explicitly provided as File(...) in the Client Mar 2, 2024
@abidlabs abidlabs changed the title Recommend files to be explicitly provided as File(...) in the Client Recommends files to be explicitly provided as File(...) in the Client, and makes some fixes to gr.load() as well Mar 2, 2024
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I kept accidentally editing the EndpointV3Compatibility class in client.py instead of the Endpoint class so moved it out to its own separate file. Makes the core client.py a bit smaller and easier to grasp.

@@ -898,13 +914,18 @@ def get_type(schema: dict):
raise APIInfoParseError(f"Cannot parse type for {schema}")


OLD_FILE_DATA = "Dict(path: str, url: str | None, size: int | None, orig_name: str | None, mime_type: str | None)"
FILE_DATA = "Dict(path: str, url: str | None, size: int | None, orig_name: str | None, mime_type: str | None, is_stream: bool)"
FILE_DATA_FORMATS = [
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can mostly remove this in 4.0 when we don't need to use the old data formats to determine if a component has any filedata in its payload

@abidlabs
Copy link
Member Author

abidlabs commented Mar 7, 2024

Ok I did a lot of cleanup of the client.py file and now everything should be working! At this point, this PR only fixes: #7360, but it should make other issues related to gr.load() a lot easier to fix. Will tackle those in future PRs

I've also added the flaky-tests labels so that we can make sure that this PR doesn't break compatibility with older versions of Gradio.

@abidlabs abidlabs added the flaky-tests This label runs flaky tests (those that use the HF API) on a PR label Mar 7, 2024
@abidlabs abidlabs marked this pull request as ready for review March 7, 2024 23:42
Copy link
Collaborator

@freddyaboulton freddyaboulton left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great @abidlabs ! Still having an issue with gr.load ( i know more fixes are coming) but thought i'd let you know cause I think it should be working given the changes in this PR.

@@ -164,6 +164,7 @@ class FileData(GradioModel):
orig_name: Optional[str] = None # original filename
mime_type: Optional[str] = None
is_stream: bool = False
meta: dict = {"_type": "gradio.FileData"}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does it have to be a dict here? I think just a string value should suffice. Might make FILE_DATA_FORMATS a bit cleaner by not having meta: Dict()`.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok with it as-is just wondering if it can be simplified

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yea originally I had it

  • meta: "gradio.FileData"

but felt that it didn't signal "internal use" clearly. I also considered

  • "_type": "gradio.FileData"

but pydantic ignores props that begin with "_" when dumping to a json.

So I figured this was the clearest and sufficiently "unique" that it would be unlikely to cause issues down the line

@@ -418,7 +418,7 @@ def from_spaces(


def from_spaces_blocks(space: str, hf_token: str | None) -> Blocks:
client = Client(space, hf_token=hf_token, download_files=False)
client = Client(space, hf_token=hf_token, upload_files=False, download_files=False)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's still not working for me

image

Looking at the logs in the space, I think the file is not being uploaded

  File "/home/user/.local/lib/python3.10/site-packages/gradio_client/utils.py", line 982, in traverse
    return func(json_obj)
  File "/home/user/.local/lib/python3.10/site-packages/gradio/processing_utils.py", line 267, in _move_to_cache
    temp_file_path = move_resource_to_block_cache(payload.path, block)
  File "/home/user/.local/lib/python3.10/site-packages/gradio/processing_utils.py", line 236, in move_resource_to_block_cache
    return block.move_resource_to_block_cache(url_or_file_path)
  File "/home/user/.local/lib/python3.10/site-packages/gradio/blocks.py", line 257, in move_resource_to_block_cache
    temp_file_path = processing_utils.save_file_to_cache(
  File "/home/user/.local/lib/python3.10/site-packages/gradio/processing_utils.py", line 170, in save_file_to_cache
    temp_dir = hash_file(file_path)
  File "/home/user/.local/lib/python3.10/site-packages/gradio/processing_utils.py", line 102, in hash_file
    with open(file_path, "rb") as f:
FileNotFoundError: [Errno 2] No such file or directory: '/tmp/private/var/folders/t1/j7cmtcgd0mx43jh9nj_r9mmw0000gn/T/gradio/c07fe25f20271b76caa3103105d3fb0158760d27/3262_2.png'

predictions = self.reduce_singleton_output(*predictions)
return predictions

def download_files(self, *data) -> tuple:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if we need to introduce the new protocol version just for the download files step. I think we can achieve the same behavior without having to introduce a new protocol by combining is_file_obj_with_meta and is_file_obj.

Like

def is_file_obj_new(data):
    if is_file_object_with_meta(data):
        # for sure a file
    elif is_file_object(data): 
        # could be a file
    else:
         # for sure not a file

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm we could do this, but there's still a chance of false positives. Let's say that your function returns a json that is of this format: {"path": "/url/to/image"}, which is not that unlikely

This will fail because is_file_obj_new(data) will parse the json dict {"path": "/url/to/image"} as a file and try to download it.

client/python/gradio_client/client.py Show resolved Hide resolved
@abidlabs
Copy link
Member Author

abidlabs commented Mar 8, 2024

It's still not working for me
Looking at the logs in the space, I think the file is not being uploaded

Can you share the code that you are running locally?

Edit: Ah okay yeah this isn't working for me either:

import gradio as gr

gr.load("spaces/gradio/image_mod").launch()

Will take another look tomorrow!

@abidlabs
Copy link
Member Author

abidlabs commented Mar 8, 2024

Ok gr.load("spaces/gradio/image_mod") should be fixed now. I also fixed the issue with the example images not showing up. Let me test a few other Spaces and I'll ping you again @freddyaboulton if everything looks good

@abidlabs
Copy link
Member Author

abidlabs commented Mar 8, 2024

I tested a bunch of simple Spaces:

  • gr.load("spaces/gradio/image_mod").launch()
  • gr.load("spaces/gradio/video_identity").launch()
  • gr.load("spaces/gradio/fake_gan_2_main").launch()
  • gr.load("spaces/gradio/filter_records").launch()
  • gr.load("spaces/gradio/diff_texts_main").launch()

And these all work now. Some more complex demos are broken:

  • gr.load("spaces/gradio/kitchen_sink").launch()
  • gr.load("spaces/gradio/chatinterface_streaming_echo").launch()

I'll fix these as part of #7362 next.

Copy link
Collaborator

@freddyaboulton freddyaboulton left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the fixes @abidlabs ! Left two comments but otherwise LGTM!

@@ -275,6 +277,8 @@ def _move_to_cache(d: dict):
f"File {path} is not in the upload folder and cannot be accessed."
)
temp_file_path = block.move_resource_to_block_cache(payload.path)
if keep_in_cache:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should only be adding files to this set, right now we add aanydata (pd.DataFrame, strings etc) if keep_in_cache is True

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm how do you figure that? _move_to_cache is only called if the value is a FileData dictionary, meaning that temp_file_path must be a string path.

Copy link
Member Author

@abidlabs abidlabs Mar 8, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I moved it to the end of _move_to_cache to be more explicit, but unless I'm missing something, this logic should be valid

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yea you're right I was mistaken! Should the predicate in traverse used here be file_obj_with_meta though?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I think so. Let me just make sure that doesn't break older apps

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually unfortunately it does seem to break gr.load() with older Spaces that have examples because the format of file data in those examples doesn't have "meta". We may need to able support both. I'll see if there's a clean way to fix this as part of tackling #7362.

@@ -190,9 +190,6 @@ def __init__(
self.scale = scale
self.min_width = min_width
self.interactive = interactive
# Keep tracks of files that should not be deleted when the delete_cache parmaeter is set
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add keep_in_cache=True to move_files_to_cache below

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah yeah thanks

@abidlabs
Copy link
Member Author

abidlabs commented Mar 8, 2024

Thanks @freddyaboulton everything should be addressed now! Will merge in after the CI runs

@abidlabs abidlabs merged commit d0688b3 into main Mar 8, 2024
6 of 7 checks passed
@abidlabs abidlabs deleted the explicit-files branch March 8, 2024 20:29
@pngwn pngwn mentioned this pull request Mar 8, 2024
This was referenced Jun 6, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
flaky-tests This label runs flaky tests (those that use the HF API) on a PR v: minor A change that requires a minor release
Projects
None yet
3 participants