Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Downloading PDF files freeze the agent #1346

Closed
n-sviridenko opened this issue Dec 7, 2024 · 8 comments
Closed

Downloading PDF files freeze the agent #1346

n-sviridenko opened this issue Dec 7, 2024 · 8 comments
Assignees

Comments

@n-sviridenko
Copy link

When the agent clicks a PDF file link to download it, it opens PDF in a new tab (due to Chromium's default PDF viewer capability), this makes the agent losing the initial page (where there are other links to click) and not coming back. And then it just hangs on that PDF page forever.

@n-sviridenko
Copy link
Author

In order to reproduce it, create a page with lots of links to PDF files and tasks Skyvern to click all of them.
It'll only visit the first PDF.

@n-sviridenko
Copy link
Author

n-sviridenko commented Dec 7, 2024

Btw. we also need to download PDFs (and access them later #1347) instead of just viewing them.

@n-sviridenko
Copy link
Author

Was also trying to see how downloading invoices is currently implemented https://docs.skyvern.com/getting-started/skyvern-in-action#log-into-a-portal-and-download-invoices

@n-sviridenko
Copy link
Author

I assume implementation is:

  1. subscribing to opened pages (before await locator.click)
  2. if a new one just got opened
  3. try fetching it (HEAD request, but w/ cookies of the current session, e.g. via calling fetch)
  4. if content type from headers is PDF, then come back to prev page and download programmatically using the same URL and cookies from current session

Place to change:

async def handle_click_to_download_file_action(

This was referenced Dec 7, 2024
@suchintan
Copy link
Contributor

We actually have this implemented in Skyvern cloud similar to what you did here: https://github.com/Skyvern-AI/skyvern/pull/1349/files but I think we forgot to bring it to the open source repo

@LawyZheng can you please review https://github.com/Skyvern-AI/skyvern/pull/1349/files and make any augmentations? We should open source our PDF handling logic

Copy link
Collaborator

i just left some comments a few minutes ago.

@suchintan
Copy link
Contributor

suchintan commented Dec 10, 2024

@LawyZheng let's get our closed source changes for PDFs ported over to the open source version

@LawyZheng
Copy link
Collaborator

Fixed in #1363

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants