Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

make sure we do a web scrape before data extraction at the end of task to ensure the refreshness of the scraped data #1152

Merged
merged 1 commit into from
Nov 7, 2024

Conversation

wintonzheng
Copy link
Contributor

@wintonzheng wintonzheng commented Nov 7, 2024

Important

Ensure data scraping is refreshed with screenshots before data extraction in agent.py, handler.py, and scraper.py.

  • Behavior:
    • Ensure data scraping is refreshed before data extraction in agent.py and handler.py.
    • Modify refresh() in scraper.py to always include screenshots.
  • Prompts:
    • Update check-user-goal.j2 to include screenshots in user goal verification.
  • Functions:
    • Remove with_screenshot parameter from refresh() and scrape_website() in scraper.py.

This description was created by Ellipsis for f0f6d49. It will automatically update as commits are pushed.

Copy link
Contributor

@ellipsis-dev ellipsis-dev bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 Looks good to me! Reviewed everything up to 2861705 in 1 minute and 13 seconds

More details
  • Looked at 127 lines of code in 4 files
  • Skipped 0 files when reviewing.
  • Skipped posting 4 drafted comments based on config settings.
1. skyvern/webeye/scraper/scraper.py:269
  • Draft comment:
    The scrape_website function has a with_screenshot parameter that is no longer used. Consider removing it for clarity and consistency.
  • Reason this comment was not posted:
    Comment looked like it was already resolved.
2. skyvern/webeye/scraper/scraper.py:297
  • Draft comment:
    The scrape_website function has a with_screenshot parameter that is no longer used. Consider removing it for clarity and consistency.
  • Reason this comment was not posted:
    Marked as duplicate.
3. skyvern/webeye/scraper/scraper.py:319
  • Draft comment:
    The scrape_website function has a with_screenshot parameter that is no longer used. Consider removing it for clarity and consistency.
  • Reason this comment was not posted:
    Marked as duplicate.
4. skyvern/webeye/scraper/scraper.py:366
  • Draft comment:
    The scrape_website function has a with_screenshot parameter that is no longer used. Consider removing it for clarity and consistency.
  • Reason this comment was not posted:
    Marked as duplicate.

Workflow ID: wflow_eKVNdDaOXRFEpmqs


You can customize Ellipsis with 👍 / 👎 feedback, review rules, user-specific overrides, quiet mode, and more.

…sk to ensure the refreshness of the scraped data
@wintonzheng wintonzheng force-pushed the shu/scrape_before_data_extraction_ensure branch from 2861705 to f0f6d49 Compare November 7, 2024 06:34
Copy link
Contributor

@ellipsis-dev ellipsis-dev bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 Looks good to me! Incremental review on f0f6d49 in 34 seconds

More details
  • Looked at 135 lines of code in 4 files
  • Skipped 0 files when reviewing.
  • Skipped posting 1 drafted comments based on config settings.
1. skyvern/webeye/actions/handler.py:2252
  • Draft comment:
    Ensure scraped_page_refreshed.screenshots is used instead of scraped_page.screenshots to maintain consistency with refreshed data.
  • Reason this comment was not posted:
    Comment did not seem useful.

Workflow ID: wflow_8xUILmY2vz6eUR3p


You can customize Ellipsis with 👍 / 👎 feedback, review rules, user-specific overrides, quiet mode, and more.

@wintonzheng wintonzheng changed the title make sure we do a data scrape before data extraction at the end of task to ensure the refreshness of the scraped data make sure we do a web scrape before data extraction at the end of task to ensure the refreshness of the scraped data Nov 7, 2024
@wintonzheng wintonzheng merged commit c80597e into main Nov 7, 2024
2 checks passed
@wintonzheng wintonzheng deleted the shu/scrape_before_data_extraction_ensure branch November 7, 2024 06:55
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants