Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

observer completion improvement #1529

Merged
merged 1 commit into from
Jan 10, 2025

Conversation

wintonzheng
Copy link
Contributor

@wintonzheng wintonzheng commented Jan 10, 2025

Important

Enhances observer completion by adding current_information key and using screenshots for goal achievement checks in observer.j2, observer_check_completion.j2, and observer_service.py.

  • Behavior:
    • Added current_information key to JSON response in observer.j2 and observer_check_completion.j2 to list extracted information and identify missing data.
    • Updated logic in observer_check_completion.j2 to use task history and screenshots to determine if the user goal is achieved when information extraction is not required.
  • Services:
    • In observer_service.py, added logic to capture and use screenshots for observer completion checks.
    • Enhanced run_observer_cruise_helper() to handle website scraping and screenshot capturing for completion validation.

This description was created by Ellipsis for c1c6f42. It will automatically update as commits are pushed.

<!-- ELLIPSIS_HIDDEN -->

> [!IMPORTANT]
> Enhance observer completion by adding `current_information` key to JSON responses and improving goal achievement logic in prompts and `observer_service.py`.
>
>   - **Prompts**:
>     - Add `current_information` key to JSON response in `observer.j2` and `observer_check_completion.j2` to list extracted information and identify missing data.
>     - Update `user_goal_achieved` logic in `observer_check_completion.j2` to consider task history and screenshots when information extraction is not required.
>   - **Services**:
>     - In `observer_service.py`, add logic to capture screenshots for observer completion check when `block_result.success` is `True`.
>     - Log a warning if website scraping fails during observer completion check in `run_observer_cruise_helper()`.
>
> <sup>This description was created by </sup>[<img alt="Ellipsis" src="https://img.shields.io/badge/Ellipsis-blue?color=175173">](https://www.ellipsis.dev?ref=Skyvern-AI%2Fskyvern-cloud&utm_source=github&utm_medium=referral)<sup> for 106a40fb7c10167bc4faf4f7dc103fd4cb616fe6. It will automatically update as commits are pushed.</sup>

<!-- ELLIPSIS_HIDDEN -->
Copy link
Contributor

@ellipsis-dev ellipsis-dev bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 Looks good to me! Incremental review on c1c6f42 in 27 seconds

More details
  • Looked at 69 lines of code in 3 files
  • Skipped 0 files when reviewing.
  • Skipped posting 3 drafted comments based on config settings.
1. skyvern/forge/prompts/skyvern/observer.j2:17
  • Draft comment:
    Typo in the comment: 'inforamtion' should be 'information'.
  "current_information": str, // List the information extracted so far. Think what information has been collected that's helpful and relevant to the user goal, and what information is missing if any.
  • Reason this comment was not posted:
    Confidence changes required: 50%
    The PR introduces a new key 'current_information' in the JSON response format. However, there is a typo in the word 'information' in the comment for this key.
2. skyvern/forge/prompts/skyvern/observer_check_completion.j2:8
  • Draft comment:
    Typo in the comment: 'inforamtion' should be 'information'.
  "current_information": str, // List the information extracted so far. Think what information has been collected that's helpful and relevant to the user goal, and what information is missing if any.
  • Reason this comment was not posted:
    Confidence changes required: 50%
    The PR introduces a new key 'current_information' in the JSON response format. However, there is a typo in the word 'information' in the comment for this key in both files.
3. skyvern/forge/sdk/services/observer_service.py:553
  • Draft comment:
    Consider recording the screenshots in the database for future reference or debugging purposes.
  • Reason this comment was not posted:
    Confidence changes required: 50%
    The PR adds a new feature to scrape the website and get screenshots for observer completion check. However, the screenshots are not being recorded in the database, which might be necessary for future reference or debugging.

Workflow ID: wflow_hszBTu0IizIIDwcW


You can customize Ellipsis with 👍 / 👎 feedback, review rules, user-specific overrides, quiet mode, and more.

Copy link
Contributor

@ellipsis-dev ellipsis-dev bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 Looks good to me! Reviewed everything up to c1c6f42 in 42 seconds

More details
  • Looked at 69 lines of code in 3 files
  • Skipped 0 files when reviewing.
  • Skipped posting 2 drafted comments based on config settings.
1. skyvern/forge/prompts/skyvern/observer.j2:17
  • Draft comment:
    Typo: 'inforamtion' should be 'information'. This typo is present in multiple files.
  • Reason this comment was not posted:
    Confidence changes required: 50%
    The word 'information' is misspelled in multiple places across the codebase. This should be corrected for clarity and professionalism.
2. skyvern/forge/sdk/services/observer_service.py:553
  • Draft comment:
    Consider calling _record_thought_screenshot here to capture the observer's thought process for debugging and auditing purposes.
  • Reason this comment was not posted:
    Decided after close inspection that this draft comment was likely wrong and/or not actionable:
    The code was refactored to handle screenshots differently. Instead of using _record_thought_screenshot, it now gets screenshots through scrape_website and passes them directly to the LLM handler. This seems to be an intentional architectural change. The old way of recording screenshots was removed in favor of this new approach. The comment suggests going back to the old way, which would conflict with the new design.
    Maybe there's a reason we want both screenshot mechanisms - perhaps they serve different purposes? The old function might capture something that the new approach misses.
    Looking at _record_thought_screenshot's implementation, it does the same thing - takes a screenshot and saves it. The new approach is more integrated, getting screenshots as part of the scraping process. Having both would be redundant.
    The comment should be deleted. The code was intentionally refactored to handle screenshots differently, and suggesting to add back the old method would be counter to this architectural change.

Workflow ID: wflow_zGGcOf6hpRVWHQx9


You can customize Ellipsis with 👍 / 👎 feedback, review rules, user-specific overrides, quiet mode, and more.

@wintonzheng wintonzheng merged commit 5b33f5e into main Jan 10, 2025
6 checks passed
@wintonzheng wintonzheng deleted the shu/observer_user_goal_check_improvement branch January 10, 2025 08:39
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant