observer completion improvement #1529

wintonzheng · 2025-01-10T08:13:41Z

Important

Enhances observer completion by adding current_information key and using screenshots for goal achievement checks in observer.j2, observer_check_completion.j2, and observer_service.py.

Behavior:
- Added current_information key to JSON response in observer.j2 and observer_check_completion.j2 to list extracted information and identify missing data.
- Updated logic in observer_check_completion.j2 to use task history and screenshots to determine if the user goal is achieved when information extraction is not required.
Services:
- In observer_service.py, added logic to capture and use screenshots for observer completion checks.
- Enhanced run_observer_cruise_helper() to handle website scraping and screenshot capturing for completion validation.

^{This description was created by}^{for c1c6f42. It will automatically update as commits are pushed.}

> [!IMPORTANT] > Enhance observer completion by adding `current_information` key to JSON responses and improving goal achievement logic in prompts and `observer_service.py`. > > - **Prompts**: > - Add `current_information` key to JSON response in `observer.j2` and `observer_check_completion.j2` to list extracted information and identify missing data. > - Update `user_goal_achieved` logic in `observer_check_completion.j2` to consider task history and screenshots when information extraction is not required. > - **Services**: > - In `observer_service.py`, add logic to capture screenshots for observer completion check when `block_result.success` is `True`. > - Log a warning if website scraping fails during observer completion check in `run_observer_cruise_helper()`. > > <sup>This description was created by </sup>[<img alt="Ellipsis" src="https://img.shields.io/badge/Ellipsis-blue?color=175173">](https://www.ellipsis.dev?ref=Skyvern-AI%2Fskyvern-cloud&utm_source=github&utm_medium=referral)<sup> for 106a40fb7c10167bc4faf4f7dc103fd4cb616fe6. It will automatically update as commits are pushed.</sup>

ellipsis-dev

👍 Looks good to me! Incremental review on c1c6f42 in 27 seconds

More details

Looked at 69 lines of code in 3 files
Skipped 0 files when reviewing.
Skipped posting 3 drafted comments based on config settings.

1. skyvern/forge/prompts/skyvern/observer.j2:17

Draft comment:
Typo in the comment: 'inforamtion' should be 'information'.

  "current_information": str, // List the information extracted so far. Think what information has been collected that's helpful and relevant to the user goal, and what information is missing if any.

Reason this comment was not posted:
Confidence changes required: 50%
The PR introduces a new key 'current_information' in the JSON response format. However, there is a typo in the word 'information' in the comment for this key.

2. skyvern/forge/prompts/skyvern/observer_check_completion.j2:8

Draft comment:
Typo in the comment: 'inforamtion' should be 'information'.

  "current_information": str, // List the information extracted so far. Think what information has been collected that's helpful and relevant to the user goal, and what information is missing if any.

Reason this comment was not posted:
Confidence changes required: 50%
The PR introduces a new key 'current_information' in the JSON response format. However, there is a typo in the word 'information' in the comment for this key in both files.

3. skyvern/forge/sdk/services/observer_service.py:553

Draft comment:
Consider recording the screenshots in the database for future reference or debugging purposes.
Reason this comment was not posted:
Confidence changes required: 50%
The PR adds a new feature to scrape the website and get screenshots for observer completion check. However, the screenshots are not being recorded in the database, which might be necessary for future reference or debugging.

Workflow ID: wflow_hszBTu0IizIIDwcW

You can customize Ellipsis with 👍 / 👎 feedback, review rules, user-specific overrides, quiet mode, and more.

ellipsis-dev

👍 Looks good to me! Reviewed everything up to c1c6f42 in 42 seconds

More details

Looked at 69 lines of code in 3 files
Skipped 0 files when reviewing.
Skipped posting 2 drafted comments based on config settings.

1. skyvern/forge/prompts/skyvern/observer.j2:17

Draft comment:
Typo: 'inforamtion' should be 'information'. This typo is present in multiple files.
Reason this comment was not posted:
Confidence changes required: 50%
The word 'information' is misspelled in multiple places across the codebase. This should be corrected for clarity and professionalism.

2. skyvern/forge/sdk/services/observer_service.py:553

Draft comment:
Consider calling _record_thought_screenshot here to capture the observer's thought process for debugging and auditing purposes.
Reason this comment was not posted:
Decided after close inspection that this draft comment was likely wrong and/or not actionable:
The code was refactored to handle screenshots differently. Instead of using _record_thought_screenshot, it now gets screenshots through scrape_website and passes them directly to the LLM handler. This seems to be an intentional architectural change. The old way of recording screenshots was removed in favor of this new approach. The comment suggests going back to the old way, which would conflict with the new design.
Maybe there's a reason we want both screenshot mechanisms - perhaps they serve different purposes? The old function might capture something that the new approach misses.
Looking at _record_thought_screenshot's implementation, it does the same thing - takes a screenshot and saves it. The new approach is more integrated, getting screenshots as part of the scraping process. Having both would be redundant.
The comment should be deleted. The code was intentionally refactored to handle screenshots differently, and suggesting to add back the old method would be counter to this architectural change.

Workflow ID: wflow_zGGcOf6hpRVWHQx9

You can customize Ellipsis with 👍 / 👎 feedback, review rules, user-specific overrides, quiet mode, and more.

wintonzheng added the sync label Jan 10, 2025

ellipsis-dev bot reviewed Jan 10, 2025

View reviewed changes

wintonzheng merged commit 5b33f5e into main Jan 10, 2025
6 checks passed

wintonzheng deleted the shu/observer_user_goal_check_improvement branch January 10, 2025 08:39

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

observer completion improvement #1529

observer completion improvement #1529

wintonzheng commented Jan 10, 2025 •

edited by ellipsis-dev bot

Loading

ellipsis-dev bot left a comment

ellipsis-dev bot left a comment

observer completion improvement #1529

observer completion improvement #1529

Conversation

wintonzheng commented Jan 10, 2025 • edited by ellipsis-dev bot Loading

ellipsis-dev bot left a comment

Choose a reason for hiding this comment

ellipsis-dev bot left a comment

Choose a reason for hiding this comment

wintonzheng commented Jan 10, 2025 •

edited by ellipsis-dev bot

Loading