-
Notifications
You must be signed in to change notification settings - Fork 47
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
V0.3.2 Minor Model Improvements + DAG Fixes Due to Data Source Change (…
…#326) ### Bug Fixes - DAG: ask_astro_load_astro_cli_docs failure ``` df["content"] = df["content"].apply(enforce_max_token_len) KeyError: 'content' ``` - DAG: ask_astro_load_stackoverflow failure ``` page above 25 requires access token or app key ``` - DAG: ask_astro_load_blogs failure ``` File "/usr/local/airflow/include/tasks/extract/blogs.py", line 56, in <lambda> lambda x: BeautifulSoup(x, "lxml").find(class_="post-card__meta").find(class_="title").get_text() AttributeError: 'NoneType' object has no attribute 'find' ``` Astro Blogs formatting has changed - Astro Docs ingest DAG Have been using outdated url doc.astronomer.io, but astronomer has moved to www.astronomer.io/docs ### Minor Improvements - Remove ingest of Github issues from ingest sources - This has been adding nothing but noise. Most closed issues are bug reports and they have been fixed, retrieving these cause the LLM to think the bug persists - Github Registry Docs Reformat What Ask Astro had for registry ingest previously does not provide LLM on any insights at all - How does the LLM know how to use this anyway? - Add operator usage and param type details e.g. of what we had before ``` # Registry ## Provider: astro-sdk-python Version: 1.8.0 Module: dataframe Module Description: This decorator will allow users to write python functions while treating SQL tables as dataframes. ``` - Upgrade from Cohere Rerank 2 to Rerank 3 - Cohere emailed us asking us if we can move to Rerank 3. It's cheaper better and faster. - Upgrade from GPT-4 Turbo to GPT-4o - System Prompt Changes - Better LLM filter as last step to get rid of unhelpful documents - Ask to not include URLs that do not explicitly appear in the context - Ask LLM to explicit cite sources whenever possible. Overriding LLM stuffing template and function in LangChain to allow DocLink and Document # passed into LLM.
- Loading branch information
Showing
17 changed files
with
178 additions
and
217 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.