number validation #277

Eyobyb · 2024-01-18T07:36:44Z

No description provided.

change prompt when number checker fail

20001LastOrder

Thanks for the PR. Please see the comments below.

src/sherpa_ai/agents/qa_agent.py

src/sherpa_ai/memory/belief.py

src/sherpa_ai/output_parsers/number_validation.py

20001LastOrder · 2024-01-24T05:21:02Z

src/tests/integration_tests/test_number_citation_validator.py

+@pytest.mark.parametrize(
+    "objective , input_data, expected_numbers",
+    [
+        # (


Are these commented inputs active? If not, they should be removed. I think they make the test case very hard to understand.

they are active but running them all at the same time will cause them to fail @20001LastOrder . so if you have a suggestion a way to keep them since they hold different aspects of the test that would be nice.

Hmm, this is quite strange because each test cases are meant to be isolated. I'll try to allocate some time to see if I can figure out why this happens

src/sherpa_ai/memory/belief.py

src/sherpa_ai/output_parsers/number_validation.py

rearange belief updation

20001LastOrder · 2024-02-05T00:46:10Z

It seems the failed tests are related to #287 and the commits I pushed could probably fix this issue as well. However, there is still one test failing for

           1,
            "on june how much cash does Sabio Delivers had?",
            (
                """Second Quarter 2023 Financial Highlights for Sabio Delivers
                Sabio delivered revenues of US$8.0M in Q2-2023, up 11% from US$7.2M in Q2-2022.
                CTV/OTT sales as a category increased by 57% to US$5.0 million, compared to US$3.2 million in the prior year's quarter. CTV/OTT sales accounted for 62% of the Company's sales mix, compared with 44% in the prior year's quarter.
                Mobile display revenues of US$2.9million in Q2-2023, down 24%, from US$3.9 million in Q2-2022, as our legacy mobile display campaigns continued to shift their spend with Sabio from mobile display to higher-margin mobile OTT streaming, which is recognized under the Company's CTV/OTT revenue category.
                Gross Profit of US$4.8 million in Q2-2023, up from US$4.3 million in Q2-2022. Gross Margin improved on a year-over-year basis, from 59% in Q2-2022 to 60% in the completed quarter. The increase is attributable to several efficiency and direct sales improvements within the CTV/OTT channel as well as our App Science business.
                Adjusted EBITDA1 loss of US$1.7 million in Q2-2023 compared to a loss of US$1.4 million in Q2-2022. The loss was primarily driven by overhead added during and subsequent to the second quarter of 2022, which included the continued expansion of our sales and marketing apparatus in the prior year and costs associated with transitioning our workforce back to the office. On a sequential basis, second quarter operating expenses, normalized for commissions, were flat in comparison to the first quarter of 2023 as cost efficiencies implemented by management offset incremental headcount additions to our salesforce to position ourselves for the 2024 U.S. elections.
                As of June 30, 2023, the Company had cash of US$1.7 million, as compared to US$2.4 million on June 30, 2022.`
                As of June 2023, the Company had US$6 million outstanding under its credit facility with Avidbank.""",
                [
                    {
                        "Document": "Sabio Delivers 11% Q2-2023 Revenue Growth, Led by 57% Increase in Connected TV/OTT Sales",
                        "Source": "https://www.sabioholding.com/press-releases/sabio-delivers-11-q2-2023-revenue-growth-led-by-57-increase-in-connected-tv-ott-sales",
                    }
                ],
            ),
            ["2.4", "1.4", "30", "2022", "2023", "1.7"],
        ),

It seems to be related to the way numbers are extracted. I got the following from the QA agent's response:

In June, Sabio Delivers had cash of US$1.7 million. This information is mentioned in the provided action-result history: "As of June 30, 2023, the Company had cash of US$1.7 million, as compared to US$2.4 million on June 30, 2022." [source](https://sabio-delivers.com/financial-highlights-q2-2023)"

And these numbers were extracted

['1.7', '30', '2023', '1.7', '2.4', '30', '2022', '2', '2023']

The error was that 2 was not in the resources. We should either find a way for this error or exclude this case from the testing if we do not handle it right now.

There were other errors I had to fix. Most of them were because the number validation fixed the flow of other tests. I think in general for new features, we should parameterize whether it should be run and set the default to not run to avoid too much influence on other test cases. It is always good to test everything with pytest to check if the feature implementation broke other tests.

20001LastOrder

LGTM now. Let's get this merged as it blocks the hydra configuration and contains some other fixes.

Eyobyb added 2 commits January 18, 2024 10:35

number validation

85c3f7f

add history filter function ,

15bb766

change prompt when number checker fail

20001LastOrder self-requested a review January 24, 2024 05:14

20001LastOrder requested changes Jan 24, 2024

View reviewed changes

Eyobyb added 2 commits January 24, 2024 16:33

refactor and fix issues raised

80d3fc7

fix and refactor issues raised

b70eccf

20001LastOrder requested changes Jan 25, 2024

View reviewed changes

src/sherpa_ai/memory/belief.py Show resolved Hide resolved

src/sherpa_ai/output_parsers/number_validation.py Outdated Show resolved Hide resolved

Eyobyb and others added 3 commits January 25, 2024 13:08

fix function call issue ,

a99226f

rearange belief updation

Fix failed tests and create common validation result class

6437872

Add common validation result class

1d384e5

20001LastOrder linked an issue Feb 5, 2024 that may be closed by this pull request

[BUG] Weird Memory Retention from Sherpa #287

Closed

20001LastOrder added 2 commits February 6, 2024 22:51

Merge main

b1be4e3

Temperarily remove one test case and fix citation validation

d5a74c8

20001LastOrder self-requested a review February 7, 2024 04:00

20001LastOrder approved these changes Feb 7, 2024

View reviewed changes

20001LastOrder merged commit de992e8 into Aggregate-Intellect:main Feb 7, 2024

20001LastOrder linked an issue Apr 4, 2024 that may be closed by this pull request

Sherpa does not accurately extract numeric values from search results #80

Closed

20001LastOrder mentioned this pull request Apr 4, 2024

Sherpa does not accurately extract numeric values from search results #80

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

number validation #277

number validation #277

Eyobyb commented Jan 18, 2024

20001LastOrder left a comment

20001LastOrder Jan 24, 2024

Eyobyb Jan 30, 2024

20001LastOrder Jan 31, 2024

20001LastOrder commented Feb 5, 2024

20001LastOrder left a comment

number validation #277

number validation #277

Conversation

Eyobyb commented Jan 18, 2024

20001LastOrder left a comment

Choose a reason for hiding this comment

20001LastOrder Jan 24, 2024

Choose a reason for hiding this comment

Eyobyb Jan 30, 2024

Choose a reason for hiding this comment

20001LastOrder Jan 31, 2024

Choose a reason for hiding this comment

20001LastOrder commented Feb 5, 2024

20001LastOrder left a comment

Choose a reason for hiding this comment