-
Notifications
You must be signed in to change notification settings - Fork 445
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Patronus Evaluate API Integration #834
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@varjoshi Thanks for your PR! It looks good, just review the suggestions and I also noticed that some functions are not tested. Once you have done the requested changes we are good to merge. Thanks!
Functions missing test:
actions.py:
check_guardrail_pass
parse_patronus_lynx_response
Hi @Pouyanpi - thanks for your review. I've addressed your comments and added the tests you requested. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@varjoshi Thank you for adding the tests. It looks good 👍🏻
4dbeafe
to
d3814ad
Compare
Signed-off-by: Pouyan <13303554+Pouyanpi@users.noreply.github.com>
d3814ad
to
36a23b6
Compare
* Add attention standard library with some basic tests. * Add additional tests and update attention library to latest version (from ACE). * Add section about supported LLMs * Fixes from @sklinglernv review * Fix documentation * Fix issue with undefined flow continuation not working with user intent generation * Fix issues with delayed restart of continuation flow * Add unit test * Update docs/colang_2/overview.rst Co-authored-by: Pouyan <13303554+Pouyanpi@users.noreply.github.com> Signed-off-by: Christian Schüller <160150754+schuellc-nvidia@users.noreply.github.com> * Fix a bug in a code block * Created missing changelogs group entries * Fix else if parsing problem * refactor: rename and update import paths in example bot * fix: resolve import path issue in config.py - Add guardrails_stdlib_path to colang_path_dirs - Improve error message for unresolved import paths * chore(deps): bump vllm in /nemoguardrails/library/patronusai Bumps [vllm](https://github.com/vllm-project/vllm) from 0.2.7 to 0.5.5. - [Release notes](https://github.com/vllm-project/vllm/releases) - [Commits](vllm-project/vllm@v0.2.7...v0.5.5) --- updated-dependencies: - dependency-name: vllm dependency-type: direct:production ... Signed-off-by: dependabot[bot] <support@github.com> * chore(deps): bump vllm in /nemoguardrails/library/llama_guard * remove redundant matching rules in escape_flow_name * Patronus Evaluate API Integration (NVIDIA#834) * Patronus Evaluate API Integration * Address comments - tests will be added separately * Add missing tests * Remove print statements --------- Signed-off-by: Pouyan <13303554+Pouyanpi@users.noreply.github.com> Co-authored-by: Pouyan <13303554+Pouyanpi@users.noreply.github.com> * Add new test ChatInterface to run CLI like tests for the CLS example tests. * Fix event synchronization. * CLS tests core.co * CSL core.co tests done. Added UtteranceUserActionStarted to CLI and ChatInterface. * Fix wrong flow parameters in `core.co` * Tests for development helper flows. * Add test and examples for `timing.co`. * Add test and examples for `llmco`. * Fix CSL tests. * Add colang 2 documentation test to pytest.ini * Remove duplicated test. * Make tests more robust: no model config & update semaphore in current loop context. * Small improvements to CSL tests. * Few minor fixes to `attention.co` * Revert making the tracking user attention flow active. * Add Private AI Integration (NVIDIA#815) * Update evaluate directory reference (NVIDIA#751) Signed-off-by: Emmanuel Ferdman <emmanuelferdman@gmail.com> * doc: update similarity threshold to 0.75 and note (NVIDIA#770) * fix: download nltk's punkt_tab in align_score Dockerfile (NVIDIA#841) * feat(docs): enhance tracing configuration guide * refactor(config): remove OpenTelemetry from tracing config * docs(tracing): add Zipkin setup instructions * refactor(tracing): move log adapters initialization * feat(tracing): add global otel exporter registration * fix(dependencies): add pandas version constraint for eval Explicitly pinning the version of pandas to avoid pip resolution issues. This ensures compatibility with streamlit, which requires pandas>=1.4.0,<3. * docs(installation): add notice for dependency resolution fix style fix bold * Fix small issues in Colang 2 library examples * Merge in commit (23e27) from Colang doc repo and adjusted to relative Github paths * fix(tests): mock PromptSession to prevent console error The tests in `tests/test_cli.py` and `tests/test_cli_migration.py` were failing with `NoConsoleScreenBufferError` due to `prompt_toolkit` expecting a Windows console but finding `xterm-256color` instead. This issue occurs in Windows GitHub runners when the runner is `windows-latest`. To resolve this, `PromptSession` is mocked globally before any tests are collected, preventing the error. This fix ensures the tests run successfully in all environments. Changes: - Added `conftest.py` to mock `PromptSession` globally. * docs: update role from bot to assistant * docs(installation): update optional dependencies install * fix(docs): update pip install instructions note * fix: handle multiple output parsers in generation Updated the condition to check if `prompt_config.output_parser` is inthe list `["verbose_v1", "bot_message"]` * fix(docs): update CLI section headers from H4 to H3 * docs: update LLM support table to use Unicode symbols MyST and Sphinx cannot render the prev style * docs: update admonitions to use MyST syntax add docs: update syntax guide with admonitions adomination adomination adomination adomination * docs: remove duplicate GCP Text Moderation section * docs: specify shell syntax for CLI example * docs: update detailed logging example output * docs: update migration guide with new options * docs: update vulnerability scanning table to use unicode checkmarks * docs: update code blocks to use sh syntax highlighting * refactor(docs): change underscore to hyphens * refactor(docs): update references to new file names where we use hyphens instead of underscore fix fix fix * chore: update latest release version in README apply review fix * Fix release date in changelog * Update colang changelog * docs: add deprecation notice for Got It AI integration * docs: fix format for deprecation notice for Got It AI integration * docs: update deprecation notice format for Got It AI * wip: unused import * wip: use deepcopy to avoid repeated action side effect * wip: remove jailbreak from example output config * chore(changelog): update changelog for v0.11.0 release fix: update release date for version 0.10.0 * bump: update version to 0.11.0 * docs: update version to 0.11.0 * fix: move an entry to colang 2 changelog * fix: apply review changes * Add Colang patch fix note to changelog * fix(docs): update Garak GitHub links to NVIDIA repo * Restructure colang changelog adding contributor names * Minor changelog fix * Fix Colang name capitalization * Update entry * Replace all underscore with hyphen characters in folders and rst file names * Add migration cross reference * wip: switch to content moderation endpoint for factcheck * chore: correct date and PR number in changelog * Undo accidental changes * Fix asyncio loop issue in combination with enable_input event * feat: migrate to Poetry for dependency management This commit migrates the project from setuptools to Poetry for dependency management and packaging. The pyproject.toml file has been updated to reflect the new configuration, including dependencies, optional dependencies, and build system requirements. This change aims to simplify dependency management and improve the overall development workflow * chore: add tox configuration for multi-python testing * chore: add Makefile for common development tasks * chore: update .gitignore for better file management * chore: update Dockerfile to use Poetry for dependencies * chore: add Dockerfile for QA environment setup * ci: update GitLab CI for multi-python and Docker support * ci: remove redundant GitHub Actions workflows * ci: add reusable GitHub Actions workflow for tests * ci: add PR tests workflow for multi-python support * ci: add full-tests workflow for multi-OS and Python * ci: add build script for packaging with Poetry * ci: add GitHub Actions workflow for building and testing wheel chore(workflows): remove comments * ci: add workflow to test Docker image (not working) * refactor: rename test classes to supress pytest warning * fix: use temp directory for .railsignore in tests This commit updates the `test_railsignore.py` to use the system's temporary directory for the `.railsignore` file. This change addresses issues with tests on Windows OS by ensuring the `.railsignore` file is created in a writable location * chore: update issue templates with triage labels * chore: add documentation issue template * docs: update CONTRIBUTING.md for Poetry migration * feat(workflows): add lock closed threads workflow * feat(workflows): add test-published-dist workflow This workflow tests the published distribution of the package from PyPI daily. It sets up Python environments for versions 3.9, 3.10, and 3.11, installs the package, starts the server, and checks its status. This ensures the published package works as expected. * refactor: consolidate dependencies in pyproject.toml * feat(ci): update cron schedule to 11:00 PM UTC daily * chore(tox): add instructions for using pyenv with tox * fix(ci): remove image from registry if tests fail * wip: add factcheck doc * Fix typos in the example prompts to remove some of the IDE warnings * Fix GTP spelling * fix(ci): remove Ubuntu from full-tests * Fix attention test on Windows. * Update underscore folder names to new hyphen format * wip: add backward incompatibility warning to doc * chore: pin fastembed to 4.0.0 4.1.0 instroduces rust-pystemmer which does not have any license * chore: remove as it is not verified and approved by NVIDIA * fix(dependencies): change Python 3.9.7 exclusion format from supported versions * fix(dependencies): update tornado to 6.4.2 https://github.com/NVIDIA/NeMo-Guardrails/security/dependabot/66 * fix(dependencies): update aiohttp to version 3.11.9 https://github.com/NVIDIA/NeMo-Guardrails/security/dependabot/65 https://github.com/NVIDIA/NeMo-Guardrails/security/dependabot/64 * fix(dependencies): update black in dev deps https://github.com/NVIDIA/NeMo-Guardrails/security/dependabot/63 * url fix * fix checks * fix(ci): add missing event types for PR trigger fix * fix(ci): disable full tests on workflow changes * fix(dependencies): update lock file * fix activefence rail docs * Fix `test_repeating_timer` doc test. * Add return_value to FinishFlow internal event * Refactor return_value to context_update * Fix a bug * feat: add utility flow `wait until done` * test: add test for flow context update as part of with statements * Add Aegis 2.0 Guardrails connector, output parser, and documentation Signed-off-by: Prasoon Varshney <prasoonv@nvidia.com> * Improve core flow * Add documentation * Simplified documentation example * Monkey patch nim_openai==vllm_openai Signed-off-by: Prasoon Varshney <prasoonv@nvidia.com> * Updates based on MR discussion Signed-off-by: Prasoon Varshney <prasoonv@nvidia.com> * fix(ci): add POETRY_VERSION variable and update cache * Fix 'nim' engine usage and corresponding documentation Signed-off-by: Prasoon Varshney <prasoonv@nvidia.com> * code, example for topic guard connecter remove transformers dependency, make langchain use chat-openai for vllm add support for downloadable NIM * add documentation, add param chat_model for vllm/nim * remove max_tokens arg - not supported for NIMs * update references to nim_self_hosted * rebase with develop, refactor to use nim * add topic safety output restriction by default, add docs * chore: remove Ubuntu from full-tests workflow * chore: remove deprecated Got It AI integration (NVIDIA#927) * Updated NemoGuard TopicControl documentation * chore(deps): bump jinja2 from 3.1.4 to 3.1.5 (NVIDIA#916) * chore(deps): bump jinja2 from 3.1.4 to 3.1.5 Bumps [jinja2](https://github.com/pallets/jinja) from 3.1.4 to 3.1.5. - [Release notes](https://github.com/pallets/jinja/releases) - [Changelog](https://github.com/pallets/jinja/blob/main/CHANGES.rst) - [Commits](pallets/jinja@3.1.4...3.1.5) --- updated-dependencies: - dependency-name: jinja2 dependency-type: direct:production ... Signed-off-by: dependabot[bot] <support@github.com> chore(desp): bump jinja2 from 3.1.4 to 3.1.5 * chore(deps): bump jinja2 from 3.1.4 to 3.1.5 in pyproject.toml --------- Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: prezakhani <13303554+Pouyanpi@users.noreply.github.com> * Add model based jailbreak detection. Update JailbreakDetectionConfig to support embedding model. * Update Dockerfile-GPU to include embedding jailbreak detections * Change default model to Snowflake. Add env variable for nv-embedqa-e5-v5 model to Dockerfiles * SPDX in files * Add SPXD to __init__.py; update flows. * Fix logging messages in actions.py; Update example config to include embedding parameter * Add jailbreak model tests * Correct test config path * Make error message more useful, return the same value structure as NIM * Add tests. Refactor model-based detections to support only NemoGuard JailbreakDetect with snowflake embeddings. * Update dockerfile to pull models from HF * Update jailbreak docs * Apply 1 suggestion(s) to 1 file(s) Co-authored-by: Pouyan Rezakhani <prezakhani@nvidia.com> * Apply 1 suggestion(s) to 1 file(s) Co-authored-by: Pouyan Rezakhani <prezakhani@nvidia.com> * Apply 1 suggestion(s) to 1 file(s) Co-authored-by: Pouyan Rezakhani <prezakhani@nvidia.com> * Add additional skip conditions for jailbreak model setup * Rename Aegis to NemoGuard ContentSafety connector Signed-off-by: Prasoon Varshney <prasoonv@nvidia.com> * Minor typo fix to make a link work Signed-off-by: Prasoon Varshney <prasoonv@nvidia.com> * Add einops to requirements.txt * fix: apply pre-commit hooks * docs: fix style fix doc titles * ci(workflows): update artifact handling in workflow * Add Nemoguard NIM blueprint (NVIDIA#932) * NemoGuard NIM integration to NIM Blueprint Signed-off-by: Aditi Bodhankar <abodhankar@nvidia.com> * NeMo Guardrails integration into NIM Blueprint Signed-off-by: Aditi Bodhankar <abodhankar@nvidia.com> --------- Signed-off-by: Aditi Bodhankar <abodhankar@nvidia.com> * fix(docs): fix abdomination format and shorten title * chore: bump version to v0.11.1 fix(docs): update github tag url for v0.11.1 feat(pyproject.toml): add URLs and update dependencies chore: update changelog Update colang changelog --------- Signed-off-by: Christian Schüller <160150754+schuellc-nvidia@users.noreply.github.com> Signed-off-by: dependabot[bot] <support@github.com> Signed-off-by: Pouyan <13303554+Pouyanpi@users.noreply.github.com> Signed-off-by: Emmanuel Ferdman <emmanuelferdman@gmail.com> Signed-off-by: Prasoon Varshney <prasoonv@nvidia.com> Signed-off-by: Aditi Bodhankar <abodhankar@nvidia.com> Co-authored-by: Severin Klingler <sklingler@nvidia.com> Co-authored-by: sklinglernv <148848069+sklinglernv@users.noreply.github.com> Co-authored-by: Christian Schüller <cschueller@nvidia.com> Co-authored-by: Christian Schüller <160150754+schuellc-nvidia@users.noreply.github.com> Co-authored-by: Pouyan <13303554+Pouyanpi@users.noreply.github.com> Co-authored-by: Radin Shayanfar <radin.shayanfar@gmail.com> Co-authored-by: Chris Parisien <64271260+cparisien@users.noreply.github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Varun Joshi <varun@patronus.ai> Co-authored-by: Kimi Li <kimi@autoalign.ai> Co-authored-by: Girish Sharma <girishsharma001@gmail.com> Co-authored-by: Emmanuel Ferdman <emmanuelferdman@gmail.com> Co-authored-by: Romain Yon <yonromai@users.noreply.github.com> Co-authored-by: Nikhil Varghese <nikhil@bot-it.ai> Co-authored-by: Krishna Sreeraj <krishna.sreeraj@thoughtworks.com> Co-authored-by: Naman Jain <j.naman.618@gmail.com> Co-authored-by: Noam Levy <noamlevy81@gmail.com> Co-authored-by: Prasoon Varshney <prasoonv@nvidia.com> Co-authored-by: Pouyan Rezakhani <prezakhani@nvidia.com> Co-authored-by: Makesh Sreedhar <makeshn@nvidia.com> Co-authored-by: Traian Rebedea <trebedea@nvidia.com> Co-authored-by: Erick Galinkin <egalinkin@nvidia.com> Co-authored-by: Aditi Bodhankar <abodhankar@nvidia.com>
Description
Patronus AI manages a powerful suite of fully-managed in-house evaluation models. They can be easily access through the Patronus Evaluate API. This PR adds an integration with this API so it can be used as an output rail.
cc @drazvan
Related Issue(s)
N/A
Checklist