Skip to content

Conversation

@yonromai
Copy link
Contributor

@yonromai yonromai commented Nov 5, 2024

Description

Fixes an issue where alignscore-server (defined in nemoguardrails/library/factchecking/align_score/Dockerfile) throws a runtime LookupError(resource_not_found) error. This error seems to be due to a breaking change in nltk. Please see steps below to reproduce.

cc: @drazvan

Steps to reproduce the error

Build alignscore-server Docker Image:

# from the root of NeMo-Guardrails
cd nemoguardrails/library/factchecking/align_score
docker build -t alignscore-server .

Run alignscore-server:

docker run -p 5123:5000 alignscore-server
# ...
# INFO:     Uvicorn running on http://0.0.0.0:5000 (Press CTRL+C to quit)

Send a request to alignscore-server (executed from a local notebook):

from nemoguardrails.library.factchecking.align_score.request import alignscore_request

await alignscore_request(
	api_url="http://localhost:5123/alignscore_base",
	evidence="Hello, world!",
	response="Hello, world!",
)
# Output: "AlignScore API request failed with status 500"
  • Note: I get a similar behavior when LLMRails invokes the alignscore_check_facts action; which is how I encountered this issue in the first place.

Logs from alignscore-server:

INFO:     172.17.0.1:60146 - "POST /alignscore_base HTTP/1.1" 500 Internal Server Error
ERROR:    Exception in ASGI application
Traceback (most recent call last):
  File "/usr/local/lib/python3.10/site-packages/uvicorn/protocols/http/h11_impl.py", line 406, in run_asgi
    result = await app(  # type: ignore[func-returns-value]
  [...]
  File "/usr/local/lib/python3.10/site-packages/nltk/tokenize/punkt.py", line 1749, in load_lang
    lang_dir = find(f"tokenizers/punkt_tab/{lang}/")
  File "/usr/local/lib/python3.10/site-packages/nltk/data.py", line 579, in find
    raise LookupError(resource_not_found)
LookupError:
**********************************************************************
  Resource punkt_tab not found.
  Please use the NLTK Downloader to obtain the resource:

  >>> import nltk
  >>> nltk.download('punkt_tab')

  For more information see: https://www.nltk.org/data.html

  Attempted to load tokenizers/punkt_tab/english/

  Searched in:
    - '/root/nltk_data'
    - '/usr/local/nltk_data'
    - '/usr/local/share/nltk_data'
    - '/usr/local/lib/nltk_data'
    - '/usr/share/nltk_data'
    - '/usr/local/share/nltk_data'
    - '/usr/lib/nltk_data'
    - '/usr/local/lib/nltk_data'
**********************************************************************

After this commit

Re-build and re-run container:

docker build -t alignscore-server .
docker run -p 5123:5000 alignscore-server

Request:

await alignscore_request(
	api_url="http://localhost:5123/alignscore_base",
	evidence="Hello, world!",
	response="Hello, world!",
)
# Output: 0.9991656541824341

Logs from alignscore-server:

INFO:     172.17.0.1:61214 - "POST /alignscore_base HTTP/1.1" 200 OK

Related Issue(s)

nltk/nltk#3293

Checklist

  • I've read the CONTRIBUTING guidelines.
  • I've updated the documentation if applicable.
  • I've added tests if applicable.
  • @mentions of the person or team responsible for reviewing proposed changes.

Copy link
Collaborator

@Pouyanpi Pouyanpi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you @yonromai for catching this 👍🏻 LGTM!

@Pouyanpi
Copy link
Collaborator

Pouyanpi commented Nov 6, 2024

reference to the issue nltk/nltk#3293

@Pouyanpi
Copy link
Collaborator

Pouyanpi commented Nov 6, 2024

@yonromai would you please gpg sign your commits per CONTRIBUTING.md?

@yonromai
Copy link
Contributor Author

yonromai commented Nov 6, 2024

@yonromai would you please gpg sign your commits per CONTRIBUTING.md?

Thank you for taking a look @Pouyanpi

I thought I signed the previous commit; wondering if something went wrong when rebasing/force-pushing. Anyway, I re-pushed and it should be good now:
image

@Pouyanpi
Copy link
Collaborator

Pouyanpi commented Nov 6, 2024

@yonromai would you please gpg sign your commits per CONTRIBUTING.md?

Thank you for taking a look @Pouyanpi

I thought I signed the previous commit; wondering if something went wrong when rebasing/force-pushing. Anyway, I re-pushed and it should be good now: image

My bad, you are right, the auto update branch removed the signature! Thanks for signing it again.

@Pouyanpi Pouyanpi merged commit 4ce7daf into NVIDIA-NeMo:develop Nov 6, 2024
1 check passed
mdambski added a commit to datarobot-forks/NeMo-Guardrails that referenced this pull request Jan 31, 2025
* Add attention standard library with some basic tests.

* Add additional tests and update attention library to latest version (from ACE).

* Add section about supported LLMs

* Fixes from @sklinglernv review

* Fix documentation

* Fix issue with undefined flow continuation not working with user intent generation

* Fix issues with delayed restart of continuation flow

* Add unit test

* Update docs/colang_2/overview.rst

Co-authored-by: Pouyan <13303554+Pouyanpi@users.noreply.github.com>
Signed-off-by: Christian Schüller <160150754+schuellc-nvidia@users.noreply.github.com>

* Fix a bug in a code block

* Created missing changelogs group entries

* Fix else if parsing problem

* refactor: rename and update import paths in example bot

* fix: resolve import path issue in config.py

- Add guardrails_stdlib_path to colang_path_dirs
- Improve error message for unresolved import paths

* chore(deps): bump vllm in /nemoguardrails/library/patronusai

Bumps [vllm](https://github.com/vllm-project/vllm) from 0.2.7 to 0.5.5.
- [Release notes](https://github.com/vllm-project/vllm/releases)
- [Commits](vllm-project/vllm@v0.2.7...v0.5.5)

---
updated-dependencies:
- dependency-name: vllm
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <support@github.com>

* chore(deps): bump vllm in /nemoguardrails/library/llama_guard

* remove redundant matching rules in escape_flow_name

* Patronus Evaluate API Integration (NVIDIA-NeMo#834)

* Patronus Evaluate API Integration

* Address comments - tests will be added separately

* Add missing tests

* Remove print statements
---------

Signed-off-by: Pouyan <13303554+Pouyanpi@users.noreply.github.com>
Co-authored-by: Pouyan <13303554+Pouyanpi@users.noreply.github.com>

* Add new test ChatInterface to run CLI like tests for the CLS example tests.

* Fix event synchronization.

* CLS tests core.co

* CSL core.co tests done. Added UtteranceUserActionStarted to CLI and ChatInterface.

* Fix wrong flow parameters in `core.co`

* Tests for development helper flows.

* Add test and examples for `timing.co`.

* Add test and examples for `llmco`.

* Fix CSL tests.

* Add colang 2 documentation test to pytest.ini

* Remove duplicated test.

* Make tests more robust: no model config & update semaphore in  current loop context.

* Small improvements to CSL tests.

* Few minor fixes to `attention.co`

* Revert making the tracking user attention flow active.

* Add Private AI Integration (NVIDIA-NeMo#815)

* Update evaluate directory reference (NVIDIA-NeMo#751)

Signed-off-by: Emmanuel Ferdman <emmanuelferdman@gmail.com>

* doc: update similarity threshold to 0.75 and note (NVIDIA-NeMo#770)

* fix: download nltk's punkt_tab in align_score Dockerfile (NVIDIA-NeMo#841)

* feat(docs): enhance tracing configuration guide

* refactor(config): remove OpenTelemetry from tracing config

* docs(tracing): add Zipkin setup instructions

* refactor(tracing): move log adapters initialization

* feat(tracing): add global otel exporter registration

* fix(dependencies): add pandas version constraint for eval

Explicitly pinning the version of pandas to avoid pip resolution issues.
This ensures compatibility with streamlit, which requires pandas>=1.4.0,<3.

* docs(installation): add notice for dependency resolution

fix style

fix

bold

* Fix small issues in Colang 2 library examples

* Merge in commit (23e27) from Colang doc repo and adjusted to relative Github paths

* fix(tests): mock PromptSession to prevent console error

The tests in `tests/test_cli.py` and `tests/test_cli_migration.py` were
failing with `NoConsoleScreenBufferError` due to `prompt_toolkit`
expecting a Windows console but finding `xterm-256color` instead. This
issue occurs in Windows GitHub runners when the runner is `windows-latest`.

To resolve this, `PromptSession` is mocked globally before any tests
are collected, preventing the error. This fix ensures the tests run
successfully in all environments.

Changes:
- Added `conftest.py` to mock `PromptSession` globally.

* docs: update role from bot to assistant

* docs(installation): update optional dependencies install

* fix(docs): update pip install instructions note

* fix: handle multiple output parsers in generation

Updated the condition to check if `prompt_config.output_parser` is inthe list `["verbose_v1", "bot_message"]`

* fix(docs): update CLI section headers from H4 to H3

* docs: update LLM support table to use Unicode symbols

MyST and Sphinx cannot render the prev style

* docs: update admonitions to use MyST syntax

add

docs: update syntax guide with admonitions

adomination

adomination

adomination

adomination

* docs: remove duplicate GCP Text Moderation section

* docs: specify shell syntax for CLI example

* docs: update detailed logging example output

* docs: update migration guide with new options

* docs: update vulnerability scanning table to use unicode checkmarks

* docs: update code blocks to use sh syntax highlighting

* refactor(docs): change underscore to hyphens

* refactor(docs): update references to new file names where we use hyphens instead of underscore

fix

fix

fix

* chore: update latest release version in README

apply review

fix

* Fix release date in changelog

* Update colang changelog

* docs: add deprecation notice for Got It AI integration

* docs: fix format for deprecation notice for Got It AI integration

* docs: update deprecation notice format for Got It AI

* wip: unused import

* wip: use deepcopy to avoid repeated action side effect

* wip: remove jailbreak from example output config

* chore(changelog): update changelog for v0.11.0 release

fix: update release date for version 0.10.0

* bump: update version to 0.11.0

* docs: update version to 0.11.0

* fix: move an entry to colang 2 changelog

* fix: apply review changes

* Add Colang patch fix note to changelog

* fix(docs): update Garak GitHub links to NVIDIA repo

* Restructure colang changelog adding contributor names

* Minor changelog fix

* Fix Colang name capitalization

* Update entry

* Replace all underscore with hyphen characters in folders and rst file names

* Add migration cross reference

* wip: switch to content moderation endpoint for factcheck

* chore: correct date and PR number in changelog

* Undo accidental changes

* Fix asyncio loop issue in combination with enable_input event

* feat: migrate to Poetry for dependency management

This commit migrates the project from setuptools to Poetry for dependency
management and packaging. The pyproject.toml file has been updated to
reflect the new configuration, including dependencies, optional
dependencies, and build system requirements. This change aims to
simplify dependency management and improve the overall development workflow

* chore: add tox configuration for multi-python testing

* chore: add Makefile for common development tasks

* chore: update .gitignore for better file management

* chore: update Dockerfile to use Poetry for dependencies

* chore: add Dockerfile for QA environment setup

* ci: update GitLab CI for multi-python and Docker support

* ci: remove redundant GitHub Actions workflows

* ci: add reusable GitHub Actions workflow for tests

* ci: add PR tests workflow for multi-python support

* ci: add full-tests workflow for multi-OS and Python

* ci: add build script for packaging with Poetry

* ci: add GitHub Actions workflow for building and testing wheel

chore(workflows): remove comments

* ci: add workflow to test Docker image (not working)

* refactor: rename test classes to supress pytest warning

* fix: use temp directory for .railsignore in tests

This commit updates the `test_railsignore.py` to use the system's
temporary directory for the `.railsignore` file. This change addresses
issues with tests on Windows OS by ensuring the `.railsignore` file is
created in a writable location

* chore: update issue templates with triage labels

* chore: add documentation issue template

* docs: update CONTRIBUTING.md for Poetry migration

* feat(workflows): add lock closed threads workflow

* feat(workflows): add test-published-dist workflow

This workflow tests the published distribution of the package from PyPI
daily. It sets up Python environments for versions 3.9, 3.10, and 3.11,
installs the package, starts the server, and checks its status. This
ensures the published package works as expected.

* refactor: consolidate dependencies in pyproject.toml

* feat(ci): update cron schedule to 11:00 PM UTC daily

* chore(tox): add instructions for using pyenv with tox

* fix(ci): remove image from registry if tests fail

* wip: add factcheck doc

* Fix typos in the example prompts to remove some of the IDE warnings

* Fix GTP spelling

* fix(ci): remove Ubuntu from full-tests

* Fix attention test on Windows.

* Update underscore folder names to new hyphen format

* wip: add backward incompatibility warning to doc

* chore: pin fastembed to 4.0.0

4.1.0 instroduces rust-pystemmer which does not have any license

* chore: remove as it is not verified and approved by NVIDIA

* fix(dependencies): change Python 3.9.7 exclusion format from supported versions

* fix(dependencies): update tornado to 6.4.2

https://github.com/NVIDIA/NeMo-Guardrails/security/dependabot/66

* fix(dependencies): update aiohttp to version 3.11.9

https://github.com/NVIDIA/NeMo-Guardrails/security/dependabot/65
https://github.com/NVIDIA/NeMo-Guardrails/security/dependabot/64

* fix(dependencies): update black in dev deps

https://github.com/NVIDIA/NeMo-Guardrails/security/dependabot/63

* url fix

* fix checks

* fix(ci): add missing event types for PR trigger

fix

* fix(ci): disable full tests on workflow changes

* fix(dependencies): update lock file

* fix activefence rail docs

* Fix `test_repeating_timer` doc test.

* Add return_value to FinishFlow internal event

* Refactor return_value to context_update

* Fix a bug

* feat: add utility flow `wait until done`

* test: add test for flow context update as part of with statements

* Add Aegis 2.0 Guardrails connector, output parser, and documentation

Signed-off-by: Prasoon Varshney <prasoonv@nvidia.com>

* Improve core flow

* Add documentation

* Simplified documentation example

* Monkey patch nim_openai==vllm_openai

Signed-off-by: Prasoon Varshney <prasoonv@nvidia.com>

* Updates based on MR discussion

Signed-off-by: Prasoon Varshney <prasoonv@nvidia.com>

* fix(ci): add POETRY_VERSION variable and update cache

* Fix 'nim' engine usage and corresponding documentation

Signed-off-by: Prasoon Varshney <prasoonv@nvidia.com>

* code, example for topic guard connecter

remove transformers dependency, make langchain use chat-openai for vllm

add support for downloadable NIM

* add documentation, add param chat_model for vllm/nim

* remove max_tokens arg - not supported for NIMs

* update references to nim_self_hosted

* rebase with develop, refactor to use nim

* add topic safety output restriction by default, add docs

* chore: remove Ubuntu from full-tests workflow

* chore: remove deprecated Got It AI integration (NVIDIA-NeMo#927)

* Updated NemoGuard TopicControl documentation

* chore(deps): bump jinja2 from 3.1.4 to 3.1.5 (NVIDIA-NeMo#916)

* chore(deps): bump jinja2 from 3.1.4 to 3.1.5

Bumps [jinja2](https://github.com/pallets/jinja) from 3.1.4 to 3.1.5.
- [Release notes](https://github.com/pallets/jinja/releases)
- [Changelog](https://github.com/pallets/jinja/blob/main/CHANGES.rst)
- [Commits](pallets/jinja@3.1.4...3.1.5)

---
updated-dependencies:
- dependency-name: jinja2
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <support@github.com>

chore(desp): bump jinja2 from 3.1.4 to 3.1.5

* chore(deps): bump jinja2 from 3.1.4 to 3.1.5 in pyproject.toml

---------

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: prezakhani <13303554+Pouyanpi@users.noreply.github.com>

* Add model based jailbreak detection. Update JailbreakDetectionConfig to support embedding model.

* Update Dockerfile-GPU to include embedding jailbreak detections

* Change default model to Snowflake. Add env variable for nv-embedqa-e5-v5 model to Dockerfiles

* SPDX in files

* Add SPXD to __init__.py; update flows.

* Fix logging messages in actions.py; Update example config to include embedding parameter

* Add jailbreak model tests

* Correct test config path

* Make error message more useful, return the same value structure as NIM

* Add tests. Refactor model-based detections to support only NemoGuard JailbreakDetect with snowflake embeddings.

* Update dockerfile to pull models from HF

* Update jailbreak docs

* Apply 1 suggestion(s) to 1 file(s)

Co-authored-by: Pouyan Rezakhani <prezakhani@nvidia.com>

* Apply 1 suggestion(s) to 1 file(s)

Co-authored-by: Pouyan Rezakhani <prezakhani@nvidia.com>

* Apply 1 suggestion(s) to 1 file(s)

Co-authored-by: Pouyan Rezakhani <prezakhani@nvidia.com>

* Add additional skip conditions for jailbreak model setup

* Rename Aegis to NemoGuard ContentSafety connector

Signed-off-by: Prasoon Varshney <prasoonv@nvidia.com>

* Minor typo fix to make a link work

Signed-off-by: Prasoon Varshney <prasoonv@nvidia.com>

* Add einops to requirements.txt

* fix: apply pre-commit hooks

* docs: fix style

fix doc titles

* ci(workflows): update artifact handling in workflow

* Add Nemoguard NIM blueprint (NVIDIA-NeMo#932)

* NemoGuard NIM integration to NIM Blueprint

Signed-off-by: Aditi Bodhankar <abodhankar@nvidia.com>

* NeMo Guardrails integration into NIM Blueprint

Signed-off-by: Aditi Bodhankar <abodhankar@nvidia.com>

---------

Signed-off-by: Aditi Bodhankar <abodhankar@nvidia.com>

* fix(docs): fix abdomination format and shorten title

* chore: bump version to v0.11.1

fix(docs): update github tag url for v0.11.1

feat(pyproject.toml): add URLs and update dependencies

chore: update changelog

Update colang changelog

---------

Signed-off-by: Christian Schüller <160150754+schuellc-nvidia@users.noreply.github.com>
Signed-off-by: dependabot[bot] <support@github.com>
Signed-off-by: Pouyan <13303554+Pouyanpi@users.noreply.github.com>
Signed-off-by: Emmanuel Ferdman <emmanuelferdman@gmail.com>
Signed-off-by: Prasoon Varshney <prasoonv@nvidia.com>
Signed-off-by: Aditi Bodhankar <abodhankar@nvidia.com>
Co-authored-by: Severin Klingler <sklingler@nvidia.com>
Co-authored-by: sklinglernv <148848069+sklinglernv@users.noreply.github.com>
Co-authored-by: Christian Schüller <cschueller@nvidia.com>
Co-authored-by: Christian Schüller <160150754+schuellc-nvidia@users.noreply.github.com>
Co-authored-by: Pouyan <13303554+Pouyanpi@users.noreply.github.com>
Co-authored-by: Radin Shayanfar <radin.shayanfar@gmail.com>
Co-authored-by: Chris Parisien <64271260+cparisien@users.noreply.github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Varun Joshi <varun@patronus.ai>
Co-authored-by: Kimi Li <kimi@autoalign.ai>
Co-authored-by: Girish Sharma <girishsharma001@gmail.com>
Co-authored-by: Emmanuel Ferdman <emmanuelferdman@gmail.com>
Co-authored-by: Romain Yon <yonromai@users.noreply.github.com>
Co-authored-by: Nikhil Varghese <nikhil@bot-it.ai>
Co-authored-by: Krishna Sreeraj <krishna.sreeraj@thoughtworks.com>
Co-authored-by: Naman Jain <j.naman.618@gmail.com>
Co-authored-by: Noam Levy <noamlevy81@gmail.com>
Co-authored-by: Prasoon Varshney <prasoonv@nvidia.com>
Co-authored-by: Pouyan Rezakhani <prezakhani@nvidia.com>
Co-authored-by: Makesh Sreedhar <makeshn@nvidia.com>
Co-authored-by: Traian Rebedea <trebedea@nvidia.com>
Co-authored-by: Erick Galinkin <egalinkin@nvidia.com>
Co-authored-by: Aditi Bodhankar <abodhankar@nvidia.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants