Skip to content

Conversation

@Dishant1804
Copy link
Contributor

Proposed change

Resolves #2342

Fixed the duplication erros that were happening while syncing the contexts and chunks

Checklist

  • I've read and followed the contributing guidelines.
  • I've run make check-test locally; all checks and tests passed.

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Oct 1, 2025

Summary by CodeRabbit

  • New Features
    • Support distinct contexts per source for the same entity, avoiding conflicts.
    • Improve duplicate detection by scoping chunk checks to the exact context.
  • Bug Fixes
    • Prevent unintended overwrites and collisions between contexts originating from different sources.
    • Avoid accidental chunk deduplication across unrelated contexts.
  • Tests
    • Updated tests to cover source-aware context behavior and context-scoped chunk lookups.
  • Chores
    • Added a database migration to enforce the new uniqueness constraint on contexts.

Walkthrough

Adds a migration to enforce Context uniqueness on (entity_type, entity_id, source). Updates Context.update_data to include source in lookups/creation. Adjusts Chunk duplicate check to use the context FK directly. Updates related tests to align with source-aware Context changes and the revised Chunk filter.

Changes

Cohort / File(s) Summary
Schema/Migration
backend/apps/ai/migrations/0010_alter_context_unique_together.py
Adds migration altering Context unique_together to (entity_type, entity_id, source); depends on ai.0009 and contenttypes.0002.
Model: Context
backend/apps/ai/models/context.py
Updates Meta unique_together to include source; update_data now queries/creates with source and removes separate source assignment.
Model: Chunk
backend/apps/ai/models/chunk.py
Changes duplicate check to filter by context=<context> plus text, replacing filters on context entity fields.
Tests: Context
backend/tests/apps/ai/models/context_test.py
Adjusts tests to pass/expect source in Context.update_data paths; removes explicit assertions on result.source.
Tests: Chunk
backend/tests/apps/ai/models/chunk_test.py
Updates filters to use context=mock_context instead of context__entity_type/context__entity_id.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Possibly related PRs

Suggested labels

backend, backend-tests

Suggested reviewers

  • arkid15r

Pre-merge checks and finishing touches

✅ Passed checks (5 passed)
Check name Status Explanation
Title Check ✅ Passed The title succinctly describes the core intent of the changes—fixing duplication errors for both chunks and context—which aligns with the PR’s primary objective.
Linked Issues Check ✅ Passed The migration, model, and chunk updates directly address the elimination of duplicate context entries via a composite unique constraint and refine chunk duplication checks, fulfilling the requirements of issue #2342.
Out of Scope Changes Check ✅ Passed All altered files and test updates focus exclusively on preventing duplicate contexts and chunks, with no unrelated functionality modified.
Description Check ✅ Passed The description clearly references resolving duplication errors when syncing contexts and chunks and links to issue #2342, demonstrating relevance to the changeset.
Docstring Coverage ✅ Passed Docstring coverage is 90.00% which is sufficient. The required threshold is 80.00%.
✨ Finishing touches
  • 📝 Generate Docstrings
🧪 Generate unit tests
  • Create PR with unit tests
  • Post copyable unit tests in a comment

📜 Recent review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between e0403e4 and c62cc1a.

📒 Files selected for processing (5)
  • backend/apps/ai/migrations/0010_alter_context_unique_together.py (1 hunks)
  • backend/apps/ai/models/chunk.py (1 hunks)
  • backend/apps/ai/models/context.py (2 hunks)
  • backend/tests/apps/ai/models/chunk_test.py (3 hunks)
  • backend/tests/apps/ai/models/context_test.py (3 hunks)
🧰 Additional context used
🧬 Code graph analysis (2)
backend/tests/apps/ai/models/chunk_test.py (1)
backend/tests/apps/ai/common/base/chunk_command_test.py (1)
  • mock_context (51-60)
backend/tests/apps/ai/models/context_test.py (1)
backend/tests/apps/ai/common/base/chunk_command_test.py (1)
  • mock_content_type (64-68)
🔇 Additional comments (12)
backend/apps/ai/models/context.py (2)

27-27: LGTM! Unique constraint properly expanded.

The addition of source to the unique_together constraint correctly addresses the duplication issue by ensuring that contexts are unique per entity and source combination.


51-55: All Context.update_data calls include explicit source
All non-test callers (in context_command.py) and all tests pass a source argument; none rely on the default empty string.

backend/apps/ai/models/chunk.py (1)

64-64: LGTM! Duplicate check correctly scoped to Context instance.

The change from filtering on context__entity_type and context__entity_id to filtering on context=context correctly scopes the duplicate check to the specific Context instance. This aligns well with the updated Context model where different sources can create separate contexts for the same entity, and each context can maintain its own set of chunks.

This change prevents false-positive duplicate detection when the same text appears in chunks belonging to different contexts (e.g., different sources) for the same entity.

backend/tests/apps/ai/models/chunk_test.py (3)

101-103: LGTM! Test correctly updated for new filter logic.

The test assertion correctly reflects the model change where duplicate checks now use the context foreign key directly (context=mock_context) instead of filtering through context entity fields.


125-127: LGTM! Consistent test update.

Test correctly updated to match the new filtering behavior.


147-149: LGTM! Consistent test update.

Test correctly updated to match the new filtering behavior.

backend/tests/apps/ai/models/context_test.py (5)

163-163: LGTM! Test correctly verifies source parameter in lookup.

The test now properly asserts that source is included in the Context.objects.get() call when updating existing contexts.


329-329: LGTM! Test correctly verifies source in get call.

The assertion properly verifies that source is passed to the lookup when attempting to retrieve an existing context.


334-334: LGTM! Test correctly verifies source in initialization.

The assertion properly verifies that source is passed to the Context.__init__() when creating a new context instance.


370-370: LGTM! Consistent test update for get call.

Test correctly updated to verify source parameter in lookup.


375-375: LGTM! Consistent test update for initialization.

Test correctly updated to verify source parameter in context creation.

backend/apps/ai/migrations/0010_alter_context_unique_together.py (1)

1-17: The duplicate‐check script couldn’t run here due to the missing configurations dependency. Please execute the provided script in your local (fully set up) environment and confirm whether any existing Context rows violate the new (entity_type, entity_id, source) constraint.


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@sonarqubecloud
Copy link

sonarqubecloud bot commented Oct 1, 2025

@arkid15r arkid15r enabled auto-merge (squash) October 2, 2025 02:37
@arkid15r arkid15r disabled auto-merge October 2, 2025 02:38
@arkid15r arkid15r merged commit 0f839b9 into OWASP:feature/nestbot-ai-assistant Oct 2, 2025
3 checks passed
@coderabbitai coderabbitai bot mentioned this pull request Oct 2, 2025
2 tasks
mirnumaan pushed a commit to mirnumaan/Nest that referenced this pull request Nov 16, 2025
github-merge-queue bot pushed a commit that referenced this pull request Nov 19, 2025
* add the aria-label

* fixed the code style, fixed by the formatter

* Handled the undefined repositoryName in aria-label.

* fix the white spaces

* Add aria-labels to interactive elements for WCAG 2.1 compliance

* Sync www-repopsitories (#2164)

* spelling fixes and tests

* sonar and code rabbit suggestions implemented

* json chunking and suggestions implemented

* code rabbit and sonar qube suggestions

* code rabbit suggestions

* suggestions implemented

* github advance security addressed

* tests fixed

* fixed tests

* Clean up backend/test_commands.py

---------

Co-authored-by: Arkadii Yakovets <2201626+arkid15r@users.noreply.github.com>
Co-authored-by: Arkadii Yakovets <arkadii.yakovets@owasp.org>

* Update docker-compose/local.yaml

* Nestbot MVP (#2113)

* Sync www-repopsitories (#2164)

* spelling fixes and tests

* sonar and code rabbit suggestions implemented

* json chunking and suggestions implemented

* code rabbit and sonar qube suggestions

* code rabbit suggestions

* suggestions implemented

* github advance security addressed

* tests fixed

* fixed tests

* Clean up backend/test_commands.py

---------

Co-authored-by: Arkadii Yakovets <2201626+arkid15r@users.noreply.github.com>
Co-authored-by: Arkadii Yakovets <arkadii.yakovets@owasp.org>

* Consolidate code commits

* Update cspell/custom-dict.txt

* Update docker-compose/local.yaml

* local yaml worder volume fix

* instance check

* poetry file updated

---------

Co-authored-by: Arkadii Yakovets <2201626+arkid15r@users.noreply.github.com>
Co-authored-by: Arkadii Yakovets <arkadii.yakovets@owasp.org>

* fixed the duplication error for chunks and context (#2343)

* Fix slack and duplication errors (#2352)

* fix slack and duplication errors

* code rabbit suggestions

* integrity error solved

* using set

* Update code

---------

Co-authored-by: Arkadii Yakovets <arkadii.yakovets@owasp.org>

* Response improvements and refactoring (#2407)

* improvements and refactoring

* added prompt checks and tests for it

* question detection refining (#2443)

* question detection refining

* sonar qube fixes

* fix tests

* Agentic rag (#2432)

* agentic rag

* spelling fixes

* code rabbit and sonar qube suggestions

* code rabbit suggestions

* refining

* fix test

* refining

* added question detectoor to nestbot mentions (#2473)

* Merge NestBot AI Assistant feature branch

* Update docker-compose/local.yaml

* Update backend/Makefile

* fix the white spaces

* Add aria-labels to interactive elements for WCAG 2.1 compliance

* Fix equality checks with floating point values (#2658)

* Edit in test cases for the new aira-label update

* More edits to make the test more unbreakable and few regex edits .

* Revert one change regards two seperate Contributors

* Regex check fix

* Global 10-second timeout and increased other mobile timeout to fix the  Mobile Safari flaky tests

---------

Co-authored-by: Dishant Miyani <dishantmiyani1804@gmail.com>
Co-authored-by: Arkadii Yakovets <2201626+arkid15r@users.noreply.github.com>
Co-authored-by: Arkadii Yakovets <arkadii.yakovets@owasp.org>
Co-authored-by: Kate Golovanova <kate@kgthreads.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants