Skip to content

Conversation

@DecodeX15
Copy link
Contributor

@DecodeX15 DecodeX15 commented Oct 7, 2025

Title:
fix(#569): remove clusters with zero face_count from persisted face data

Description:
This PR filters out clusters where face_count === 0 before updating the main array.
By excluding these empty clusters, face data no longer persists after the source folder is removed.

Uploading Recording 2025-10-07 095004.mp4…

Summary by CodeRabbit

  • New Features

    • Automatic removal of empty face clusters after folder add/sync/delete.
    • App clears cache on startup to ensure fresh data.
  • Refactor

    • Backend/sync service ports updated to 52123/52124; startup and service handling consolidated.
    • Removed in-app "Restart Server" control and automatic server stop on close.
  • Documentation

    • Setup and guides updated for Conda-based workflow, ports, and microservice steps.
  • Chores

    • Added LICENSE and COPYRIGHT files.

✏️ Tip: You can customize this high-level summary in your review settings.

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Oct 7, 2025

📝 Walkthrough

Walkthrough

Adds a database and utility function to delete empty face clusters and invokes it after folder add/sync/delete flows; removes the old microservice management module and shifts service startup/ports across frontend, backend, and sync-microservice artifacts; updates docs, CI, and packaging artifacts accordingly.

Changes

Cohort / File(s) Summary
Cluster cleanup (DB + util)
backend/app/database/face_clusters.py, backend/app/utils/face_clusters.py
New DB function db_delete_empty_clusters() and wrapper cluster_util_delete_empty_clusters() that DELETEs clusters with no faces, commit/rollback handling, and logging; returns deleted count.
Folder routes integration
backend/app/routes/folders.py
Calls cluster_util_delete_empty_clusters() after post_folder_add_sequence, post_sync_folder_sequence, and delete_folders flows; imports adjusted accordingly.
Frontend service orchestration & config
frontend/src/utils/serverUtils.ts, frontend/src/main.tsx, frontend/src/config/Backend.ts, frontend/src/components/OnboardingSteps/ServerCheck.tsx, frontend/src/components/FaceCollections.tsx, frontend/src/hooks/useFolderOperations.tsx, frontend/src/pages/SettingsPage/components/ApplicationControlsCard.tsx
Reworked startServer to ensure backend + sync-microservice processes are started via tauri spawn commands; removed stopServer/restart UI/button; updated backend/sync URLs and ports (52123/52124); increased health-check retries; minor UI feedback change and type assertion tweak in FaceCollections.
Tauri/native service handlers & capabilities
frontend/src-tauri/src/services/mod.rs, frontend/src-tauri/src/main.rs, frontend/src-tauri/capabilities/migrated.json, frontend/src-tauri/tauri.conf.json
Renamed/get_resources_folder_path handler and capability entries (StartBackend*/StartSyncService*), adjusted resource path resolution and CSP connect-src ports.
Sync microservice runtime & config
sync-microservice/*, sync-microservice/app/config/settings.py, sync-microservice/main.py, sync-microservice/app/core/lifespan.py
Changed ports to 52124, introduced DB connection retry loop on startup, adjusted router prefixes/tags, and updated Uvicorn host binding to localhost.
Removed microservice manager
backend/app/utils/microservice.py
Entire module removed (previously handled starting/managing sync microservice, venv handling, process I/O streaming, and logging).
Backend startup & settings
backend/main.py, backend/app/config/settings.py, app.py
Removed microservice auto-start call, updated backend dev host/port to localhost:52123, added app startup hook to invalidate cache, and adjusted cache refresh behavior.
Logging behavior changes
backend/app/logging/setup_logging.py, sync-microservice/app/logging/setup_logging.py
InterceptHandler.emit reworked to mutate LogRecord message and dispatch to root handlers directly (avoids get_logger path).
Docs, scripts & packaging
docs/*, scripts/*, LICENSE.md, COPYRIGHT.md, .github/workflows/build-and-release.yml, .pre-commit-config.yaml, scripts/setup.*, frontend/src-tauri/* capabilities
Updated setup guides to Conda/ports, adjusted scripts for sync-microservice setup, added LICENSE/COPYRIGHT, altered CI step to copy utils into dist, updated CSS wrapping rules, and changed pre-commit ruff/black revs.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly related PRs

Suggested reviewers

  • rahulharpal1603

Poem

🐰
I hop through code with nimble paws,
I trim the nests and mind the laws.
Empty clusters off they go,
Services wake and ports now flow.
A tidy garden, soft and slow.

🚥 Pre-merge checks | ✅ 2 | ❌ 1
❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 73.68% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title clearly and specifically summarizes the main change: fixing a face count inconsistency issue that occurs when folders are removed.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing touches
  • 📝 Generate docstrings

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 20d16e5 and 24dba76.

📒 Files selected for processing (1)
  • frontend/src/components/FaceCollections.tsx (1 hunks)
🧰 Additional context used
🧬 Code graph analysis (1)
frontend/src/components/FaceCollections.tsx (1)
frontend/src/types/Media.ts (1)
  • Cluster (42-47)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)
  • GitHub Check: Backend Tests
  • GitHub Check: Tauri Build Check (macos-latest, --target aarch64-apple-darwin)
  • GitHub Check: Tauri Build Check (windows-latest)
  • GitHub Check: Tauri Build Check (ubuntu-22.04)

Comment on lines 23 to 32
useEffect(() => {
if (clustersSuccess && clustersData?.data?.clusters) {
const clusters = (clustersData.data.clusters || []) as Cluster[];
dispatch(setClusters(clusters));
}
}, [clustersData, clustersSuccess, dispatch]);
if (clustersSuccess && clustersData?.data?.clusters) {
const clusters = (clustersData.data.clusters || []) as Cluster[];

const filteredClusters = clusters.filter(c => c.face_count > 0);

dispatch(setClusters(filteredClusters));
console.log(filteredClusters);
}
}, [clustersData, clustersSuccess, dispatch]);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion | 🟠 Major

Filtering logic is correct; remove console.log and improve variable naming.

The core fix correctly filters out clusters with face_count === 0, which addresses the stated objective of preventing face data persistence after folder removal.

However:

  1. Remove the console.log on line 30 before merging to production. Logging filtered data is helpful during development but should not ship to users.

  2. The local variable clusters on line 25 shadows the Redux state variable from line 16, making the code harder to follow. Consider renaming the local variable to allClusters or fetchedClusters.

  3. The || [] fallback on line 25 is redundant since line 24 already ensures clustersData?.data?.clusters exists via optional chaining.

Apply this diff to address the issues:

  useEffect(() => {
   if (clustersSuccess && clustersData?.data?.clusters) {
-    const clusters = (clustersData.data.clusters || []) as Cluster[];
+    const fetchedClusters = clustersData.data.clusters as Cluster[];

-    const filteredClusters = clusters.filter(c => c.face_count > 0);
+    const filteredClusters = fetchedClusters.filter(c => c.face_count > 0);

     dispatch(setClusters(filteredClusters));
-    console.log(filteredClusters);
   }
 }, [clustersData, clustersSuccess, dispatch]);
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
useEffect(() => {
if (clustersSuccess && clustersData?.data?.clusters) {
const clusters = (clustersData.data.clusters || []) as Cluster[];
dispatch(setClusters(clusters));
}
}, [clustersData, clustersSuccess, dispatch]);
if (clustersSuccess && clustersData?.data?.clusters) {
const clusters = (clustersData.data.clusters || []) as Cluster[];
const filteredClusters = clusters.filter(c => c.face_count > 0);
dispatch(setClusters(filteredClusters));
console.log(filteredClusters);
}
}, [clustersData, clustersSuccess, dispatch]);
useEffect(() => {
if (clustersSuccess && clustersData?.data?.clusters) {
const fetchedClusters = clustersData.data.clusters as Cluster[];
const filteredClusters = fetchedClusters.filter(c => c.face_count > 0);
dispatch(setClusters(filteredClusters));
}
}, [clustersData, clustersSuccess, dispatch]);
🤖 Prompt for AI Agents
In frontend/src/components/FaceCollections.tsx around lines 23 to 32, remove the
development console.log call, rename the local variable `clusters` to
`fetchedClusters` (or `allClusters`) to avoid shadowing the Redux state
variable, and eliminate the redundant `|| []` fallback since the optional
chaining already guarantees the existence check; then dispatch the filtered
results using the new variable name.

@DecodeX15
Copy link
Contributor Author

@rahulharpal1603 sir please review this

@rahulharpal1603 rahulharpal1603 linked an issue Oct 28, 2025 that may be closed by this pull request
6 tasks
@github-actions github-actions bot added backend bug Something isn't working labels Oct 31, 2025
Comment on lines 26 to 30

const filteredClusters = clusters.filter((c) => c.face_count > 0);

dispatch(setClusters(filteredClusters));
// console.log(filteredClusters);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not the correct way. You need to ensure that empty face clusters are deleted at the database level.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

doing...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not the correct way. You need to ensure that empty face clusters are deleted at the database level.

sir can you please elaborate more ??

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok sir i will write a route to delete all face id with 0 face count ??

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

tell me sir ??

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sir pleaseeee review

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (1)
frontend/src/components/FaceCollections.tsx (1)

25-30: Good progress on console.log; complete the refactor by addressing variable shadowing.

The filtering logic correctly solves the face count inconsistency issue. The console.log has been commented out (good improvement from the previous review), but two issues from the prior review remain:

  1. Line 25: The local variable clusters shadows the Redux state variable clusters from line 16, reducing code clarity.

  2. Line 25: The || [] fallback is redundant since line 24's optional chaining (clustersData?.data?.clusters) already guarantees the property exists.

Apply this diff to complete the cleanup:

  useEffect(() => {
    if (clustersSuccess && clustersData?.data?.clusters) {
-     const clusters = (clustersData.data.clusters || []) as Cluster[];
+     const fetchedClusters = clustersData.data.clusters as Cluster[];

-     const filteredClusters = clusters.filter((c) => c.face_count > 0);
+     const filteredClusters = fetchedClusters.filter((c) => c.face_count > 0);

      dispatch(setClusters(filteredClusters));
-     // console.log(filteredClusters);
    }
  }, [clustersData, clustersSuccess, dispatch]);
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between ae9de04 and 85cb38c.

📒 Files selected for processing (2)
  • docs/backend/backend_python/openapi.json (2 hunks)
  • frontend/src/components/FaceCollections.tsx (1 hunks)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)
  • GitHub Check: Tauri Build Check (macos-latest, --target aarch64-apple-darwin)
  • GitHub Check: Tauri Build Check (windows-latest)
  • GitHub Check: Tauri Build Check (ubuntu-22.04)
  • GitHub Check: Backend Tests
🔇 Additional comments (1)
docs/backend/backend_python/openapi.json (1)

1081-1083: Verify whether these OpenAPI schema changes are intentional.

The changes to this file don't appear to be directly related to the PR objective of fixing face count inconsistency after folder removal:

  1. Line 1081-1083: The input_type parameter now uses a direct $ref with sibling properties (valid in OpenAPI 3.1), but this seems unrelated to the face clustering fix.

  2. Line 2158: Adding "additionalProperties": true to ImageInCluster.Metadata broadens the validation to accept any additional metadata properties.

These look like incidental schema regeneration changes. Please confirm whether these modifications are intentional and necessary for this PR, or if they should be reverted or moved to a separate PR focused on API schema updates.

Also applies to: 2158-2158

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

♻️ Duplicate comments (1)
frontend/src/components/FaceCollections.tsx (1)

25-25: Rename local variable to avoid shadowing Redux state.

The local variable clusters shadows the Redux state variable from line 16, making the code harder to follow and maintain. Consider renaming to fetchedClusters or allClusters.

Apply this diff:

-    const clusters = clustersData.data.clusters as Cluster[];
-    dispatch(setClusters(clusters));
+    const fetchedClusters = clustersData.data.clusters as Cluster[];
+    dispatch(setClusters(fetchedClusters));
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 85cb38c and 324e5ca.

📒 Files selected for processing (5)
  • backend/app/database/face_clusters.py (2 hunks)
  • backend/app/routes/folders.py (4 hunks)
  • backend/app/utils/face_clusters.py (1 hunks)
  • frontend/src/components/FaceCollections.tsx (1 hunks)
  • frontend/src/components/Media/MediaView.tsx (2 hunks)
✅ Files skipped from review due to trivial changes (1)
  • frontend/src/components/Media/MediaView.tsx
🧰 Additional context used
🧬 Code graph analysis (3)
frontend/src/components/FaceCollections.tsx (1)
frontend/src/types/Media.ts (1)
  • Cluster (51-56)
backend/app/database/face_clusters.py (1)
backend/app/logging/setup_logging.py (2)
  • setup_logging (134-196)
  • get_logger (199-209)
backend/app/routes/folders.py (3)
backend/app/database/folders.py (1)
  • db_delete_folders_batch (155-187)
backend/app/utils/API.py (1)
  • API_util_restart_sync_microservice_watcher (8-33)
backend/app/utils/face_clusters.py (1)
  • cluster_util_delete_empty_clusters (592-624)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)
  • GitHub Check: Backend Tests
  • GitHub Check: Tauri Build Check (ubuntu-22.04)
  • GitHub Check: Tauri Build Check (macos-latest, --target aarch64-apple-darwin)
  • GitHub Check: Tauri Build Check (windows-latest)
🔇 Additional comments (4)
backend/app/database/face_clusters.py (1)

6-7: LGTM!

Logger initialization follows the correct pattern.

backend/app/routes/folders.py (2)

80-81: LGTM with a note on error handling.

The placement after API_util_restart_sync_microservice_watcher() is appropriate. Since this runs in a background executor, errors are logged but don't block the API response. However, note that the current implementation silently returns 0 on failure (see review comment on cluster_util_delete_empty_clusters()).


131-132: LGTM!

Consistent placement with the post-folder-add sequence. This executes asynchronously in the background executor.

frontend/src/components/FaceCollections.tsx (1)

23-28: Backend empty cluster deletion is correctly implemented.

Verification confirms the backend properly handles deletion of empty clusters at the database level. The cluster_util_delete_empty_clusters() function is called immediately after db_delete_folders_batch() in the delete-folders endpoint (backend/app/routes/folders.py:334), ensuring orphaned clusters are removed before the frontend fetches updated cluster data. The frontend changes correctly rely on this backend cleanup.

Comment on lines +330 to +352
def db_delete_empty_clusters() -> int:
"""
Delete all clusters that have no faces associated with them.
"""
conn = sqlite3.connect(DATABASE_PATH)
cursor = conn.cursor()

try:
cursor.execute(
"""
DELETE FROM face_clusters
WHERE cluster_id NOT IN (
SELECT DISTINCT cluster_id FROM faces WHERE cluster_id IS NOT NULL
)
"""
)

deleted_count = cursor.rowcount
conn.commit()
logger.info(f"Deleted {deleted_count} empty clusters.")
return deleted_count
finally:
conn.close()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Optimize the subquery and add error handling.

The current implementation has two issues:

  1. Performance concern: The NOT IN subquery can be slow with large datasets because it doesn't leverage indexes efficiently. With many clusters and faces, this could block the database.

  2. Missing error handling: If the query fails, no rollback is performed, potentially leaving the transaction in an inconsistent state.

Apply this diff to improve performance and add error handling:

 def db_delete_empty_clusters() -> int:
     """
     Delete all clusters that have no faces associated with them.
     """
     conn = sqlite3.connect(DATABASE_PATH)
     cursor = conn.cursor()
 
     try:
         cursor.execute(
             """
             DELETE FROM face_clusters
-            WHERE cluster_id NOT IN (
-                SELECT DISTINCT cluster_id FROM faces WHERE cluster_id IS NOT NULL
+            WHERE NOT EXISTS (
+                SELECT 1 FROM faces 
+                WHERE faces.cluster_id = face_clusters.cluster_id
             )
             """
         )
 
         deleted_count = cursor.rowcount
         conn.commit()
         logger.info(f"Deleted {deleted_count} empty clusters.")
         return deleted_count
+    except Exception as e:
+        logger.error(f"Error deleting empty clusters: {e}")
+        conn.rollback()
+        raise
     finally:
         conn.close()

The NOT EXISTS correlated subquery allows the database to use indexes efficiently and short-circuit as soon as a matching face is found.

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
def db_delete_empty_clusters() -> int:
"""
Delete all clusters that have no faces associated with them.
"""
conn = sqlite3.connect(DATABASE_PATH)
cursor = conn.cursor()
try:
cursor.execute(
"""
DELETE FROM face_clusters
WHERE cluster_id NOT IN (
SELECT DISTINCT cluster_id FROM faces WHERE cluster_id IS NOT NULL
)
"""
)
deleted_count = cursor.rowcount
conn.commit()
logger.info(f"Deleted {deleted_count} empty clusters.")
return deleted_count
finally:
conn.close()
def db_delete_empty_clusters() -> int:
"""
Delete all clusters that have no faces associated with them.
"""
conn = sqlite3.connect(DATABASE_PATH)
cursor = conn.cursor()
try:
cursor.execute(
"""
DELETE FROM face_clusters
WHERE NOT EXISTS (
SELECT 1 FROM faces
WHERE faces.cluster_id = face_clusters.cluster_id
)
"""
)
deleted_count = cursor.rowcount
conn.commit()
logger.info(f"Deleted {deleted_count} empty clusters.")
return deleted_count
except Exception as e:
logger.error(f"Error deleting empty clusters: {e}")
conn.rollback()
raise
finally:
conn.close()
🤖 Prompt for AI Agents
In backend/app/database/face_clusters.py around lines 330 to 352, replace the
current DELETE ... WHERE cluster_id NOT IN (...) with a correlated NOT EXISTS
subquery to improve query planning and index usage (e.g., DELETE FROM
face_clusters fc WHERE NOT EXISTS (SELECT 1 FROM faces f WHERE f.cluster_id =
fc.cluster_id)); also wrap the execute/commit in a try/except block that calls
conn.rollback() on exception, logs the exception details, and re-raises the
error (ensure conn.close() remains in the finally block).


deleted_count = db_delete_folders_batch(request.folder_ids)

cluster_util_delete_empty_clusters()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Consider moving cluster cleanup to background processing.

Unlike the post_folder_add_sequence and post_sync_folder_sequence which run cleanup asynchronously in an executor, this call executes synchronously and blocks the API response. If there are many empty clusters to delete, this could cause:

  1. Delayed response to the client
  2. Potential timeout on slow databases
  3. Inconsistent user experience compared to other operations

Option 1 (Recommended): Move to background processing for consistency:

     deleted_count = db_delete_folders_batch(request.folder_ids)
-    cluster_util_delete_empty_clusters()
+    
+    # Clean up empty clusters in background
+    executor: ProcessPoolExecutor = app_state.executor
+    executor.submit(cluster_util_delete_empty_clusters)
+    
     return DeleteFoldersResponse(

Note: You'll need to add app_state=Depends(get_state) to the function signature.

Option 2: Keep synchronous if immediate consistency is required, but document the trade-off and consider adding a timeout.

Which approach is more appropriate depends on whether you need the clusters cleaned up before returning to the client.

Committable suggestion skipped: line range outside the PR's diff.

🤖 Prompt for AI Agents
In backend/app/routes/folders.py around line 334, the direct call to
cluster_util_delete_empty_clusters() runs synchronously and can block the API
response; move this cleanup to background processing for consistency with
post_folder_add_sequence/post_sync_folder_sequence by invoking
cluster_util_delete_empty_clusters via the app state executor (e.g.,
app_state.background or run_in_executor) so it runs asynchronously; update the
endpoint signature to accept app_state=Depends(get_state) and schedule the
cleanup task on that executor, keeping the endpoint response immediate.

Comment on lines +592 to +624
def cluster_util_delete_empty_clusters() -> int:
"""
Delete all clusters that have no faces associated with them.

Returns:
int: Number of clusters deleted
"""
import sqlite3
from app.config.settings import DATABASE_PATH

conn = sqlite3.connect(DATABASE_PATH)
cursor = conn.cursor()

try:
cursor.execute(
"""
DELETE FROM face_clusters
WHERE cluster_id NOT IN (
SELECT DISTINCT cluster_id FROM faces WHERE cluster_id IS NOT NULL
)
"""
)

deleted_count = cursor.rowcount
conn.commit()
logger.info(f"Deleted {deleted_count} empty clusters.")
return deleted_count
except Exception as e:
logger.error(f"Error deleting empty clusters: {e}")
conn.rollback()
return 0
finally:
conn.close()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion | 🟠 Major

Refactor to use the database layer instead of duplicating SQL logic.

This function duplicates the exact same SQL logic from db_delete_empty_clusters() in backend/app/database/face_clusters.py. This creates several issues:

  1. Architecture violation: Utils layer should not contain raw SQL queries; database operations should be encapsulated in the database layer.
  2. Code duplication: Identical SQL in two places makes maintenance harder and increases risk of divergence.
  3. Inconsistent error handling: This version swallows errors and returns 0, while the database version propagates exceptions.

Apply this diff to call the database layer instead:

+from app.database.face_clusters import db_delete_empty_clusters
+
 def cluster_util_delete_empty_clusters() -> int:
     """
     Delete all clusters that have no faces associated with them.
 
     Returns:
         int: Number of clusters deleted
     """
-    import sqlite3
-    from app.config.settings import DATABASE_PATH
-
-    conn = sqlite3.connect(DATABASE_PATH)
-    cursor = conn.cursor()
-
     try:
-        cursor.execute(
-            """
-            DELETE FROM face_clusters
-            WHERE cluster_id NOT IN (
-                SELECT DISTINCT cluster_id FROM faces WHERE cluster_id IS NOT NULL
-            )
-            """
-        )
-
-        deleted_count = cursor.rowcount
-        conn.commit()
-        logger.info(f"Deleted {deleted_count} empty clusters.")
-        return deleted_count
+        return db_delete_empty_clusters()
     except Exception as e:
         logger.error(f"Error deleting empty clusters: {e}")
-        conn.rollback()
         return 0
-    finally:
-        conn.close()

This maintains the error-swallowing behavior at the utils layer if desired, while eliminating the duplication.

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
def cluster_util_delete_empty_clusters() -> int:
"""
Delete all clusters that have no faces associated with them.
Returns:
int: Number of clusters deleted
"""
import sqlite3
from app.config.settings import DATABASE_PATH
conn = sqlite3.connect(DATABASE_PATH)
cursor = conn.cursor()
try:
cursor.execute(
"""
DELETE FROM face_clusters
WHERE cluster_id NOT IN (
SELECT DISTINCT cluster_id FROM faces WHERE cluster_id IS NOT NULL
)
"""
)
deleted_count = cursor.rowcount
conn.commit()
logger.info(f"Deleted {deleted_count} empty clusters.")
return deleted_count
except Exception as e:
logger.error(f"Error deleting empty clusters: {e}")
conn.rollback()
return 0
finally:
conn.close()
from app.database.face_clusters import db_delete_empty_clusters
def cluster_util_delete_empty_clusters() -> int:
"""
Delete all clusters that have no faces associated with them.
Returns:
int: Number of clusters deleted
"""
try:
return db_delete_empty_clusters()
except Exception as e:
logger.error(f"Error deleting empty clusters: {e}")
return 0
🤖 Prompt for AI Agents
In backend/app/utils/face_clusters.py around lines 592 to 624, replace the
inline SQLite deletion SQL with a call into the database layer: import the
db_delete_empty_clusters function from backend.app.database.face_clusters, call
it from this util, log the returned deleted count and return it; keep the
current try/except/finally shape if you want to continue swallowing errors
(catch Exception, log error, return 0) but remove the sqlite3 connection,
cursor, raw SQL, commit/rollback and conn.close() so DB handling is delegated
entirely to db_delete_empty_clusters.

@DecodeX15
Copy link
Contributor Author

@rahulharpal1603 sir can u please review it

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 7

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
frontend/src/main.tsx (1)

6-22: Restore shutdown on window close to avoid orphaned services.

The close handler no longer stops backend/sync processes, which can leave background services running and cause port conflicts on the next launch. Please reintroduce a stop mechanism (e.g., a stopServer that terminates both processes) and invoke it here.

💡 Suggested direction (requires implementing stopServer in serverUtils)
-import { startServer } from './utils/serverUtils';
+import { startServer, stopServer } from './utils/serverUtils';

@@
     const { getCurrentWindow } = await import('@tauri-apps/api/window');
     await getCurrentWindow().onCloseRequested(async () => {
-      // code to stop the server
+      await stopServer();
     });
🤖 Fix all issues with AI agents
In @.pre-commit-config.yaml:
- Around line 2-3: Update the pinned pre-commit tool versions: change the
ruff-pre-commit entry's rev from v0.4.10 to v0.14.14 and update the black
entry's rev to v26.1.0 unless the project must support Python 3.8 or 3.9, in
which case keep the existing black rev (24.4.2); ensure you update the rev
values for the two repo entries referenced
(https://github.com/astral-sh/ruff-pre-commit and the black pre-commit repo) so
the pre-commit config reflects the latest compatible releases.

In `@backend/app/logging/setup_logging.py`:
- Around line 246-255: When re-dispatching the modified LogRecord to other
handlers in setup_logging.py, respect each handler's level instead of calling
handler.handle(record) unconditionally; update the loop over
root_logger.handlers to skip self and only invoke handler.handle(record) when
record.levelno >= handler.level (or equivalently call handler.filter(record) to
apply filters), e.g., in the block iterating root_logger.handlers, add a guard
using handler.level and/or handler.filter before calling handler.handle(record)
so handlers don't receive records below their configured level.

In `@docs/backend/backend_rust/api.md`:
- Around line 17-19: The example has a variable name mismatch and outdated
comment: the code calls invoke("get_resources_folder_path") and assigns to
resourcesFolderPath but logs serverPath; replace the console.log reference to
use resourcesFolderPath (not serverPath) and update the preceding comment text
to accurate terminology (e.g., "Get resources folder path") so the example is
correct and consistent with the invoke call.

In `@scripts/setup.js`:
- Around line 19-44: Documentation and scripts disagree: setup.js (using
os.platform() checks and debianVersionPath/bashScript logic) and setup.sh
currently reject macOS (Darwin) but docs claim macOS is supported; either update
docs to remove macOS claims or add macOS support to the scripts. To fix, choose
one approach: (A) update docs/overview/features.md, docs/Manual_Setup_Guide.md,
and CONTRIBUTING.md to remove or qualify macOS support and clearly state
supported platforms (Windows and Debian-based Linux), or (B) implement macOS
handling in setup.js by adding an os.platform() === 'darwin' branch that detects
macOS, sets command/args appropriately (or invokes setup.sh with
macOS-compatible commands), ensures executable permissions similar to the
debianVersionPath branch, and update setup.sh with a macOS code path; update
logging messages that currently list supported OSes to include or exclude macOS
accordingly.

In `@scripts/setup.sh`:
- Around line 102-110: The script uses an unguarded "cd sync-microservice" so if
the directory change fails subsequent virtualenv and pip commands run in the
wrong place; update the block around the "cd sync-microservice" command to check
the exit status (or use a safe conditional like "cd sync-microservice || { echo
...; exit 1; }") and exit with a non-zero status on failure, ensuring you only
run "python -m venv .sync-env", "source .sync-env/bin/activate", "pip install -r
requirements.txt" and the following "deactivate" when the cd succeeded; apply
the same guard/behavior to the earlier backend setup block that also performs an
unguarded cd.

In `@sync-microservice/app/logging/setup_logging.py`:
- Around line 254-263: The current re-dispatch loop in setup_logging.py bypasses
handler-level filtering by calling handler.handle(record) directly; update the
loop that iterates root_logger.handlers to respect handler filters/levels either
by using root_logger.callHandlers(record) or by guarding each re-dispatch with
handler.level and filters (e.g., for each handler in root_logger.handlers (skip
self) check handler.level and handler.filter(record) / record.levelno >=
handler.level before calling handler.handle(record)), so only handlers that
should accept the record receive it.

In `@sync-microservice/main.py`:
- Around line 50-51: The server is bound to "localhost" which prevents
remote/container access; in main.py locate the server startup call that sets
host and port (the uvicorn.run or equivalent where host="localhost", port=52124)
and change host back to "0.0.0.0" to allow external and container network
access, or if the service must be local-only, add a clear comment/docstring next
to that host setting (and update any deployment docs) stating the intentional
localhost binding instead of changing behavior.
🧹 Nitpick comments (7)
app.py (1)

17-18: Batch the invalidations if the API accepts multiple keys.

This reduces redundant cache client round‑trips and makes the invalidation more atomic.

♻️ Suggested change
-    invalidate_cache("albums:get_all_albums")
-    invalidate_cache("folder_structure:get_folder_structure")
+    invalidate_cache("albums:get_all_albums", "folder_structure:get_folder_structure")
sync-microservice/app/core/lifespan.py (1)

3-57: Avoid blocking the event loop during startup retries.

time.sleep() in an async lifespan blocks the event loop; switch to await asyncio.sleep() and use time.monotonic() for reliable timeout tracking (also counts db check time).

♻️ Proposed fix (non-blocking sleep + monotonic timeout)
-import time
+import asyncio
+import time
@@
-        start_time = time.time()
+        start_time = time.monotonic()
@@
-            elapsed_time = time.time() - start_time
-
             if db_check_database_connection():
                 logger.info(f"Database connection successful on attempt {attempt}")
                 break
+            elapsed_time = time.monotonic() - start_time
@@
-            time.sleep(retry_interval)
+            await asyncio.sleep(retry_interval)
frontend/src-tauri/capabilities/migrated.json (1)

95-99: Verify spawn targets resolve in packaged builds (and PowerShell invocation).

These entries assume ./PictoPy_Server and ./PictoPy_Sync are in the process CWD and that PowerShell will execute them as-is. In packaged Tauri apps that often isn’t true, which can prevent the backend/sync services from starting. Please verify on packaged Windows/macOS/Linux builds; if these are bundled binaries, consider spawning them directly (or as sidecars) and using explicit .exe or -Command on Windows to avoid PowerShell argument parsing issues.

Also applies to: 106-110, 117-121, 128-131

frontend/src/components/OnboardingSteps/ServerCheck.tsx (1)

23-36: Consider whether 60 retries aligns with onboarding UX expectations.

retry: 60 with a fixed retryDelay: 1000 means users wait ~60 seconds before the error path triggers during health checks. Confirm this timeout window is appropriate for initial server connectivity validation, or make the retry budget configurable per environment.

backend/main.py (1)

73-74: Centralize host/port to avoid config drift.
Hard-coding host/port in multiple places risks mismatch and makes non-local deployments harder. Consider sourcing from env/settings and reusing for both OpenAPI + runtime config (and confirm localhost binding is intended outside dev).

💡 Proposed refactor
+BACKEND_HOST = os.getenv("BACKEND_HOST", "localhost")
+BACKEND_PORT = int(os.getenv("BACKEND_PORT", "52123"))
+
 app = FastAPI(
     lifespan=lifespan,
     title="PictoPy",
@@
     servers=[
-        {"url": "http://localhost:52123", "description": "Local Development server"}
+        {
+            "url": f"http://{BACKEND_HOST}:{BACKEND_PORT}",
+            "description": "Local Development server",
+        }
     ],
 )
@@
     config = Config(
         app=app,
-        host="localhost",
-        port=52123,
+        host=BACKEND_HOST,
+        port=BACKEND_PORT,
         log_level="info",
         log_config=None,  # This is crucial - disable Uvicorn's default logging config
     )

Also applies to: 142-145

frontend/src-tauri/src/services/mod.rs (1)

5-10: Add a defensive existence check for the resolved resources path.
If the bundled layout still uses resources/backend, this will silently return a non-existent path and fail later. A quick check provides a clearer error and avoids downstream surprises.

🧩 Proposed guard
 pub fn get_resources_folder_path(handle: tauri::AppHandle) -> Result<String, String> {
     let resource_path = handle
         .path()
         .resolve("resources", BaseDirectory::Resource)
         .map_err(|e| e.to_string())?;
+    if !resource_path.exists() {
+        return Err(format!(
+            "Resources folder not found: {}",
+            resource_path.to_string_lossy()
+        ));
+    }
     Ok(resource_path.to_string_lossy().to_string())
 }
docs/Script_Setup_Guide.md (1)

8-8: Minor style improvement for the note.

Consider rephrasing for better readability.

📝 Suggested improvement
-> Note that the Step No. 8 given below is not mentioned in the video, please follow that step also.
+> Note that Step 8 below is not covered in the video. Please follow that step as well.

Comment on lines +2 to +3
- repo: https://github.com/astral-sh/ruff-pre-commit
rev: v0.4.10
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🌐 Web query:

latest release astral-sh ruff-pre-commit v0.4.10 current supported

💡 Result:

astral-sh/ruff-pre-commit v0.4.10 is not current.

  • The latest release of astral-sh/ruff-pre-commit is v0.14.14 (Jan 22, 2026). [1]
  • The repository README currently recommends using rev: v0.14.14 in .pre-commit-config.yaml. [2]

So v0.4.10 is outdated; if you want a currently supported/current version, update your rev to v0.14.14. [1][2]

Sources: [1] [2]


🌐 Web query:

latest release psf black 24.4.2 current supported

💡 Result:

  • Latest PSF Black release: 26.1.0 (released Jan 18, 2026). Supported runtime: Python ≥ 3.10 (classifiers include 3.10–3.14). [1][2]
  • Black 24.4.2: released Apr 26, 2024; supported runtime: Python ≥ 3.8 (classifiers 3.8–3.12). It is not the current release. [3]
  • Support changes since 24.4.2: Black dropped Python 3.8 in 24.10.0 and dropped Python 3.9 in 25.12.0 (which is why current Black now requires Python 3.10+). [4]

Sources: [1] [2] [3] [4]


Update pinned tool versions—both are significantly outdated.

  • ruff-pre-commit v0.4.10 is from 2024; the latest release is v0.14.14 (Jan 22, 2026). Update rev to v0.14.14.
  • black 24.4.2 is from April 2024; the latest release is 26.1.0 (Jan 18, 2026). However, note that Black 25.12.0+ requires Python ≥ 3.10 (it dropped Python 3.8 in v24.10.0 and Python 3.9 in v25.12.0). If your project must support Python 3.8 or 3.9, keep 24.4.2; otherwise, update to 26.1.0.

Also applies to: 8-9

🤖 Prompt for AI Agents
In @.pre-commit-config.yaml around lines 2 - 3, Update the pinned pre-commit
tool versions: change the ruff-pre-commit entry's rev from v0.4.10 to v0.14.14
and update the black entry's rev to v26.1.0 unless the project must support
Python 3.8 or 3.9, in which case keep the existing black rev (24.4.2); ensure
you update the rev values for the two repo entries referenced
(https://github.com/astral-sh/ruff-pre-commit and the black pre-commit repo) so
the pre-commit config reflects the latest compatible releases.

Comment on lines +246 to +255
record.msg = f"[{module_name}] {msg}"
record.args = ()
# Clear exception / stack info to avoid duplicate traces
record.exc_info = None
record.stack_info = None

root_logger = logging.getLogger()
for handler in root_logger.handlers:
if handler is not self:
handler.handle(record)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Respect handler log levels when re-dispatching records.

Calling handler.handle(record) directly bypasses handler-level filtering, so DEBUG logs can leak into INFO+ handlers. Add a record.levelno >= handler.level guard (or delegate through Logger.callHandlers) to keep log levels consistent.

🛠️ Proposed fix
-        root_logger = logging.getLogger()
-        for handler in root_logger.handlers:
-            if handler is not self:
-                handler.handle(record)
+        root_logger = logging.getLogger()
+        for handler in root_logger.handlers:
+            if handler is self:
+                continue
+            if record.levelno >= handler.level:
+                handler.handle(record)
🤖 Prompt for AI Agents
In `@backend/app/logging/setup_logging.py` around lines 246 - 255, When
re-dispatching the modified LogRecord to other handlers in setup_logging.py,
respect each handler's level instead of calling handler.handle(record)
unconditionally; update the loop over root_logger.handlers to skip self and only
invoke handler.handle(record) when record.levelno >= handler.level (or
equivalently call handler.filter(record) to apply filters), e.g., in the block
iterating root_logger.handlers, add a guard using handler.level and/or
handler.filter before calling handler.handle(record) so handlers don't receive
records below their configured level.

Comment on lines 17 to +19
// Example: Get server path
const serverPath = await invoke("get_server_path");
console.log("Server path:", serverPath);
const resourcesFolderPath = await invoke("get_resources_folder_path");
console.log("Resources folder path:", serverPath);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

Fix variable reference mismatch in example code.

Line 19 references serverPath, but the variable declared on line 18 is resourcesFolderPath. This will cause a ReferenceError if developers copy this example. Additionally, the comment on line 17 uses outdated terminology.

🐛 Proposed fix for variable mismatch and comment
-// Example: Get server path
-const resourcesFolderPath = await invoke("get_resources_folder_path");
-console.log("Resources folder path:", serverPath);
+// Example: Get resources folder path
+const resourcesFolderPath = await invoke("get_resources_folder_path");
+console.log("Resources folder path:", resourcesFolderPath);
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
// Example: Get server path
const serverPath = await invoke("get_server_path");
console.log("Server path:", serverPath);
const resourcesFolderPath = await invoke("get_resources_folder_path");
console.log("Resources folder path:", serverPath);
// Example: Get resources folder path
const resourcesFolderPath = await invoke("get_resources_folder_path");
console.log("Resources folder path:", resourcesFolderPath);
🤖 Prompt for AI Agents
In `@docs/backend/backend_rust/api.md` around lines 17 - 19, The example has a
variable name mismatch and outdated comment: the code calls
invoke("get_resources_folder_path") and assigns to resourcesFolderPath but logs
serverPath; replace the console.log reference to use resourcesFolderPath (not
serverPath) and update the preceding comment text to accurate terminology (e.g.,
"Get resources folder path") so the example is correct and consistent with the
invoke call.

Comment on lines +19 to 44
} else if (os.platform() === 'linux') {
// Check if it's Debian-based Linux
const debianVersionPath = '/etc/debian_version';
if (fs.existsSync(debianVersionPath)) {
// On Debian-based Linux, use the bash script
command = bashScript;
args = [];

// Ensure the bash script is executable; if not, set the execute permission.
try {
fs.accessSync(bashScript, fs.constants.X_OK);
} catch (err) {
console.log(`File ${bashScript} is not executable. Setting execute permission...`);
fs.chmodSync(bashScript, 0o755);
}
} else {
console.error('Unsupported Linux distribution. This setup script only supports Debian-based Linux distributions.');
console.error('Please install system dependencies manually and run the individual setup commands.');
process.exit(1);
}
} else {
console.error(`Unsupported operating system: ${os.platform()}`);
console.error('This setup script only supports Windows and Debian-based Linux distributions.');
console.error('Please install system dependencies manually and run the individual setup commands.');
process.exit(1);
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

🧩 Analysis chain

🏁 Script executed:

# First, let's see the full context of setup.js to understand the change
cat -n scripts/setup.js

Repository: AOSSIE-Org/PictoPy

Length of output: 2311


🏁 Script executed:

# Check git history to see if macOS support was previously present
git log --oneline -p scripts/setup.js | head -200

Repository: AOSSIE-Org/PictoPy

Length of output: 2156


🏁 Script executed:

# Search for macOS/Darwin references in documentation
rg -i 'macos|darwin|osx' --type=md

Repository: AOSSIE-Org/PictoPy

Length of output: 482


🏁 Script executed:

# Check if there's a setup.sh and what platforms it supports
cat scripts/setup.sh 2>/dev/null | head -50

Repository: AOSSIE-Org/PictoPy

Length of output: 1512


🏁 Script executed:

# Check for any README or docs that mention platform support
find . -name 'README*' -o -name 'INSTALL*' | head -5

Repository: AOSSIE-Org/PictoPy

Length of output: 174


Documentation claims macOS support that the setup scripts don't provide.

The setup.js file is newly created (not modified from a previous version) and supports only Windows and Debian-based Linux. However, the project documentation claims cross-platform support for macOS:

  • docs/overview/features.md: "Available on major operating systems (Windows, macOS, Linux)"
  • docs/Manual_Setup_Guide.md: Mentions "Terminal (Linux/MacOS)"
  • CONTRIBUTING.md: References "On macOS/Linux"

Both setup.js and setup.sh reject macOS (Darwin) users. Update documentation to accurately reflect that macOS is not supported, or add macOS support to the setup scripts.

🤖 Prompt for AI Agents
In `@scripts/setup.js` around lines 19 - 44, Documentation and scripts disagree:
setup.js (using os.platform() checks and debianVersionPath/bashScript logic) and
setup.sh currently reject macOS (Darwin) but docs claim macOS is supported;
either update docs to remove macOS claims or add macOS support to the scripts.
To fix, choose one approach: (A) update docs/overview/features.md,
docs/Manual_Setup_Guide.md, and CONTRIBUTING.md to remove or qualify macOS
support and clearly state supported platforms (Windows and Debian-based Linux),
or (B) implement macOS handling in setup.js by adding an os.platform() ===
'darwin' branch that detects macOS, sets command/args appropriately (or invokes
setup.sh with macOS-compatible commands), ensures executable permissions similar
to the debianVersionPath branch, and update setup.sh with a macOS code path;
update logging messages that currently list supported OSes to include or exclude
macOS accordingly.

Comment on lines +102 to +110
# ---- Set up the sync-microservice ----
echo -e "${YELLOW}Setting up sync-microservice...${NC}"
cd sync-microservice
python -m venv .sync-env
source .sync-env/bin/activate
pip install --upgrade pip
pip install -r requirements.txt
deactivate
cd ..
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Add error handling for directory change.

If the cd sync-microservice command fails (e.g., directory doesn't exist), subsequent commands will execute in the wrong directory, potentially corrupting the parent environment.

Proposed fix with error handling
 # ---- Set up the sync-microservice ----
 echo -e "${YELLOW}Setting up sync-microservice...${NC}"
-cd sync-microservice
+cd sync-microservice || { echo -e "${RED}sync-microservice directory not found${NC}"; exit 1; }
 python -m venv .sync-env
 source .sync-env/bin/activate
 pip install --upgrade pip
 pip install -r requirements.txt
 deactivate
-cd ..
+cd .. || exit 1

Note: The existing backend setup block (lines 94-100) has the same issue—consider applying similar error handling there for consistency.

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
# ---- Set up the sync-microservice ----
echo -e "${YELLOW}Setting up sync-microservice...${NC}"
cd sync-microservice
python -m venv .sync-env
source .sync-env/bin/activate
pip install --upgrade pip
pip install -r requirements.txt
deactivate
cd ..
# ---- Set up the sync-microservice ----
echo -e "${YELLOW}Setting up sync-microservice...${NC}"
cd sync-microservice || { echo -e "${RED}sync-microservice directory not found${NC}"; exit 1; }
python -m venv .sync-env
source .sync-env/bin/activate
pip install --upgrade pip
pip install -r requirements.txt
deactivate
cd .. || exit 1
🧰 Tools
🪛 Shellcheck (0.11.0)

[warning] 104-104: Use 'cd ... || exit' or 'cd ... || return' in case cd fails.

(SC2164)

🤖 Prompt for AI Agents
In `@scripts/setup.sh` around lines 102 - 110, The script uses an unguarded "cd
sync-microservice" so if the directory change fails subsequent virtualenv and
pip commands run in the wrong place; update the block around the "cd
sync-microservice" command to check the exit status (or use a safe conditional
like "cd sync-microservice || { echo ...; exit 1; }") and exit with a non-zero
status on failure, ensuring you only run "python -m venv .sync-env", "source
.sync-env/bin/activate", "pip install -r requirements.txt" and the following
"deactivate" when the cd succeeded; apply the same guard/behavior to the earlier
backend setup block that also performs an unguarded cd.

Comment on lines +254 to +263
record.msg = f"[{module_name}] {msg}"
record.args = ()
# Clear exception / stack info to avoid duplicate traces
record.exc_info = None
record.stack_info = None

root_logger = logging.getLogger()
for handler in root_logger.handlers:
if handler is not self:
handler.handle(record)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Respect handler log levels when re-dispatching records.

Same issue as backend: calling handler.handle(record) directly skips handler-level filtering, so lower-level logs can leak into higher-level handlers. Add a level guard (or use Logger.callHandlers) to preserve expected log behavior.

🛠️ Proposed fix
-        root_logger = logging.getLogger()
-        for handler in root_logger.handlers:
-            if handler is not self:
-                handler.handle(record)
+        root_logger = logging.getLogger()
+        for handler in root_logger.handlers:
+            if handler is self:
+                continue
+            if record.levelno >= handler.level:
+                handler.handle(record)
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
record.msg = f"[{module_name}] {msg}"
record.args = ()
# Clear exception / stack info to avoid duplicate traces
record.exc_info = None
record.stack_info = None
root_logger = logging.getLogger()
for handler in root_logger.handlers:
if handler is not self:
handler.handle(record)
record.msg = f"[{module_name}] {msg}"
record.args = ()
# Clear exception / stack info to avoid duplicate traces
record.exc_info = None
record.stack_info = None
root_logger = logging.getLogger()
for handler in root_logger.handlers:
if handler is self:
continue
if record.levelno >= handler.level:
handler.handle(record)
🤖 Prompt for AI Agents
In `@sync-microservice/app/logging/setup_logging.py` around lines 254 - 263, The
current re-dispatch loop in setup_logging.py bypasses handler-level filtering by
calling handler.handle(record) directly; update the loop that iterates
root_logger.handlers to respect handler filters/levels either by using
root_logger.callHandlers(record) or by guarding each re-dispatch with
handler.level and filters (e.g., for each handler in root_logger.handlers (skip
self) check handler.level and handler.filter(record) / record.levelno >=
handler.level before calling handler.handle(record)), so only handlers that
should accept the record receive it.

Comment on lines +50 to +51
host="localhost",
port=52124,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🌐 Web query:

Uvicorn host localhost vs 0.0.0.0 Docker networking difference

💡 Result:

In Uvicorn, --host controls which network interface the server binds to inside the container:

  • --host localhost (usually 127.0.0.1) binds to the container’s loopback interface, so the app is reachable only from inside that same container. Even if you run Docker with -p 8000:8000, the host can’t reach it because nothing is listening on the container’s external interface. [1][2]
  • --host 0.0.0.0 binds to all interfaces in the container, including the one Docker connects to the bridge network, so port publishing (-p) can forward traffic from the host to the container successfully. [1][2][3]

How Docker port publishing relates:

  • docker run -p HOSTPORT:CONTAINERPORT ... publishes the container port on the host, and by default Docker binds that published port on all host interfaces (0.0.0.0 / ::). [4]
  • You can restrict exposure to only the host machine by binding on the host side: -p 127.0.0.1:8000:8000. This is independent of Uvicorn, but Uvicorn still must bind 0.0.0.0 inside the container to accept forwarded traffic. [4]

Typical Docker command:

uvicorn main:app --host 0.0.0.0 --port 8000
docker run -p 8000:8000 your-image

Sources: [1] StackOverflow answer on FastAPI/Uvicorn in Docker, [2] explanation of binding to 0.0.0.0 for container reachability, [3] tiangolo image docs on HOST=127.0.0.1 only being available inside container, [4] Docker Docs on port publishing/default bind address.


Binding to localhost will prevent remote and container access.

The change from "0.0.0.0" to "localhost" binds the server to the loopback interface only. This breaks:

  • Docker port publishing (even with -p 8000:8000, the forwarded traffic cannot reach the service)
  • Access from other containers in the same network
  • Any remote or VMs accessing the service

If remote or containerized access is needed, revert to "0.0.0.0". If this service is genuinely local-only, document this design decision and ensure no remote access is required.

🤖 Prompt for AI Agents
In `@sync-microservice/main.py` around lines 50 - 51, The server is bound to
"localhost" which prevents remote/container access; in main.py locate the server
startup call that sets host and port (the uvicorn.run or equivalent where
host="localhost", port=52124) and change host back to "0.0.0.0" to allow
external and container network access, or if the service must be local-only, add
a clear comment/docstring next to that host setting (and update any deployment
docs) stating the intentional localhost binding instead of changing behavior.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

backend bug Something isn't working

Projects

None yet

Development

Successfully merging this pull request may close these issues.

BUG: face data persists after source folder is removed

2 participants