Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Do not wrap filepath in Path to fix indexing markdown files on Windows #993

Merged
merged 2 commits into from
Dec 2, 2024

Conversation

debanjum
Copy link
Member

@debanjum debanjum commented Dec 2, 2024

Issue

  • Path with / are converted to \ on Windows using the Path operator.
  • The markdown_to_entries module was trying to normalize file paths withPath for some reason.
    This would store the file paths in DB Entry differently than the file to entries map if Khoj ran on Windows.
    That'd result in a KeyError when trying to look up the entry file path from file_to_text_map in the text_to_entries:update_embeddings() function.

Fix

  • Removing the unnecessary OS dependent Path normalization in markdown_to_entries should keep the file path storage consistent across file_to_text_map var, FileObjectAdaptor, Entry DB tables on Windows for Markdown files as well.

This issue will affect users hosting Khoj server on Windows and attempting to index markdown files.

Resolves #984

Issue
- Path with / are converted to \\ on Windows using the Path operator.
- The markdown to entries method for some reason was doing this.
  This would store the file paths in DB entry differently than the file
  to entries map. Resulting in a KeyError when trying to look up the
  entry file path from file_to_text_map in the
  text_to_entries:update_embeddings() function.

Fix
- Removing the unnecessary OS dependendent Path normalization in
  markdown_to_entries should keep the file path storage consistent
  across file_to_text_map var, FileObjectAdaptor, Entry DB tables on
  Windows for Markdown files as well

This issue would only affect users hosting Khoj server on Windows and
attempting to index markdown files.

Resolves #984
- Add type hints to improve maintainability of stabilzed indexing code
- It shouldn't be necessary to wrap string variables in an f-string

This change aims to improve code quality. It should not affect
functionality.
@debanjum debanjum added the fix Fix something that isn't working as expected label Dec 2, 2024
@debanjum debanjum merged commit db29894 into master Dec 2, 2024
13 checks passed
@debanjum debanjum deleted the fix-indexing-markdown-files-in-windows branch December 2, 2024 10:00
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
fix Fix something that isn't working as expected
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[FIX] Issue indexing from Obsidian on Windows
1 participant