Skip to content

Commit

Permalink
Do not wrap filepath in Path to fix indexing markdown files on Windows
Browse files Browse the repository at this point in the history
Issue
- Path with / are converted to \\ on Windows using the Path operator.
- The markdown to entries method for some reason was doing this.
  This would store the file paths in DB entry differently than the file
  to entries map. Resulting in a KeyError when trying to look up the
  entry file path from file_to_text_map in the
  text_to_entries:update_embeddings() function.

Fix
- Removing the unnecessary OS dependendent Path normalization in
  markdown_to_entries should keep the file path storage consistent
  across file_to_text_map var, FileObjectAdaptor, Entry DB tables on
  Windows for Markdown files as well

This issue would only affect users hosting Khoj server on Windows and
attempting to index markdown files.

Resolves #984
  • Loading branch information
debanjum committed Dec 2, 2024
1 parent 9e0a2c7 commit dffdd81
Showing 1 changed file with 1 addition and 1 deletion.
2 changes: 1 addition & 1 deletion src/khoj/processor/content/markdown/markdown_to_entries.py
Original file line number Diff line number Diff line change
Expand Up @@ -139,7 +139,7 @@ def convert_markdown_entries_to_maps(parsed_entries: List[str], entry_to_file_ma
# Escape the URL to avoid issues with special characters
entry_filename = urllib3.util.parse_url(raw_filename).url
else:
entry_filename = str(Path(raw_filename))
entry_filename = raw_filename

heading = parsed_entry.splitlines()[0] if re.search(r"^#+\s", parsed_entry) else ""
# Append base filename to compiled entry for context to model
Expand Down

0 comments on commit dffdd81

Please sign in to comment.