Chat template: PDF citation viewer #5843
Merged
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Allows
.pdf
file citations to open in a PDF viewer.It highlights the text based on exact string match, since this uses the
#search=...
URL parameter supported by pdfjs, which in turn is equivalent to the user opening the "Find" feature and typing in the citation quote. As such it's not 100% guaranteed to highlight the citation, since in some cases the LLM returns very slight variations on the text instead of quoting it character-for-character.If this wasn't good enough, we could:
However this would be very complex and possibly still error-prone.
Given that in most cases we will still open the correct PDF page even if we can't highlight the citation, this is a reasonable tradeoff. App developers with more stringent requirements can implement the large amount of additional code needed to pick out citations more precisely.
Use of PDFJS
Given future plans to avoid the NPM dependency, I've included the files as actual files in
wwwroot
. There are unfortunately quite a lot of them (e.g., many toolbar icons) but it's all hidden away in apdfjs
directory so likely won't cause any problems or confusion.Serving the cited PDFs
I've added use of
UseStaticFiles
to serve everything from theData
directory. Arguably developers may wish to limit what files are served (e.g., just to.pdf
files, or just to files that have been ingested) but that would substantially complicate the template logic. The intention of theData
directory is for a quick getting-started process and isn't intended to scale up to all use cases (e.g., you wouldn't put the entire contents of a CMS in there) so I think it's reasonable to simplify by treating that as a publicly-servable directory. Obviously this needs to be documented when we talk about adding files to that directory.Bug workaround
In order to serve the pdfjs
viewer.html
file, I had to work around dotnet/aspnetcore#58940 by changingMapStaticFiles
toUseStaticFiles
. Hopefully we can change this back if that gets fixed in a patch.Note that the other workaround of
ReloadStaticAssetsAtRuntime: false
isn't suitable since it would break the ability to edit any static files content (you'd have to restart the server after every file change).Microsoft Reviewers: Open in CodeFlow