-
Notifications
You must be signed in to change notification settings - Fork 79
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Feature request] Allow Interaction with PDF Files #242
Comments
Thanks for filing. I'd like to add features like this in the future. Recently, I've been focusing on some cleanups/rearchs so I can make it easier to enable things like this (and perhaps use different models other than OpenAI's).
Yup. The upload function would be handy. We can hopefully reuse it in the shells too. I was thinking of scanning for markdown links to local files and automatically uploading and saving the file-id's to buffer-local vars (for future resolutions). Are you keen to contribute an implementation? |
Yes! Since I personally really want this feature, I am willing to contribute to it. However, I am not yet familiar with the source code of your project. Could you please provide me with a rough guidance on what I can do and where I should start? |
Take a look at If you haven't used edebug before, it's great to step through code to figure how things work. Lemme know if you have more questions. |
Also |
|
Okay I've set up all the necessary HTTP requests. You can check out a demo by setting the Right now, everything runs synchronously, I will change them later. Also, messages are grouped by threads, so one can view all historical messages via a |
Thanks for this! Gimme a little time to play with it and get a feel of the OpenAI api. If I understand correctly, your main use case is org babel? This may be simpler than shell integration, so we could start with that and defer the shell for a little while? |
I just want a functional GPT interface in Emacs that can interact with PDFs, whether it's through the shell or org-babel. I'm choosing So yes, we can start with |
Hey, the demo changes are great! 7f61f49 adds a babel experiment to the ob-assistant-file-query branch. Super rough and needs more work, but it's a start if you wanna give it a try (needs at least shell-maker v0.63.1). |
If
|
It looked already really nice, Great Job! Do you have any ideas about saving (add-to-list
'gpt-session
'("Mastering-Emacs" . (
:description "This is a session about book Mastering Emacs"
:file-path "/path/to/book/Mastering-Emacs.pdf"
:file-id "xx"
:assistant-id "yy"
:thread-id "zz"
))) Additionally, we could provide a function prompting users to choose a session via its name,
|
Or maybe we could even migrate the above thing into your |
Do you know if we can rely on OpenAI API and query? What's the lifetime of each of these things? Wondering as keeping a local copy will also require managing staleness. |
According to the documentation, files are never deleted, while assistants and threads are removed after 30 days. So, I think at least we should provide a way to store relations between uploaded files and their corresponding file IDs, as uploading a PDF file is both time-consuming and costly. In addition, ideally, it would be better if users could choose an assistant declaratively. In other words, users should just need to set its name and prompt and |
Hi there. Is this maybe meant for an issue filed on gptel project? |
We should prolly keep this feature request open until we merge the https://github.com/xenodium/chatgpt-shell/tree/ob-assistant-file-query branch. |
I feel so sorry. I intended to answer questions in another place. Please just ignore what I posted. |
No worries! 👍 |
Merged to main (may as well). While it doesn't yet have the At a bare minimum, Will have to do for the time being. I have to switch gears and dedicate available effort to enable non-OpenAI models work in chatgpt-shell. That's a biggie. |
With GPT-4o, one can upload files such as PDFs via the Assistant object and get the ID of the file. Then, one can create a thread using that ID and ask questions regarding it. Here is an example that demonstrates what I mentioned in Python.
Can we do similar things via
chatgpt-shell
? My idea is to provide some helper functions for uploading files and returning the file ID. Then, one can attach:file-id
as a parameter inorg-babel
. Do you have any thoughts about that?The text was updated successfully, but these errors were encountered: