-
Notifications
You must be signed in to change notification settings - Fork 20
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Some questions #1
Comments
Hi Gelsas, Thanks for reaching out! In principle, this repo is able to generate much longer stories such as the length you're suggesting, but the quality might be worse and you might lose a bit of coherence over time; I haven't tested stories of such length myself. It would also take a very long time to generate a story of that length through the public OPT API (3.5k-4k tokens already might already take a day or more, since each individual API call can take a few seconds), so if you don't have compute to set up your own OPT server you might want to just use GPT3 although the quality will be a bit lower. If I had to guess, the total cost of generating an entire 40k token story using our system through the GPT3 API (using the davinci model, and given an existing outline) would be on the order of $20. Generating just the outline should also be possible (and in fact is recommended if you're trying to generate a much longer story). I haven't tested generating much more detailed outlines, but you can follow the instructions in the Plan + Outline Generation section of the README to generate the outline, setting Since you already have a story idea in mind, you could try to work with the outlining system "interactively" as well, although admittedly the procedure for doing so is a bit convoluted currently. For interactive outlining you'd use the Hope that helps! To be totally upfront, we're not anywhere near the level of a strong human author yet; you can get a sense for average story quality by looking at the example outlines + stories at the end of the appendix in our paper. But if that level of writing looks acceptable to you for just initial outlining/drafting purposes, then go for it, and of course working with the system interactively will make the results a lot better. Please let me know if you have any other questions / difficulties. Thanks, |
Hi Kevin, First of all I want to thank you for this detailed response. It already gives me a better understanding and guidance on what I need to do. I am currently waiting to hear back from the Alpa guys on Slack but I might not be eligible to use the API, since I am not working for any institute but doing this research as a hobby. Do you know of any other ways to use OPT-175B via an API? I don't think I would be eligible in Facebooks/Metas point of view since I am not someone researching at an institute. Also I am not sure how much compute I would need, do you have some more info regarding that ? So currently it looks like I will need to go with Option 3: GPT3-175B (easy to get started; worse quality) My questions here are the following: Why is it not possible to use the detailed controller when using GTP3-175B? Since OpenAI just released their chatgpt API: gpt-3.5-turbo (https://openai.com/blog/introducing-chatgpt-and-whisper-apis) That would be all questions for know since I couldn't start my tests/experiments yet, but once I start testing I might have some more specific questions. Just wanted to say thank you again for taking the time to respond to my questions. Best, |
Hi Gelsas, I don't know of other OPT-175B APIs, unfortunately. You would also need a lot of compute to serve OPT-175B; recently they also released the LLaMA models https://ai.facebook.com/blog/large-language-model-llama-meta-ai/ which might be better and wouldn't require quite as much compute, but I'd guess you probably still need one or multiple high-end GPUs (40-80GBs) to serve it yourself if using the larger models, depending on which one exactly. The reason we can't use the detailed controller with GPT3-175B is because the detailed controller relies on manipulating the logprobs of tokens at each step of generation. Not only does the GPT3 API severely limit how many logprobs you can get back, but it's unbelievably inefficient to use in this pipeline because it doesn't let you cache computation - you'd have to re-query with an entirely new prompt for every token you generate. The ChatGPT API will have the same issues. The Alpa API doesn't have these limitations (because I added the required functionality myself via a pull request a few months ago, lol). But it's not necessarily a dealbreaker if you can't use the detailed controller due to using GPT3 instead; just expect the model to wander off track away from your outline a bit more frequently. This can probably be remedied to a decent extent by manual intervention if you're willing to spend the effort to work with the system interactively paragraph by paragraph. As for using ChatGPT or a text-davinci-00X model: during outline generation, due to generating the outline in breadth-first expansion order, you want the model to handle suffix context in addition to just a prefix (i.e., the prompt). Last I checked, text-davinci-003 and ChatGPT still didn't support this; we used text-davinci-002 in the paper. But if they do support suffix context you can swap them in. For the main story generation where we use the base non-instruction-tuned davinci model, you also need to be a bit careful. My initial experiments with using text-davinci-002 instead were a complete disaster (it's mentioned in a footnote in our earlier Re3 paper https://arxiv.org/abs/2210.06774). I think the prompt we use, which contains hundreds of tokens of context, is totally out of distribution for these instruction-tuned models; in addition, the instruction-tuned models are usually finetuned to produce shorter outputs rather than full-length stories, whereas the base davinci's pretraining set actually contains a lot of novels which is what we want. We observed that text-davinci-002 usually generated maybe 1 paragraph of somewhat reasonable text before reliably starting to repeat itself or degenerating into gibberish. I think these problems are alleviated to some degree with text-davinci-003 and ChatGPT, but not completely gone: if you try to use these models I think you'll usually observe that they write in sort of a "higher-level" style compared to what you want in a story, i.e., they like to "tell" instead of "show." It's definitely possible that you could get around these limitations with the newer text-davinci-003 and ChatGPT models with better prompt engineering, though, and it's worth exploring further. There are of course potential benefits. For example, text-davinci-003 and ChatGPT are definitely better at staying on-topic and not contradicting the context in the prompt compared to the base davinci model. So if you could resolve the style issues and use them as the inner model that DOC calls during story generation, it'd likely improve some of our metrics. Hope that gives you some perspective on the possible choices. Thanks, |
Hi Kevin, Thank you for the very detailed response. I just got approved by meta for the LLaMA models and I am thinking of using vast.ai to rent the GPUs I probably need for this. Using LLaMA is probably much better than using GPT3, correct? Would it be possible to get some more insights on how I can use the LLaMA models with DOC ? Can the detailed controller be used when using the LLaMA models ? I hope you don't mind all these questions. Even if they are maybe "stupid" questions. Best, |
Hi Gelsas, I haven't actually run the LLaMA models myself but I would guess it's better than GPT3 if you use the detail controller. There's no problem with using the detail controller in principle, but there could be some practical issues with tokenization though-- if the LLaMA models use different tokenization than OPT (I'm not sure if they do - you'd have to check) then you'd have to retrain the detail controller using a model that uses LLaMA tokenization. In this case you'd follow the instructions near the bottom of the readme to retrain it, and you'd probably also have to modify https://github.com/yangkevin2/doc-story-generation/blob/main/story_generation/common/controller/models/fudge_controller.py (see especially the bottom of the file where it has versions for GPT and OPT based models). Retraining might also be complicated since the smallest LLaMA model is 7B, which is a bit large for this purpose, so it would result in the detail controller running more slowly at inference time since it's not very performance-optimized currently. On the other hand, if it turns out that LLaMA uses the same tokenizer as OPT, then you can avoid all this trouble and directly use the existing detail controller I provide in the download. Finally, you'd have to modify the code that interfaces with the 65B LLaMA model that you're calling, since right now it's assuming the Alpa interface for an OPT model. TBH, I would consider starting with GPT3 anyway (you wouldn't need LLaMA for the planning stage anyway, only for the main drafting procedure after the planning is done) since it's so much easier to get started and try it out interactively, to get a rough sense of what the system can do. Then you could try to switch to LLaMA if you think it's worth the effort trying to improve it further for your purposes. |
Hi Kevin, As per your suggestion I am starting with GPT3. I have followed the instructions in the readme.
I am not sure why it shows this error: UserWarning: Failed to initialize NumPy: module compiled against API version 0xf but this version of numpy is 0xe (Triggered internally at ..\torch\csrc\utils\tensor_numpy.cpp:68.) Since as per the requirements.txt the version that got installed is: numpy==1.21.6 Any idea how this could be fixed? |
I was able to resolve this aswell. Switched from python 3.10 to 3.8.15 How long should the process approximately take ? To finish the oultine? It was running for a bit and now it shows this in the plan.log: context length 1025 exceeded artificial context length limit 1024 Not sure how to fix that, can you advice what is causing the context length to be exactly 1 character to long ? This is the content of the plan.log
|
Oh, now that I think about it, we never ran into this during our own tests since we limited the depth (number of outline levels) to 3 for our experiments. The problem is that we limit the GPT3/OPT context length for all methods to 1024 for fair comparison with baselines during our paper experiments, and you can end up exceeding that limit if you increase the depth further. You should be able to remove this artificial limitation by changing the |
I have added --max-context-length 2048 to my command but now I am getting this error: Token indices sequence length is longer than the specified maximum sequence length for this model (1027 > 1024). Running this sequence through the model will result in indexing errors And now a few hours later:
and a little bit after that I got this error:
It has been running for 5-6 hours now. How long should it take ? |
The initial warning can be ignored -- it's complaining because we use the GPT2 tokenizer. GPT2 presumably has a max context length of up to 1024, but GPT3 shares the same tokenizer while using a longer context length. I see in your initial command from earlier that you set Your most recent crash looks like it's due to the context length getting too long for outline order reranker. The context length was fine for the length that we used in our paper experiments, but the outline is getting too long for it with your current settings. The problem might just go away if you decrease Sorry for all the trouble; we haven't carefully tested larger values of |
Hi I just read about DOC and have some questions. I am developing in python since a few years but never really worked with AI yet. But this project is fascinating me because it may be the help/assistance I have been looking for.
Why am I interested so much in this ?
I love reading scifi books (I have read every big sci-fi book series there is and a lot of lesser known novels).
In the past few month I had a great idea for a scifi novel and have already a outline created for it. My biggest problem is that I am not a writer myself so I can not bring my idea to life or even create a more detailed outline of it.
Since I am a developer myself I already thought there must be something that can help me get around my shortcomings and after reading your paper just now, this sounds very promising.
My question now is can DOC be used to generate longer story's than 3,5k to 4k word long stories ?
I would love to use DOC to create and develop together with it a much longer rough story with me like 30000 to 40000 words long. I know it would probably not be perfect at all but it would be a great starting point for me to bring my idea for a story roughly to life and then just fine-tuning and fixing storyline issues my self. Then I could have a book based on my own idea. That would be awesome.
If writing such long stories is not possible, could DOC help me to create a very detailed outline a few thousand words long ?
If the answer is yes that would be possible then I will definitely test it myself but before investing all the time to understand every detail of DOC I thought it would make the most sense to ask here first.
Best,
Gelsas
The text was updated successfully, but these errors were encountered: