Replies: 4 comments
-
This project might help: https://www.crafters.ai/aitools/research-agents-3-0 |
Beta Was this translation helpful? Give feedback.
-
@sonichi Things I've learned since posting:
Posit: I'm also finding the specificity of some of the examples very significant. The text in many of the prompts feels like it is actually thought by the human authors to be "telling" the model what to do. The purpose of prompts is to seed the token stream. If the LM was trained on documents in which the ~pattern "You are a helpful AI. The AI provides helpful answers. The AI thinks outside the box" occurred, then you are going to strongly over-emphasize a set of - presumably - speculative discussions about how an AI might behave, and if the training data came from reddit or stackoverflow that's probably going to weight-up patterns that were discussing The 3 Laws or morality or such which will produce banter (lots of civility, thank you, I'm pleased to help, etc). The upshot of this is that it's really super hard to get some of the notebooks to do something very different than the specific scenario in the example - the arxiv web analysis example works less efficiently for other sites or other forms of processing of similar types of document, and begins to rapidly deteriorate in success rates as you move to more unrelated topics. Not unexpectedly, obviously, but it can drastically complicated learning the toolset. C.f ask it to search a short (2-4) list of rss or article feeds on news about ssds, gpus, or 3d printers in the last year to identify significant developments that have come to market to select factors/features worth looking for in choosing devices released in the last 6 months. I would expect to see it struggle, perhaps, with extracting the data, or identifying "developments" - and depending on the LM used this does begin to surface, but mostly what happens - even with GPT 4 16k - is that it just starts to go a bit loopy or start just doing other things. A too-short prompt today regarding new TPUs resulted it in wanting to provide me with some surely deep insights into airline tickets. If you eliminate the web search component, then results are obviously a factor of the model/settings/quality vs the alignment of key phrase or word patterns in the prompts with attention. I don't think most users realize that the difference between various prompt settings is sometimes just down to word positioning. "Do not create stub functions. Do not create empty methods." may result in attention on "do not create stub functions" and "create empty methods" whereas "Do not please create stub functions. Chicken* Do not create empty methods" may lead it to correctly attend to "[Do] not create empty methods" ('<*>' being two arbitrary tokens) But because users instead make a change to, say, "Do not create stub functions and please do not create empty methods" it reinforces their perception that you are explaining or instructing the model rather than attention/pattern curating. This is an incredibly easy mental trap to fall into while also trying to understand the effect of tweaking code. Suggestion: |
Beta Was this translation helpful? Give feedback.
-
#852 has been opened to address this. |
Beta Was this translation helpful? Give feedback.
-
I'll resolve this for now. Please reopen to continue the discussion. |
Beta Was this translation helpful? Give feedback.
-
Your dad calls: mom's old sweater-maker-98 broke down and he's thinking about spending $23,000,000 on a second hand "print shirt mk 3" so she can still make you a sweater this year.
I'm trying to put together a research agent group that is capable of building a customized recommendation list along with some purchase guidance running against local LMs. I've tried a bunch of simple and complex approaches, but none of the approaches seem to work. The moment I stray from the 'arxiv research' task in the original group-web-research group, nothing works.
I either end up with it just going ahead and dumping a list of recommendations on the first pass, and they're out of date with no obvious web access, or it getting stuck on 'GroupChat select_speaker failed to resolve the next speaker's name. This is because the speaker selection OAI call returned:'
https://gist.github.com/kfsone/4db4156fc77eb7a9ce6c09ae59278764
This one has gotten a little fancy ... I tried making it less likely I would make typos telling one agent to talk to another...
Any advice/obvious gotchas? And recommendations for reasonable models that might help? I've tried dolphin 2.2.1, Mistral 7b Instruct 0.1 k6m, sciphi-self-rag-mistral-32k ... I have token window set to 16k to try and ensure context windows aren't an issue...
Beta Was this translation helpful? Give feedback.
All reactions