-
Notifications
You must be signed in to change notification settings - Fork 4.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Structure way defining of corpus data #469
Comments
Currently, ChatterBot's corpus format is essentially just a list of dialog sets. For example" {
"conversations": [
[
"...",
"...",
"...",
"...",
],
[
"...",
"...",
"...",
"...",
]
]
} It would be a good idea to modify the format so that it can store more information as you suggested. |
Question: as per suggested, the corpus should be in json format. How do you train the corpus in the main application? Let's say I exported the training corpus data, how do you retrain it on another bot? There isn't any examples on that? I basically wrote my own "adapter" to read the corpus in json format, then load it in JSON pair to train. |
@kennetham Right now you can export your chat bot's knowledge as a JSON file: http://chatterbot.readthedocs.io/en/latest/corpus.html?highlight=export#exporting-your-chat-bot-s-database-as-a-training-corpus The ability to specify a file path for a training corpus will be added in #467 |
@gunthercox I am planning to write PR for above two enhancements, do any have any ETA for these two issues #469 and #467? |
@vkosuri I haven't started working on anything to allow custom paths for corpus data (#467) yet, feel free to start if you are interested in working on it. For this ticket (#469) i'm in the process of researching the formats of other existing data corpora. I'm interested to see if there are any design patterns that might be beneficial to follow. |
I just wanted to post a link for later reference. This is for the current work-in-progress concept for the future version of ChatterBot's dialog corpus files. I'm still considering other ideas so this document will be updated in the future. https://github.com/gunthercox/ChatterBot/wiki/ChatterBot-Corpus-Specification |
This looks good to me. Question
|
In this model, responses are indicated by consecutive statements in each list. |
Apologies I am making this conversation longer 🔢 , From above statement, can i assume If the question multiple answer, do i need to two lists for same answer? How do i make programmable responses? Is there way the If chatterbot not found the answer in corpus, suggest chatterbot to look for programmable response |
No problem, any questions you have about it are helpful because it lets me consider things that I might not have thought about. If you have any other questions, please ask them. I want to get as much feedback on the design as possible before committing to it. Also, you are correct. For representing multiple responses to the same input, the input will have to be listed multiple times. I designed it this way to avoid deep nested lists of responses which might be difficult for developers to read, and more intensive for programs to traverse. For programmable responses, I usually recommend some form of a customized logic adapter. However, I have seen valid cases where there are, for example, wildcards in statements. So a if a statement is something like: "My favorite color is {color}". In this case, color can be any valid color. These wildcards are something that is well supported by AIML, but it is currently something that is not well supported by ChatterBot. I will definitely look into the possibility of supporting AIML in the new corpus format, or something similar. |
By looking into http://www.alicebot.org/aaa.html it was amazing like chatterbot. If i want make a bot like |
Were there any features that you saw in Alice bot that ChatterBot doesn't have? |
some of them i have found, please point/correct me if it already there Bot PropertiesI think it is good idea if we have similar kind of feature. Preprocessing statementsThis is my first choice of implementation, It's awesome feature Template mechanismI am assuming this statements has template, if it correct could you please share your views on this? <template>As a <bot name="age"/> year old <bot name="gender"/> I am not really interested in that discussion.</template> Reusing of corpus dataOther than i am also looking into some part/entire/few statements can i reuse it in any other corpus data Corpus search orderAre we fallowing any order to search corpus database? |
vkosuri did you implement aiml or add it to chatterBot ? |
We have added some of aiml copurs into chatterbot-corpus. |
This thread has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs. |
@gunthercox Is there any specific way to define corpus data for chatter bot?
for example
If so how chatterbot will process this text
The text was updated successfully, but these errors were encountered: