-
Notifications
You must be signed in to change notification settings - Fork 512
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Inconsistencies in intent classification #778
Comments
Hi @satnam2012 , We are working on improving the reproducibility of training through the use of a random seed. However, there was an issue in the scikit-learn lib which was causing non-deterministic behaviour (see scikit-learn/scikit-learn#13422), and thus preventing us from shipping this feature. |
@satnam2012 , import io
import json
from snips_nlu import SnipsNLUEngine
with io.open("path/to/dataset.json", encoding="utf8") as f:
dataset = json.load(f)
engine_1 = SnipsNLUEngine(random_state=42).fit(dataset)
engine_2 = SnipsNLUEngine(random_state=42).fit(dataset)
res_1 = engine_1.parse("turn lights in basement")
res_2 = engine_2.parse("turn lights in basement")
assert res_1 == res_2 |
@adrienball Here's a minimum working example:
After running the above code twice and persisting the engine to a different directory each time, I compared the contents of I have Any idea how to solve this? |
@drorvinkler I cannot reproduce using your example.
Cheers |
@adrienball |
I managed to reproduce the issue in Python3.5, it seems to be specific to this version as I can't reproduce with Python3.6 and Python2.7. |
Thanks |
* Fix non-deterministic behavior fixes #778 * Update changelog * Fix issue with dirhash * Fix issues * Fix issue with Python<3.4
Hi,
I have been working on the sample dataset and sample code posted in the https://snips-nlu.readthedocs.io/en/latest/quickstart.html.
I have also added a new intent "sampleTurnOffLight" to the same sample_dataset.json which looks like below
sample_dataset.json.zip
For a text - "turn lights in basement"
I'm getting different classification every time.
Note - I retrain(fit) every time before I call the parse
I expect it to behave consistently with each re-train
Could you please confirm the behavior?
Run 1-
{
"input": "turn lights in basement",
"slots": [],
"intent": {
"intentName": "sampleTurnOffLight",
"probability": 0.6660875805168223
}
}
Run 2-
{
"input": "turn lights in basement",
"slots": [],
"intent": {
"intentName": "sampleTurnOnLight",
"probability": 0.6430405901353275
}
}
The text was updated successfully, but these errors were encountered: