-
Notifications
You must be signed in to change notification settings - Fork 26
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Starting point code for MDR implementation #1
Conversation
Assumption(s): Labels are only binary Implemented: fit(self, features, classes): simply build a dictionary that maps each instance of the feature vector to a tuple. The tuple keeps count of how many times a particular label value appears with that instance of feature vector. Key: tuple of feature values - Value: tuple of label frequency/label counts transform(self, features): After the dictionary is completed, combine each instance of feature vector above into one corresponding label that has the frequency ratio greater than its standard default ratio. score(self, features, classes): Compare the new combined feature vector with its corresponding class labels, and count the times the two match. Output the average accuracy by averaging the match count over the length of the new feature vector / classes vector. Implementation is tested in main() by training MDR on the training set and getting accuracy_score on the test set.
description | ||
tie_break: type int (default: 0) | ||
description: specify the default label in case there's a tie in a given set of feature values | ||
default_label: type int (default: 0) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remove the words "type"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oops got cha!
Changed fdict to feature_map Removed ‘type’ in line 34 & 36
Fixed all bugs according to Randal Olson’s comments.
@@ -18,29 +18,33 @@ | |||
""" | |||
|
|||
import pandas as pd | |||
|
|||
import numpy as np | |||
from collections import defaultdict | |||
from __future__ import print_function |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
from __future__ import print_function
must be the first import in the file.
Assumption(s): Labels are only binary
Implemented:
fit(self, features, classes): simply build a dictionary that maps each
instance of the feature vector to a tuple. The tuple keeps count of how
many times a particular label value appears with that instance of
feature vector. Key: tuple of feature values - Value: tuple of label
frequency/label counts
transform(self, features): After the dictionary is completed, combine
each instance of feature vector above into one corresponding label that
has the frequency ratio greater than its standard default ratio.
score(self, features, classes): Compare the new combined feature vector
with its corresponding class labels, and count the times the two match.
Output the average accuracy by averaging the match count over the
length of the new feature vector / classes vector.
Implementation is tested in main() by training MDR on the training set
and getting accuracy_score on the test set.