-
Notifications
You must be signed in to change notification settings - Fork 16
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Variable types #10 #12
base: main
Are you sure you want to change the base?
Conversation
j-grover
commented
Oct 18, 2019
- Adding variable types as parameters to auto_entityset, make_entityset
- Test in tests/test_normalize
- Updated README
- Resolves Variable types not preserved after call to normalize_entity() #10
|
e3aff39
to
05ab955
Compare
Generally I think this PR looks pretty good, but I have a question about how we want this to behave for the ID columns. If you look at the original entityset ( @rwedge Is this behavior acceptable? I know we will have some differences in variable types where a column in the original non-normalized entityset gets set as an index in a normalized entity, but wasn't sure how we wanted non-index columns to be treated throughout. We probably should also update the test to test all of the columns we expect to have the same variable types throughout instead of just the single |
@thehomebrewnerd changing the variable we normalized on to I agree the test should test all of the columns, and I think we should test that |
@j-grover sorry for the delay on the review, are you interested in updating the tests? |
361a774
to
8cc177c
Compare
8cc177c
to
db8257a
Compare
@rwedge Updated test to check all columns |
@j-grover Thanks for the quick response and updates to the test. Since we have now modified the parameters for the Is that something you would be able to do as well? |
What sort of things are we looking to test for these two methods. For example one case with default values and one with custom args? |
Yes, that is what I was thinking...tests very similar to the test you added for |
7060663
to
46c2a19
Compare
@thehomebrewnerd |
@j-grover Thanks for creating these tests. If we expect this behavior - where the name is not deterministic - we could set a variable for the name that is actually returned and then use that variable in the tests. One way that comes to mind would be to check if
I think you would also need to rename one of the dataframe columns for the I don't know enough about the details of this code to know if expect to get the same name back every time, but I can look into that a bit more in the meantime to make sure this isn't highlighting some other issue. |
|
||
|
||
def test_auto_entityset_custom_args(): | ||
dic = {'team': ['Red', 'Red', 'Red', 'Orange', 'Orange', 'Yellow', |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
these dictionaries get re-used in several tests, creating many duplicated lines of code. Could you turn these dictionaries into pytest fixtures so they're only defined once?
@j-grover After reviewing the code with @rwedge , the non-deterministic nature of column names is to be expected as the new names are created by joining an unsorted list of column names. Issue #24 was created to fix this problem, so for this PR I suggest we go ahead and implement the tests as we have described above, and then we can update later after issue #24 is closed. |
e0e8de0
to
72c82fd
Compare
Sorting the list sounds good. I have change the index names accordingly to |