-
Notifications
You must be signed in to change notification settings - Fork 4
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Interaction matrix of user #10
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks good.
I think you might want to think about writing the code that "glues" this all together. Instead of just writing the library functions, what does it looks like to call these functions and produce the final interaction matrix?
Hey @siddz415 , just wanted to check in if you're able to respond to the comments in this PR? If you're too busy with other stuff that's totally cool. We were thinking we would re-assign the PR if there's not any motion before the meeting on Nov 7. Let us know if you're still interested in working on it. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Using latin-1 encoding in load_movie_titles() seems odd, but perhaps we're loading an ancient file which is not in utf8. It's all commented code in any case, so we should probably just delete those lines before merging to main
.
The split() on ,
comma suggests that we might prefer to import csv
.
In lightfm_recommendation.py and main.py we have a lot of code up at module level. Recommend burying it within def main():
or whatever. That way it will be safe for some future unit test to import
it without side effects. The current code, which lacks a __main__
guard, is innocuous enough. But I'm concerned it will encourage folks to add stuff to this code base which is hard for a "second caller" (such as a unit test) to invoke. Such issues could be addressed in the current PR prior to a merge, or in a subsequent PR.
The file is from 2007, so it's completely possible that it's not UTF-8. It's also possible that it doesn't have any non-ASCII characters anyway, so it might not matter.
These text files are not CSV, they are only CSV-like. Individual fields can have commas in them. Specifically
I agree. Better to put things in a main function now and figure out how to call it later. We can always rename |
…o interaction-matrix-of-user
…bridge/MediaBridge into interaction-matrix-of-user
This PR is drifting toward being moribund. It is still mergeable at this point, though it proposes adding more than 4 million lines of CSV data, which for It might be helpful to split Data from Code, merge the code, and move on to other issues and feature requests. |
Thanks for updating the branch @jhanley634. We discussed early on in our in-person meetings that our strategy should not be to commit the Netflix prize data. That's the reason we're exploring the other PR that automatically downloads it. This has already been addressed in this comment: #10 (comment) |
…o interaction-matrix-of-user
…bridge/MediaBridge into interaction-matrix-of-user
Ok so I updated it to use a sparse matrix and removed the extra variable (right now its getting the files by calling But not pushing the changes yet cause want to make sure I'm not breaking anything (it throws an error, but also doesn't seem to work for different reasons without my changes). So just to make sure, But then when going through the file, this line It looks like currently, these are the files being used in that function? |
I guess my greater question is, what are in these |
The mv_*.txt files are in the form: MOVIE_ID: There is a file for each movie id, and the id in the file name corresponds to the id of the movie that is referenced. |
Ok thats good to know, TBH I will push my changes then for now cause it should theoretically work if thats true... |
…e reasonable location
Okay we're clearly working on this at the exact same moment, which probably isn't a great idea. Please pull. I consolidated all of the files and put them in a sane place. |
Will do! Hopefully didnt create some kind of merge conflict, but yeah will make sure to avoid any future interactions |
No description provided.