WSDM - KKBox's Music Recommendation Challenge
Primary Performance Verification Using Kaggle's Data
Using Kaggle's data, complete the first implementation of the recommended system. Then, they want to recommend music through XGBoost to the users' situation.
- Window10 Home
- CPU : Intel i5-7200U
- Ram : 8GB
- GPU : None
- ANACONDA : 5.7.4
The songs. Note that data is in unicode.
- song_id
- song_length: in ms
- genre_ids: genre category. Some songs have multiple genres and they are separated by |
- artist_name
- composer
- lyricist
- language
- members.csv
- user information.
- city
- bd: age. Note: this column has outlier values, please use your judgement.
- gender
- registered_via: registration method
- registration_init_time: format %Y%m%d
- expiration_date: format %Y%m%d
- msno: user id
- song_id: song id
- source_system_tab: the name of the tab where the event was triggered. System tabs are used to categorize KKBOX mobile apps functions. For example, tab my library contains functions to manipulate the local storage, and tab search contains functions relating to search.
- source_screen_name: name of the layout a user sees.
- source_type: an entry point a user first plays music on mobile apps. An entry point could be album, online-playlist, song .. etc.
- target: this is the target variable. target=1 means there are recurring listening event(s) triggered within a month after the user’s very first observable listening event, target=0 otherwise .
- Flow Chart
- Update Flow Chart(2019-07-03)
-
사용자 상황 정보(User_Situation)
USERID , DAY , EVENT , LOCATION , WEATHER , TIME (MUSIC) -
음악정보 DB(Melon_Top100_DB)
FILENAME , ALBUM , ARTIST , GENRE , TITLE , YEAR , TAG -
사용자 정보(User)
USERID , AGE , SEX , SINGERS , GENRES , TAGS -
상황별 들었 던, 음악 리스트(User_Situation_Music)
USERID , DAY , TIME , MUSIC