Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor in dataloader #559

Merged
merged 14 commits into from
Dec 8, 2020
Merged

Refactor in dataloader #559

merged 14 commits into from
Dec 8, 2020

Conversation

chenyushuo
Copy link
Collaborator

  1. Remove pre-neg-sampling in Dataloader.
  2. Remove uid2index in Dataset.
  3. Speed up in sampler.py (break through) and fix a bug of sampler.
  4. Change feat_list to feat_name_list in Dataset.
  5. Remove config['fill_nan'].
  6. Fix the output in Trainer._generate_train_loss_output().
  7. Change pandas.DataFrame to Interaction when data is processed in Dataset.
  8. Fixed the mismatch of neg-item feature in context neg-sample dataloader.
  9. Fixed time_field type error in sequential dataloader.
  10. Fixed kg_feat is empty in kg dataloader.
  11. Speed up in Dataset._filter_by_inter_num().
  12. Speed up in sequential_dataloader to and fix a runtime error in data_preprocess.
  13. Remove drop_filter_field and drop_preload_weight.
  14. Add unused_col to drop the columns only used in data preparation but not used in model.

@chenyushuo chenyushuo requested a review from 2017pxy December 8, 2020 08:22
@2017pxy 2017pxy merged commit d47c224 into RUCAIBox:0.2.x Dec 8, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants