Skip to content
This repository has been archived by the owner on Nov 1, 2024. It is now read-only.

Supports More Operations for Recommendation Systems #494

Open
Ash-Zheng opened this issue Sep 21, 2022 · 1 comment
Open

Supports More Operations for Recommendation Systems #494

Ash-Zheng opened this issue Sep 21, 2022 · 1 comment

Comments

@Ash-Zheng
Copy link

Hi,

I noticed that some data preprocessing operations used in recommendation systems like bucketize, sigridHash, and firstX are implemented in: torcharrow/tree/main/csrc/velox/functions/rec

I would like to ask if other preprocessing operations for recommendation system be supported in the future?
For example, recent paper from Meta[1] mentioned 16 kinds of common preprocessing operations in the Table-11 including: bucketize, sigridHash, firstX, Cartesian, IdListTransform, BoxCox, MapId, and NGram.
Most of them are not supported now. Will these operations be supported in torcharrow in the future?

[1] Zhao, Mark, et al. "Understanding data storage and ingestion for large-scale deep recommendation model training: industrial product." Proceedings of the 49th Annual International Symposium on Computer Architecture. 2022.

@wenleix
Copy link
Contributor

wenleix commented Oct 4, 2022

cc @YLGH

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants