Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ML-298][Core] Optimize LabelPoints single-threaded to multi-threaded and the rename coalesce functions #302

Merged
merged 3 commits into from
Jul 3, 2023

Conversation

minmingzhu
Copy link
Collaborator

@minmingzhu minmingzhu commented Jun 1, 2023

What changes were proposed in this pull request?

closes #298

  1. Optimize data conversion that make single-threaded copy becomes multi-threaded copy
  2. coalesce Vectors and LabelPoints for GPU

coalesceVectorsToHomogenTables
coalesceLabelPointsToHomogenTables

  1. coalesce Vectors, dense LabelPoints and sparse LabelPoints for CPU

coalesceVectorsToNumericTables
coalesceLabelPointsToNumericTables
coalesceSparseLabelPointsToSparseNumericTables

@minmingzhu minmingzhu changed the title [ML-298] Optimize merge labeled point [ML-298] Optimize merge labeledpoint Jun 1, 2023
@github-actions
Copy link

github-actions bot commented Jun 1, 2023

#298

@minmingzhu minmingzhu force-pushed the optimize_merge_labeledPoint branch 2 times, most recently from 7757a67 to 072b491 Compare June 5, 2023 08:44
@minmingzhu minmingzhu force-pushed the optimize_merge_labeledPoint branch from d8153de to 2cb7ba0 Compare June 14, 2023 05:39
@minmingzhu minmingzhu force-pushed the optimize_merge_labeledPoint branch 2 times, most recently from 07dc5a9 to ce644d4 Compare June 29, 2023 08:17
…oint data conversion

Signed-off-by: minmingzhu <minming.zhu@intel.com>

update OneDAL.scala

Signed-off-by: minmingzhu <minming.zhu@intel.com>

optimize labeledpoint data conversion

Signed-off-by: minmingzhu <minming.zhu@intel.com>

update

Signed-off-by: minmingzhu <minming.zhu@intel.com>

update

Signed-off-by: minmingzhu <minming.zhu@intel.com>

update

Signed-off-by: minmingzhu <minming.zhu@intel.com>

update logs

Signed-off-by: minmingzhu <minming.zhu@intel.com>

update

Signed-off-by: minmingzhu <minming.zhu@intel.com>

1. remove debug log
2. format code style

Signed-off-by: minmingzhu <minming.zhu@intel.com>

update OneDAL.scala

Signed-off-by: minmingzhu <minming.zhu@intel.com>

update OneDAL.scala

Signed-off-by: minmingzhu <minming.zhu@intel.com>

[ML-282] Refactor CPU & GPU examples (oap-project#306)

* First move

* Move device discover for scala

* Delete old gpu discover

* Add run-all-gpu

* Add clean up

* Add tmp utils file

* Add exe

* Rename run script

* Scala gpu donw

* Scala cpu done

* For ci

* pyspark ci

* Rename scala

* Rename scala file in scripts

* Pyspark unit done

* Update pyspark utils

* Update ci

* Remove tmp utils

* Reaname utils

* Change absolute path, rm als gpu.sh

* Scala absolute path

* Change sanity check

* Rename ci

* Split random_forest

* Fix name change in ci

* Fix path typo

* Fix typo

update OneDAL.scala

Signed-off-by: minmingzhu <minming.zhu@intel.com>

Update run-gpu.sh

update

Signed-off-by: minmingzhu <minming.zhu@intel.com>

update

Signed-off-by: minmingzhu <minming.zhu@intel.com>

update OneDAL.scala

Signed-off-by: minmingzhu <minming.zhu@intel.com>

Update OneDAL.scala
@minmingzhu minmingzhu force-pushed the optimize_merge_labeledPoint branch from cc5b7f2 to 6f55b85 Compare June 29, 2023 08:38
2. remove unused functions

Signed-off-by: minmingzhu <minming.zhu@intel.com>
Signed-off-by: minmingzhu <minming.zhu@intel.com>
@xwu99
Copy link
Collaborator

xwu99 commented Jul 3, 2023

The title and description are not related to what you have changed.

@minmingzhu minmingzhu changed the title [ML-298] Optimize merge labeledpoint [ML-298] Optimize merge labeledpoint and the coalesce of various data structures is named uniformly Jul 3, 2023
@minmingzhu minmingzhu changed the title [ML-298] Optimize merge labeledpoint and the coalesce of various data structures is named uniformly [ML-298] Optimize data conversion that make single-threaded copy becomes multi-threaded copy and the coalesce of various data structures is named uniformly Jul 3, 2023
@xwu99 xwu99 changed the title [ML-298] Optimize data conversion that make single-threaded copy becomes multi-threaded copy and the coalesce of various data structures is named uniformly [ML-298][Core] Optimize LabelPoints single-threaded to multi-threaded and the rename coalesce functions Jul 3, 2023
@xwu99 xwu99 merged commit 4fca9ec into oap-project:master Jul 3, 2023
minmingzhu added a commit to minmingzhu/oap-mllib that referenced this pull request Jul 4, 2023
… and the rename coalesce functions (oap-project#302)

* Used multi threads copy data to continuous array to optimize labeledpoint data conversion

Signed-off-by: minmingzhu <minming.zhu@intel.com>

update OneDAL.scala

Signed-off-by: minmingzhu <minming.zhu@intel.com>

optimize labeledpoint data conversion

Signed-off-by: minmingzhu <minming.zhu@intel.com>

update

Signed-off-by: minmingzhu <minming.zhu@intel.com>

update

Signed-off-by: minmingzhu <minming.zhu@intel.com>

update

Signed-off-by: minmingzhu <minming.zhu@intel.com>

update logs

Signed-off-by: minmingzhu <minming.zhu@intel.com>

update

Signed-off-by: minmingzhu <minming.zhu@intel.com>

1. remove debug log
2. format code style

Signed-off-by: minmingzhu <minming.zhu@intel.com>

update OneDAL.scala

Signed-off-by: minmingzhu <minming.zhu@intel.com>

update OneDAL.scala

Signed-off-by: minmingzhu <minming.zhu@intel.com>

[ML-282] Refactor CPU & GPU examples (oap-project#306)

* First move

* Move device discover for scala

* Delete old gpu discover

* Add run-all-gpu

* Add clean up

* Add tmp utils file

* Add exe

* Rename run script

* Scala gpu donw

* Scala cpu done

* For ci

* pyspark ci

* Rename scala

* Rename scala file in scripts

* Pyspark unit done

* Update pyspark utils

* Update ci

* Remove tmp utils

* Reaname utils

* Change absolute path, rm als gpu.sh

* Scala absolute path

* Change sanity check

* Rename ci

* Split random_forest

* Fix name change in ci

* Fix path typo

* Fix typo

update OneDAL.scala

Signed-off-by: minmingzhu <minming.zhu@intel.com>

Update run-gpu.sh

update

Signed-off-by: minmingzhu <minming.zhu@intel.com>

update

Signed-off-by: minmingzhu <minming.zhu@intel.com>

update OneDAL.scala

Signed-off-by: minmingzhu <minming.zhu@intel.com>

Update OneDAL.scala

* 1. rename function name
2. remove unused functions

Signed-off-by: minmingzhu <minming.zhu@intel.com>

* update OneDAL.scala

Signed-off-by: minmingzhu <minming.zhu@intel.com>

---------

Signed-off-by: minmingzhu <minming.zhu@intel.com>
minmingzhu added a commit to minmingzhu/oap-mllib that referenced this pull request Jul 4, 2023
… and the rename coalesce functions (oap-project#302)

* Used multi threads copy data to continuous array to optimize labeledpoint data conversion

Signed-off-by: minmingzhu <minming.zhu@intel.com>

update OneDAL.scala

Signed-off-by: minmingzhu <minming.zhu@intel.com>

optimize labeledpoint data conversion

Signed-off-by: minmingzhu <minming.zhu@intel.com>

update

Signed-off-by: minmingzhu <minming.zhu@intel.com>

update

Signed-off-by: minmingzhu <minming.zhu@intel.com>

update

Signed-off-by: minmingzhu <minming.zhu@intel.com>

update logs

Signed-off-by: minmingzhu <minming.zhu@intel.com>

update

Signed-off-by: minmingzhu <minming.zhu@intel.com>

1. remove debug log
2. format code style

Signed-off-by: minmingzhu <minming.zhu@intel.com>

update OneDAL.scala

Signed-off-by: minmingzhu <minming.zhu@intel.com>

update OneDAL.scala

Signed-off-by: minmingzhu <minming.zhu@intel.com>

[ML-282] Refactor CPU & GPU examples (oap-project#306)

* First move

* Move device discover for scala

* Delete old gpu discover

* Add run-all-gpu

* Add clean up

* Add tmp utils file

* Add exe

* Rename run script

* Scala gpu donw

* Scala cpu done

* For ci

* pyspark ci

* Rename scala

* Rename scala file in scripts

* Pyspark unit done

* Update pyspark utils

* Update ci

* Remove tmp utils

* Reaname utils

* Change absolute path, rm als gpu.sh

* Scala absolute path

* Change sanity check

* Rename ci

* Split random_forest

* Fix name change in ci

* Fix path typo

* Fix typo

update OneDAL.scala

Signed-off-by: minmingzhu <minming.zhu@intel.com>

Update run-gpu.sh

update

Signed-off-by: minmingzhu <minming.zhu@intel.com>

update

Signed-off-by: minmingzhu <minming.zhu@intel.com>

update OneDAL.scala

Signed-off-by: minmingzhu <minming.zhu@intel.com>

Update OneDAL.scala

* 1. rename function name
2. remove unused functions

Signed-off-by: minmingzhu <minming.zhu@intel.com>

* update OneDAL.scala

Signed-off-by: minmingzhu <minming.zhu@intel.com>

---------

Signed-off-by: minmingzhu <minming.zhu@intel.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[ML-298][Core] Optimize LabelPoints single-threaded to multi-threaded and the rename coalesce functions
2 participants