Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: add chunk size parameter for copying java data to native #1041

Merged
merged 1 commit into from
May 6, 2021

Conversation

imatiach-msft
Copy link
Contributor

  • add chunk size parameter for copying java data to native
  • fix minor memory leak
  • increase default chunk size from 1000 to 10000, for most real scenarios this is actually still probably more on the smaller side of what it should be set to (the number of rows in the dataset, but we don't want to do a count beforehand since it is expensive)

@imatiach-msft imatiach-msft force-pushed the ilmat/chunking-param branch from 2dde42d to ad10270 Compare May 3, 2021 05:42
@imatiach-msft imatiach-msft changed the title add chunk size parameter for copying java data to native feat: add chunk size parameter for copying java data to native May 3, 2021
@imatiach-msft
Copy link
Contributor Author

/azp run

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@codecov
Copy link

codecov bot commented May 3, 2021

Codecov Report

Merging #1041 (0f87a2c) into master (aad223e) will decrease coverage by 0.03%.
The diff coverage is 94.73%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #1041      +/-   ##
==========================================
- Coverage   84.86%   84.82%   -0.04%     
==========================================
  Files         203      203              
  Lines        9640     9648       +8     
  Branches      559      548      -11     
==========================================
+ Hits         8181     8184       +3     
- Misses       1459     1464       +5     
Impacted Files Coverage Δ
.../com/microsoft/ml/spark/lightgbm/TrainParams.scala 100.00% <ø> (ø)
...m/microsoft/ml/spark/lightgbm/LightGBMParams.scala 89.35% <80.00%> (-0.23%) ⬇️
...crosoft/ml/spark/lightgbm/LightGBMClassifier.scala 91.01% <100.00%> (+0.10%) ⬆️
...m/microsoft/ml/spark/lightgbm/LightGBMRanker.scala 63.07% <100.00%> (ø)
...icrosoft/ml/spark/lightgbm/LightGBMRegressor.scala 72.22% <100.00%> (ø)
...om/microsoft/ml/spark/lightgbm/LightGBMUtils.scala 89.58% <100.00%> (+0.22%) ⬆️
...a/com/microsoft/ml/spark/lightgbm/TrainUtils.scala 86.40% <100.00%> (ø)
...a/com/microsoft/ml/spark/io/http/HTTPClients.scala 73.33% <0.00%> (-10.00%) ⬇️
...microsoft/ml/spark/cognitive/SpeechToTextSDK.scala 90.62% <0.00%> (+0.78%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 1c4691f...0f87a2c. Read the comment docs.

@imatiach-msft imatiach-msft force-pushed the ilmat/chunking-param branch from ad10270 to 0f87a2c Compare May 6, 2021 05:19
@imatiach-msft
Copy link
Contributor Author

/azp run

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@imatiach-msft imatiach-msft merged commit b7f29e8 into microsoft:master May 6, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants