Commit a0345cb
[SPARK-21680][ML][MLLIB] optimize Vector compress
## What changes were proposed in this pull request?
When use Vector.compressed to change a Vector to SparseVector, the performance is very low comparing with Vector.toSparse.
This is because you have to scan the value three times using Vector.compressed, but you just need two times when use Vector.toSparse.
When the length of the vector is large, there is significant performance difference between this two method.
## How was this patch tested?
The existing UT
Author: Peng Meng <peng.meng@intel.com>
Closes #18899 from mpjlu/optVectorCompress.1 parent 7add4e9 commit a0345cb
File tree
5 files changed
+60
-14
lines changed- mllib-local/src
- main/scala/org/apache/spark/ml/linalg
- test/scala/org/apache/spark/ml/linalg
- mllib/src
- main/scala/org/apache/spark/mllib/linalg
- test/scala/org/apache/spark/mllib/linalg
- project
5 files changed
+60
-14
lines changedLines changed: 18 additions & 6 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
136 | 136 | | |
137 | 137 | | |
138 | 138 | | |
139 | | - | |
| 139 | + | |
| 140 | + | |
| 141 | + | |
| 142 | + | |
| 143 | + | |
| 144 | + | |
| 145 | + | |
| 146 | + | |
| 147 | + | |
| 148 | + | |
| 149 | + | |
| 150 | + | |
| 151 | + | |
| 152 | + | |
| 153 | + | |
140 | 154 | | |
141 | 155 | | |
142 | 156 | | |
| |||
152 | 166 | | |
153 | 167 | | |
154 | 168 | | |
155 | | - | |
| 169 | + | |
156 | 170 | | |
157 | 171 | | |
158 | 172 | | |
| |||
495 | 509 | | |
496 | 510 | | |
497 | 511 | | |
498 | | - | |
499 | | - | |
| 512 | + | |
500 | 513 | | |
501 | 514 | | |
502 | 515 | | |
| |||
635 | 648 | | |
636 | 649 | | |
637 | 650 | | |
638 | | - | |
639 | | - | |
| 651 | + | |
640 | 652 | | |
641 | 653 | | |
642 | 654 | | |
| |||
Lines changed: 10 additions & 0 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
318 | 318 | | |
319 | 319 | | |
320 | 320 | | |
| 321 | + | |
| 322 | + | |
| 323 | + | |
| 324 | + | |
| 325 | + | |
321 | 326 | | |
322 | 327 | | |
323 | 328 | | |
324 | 329 | | |
325 | 330 | | |
| 331 | + | |
| 332 | + | |
| 333 | + | |
| 334 | + | |
| 335 | + | |
326 | 336 | | |
327 | 337 | | |
328 | 338 | | |
| |||
Lines changed: 18 additions & 8 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
149 | 149 | | |
150 | 150 | | |
151 | 151 | | |
152 | | - | |
| 152 | + | |
| 153 | + | |
| 154 | + | |
| 155 | + | |
| 156 | + | |
| 157 | + | |
| 158 | + | |
| 159 | + | |
| 160 | + | |
| 161 | + | |
| 162 | + | |
| 163 | + | |
| 164 | + | |
| 165 | + | |
| 166 | + | |
153 | 167 | | |
154 | 168 | | |
155 | 169 | | |
| |||
165 | 179 | | |
166 | 180 | | |
167 | 181 | | |
168 | | - | |
| 182 | + | |
169 | 183 | | |
170 | 184 | | |
171 | 185 | | |
| |||
669 | 683 | | |
670 | 684 | | |
671 | 685 | | |
672 | | - | |
673 | | - | |
674 | | - | |
| 686 | + | |
675 | 687 | | |
676 | 688 | | |
677 | 689 | | |
| |||
822 | 834 | | |
823 | 835 | | |
824 | 836 | | |
825 | | - | |
826 | | - | |
827 | | - | |
| 837 | + | |
828 | 838 | | |
829 | 839 | | |
830 | 840 | | |
| |||
Lines changed: 10 additions & 0 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
351 | 351 | | |
352 | 352 | | |
353 | 353 | | |
| 354 | + | |
| 355 | + | |
| 356 | + | |
| 357 | + | |
| 358 | + | |
354 | 359 | | |
355 | 360 | | |
356 | 361 | | |
357 | 362 | | |
358 | 363 | | |
| 364 | + | |
| 365 | + | |
| 366 | + | |
| 367 | + | |
| 368 | + | |
359 | 369 | | |
360 | 370 | | |
361 | 371 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
1015 | 1015 | | |
1016 | 1016 | | |
1017 | 1017 | | |
| 1018 | + | |
| 1019 | + | |
| 1020 | + | |
| 1021 | + | |
1018 | 1022 | | |
1019 | 1023 | | |
1020 | 1024 | | |
| |||
0 commit comments