Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Snap 2358 Sorted Column Batches on partitioning keys #1054

Open
wants to merge 392 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
392 commits
Select commit Hold shift + click to select a range
f6fd03d
Merge branch 'master' into li-perfomance-test
Mar 7, 2018
1543f6f
Merge branch 'li-perfomance-test' into vivek-try1
Mar 7, 2018
aad6fe1
Reverted benchmark. Now use QueryBenchmark only for data generation p…
Mar 7, 2018
30d4947
Updated test with new values
Mar 7, 2018
b6b5849
Merge branch 'li-perfomance-test' into vivek-try1
Mar 7, 2018
40ed17c
Updated test
Mar 7, 2018
016a60a
test code refactoring
Mar 7, 2018
9466d7c
Merge branch 'li-perfomance-test' into vivek-try1
Mar 7, 2018
233f134
Updated test
Mar 7, 2018
4bed89d
Updated test
Mar 7, 2018
1198a93
Merge branch 'li-perfomance-test' into vivek-try1
Mar 7, 2018
0111bc1
Updated test
Mar 7, 2018
7a4cba3
Updated test to handle multiple inserts
Mar 8, 2018
21cd499
Merge branch 'li-perfomance-test' into vivek-try1
Mar 8, 2018
0b5c110
Compilation issue
Mar 8, 2018
3d7eded
Added more duplicity in data
Mar 8, 2018
88b1462
Merge branch 'li-perfomance-test' into vivek-try1
Mar 8, 2018
5fbbd44
Updated params for test
Mar 8, 2018
7266fd4
Merge branch 'li-perfomance-test' into vivek-try1
Mar 8, 2018
5ad492c
Updated expectedcount
Mar 8, 2018
b4d8ffe
Merge branch 'li-perfomance-test' into vivek-try1
Mar 8, 2018
b7a351a
Updated expected count in test
Mar 8, 2018
ce2f4a1
Updated estimated count
Mar 8, 2018
a9c987a
Updated tests
Mar 8, 2018
3f871f3
[SNAP-2243][SNAP-2188] procedure for smart connector iteration and fixes
Mar 8, 2018
017b5ac
fixed test failures
Mar 9, 2018
c461959
changed Filters to Expressions in Relation APIs
Mar 9, 2018
0a9107e
Merge remote-tracking branch 'origin/master' into SNAP-2243
Mar 9, 2018
e3b4d41
fixed issues in the new StartsWithForStats expression code
Mar 9, 2018
a22b104
removed a debug assertion
Mar 9, 2018
a722424
correct one issue
Mar 10, 2018
022b3b7
update store link
Mar 10, 2018
3f3fa6f
added release for last batch in RemoteEntriesIterator
Mar 10, 2018
7d6211a
minor comment changes
Mar 10, 2018
664760c
minor change to retrieve ClusteredColumnIterator only once
Mar 11, 2018
63625b2
[SNAP-2244] stats for delta column batches
Mar 11, 2018
6dfb597
fixed failures and few cleanups
Mar 11, 2018
ad8c71e
added some code comments
Mar 11, 2018
e9a9f8e
fixing some failures
Mar 12, 2018
caf53c4
link store
Mar 12, 2018
9c0c3a9
fix a failure in SnappyRowStoreModeDUnit
Mar 12, 2018
a0b0927
Merge branch 'master' into li-perfomance-test
Mar 12, 2018
986ca27
Sync spark
Mar 12, 2018
368c152
Merge branch 'li-perfomance-test' into vivek-try1
Mar 12, 2018
ebcdca3
added tests for delta stats check; incorporate review comment
Mar 12, 2018
2d296ab
Changes to handle latest master merge
Mar 12, 2018
47d8e6b
Changes to get rid of any customization to spark
Mar 12, 2018
b029a27
Doing away with any customization in join attributes or forcing SMJ
Mar 12, 2018
3ad0ab2
Reintroducing customization to spark for testing purpose. Will be rev…
Mar 13, 2018
b84afc4
Customization to spark for testing purpose. Will be reverted back.
Mar 13, 2018
e53992d
Disabling insert part of Put-Into
Mar 13, 2018
4192c46
Changed put-into join condition to full outer.
Mar 13, 2018
e735f58
Further done away with Insert part of PutInto in excution part.
Mar 13, 2018
e00602f
Code refactoring
Mar 13, 2018
25ebcdf
Sync spark
Mar 13, 2018
2dab014
Sync spark
Mar 13, 2018
226fa2f
Switch to QueryBenchmark
Mar 14, 2018
b702bea
Merge branch 'li-perfomance-test' into li-sorted-insert
Mar 14, 2018
cf17647
Benchmark: For rate pass avg than best
Mar 14, 2018
44fc5ff
Added test for measuring multithreaded performance.
Mar 15, 2018
3838aaa
Merge branch 'li-perfomance-test' into li-sorted-insert
Mar 15, 2018
4e0b6d9
Allow even single thread to run in multi-threaded mode test.
Mar 15, 2018
b535c0d
Merge branch 'li-perfomance-test' into li-sorted-insert
Mar 15, 2018
bcbf257
Small change
Mar 15, 2018
574c1e0
Merge branch 'li-perfomance-test' into li-sorted-insert
Mar 15, 2018
7a020e6
Fix for NPE in CachedDataFrame
Mar 15, 2018
19a0643
Merge branch 'li-perfomance-test' into li-sorted-insert
Mar 15, 2018
d99332e
Added Dunit based test and performance test
Mar 16, 2018
cdf0827
Merge branch 'li-perfomance-test' into li-sorted-insert
Mar 16, 2018
26e6682
Minor update
Mar 16, 2018
9ba4768
Merge branch 'master' into li-perfomance-test
Mar 19, 2018
750ed7b
Merge branch 'SNAP-2244' into li-perfomance-test
Mar 19, 2018
133e893
Merge branch 'master' into li-perfomance-test-SNAP-2243
Mar 19, 2018
db54338
Merge branch 'li-perfomance-test' into li-perfomance-test-SNAP-2243
Mar 19, 2018
e5fb3ba
Updated properties being set
Mar 19, 2018
3516fae
Merge branch 'li-perfomance-test-SNAP-2243' into li-sorted-insert-SNA…
Mar 19, 2018
76b9b1a
Changes to complete merge of SNAP-2243/SNAP-2244
Mar 19, 2018
68e75da
Importand changes for successful merge of SNAP-2244
Mar 19, 2018
0e5d72d
Update multithreaded performance test
Mar 20, 2018
b6ca4c2
Changes done for Multithreaded performance test
Mar 20, 2018
d99dbbc
Changes for latency tests
Mar 20, 2018
edfc067
Updated number of threads
Mar 20, 2018
26bc445
Refactored Multi threaded tests
Mar 20, 2018
a8a028b
Merge branch 'li-perfomance-test-SNAP-2243' into li-sorted-insert-SNA…
Mar 20, 2018
e9028b7
Merge branch 'master' into li-perfomance-test-SNAP-2243
Mar 20, 2018
9379272
Merge branch 'li-perfomance-test-SNAP-2243' into li-sorted-insert-SNA…
Mar 20, 2018
ebdd84a
Refactoring of test code
Mar 20, 2018
03f767b
Merge branch 'li-perfomance-test-SNAP-2243' into li-sorted-insert-SNA…
Mar 21, 2018
8428f31
Added test for join performance
Mar 21, 2018
62b87fc
Fix for build failure
Mar 21, 2018
62f3b8b
Changes to generate correct data
Mar 21, 2018
cf33f82
Merge branch 'master' into li-perfomance-test-SNAP-2243
Mar 21, 2018
48ccec2
Merge branch 'li-perfomance-test-SNAP-2243' into li-sorted-insert-SNA…
Mar 21, 2018
e9a0d4d
Fix a build issue
Mar 21, 2018
e8e7cd3
Refactored test so latency tests use Spark's Benchmark
Mar 22, 2018
128491b
Small change
Mar 22, 2018
634beb2
Merge branch 'li-perfomance-test-SNAP-2243' into li-sorted-insert-SNA…
Mar 22, 2018
59d35fc
Code refactoring and also marked as colocated table
Mar 22, 2018
7a3b0e1
Increased heap memory for test
Mar 22, 2018
05aeb5e
Slight change
Mar 22, 2018
5395fdf
Merge branch 'li-perfomance-test-SNAP-2243' into li-sorted-insert-SNA…
Mar 22, 2018
04dbd70
Added an issue
Mar 22, 2018
950442f
Merge branch 'li-perfomance-test-SNAP-2243' into li-sorted-insert-SNA…
Mar 22, 2018
322bb8b
Merge branch 'master' into li-sorted
Mar 22, 2018
a3a2977
Merge branch 'li-master' into li-sorted
Mar 22, 2018
b46f103
Merge branch 'master' into li-master
Mar 23, 2018
2bdc66e
Small change
Mar 23, 2018
4584e2e
Merge branch 'li-master' into li-sorted
Mar 23, 2018
7f243eb
Code refactoring. Moved some code from li-sorted to li-master
Mar 23, 2018
09b9b9a
Restored DUnit test
Mar 23, 2018
76afd4c
Restore DUnit test
Mar 23, 2018
8a48073
Revert this. Only temporarily switch ON logs
Mar 23, 2018
b2e9deb
Merge branch 'li-master' into li-sorted
Mar 23, 2018
dbfe0fe
Disabled insert into row-buffer.
Mar 26, 2018
f93668e
Minor update
Mar 26, 2018
a593eaa
Force SMJ while performing join
Mar 26, 2018
3810294
Updated test
Mar 26, 2018
69f2915
Code refactoring
Mar 26, 2018
5f4f14c
Temporary fix for getting column batches in sorted order
Mar 28, 2018
f724b23
Temporary fix for insertingrows in sorted cahed batches which have no…
Mar 28, 2018
2ef6c47
Merge branch 'master' into li-master
Mar 29, 2018
ec6b5cb
Slight improvement over 'Temporary fix for getting column batches in
Mar 29, 2018
efe45c9
Updated join test
Apr 2, 2018
489de62
Merge branch 'li-master' into li-sorted
Apr 2, 2018
0d56597
Updated insert performance test
Apr 2, 2018
9fa8d73
Merge branch 'li-master' into li-sorted
Apr 2, 2018
7d57eb6
Minor change
Apr 2, 2018
0025772
Merge branch 'master' into li-master
Apr 4, 2018
258913c
Merge branch 'li-master' into li-sorted
Apr 4, 2018
6cb19d5
Merge branch 'master' into li-master
Apr 5, 2018
80773c6
Merge branch 'li-master' into li-sorted
Apr 5, 2018
45f9932
Removed redundant code
Apr 6, 2018
e5feb63
Follow up to: Temporary fix for insertingrows in sorted cahed batches…
Apr 6, 2018
3e0d2b6
code refactoring
Apr 6, 2018
7aa7db5
Even if Minimum or Maximum value has not changed, but delta would sti…
Apr 11, 2018
18294ef
Fix for handling multiple incremental inserts
Apr 11, 2018
37be741
Added test for multiple inserts
Apr 11, 2018
c826c00
Disable overflow of column batches while scanning
Apr 11, 2018
84d675f
Added test for duplicates in incremenatl inserts
Apr 12, 2018
7cd4ac0
Corrected a test
Apr 12, 2018
aa17784
Updated range query function
Apr 12, 2018
8a010bc
Updated range performance test
Apr 12, 2018
a74629a
Merge branch 'li-master' into li-sorted
Apr 12, 2018
e7edaaf
Correctly reflecting count of incremental insert in delta i.e. update…
Apr 13, 2018
617d6ac
Added test to do both update and incremental insert
Apr 13, 2018
c0eff80
Updated merge primitive to handle both update and insert
Apr 13, 2018
4899de2
Updated test for update and insert
Apr 13, 2018
a46bc4b
Fix issue with handling incremental insert and update on same ordinal.
Apr 17, 2018
bf7883b
Small code refactoring
Apr 17, 2018
a1e7b60
Small code refactoring
Apr 17, 2018
fe83a08
Handling duplicate values incse of both insert and update
Apr 17, 2018
d65f700
Updated test for update and insert
Apr 17, 2018
3e79612
Updated test
Apr 17, 2018
1632f4e
Updated test
Apr 18, 2018
382ae1a
Always do update of delta even when there is no changes in count
Apr 18, 2018
7203d96
Revert "Always do update of delta even when there is no changes in co…
Apr 18, 2018
253bb43
Merge branch 'master' into li-master
Apr 24, 2018
1fd04f8
Merge branch 'li-master' into li-sorted
Apr 24, 2018
49f400f
Handling both update and insert on same dataset even with multiple bu…
Apr 24, 2018
9b36374
Updated test for insert and update
Apr 24, 2018
3b3b9d1
Added new tests
Apr 27, 2018
bdb0db3
Adding rudimentary ColumnSortedInsertExec Node
Apr 27, 2018
9734d10
Merge branch 'master' into li-master
May 2, 2018
cddd170
Merge branch 'li-master' into li-sorted
May 2, 2018
c0bf507
Doing away with idea of adding an extra node over SMJ for insert for now
May 2, 2018
aa9ce03
Merge branch 'master' into li-master
May 8, 2018
0790206
Merge branch 'li-master' into li-sorted
May 8, 2018
7983bb4
Adding basic delete functionality
May 10, 2018
cebee52
Sync store
May 10, 2018
0fbb166
Merge branch 'master' into li-master
May 10, 2018
7bab71e
Merge branch 'li-master' into li-sorted
May 10, 2018
c5c878d
Merge branch 'master' into li-master
May 11, 2018
7e46161
Merge branch 'li-master' into li-sorted
May 11, 2018
a88ab2a
Changing column table scan so to handle deletes in both dictionary an…
May 15, 2018
46eaf0e
Improving over commit 7983bb43b07f36be2499995e3584e6b0a90049fe i.e. A…
May 15, 2018
e9c40d8
Improving delete tests
May 15, 2018
5784d72
Merge branch 'master' into li-master
May 15, 2018
c1a5c51
Merge branch 'li-master' into li-sorted
May 15, 2018
143bf46
Updated test
May 15, 2018
94e7f24
Disable join in insert route of Put Into
May 17, 2018
8a4f9f5
Updated test for delete
May 17, 2018
9212f46
Change related for debug print
May 17, 2018
a6a5ff6
Corrected delete handling i.e. Changing column table scan so to handl…
May 17, 2018
c373b4f
Do not release delta buffer if case of delta insert
May 17, 2018
320701c
Merge branch 'master' into li-master
May 17, 2018
0f5acac
Merge branch 'li-master' into li-sorted
May 17, 2018
9aa7549
Merge branch 'master' into li-master
May 23, 2018
a6c14ad
Merge branch 'li-master' into li-sorted
May 23, 2018
1ea09c1
Removing redundant changes
May 24, 2018
caf3ff6
Reverted all changes related to Put Into so far
May 29, 2018
89af47d
DML changes for insert that will follow a new path now
May 30, 2018
6c5bf44
Merge branch 'master' into SNAP-2358
May 30, 2018
b1926af
Reverting some redundant changes
May 30, 2018
3213e78
Changes from Sumedh to sort column batches on minimum value
May 31, 2018
5616e86
Temporary changes to make sorting on column batches working
Jun 4, 2018
33ecd9e
Merge branch 'master' into SNAP-2358
Jun 4, 2018
093c70d
Removing local flags
Jun 5, 2018
26e91a1
Revert "Doing away with idea of adding an extra node over SMJ for ins…
Jun 6, 2018
4648fde
Removed a hardcoded flag and also removing any dependency on changes …
Jun 6, 2018
cb6a2f5
Merge branch 'master' into SNAP-2358
Jun 6, 2018
0c94cad
Merge branch 'SNAP-2358' into SNAP-2358-v1
Jun 6, 2018
48eb8de
Remove a redundancy
Jun 7, 2018
0f3ee26
Merge branch 'master' into SNAP-2358
Jun 7, 2018
80d57ee
Merge branch 'SNAP-2358' into SNAP-2358-v1
Jun 7, 2018
226c2d9
Removing hardcoded elimination of exceptions.
Jun 7, 2018
baa31f4
Added a TODO
Jun 7, 2018
a66b0e4
Merge branch 'master' into SNAP-2358
Jun 12, 2018
14b42b2
Merge branch 'SNAP-2358' into SNAP-2358-v1
Jun 12, 2018
40ec6e2
Incorporating required DDL changes so column batches would only be so…
Jun 14, 2018
bec484e
Merge branch 'master' into SNAP-2358
Jun 14, 2018
c53cc46
Merge branch 'SNAP-2358' into SNAP-2358-v1
Jun 14, 2018
fb8995b
Updated DDL changes
Jun 14, 2018
75c3d1d
Handle case where table is not sorted
Jun 14, 2018
862b99e
small correction
Jun 14, 2018
412fd2a
First phase of changes to handle both first insert and delta insert w…
Jun 16, 2018
84f1dc6
Second phase of changes to handle both first insert and delta insert …
Jun 17, 2018
6299385
Code refactoring over last commit
Jun 17, 2018
6a1746a
Handling direct insert scenario
Jun 17, 2018
92aa5ad
Added change for handling delta insert rows that falls in existing range
Jun 17, 2018
8853518
Basic working version of delta insert for values that fall in range
Jun 18, 2018
8e1ee97
Removed one redundancy
Jun 18, 2018
19d8a44
Doing away with passing information to ColumnTableScan that whether i…
Jun 18, 2018
574e6c2
Merge branch 'master' into SNAP-2358
Jun 18, 2018
ea53b94
Merge branch 'SNAP-2358' into SNAP-2358-v1
Jun 18, 2018
e7e99a7
Merge branch 'SNAP-2358-v1' into SNAP-2358-v2
Jun 18, 2018
cb2c051
Code refactoring to remove some redundant code.
Jun 18, 2018
f619e24
Code refactoring for insert performance
Jun 18, 2018
d99ebd8
Added ways to avoid scans that are not needed while insert
Jun 19, 2018
24853e2
Merge branch 'master' into SNAP-2358
Jun 20, 2018
701225d
While creating ColumnFormatIterator i.e. scanner for column store, pa…
Jun 20, 2018
eb21a5d
Pass flag to ColumnTableScan that if table is sorted or not
Jun 20, 2018
73c9019
Enable sorted order insert only when table is sorted
Jun 20, 2018
728d12d
Test refactoring. Commented out a failing test
Jun 21, 2018
87f94f8
Removed debug changes for update related to delta insert. may be need…
Jun 21, 2018
bf3fcc4
Code refatoring to guard delta inserted change sunder a flag
Jun 21, 2018
6e88555
Further code refactoring to make pass precheckin
Jun 21, 2018
279b4d8
Putting a small fix for eliminating extra sort node
Jun 21, 2018
eefac55
Marked partitioning information in DeltaInsertExec
Jun 21, 2018
0c2a6ff
Removed extra debug flag
Jun 21, 2018
ea735fc
Disable dunit and performance tests by default
Jun 21, 2018
f6678f3
Code rectoring to bring code generated changes under a flag
Jun 21, 2018
c2222e4
Further refactoring for generated code and update
Jun 21, 2018
750e13e
Further refactor column delta related code for taking under a flag
Jun 21, 2018
fefdb54
For some scenario like test "update delete on column table" of Prepar…
Jun 22, 2018
3f88a75
Correctly taking care of sorting order provided by user
Jun 22, 2018
c2433a5
Merge branch 'master' into SNAP-2358
Jun 25, 2018
ad7ea90
Removing an optimization that used iterator only for cases of join an…
Jun 25, 2018
7c894fb
Fix for an issue reklated to sorting
Jun 27, 2018
61598ea
resolved merge conflict with master in SnappySessionState
hemanthmeka Aug 2, 2018
52b8ccf
merging from master
hemanthmeka Aug 8, 2018
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
@@ -0,0 +1,81 @@
/*
* Copyright (c) 2017 SnappyData, Inc. All rights reserved.
*
* Licensed under the Apache License, Version 2.0 (the "License"); you
* may not use this file except in compliance with the License. You
* may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or
* implied. See the License for the specific language governing
* permissions and limitations under the License. See accompanying
* LICENSE file.
*/
package org.apache.spark.sql.store

import scala.concurrent.duration.{FiniteDuration, MINUTES}

import io.snappydata.cluster.ClusterManagerTestBase

import org.apache.spark.sql.SnappyContext

/**
* SortedColumnTests and SortedColumnPerformanceTests in DUnit.
*/
class SortedColumnDUnitTest(s: String) extends ClusterManagerTestBase(s) {

def testDummy(): Unit = {}

def disabled_testBasicInsert(): Unit = {
val snc = SnappyContext(sc).snappySession
val colTableName = "colDeltaTable"
val numElements = 551
val numBuckets = 2

SortedColumnTests.verfiyInsertDataExists(snc, numElements)
SortedColumnTests.verfiyUpdateDataExists(snc, numElements)
SortedColumnTests.testBasicInsert(snc, colTableName, numBuckets, numElements)
}

def disabled_testPointQueryPerformance() {
val snc = SnappyContext(sc).snappySession
val colTableName = "colDeltaTable"
val numElements = 999551
val numBuckets = SortedColumnPerformanceTests.cores
val numIters = 100
SortedColumnPerformanceTests.benchmarkMultiThreaded(snc, colTableName, numBuckets, numElements,
numIters, "PointQuery", numTimesInsert = 10,
doVerifyFullSize = true)(SortedColumnPerformanceTests.executeQuery_PointQuery_mt)
// while (true) {}
}

def disabled_testPointQueryPerformanceMultithreaded() {
val snc = SnappyContext(sc).snappySession
val colTableName = "colDeltaTable"
val numElements = 999551
val numBuckets = SortedColumnPerformanceTests.cores
val numIters = 100
val totalNumThreads = SortedColumnPerformanceTests.cores
val totalTime: FiniteDuration = new FiniteDuration(5, MINUTES)
SortedColumnPerformanceTests.benchmarkMultiThreaded(snc, colTableName, numBuckets, numElements,
numIters, "PointQuery multithreaded", numTimesInsert = 10, isMultithreaded = true,
doVerifyFullSize = false, totalThreads = totalNumThreads,
runTime = totalTime)(SortedColumnPerformanceTests.executeQuery_PointQuery_mt)
// while (true) {}
}

def disabled_testRangeQueryPerformance() {
val snc = SnappyContext(sc).snappySession
val colTableName = "colDeltaTable"
val numElements = 999551
val numBuckets = SortedColumnPerformanceTests.cores
val numIters = 21
SortedColumnPerformanceTests.benchmarkMultiThreaded(snc, colTableName, numBuckets, numElements,
numIters, "RangeQuery", numTimesInsert = 10,
doVerifyFullSize = true)(SortedColumnPerformanceTests.executeQuery_RangeQuery_mt)
// while (true) {}
}
}
Loading