forked from prestodb/presto
-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Example Pull Request: using 9897 #1
Open
munderseth
wants to merge
142
commits into
TryTestspace:master
Choose a base branch
from
unknown repository
base: master
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Previously this method was just logging an error.
pageBufferClientCallbackExecutor is a fixed-sized thread pool, so wrapping it with a BoundedExecutor is unnecessary.
Since memory allocation in `MemoryPool.reserve()` is instantaneous, we should free the memory in previous pool without delay.
This is "more correct", as pools are stateful.
Separate handling of joins and semi-joins.
This allows connectors extending `presto-base-jdbc` to support other data types than supported by `BaseJdbcClient` and `JdbcRecordCursor` out of the box.
Previously the exception would say `No page sink provider for connector 'catalog-name'`.
toFailure(s) can create cyclic suppression or causes given it is recursive. Use a hash set to detect seen failures.
It causes compilation failure.
Previously, RENAME TABLE in presto-memory always renames table to same schema even if the target schema is specified. This commit changes the behavior to use specified schema. If the schema does not exist, throw SchemaNotFoundException.
The fact `object` type matches type's java type is asserted earlier.
"proper literal" means just a literal or a literal wrapped with a CAST. Previously, `toExpression(..., CHAR(x))` would return a "magic literal function call" which is hard to analyze.
This is only refactor, moving `decrementReferenceCounts` from _after_ calling `deleteGroup` into that method. Since there is exactly one call-site, this doesn't change current behavior. Motivation: this makes `deleteGroup` to maintain invariants.
We now run a query that will run for a longer time to give more time to the SqlQueryManager thread to kill it. We also explicitly wait for the final query state after running the query instead of waiting for the SqlQueryManager thread implicitly.
Add Hive storage format and respect table format session properties. These two properties are experimental and should be reverted in the future.
Previously when setMemoryPool is called while an operator is allocating memory it's possible that these two threads deadlock as they acquire the monitors in different orders. The thread calling setMemoryPool was acquiring the monitor of the QueryContext instance/this first and then the monitor of the user/revocable memory contexts in the queryMemoryContext. However, the driver threads reserving memory were acquiring the monitors of the user/revocable memory contexts first, and then the monitor of the QueryContext instance. This change solve the issue by preventing locking of the user/revocable memory contexts in the queryMemoryContext by getting the same information from the MemoryPool.
The generated code made the base state for aggregations a SingleState type. Combine for the aggregation had to handle merging that empty single into the first group. This makes it a group state for consistency and to simplify group state's so they don't have to handle single state (shouldn't know anything about them)
Currently, an array of histogram objects per group-by-id is kept. This is inefficient for two major reasons: 1. there is a large # of java objects and incur overhead. Inside the histogram implementation, collections like LongBigArray or IntBigArray are used which have minimize sizes in the 100-200+ byte ranges. Histograms with a single item, or a few, still incur this penalty. 2. across group-by-ids, there may be the same value multiple times. * anecdotal analysis showed we do have group-by histograms that have few values ** after implementation, we found with yourkit that use cases that were holding 1 gb of ram dropped to 45mb of ram. *** profiling with JMH showed that the new version was equal to or better than the old version what: The solution here is to build a single object that includes the group-by-id in the key within an internal hash table. There are many details to support the appearance of a single histogram per group-by table both for transfer when doing intermediate combining, as well as final output. See GroupedTypedHistogram for the details. In short, this reduces the # of java objects to a constant number. Values are also stored once in another custom hash implementation. We also found that changing the probe function from linear to "sum of square" (sort of super-quadratic) performed better testing with JMH. This has also introduced a new config variable, or rather used one from a previous commit, that allows legacy (existing behavior) or the new one. We default to the new behavior, but allow rollback to the old behavior in case of some unexpected failure. Future work for readability, private inner classes were introduced. These don't appear to have affected perf (per JMH), but JFR confirms they aren't elided by the JVM. This is an opportunity to improve perf (causes 4.5% of gc pressure in some testing).
It's now a proper method in TupleDomain/Domain that can be used in other places.
If the tuple domain contains too many terms, the resulting expression can be very expensive to interpret in the call to shouldPrune. This reduces the number of discrete elements in the domain to produce smaller expression trees.
This avoids re-analyzing types and other overhead that was occurring on each invocation to evaluate the constraint passed to getTableLayouts()
Take a look at query `SELECT 1 FROM orders WHERE orderstatus = 'O'`. In that case FilterNode will be optimized to `currentConstraint` in `TableScanNode`, `TableScanNode` do not output any column. In order to support that in TPCH we need to filter out all rows which do not match `currentConstraint`. It is needed to have a `TpchTable` being accessible in `TpchRecordSet` so we could check column used in `currentConstraint` have values contained in the domain of `currentConstraint`.
There is no practical reason that one could prefer predicate pushdown disabled. In regards of testing for case with predicate pushdown enabled and disabled, one can select columns which either support or do not support predicate according to test needs.
`BenchmarkPlanner` runs both iterative optimizer enabled and disabled. Without iterative optimizer, and with `forceSingleNode=true`, the created plan is invalid and eventually fails. This disables `forceSingleNode` so that we always produce a plan without failure.
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Example only. To demonstrate how Testspace handles
forked Pull Requests.
Using presto#9897 - has test failures.
Go to fork-PR-1 and select
Failures
.