-
Notifications
You must be signed in to change notification settings - Fork 306
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Improvement]: Build a rule of relationship between table and optimizer/resource group #1865
Comments
This issue has been automatically marked as stale because it has been open for 180 days with no activity. It will be closed in next 14 days if no further activity occurs. To permanently prevent this issue from being considered stale, add the label 'not-stale', but commenting on the issue is preferred when possible. |
This issue has been closed because it has not received any activity in the last 14 days since being marked as 'stale' |
I don't think rules on catalog properties are necessary if we could use optimizer group |
Will it be possible for multiple types of optimizers to exist in one OptimizerGroup in the future? For example, the same OptimizerGroup may contain both Flink and Spark optimizers. The OptimizerGroup is similar to a logical resource pool, and different types of optimizers will occupy some resources. |
What scenarios would this hybrid resource model be helpful for? |
@majin1102 Thanks for the reply, I'm asking this because of the following scenarios: when using Flink optimizer for merging, the optimizer may stop/or need to chase data, or there may be sudden needs for merging. However, Flink optimizer is not particularly good at automatic scaling (at least on Yarn). In addition, if we consider resources, that is, OptimizerGroup is just a resource pool, and the optimizer is an application running in OptimzierGroup(similar to OptimizerGroup is a queue of Yarn, the optimizer is an application), will this not add too much complexity, or is there something I missed here? thanks |
Search before asking
What would you like to be improved?
Right now, when we want to declare a table optimized by some optimizer group, we have two clear ways:
set default optimizer group of a catalog, and don't declare optimizer group in table properties:
declare a property of 'self-optimizing.group' in table properties (in create table or alter table statement):
In practice, using default optimizer group has better experience while not flexible in case that multiple groups are necessary in one catalog. Using table property provides more flexibility but sacrifice user experience and security, imagine that every table(user) needs to know the resources behind AMS and has the authority to allocate resources, this could be a disaster.
It does't seem a big deal of this because in many cases there's only one external/default optimizer group without considerations for security and isolation. But it would be never late to have a better way to provide user experience, isolation and security for self-optimizing
How should we improve?
Better user experience
users only declare relationships in one place and use them everywhere. It's a bad idea to define a property in table which means table owner must know the concepts and instances.
It's a good idea of declaring properties in optimizer group and use an extendable rule like regex
Better security
Relationships of table and resource should be certain and can not be modified without the permission of the owner of resources. It is clear that declaring properties in optimizer group fulfills this criterion
Better isolation
when we declare relationships of table and resources or modify them, the rules must be mutually exclusive
In conclusion, I proposed that declaring regex rules in optimizer group defines relationships of table and resources. For example:
leads to a clear definition that this optimizer groups could be used in these tables and only used by them.
Are you willing to submit PR?
Subtasks
No response
Code of Conduct
The text was updated successfully, but these errors were encountered: