-
Notifications
You must be signed in to change notification settings - Fork 3.7k
[DynamicPartition] Optimize the rule of creating dynamic partition #3679
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[DynamicPartition] Optimize the rule of creating dynamic partition #3679
Conversation
| * dynamic\_partition\_enable | ||
| 是否开启 Doris 的动态分区功能。默认为 false,即关闭。该参数只影响动态分区表的分区操作,不影响普通表。 | ||
| 是否开启 Doris 的动态分区功能。默认为 false,即关闭。该参数只影响动态分区表的分区操作,不影响普通表。可以通过修改 fe.conf 中的参数并重启 FE 生效。也可以在运行时执行以下命令生效: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
这里我们是否可以放开默认为true?目前已经存在了比较长的时间了,有些用户可能不仔细看文档,创建了分区表但是没有修改配置,造成使用上的一些额外成本?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
我打算等0.12版本稳定了再打开这个限制,目前来看,还是有各种各样的小问题。。
| p20200521: ["2020-05-21", "2020-05-22") | ||
| ``` | ||
| 如果此时将分区粒度改为 MONTH,则系统会尝试创建范围为 `["2020-05-01", "2020-06-01")` 的分区,而该分区的分区范围和已有分区冲突,所以无法创建。而范围为 `["2020-06-01", "2020-07-01")` 的分区可以正常创建。因此,2020-05-22 到 2020-05-30 时间段的分区,需要自行填补。 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
如果命令冲突时,该命令返回成功还是失败?如果返回成功,那些需要补齐的分区能否自动帮助用户补齐?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
目前是直接成功,不会报错,show dynamic partition 可以看到错误信息。
这个我建议由用户自行处理。系统来处理这个问题,太复杂了,容易出错,并且隐含的操作太多。
| 返回:200 | ||
| ``` | ||
| `curl --location-trusted -u username:password -XGET http://fe_host:fe_http_port/api/_set_config?dynamic_partition_check_interval_seconds=432000` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
设置http协议为啥用GET方法而不是SET方法?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
这个就是沿用之前的框架,没有做更多的修改
| throw new DdlException("Invalid properties: " + DynamicPartitionProperty.START_DAY_OF_WEEK); | ||
| } | ||
| try { | ||
| int dayOfMonth = Integer.parseInt(val); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
dayOfWeek
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK
| // 2. get the date of `startOf` week | ||
| int day = current.get(Calendar.DAY_OF_WEEK); | ||
| // SUNDAY will return 1, we will set it to 7, and make MONDAY to 1, and so on | ||
| day = day == 1 ? 7 : day - 1; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you‘d better add Parentheses
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK
| * Today is 2020-05-24, offset = 1, startOf.month = 3 | ||
| * It will return 2020-06-03 | ||
| */ | ||
| private static String getPartitionRangeOfMonth(Calendar current, int offset, StartOfDate startOf, TimeZone tz, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why tz is no use?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What about use LocalDateTime/ZonedDateTime to replace all Calendar? They are thread-safe, immutable and support timezone
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe in next PR? This PR is already too big
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, we can change it in next PR~
| current.add(Calendar.WEEK_OF_YEAR, offset); | ||
| // 2. get the date of `startOf` week | ||
| int day = current.get(Calendar.DAY_OF_WEEK); | ||
| // SUNDAY will return 1, we will set it to 7, and make MONDAY to 1, and so on |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Use LocalDateTime can avoid this trick
| } | ||
| } | ||
|
|
||
| /* |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
java doc comment is start with /** xx */ , maybe we can unify it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, I will change my IDE's setting. But there are lots of comment write like this. I can just make sure that newly added code can obey to rule.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks good.
| // keys.size is always equal to column.size, cannot reach this exception | ||
| LOG.warn("Keys size is not equal to column size. Error={}", e.getMessage()); | ||
| LOG.warn("Keys size is not equal to column size. Error={}, db: {}, table: {}", e.getMessage(), | ||
| db.getFullName(), olapTable.getName()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe we should remove the warning, because dynamic_partition.start is default to Integer.MIN_VALUE, so we may got a AnalysisException like this Error=errCode = 2, detailMessage = date literal [5877471-07-26 00:00:00] is invalid
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, I will check this offset to make it reasonable.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I just make a check that if start offset == Integer.MIN_VALUE, it will not drop any partition.
And for adding partition, I just leave it to user to make sure the value is reasonable.
| // keys.size is always equal to column.size, cannot reach this exception | ||
| LOG.warn("Keys size is not equal to column size. Error={}", e.getMessage()); | ||
| LOG.warn("Keys size is not equal to column size. Error={}, db: {}, table: {}", e.getMessage(), | ||
| db.getFullName(), olapTable.getName()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same as following comment
| iterator.remove(); | ||
| continue; | ||
| } | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The single column check work is done when create table, no need to check here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is just a self-defence~
| } | ||
| try { | ||
| int dayOfMonth = Integer.parseInt(val); | ||
| if (dayOfMonth < 1 || dayOfMonth > 28) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
add a comment why less than 28, the reason in doc is for users?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have already stated the reason in the document, I will add some comments here.
| // return the number of days of give year.month | ||
| public static int getNumberDaysOfMonth(int year, int month) { | ||
| Calendar calendar = Calendar.getInstance(); | ||
| calendar.set(year, month + 1, 0); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why plus one? getNumberDaysOfMonth(2020, 1) will return 29, and the function name is confused
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This function is useless, i will remove it
|
LGTM |
…pache#3679) Problem is described in ISSUE apache#3678 This CL mainly changed to rule of creating dynamic partition. 1. If time unit is DAY, the logic remains unchanged. 2. If time unit is WEEK, the logical changes are as follows: 1. Allow to set the start day of every week, the default is Monday. Optional Monday to Sunday 2. Assuming that the starting day is a Tuesday, the range of the partition is Tuesday of the week to Monday of the next week. 3. If time unit is MONTH, the logical changes are as follows: 1. Allow to set the start date of each month. The default is 1st, and can be selected from 1st to 28th. 2. Assuming that the starting date is the 2nd, the range of the partition is from the 2nd of this month to the 1st of the next month. 4. The `SHOW DYNAMIC PARTITION TABLES` statement adds a `StartOf` column to show the start day of week or month. It is recommended to refer to the example in `dynamic-partition.md` to understand. TODO: Better to support HOUR and YEAR time unit. Maybe in next PR. FIX: apache#3678
Problem is described in ISSUE #3678
This CL mainly changed to rule of creating dynamic partition.
If time unit is DAY, the logic remains unchanged.
If time unit is WEEK, the logical changes are as follows:
If time unit is MONTH, the logical changes are as follows:
The
SHOW DYNAMIC PARTITION TABLESstatement adds aStartOfcolumn to show the start day of week or month.It is recommended to refer to the example in
dynamic-partition.mdto understand.TODO:
Better to support HOUR and YEAR time unit. Maybe in next PR.
FIX: #3678