Skip to content

Conversation

@morningman
Copy link
Contributor

@morningman morningman commented May 25, 2020

Problem is described in ISSUE #3678
This CL mainly changed to rule of creating dynamic partition.

  1. If time unit is DAY, the logic remains unchanged.

  2. If time unit is WEEK, the logical changes are as follows:

    1. Allow to set the start day of every week, the default is Monday. Optional Monday to Sunday
    2. Assuming that the starting day is a Tuesday, the range of the partition is Tuesday of the week to Monday of the next week.
  3. If time unit is MONTH, the logical changes are as follows:

    1. Allow to set the start date of each month. The default is 1st, and can be selected from 1st to 28th.
    2. Assuming that the starting date is the 2nd, the range of the partition is from the 2nd of this month to the 1st of the next month.
  4. The SHOW DYNAMIC PARTITION TABLES statement adds a StartOf column to show the start day of week or month.

It is recommended to refer to the example in dynamic-partition.md to understand.

TODO:
Better to support HOUR and YEAR time unit. Maybe in next PR.

FIX: #3678

@morningman morningman added kind/api-change Categorizes issue or PR as related to adding, removing, or otherwise changing an API area/catalog Issues or PRs related to catalog management labels May 25, 2020
@morningman morningman self-assigned this May 25, 2020
* dynamic\_partition\_enable
是否开启 Doris 的动态分区功能。默认为 false,即关闭。该参数只影响动态分区表的分区操作,不影响普通表。
是否开启 Doris 的动态分区功能。默认为 false,即关闭。该参数只影响动态分区表的分区操作,不影响普通表。可以通过修改 fe.conf 中的参数并重启 FE 生效。也可以在运行时执行以下命令生效:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这里我们是否可以放开默认为true?目前已经存在了比较长的时间了,有些用户可能不仔细看文档,创建了分区表但是没有修改配置,造成使用上的一些额外成本?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

我打算等0.12版本稳定了再打开这个限制,目前来看,还是有各种各样的小问题。。

p20200521: ["2020-05-21", "2020-05-22")
```
如果此时将分区粒度改为 MONTH,则系统会尝试创建范围为 `["2020-05-01", "2020-06-01")` 的分区,而该分区的分区范围和已有分区冲突,所以无法创建。而范围为 `["2020-06-01", "2020-07-01")` 的分区可以正常创建。因此,2020-05-22 到 2020-05-30 时间段的分区,需要自行填补。
Copy link
Contributor

@wutiangan wutiangan May 25, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

如果命令冲突时,该命令返回成功还是失败?如果返回成功,那些需要补齐的分区能否自动帮助用户补齐?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

目前是直接成功,不会报错,show dynamic partition 可以看到错误信息。
这个我建议由用户自行处理。系统来处理这个问题,太复杂了,容易出错,并且隐含的操作太多。

返回:200
```
`curl --location-trusted -u username:password -XGET http://fe_host:fe_http_port/api/_set_config?dynamic_partition_check_interval_seconds=432000`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

设置http协议为啥用GET方法而不是SET方法?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这个就是沿用之前的框架,没有做更多的修改

throw new DdlException("Invalid properties: " + DynamicPartitionProperty.START_DAY_OF_WEEK);
}
try {
int dayOfMonth = Integer.parseInt(val);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

dayOfWeek

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK

// 2. get the date of `startOf` week
int day = current.get(Calendar.DAY_OF_WEEK);
// SUNDAY will return 1, we will set it to 7, and make MONDAY to 1, and so on
day = day == 1 ? 7 : day - 1;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you‘d better add Parentheses

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK

* Today is 2020-05-24, offset = 1, startOf.month = 3
* It will return 2020-06-03
*/
private static String getPartitionRangeOfMonth(Calendar current, int offset, StartOfDate startOf, TimeZone tz,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why tz is no use?

Copy link
Contributor

@WingsGo WingsGo May 25, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about use LocalDateTime/ZonedDateTime to replace all Calendar? They are thread-safe, immutable and support timezone

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe in next PR? This PR is already too big

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, we can change it in next PR~

current.add(Calendar.WEEK_OF_YEAR, offset);
// 2. get the date of `startOf` week
int day = current.get(Calendar.DAY_OF_WEEK);
// SUNDAY will return 1, we will set it to 7, and make MONDAY to 1, and so on
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use LocalDateTime can avoid this trick

}
}

/*
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

java doc comment is start with /** xx */ , maybe we can unify it?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, I will change my IDE's setting. But there are lots of comment write like this. I can just make sure that newly added code can obey to rule.

Copy link
Contributor

@WingsGo WingsGo May 25, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks good.

// keys.size is always equal to column.size, cannot reach this exception
LOG.warn("Keys size is not equal to column size. Error={}", e.getMessage());
LOG.warn("Keys size is not equal to column size. Error={}, db: {}, table: {}", e.getMessage(),
db.getFullName(), olapTable.getName());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe we should remove the warning, because dynamic_partition.start is default to Integer.MIN_VALUE, so we may got a AnalysisException like this Error=errCode = 2, detailMessage = date literal [5877471-07-26 00:00:00] is invalid

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, I will check this offset to make it reasonable.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just make a check that if start offset == Integer.MIN_VALUE, it will not drop any partition.
And for adding partition, I just leave it to user to make sure the value is reasonable.

// keys.size is always equal to column.size, cannot reach this exception
LOG.warn("Keys size is not equal to column size. Error={}", e.getMessage());
LOG.warn("Keys size is not equal to column size. Error={}, db: {}, table: {}", e.getMessage(),
db.getFullName(), olapTable.getName());
Copy link
Contributor

@WingsGo WingsGo May 25, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same as following comment

iterator.remove();
continue;
}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The single column check work is done when create table, no need to check here

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is just a self-defence~

}
try {
int dayOfMonth = Integer.parseInt(val);
if (dayOfMonth < 1 || dayOfMonth > 28) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add a comment why less than 28, the reason in doc is for users?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have already stated the reason in the document, I will add some comments here.

// return the number of days of give year.month
public static int getNumberDaysOfMonth(int year, int month) {
Calendar calendar = Calendar.getInstance();
calendar.set(year, month + 1, 0);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why plus one? getNumberDaysOfMonth(2020, 1) will return 29, and the function name is confused

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This function is useless, i will remove it

@WingsGo
Copy link
Contributor

WingsGo commented May 26, 2020

LGTM

acelyc111 pushed a commit to acelyc111/incubator-doris that referenced this pull request Jan 20, 2021
…pache#3679)

Problem is described in ISSUE apache#3678 
This CL mainly changed to rule of creating dynamic partition.

1. If time unit is DAY, the logic remains unchanged.
2. If time unit is WEEK, the logical changes are as follows:

	1. Allow to set the start day of every week, the default is Monday. Optional Monday to Sunday
	2. Assuming that the starting day is a Tuesday, the range of the partition is Tuesday of the week to Monday of the next week.

3. If time unit is MONTH, the logical changes are as follows:

	1. Allow to set the start date of each month. The default is 1st, and can be selected from 1st to 28th.
	2. Assuming that the starting date is the 2nd, the range of the partition is from the 2nd of this month to the 1st of the next month.

4. The `SHOW DYNAMIC PARTITION TABLES` statement adds a `StartOf` column to show the start day of week or month.

It is recommended to refer to the example in `dynamic-partition.md` to understand.

TODO:
Better to support HOUR and YEAR time unit. Maybe in next PR.

FIX: apache#3678
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area/catalog Issues or PRs related to catalog management kind/api-change Categorizes issue or PR as related to adding, removing, or otherwise changing an API

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[DynamicPartition] Dynamic partition range always intersect with existing partition's.

5 participants