Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[IOTDB-1026] Support wildcard ** in Path And Replace PrefixPath usage with PathPattern in IOTDB-SQL #3918

Merged
merged 64 commits into from
Sep 26, 2021

Conversation

MarcosZyk
Copy link
Contributor

@MarcosZyk MarcosZyk commented Sep 7, 2021

Description

1. Motivation

  1. The wildcard * at tail of path can't help user represent the node only on last level.
  2. Some of the IoTDB-SQL is based on prefix path usage, especially in DDL, and the prefix path cannot help user represent target nodes or timeseries precisely.

For example, suppose we have timeseries root.sg.d.s1, root.sg.d.s2, and root.sg.d.t.s1, consider the following SQL statement:

  1. show timeseries root.sg.d
  2. show timeseries root.sg.d.*
  3. select * from root.sg.d

If use want to process all the three timeseries, all the statements above will work.
However, if user only want to process root.sg.d.s1 and root.sg.d.s2, none of these will work and it's hard to come up with some other SQL statements besides enumerating the traget timeseries.

2. Feature

2.1. Define path pattern and Introduce wildcard **

** will match one or more level in path and * will only match one level.
By leveraging ** and *, users are enabled to construct a path pattern for target paths or nodes in MTree.
For example, suppose we have timeseries root.sg.d.s1, root.sg.d.s2, and root.sg.d.t.s1, consider the following pattern:

  1. root.sg.d.* matches root.sg.d.s1 and root.sg.d.s2
  2. root.sg.d.** matches root.sg.d.s1, root.sg.d.s2 and root.sg.d.t.s1
  3. root.sg.**.s1 matches root.sg.d.s1 and root.sg.d.t.s1

2.2. Replace all prefix path usage in SQL with path pattern

Most of the prefix path can be replaced by a path pattern end with **, and path pattern provides users a more precise way to process timeseries.

3. Implementation

Related implementation in MTree.java is refactored. A MTree traversal framework for path pattern is established. The methods for different kind of resultset can share the same traversal code and achieve target resultset by overriding the related result process method.

4. Docs and Discussion

docs

  1. feature definition doc
  2. design doc

discussion
#3990

@coveralls
Copy link

coveralls commented Sep 7, 2021

Coverage Status

Coverage increased (+0.004%) to 67.545% when pulling f0e4297 on zyk990424:wildcard_extension into a9f582e on apache:master.

@qiaojialin
Copy link
Member

Hi, it seems that this change breaks the backward compatibility?

Let's say we have 3 time series: root.sg.d.s1, root.sg.d.s2, and root.sg.d.t.s1.

Before this change, the following query produces a result set that contains all of the time series.

select * from root.sg.d

After this change, it produces only root.sg.d.s1 and root.sg.d.s2, right?

Yes, after this change, select * from root.sg.d will be select ** from root.sg.d

# Conflicts:
#	session/src/test/java/org/apache/iotdb/session/IoTDBSessionVectorIT.java
@qiaojialin qiaojialin merged commit bc3b736 into apache:master Sep 26, 2021
@MarcosZyk MarcosZyk deleted the wildcard_extension branch April 22, 2022 07:14
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants