-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
bugfix: use PropertiesWriter to escape index_map keys properly #12018
Conversation
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## master #12018 +/- ##
============================================
+ Coverage 61.75% 63.97% +2.22%
- Complexity 207 1572 +1365
============================================
Files 2436 2687 +251
Lines 133233 147672 +14439
Branches 20636 22631 +1995
============================================
+ Hits 82274 94479 +12205
- Misses 44911 46261 +1350
- Partials 6048 6932 +884
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. |
I guess this is not the only place where it can fail. E.g. segment metadata is also a properties configuration. |
We would really like the ability to use these characters in column names, since other DBs like ClickHouse do and supporting them makes programmatic column creation simple. This let's us avoid complexity/avoid creating some mapping to substitute out reserved characters. Re: segment metadata, this already correctly handles these characters. e.g. from
AFAI could tell the issue is only for index_map, which used some custom logic to write the file instead of relying on PropertiesConfiguration functionality. We've been using this patch for a couple weeks and haven't run into any other query/operational/ingestion bugs due to the special column name |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Interesting, seems the problem is that we use PropertiesConfiguration
to load the file, but not using it to write the file.
Can you change it to directly use PropertiesConfiguration
to write the file (i.e. remove PrintWriter
)? Similar to how SegmentColumnarIndexCreator.writeMetadata()
handles the metadata write
@@ -431,24 +434,26 @@ private static String getKey(String column, String indexName, boolean isStartOff | |||
} | |||
|
|||
@VisibleForTesting | |||
static void persistIndexMaps(List<IndexEntry> entries, PrintWriter writer) { | |||
static void persistIndexMaps(List<IndexEntry> entries, PrintWriter writer) throws IOException { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(format) Please apply Pinot Style
0aaa120
to
0aeb163
Compare
Previously segment build failed if a column name contained
:
or=
, since these characters have special meaning inProperties
files. For example, an exception produced for a column likeheaders.:auth
is:This changes the writing to be done via PropertiesWriter instead of building the string explicitly.
Testing: unit tests + deployed in a cluster and verified segments were sealed properly