-
Notifications
You must be signed in to change notification settings - Fork 1.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Performance] Improve performance for date field parsing #8361
Comments
I'm working on running the same with nyc_taxis to see how these microbenchmark results translate to actual workloads. |
Working on creating a new valid datetime format for opensearch datetime field mapping which will internally use fast datetime parser implementation |
@CaptainDredge Could you share the performance measurements on this? |
Documenting possible approaches for integrating new fast parser implementation with existing code abstractions Overview:
Problem: Possible design approaches:
cc: @mgodwan |
Thanks @CaptainDredge for the performance numbers, they look promising. I see On the approaches, I like option 3 as it provides easy fall back option without much changes. |
Microbenchmark comparison with other implementations mentioned in #11177 for
Legend: Overall a small overhead in #11465 implementation is due to usage of cc: @mgodwan , @backslasht |
Hi, are we on track for this to be released in 2.12 ? |
Is your feature request related to a problem? Please describe.
OpenSearch relies on
java.time.format.DateTimeFormatter
for parsing various date time formats. This provides flexibility to support a plethora of use cases with multiple date-time formats.For most of the use cases, users generally have common date-time formats for which JDK formatters can be slow. We can utilize the knowledge of underlying format and provide better latencies/throughput for document/query parsing with code customized for the underlying format. A lot of logging libraries (e.g. log4j) provide common formats for datetime which we can see to support to start with.
Describe the solution you'd like
Faster parsing alternatives for known formats, which can yield better times for overall indexing.
Describe alternatives you've considered
A barebones POC code for format
yyyy-MM-dd HH:mm:ss
: mgodwan@4345a75The JMH micro-benchmarks run with this format using JDK Pattern formatter vs custom formatter show following results:
The text was updated successfully, but these errors were encountered: