Skip to content

Conversation

@SolidWallOfCode
Copy link
Member

@SolidWallOfCode SolidWallOfCode commented Mar 2, 2023

This brings the unit parser logic into ATS. It provides generic support for parsing strings that have embedded units. Updating the set of units requires only changing the initialization data, making it very easy to add or remove additional units. This commit adds a concrete instantiation that supports time string parsing, so strings such as "1 day 12 hours 30 minutes" is correctly parsed. A primary goal is to stop using default / implied units for time values in configuration, but instead always require units while providing an easy way to specify convenient units.

See this fle for examples of use.

Related to #9477

Usage would be something like

auto && [ value, errata ] = ts::time_parser(value_string);
if (! errata.is_ok()) {
  std::string text;
  Warning(bwprint(text, "Failed to parse duration - {}", errata)).c_str());
  return TS_ERROR;
}
timeout = std::chrono::duration_cast<std::chrono::milliseconds>(value);

@maskit
Copy link
Member

maskit commented Mar 2, 2023

Having this parser is fine. It'd make sense for file/memory size, but I'm skeptical to use it for amount of time such as timeout.

abc.max_buf_size: 10M -- Makes sense, it might hit internal upper limit but I don't think there'd be confusion.
xyz.timeout: 100ms -- Even if there's no upper nor lower limit, It's not obvious whether milliseconds makes sense.

We currently use second for settings that receive amount of time in most of cases. And if a setting expects millisecond, the setting name has _ms suffix. It's obvious that the percision is millisecond (if not, the name should not have _ms and just use second like others).

It's not only about percision. The suffix roughly suggests reasonable amount of time. Setting 100000 to a setting that has _ms suffix looks odd.

Of course we could describe things on the documentation and say RTFM, but I don't think providing super detailed 1000 pages of documentation is user friendly.

{nanoseconds{hours{24}}.count(), {"d", "day", "days"}},
{nanoseconds{hours{168}}.count(), {"w", "week", "weeks"}}}}};

} // namespace ts
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just curious, is all of this constexpr resolved to data at compile time or is there static initializer setup here?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, there's still computation that occurs to initialize the parser data structure. I'm not sure what you mean by "static initializer" other than the parser being a static data structure which is constructed at process start time. That is,

UnitParser const time_parser{ ... };

which declares a static instance of UnitParser.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I mean the static setup at process start time. It's not called "static initializing"?

I'd rather see time_parser be a function and just return a UnitParser and if a program wants one call. Then the programmer can decide its lifetime.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why? It seems more performant to construct a single instance at process start and use that everywhere. Then there is no runtime construction / destruction cost. What's the benefit of multiple instances?

@brbzull0
Copy link
Contributor

brbzull0 commented Mar 3, 2023

Not having units specified could end up in being a real mess, I like this.

@SolidWallOfCode
Copy link
Member Author

It's certainly caused many problems in the past. One of my goals for the YAML conversion was to require explicit units for all time periods to avoid any possible confusion, and "random" numbers like "86400" instead of "1 day".

@maskit
Copy link
Member

maskit commented Mar 3, 2023

What were the problems?

@SolidWallOfCode
Copy link
Member Author

People used wildly inappropriate values because they didn't realize the implied units. It's a bit less bad because one place this came up was where the default unit was minutes, not seconds. There was another case where the user assumed ms but it was really seconds.

@maskit
Copy link
Member

maskit commented Mar 3, 2023

Sounds like this fixes a problem, but those problems support my concern. People don't read the documentation. And, more characters, more typos.
I can imagine conversations like:

A: Hey, I set xyz to 12h but it doesn't work!
B: Oh, it's rounded to day. You could set it to 24h but less than 24h wouldn't work as you expect.

A: Hey, I set the timeout to 100ms but it doesn't work!
B: Can you share your records.yml?
B: There's a typo. You set it to 100 nanoseconds

A: Hey, I set another timeout to 800ms, and it doesn't work. I'm sure I set it to 800 millisecond this time.
B: Oh, unlike the other timeout, this one is a bit special and it's not that precise. I'd use second.

Nothing is going to be perfect. I wonder which is better.

@SolidWallOfCode SolidWallOfCode marked this pull request as ready for review March 7, 2023 00:20
Copy link
Contributor

@bneradt bneradt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense. Should be straightforward to add some unit tests?

@SolidWallOfCode
Copy link
Member Author

Based on comments, I moved the actual unit parser instance to a nested namespace and wrapped it with a function to do the conversion to std::chrono units so that doesn't have to be repeated with the risk of error. Comments were added to the function to make use clearer.

Copy link
Contributor

@brbzull0 brbzull0 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good Alan! Thanks.

@SolidWallOfCode SolidWallOfCode merged commit d0a7569 into apache:master Mar 13, 2023
cmcfarlen pushed a commit to cmcfarlen/trafficserver that referenced this pull request Jun 3, 2024
cmcfarlen pushed a commit to cmcfarlen/trafficserver that referenced this pull request Jun 3, 2024
* commit 'c54a2e2b77151869ff014fbdc4c82cec0afcbb8c': (37 commits)
  Slight performance improvements before calling APIHooks::clear (apache#9480)
  libswoc: Update to 1.4.5 (apache#9522)
  CryptoContext: Clean up to avoid compiler problem. (apache#9521)
  Add TLSCertSwitchSupport (apache#9322)
  Add clang-format-tests to clang-format target (apache#9456)
  Adds the AR env variable to config.nice (apache#9515)
  Fix .asf.yaml (apache#9519)
  Hugepage config cleanup (apache#9479)
  Separate io_uring into a separate library. AIO in io_uring mode uses new io_uring lib. (apache#9462)
  Avoid memory allocation in CryptoHash (apache#9474)
  UnitParser: add unit parser support. (apache#9485)
  autest - Minor fix on the verifier_client test ext to allow setting only the http3 ports. (apache#9517)
  Remove support for port event polling (apache#9476)
  QUIC: Add support to configure UDP max payload limit. (apache#9486)
  Reduce the size of the APIHooks, eliminating enum gap (apache#9509)
  Add support for CMCD-Request header nor field to prefetch plugin (apache#9232)
  Eliminates padding from some common structs (apache#9481)
  Enable external file loading for sni.yaml. (apache#9501)
  Remove inactive include of IpMapConf.h (apache#9512)
  Cleanup: Remove RecModeT from the code. (apache#9487)
  ...
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants