-
Notifications
You must be signed in to change notification settings - Fork 849
Lexicon: A enumeration <-> string converter. #4222
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
57360fb to
48281f6
Compare
|
Current issue is a bug in clang interacting badly with the GCC headers, making |
| Description | ||
| =========== | ||
|
|
||
| A :class:`Lexicon` is templated by the enumeration class. It contains a set of **definitions**, each |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it's more clear to say "a class template Lexicon takes an enumeration type as its only parameter".
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I changed it around to specify the type should be numeric, rather than an enumeration specifically.
| name, the primary name or any secondary name will yield the same enumeration. | ||
|
|
||
| Defaults can be set so that any name or enumeration that does not match a definition yields the | ||
| default value. The default can be explicit or it can be a handler function. The handler functions as |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
as -> acts as
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed.
| default value. The default can be explicit or it can be a handler function. The handler functions as | ||
| an internal catch for undefined conversions and is generally used to log the failure while returning | ||
| a default. It could be used to compute a default but for names, this is problematic due to memory | ||
| ownership and thread safety issues. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Highly non-obvious.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reworded a bit, and a foot note added.
| Assume the enumeration is | ||
|
|
||
| .. literalinclude:: ../../../lib/ts/unit-tests/test_Lexicon.cc | ||
| :lines: 30 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you're going to do this, I think you need a "WARNING: Do not even BREATH on this file!" comment in test_Lexicon.cc.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The code this depends on is marked off as being part of the documentation. I really want to be able to know the example code compiles and runs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The problem is for maintainers of test_Lexicon.cc to know that they need to change the documentation as well if they add or remove lines.
Maybe you should put code like this in test_Lexicon.cc: https://godbolt.org/z/b1H8tA
| Insert the element :arg:`v` into the container. | ||
| Insert the element :arg:`v` into the container. If there are already elements with the same | ||
| key, :arg:`v` is inserted after these elements. | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we call it IntrusiveHashMultimap for consistency with STL naming?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Different PR, but no. Functionality is important, naming less so.
| template <typename Transform> | ||
| void | ||
| ATSHash64FNV1a::update(const void *data, size_t len, Transform xfrm) | ||
| ATSHash64FNV1a::update(const void *data, size_t len, Transform xf) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why in this case are we using 'Transform xf' instead of 'const Transform &xf'?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oversight, I think the const& is a better choice in case Transform has internal state.
| */ | ||
| template <typename E> class Lexicon | ||
| { | ||
| using self_type = Lexicon; ///< Self reference type. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why use alias that's longer than what it is aliasing?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Consistency, clarity. The point of using self_type is to be clear in declaration that a method is returning an instance of itself.
| /// Storage for names. | ||
| MemArena _arena{1024}; | ||
| /// Access by name. | ||
| IntrusiveHashMap<typename Item::NameLinkage> _by_name; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it's been shown that, for small lookup tables, an array/vector with linear search is fastest. In any case, it's simpler and any difference in speed will be small for a small lookup table.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, but I don't agree that in this case it would be simpler, in terms of the implementation, in Lexicon. The mapping also gives us nice locality of equivalent keys.
| std::string_view localize(std::string_view name); | ||
|
|
||
| /// Storage for names. | ||
| MemArena _arena{1024}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Won't the string names probably be string literals? If so, it seems a shame to copy them rather than just use the passed string_view.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Usually, but that can't be guaranteed. Because it's a MemArena the amortized cost of the copy is small and I think that's worth paying to avoid any (likely complex) rules about input string lifetimes, or restricting Lexicon to only handling constant strings.
| for (decltype(len) j = 0; j < len; ++j) { | ||
| s[j] = char_gen(randu); | ||
| } | ||
| s[len] = 0; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why the nul termination?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure, I suspect it's a left over from a prior iteration when it used a C-string style interface instead of string_view.
|
Just out of curiosity, is there any mechanism that ensure all enum items are in the lexicon? Let's say |
|
Ah, I found |
|
But I'm still curious how we can avoid the mismatch. If it was a trafficserver/proxy/http2/Http2DebugNames.cc Lines 29 to 47 in 4405411
|
|
There isn't any way, at compile time, to verify all of the enumeration values are in the I could look at creating a constructor like and then the initialization could be In this case, if there are not exactly |
|
A possible alternative is a macro-based approach: https://wkaras.webs.com/itemlist/itemlist.html |
|
The mismatch would not be a big issue. I just wanted to know if it's possible. |
maskit
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Although I don't understand the implementation details but I'm ok with this. Only concern I have is maintainability. The code is much longer and more complicated than the switch-case example I posted above. I understand this is more general and more powerful, but I would switch back to switch-case way if I got into a trouble with this instead of fixing it.
|
@maskit Yes, but one of the key things this does the |
ea7ff14 to
0eea3f4
Compare
|
I added and documented a mechanism to do the size verification at compile time. Not quite as clunky as directly using a |
0eea3f4 to
8f254e6
Compare
| or more contributor license agreements. See the NOTICE file | ||
| distributed with this work for additional information | ||
| regarding copyright ownership. The ASF licenses this file | ||
| regarding copyright ownership. The ASF licenses this fileN |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is fileN really right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's actually in a different PR this depends on. #4223.
8f254e6 to
13f9185
Compare
|
Leaving this until more of the infrastructure is in place. |
This is a class I have intended on build for years, it is based on a class of the same name and similar function I wrote for Network Geographics. It was very handy but it required using Boost.MultiIndex which clearly wasn't going to work in the TS codebase. But now I have the infrastructure pieces needed to make the implementation simple, along with some nice C++ eleventy features. I ended up writing this now because it will be very useful in the YAML conversion of WCCP.
Dependent on #4223.
Waiting on other infrastructure PRs.