Skip to content
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1,239 changes: 74 additions & 1,165 deletions CMakeLists.txt

Large diffs are not rendered by default.

160 changes: 160 additions & 0 deletions doc/developer-guide/internal-libraries/Lexicon.en.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,160 @@
.. Licensed to the Apache Software Foundation (ASF) under one
or more contributor license agreements. See the NOTICE file distributed with this work for
additional information regarding copyright ownership. The ASF licenses this file to you under the
Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
the License. You may obtain a copy of the License at

http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software distributed under the License
is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express
or implied. See the License for the specific language governing permissions and limitations under
the License.

.. include:: ../../common.defs

.. default-domain:: cpp

Lexicon
*******

Synopsis
========

:code:`#include <ts/Lexicon.h>`

:class:`Lexicon` is a template class designed to facilitate conversions between enumerations and
strings. Each enumeration can have a **primary** name and an arbitrary number of **secondary**
names. Enumerations convert to the primary name, and all the names convert to the enumeration.
Defaults can also be set such that a conversion for a name or enumeration that isn't defined yields
the default. All comparisons are case insensitive.

Description
===========

A :class:`Lexicon` is a template class with a single type, which should be a numeric type, usually
an enumeration. An instance contains a set of **definitions**, each of which is an association
between a value, a primary name, and secondary names, the last of which is optional. All names and
values must be unique across the :class:`Lexicon`. The array operator is used to do the conversions.
When indexed by a value, the primary name for that value is returned. When indexed by a name, the
primary name or any secondary name will yield the same value.

Defaults can be set so that any name or value that does not match a definition yields the default. A
default can set for names or values independently. A default can be explicit or it can be a function
which is called when the :class:`Lexicon` is indexed by a non-matching index. The handler function
must return a default of the appropriate type. It acts as an internal catch for undefined
conversions and is generally used to log the failure while returning a default. It could be used to
compute a default but in the case of names, this is problematic due to memory ownership and thread
safety issues. Because the return type of a name is :code:`std::string_view` there is no signal to
cleanup any allocated memory, and storing the name in the :class:`Lexicon` instance makes it
non-thread safe on read access [#]_.

Definitions can be added by the :func:`Lexicon::define` method. Usually a :class:`Lexicon` will be
constructed with the definitions. Two types of such construction are supported, both of them taking
an initializer list of definitions. The definitions may be a pair of enumeration and primary name,
or the definitions may be an enumeration and an initializer list of names, the first of which is the
primary name. Because initializer lists must be homogenous, all definitions must be of the same
type.

When initialized and defaults set, a :class:`Lexicon` makes it very easy to convert between
enumeration values and strings for debugging and configuration handling, particularly if the
enumeration has a value for invalid. Checking input strings is then simply indexing the
:class:`Lexicon` with the string - if it's valid the appropriate enumeration value is returned,
otherwise the invalid value is returned as the default.

Construction is normally done by initializer lists, as these are easier to work with. There are
two forms, either pairs of value, name or pairs of value, list-of-names. The former is simpler
and sufficient if only primary names are to be defined, but the latter is required if secondary
names are present. In addition similar construction using :code:`std::array` is provided. This
is a bit clunkier, but does enable compile time verification that all values are defined.

Examples
========

Assume the enumeration is

.. literalinclude:: ../../../lib/ts/unit-tests/test_Lexicon.cc
:lines: 30
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you're going to do this, I think you need a "WARNING: Do not even BREATH on this file!" comment in test_Lexicon.cc.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The code this depends on is marked off as being part of the documentation. I really want to be able to know the example code compiles and runs.

Copy link
Contributor

@ywkaras ywkaras Sep 12, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The problem is for maintainers of test_Lexicon.cc to know that they need to change the documentation as well if they add or remove lines.

Maybe you should put code like this in test_Lexicon.cc: https://godbolt.org/z/b1H8tA


A :class:`Lexicon` for this could be defined as

.. literalinclude:: ../../../lib/ts/unit-tests/test_Lexicon.cc
:lines: 32

An instance could be constructed, with primary and secondary names, as

.. literalinclude:: ../../../lib/ts/unit-tests/test_Lexicon.cc
:lines: 36-40

Note there are no secondary names for ``INVALID`` but the list form must be used. If no secondary
names are needed, it could be done this way

.. literalinclude:: ../../../lib/ts/unit-tests/test_Lexicon.cc
:lines: 42-46

Assuming the first case with secondary names is used, it would be helpful to set defaults so that
undefined names or values map to the invalid case.

.. literalinclude:: ../../../lib/ts/unit-tests/test_Lexicon.cc
:lines: 48

With this initialization, these checks all succeed

.. literalinclude:: ../../../lib/ts/unit-tests/test_Lexicon.cc
:lines: 50-59

A bit more complex initialization can be used in cases where verifying all of the values in an
enumeration are covered by the :class:`Lexicon`. There is a special constructor for this that
takes an extra argument, :code:`Require<>`. The API is designed with the presumption there is a "LAST_VALUE"
in the enumeration which can be used for size. For example, something like

.. literalinclude:: ../../../lib/ts/unit-tests/test_Lexicon.cc
:lines: 62

with the :class:`Lexicon` specialization as

.. literalinclude:: ../../../lib/ts/unit-tests/test_Lexicon.cc
:lines: 63

To cover everything (except the last value, which would normally be handled by the default), the
initialization would be

.. literalinclude:: ../../../lib/ts/unit-tests/test_Lexicon.cc
:lines: 64-68

If there is am missing value, this fails to compile with a "ts::Lexicon<Radio>::Definition::value’
is uninitialized reference".

Reference
=========

.. class:: template < typename T > Lexicon

A bidirectional converter between enumerations of :code:`T` and strings.

.. function:: T operator [] (std::string_view name)

Return the enumeration associated with :arg:`name`. If :arg:`name` is not in a definition and
no default has been set, :code:`std::domain_error` is thrown.

.. function:: std::string_view operator [] (T value)

Return the primary name associated with :arg:`value`. If :arg:`value` is not in a definition
and no default has been set, :code:`std::domain_error` is thrown.

.. function:: Lexicon & define(T value, std::string_view primary, ... )

Add a definition. The :arg:`value` is associated with the :arg:`primary` name. An arbitrary
number of additional secondary names may be provided.

Appendix
========

.. rubric:: Footnotes

.. [#]

The original implementation predated `std::string_view` and so returned `std::string`. While this
generation of names for unmatched values easy, it did have a performance cost. In this version I
chose to use :code:`std::string_view` which makes the default handler functions not as useful,
but still, in my view, sufficiently useful (for logging at least) to be worth supporting.
1 change: 1 addition & 0 deletions doc/developer-guide/internal-libraries/index.en.rst
Original file line number Diff line number Diff line change
Expand Up @@ -34,6 +34,7 @@ development team.
buffer-writer.en
intrusive-list.en
intrusive-hash-map.en
Lexicon.en
MemArena.en
AcidPtr.en
Extendible.en
21 changes: 21 additions & 0 deletions include/tscore/Hash.h
Original file line number Diff line number Diff line change
Expand Up @@ -57,9 +57,30 @@ struct ATSHash : ATSHashBase {
struct ATSHash32 : ATSHashBase {
virtual uint32_t get(void) const = 0;
virtual bool operator==(const ATSHash32 &) const;
uint32_t hash_immediate(void *data, size_t len);
};

struct ATSHash64 : ATSHashBase {
virtual uint64_t get(void) const = 0;
virtual bool operator==(const ATSHash64 &) const;
uint64_t hash_immediate(void *data, size_t len);
};

// ----
// Implementation

inline uint32_t
ATSHash32::hash_immediate(void *data, size_t len)
{
this->update(data, len);
this->final();
return this->get();
}

inline uint64_t
ATSHash64::hash_immediate(void *data, size_t len)
{
this->update(data, len);
this->final();
return this->get();
}
83 changes: 54 additions & 29 deletions include/tscore/HashFNV.h
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,7 @@
Licensed to the Apache Software Foundation (ASF) under one
or more contributor license agreements. See the NOTICE file
distributed with this work for additional information
regarding copyright ownership. The ASF licenses this file
regarding copyright ownership. The ASF licenses this fileN
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is fileN really right?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's actually in a different PR this depends on. #4223.

to you under the Apache License, Version 2.0 (the
"License"); you may not use this file except in compliance
with the License. You may obtain a copy of the License at
Expand All @@ -31,63 +31,88 @@
#include <cstdint>

struct ATSHash32FNV1a : ATSHash32 {
protected:
using super_type = ATSHash32;
using nullxfrm = ATSHash::nullxfrm;

public:
ATSHash32FNV1a(void);

template <typename Transform> void update(const void *data, size_t len, Transform xfrm);
void
update(const void *data, size_t len) override
{
update(data, len, ATSHash::nullxfrm());
}
template <typename Transform> void update(const void *data, size_t len, const Transform &xf);

void update(const void *data, size_t len) override;

void final(void) override;
uint32_t get(void) const override;
void clear(void) override;

template <typename Transform> uint32_t hash_immediate(const void *data, size_t len, const Transform &xf);

private:
uint32_t hval;
};

template <typename Transform>
void
ATSHash32FNV1a::update(const void *data, size_t len, Transform xfrm)
{
uint8_t *bp = (uint8_t *)data;
uint8_t *be = bp + len;

for (; bp < be; ++bp) {
hval ^= (uint32_t)xfrm(*bp);
hval += (hval << 1) + (hval << 4) + (hval << 7) + (hval << 8) + (hval << 24);
}
}

struct ATSHash64FNV1a : ATSHash64 {
ATSHash64FNV1a(void);

template <typename Transform> void update(const void *data, size_t len, Transform xfrm);
void
update(const void *data, size_t len) override
{
update(data, len, ATSHash::nullxfrm());
}
void update(const void *data, size_t len) override;

void final(void) override;
uint64_t get(void) const override;
void clear(void) override;

template <typename Transform> uint64_t hash_immediate(const void *data, size_t len, const Transform &xf);

private:
uint64_t hval;
};

// ----------
// Implementation

inline void
ATSHash32FNV1a::update(const void *data, size_t len)
{
return this->update(data, len, ATSHash::nullxfrm());
}
inline void
ATSHash64FNV1a::update(const void *data, size_t len)
{
return this->update(data, len, ATSHash::nullxfrm());
}

template <typename Transform>
uint32_t
ATSHash32FNV1a::hash_immediate(const void *data, size_t len, const Transform &xf)
{
this->update(data, len, xf);
this->final();
return this->get();
}

template <typename Transform>
void
ATSHash32FNV1a::update(const void *data, size_t len, const Transform &xf)
{
const uint8_t *bp = static_cast<const uint8_t *>(data);
const uint8_t *be = bp + len;

for (; bp < be; ++bp) {
hval ^= static_cast<uint32_t>(xf(*bp));
hval += (hval << 1) + (hval << 4) + (hval << 7) + (hval << 8) + (hval << 24);
}
}

template <typename Transform>
void
ATSHash64FNV1a::update(const void *data, size_t len, Transform xfrm)
ATSHash64FNV1a::update(const void *data, size_t len, Transform xf)
{
uint8_t *bp = (uint8_t *)data;
uint8_t *be = bp + len;
const uint8_t *bp = static_cast<const uint8_t *>(data);
const uint8_t *be = bp + len;

for (; bp < be; ++bp) {
hval ^= (uint64_t)xfrm(*bp);
hval ^= static_cast<uint64_t>(xf(*bp));
hval += (hval << 1) + (hval << 4) + (hval << 5) + (hval << 7) + (hval << 8) + (hval << 40);
}
}
Loading