Skip to content

Conversation

@maskit
Copy link
Member

@maskit maskit commented Mar 1, 2023

TLSomethingSupport, which enables using SSLNetVC and QUICNetVC in the same way, requires dynamic_cast at many places. I don't know how much it affects performance, but this is one of ways to remove the dynamic_cast.

It still relies on RTTI, so maybe it's not as fast as you want, but I guess it's faster than looking up vtable.

We probably want to avoid std::unordered_map here, and I'm going to change it later if we go this way. Current code is just to show the idea.

To remove the dynamic_casts, I added an array to store pre-static_cast-ed pointers to NetVConnection.

@maskit maskit self-assigned this Mar 1, 2023
@ywkaras
Copy link
Contributor

ywkaras commented Mar 6, 2023

The virtual member function I added for the tunnel metrics may show a pattern that could be used for this: https://github.com/apache/trafficserver/pull/9403/files#diff-28fd9480778c3af26dbe33816ff4f4d35a616444bb9d19adaecd6735e29ca546R365

@maskit
Copy link
Member Author

maskit commented Mar 7, 2023

virtual void do_something() is one of options. But in that way, NetVConnection would have a lot of that kind of functions. And I think it would be a mess eventually. I can imagine somebody will propose do_something_detail_for_xyz just for one of NetVCs without abstraction. Also, some of such functions would probably need some parameters. I guess it'd be difficult to abstract everything.

virtual bool is_feature_x_supprted() for each mix-in may be better in that sense. But NetVC knows the name of x, and we have to modify NetVC base class whenever we add a new mix-in.

virtual bool has_support_for(enum_or_string feature_name) is clean. It would not bring any detail to NetVC itself. If get_service() isn't acceptable, I can live with this one.

@ywkaras
Copy link
Contributor

ywkaras commented Mar 7, 2023

To keep changes minimal, I suggest this pattern: https://godbolt.org/z/E86E4zxGs .

@maskit
Copy link
Member Author

maskit commented Mar 7, 2023

To keep changes minimal, I suggest this pattern: https://godbolt.org/z/E86E4zxGs .

I'd say it's a worse version of is_feature_x_supprted. It has the same cons, plus it also requires forward declarations.
I'm happy to make big changes if we can solve this cleanly.

} else {
return nullptr;
}
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will not work properly, for reasons illustrated by this example code: https://godbolt.org/z/bGofcYhse . Not that f() and g() do not have the same assembly code. With multiple inheritance, there is no known implementation (and probably impossible) where all the base classes to have the same (start) address as the derived class. Even with no multiple inheritance, I think it's undefined behavior to assume that (direct and indirect) base classes have the same address as the derived class. Although the de facto portability of this assumption is very high in that case.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, looks like this approach doesn't work at a minimum. What about other approaches? Do you see any technical issues?

@ywkaras
Copy link
Contributor

ywkaras commented Mar 8, 2023

There is some chance that this would be faster than dynamic_cast: https://godbolt.org/z/fvzWsxj5T .

@maskit
Copy link
Member Author

maskit commented Mar 8, 2023

There is some chance that this would be faster than dynamic_cast: https://godbolt.org/z/fvzWsxj5T .

I don't think we need to cast MixinA to MixinB. In that sense this should be enough, it's so naive though. https://godbolt.org/z/fjnMxfj85

But your code suggests that I could store pre-casted pointers into a map instead of just having supported types in a set.

I wonder if this kind of ways can be ten times faster than dynamic_cast. Only 5% faster way is not a win here.

@ywkaras
Copy link
Contributor

ywkaras commented Mar 8, 2023

This would be faster: https://godbolt.org/z/66YofcchT

@maskit
Copy link
Member Author

maskit commented Mar 8, 2023

Sure, but again, if we do that the base class has to be modified every time we add new mixins, and the base class has to know all mixin names.

@maskit
Copy link
Member Author

maskit commented Mar 14, 2023

Not as clean as get_mixin(type_info x), and not as fast as get_mixin_a(), but can be a compromise? It should be faster than if-else and hash table.
https://godbolt.org/z/T68zTPG15

@maskit maskit mentioned this pull request Mar 14, 2023
@maskit maskit force-pushed the reduce_dynamic_cast branch from 6522558 to 5d97947 Compare May 23, 2023 00:05
@maskit maskit marked this pull request as ready for review May 23, 2023 01:17
@maskit maskit added this to the 10.0.0 milestone May 23, 2023
@maskit
Copy link
Member Author

maskit commented May 25, 2023

[approve ci autest]

@maskit
Copy link
Member Author

maskit commented May 26, 2023

[approve ci autest]

polite_hook_wait failed.

@maskit
Copy link
Member Author

maskit commented May 26, 2023

[approve ci autest]

polite_hook_wait failed again.

@maskit
Copy link
Member Author

maskit commented May 26, 2023

[approve ci autest]

@bryancall bryancall requested a review from ywkaras June 5, 2023 22:34
@jpeach
Copy link
Contributor

jpeach commented Jun 7, 2023

I don't know how much it affects performance, but this is one of ways to remove the dynamic_cast.

Were you able to do any benchmarks to demonstrate the performance effect?

@jpeach
Copy link
Contributor

jpeach commented Jun 7, 2023

My 2c o the API here is that there's too great a reliance on always having to match the enum and type pointer. You need to do it in every place that registers a helper, and every place that consumes one, which is error-prone. It would be great if the mapping was in exactly one place.

For example, you could keep the mapping in a template specialization:

template <typename T> T* NetConnectionService(const NetVConnection *);

template<> ALPNSupport* NetConnectionService(const NetVConnection * vc)
{
  return static_cast< ALPNSupport*>(vc.get_thing(TLS_ALPN))
}

/// etc ..

This approach could also let you get rid of the nil checks:

template <typename T, F>
WithNetConnectionService(const NetVConnection * vc, F&& handler)
{
    T * svc = NetConnectionService<T>(vc);
    if (svc) {
        handler(svc)
    }
}

@ywkaras
Copy link
Contributor

ywkaras commented Jun 22, 2023

I'm going to abstain on this one. The clean way to get rid of dynamic_cast is to move the distinct behavior for derived classes into the derived class. But it's very difficult to do that in this case. I feel like this is adding complexity, for only a speculative benefit of significantly improved performance. Maybe find another reviewer, or start a thread on the dev mailing list.

@maskit maskit force-pushed the reduce_dynamic_cast branch from de78c49 to 805ae67 Compare June 23, 2023 04:56
@maskit
Copy link
Member Author

maskit commented Jun 23, 2023

Before (w/o this change)

$ h2load -n 100000 -m 10 -c 10 -t 5 https://localhost:8443/8k                                                                                                                                                                                                                                                                 
starting benchmark...                                                                                                                   
spawning thread #0: 2 total client(s). 20000 total requests                                                                             
spawning thread #1: 2 total client(s). 20000 total requests                                                                             
spawning thread #2: 2 total client(s). 20000 total requests                                                                             
spawning thread #3: 2 total client(s). 20000 total requests                                                                             
spawning thread #4: 2 total client(s). 20000 total requests                                                                             
TLS Protocol: TLSv1.3                                                                                                                   
Cipher: TLS_AES_128_GCM_SHA256                                                                                                          
Server Temp Key: X25519 253 bits                                                                                                        
Application protocol: h2                                                                                                                
progress: 10% done                                                                                                                      
progress: 20% done                                                                                                                      
progress: 30% done                                                                                                                      
progress: 40% done                                                                                                                      
progress: 50% done                                                                                                                      
progress: 60% done                                                                                                                      
progress: 70% done                                                                                                                      
progress: 80% done                                                                                                                      
progress: 90% done                                                                                                                      
progress: 100% done                                                                                                                     
                                                                                                                                        
finished in 3.44s, 29042.21 req/s, 227.72MB/s                                                                                           
requests: 100000 total, 100000 started, 100000 done, 100000 succeeded, 0 failed, 0 errored, 0 timeout                                   
status codes: 100000 2xx, 0 3xx, 0 4xx, 0 5xx                                                                                           
traffic: 784.11MB (822202500) total, 1.15MB (1202140) headers (space savings 96.37%), 781.25MB (819200000) data                         
                     min         max         mean         sd        +/- sd                                                              
time for request:      309us     63.06ms      3.37ms      1.43ms    84.61%                                                              
time for connect:    11.93ms     71.32ms     43.61ms     19.76ms    60.00%                                                              
time to 1st byte:    74.94ms     76.78ms     75.38ms       517us    90.00%                                                              
req/s           :    2904.63     2932.24     2911.70        9.42    80.00%

image

After (w/ this change)

$ h2load -n 100000 -m 10 -c 10 -t 5 https://localhost:8443/8k                                                                        
starting benchmark...                                                                                                                
spawning thread #0: 2 total client(s). 20000 total requests                                                                          
spawning thread #1: 2 total client(s). 20000 total requests                                                                          
spawning thread #2: 2 total client(s). 20000 total requests                                                                          
spawning thread #3: 2 total client(s). 20000 total requests                                                                          
spawning thread #4: 2 total client(s). 20000 total requests                                                                          
TLS Protocol: TLSv1.3                                                                                                                
Cipher: TLS_AES_128_GCM_SHA256                                                                                                       
Server Temp Key: X25519 253 bits                                                                                                     
Application protocol: h2                                                                                                             
progress: 10% done                                                                                                                   
progress: 20% done                                                                                                                   
progress: 30% done                                                                                                                   
progress: 40% done                                                                                                                   
progress: 50% done                                                                                                                   
progress: 60% done                                                                                                                   
progress: 70% done                                                                                                                   
progress: 80% done                                                                                                                   
progress: 90% done                                                                                                                   
progress: 100% done                                                                                                                  
                                                                                                                                     
finished in 3.31s, 30198.57 req/s, 236.79MB/s                                                                                        
requests: 100000 total, 100000 started, 100000 done, 100000 succeeded, 0 failed, 0 errored, 0 timeout                                
status codes: 100000 2xx, 0 3xx, 0 4xx, 0 5xx                                                                                        
traffic: 784.11MB (822202500) total, 1.15MB (1202140) headers (space savings 96.37%), 781.25MB (819200000) data                      
                     min         max         mean         sd        +/- sd                                                           
time for request:      328us     65.54ms      3.23ms      1.43ms    84.80%                                                           
time for connect:    12.12ms     74.18ms     46.18ms     20.57ms    60.00%                                                           
time to 1st byte:    77.60ms     78.46ms     77.90ms       235us    70.00%                                                           
req/s           :    3020.03     3040.60     3027.50        8.17    80.00%

image

HttpSM::attach_client_session is gone.

@ywkaras
Copy link
Contributor

ywkaras commented Jun 26, 2023

How about this approach?

wkaras ~/STUFF/GET_SERVICE
$ cat x.cc
#include <cstdint>

namespace ts
{

// Returns offset from Base1 to Base2 in any instance of Derived.
//
template <class Derived, class Base1, class Base2>
std::ptrdiff_t base_to_base_offset()
{
  // Don't use 0 for dummy address of derived, since compiler may treat null address specially.
  //
  auto addr = reinterpret_cast<Derived *>(0x100);

  return reinterpret_cast<char *>(static_cast<Base2 *>(addr)) - reinterpret_cast<char *>(static_cast<Base1 *>(addr));
}

} // end namespace ts

enum class X_SERVICE {
  Y1, Y2, Y3, NUM
};

template <class>
struct X_Service;

static std::ptrdiff_t const X_SERVICE_NONE{PTRDIFF_MIN};

class X
{
public:
  template <class C>
  C const * get_service() const
  {
    auto ofs = _service_ofs()[std::size_t(X_Service<C>::value)];
    if (X_SERVICE_NONE == ofs) {
      return nullptr;
    } else {
      return reinterpret_cast<C const *>(reinterpret_cast<char const *>(this) + ofs);
    }
  }

  template <class C>
  C * get_service()
  {
    std::ptrdiff_t ofs = _service_ofs()[std::size_t(X_Service<C>::value)];
    if (X_SERVICE_NONE == ofs) {
      return nullptr;
    } else {
      return reinterpret_cast<C *>(reinterpret_cast<char *>(this) + ofs);
    }
  }

private:
  static std::ptrdiff_t const * const _service_ofs_table;

  // Returns array (A), indexed by service (S), where A[S] is the offset from base class X to the base class for service
  // S in the same object, or NO_SERVICE if there is no base class in the object for service S.
  //
  virtual std::ptrdiff_t const * _service_ofs() const { return _service_ofs_table; }

  int dummy_data;
};

std::ptrdiff_t const * const X::_service_ofs_table{[]() -> std::ptrdiff_t *
{
  static std::ptrdiff_t tbl[std::size_t(X_SERVICE::NUM)];

  tbl[std::size_t(X_SERVICE::Y1)] = X_SERVICE_NONE;
  tbl[std::size_t(X_SERVICE::Y2)] = X_SERVICE_NONE;
  tbl[std::size_t(X_SERVICE::Y3)] = X_SERVICE_NONE;

  return tbl;
}()};

struct Y1
{
  int dummy_data;
};

template <>
struct X_Service<Y1>
{
  static X_SERVICE const value{X_SERVICE::Y1};
};

struct Y2
{
  int dummy_data;
};

template <>
struct X_Service<Y2>
{
  static X_SERVICE const value{X_SERVICE::Y2};
};

struct Y3
{
  int dummy_data;
};

template <>
struct X_Service<Y3>
{
  static X_SERVICE const value{X_SERVICE::Y3};
};

class D1 : public X, public Y1, public Y3
{
public:
  int dummy_data;

private:
  static std::ptrdiff_t const * const _service_ofs_table;

  // Returns array (A), indexed by service (S), where A[S] is the offset from base class X to the base class for service
  // S in the same object, or NO_SERVICE if there is no base class in the object for service S.
  //
  std::ptrdiff_t const * _service_ofs() const override { return _service_ofs_table; }
};

std::ptrdiff_t const * const D1::_service_ofs_table{[]() -> std::ptrdiff_t *
{
  static std::ptrdiff_t tbl[std::size_t(X_SERVICE::NUM)];

  tbl[std::size_t(X_SERVICE::Y1)] = ts::base_to_base_offset<D1, X, Y1>();
  tbl[std::size_t(X_SERVICE::Y2)] = X_SERVICE_NONE;
  tbl[std::size_t(X_SERVICE::Y3)] = ts::base_to_base_offset<D1, X, Y3>();

  return tbl;
}()};

#include <cassert>

int main()
{
  D1 d1;
  X *xp = &d1;

  assert(xp->get_service<Y1>() == static_cast<Y1 *>(&d1));
  assert(xp->get_service<Y2>() == nullptr);
  assert(xp->get_service<Y3>() == static_cast<Y3 *>(&d1));

  return 0;
}
wkaras ~/STUFF/GET_SERVICE
$ gcc -Wall -Wextra -pedantic -std=c++17 x.cc -lstdc++
wkaras ~/STUFF/GET_SERVICE
$ ./a.out
wkaras ~/STUFF/GET_SERVICE
$

@maskit
Copy link
Member Author

maskit commented Jun 26, 2023

None of the dynamic_casts need casting from a base class to another base class.

@ywkaras
Copy link
Contributor

ywkaras commented Jun 26, 2023

None of the dynamic_casts need casting from a base class to another base class.

X corresponds to NetVConnection, and Y1, Y2, ... correspond to TLSTunnelSupport, TLSSNISupport, etc. D1 would correspond to some class like SSLNetVConnection.

@maskit
Copy link
Member Author

maskit commented Jun 26, 2023

Oh you mean the mixins.

What's the advantage of your code? Looks like doing static_cast in a complicated way.

@ywkaras
Copy link
Contributor

ywkaras commented Jun 28, 2023

I suggest using this alternate approach. It only requires each NetVConnection object to have an additional pointer, rather than an array of pointers:

#include <cstdint>

namespace ts
{

// Returns offset from Base1 to Base2 in any instance of Derived.
//
template <class Derived, class Base1, class Base2>
std::ptrdiff_t base_to_base_offset()
{
  // Don't use 0 for dummy address of derived, since compiler may treat null address specially.
  //
  auto addr = reinterpret_cast<Derived *>(0x100);

  return reinterpret_cast<char *>(static_cast<Base2 *>(addr)) - reinterpret_cast<char *>(static_cast<Base1 *>(addr));
}

} // end namespace ts

enum class X_SERVICE {
  Y1, Y2, Y3, NUM
};

template <class>
struct X_Service;

static std::ptrdiff_t const X_SERVICE_NONE{PTRDIFF_MIN};

class X
{
public:
  X(std::ptrdiff_t const *service_ofs) : _service_ofs(service_ofs) {}

  template <class C>
  C const * get_service() const
  {
    auto ofs = _service_ofs[std::size_t(X_Service<C>::value)];
    if (X_SERVICE_NONE == ofs) {
      return nullptr;
    } else {
      return reinterpret_cast<C const *>(reinterpret_cast<char const *>(this) + ofs);
    }
  }

  template <class C>
  C * get_service()
  {
    std::ptrdiff_t ofs = _service_ofs[std::size_t(X_Service<C>::value)];
    if (X_SERVICE_NONE == ofs) {
      return nullptr;
    } else {
      return reinterpret_cast<C *>(reinterpret_cast<char *>(this) + ofs);
    }
  }

private:
  // Indexed by service (S), where _service_ofs[S] is the offset from base class X to the base class for service
  // S in the same object, or NO_SERVICE if there is no base class in the object for service S.
  //
  std::ptrdiff_t const * const _service_ofs;

  int dummy_data;
};

struct Y1
{
  int dummy_data;
};

template <>
struct X_Service<Y1>
{
  static X_SERVICE const value{X_SERVICE::Y1};
};

struct Y2
{
  int dummy_data;
};

template <>
struct X_Service<Y2>
{
  static X_SERVICE const value{X_SERVICE::Y2};
};

struct Y3
{
  int dummy_data;
};

template <>
struct X_Service<Y3>
{
  static X_SERVICE const value{X_SERVICE::Y3};
};

class D1 : public X, public Y1, public Y3
{
public:
  D1() : X(_service_ofs_table) {}

private:
  static std::ptrdiff_t const * const _service_ofs_table;

  int dummy_data;
};

std::ptrdiff_t const * const D1::_service_ofs_table{[]() -> std::ptrdiff_t *
{
  static std::ptrdiff_t tbl[std::size_t(X_SERVICE::NUM)];

  tbl[std::size_t(X_SERVICE::Y1)] = ts::base_to_base_offset<D1, X, Y1>();
  tbl[std::size_t(X_SERVICE::Y2)] = X_SERVICE_NONE;
  tbl[std::size_t(X_SERVICE::Y3)] = ts::base_to_base_offset<D1, X, Y3>();

  return tbl;
}()};

#include <cassert>

int main()
{
  D1 d1;
  X *xp = &d1;

  assert(xp->get_service<Y1>() == static_cast<Y1 *>(&d1));
  assert(xp->get_service<Y2>() == nullptr);
  assert(xp->get_service<Y3>() == static_cast<Y3 *>(&d1));

  return 0;
}

This performance test indicates the performance would be the same or slightly better:

wkaras ~/STUFF/IF_VS_ARR
$ cat scr2
cat x2.cc
echo ==============
gcc -O3 -Wall -Wextra -pedantic -std=c++17 x2.cc -lstdc++
time ./a.out
echo ==============
gcc -DWALT -O3 -Wall -Wextra -pedantic -std=c++17 x2.cc -lstdc++
time ./a.out
wkaras ~/STUFF/IF_VS_ARR
$ . scr2
int const Dim = 7;
int const Idx = 4;
int const Num_s = 1 << 14;
int const Iters = 1 << 17;

using Ofs_t = long long;

char dummy;

Ofs_t the_ofs_arr[Dim];

struct S
{
  char * volatile arr[Dim];

  Ofs_t volatile * volatile ofs_arr{the_ofs_arr};
};

S s[Num_s];

S * volatile sp[Num_s];

char * volatile cp;

int main()
{
  for (int i{0}; i < Num_s; ++i) {
    s[i].arr[Idx] = &dummy + i;
    sp[i] = s + i;
  }

  the_ofs_arr[Idx] = 42;

  #ifndef WALT

  for (int i{Iters}; i; --i) {
    for (int j{0}; j < Num_s; ++j) {
      char *p = sp[j]->arr[Idx];
      if (p) {
        cp = p;
      }
    }
  }

  #else

  char *dp = &dummy;
  for (int i{Iters}; i; --i) {
    for (int j{0}; j < Num_s; ++j) {
      Ofs_t ofs = sp[j]->ofs_arr[Idx];
      if (666 != ofs) {
        cp = dp + ofs;
      }
    }
  }

  #endif

  return 0;
}
==============

real	0m2.657s
user	0m2.653s
sys	0m0.004s
==============

real	0m2.541s
user	0m2.541s
sys	0m0.000s
wkaras ~/STUFF/IF_VS_ARR
$ 

@jpeach
Copy link
Contributor

jpeach commented Jun 28, 2023

I suggest using this alternate approach. It only requires each NetVConnection object to have an additional pointer, rather than an array of pointers:

IIUC this requires the use of inheritance, where the existing approach allows the option of composition. I agree that is would be nice to store fewer pointers though.

@maskit
Copy link
Member Author

maskit commented Jun 29, 2023

I suggest using this alternate approach. It only requires each NetVConnection object to have an additional pointer, rather than an array of pointers:

IIUC this requires the use of inheritance, where the existing approach allows the option of composition. I agree that is would be nice to store fewer pointers though.

Composition is an interesting idea. It can reduce the size of Unix/SSL/QUICNetVConnection. For example, if TLS early data is disabled by setting, we don't need to have TLSEarlyData and related variables in the first place.

@ywkaras
Copy link
Contributor

ywkaras commented Jul 3, 2023

OK, here's composition, and the option of multiple services with the same type. For a total of three shrubberies.

#include <cstdint>

namespace ts
{

// Returns offset from Base1 to Base2 in any instance of Derived.
//
template <class Derived, class Base1, class Base2>
std::ptrdiff_t base_to_base_offset()
{
  // Don't use 0 for dummy address of derived, since compiler may treat null address specially.
  //
  auto addr = reinterpret_cast<Derived *>(0x100);

  return reinterpret_cast<char *>(static_cast<Base2 *>(addr)) - reinterpret_cast<char *>(static_cast<Base1 *>(addr));
}

// Returns offset from Base1 to data member of Derived.
//
template <class Derived, class Base1, typename MbrT, MbrT Derived::*Mbr>
std::ptrdiff_t base_to_mbr_offset()
{
  // Don't use 0 for dummy address of derived, since compiler may treat null address specially.
  //
  auto addr = reinterpret_cast<Derived *>(0x100);

  return reinterpret_cast<char *>(&(addr->*Mbr)) - reinterpret_cast<char *>(static_cast<Base1 *>(addr));
}

} // end namespace ts

enum class X_SERVICE {
  Y1, Y2, Y3_0, Y3_1, Y4, NUM
};

template <class, int>
struct X_Service;

static std::ptrdiff_t const X_SERVICE_NONE{PTRDIFF_MIN};

class X
{
public:
  X(std::ptrdiff_t const *service_ofs) : _service_ofs(service_ofs) {}

  template <class C, int Selector = 0>
  C const * get_service() const
  {
    auto ofs = _service_ofs[std::size_t(X_Service<C, Selector>::value)];
    if (X_SERVICE_NONE == ofs) {
      return nullptr;
    } else {
      return reinterpret_cast<C const *>(reinterpret_cast<char const *>(this) + ofs);
    }
  }

  template <class C, int Selector = 0>
  C * get_service()
  {
    std::ptrdiff_t ofs = _service_ofs[std::size_t(X_Service<C, Selector>::value)];
    if (X_SERVICE_NONE == ofs) {
      return nullptr;
    } else {
      return reinterpret_cast<C *>(reinterpret_cast<char *>(this) + ofs);
    }
  }

private:
  // Indexed by service (S), where _service_ofs[S] is the offset from base class X to the base class or data member for
  // service S in the same object, or NO_SERVICE if there is no such base/member in the object for service S.
  //
  std::ptrdiff_t const * const _service_ofs;

  int dummy_data;
};

struct Y1
{
  int dummy_data;
};

template <>
struct X_Service<Y1, 0>
{
  static X_SERVICE const value{X_SERVICE::Y1};
};

struct Y2
{
  int dummy_data;
};

template <>
struct X_Service<Y2, 0>
{
  static X_SERVICE const value{X_SERVICE::Y2};
};

struct Y3
{
  int dummy_data;
};

template <>
struct X_Service<Y3, 0>
{
  static X_SERVICE const value{X_SERVICE::Y3_0};
};

template <>
struct X_Service<Y3, 1>
{
  static X_SERVICE const value{X_SERVICE::Y3_1};
};

struct Y4
{
  int dummy_data;
};

template <>
struct X_Service<Y4, 0>
{
  static X_SERVICE const value{X_SERVICE::Y4};
};

class D1 : public X, public Y1, public Y3
{
public:
  D1() : X(_service_ofs_table) {}

  Y3 my3;
  Y4 my4;

private:
  static std::ptrdiff_t const * const _service_ofs_table;

  int dummy_data;
};

std::ptrdiff_t const * const D1::_service_ofs_table{[]() -> std::ptrdiff_t *
{
  static std::ptrdiff_t tbl[std::size_t(X_SERVICE::NUM)];

  tbl[std::size_t(X_SERVICE::Y1)] = ts::base_to_base_offset<D1, X, Y1>();
  tbl[std::size_t(X_SERVICE::Y2)] = X_SERVICE_NONE;
  tbl[std::size_t(X_SERVICE::Y3_0)] = ts::base_to_base_offset<D1, X, Y3>();
  tbl[std::size_t(X_SERVICE::Y3_1)] = ts::base_to_mbr_offset<D1, X, Y3, &D1::my3>();
  tbl[std::size_t(X_SERVICE::Y4)] = ts::base_to_mbr_offset<D1, X, Y4, &D1::my4>();

  return tbl;
}()};

#include <cassert>

int main()
{
  D1 d1;
  X *xp = &d1;

  assert(xp->get_service<Y1>() == static_cast<Y1 *>(&d1));
  assert(xp->get_service<Y2>() == nullptr);
  assert((xp->get_service<Y3, 0>()) == static_cast<Y3 *>(&d1));
  assert((xp->get_service<Y3, 1>()) == &d1.my3);
  assert(xp->get_service<Y4>() == &d1.my4);

  return 0;
}

@maskit
Copy link
Member Author

maskit commented Jul 4, 2023

Looks like my3 and my4 have to be always created. Can we have Y3 *my3 instead? Otherwise we cannot make the base class small. Although composition is an interesting idea, I'm not going to make a change for it now. Especially the dynamic instantiation can make this slow again.

This PR touches many places and I don't want to keep this open for long time to find the best way (there's already a conflict). It looks like we all agreed on the interface at a minimum. I'm going to make the changes James suggested, so we can start using the new interface.

@maskit maskit force-pushed the reduce_dynamic_cast branch from 45561f2 to fa45f3b Compare July 4, 2023 01:04
@ywkaras
Copy link
Contributor

ywkaras commented Jul 5, 2023

Can we have Y3 *my3 instead?

No. If the service object is not within the derived object's memory footprint, there would not be a fixed offset to it. You could perhaps use std::optional.

@maskit maskit requested a review from jpeach July 5, 2023 15:47
@maskit
Copy link
Member Author

maskit commented Jul 5, 2023

I think it's in a good shape now, and I'd like to merge this if there's no blocker. Let me know if there are comments/issues that are not resolved yet.

jpeach
jpeach previously approved these changes Jul 9, 2023
Copy link
Contributor

@jpeach jpeach left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When you squash the commit, please add a summary of the performance benefit from the discussion here.

}

inline void
NetVConnection::_set_service(enum NetVConnection::Service service, void *instance)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that this isn't type-safe. I expect it would be possible to make it typesafe if you had a template that mapped the enum to the type directly.

Co-authored-by: James Peach <jpeach@apache.org>
@maskit maskit requested a review from jpeach July 10, 2023 16:54
@maskit maskit merged commit f6bf8bd into apache:master Jul 11, 2023
@maskit maskit mentioned this pull request Jul 11, 2023
maskit added a commit to maskit/trafficserver that referenced this pull request Jul 12, 2023
This is a follow-up on apache#9482. I missed one place on the previous PR.
maskit added a commit to maskit/trafficserver that referenced this pull request Jul 12, 2023
This is a follow-up on apache#9482. I missed one place on the previous PR.
@maskit maskit mentioned this pull request Jul 12, 2023
maskit added a commit that referenced this pull request Jul 12, 2023
This is a follow-up on #9482. I missed one place on the previous PR.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants