Skip to content

[SYCL] Support connection with multiple plugins #1490

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 6 commits into from
Apr 23, 2020

Conversation

againull
Copy link
Contributor

@againull againull commented Apr 7, 2020

This commit enables including multiple devices of the same device_type
and changed the logic of device selection to just prefer a SYCL_BE
device if present. If someone uses SYCL_BE but appropriate device is
not present, we will simply use any other device.

Signed-off-by: Artur Gainullin artur.gainullin@intel.com

Copy link
Contributor

@smaslov-intel smaslov-intel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, just few minor comments

// are processed only once, like reading a string from environment
// and converting it into a typed object.
//
template <typename T, const char *E> class Config {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggest reusing SYCLConfig from source/detail/config.hpp instead.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using SYCLConfig in this case is the same as just using getenv. Introduced Config class is a singleton and was created to read environment variables only once (i.e. for efficiency). But getPreferredBE is called only once in device_selector and tracing is debugging feature, so this kind of over-complication doesn't make sense. I removed singleton, so that we don't need any static variables and global variables. I think it is ok just to read environment variables in this case.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using SYCLConfig in this case is the same as just using getenv. Introduced Config class is a singleton and was created to read environment variables only once (i.e. for efficiency).

SYCLConfig does the same - it reads only once.

But getPreferredBE is called only once in device_selector and tracing is debugging feature, so this kind of over-complication doesn't make sense.
I believe for tracing it is called each time we call PI API. So, reading env var each time can be costly.

Copy link
Contributor Author

@againull againull Apr 15, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@romanovvlad Sorry, but looking to the code I don't understand why. When we call get() we always call getRawValue() which calls getenv().

template <ConfigID Config> class SYCLConfig {
  using BaseT = SYCLConfigBase<Config>;

public:
  static const char *get() {
    const char *ValStr = getRawValue();
    return ValStr;
  }

private:
  static const char *getRawValue() {
    if (ConfigFromEnvEnabled)
      if (const char *ValStr = getenv(BaseT::MConfigName))
        return ValStr;

m_Data = (Env ? std::atoi(Env) : 0);
}

static const std::map<std::string, Backend> SyclBeMap{
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we convert it into a pod datatype to avoid having complex global objects?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.

if (PluginNames.empty() && EnableTrace)
std::cerr << "No Plugins Found." << std::endl;
if (PluginNames.empty() && trace())
std::cerr << "SYCL_PI_TRACE[-1]: No Plugins Found." << std::endl;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What [-1] stands for?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This says what trace-level triggered this trace line. It is useful to grep the full trace for only specific levels.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is not clear to me why do you need to grep for specific trace levels if you can just enable only particular one instead

Moreover, -1 stands for ALL, so, it doesn't really provide information about particular trace-level

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@AlexeySachkov Do you think we should not print labels like "SYCL_PI_TRACE[-1]:" at all?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@againull, I think so. From my point of view, it is better to put each message into particular category to be able enable/disable it

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The "SYCL_PI_TRACE" (without the mask) prefix is useful to separate out program output from the SYCL RT traces. The mask is useful to find a concise group of messages if provided a large full trace from outside. The mask also tells what mask should be used if you are just interested in this particular group of messages.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we print PI_TRACE_ALL instead of hard coding "-1"? Or event have a helper which checks if some level of printing is enabled and adds it to the beginning of the printed message. Asking because with current implementation this implicit alignment between if statement and content of print can easily go out of sync.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, we can improve this further (probably in subsequent PRs), but printing -1 is probably better that PI_TARCE_ALL, because -1 is what users specify on command line.

PluginNames.push_back(std::make_pair<std::string, Backend>(
OPENCL_PLUGIN_NAME, SYCL_BE_PI_OPENCL));
PluginNames.push_back(
std::make_pair<std::string, Backend>(CUDA_PLUGIN_NAME, SYCL_BE_PI_CUDA));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like this will force the OpenCL NVIDIA platform to come before the CUDA backend...

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's correct, by default it would be OpenCL. The explicit SYCL_BE=PI_CUDA would change that to CUDA.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's please move this preference inside the device_selector::select_device(). It is too subtle to have it like here. In select_device() you'd just choose the device of the preferred BE if multiple BE get the same score.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

return true;
}
} else if (PreferredBE == RT::Backend::SYCL_BE_PI_CUDA) {
if (PlatformVersion.find("CUDA") != std::string::npos) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The PlatformVersion value for NVIDIA OpenCL also contains the string CUDA, so this will pick up OpenCL NVIDIA first. See output of NVIDIA OpenCL platform from clinfo:

Patform Vendor                                 NVIDIA Corporation
Platform Version                                OpenCL 1.2 CUDA 10.1.236     

The solution is to change the string of the PI CUDA backend to something more specific that we can search or alternatively find CUDA string but ensure the OpenCL one is not there.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

actually the useBackend or now preferredBackend isn't used anywhere else than the device_selector with this change. Do we really need that runtime code in pi.hpp, or will it be simpler to evaluate the ENV in device selector? especially as preferredBackend will always return OpenCl when no env is set :)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I now think we should just get the BE from the device's plugin, instead of querying their names.

auto BE = detail::getSyclObjImpl(dev)->getPlugin().getBackend();

Then the isDeviceOfPreferredSyclBe can be removed.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.

Copy link
Contributor

@smaslov-intel smaslov-intel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please fix a few pointed points

@againull againull force-pushed the multiple_plugins branch 2 times, most recently from 69922a9 to bee65ad Compare April 13, 2020 08:04
static constexpr char SYCL_PI_TRACE[] = "SYCL_PI_TRACE";

static Backend getBE(const char *EnvVar) {
const char *BE = std::getenv(SYCL_BE);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

isn't this supposed to be something like std::getenv(*EnvVar)?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, my mistake, fixed.

@againull againull force-pushed the multiple_plugins branch 2 times, most recently from 981cb93 to 3296134 Compare April 13, 2020 18:13
// Lists valid environment variables.
static constexpr char SYCL_BE[] = "SYCL_BE";
static constexpr char SYCL_INTEROP_BE[] = "SYCL_INTEROP_BE";
static constexpr char SYCL_PI_TRACE[] = "SYCL_PI_TRACE";
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
static constexpr char SYCL_PI_TRACE[] = "SYCL_PI_TRACE";
static constexpr auto SYCL_PI_TRACE = "SYCL_PI_TRACE";

seems simpler

}
Plugins.push_back(plugin(PluginInformation));
Plugins.push_back(plugin(PluginInformation, PluginNames[I].second));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps an .emplace_back could work here?

for (const auto &dev : devices)
if (score < operator()(dev)) {
for (const auto &dev : devices) {
int dev_score = operator()(dev);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
int dev_score = operator()(dev);
int dev_score = (*this)(dev);

looks clearer to me

res = &dev;
score = operator()(dev);
score = dev_score;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This reminds me that we have never said anything about the fact that the device selector should always return the same value for the same device... ;-) At least calling it once and caching the result like here is safer.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, at some point that was intentional :-)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is legal probably and might still be intentional. For example imagine you want to implement the random_selector. :-)

At least we want the selector to be called only once per device by the SYCL runtime inside 1 SYCL consumption or some havoc can happen...

// are processed only once, like reading a string from environment
// and converting it into a typed object.
//
template <typename T, const char *E> class Config {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using SYCLConfig in this case is the same as just using getenv. Introduced Config class is a singleton and was created to read environment variables only once (i.e. for efficiency).

SYCLConfig does the same - it reads only once.

But getPreferredBE is called only once in device_selector and tracing is debugging feature, so this kind of over-complication doesn't make sense.
I believe for tracing it is called each time we call PI API. So, reading env var each time can be costly.

if (PluginNames.empty() && EnableTrace)
std::cerr << "No Plugins Found." << std::endl;
if (PluginNames.empty() && trace())
std::cerr << "SYCL_PI_TRACE[-1]: No Plugins Found." << std::endl;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we print PI_TRACE_ALL instead of hard coding "-1"? Or event have a helper which checks if some level of printing is enabled and adds it to the beginning of the printed message. Asking because with current implementation this implicit alignment between if statement and content of print can easily go out of sync.

@againull againull requested review from kbobrovs, pvchupin and a team as code owners April 22, 2020 00:30
smaslov-intel
smaslov-intel previously approved these changes Apr 22, 2020
Copy link
Contributor

@smaslov-intel smaslov-intel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thank you!

@bader
Copy link
Contributor

bader commented Apr 22, 2020

@againull, please, fix tests on Linux and Windows.

This commit enables including multiple devices of the same device_type
and changed the logic of device selection to just prefer a SYCL_BE
device if present. If someone uses SYCL_BE but appropriate device is
not present, we will simply use any other device.

Signed-off-by: Artur Gainullin <artur.gainullin@intel.com>
* Specialize SYCLConfig template class for SYCL_BE and SYCL_PI_TRACE.
* Reuse SYCLConfig instead of introducing new Config class in PI.
* Use recently introduced backend enum instead of pi::Backend enum.
* Print label instead of number during PI_TRACE.
* Introduce helper that returns label depending on level of tracing
* Force SYCL RT to use specified backend when SYCL_BE is set.
If SYCL_BE is not specified then SYCL RT is not forced to use specific
backend. But make opencl backend preferred.
* Update docs with info about SYCL_BE and SYCL_PI_TRACE

Signed-off-by: Artur Gainullin <artur.gainullin@intel.com>
Signed-off-by: Artur Gainullin <artur.gainullin@intel.com>
Signed-off-by: Artur Gainullin <artur.gainullin@intel.com>
smaslov-intel
smaslov-intel previously approved these changes Apr 22, 2020
static backend *get() {
static bool Initialized = false;
static bool IsSet = false;
static backend Backend = backend::opencl;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please, consider having backend* initialized to nullptr here.

return IsSet ? &Backend : nullptr;

const char *ValStr = BaseT::getRawValue();
const std::map<std::string, backend> SyclBeMap{
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please, consider using std::array.

Signed-off-by: Artur Gainullin <artur.gainullin@intel.com>
Signed-off-by: Artur Gainullin <artur.gainullin@intel.com>
@againull againull requested a review from romanovvlad April 22, 2020 23:44
@bader bader requested a review from smaslov-intel April 23, 2020 09:41
Copy link
Contributor

@romanovvlad romanovvlad left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please, do not force push as it kills possibility to track changes.

Copy link
Contributor

@kbobrovs kbobrovs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sycl/source/detail/program_manager/program_manager.cpp
sycl/doc/EnvironmentVariables.md
LGTM

@bader bader merged commit 937fec1 into intel:sycl Apr 23, 2020
alexbatashev pushed a commit to alexbatashev/llvm that referenced this pull request Apr 26, 2020
…_docs

* origin/sycl:
  [XPTI][Framework] Reference implementation of the Xpti framework to be used with instrumentation in SYCL (intel#1557)
  [SYCL] Initial ABI checks implementation (intel#1528)
  [SYCL] Support connection with multiple plugins (intel#1490)
  [SYCL] Add a new header file with the reduction class definition (intel#1558)
  [SYCL] Add test for SYCL kernels with accessor and spec constant (intel#1536)
alexbatashev pushed a commit to alexbatashev/llvm that referenced this pull request Apr 29, 2020
…versioning

* origin/sycl:
  [XPTI][Framework] Reference implementation of the Xpti framework to be used with instrumentation in SYCL (intel#1557)
  [SYCL] Initial ABI checks implementation (intel#1528)
  [SYCL] Support connection with multiple plugins (intel#1490)
  [SYCL] Add a new header file with the reduction class definition (intel#1558)
  [SYCL] Add test for SYCL kernels with accessor and spec constant (intel#1536)
  [SYCL][CUDA] Move interop tests (intel#1570)
  [Driver][SYCL] Remove COFF object format designator for Windows device compiles (intel#1574)
  [SYCL] Fix conflicting visibility attributes (intel#1571)
  [SYCL][DOC] Update the SYCL Runtime Interface document with design details (intel#680)
  [SYCL] Improve image accessors support on a host device (intel#1502)
  [SYCL] Make queue's non-USM event ownership temporary (intel#1561)
  [SYCL] Added support of rounding modes for non-host devices (intel#1463)
  [SYCL] SemaSYCL significant refactoring (intel#1517)
  [SYCL] Support 0-dim accessor in handler::copy(accessor, accessor) (intel#1551)
@againull againull deleted the multiple_plugins branch December 3, 2022 00:03
aelovikov-intel pushed a commit to aelovikov-intel/llvm that referenced this pull request Feb 23, 2023
`cl_*` types are defined by SYCL spec as interoperability interface with
OpenCL. There is no need to use them in most of SYCL tests.

This change is mostly motivated by the fact that those `cl_*` type aliases
were moved from `sycl` into `sycl::opencl` namespace: we simply want to
avoid tests breakage when we align the implementation with the spec.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

10 participants