-
Notifications
You must be signed in to change notification settings - Fork 769
[SYCL] Support connection with multiple plugins #1490
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, just few minor comments
sycl/source/detail/pi.cpp
Outdated
// are processed only once, like reading a string from environment | ||
// and converting it into a typed object. | ||
// | ||
template <typename T, const char *E> class Config { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Suggest reusing SYCLConfig from source/detail/config.hpp instead.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Using SYCLConfig in this case is the same as just using getenv. Introduced Config class is a singleton and was created to read environment variables only once (i.e. for efficiency). But getPreferredBE is called only once in device_selector and tracing is debugging feature, so this kind of over-complication doesn't make sense. I removed singleton, so that we don't need any static variables and global variables. I think it is ok just to read environment variables in this case.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Using SYCLConfig in this case is the same as just using getenv. Introduced Config class is a singleton and was created to read environment variables only once (i.e. for efficiency).
SYCLConfig does the same - it reads only once.
But getPreferredBE is called only once in device_selector and tracing is debugging feature, so this kind of over-complication doesn't make sense.
I believe for tracing it is called each time we call PI API. So, reading env var each time can be costly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@romanovvlad Sorry, but looking to the code I don't understand why. When we call get() we always call getRawValue() which calls getenv().
template <ConfigID Config> class SYCLConfig {
using BaseT = SYCLConfigBase<Config>;
public:
static const char *get() {
const char *ValStr = getRawValue();
return ValStr;
}
private:
static const char *getRawValue() {
if (ConfigFromEnvEnabled)
if (const char *ValStr = getenv(BaseT::MConfigName))
return ValStr;
sycl/source/detail/pi.cpp
Outdated
m_Data = (Env ? std::atoi(Env) : 0); | ||
} | ||
|
||
static const std::map<std::string, Backend> SyclBeMap{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we convert it into a pod datatype to avoid having complex global objects?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed.
sycl/source/detail/pi.cpp
Outdated
if (PluginNames.empty() && EnableTrace) | ||
std::cerr << "No Plugins Found." << std::endl; | ||
if (PluginNames.empty() && trace()) | ||
std::cerr << "SYCL_PI_TRACE[-1]: No Plugins Found." << std::endl; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What [-1] stands for?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This says what trace-level triggered this trace line. It is useful to grep the full trace for only specific levels.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is not clear to me why do you need to grep for specific trace levels if you can just enable only particular one instead
Moreover, -1
stands for ALL
, so, it doesn't really provide information about particular trace-level
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@AlexeySachkov Do you think we should not print labels like "SYCL_PI_TRACE[-1]:" at all?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@againull, I think so. From my point of view, it is better to put each message into particular category to be able enable/disable it
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The "SYCL_PI_TRACE" (without the mask) prefix is useful to separate out program output from the SYCL RT traces. The mask is useful to find a concise group of messages if provided a large full trace from outside. The mask also tells what mask should be used if you are just interested in this particular group of messages.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we print PI_TRACE_ALL instead of hard coding "-1"? Or event have a helper which checks if some level of printing is enabled and adds it to the beginning of the printed message. Asking because with current implementation this implicit alignment between if statement and content of print can easily go out of sync.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, we can improve this further (probably in subsequent PRs), but printing -1 is probably better that PI_TARCE_ALL, because -1 is what users specify on command line.
sycl/source/detail/pi.cpp
Outdated
PluginNames.push_back(std::make_pair<std::string, Backend>( | ||
OPENCL_PLUGIN_NAME, SYCL_BE_PI_OPENCL)); | ||
PluginNames.push_back( | ||
std::make_pair<std::string, Backend>(CUDA_PLUGIN_NAME, SYCL_BE_PI_CUDA)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like this will force the OpenCL NVIDIA platform to come before the CUDA backend...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's correct, by default it would be OpenCL. The explicit SYCL_BE=PI_CUDA would change that to CUDA.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's please move this preference inside the device_selector::select_device(). It is too subtle to have it like here. In select_device() you'd just choose the device of the preferred BE if multiple BE get the same score.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
sycl/source/device_selector.cpp
Outdated
return true; | ||
} | ||
} else if (PreferredBE == RT::Backend::SYCL_BE_PI_CUDA) { | ||
if (PlatformVersion.find("CUDA") != std::string::npos) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The PlatformVersion value for NVIDIA OpenCL also contains the string CUDA, so this will pick up OpenCL NVIDIA first. See output of NVIDIA OpenCL platform from clinfo:
Patform Vendor NVIDIA Corporation
Platform Version OpenCL 1.2 CUDA 10.1.236
The solution is to change the string of the PI CUDA backend to something more specific that we can search or alternatively find CUDA string but ensure the OpenCL one is not there.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
actually the useBackend
or now preferredBackend
isn't used anywhere else than the device_selector with this change. Do we really need that runtime code in pi.hpp
, or will it be simpler to evaluate the ENV in device selector? especially as preferredBackend
will always return OpenCl
when no env is set :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I now think we should just get the BE from the device's plugin, instead of querying their names.
auto BE = detail::getSyclObjImpl(dev)->getPlugin().getBackend();
Then the isDeviceOfPreferredSyclBe
can be removed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please fix a few pointed points
69922a9
to
bee65ad
Compare
sycl/source/detail/pi.cpp
Outdated
static constexpr char SYCL_PI_TRACE[] = "SYCL_PI_TRACE"; | ||
|
||
static Backend getBE(const char *EnvVar) { | ||
const char *BE = std::getenv(SYCL_BE); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
isn't this supposed to be something like std::getenv(*EnvVar)
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, my mistake, fixed.
981cb93
to
3296134
Compare
sycl/source/detail/pi.cpp
Outdated
// Lists valid environment variables. | ||
static constexpr char SYCL_BE[] = "SYCL_BE"; | ||
static constexpr char SYCL_INTEROP_BE[] = "SYCL_INTEROP_BE"; | ||
static constexpr char SYCL_PI_TRACE[] = "SYCL_PI_TRACE"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
static constexpr char SYCL_PI_TRACE[] = "SYCL_PI_TRACE"; | |
static constexpr auto SYCL_PI_TRACE = "SYCL_PI_TRACE"; |
seems simpler
sycl/source/detail/pi.cpp
Outdated
} | ||
Plugins.push_back(plugin(PluginInformation)); | ||
Plugins.push_back(plugin(PluginInformation, PluginNames[I].second)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perhaps an .emplace_back
could work here?
sycl/source/device_selector.cpp
Outdated
for (const auto &dev : devices) | ||
if (score < operator()(dev)) { | ||
for (const auto &dev : devices) { | ||
int dev_score = operator()(dev); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
int dev_score = operator()(dev); | |
int dev_score = (*this)(dev); |
looks clearer to me
res = &dev; | ||
score = operator()(dev); | ||
score = dev_score; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This reminds me that we have never said anything about the fact that the device selector should always return the same value for the same device... ;-) At least calling it once and caching the result like here is safer.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well, at some point that was intentional :-)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is legal probably and might still be intentional. For example imagine you want to implement the random_selector
. :-)
At least we want the selector to be called only once per device by the SYCL runtime inside 1 SYCL consumption or some havoc can happen...
4aab980
to
2a94663
Compare
sycl/source/detail/pi.cpp
Outdated
// are processed only once, like reading a string from environment | ||
// and converting it into a typed object. | ||
// | ||
template <typename T, const char *E> class Config { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Using SYCLConfig in this case is the same as just using getenv. Introduced Config class is a singleton and was created to read environment variables only once (i.e. for efficiency).
SYCLConfig does the same - it reads only once.
But getPreferredBE is called only once in device_selector and tracing is debugging feature, so this kind of over-complication doesn't make sense.
I believe for tracing it is called each time we call PI API. So, reading env var each time can be costly.
sycl/source/detail/pi.cpp
Outdated
if (PluginNames.empty() && EnableTrace) | ||
std::cerr << "No Plugins Found." << std::endl; | ||
if (PluginNames.empty() && trace()) | ||
std::cerr << "SYCL_PI_TRACE[-1]: No Plugins Found." << std::endl; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we print PI_TRACE_ALL instead of hard coding "-1"? Or event have a helper which checks if some level of printing is enabled and adds it to the beginning of the printed message. Asking because with current implementation this implicit alignment between if statement and content of print can easily go out of sync.
859305d
to
2d89167
Compare
7109cd8
to
c8e1512
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thank you!
@againull, please, fix tests on Linux and Windows. |
c8e1512
to
682a0a0
Compare
This commit enables including multiple devices of the same device_type and changed the logic of device selection to just prefer a SYCL_BE device if present. If someone uses SYCL_BE but appropriate device is not present, we will simply use any other device. Signed-off-by: Artur Gainullin <artur.gainullin@intel.com>
* Specialize SYCLConfig template class for SYCL_BE and SYCL_PI_TRACE. * Reuse SYCLConfig instead of introducing new Config class in PI. * Use recently introduced backend enum instead of pi::Backend enum. * Print label instead of number during PI_TRACE. * Introduce helper that returns label depending on level of tracing * Force SYCL RT to use specified backend when SYCL_BE is set. If SYCL_BE is not specified then SYCL RT is not forced to use specific backend. But make opencl backend preferred. * Update docs with info about SYCL_BE and SYCL_PI_TRACE Signed-off-by: Artur Gainullin <artur.gainullin@intel.com>
Signed-off-by: Artur Gainullin <artur.gainullin@intel.com>
Signed-off-by: Artur Gainullin <artur.gainullin@intel.com>
sycl/source/detail/config.hpp
Outdated
static backend *get() { | ||
static bool Initialized = false; | ||
static bool IsSet = false; | ||
static backend Backend = backend::opencl; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please, consider having backend* initialized to nullptr here.
sycl/source/detail/config.hpp
Outdated
return IsSet ? &Backend : nullptr; | ||
|
||
const char *ValStr = BaseT::getRawValue(); | ||
const std::map<std::string, backend> SyclBeMap{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please, consider using std::array.
Signed-off-by: Artur Gainullin <artur.gainullin@intel.com>
682a0a0
to
cb7b4eb
Compare
Signed-off-by: Artur Gainullin <artur.gainullin@intel.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please, do not force push as it kills possibility to track changes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sycl/source/detail/program_manager/program_manager.cpp
sycl/doc/EnvironmentVariables.md
LGTM
…_docs * origin/sycl: [XPTI][Framework] Reference implementation of the Xpti framework to be used with instrumentation in SYCL (intel#1557) [SYCL] Initial ABI checks implementation (intel#1528) [SYCL] Support connection with multiple plugins (intel#1490) [SYCL] Add a new header file with the reduction class definition (intel#1558) [SYCL] Add test for SYCL kernels with accessor and spec constant (intel#1536)
…versioning * origin/sycl: [XPTI][Framework] Reference implementation of the Xpti framework to be used with instrumentation in SYCL (intel#1557) [SYCL] Initial ABI checks implementation (intel#1528) [SYCL] Support connection with multiple plugins (intel#1490) [SYCL] Add a new header file with the reduction class definition (intel#1558) [SYCL] Add test for SYCL kernels with accessor and spec constant (intel#1536) [SYCL][CUDA] Move interop tests (intel#1570) [Driver][SYCL] Remove COFF object format designator for Windows device compiles (intel#1574) [SYCL] Fix conflicting visibility attributes (intel#1571) [SYCL][DOC] Update the SYCL Runtime Interface document with design details (intel#680) [SYCL] Improve image accessors support on a host device (intel#1502) [SYCL] Make queue's non-USM event ownership temporary (intel#1561) [SYCL] Added support of rounding modes for non-host devices (intel#1463) [SYCL] SemaSYCL significant refactoring (intel#1517) [SYCL] Support 0-dim accessor in handler::copy(accessor, accessor) (intel#1551)
`cl_*` types are defined by SYCL spec as interoperability interface with OpenCL. There is no need to use them in most of SYCL tests. This change is mostly motivated by the fact that those `cl_*` type aliases were moved from `sycl` into `sycl::opencl` namespace: we simply want to avoid tests breakage when we align the implementation with the spec.
This commit enables including multiple devices of the same device_type
and changed the logic of device selection to just prefer a SYCL_BE
device if present. If someone uses SYCL_BE but appropriate device is
not present, we will simply use any other device.
Signed-off-by: Artur Gainullin artur.gainullin@intel.com