Skip to content

Conversation

@asudarsa
Copy link
Contributor

Signed-off-by: Arvind Sudarsanam arvind.sudarsanam@intel.com

Signed-off-by: Arvind Sudarsanam <arvind.sudarsanam@intel.com>
@asudarsa asudarsa requested a review from a team as a code owner June 23, 2022 15:48
@asudarsa
Copy link
Contributor Author

I found that the number of queues for a sub sub device was not being set via the ZeProperties structure. So, we have not been creating a L0 queue for sub sub devices. This change resolves that.

if (SubSubDeviceOrdinal >= 0) {
QueueGroup[queue_group_info_t::Compute].ZeOrdinal = SubSubDeviceOrdinal;
QueueGroup[queue_group_info_t::Compute].ZeIndex = SubSubDeviceIndex;
QueueGroup[queue_group_info_t::Compute].ZeProperties.numQueues = 1;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we initialize all fields of ZeProperties (there are more of them, see https://spec.oneapi.io/level-zero/latest/core/api.html?highlight=numqueues#ze-command-queue-group-properties-t)?

Can we reuse zeDeviceGetCommandQueueGroupProperties result from here https://github.com/intel/llvm/pull/6349/files#diff-15dd1eb076d2164bd9e87d9057f05f652a716498e8cdf5975e564c65309a0985R684-R685? If I am not mistaken ZeProperties is supposed to be initialized using zeDeviceGetCommandQueueGroupProperties and not manually.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

L0 backend does not have any subsub devices. In the current implementation, we piggy back on the sub-device to act as the ZeDevice for subsubdevice. So, the zeDeviceGetCommandQueueGroupProperties returns the properties for sub-device and not the subsub device. So, that's why I am not using it here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, about initializing all other fields. I think we can try to initialize minimal number of fields. Currently, we need only the number of queues.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, but I have troubles understanding why it is ok to initialize only numQueues.

For example, at this part of code in enqueueMemFillHelper we use maxMemoryFillPatternSize:
https://github.com/intel/llvm/blob/sycl/sycl/plugins/level_zero/pi_level_zero.cpp#L6399-L6419

If Device at this part of code is a sub-sub device then we use uninitialized field maxMemoryFillPatternSize there.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As @asudarsa mentions, L0 doesn't have sub-sub-devices, to the properties you can get are the ones from the sub-device. I think that we have two options:

  • whenever querying for props of a sub-sub-devices, query the ones from the parent sub-devices, either sub-device->props or a pointer to it.
  • copy them at initialization.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree for sub-sub-device we need to initialize the entire ZeProperties (of type ze_command_queue_group_properties_t), populate it with what it's parent sub-device has in zeDeviceGetCommandQueueGroupProperties and then update as makes sense, e.g. set numQueues = 1

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @smaslov-intel and @jandres742

Thanks so much for the feedback. As you mentioned, I think it is a good idea to initialize the entire ZeProperties structure using data gathered for sub-device. However, zeDeviceGetCommandQueueGroupProperties returns multiple ZeProperties, one for each group. I think I will try to use the one for 'compute' group. Please let me know if there are questions.

Thanks again

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added a new commit addressing changes. Thanks

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, but please add a comment why "numQueues" is updated for sub-sub-device

@asudarsa asudarsa requested a review from rbegam June 24, 2022 20:17
Copy link
Contributor

@smaslov-intel smaslov-intel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please rework as suggested in comments

Signed-off-by: Arvind Sudarsanam <arvind.sudarsanam@intel.com>
@asudarsa
Copy link
Contributor Author

Added E2E test here.
intel/llvm-test-suite#1069

Signed-off-by: Arvind Sudarsanam <arvind.sudarsanam@intel.com>
@againull againull merged commit 4b1b01b into intel:sycl Jun 28, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants