Skip to content

Conversation

@dongkyunahn-intel
Copy link
Contributor

36 out of 40 LSC tests are passing

@dongkyunahn-intel dongkyunahn-intel requested a review from a team as a code owner May 4, 2022 21:15
static_assert(ImmOffset == 0);
static_assert(DS != __ESIMD_ENS::lsc_data_size::u16u32h);

constexpr uint MASK = loadstoreAlignMask<Ty, VS, DS, N>();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please move this down to the place of use. Same is true for Output, ElemCount

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

constexpr uint MASK = loadstoreAlignMask<Ty, VS, DS, N>();
__ESIMD_DNS::vector_type_t<Ty, N * __ESIMD_EDNS::to_int<VS>()> Output = 0;

constexpr int ElemCount = __ESIMD_EDNS::to_int<VS>();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think N's definition /// @tparam N is the number of channels (platform dependent). is wrong here and in many other places in this file. This is actually the simd size - the number of chunks of VS*sizeof(T) size. If we take rgba load/strore as reference, then VS would be the number of channels, not N. So ElemCount is actually the number of channels. It would be great if you could change N's definition to something like "the SIMD size of operation (the number of addresses to access)".

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comment updated.

ByteDistance += rawAddressIncrement<Ty, DS>(),
VecIdx += vectorIndexIncrement<N, _Transposed>()) {

Output[VecIdx] = *((Ty *)(addrs[AddrIdx] + ByteDistance));
Copy link
Contributor

@kbobrovs kbobrovs May 5, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please hoist addrs[AddrIdx] out of the loop

BaseAddr = addrs[AddrIdx];
assert((BaseAddr  & MASK) == 0 && "Address Alignment Error!!");

    for (...) {
      Output[VecIdx] = *((Ty *)(BaseAddr + ByteDistance));
...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

- Updating comments for definition of 'N' : SIMD operationsize
- addrs[] to BaseAddr
- Relocating variables close to the place of its use
- Elem* to Chanl*
/// @tparam Transposed indicates if the data is transposed during the transfer.
/// @tparam N is the number of channels (platform dependent).
/// @tparam N is the SIMD size of operation (the number of addresses to access,
/// platform dependent).
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't quite understand what "platform-dependent" means for N, as this is what user passes in the code. Restrictions maybe platform-dependent, but that's normal situation for many other ESIMD API parameters, so I suggest to remove it here and on all other similar places

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed 'platform dependent' comment.

@dongkyunahn-intel
Copy link
Contributor Author

Unrelated failures from HIP AMDGPU LLVM Test Suite

Failed Tests (6):
  SYCL :: GroupAlgorithm/SYCL2020/exclusive_scan.cpp
  SYCL :: GroupAlgorithm/SYCL2020/inclusive_scan.cpp
  SYCL :: GroupAlgorithm/SYCL2020/reduce.cpp
  SYCL :: GroupAlgorithm/exclusive_scan.cpp
  SYCL :: GroupAlgorithm/inclusive_scan.cpp
  SYCL :: GroupAlgorithm/reduce.cpp

@pvchupin
Copy link
Contributor

pvchupin commented Jun 1, 2022

@dongkyunahn-intel, @kbobrovs, please check post-commit fails on Windows: https://github.com/intel/llvm/actions/runs/2416580740

@dongkyunahn-intel
Copy link
Contributor Author

Posted PR for fix - #6223

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants