Add Tensor constructor that just require size instead of full data array #14

axsaucedo · 2020-08-29T17:56:16Z

Currently Tensors have a constructor that allows to be created with full data array. This is sub-optimal when staging tensors or output tensors are created given that the data would be copied only to infer the size of the buffer, but the data would be replaced (that is unless the staging tensor is being used to copy from host to device). This issue encompasses adding a constructor or means to create tensors without data array.

unexploredtest · 2021-01-23T14:29:27Z

So, I tried making this possible, and this is what I have done so far.
Here is the constructor that I created:

Tensor::Tensor(int tensorSize, TensorTypes tensorType)
{
#if DEBUG
    SPDLOG_DEBUG("Kompute Tensor constructor data length: {}, and type: {}",
                 data.size(),
                 tensorType);
#endif

    std::vector<float> data_raw;
    data_raw.reserve(tensorSize);
    const std::vector<float>& data = data_raw;

    this->mData = data;
    this->mShape = { static_cast<uint32_t>(tensorSize) };
    this->mTensorType = tensorType;
}

I tackled the problem via reserving the data, so that it won't take memory like what was intended to be achieved. Also, as there are no floats inside the Tensor yet, this->mShape will take { static_cast<uint32_t>(tensorSize) }; because data.size() will yield 0. However, when I tested it with this, it resulted in segmentation fault, I traced this and it happens in:

    commandBuffer->copyBuffer(
      *copyFromTensor->mBuffer, *this->mBuffer, copyRegion);

(Tensors.cpp)
I couldn't figure out why it resulted in segmentation fault, any ideas? Is reserve() not a viable solution?

alexander-g · 2021-01-23T16:06:40Z

I don't know how to solve this but here are some hints:

That line commandBuffer->copyBuffer is in the function Tensor::recordCopyFrom().
This function was called most likely from OpTensorCreate::record() trying to copy from a staging Tensor (host) to your Tensor (device)
The staging Tensor was created in OpTensorCreate::init():line47 with the parameter tensor->data() which refers to your Tensor.
tensor->data() returns your zero-sized std::vector which results in a zero-sized staging Tensor
The segfault is then probably because copyBuffer tries to copy buffers of different sizes

Having said that, I believe this issue is actually about avoiding this copyBuffer command completely, because host->device data transfer is slow.

unexploredtest · 2021-01-23T17:43:35Z

Yeah I see, thanks. I digged deeper with 'tensor->data()'. Though tensor->data() has a zero value, the method Tensor::memorySize() returns the output as desired, because we have actually given the mShape in the constructor, so for the most cases it works and (because Tensor::memorySize() is used when creating buffers). There is a place which might have caused the problem: (Tensor.cpp)Tensor::mapDataIntoHostMemory(), in memcpy(mapped, this->mData.data(), bufferSize);, maybe this caused the problem.

Having said that, I believe this issue is actually about avoiding this copyBuffer command completely, because host->device data transfer is slow.`

👍

alexander-g · 2021-02-05T08:17:35Z

@axsaucedo I've taken a short look at this and have some questions:

What's the point of staging tensors? Especially, why give the user the option to create one manually? What would be the use case?
How to use eStorage tensors? It seems they are currently unusable. I cannot use OpTensorCreate with them because this op tries to call mapDataIntoHostMemory() on them which is only valid for staging tensors. I also cannot manually call init() because I cannot access the Manager's vk::PhysicalDevice and vk::Device which are private.

So my suggestion would be to have only one type of tensor, which automatically creates a buffer on the device. If the user wants to transfer data to/from the device, a host-side vulkan buffer could be created on the fly and deleted after the transfer. Otherwise it acts as device-only memory.
No need to keep a duplicate of the data on the host, let the user manage host-side memory themself. This would also remove the need for #21

axsaucedo · 2021-02-06T16:56:41Z

Hmm you raise a good point, you are right I think currently the eStorage tensors are unusable as the op tries to call mapDataIntoHostMemory. Let me have a look at how this can be addressed. Storage tensors may still be useful if the users want to make available memory to use within the GPU that would only be used to store data as any processing is being computed but it's not expected to be used to be sent back to the host. I have opened #132 so we can explore further.

In regards to your second point, it does still make sense for some situations where people may want to use a hostVisible tensor that would allow for memory to be accessible in the CPU without the need of an operation to copy into GPUOnlyVisible memory. I saw your PR that addressees the initial issue of data copy, so I will provide further thougths there, and we can continue the discussion around addressing the current functionality around Tensors in the issue #132.

This was referenced Jan 30, 2021

Disallowing zero sized tensors #129

Merged

Sharing mData between device and staging tensors #130

Closed

axsaucedo mentioned this issue Feb 6, 2021

eStorage Tensors are currently unusable as OpTensorCreate calls mapDataIntoHostMemory #132

Closed

axsaucedo mentioned this issue Dec 3, 2022

Adding storage tensor functionality with tests #316

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add Tensor constructor that just require size instead of full data array #14

Add Tensor constructor that just require size instead of full data array #14

axsaucedo commented Aug 29, 2020

unexploredtest commented Jan 23, 2021 •

edited

Loading

alexander-g commented Jan 23, 2021

unexploredtest commented Jan 23, 2021 •

edited

Loading

alexander-g commented Feb 5, 2021

axsaucedo commented Feb 6, 2021

Add Tensor constructor that just require size instead of full data array #14

Add Tensor constructor that just require size instead of full data array #14

Comments

axsaucedo commented Aug 29, 2020

unexploredtest commented Jan 23, 2021 • edited Loading

alexander-g commented Jan 23, 2021

unexploredtest commented Jan 23, 2021 • edited Loading

alexander-g commented Feb 5, 2021

axsaucedo commented Feb 6, 2021

unexploredtest commented Jan 23, 2021 •

edited

Loading

unexploredtest commented Jan 23, 2021 •

edited

Loading