Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Tensor constructor that just require size instead of full data array #14

Open
axsaucedo opened this issue Aug 29, 2020 · 5 comments

Comments

@axsaucedo
Copy link
Member

Currently Tensors have a constructor that allows to be created with full data array. This is sub-optimal when staging tensors or output tensors are created given that the data would be copied only to infer the size of the buffer, but the data would be replaced (that is unless the staging tensor is being used to copy from host to device). This issue encompasses adding a constructor or means to create tensors without data array.

@unexploredtest
Copy link
Contributor

unexploredtest commented Jan 23, 2021

So, I tried making this possible, and this is what I have done so far.
Here is the constructor that I created:

Tensor::Tensor(int tensorSize, TensorTypes tensorType)
{
#if DEBUG
    SPDLOG_DEBUG("Kompute Tensor constructor data length: {}, and type: {}",
                 data.size(),
                 tensorType);
#endif

    std::vector<float> data_raw;
    data_raw.reserve(tensorSize);
    const std::vector<float>& data = data_raw;

    this->mData = data;
    this->mShape = { static_cast<uint32_t>(tensorSize) };
    this->mTensorType = tensorType;
}

I tackled the problem via reserving the data, so that it won't take memory like what was intended to be achieved. Also, as there are no floats inside the Tensor yet, this->mShape will take { static_cast<uint32_t>(tensorSize) }; because data.size() will yield 0. However, when I tested it with this, it resulted in segmentation fault, I traced this and it happens in:

    commandBuffer->copyBuffer(
      *copyFromTensor->mBuffer, *this->mBuffer, copyRegion);

(Tensors.cpp)
I couldn't figure out why it resulted in segmentation fault, any ideas? Is reserve() not a viable solution?

@alexander-g
Copy link
Contributor

I don't know how to solve this but here are some hints:

  • That line commandBuffer->copyBuffer is in the function Tensor::recordCopyFrom().
  • This function was called most likely from OpTensorCreate::record() trying to copy from a staging Tensor (host) to your Tensor (device)
  • The staging Tensor was created in OpTensorCreate::init():line47 with the parameter tensor->data() which refers to your Tensor.
  • tensor->data() returns your zero-sized std::vector which results in a zero-sized staging Tensor
  • The segfault is then probably because copyBuffer tries to copy buffers of different sizes

Having said that, I believe this issue is actually about avoiding this copyBuffer command completely, because host->device data transfer is slow.

@unexploredtest
Copy link
Contributor

unexploredtest commented Jan 23, 2021

Yeah I see, thanks. I digged deeper with 'tensor->data()'. Though tensor->data() has a zero value, the method Tensor::memorySize() returns the output as desired, because we have actually given the mShape in the constructor, so for the most cases it works and (because Tensor::memorySize() is used when creating buffers). There is a place which might have caused the problem: (Tensor.cpp)Tensor::mapDataIntoHostMemory(), in memcpy(mapped, this->mData.data(), bufferSize);, maybe this caused the problem.

Having said that, I believe this issue is actually about avoiding this copyBuffer command completely, because host->device data transfer is slow.`

👍

@alexander-g
Copy link
Contributor

@axsaucedo I've taken a short look at this and have some questions:

  • What's the point of staging tensors? Especially, why give the user the option to create one manually? What would be the use case?
  • How to use eStorage tensors? It seems they are currently unusable. I cannot use OpTensorCreate with them because this op tries to call mapDataIntoHostMemory() on them which is only valid for staging tensors. I also cannot manually call init() because I cannot access the Manager's vk::PhysicalDevice and vk::Device which are private.

So my suggestion would be to have only one type of tensor, which automatically creates a buffer on the device. If the user wants to transfer data to/from the device, a host-side vulkan buffer could be created on the fly and deleted after the transfer. Otherwise it acts as device-only memory.
No need to keep a duplicate of the data on the host, let the user manage host-side memory themself. This would also remove the need for #21

@axsaucedo
Copy link
Member Author

Hmm you raise a good point, you are right I think currently the eStorage tensors are unusable as the op tries to call mapDataIntoHostMemory. Let me have a look at how this can be addressed. Storage tensors may still be useful if the users want to make available memory to use within the GPU that would only be used to store data as any processing is being computed but it's not expected to be used to be sent back to the host. I have opened #132 so we can explore further.

In regards to your second point, it does still make sense for some situations where people may want to use a hostVisible tensor that would allow for memory to be accessible in the CPU without the need of an operation to copy into GPUOnlyVisible memory. I saw your PR that addressees the initial issue of data copy, so I will provide further thougths there, and we can continue the discussion around addressing the current functionality around Tensors in the issue #132.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants