Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Amend memory ownership hierarchy to have Tensor owned by Manager instead of OpCreateTensor / OpBase #138

Merged
merged 27 commits into from
Feb 10, 2021
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
27 commits
Select commit Hold shift + click to select a range
fc3d23d
Removed OpCreateTensor in favour of manager memory ownership
axsaucedo Feb 8, 2021
4dedfad
Updated tests to reflect manager tensor memory ownership
axsaucedo Feb 8, 2021
f356e64
Updated aggregate headers
axsaucedo Feb 8, 2021
984709a
Removd opcreatetensor from docs
axsaucedo Feb 8, 2021
f62e353
Removed persistent anonymous sequences
axsaucedo Feb 8, 2021
aa75fda
format
axsaucedo Feb 8, 2021
65cb1b7
Removed destroy tensor function to avoid error logs in test
axsaucedo Feb 8, 2021
aa25f98
Added OpTensorSyncDevice by default on manager buildtensor functions …
axsaucedo Feb 8, 2021
d7fe53e
Updated tests to align with manager parameters update
axsaucedo Feb 8, 2021
71cde2d
Updated single include header
axsaucedo Feb 8, 2021
3547810
reformat
axsaucedo Feb 8, 2021
667841d
Updated ccls to include python
axsaucedo Feb 9, 2021
dead40c
Added python target
axsaucedo Feb 9, 2021
6509758
Updated python to align with new structure
axsaucedo Feb 9, 2021
b34984b
Updating sequence to have isInit until init run
axsaucedo Feb 9, 2021
39d02dd
Added test that verifies memory violation sequence
axsaucedo Feb 9, 2021
9125220
updating single include
axsaucedo Feb 9, 2021
0d9a975
Renamed tensor and rebuild functions
axsaucedo Feb 9, 2021
4baba33
Updated tests to match new functions and added test to ensure seuqenc…
axsaucedo Feb 9, 2021
3e91a77
Updated docs to match functions
axsaucedo Feb 9, 2021
b243d43
Updated readme
axsaucedo Feb 9, 2021
4e9888e
Updated examples
axsaucedo Feb 9, 2021
1edcb42
Single include
axsaucedo Feb 9, 2021
d8041d6
Added python updated functions
axsaucedo Feb 9, 2021
a828bb9
Updated python tests
axsaucedo Feb 9, 2021
3c486eb
Updated test to cover sequences
axsaucedo Feb 9, 2021
48f041d
Updated the examples
axsaucedo Feb 9, 2021
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions .ccls
Original file line number Diff line number Diff line change
Expand Up @@ -13,6 +13,7 @@
-DDEBUG=1
-DKOMPUTE_INCLUDE_FOR_SYNTAX

-I/usr/include/python3.6/
-I./python/pybind11/include/
-I./external/Vulkan-Headers/include/
-I./external/googletest/googletest/include/
Expand Down
5 changes: 5 additions & 0 deletions Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -156,6 +156,11 @@ vs_run_tests: vs_build_tests
./build/test/$(VS_BUILD_TYPE)/test_kompute.exe --gtest_filter=$(FILTER_TESTS)


#### PYTHONG ####

test_python:
python -m pytest -s --log-cli-level=DEBUG -v python/test/

####### Run CI Commands #######

# This command uses act to replicate github action
Expand Down
6 changes: 3 additions & 3 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -54,9 +54,9 @@ int main() {
kp::Manager mgr;

// 2. Create and initialise Kompute Tensors through manager
auto tensorInA = mgr.buildTensor({ 2., 2., 2. });
auto tensorInB = mgr.buildTensor({ 1., 2., 3. });
auto tensorOut = mgr.buildTensor({ 0., 0., 0. });
auto tensorInA = mgr.tensor({ 2., 2., 2. });
auto tensorInB = mgr.tensor({ 1., 2., 3. });
auto tensorOut = mgr.tensor({ 0., 0., 0. });

// 3. Specify "multiply shader" code (can also be raw string, spir-v bytes or file path)
std::string shaderString = (R"(
Expand Down
8 changes: 4 additions & 4 deletions docs/overview/advanced-examples.rst
Original file line number Diff line number Diff line change
Expand Up @@ -97,7 +97,7 @@ Record commands in a single submit by using a Sequence to send in batch to GPU.
mgr.evalOpDefault<kp::OpCreateTensor>({tensorLHS, tensorRHS, tensorOutput});

// Create a new sequence
std::weak_ptr<kp::Sequence> sqWeakPtr = mgr.getOrCreateManagedSequence();
std::weak_ptr<kp::Sequence> sqWeakPtr = mgr.sequence();

if (std::shared_ptr<kp::Sequence> sq = sqWeakPtr.lock())
{
Expand Down Expand Up @@ -226,8 +226,8 @@ Back to `examples list <#simple-examples>`_.
// We need to create explicit sequences with their respective queues
// The second parameter is the index in the familyIndex array which is relative
// to the vector we created the manager with.
mgr.createManagedSequence("queueOne", 0);
mgr.createManagedSequence("queueTwo", 1);
mgr.sequence("queueOne", 0);
mgr.sequence("queueTwo", 1);

// Creates tensor an initializes GPU memory (below we show more granularity)
auto tensorA = std::make_shared<kp::Tensor>(kp::Tensor(std::vector<float>(10, 0.0)));
Expand Down Expand Up @@ -422,7 +422,7 @@ Now that we have the inputs and outputs we will be able to use them in the proce
kp::Manager mgr;

if (std::shared_ptr<kp::Sequence> sq =
mgr.getOrCreateManagedSequence("createTensors").lock())
mgr.sequence("createTensors").lock())
{
// ...

Expand Down
4 changes: 2 additions & 2 deletions docs/overview/async-parallel.rst
Original file line number Diff line number Diff line change
Expand Up @@ -208,8 +208,8 @@ It's worth mentioning you can have multiple sequences referencing the same queue
// We need to create explicit sequences with their respective queues
// The second parameter is the index in the familyIndex array which is relative
// to the vector we created the manager with.
mgr.createManagedSequence("queueOne", 0);
mgr.createManagedSequence("queueTwo", 1);
mgr.sequence("queueOne", 0);
mgr.sequence("queueTwo", 1);

We create the tensors without modifications.

Expand Down
10 changes: 0 additions & 10 deletions docs/overview/reference.rst
Original file line number Diff line number Diff line change
Expand Up @@ -86,16 +86,6 @@ The kp::OpMult operation is a sample implementation of the kp::OpAlgoBase class.
.. doxygenclass:: kp::OpMult
:members:

OpTensorCreate
-------

The kp::OpTensorCreate is a tensor only operations which initialises a kp::Tensor by creating the respective vk::Buffer and vk::Memory, as well as transferring the local data into the GPU.

.. image:: ../images/kompute-vulkan-architecture-opcreatetensor.jpg
:width: 100%

.. doxygenclass:: kp::OpTensorCreate
:members:

OpTensorCopy
-------
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -42,16 +42,9 @@ void KomputeModelML::train(std::vector<float> yData, std::vector<float> xIData,
kp::Manager mgr;

{
mgr.rebuild(params);

std::shared_ptr<kp::Sequence> sqTensor =
mgr.createManagedSequence();

sqTensor->begin();
sqTensor->record<kp::OpTensorCreate>(params);
sqTensor->end();
sqTensor->eval();

std::shared_ptr<kp::Sequence> sq = mgr.createManagedSequence();
std::shared_ptr<kp::Sequence> sq = mgr.sequence();

// Record op algo base
sq->begin();
Expand Down
6 changes: 3 additions & 3 deletions examples/array_multiplication/src/Main.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -14,9 +14,9 @@ int main()

kp::Manager mgr;

auto tensorInA = mgr.buildTensor({ 2.0, 4.0, 6.0 });
auto tensorInB = mgr.buildTensor({ 0.0, 1.0, 2.0 });
auto tensorOut = mgr.buildTensor({ 0.0, 0.0, 0.0 });
auto tensorInA = mgr.tensor({ 2.0, 4.0, 6.0 });
auto tensorInB = mgr.tensor({ 0.0, 1.0, 2.0 });
auto tensorOut = mgr.tensor({ 0.0, 0.0, 0.0 });

#ifdef KOMPUTE_ANDROID_SHADER_FROM_STRING
std::string shader(R"(
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,7 @@ void KomputeSummatorNode::add(float value) {
// Set the new data in the local device
this->mSecondaryTensor->setData({value});
// Execute recorded sequence
if (std::shared_ptr<kp::Sequence> sq = this->mSequence.lock()) {
if (std::shared_ptr<kp::Sequence> sq = this->mSequence) {
sq->eval();
}
else {
Expand All @@ -29,12 +29,12 @@ float KomputeSummatorNode::get_total() const {

void KomputeSummatorNode::_init() {
std::cout << "CALLING INIT" << std::endl;
this->mPrimaryTensor = this->mManager.buildTensor({ 0.0 });
this->mSecondaryTensor = this->mManager.buildTensor({ 0.0 });
this->mSequence = this->mManager.getOrCreateManagedSequence("AdditionSeq");
this->mPrimaryTensor = this->mManager.tensor({ 0.0 });
this->mSecondaryTensor = this->mManager.tensor({ 0.0 });
this->mSequence = this->mManager.sequence("AdditionSeq");

// We now record the steps in the sequence
if (std::shared_ptr<kp::Sequence> sq = this->mSequence.lock())
if (std::shared_ptr<kp::Sequence> sq = this->mSequence)
{

std::string shader(R"(
Expand All @@ -59,7 +59,7 @@ void KomputeSummatorNode::_init() {
{ this->mSecondaryTensor });

// Then we run the operation with both tensors
sq->record<kp::OpAlgoBase<>>(
sq->record<kp::OpAlgoBase>(
{ this->mPrimaryTensor, this->mSecondaryTensor },
std::vector<char>(shader.begin(), shader.end()));

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -28,9 +28,9 @@ float KomputeSummator::get_total() const {

void KomputeSummator::_init() {
std::cout << "CALLING INIT" << std::endl;
this->mPrimaryTensor = this->mManager.buildTensor({ 0.0 });
this->mSecondaryTensor = this->mManager.buildTensor({ 0.0 });
this->mSequence = this->mManager.getOrCreateManagedSequence("AdditionSeq");
this->mPrimaryTensor = this->mManager.tensor({ 0.0 });
this->mSecondaryTensor = this->mManager.tensor({ 0.0 });
this->mSequence = this->mManager.sequence("AdditionSeq");

// We now record the steps in the sequence
{
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -50,15 +50,10 @@ void KomputeModelMLNode::train(Array yArr, Array xIArr, Array xJArr) {
{
kp::Manager mgr;

std::shared_ptr<kp::Sequence> sqTensor =
mgr.createManagedSequence();
mgr.rebuild(params);

sqTensor->begin();
sqTensor->record<kp::OpTensorCreate>(params);
sqTensor->end();
sqTensor->eval();

std::shared_ptr<kp::Sequence> sq = mgr.createManagedSequence();
{
std::shared_ptr<kp::Sequence> sq = mgr.sequence();

// Record op algo base
sq->begin();
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -55,15 +55,9 @@ void KomputeModelML::train(Array yArr, Array xIArr, Array xJArr) {
kp::Manager mgr;

{
std::shared_ptr<kp::Sequence> sqTensor =
mgr.createManagedSequence();
mgr.rebuild(params);

sqTensor->begin();
sqTensor->record<kp::OpTensorCreate>(params);
sqTensor->end();
sqTensor->eval();

std::shared_ptr<kp::Sequence> sq = mgr.createManagedSequence();
std::shared_ptr<kp::Sequence> sq = mgr.sequence();

// Record op algo base
sq->begin();
Expand Down
10 changes: 2 additions & 8 deletions examples/logistic_regression/src/Main.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -35,15 +35,9 @@ int main()

kp::Manager mgr;

std::shared_ptr<kp::Sequence> sqTensor =
mgr.createManagedSequence();
mgr.rebuild(params);

sqTensor->begin();
sqTensor->record<kp::OpTensorCreate>(params);
sqTensor->end();
sqTensor->eval();

std::shared_ptr<kp::Sequence> sq = mgr.createManagedSequence();
std::shared_ptr<kp::Sequence> sq = mgr.sequence();

// Record op algo base
sq->begin();
Expand Down
13 changes: 2 additions & 11 deletions python/src/docstrings.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -119,7 +119,7 @@ integrate with the vulkan kompute use.
@param device Vulkan logical device to use for all base resources
@param physicalDeviceIndex Index for vulkan physical device used)doc";

static const char *__doc_kp_Manager_buildTensor =
static const char *__doc_kp_Manager_tensor =
R"doc(Function that simplifies the common workflow of tensor creation and
initialization. It will take the constructor parameters for a Tensor
and will will us it to create a new Tensor and then create it using
Expand All @@ -133,15 +133,6 @@ static const char *__doc_kp_Manager_createDevice = R"doc()doc";

static const char *__doc_kp_Manager_createInstance = R"doc()doc";

static const char *__doc_kp_Manager_createManagedSequence =
R"doc(Create a new managed Kompute sequence so it's available within the
manager.

@param sequenceName The name for the named sequence to be created, if
empty then default indexed value is used @param queueIndex The queue
to use from the available queues @return Weak pointer to the manager
owned sequence resource)doc";

static const char *__doc_kp_Manager_evalOp =
R"doc(Function that evaluates operation against named sequence.

Expand Down Expand Up @@ -187,7 +178,7 @@ R"doc(Function that evaluates operation against a newly created sequence.
TArgs Template parameters that will be used to initialise Operation to
allow for extensible configurations on initialisation)doc";

static const char *__doc_kp_Manager_getOrCreateManagedSequence =
static const char *__doc_kp_Manager_sequence =
R"doc(Get or create a managed Sequence that will be contained by this
manager. If the named sequence does not currently exist, it would be
created and initialised.
Expand Down
25 changes: 10 additions & 15 deletions python/src/main.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -105,8 +105,6 @@ PYBIND11_MODULE(kp, m) {
.def("is_init", &kp::Sequence::isInit, "Checks if the Sequence has been initialized")

// record
.def("record_tensor_create", &kp::Sequence::record<kp::OpTensorCreate>,
"Records operation to create and initialise tensor GPU memory and buffer")
.def("record_tensor_copy", &kp::Sequence::record<kp::OpTensorCopy>,
"Records operation to copy one tensor to one or many tensors")
.def("record_tensor_sync_device", &kp::Sequence::record<kp::OpTensorSyncDevice>,
Expand Down Expand Up @@ -157,11 +155,16 @@ PYBIND11_MODULE(kp, m) {
[](uint32_t physicalDeviceIndex, const std::vector<uint32_t>& familyQueueIndices) {
return std::unique_ptr<kp::Manager>(new kp::Manager(physicalDeviceIndex, familyQueueIndices));
}), "Manager initialiser can provide specified device and array of GPU queueFamilies to load.")
.def("get_create_sequence", &kp::Manager::getOrCreateManagedSequence, "Get a Sequence or create a new one with given name")
.def("create_sequence", &kp::Manager::createManagedSequence,
py::arg("name") = "", py::arg("queueIndex") = 0, "Create a sequence with specific name and specified index of available queues")
.def("build_tensor", &kp::Manager::buildTensor,
py::arg("data"), py::arg("tensorType") = kp::Tensor::TensorTypes::eDevice,
.def("sequence", &kp::Manager::sequence,
py::arg("name") = "", py::arg("queueIndex") = 0, "Get or create a sequence with specific name and specified index of available queues")
.def("tensor", &kp::Manager::tensor,
py::arg("data"), py::arg("tensorType") = kp::Tensor::TensorTypes::eDevice, py::arg("syncDataToGPU") = true,
"Build and initialise tensor")
.def("rebuild", py::overload_cast<std::vector<std::shared_ptr<kp::Tensor>>, bool>(&kp::Manager::rebuild),
py::arg("tensors"), py::arg("syncDataToGPU") = true,
"Build and initialise list of tensors")
.def("rebuild", py::overload_cast<std::shared_ptr<kp::Tensor>, bool>(&kp::Manager::rebuild),
py::arg("tensor"), py::arg("syncDataToGPU") = true,
"Build and initialise tensor")

// Await functions
Expand All @@ -172,8 +175,6 @@ PYBIND11_MODULE(kp, m) {
py::arg("waitFor") = UINT64_MAX, "Awaits for asynchronous operation on the last anonymous Sequence created")

// eval default
.def("eval_tensor_create_def", &kp::Manager::evalOpDefault<kp::OpTensorCreate>,
"Evaluates operation to create and initialise tensor GPU memory and buffer with new anonymous Sequence")
.def("eval_tensor_copy_def", &kp::Manager::evalOpDefault<kp::OpTensorCopy>,
"Evaluates operation to copy one tensor to one or many tensors with new anonymous Sequence")
.def("eval_tensor_sync_device_def", &kp::Manager::evalOpDefault<kp::OpTensorSyncDevice>,
Expand Down Expand Up @@ -209,8 +210,6 @@ PYBIND11_MODULE(kp, m) {
"Evaluates operation to run left right out operation with custom shader with new anonymous Sequence")

// eval
.def("eval_tensor_create", &kp::Manager::evalOp<kp::OpTensorCreate>,
"Evaluates operation to create and initialise tensor GPU memory and buffer with explicitly named Sequence")
.def("eval_tensor_copy", &kp::Manager::evalOp<kp::OpTensorCopy>,
"Evaluates operation to copy one tensor to one or many tensors with explicitly named Sequence")
.def("eval_tensor_sync_device", &kp::Manager::evalOp<kp::OpTensorSyncDevice>,
Expand Down Expand Up @@ -249,8 +248,6 @@ PYBIND11_MODULE(kp, m) {
"Evaluates operation to run left right out operation with custom shader with explicitly named Sequence")

// eval async default
.def("eval_async_tensor_create_def", &kp::Manager::evalOpAsyncDefault<kp::OpTensorCreate>,
"Evaluates asynchronously operation to create and initialise tensor GPU memory and buffer with anonymous Sequence")
.def("eval_async_tensor_copy_def", &kp::Manager::evalOpAsyncDefault<kp::OpTensorCopy>,
"Evaluates asynchronously operation to copy one tensor to one or many tensors with anonymous Sequence")
.def("eval_async_tensor_sync_device_def", &kp::Manager::evalOpAsyncDefault<kp::OpTensorSyncDevice>,
Expand Down Expand Up @@ -286,8 +283,6 @@ PYBIND11_MODULE(kp, m) {
"Evaluates asynchronously operation to run left right out operation with custom shader with anonymous Sequence")

// eval async
.def("eval_async_tensor_create", &kp::Manager::evalOpAsync<kp::OpTensorCreate>,
"Evaluates asynchronously operation to create and initialise tensor GPU memory and buffer with explicitly named Sequence")
.def("eval_async_tensor_copy", &kp::Manager::evalOpAsync<kp::OpTensorCopy>,
"Evaluates asynchronously operation to copy one tensor to one or many tensors with explicitly named Sequence")
.def("eval_async_tensor_sync_device", &kp::Manager::evalOpAsync<kp::OpTensorSyncDevice>,
Expand Down
2 changes: 1 addition & 1 deletion python/test/test_array_multiplication.py
Original file line number Diff line number Diff line change
Expand Up @@ -14,7 +14,7 @@ def test_array_multiplication():
tensor_out = kp.Tensor([0, 0, 0])

# 3. Initialise the Kompute Tensors in the GPU
mgr.eval_tensor_create_def([tensor_in_a, tensor_in_b, tensor_out])
mgr.rebuild([tensor_in_a, tensor_in_b, tensor_out])

# 4. Define the multiplication shader code to run on the GPU
@ps.python2shader
Expand Down
Loading