-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add Arrow - Velox conversion support #4450
Closed
sanjibansg
wants to merge
16
commits into
facebookincubator:main
from
sanjibansg:arrow-velox-conversion
Closed
Changes from all commits
Commits
Show all changes
16 commits
Select commit
Hold shift + click to select a range
b66d17f
feat: Support for Arrow-Velox conversion
sanjibansg d52ecfa
fix: passing default memory pool while converting
sanjibansg 6c4b5ad
format
assignUser d93409d
use extras_require for test deps
assignUser 2aa27b0
feat: separate module for the conversion functions
sanjibansg 44271a2
feat: import_from_arrow now can work only with the pyarrow.Array object
sanjibansg 3ff09cd
feat: add roundtrip testing for conversion functions
sanjibansg 1931791
feat: using smart pointers for conversion memory movement
sanjibansg 5f42b1e
feat: pv.export_to_arrow() should return a pyarrow.array object and n…
sanjibansg 703d846
fix: moved MemoryPool instance to util header, updated tests
sanjibansg ea6c049
fix: format-check
sanjibansg 633fd03
fix: formatting on CMakeLists.txt
sanjibansg f3108aa
fix: util header not required as instance struct was moved to context…
sanjibansg 87e5f2b
fix: using static method getSingletonInstance()
sanjibansg 072244e
fix: format-check
sanjibansg 579010b
fix: python build should have verbose output for passing CI checks
sanjibansg File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,52 @@ | ||
/* | ||
* Copyright (c) Facebook, Inc. and its affiliates. | ||
* | ||
* Licensed under the Apache License, Version 2.0 (the "License"); | ||
* you may not use this file except in compliance with the License. | ||
* You may obtain a copy of the License at | ||
* | ||
* http://www.apache.org/licenses/LICENSE-2.0 | ||
* | ||
* Unless required by applicable law or agreed to in writing, software | ||
* distributed under the License is distributed on an "AS IS" BASIS, | ||
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. | ||
* See the License for the specific language governing permissions and | ||
* limitations under the License. | ||
*/ | ||
|
||
#include "conversion.h" | ||
#include <velox/vector/arrow/Abi.h> | ||
#include <velox/vector/arrow/Bridge.h> | ||
#include "context.h" | ||
|
||
namespace facebook::velox::py { | ||
|
||
namespace py = pybind11; | ||
|
||
void addConversionBindings(py::module& m, bool asModuleLocalDefinitions) { | ||
m.def("export_to_arrow", [](VectorPtr& inputVector) { | ||
auto arrowArray = std::make_unique<ArrowArray>(); | ||
auto pool_ = PyVeloxContext::getSingletonInstance().pool(); | ||
facebook::velox::exportToArrow(inputVector, *arrowArray, pool_); | ||
|
||
auto arrowSchema = std::make_unique<ArrowSchema>(); | ||
facebook::velox::exportToArrow(inputVector, *arrowSchema); | ||
|
||
py::module arrow_module = py::module::import("pyarrow"); | ||
py::object array_class = arrow_module.attr("Array"); | ||
return array_class.attr("_import_from_c")( | ||
reinterpret_cast<uintptr_t>(arrowArray.get()), | ||
reinterpret_cast<uintptr_t>(arrowSchema.get())); | ||
}); | ||
|
||
m.def("import_from_arrow", [](py::object inputArrowArray) { | ||
auto arrowArray = std::make_unique<ArrowArray>(); | ||
auto arrowSchema = std::make_unique<ArrowSchema>(); | ||
inputArrowArray.attr("_export_to_c")( | ||
reinterpret_cast<uintptr_t>(arrowArray.get()), | ||
reinterpret_cast<uintptr_t>(arrowSchema.get())); | ||
auto pool_ = PyVeloxContext::getSingletonInstance().pool(); | ||
return importFromArrowAsOwner(*arrowSchema, *arrowArray, pool_); | ||
}); | ||
} | ||
} // namespace facebook::velox::py |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,34 @@ | ||
/* | ||
* Copyright (c) Facebook, Inc. and its affiliates. | ||
* | ||
* Licensed under the Apache License, Version 2.0 (the "License"); | ||
* you may not use this file except in compliance with the License. | ||
* You may obtain a copy of the License at | ||
* | ||
* http://www.apache.org/licenses/LICENSE-2.0 | ||
* | ||
* Unless required by applicable law or agreed to in writing, software | ||
* distributed under the License is distributed on an "AS IS" BASIS, | ||
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. | ||
* See the License for the specific language governing permissions and | ||
* limitations under the License. | ||
*/ | ||
|
||
#pragma once | ||
|
||
#include <pybind11/pybind11.h> | ||
|
||
namespace facebook::velox::py { | ||
|
||
namespace py = pybind11; | ||
|
||
/// Adds bindings for arrow-velox conversion functions to module m. | ||
/// | ||
/// @param m Module to add bindings to. | ||
/// @param asModuleLocalDefinitions If true then these bindings are only | ||
/// visible inside the module. Refer to | ||
/// https://pybind11.readthedocs.io/en/stable/advanced/classes.html#module-local-class-bindings | ||
/// for further details. | ||
void addConversionBindings(py::module& m, bool asModuleLocalDefinitions = true); | ||
|
||
} // namespace facebook::velox::py |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks this is great, can we add export / import tests for all the types we support currently though ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's certainly good to add some more tests, but just to mention that the actual conversion itself (the C++ code) is also already tested at https://github.com/facebookincubator/velox/blob/main/velox/vector/arrow/tests/ArrowBridgeArrayTest.cpp
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @jorisvandenbossche , nevertheless would be good to add tests for atleast the primitive types to make sure there are no inadvertent casts across types , since its going via Cpp to Python etc. Also PyVelox doesnt support all the types thats Velox supports yet :) .
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have added the test cases for integers, floats, and strings. Should I add some other test cases as well?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, while testing multiple test cases, I noticed, that the memory pool instance was not handled correctly earlier. So, I had to move the Instance struct to a separate header file, does this approach look good, or can there be a better way?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I think the latest vectorsaver changes also moved it to a different file. Please merge from latest.