-
Notifications
You must be signed in to change notification settings - Fork 750
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Pytorch] New version of the presets #1360
Conversation
Could you undo unnecessary changes in indenting? |
Add missing ATen/ops/from_blob.h Add missing ATen/ops/tensor.h
I reindented using rules closer, I think, that the ones you use. Also added 2 missing includes. |
I mean, could you please not change the lines that you haven't touched? It makes it harder to review. |
You won't be able to review by comparing line by line with the previous version. Too much has changed. But please hold on, another commit coming. I realized some includes are still missing + I need to commit the gen directory so that the CI checks pass. Edit: in fact, for a thorough review, it might be easier to check the difference in the generated files, after filtering out non significant changes. |
Added parsing of files included from additional includes (Descriptors.h, Types.h...)
Remove _object Remove _compute_enum_name Remove _CopyBytesFunctionRegisterer
I removed the includes depending on CUDA includes. |
The build is still failing on Windows. Please fix this! |
Pytorch tries a dlopen on |
I had already added that library. The question should be, why did you
remove it? Please don't remove stuff for no reason.
|
The answer is that I did't remove anything. I have started this before your update for Pytorch 2.0 and I didn't merge your changes. Well I did merge 2.0.0 but not 2.0.1 |
I just built this locally to get a better estimate how much I'll have to change when upgrading. One thing I noticed is that sqrt for // aten::sqrt(Tensor self) -> Tensor
- @Namespace("at") public static native @ByVal Tensor sqrt(@Const @ByRef Tensor self); Instead we have the complex variants: @Namespace("c10_complex_math") public static native @ByVal @Name("sqrt<float>") FloatComplex sqrt(@Const @ByRef FloatComplex x);
@Namespace("c10_complex_math") public static native @ByVal @Name("sqrt<double>") DoubleComplex sqrt(@Const @ByRef DoubleComplex x); Probably due to this mapping: infoMap.put(new Info("c10_complex_math::sqrt<float>").javaNames("sqrt"))
.put(new Info("c10_complex_math::sqrt<double>").javaNames("sqrt")) Is it possible to keep the original |
I don't think it's due to those, but if it works by adding one more for at::sqrt() as well, let's just do that? |
I added an explicit info for |
I'm seeing a bunch of crashes now when running our tests, especially in IndexingSlicingJoiningOpsSuite.I think they are starting to become a nice regression test suite for the native bindings. :) I've tried to isolate the crashing ops. Most seem to take lists of tensors in some form, like import org.bytedeco.pytorch.*;
global.torch.stack(
new TensorArrayRef(
new TensorVector(
AbstractTensor.create(0),
AbstractTensor.create(1)
)
),
0
).print();
The same as above,
Calling |
Should we wait after that is fixed to merge this pull request? |
From my side it's fine merging now and then fixing this in a separate PR. I'd actually like to try with the CI snapshots to rule out that the issue is caused by me building locally. |
I can reproduce your crash, so it's not related to your local build. |
The ArrayRef constructor taking a Vector isn't mapped any more. So it's the "pointer-cast" constructor that is called instead in your case, causing the crash. |
So you mean s.th. like this, relying on the fact that vectors are stored contiguously? var vector = new TensorVector(AbstractTensor.create(0), AbstractTensor.create(1));
global.torch.stack(
new TensorArrayRef(vector.front(), vector.size()),
0
).print(); Seems to be working, I got all tests passing again. :) |
Or something like:
|
This is great! Thanks for these improvements @HGuillemet! |
Reorganization of the Pytorch presets.
Main improvements are
torch::nn::Module
and its subclasses. Avoiding segmentation faults when a function acceptingtorch::nn::Module
or aModule
instance method is called from a subclass instance.shared_ptr
. This concernsModule
and its subclasses,Tensor
and a few less important classes. A consequence is that modules are now deallocated normally (they were never deallocated in previous version of the presets).shared_ptr
are not mapped any more.Module
is now virtualized, meaning that if you override in Java its virtual member functions (train
,is_training
,to
,zero_grad
,save
,load
,pretty_print
,is_serializable
), your Java code will get called by libtorch.Sequential
,AnyModule
,AnyValue
.should be mapped, ... baring a lot of exceptions and a lot of missing template instantiations.
The CUDA versions of the library now includes PTX code for compute capability 7.0 and binary codes for 5.0 and 6.0. This means you need a Maxwell card minimum, and that at first launch, if you have a Volta or more recent card, the library will compile the PTX code of all libtorch kernels into binaries for your card. This can take 10 to 20 minutes. The binaries are stored in a cache for future launches. If your application still lag at startup after first launch, this means your cache size is too low. Try to increase it by setting environment variable CUDA_CACHE_MAXSIZE to 2000000.
If this is not acceptable for your needs and you cannot afford the delay at first launch, or if you want to benefit from newer compute capabilities, you can install libtorch from pytorch.org and set the
org.bytedeco.javacpp.pathsFirst
system property totrue
.Please give it a test and report here any problem, comment, missing API or other RFE.