Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Get rid of specialization #210

Closed
konstin opened this issue Aug 25, 2018 · 42 comments · Fixed by #969
Closed

Get rid of specialization #210

konstin opened this issue Aug 25, 2018 · 42 comments · Fixed by #969
Assignees

Comments

@konstin
Copy link
Member

konstin commented Aug 25, 2018

The last step for #5 is getting rid of specialization. I've remove most uses in 7c0379b and subsequent commits, except for the following 17 10 9 7 4: rust-lang/rust#64564 changed the rules so there are again many default fn.

src/class/descr.rs
76:    default fn tp_descr_get() -> Option<ffi::descrgetfunc> {
97:    default fn tp_descr_set() -> Option<ffi::descrsetfunc> {
130:    default fn tp_as_descr(_type_object: &mut ffi::PyTypeObject) {}

src/class/buffer.rs
46:    default fn tp_as_buffer() -> Option<ffi::PyBufferProcs> {
74:    default fn cb_bf_getbuffer() -> Option<ffi::getbufferproc> {
110:    default fn cb_bf_releasebuffer() -> Option<ffi::releasebufferproc> {

src/class/gc.rs
27:    default fn update_type_object(_type_object: &mut ffi::PyTypeObject) {}
72:    default fn tp_traverse() -> Option<ffi::traverseproc> {
124:    default fn tp_clear() -> Option<ffi::inquiry> {

src/class/iter.rs
49:    default fn tp_as_iter(_typeob: &mut ffi::PyTypeObject) {}
71:    default fn tp_iter() -> Option<ffi::getiterfunc> {
94:    default fn tp_iternext() -> Option<ffi::iternextfunc> {

src/class/basic.rs
152:    default fn tp_as_object(_type_object: &mut ffi::PyTypeObject) {}
153:    default fn nb_bool_fn() -> Option<ffi::inquiry> {
183:    default fn tp_getattro() -> Option<ffi::binaryfunc> {
252:        default fn set_attr() -> Option<ffi::setattrofunc> {
274:        default fn del_attr() -> Option<ffi::setattrofunc> {
296:        default fn set_del_attr() -> Option<ffi::setattrofunc> {
324:    default fn tp_str() -> Option<ffi::unaryfunc> {
344:    default fn tp_repr() -> Option<ffi::unaryfunc> {
364:    default fn tp_hash() -> Option<ffi::hashfunc> {
389:    default fn nb_bool() -> Option<ffi::inquiry> {
409:    default fn tp_richcompare() -> Option<ffi::richcmpfunc> {

src/class/mapping.rs
83:    default fn tp_as_mapping() -> Option<ffi::PyMappingMethods> {
116:    default fn mp_length() -> Option<ffi::lenfunc> {
139:    default fn mp_subscript() -> Option<ffi::binaryfunc> {
162:    default fn mp_ass_subscript() -> Option<ffi::objobjargproc> {
187:    default fn mp_del_subscript() -> Option<ffi::objobjargproc> {
201:    default fn det_set_dispatch() -> Option<ffi::objobjargproc> {

src/class/pyasync.rs
94:    default fn tp_as_async() -> Option<ffi::PyAsyncMethods> {
121:    default fn am_await() -> Option<ffi::unaryfunc> {
144:    default fn am_aiter() -> Option<ffi::unaryfunc> {
167:    default fn am_anext() -> Option<ffi::unaryfunc> {

src/conversion.rs
107:    default fn with_borrowed_ptr<F, R>(&self, py: Python, f: F) -> R

src/class/sequence.rs
135:    default fn tp_as_sequence() -> Option<ffi::PySequenceMethods> {
168:    default fn sq_length() -> Option<ffi::lenfunc> {
190:    default fn sq_item() -> Option<ffi::ssizeargfunc> {
239:        default fn set_item() -> Option<ffi::ssizeobjargproc> {
285:        default fn del_item() -> Option<ffi::ssizeobjargproc> {
328:        default fn del_set_item() -> Option<ffi::ssizeobjargproc> {
372:    default fn sq_contains() -> Option<ffi::objobjproc> {
394:    default fn sq_concat() -> Option<ffi::binaryfunc> {
416:    default fn sq_repeat() -> Option<ffi::ssizeargfunc> {
438:    default fn sq_inplace_concat() -> Option<ffi::binaryfunc> {
465:    default fn sq_inplace_repeat() -> Option<ffi::ssizeargfunc> {

src/class/number.rs
625:    default fn tp_as_number() -> Option<ffi::PyNumberMethods> {
692:    default fn nb_add() -> Option<ffi::binaryfunc> {
714:    default fn nb_subtract() -> Option<ffi::binaryfunc> {
736:    default fn nb_multiply() -> Option<ffi::binaryfunc> {
758:    default fn nb_matrix_multiply() -> Option<ffi::binaryfunc> {
780:    default fn nb_true_divide() -> Option<ffi::binaryfunc> {
802:    default fn nb_floor_divide() -> Option<ffi::binaryfunc> {
824:    default fn nb_remainder() -> Option<ffi::binaryfunc> {
846:    default fn nb_divmod() -> Option<ffi::binaryfunc> {
868:    default fn nb_power() -> Option<ffi::ternaryfunc> {
890:    default fn nb_lshift() -> Option<ffi::binaryfunc> {
912:    default fn nb_rshift() -> Option<ffi::binaryfunc> {
934:    default fn nb_and() -> Option<ffi::binaryfunc> {
956:    default fn nb_xor() -> Option<ffi::binaryfunc> {
978:    default fn nb_or() -> Option<ffi::binaryfunc> {
1000:    default fn nb_inplace_add() -> Option<ffi::binaryfunc> {
1022:    default fn nb_inplace_subtract() -> Option<ffi::binaryfunc> {
1044:    default fn nb_inplace_multiply() -> Option<ffi::binaryfunc> {
1066:    default fn nb_inplace_matrix_multiply() -> Option<ffi::binaryfunc> {
1088:    default fn nb_inplace_true_divide() -> Option<ffi::binaryfunc> {
1110:    default fn nb_inplace_floor_divide() -> Option<ffi::binaryfunc> {
1132:    default fn nb_inplace_remainder() -> Option<ffi::binaryfunc> {
1154:    default fn nb_inplace_power() -> Option<ffi::ternaryfunc> {
1176:    default fn nb_inplace_lshift() -> Option<ffi::binaryfunc> {
1198:    default fn nb_inplace_rshift() -> Option<ffi::binaryfunc> {
1220:    default fn nb_inplace_and() -> Option<ffi::binaryfunc> {
1242:    default fn nb_inplace_xor() -> Option<ffi::binaryfunc> {
1264:    default fn nb_inplace_or() -> Option<ffi::binaryfunc> {
1287:    default fn nb_add_fallback() -> Option<ffi::binaryfunc> {
1309:    default fn nb_sub_fallback() -> Option<ffi::binaryfunc> {
1331:    default fn nb_mul_fallback() -> Option<ffi::binaryfunc> {
1353:    default fn nb_matmul_fallback() -> Option<ffi::binaryfunc> {
1375:    default fn nb_truediv_fallback() -> Option<ffi::binaryfunc> {
1397:    default fn nb_floordiv_fallback() -> Option<ffi::binaryfunc> {
1419:    default fn nb_mod_fallback() -> Option<ffi::binaryfunc> {
1441:    default fn nb_divmod_fallback() -> Option<ffi::binaryfunc> {
1463:    default fn nb_pow_fallback() -> Option<ffi::ternaryfunc> {
1485:    default fn nb_lshift_fallback() -> Option<ffi::binaryfunc> {
1507:    default fn nb_rshift_fallback() -> Option<ffi::binaryfunc> {
1529:    default fn nb_and_fallback() -> Option<ffi::binaryfunc> {
1551:    default fn nb_xor_fallback() -> Option<ffi::binaryfunc> {
1573:    default fn nb_or_fallback() -> Option<ffi::binaryfunc> {
1595:    default fn nb_negative() -> Option<ffi::unaryfunc> {
1618:    default fn nb_positive() -> Option<ffi::unaryfunc> {
1640:    default fn nb_absolute() -> Option<ffi::unaryfunc> {
1662:    default fn nb_invert() -> Option<ffi::unaryfunc> {
1684:    default fn nb_int() -> Option<ffi::unaryfunc> {
1706:    default fn nb_float() -> Option<ffi::unaryfunc> {
1728:    default fn nb_index() -> Option<ffi::unaryfunc> {

src/instance.rs
387:    default fn to_managed_py_ref<'p>(&self, py: Python<'p>) -> ManagedPyRef<'p, Self> {
396:    default fn drop_impl(borrowed: &mut ManagedPyRef<Self>) {

src/types/sequence.rs
268:                default fn extract(obj: &'a PyAny) -> PyResult<Self> {
307:    default fn extract(obj: &'a PyAny) -> PyResult<Self> {

Once those are removed, we pyo3 can be stable on rust 1.30! I'm happy to take pull requests tackling one or more of those default fn uses

@ijl
Copy link
Contributor

ijl commented Nov 11, 2018

The codegen in pyo3-derive-backend is a problem once the above functions are removed. For example:

impl ::pyo3::class::methods::PyPropMethodsProtocolImpl for #cls {
fn py_methods() -> &'static [::pyo3::class::PyMethodDefType] {
static METHODS: &'static [::pyo3::class::PyMethodDefType] = &[
#(#py_methods),*
];
METHODS
}
}

impl ::pyo3::class::methods::PyMethodsProtocolImpl for #ty {
fn py_methods() -> &'static [::pyo3::class::PyMethodDefType] {
static METHODS: &'static [::pyo3::class::PyMethodDefType] = &[
#(#methods),*
];
METHODS
}
}

I am not familiar enough with it to have a solution.

@konstin
Copy link
Member Author

konstin commented Nov 11, 2018

I've also already ran against that and haven't found a good solution either :/

I think the most viable way forward would be not to have a default implementation of PyMethodsProtocolImpl, but instead have #[pymethods] implement it. For cases where there is no pymethods block, one could use #[pyclass(methods = false)] which then implements py_methods().

@konstin
Copy link
Member Author

konstin commented Jan 30, 2019

#332 fixes the PyMethodsProtocolImpl problem

@konstin konstin removed this from the 0.6.0 milestone Mar 17, 2019
@bspeice
Copy link

bspeice commented Apr 15, 2019

Is this still an issue given #332 being merged? I'm interested in helping push this to stable, and would appreciate advice for where to start.

EDIT: It looks like the first two instances in conversion are trivially removable, and I'm not sure what the implications are for removing the attempted buffer.

@kngwyu
Copy link
Member

kngwyu commented Apr 16, 2019

@bspeice
I think now the problem is around #426.
I'm looking for a better design but have not come up with it.

@konstin
Copy link
Member Author

konstin commented Apr 17, 2019

@bspeice #210 (comment) is fixed by #332, while some specialization are still there.

The main question is how we can rewrite the following so we don't have a conflicting implementation without specialization anymore:

pub trait FooImpl {
    fn foo() -> Option<ffi::Foo> {
        None
    }
}

impl<T> FooImpl for T {}

impl<'p, T> FooImpl for T
where
    T: Bar<'p>,
{
    fn foo() -> Option<ffi::PySequenceMethods> {
        return bar();
    }
}

(You can see a lot of those in the error message by removing #![feature(specialization)] from lib.rs)

@Alphare
Copy link

Alphare commented Jun 4, 2019

As a contributor to Mercurial where we started using Rust a few months ago, I can say that we very much look forward to using PyO3, and the progress towards a build on stable is really encouraging.
However I wonder if that last issue is fixable at all, considering that it is exactly the use case for specialization.

@konstin
Copy link
Member Author

konstin commented Jun 5, 2019

It is technically fixable. We mainly have two options:

Inventory

Instead of implementing traits, the protocol implementations are submit to an inventory and added collected at class initialization time. The disadvantage here is that this essentially a well established linker hack, which can cause some weird interactions with the c api (e.g. #341).

Inventory is already used for pymethods blocks (so that we can have 0 to n pymethods blocks).

Manual annotations

You could force explicit annotations on pyclass blocks about which protocols will be implemented. The pyclass block would add default implementations for all protocol you didn't specify. If we wanted to completely get rid of inventory, we'd also need to forbid multiple pymethod block (or add an awkward syntax for numbering them) and annotate whether a struct has one or zero pymethod blocks.

@birkenfeld
Copy link
Member

I feel like #2 is not a particularly hard burden, especially with the outlook that it will get removed in some future.

For me a few more manual steps are certainly worth it for going to stable.

@Alphare
Copy link

Alphare commented Sep 4, 2019

Is there anything I can do to help? I'm not sure how much time I can set aside, but compiling on stable would be a huge deal to the Mercurial project.

@rth
Copy link

rth commented Sep 4, 2019

@Alphare not an expert on this, but as far as I understood #552 is a WIP PR to address part of this that could use some help.

@konstin
Copy link
Member Author

konstin commented Sep 5, 2019

@Alphare There are mainly two categories of traits that still use specialization: The first are already protocol implementation traits, which are used to implement the dunder methods. If I haven't made a grave mistake, extending #552 to all Py*Protocol traits should give us protocols without specialization.

The other category are the conversion traits, specifically FromPyObject. This requires a lot of trait refactoring. I think the way to tackle this is trying to move everything over to PyTryInto and IntoPy. E.g. in #583 I've replace IntoPyObject with IntoPy.

A way to get a (very verbose) todo list is to remove the #![feature(specialization)] from src/lib.rs and run cargo check.

@birkenfeld
Copy link
Member

@dtolnay to the rescue? https://www.reddit.com/r/rust/comments/dm371q/towards_overflower_10/f4wjajf

@gilescope
Copy link
Contributor

I'm having a play with that technique to see if I can get it working for an Add method as an initial example. If someone else beats me to it I would not complain at all! Once Pyo3 is stable I expect serious uptick in usage.

@gilescope
Copy link
Contributor

gilescope commented Nov 15, 2019

rg "default .*fn" seems to have ~60 hits now not 18, we're further away then when the issue was raised :-( .

#552 doesn't compile - did it ever or was that the work in progress bit?

Most of the specialised functions are static taking no args - all of @dtolnay 's excellent specialisation examples dependend upon selecting the specialisation based upon the type of the argument.

So that's the bad news out of the way. Anyone got any good news / suggestions?
(Also is there a good chat room discord/irc/zullip to talk pyo3?)

@kngwyu
Copy link
Member

kngwyu commented Nov 15, 2019

rg "default .*fn" seems to have ~60 hits now not 18, we're further away then when the issue was raised :-( .

It's because of #614

#552 doesn't compile - did it ever or was that the work in progress bit?

It's also because of #614

Most of the specialised functions are static taking no args - all of @dtolnay 's excellent specialisation examples dependend upon selecting the specialisation based upon the type of the argument.

Yeah, we cannot use it.

So that's the bad news out of the way. Anyone got any good news / suggestions?
(Also is there a good chat room discord/irc/zullip to talk pyo3?)

We have a gitter channel.

And what I can say is...
We have 2 kinds of specialization usages.

  1. For ObjectProtocol
  2. For distinguishing Python native object and pyclass

I think the later one should be resolved by changing the object layout of pyclass.
And, for the former one, I still have not decided what kind of approach we should use.
@konstin's 'unified slots' approach is good but still needs default.
I'm thinking about specialized struct approaches but... now it's too difficult to explain shortly here.

@konstin
Copy link
Member Author

konstin commented Nov 15, 2019

And, for the former one, I still have not decided what kind of approach we should use.
@konstin's 'unified slots' approach is good but still needs default.

I'm pretty sure that the approach using inventory still works, but so far no one volunteered to work in that.

@gilescope
Copy link
Contributor

gilescope commented Nov 17, 2019 via email

@wickerwaka
Copy link

I have some alternative solutions that I've been hacking away at here: https://github.com/wickerwaka/pyo3/tree/removing-specialization

I still have a about 30% of the tests stubbed out, but it is building and running with stable rust. The solution I settled on is having the proc_macros provide default implementations for the various protocol traits that the user has not implemented. For any missing implementations in #[pyproto] blocks that was relatively simple, but there wasn't an easy solution for entirely missing protocols because there was no point at which the proc_macros could know the user was finished implementing protocols. So the solution I settle on there is requiring a list of protocols in the #[pyclass] proc_macro like so:
#[pyclass(protocols=[PyNumberProtocol, PyObjectProtocol])]
Which instructs the pyclass proc_macro to provide default implementations for everything except PyNumberProtocol and PyObjectProtocol.

Generally it has been working pretty well. I am currently trying to work around issues with Sel, Del and SetDel implementations for the sequence and mapping protocols, since the current implementations rely on specialization to make a compile time decision over which methods to use.

Again, this is still very hacky and is just a proof of concept and learning experience for me, it is not in a mergable state.

@gilescope
Copy link
Contributor

Nice hacking! One thing I really like about your approach: being more dependent on proc macros (compile time) rather than being more dependent on inventory (runtime) seems like a simpler solution.

@kngwyu
Copy link
Member

kngwyu commented Dec 1, 2019

@davidhewitt
Thank you.
I'll cc you when I file an issue or PR for related topics.

@rbjorklin
Copy link

I'm not familiar with the problem so just passing information along, would this be of any use? https://github.com/dtolnay/case-studies/blob/master/autoref-specialization/README.md

Found it here: rust-lang/rust#31844 (comment)

@seako
Copy link

seako commented Dec 5, 2019

I'm not familiar with the problem so just passing information along, would this be of any use? https://github.com/dtolnay/case-studies/blob/master/autoref-specialization/README.md

Found it here: rust-lang/rust#31844 (comment)

alas, no
#210 (comment)

Most of the specialised functions are static taking no args - all of @dtolnay 's excellent specialisation examples dependend upon selecting the specialisation based upon the type of the argument.

@dtolnay
Copy link

dtolnay commented Dec 5, 2019

Most of the specialised functions are static taking no args - all of @dtolnay 's excellent specialisation examples dependend upon selecting the specialisation based upon the type of the argument.

FWIW I don't see the presence or absence of arguments as being relevant to the technique. You can always turn a no-arg function into a method by dispatching it on PhantomData::<T>.

For example instead of:

trait Trait {
    fn f();
}

impl<T: Display> Trait for T {
    default fn f() {...}
}

impl Trait for String {
    fn f() {...}
}

you would dispatch it as:

impl<T: Display> DisplayKind for &PhantomData<T> {}
impl StringKind for PhantomData<String> {}

@kngwyu kngwyu pinned this issue Dec 5, 2019
@kngwyu
Copy link
Member

kngwyu commented Dec 9, 2019

Just an experimental note:
I'm considering the use of 'provider pattern' where we switch implementations by generics, instead of trait specialization: https://play.rust-lang.org/?version=stable&mode=debug&edition=2018&gist=8131e41b3c816bb317391c252f831ab0
In this pattern, we prepare 2 kinds of structs, which corresponds to default implementation and real protocol implementation.

trait LenProvider: Sized {
    fn len() -> Option<Box<dyn Fn(PyObject) -> usize>> {
        None
    }
}

struct DummyProvider;
impl LenProvider for DummyProvider {}

struct LenProviderImpl<T: LenImpl>(PhantomData<T>);
impl<T: 'static + LenImpl> LenProvider for LenProviderImpl<T> {
    fn len() -> Option<Box<dyn Fn(PyObject) -> usize>> {
        fn wrap<T: LenImpl>(obj: PyObject) -> usize {
            let obj = T::from_obj(obj);
            obj.__len__().into().unwrap()
        }
        Some(Box::new(wrap::<T>))
    }
}

And then we can select which implementation to use in proc-macro code.
If a user implement __len__, we pass LenProviderImpl<T> to the initialization function.
If he doesn't, we use DummyProvider.

@gilescope
Copy link
Contributor

gilescope commented Dec 9, 2019 via email

@kngwyu
Copy link
Member

kngwyu commented Dec 9, 2019

@gilescope
I'm not sure what you are talking about, but if you are suspicious about specific protocol traits(e.g., LenImpl in my example or PyMappingLenProtocol), they exist so that users do not need to give associated type definitions for methods they don't need.
We can skip implementaiton of traits by unimplemented!, but can't do so for associated types.

@konstin
Copy link
Member Author

konstin commented Dec 10, 2019

I’m probably naive here, but why can’t we have a trait with all the dunder methods on it with default implementations of unimplemented! and the we just override methods we want to impl? Is it because you might be able to write blanket impls more easily? Or is it that all the impl would have to be in one place? I’m just trying to understand why the easy route is ruled out.

Those protocols aren't just normal methods, but use different "tp slots" (s. https://docs.python.org/3/c-api/typeobj.html), and each protocol trait is for one of those slots. The python interpreter also will behave differently for methods that aren't there than for those that are implemented as error (and especially unimplemented!() will crash the whole application instead of raising an exception).

@programmerjake
Copy link
Contributor

One thing to consider is moving to min_specialization instead of specialization: See rust-lang/rust#31844 (comment)

@davidhewitt
Copy link
Member

+1 on considering min_specialization. I have taken a brief look at it and I think it'll still require reworking a fair amount of existing uses of specialization, but at least then we'll be on a sound subset of specialization which might be stabilized sooner rather than later.

@nacaclanga nacaclanga mentioned this issue Apr 29, 2020
6 tasks
@kngwyu
Copy link
Member

kngwyu commented May 1, 2020

For now, I think we can remove specialization by using inventory for what is called PyMethod in our proc-macro codes. I'll create a patch for this if I can work on it this weekend.

Anyway, I think we should be more optimistic about this. As @konstin showed, there's certainly a way to remove specialization.

@davidhewitt
Copy link
Member

davidhewitt commented May 1, 2020

I would love it if you have the chance to make that PR! That's easily the biggest use of specialization and then it should be relatively easy for us to get to stable 👍

Don't worry about my WIP branches breaking with changes you make - I will rebase them after 😂

@kngwyu
Copy link
Member

kngwyu commented May 18, 2020

STATUS: Now I'm experimenting with a new idea to re-implement object protocols without specialization.
At last, my 'provider pattern' failed since it's really difficult to pass marker types in proc-macro code when a user doesn't specify #[pyproto].
So now I'm implementing an inventory-like, but task-specific and more efficient storage to store FFI functions.
Though recently I have lots of other tasks and cannot complete the implementation immediately, I'll open a PR till June 5th or so.

@davidhewitt
Copy link
Member

@kngwyu have you got the provider pattern branch anywhere? I'd be curious to read the code later and see if I could offer any ideas

@kngwyu
Copy link
Member

kngwyu commented May 18, 2020

@davidhewitt

have you got the provider pattern branch anywhere?

No.

@kngwyu kngwyu self-assigned this May 18, 2020
@davidhewitt
Copy link
Member

Alas, no worries. Many thanks for your continued work on this! (And of course take your time - we have waited a long time already, can wait a few more months 😄 )

@kngwyu
Copy link
Member

kngwyu commented Jun 21, 2020

🎉

@kngwyu kngwyu unpinned this issue Jun 21, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.