Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implementing pre allocation context creation #116

Closed
wants to merge 2 commits into from

Conversation

elichai
Copy link
Member

@elichai elichai commented May 28, 2019

Hi,
I want to implement bitcoin-core/secp256k1#566

this PR build on top of #115.

I see 2 ways to do it:

  1. expose get_preallocated_size(), new_preallocated_context() functions to the user and add the reference to that buffer into the Context struct so rust won't allow him to write into it anymore.

  2. just assume that if he can allocate data on the fly then he will Implement the https://doc.rust-lang.org/beta/std/alloc/trait.GlobalAlloc.html and just use liballoc instead. (and allocate ourself with Vec).

This is a first try on how 1 will look like (didn't write function docs and I hope I can find better function names haha)

@elichai
Copy link
Member Author

elichai commented May 28, 2019

I hoped I could leverage Verification and Signing and add a "parent" trait called Context that has the flags in it according to all the traits and then just use rust generics to identify if it's all/verification/signing and act accordingly.
Put this might be impossible in rust current state.

(tried combining traits: https://play.rust-lang.org/?gist=1c23c763ea2438eb620f51fbfb985515,
And also tried with Specialization(https://github.com/rust-lang/rfcs/blob/master/text/1210-impl-specialization.md) which didn't really work either and that's not yet stable anyway.)

@elichai
Copy link
Member Author

elichai commented Jun 11, 2019

Ok, I thought about it and I think the best way to do this is by testing how much memory each flag requires and then putting that on the stack and panic if the result of new_preallocated_context is bigger than that.
then every time we update the underlying libsecp the tests will crash if the context is now bigger and we can adjust the stack allocation accordingly.

(that way we can preserve the API)

What do you guys think? @apoelstra

@real-or-random
Copy link
Collaborator

Hm, the problem with that approach is that the size of the underlying context depends on the (C) platform for which secp256k1 has been compiled and the configuration settings (endo, etc.). Given what's currently in the context, we could reasonably guess an upper bound depending on configuration settings, etc. But that seems overly complicated.

What's the downside of adding get_preallocated_size()?

Is the underlying problem that you want to avoid the heap for no_std?

@elichai
Copy link
Member Author

elichai commented Jun 12, 2019

Yes. The underlying problem is to avoid heap allocation and give a fully no-std support.

In the current implementation I wrote there's a get_preallocated_size and then the user of the lib should give a mutable reference to a big enough buffer which is stored in the context struct (this prevents the user from mutating it while it's part of the context)

I just thought that it would be nicer to preserve API if we can put an upper bound and then put it on the stack, but if it's platform specific then it'll be overly complicated as you said.

@real-or-random
Copy link
Collaborator

Hm I see. Offering get_preallocated_size is simply pushing the problem towards the user who then can't use the heap either. The user may have a better understanding of the target platform though. But I think this really depends on your setting.

Maybe we can convince upstream to make secp256k1_context_preallocated_size a macro that expands to a compile time constant.

@elichai
Copy link
Member Author

elichai commented Jun 12, 2019

yeah, so the user might have a big scratch space available (either heap or stack) which he could use, or he might have the ability to do some weird allocations himself(see #119 for example) but it would be easier if we could solve it on the stack for the user.

if the whole size thing could've been constant it would be nice but I think there's some implementation difficulties and they're hoping to just make the Context itself constant and then they could get rid of it completely(and replace it with compilation flags for the size of the context)

@apoelstra
Copy link
Member

@elichai
Copy link
Member Author

elichai commented Jun 12, 2019

@apoelstra So you're suggesting to implement the trait directly on the structs/enums and not through the already existing traits, that's too bad but I guess that's all we can do.

Maybe i'll make another PR that replaces the structs with enums and adds this trait, I think it should be in a separate PR

@apoelstra
Copy link
Member

Yeah, it kinda sucks but I think it's the correct thing to do actually - you can imagine some future context type has different flags but still implements the ordinary Signing and Verification traits.

@real-or-random
Copy link
Collaborator

yeah, so the user might have a big scratch space available (either heap or stack) which he could use, or he might have the ability to do some weird allocations himself(see #119 for example) but it would be easier if we could solve it on the stack for the user.

OTOH this would assumes that we have enough stack space and the user wants to do this on the stack. One typical use of no_std will be embedded devices where the user may indeed want to decide where in "which" memory the context is created. So I guess exporting the size function is okay for now.

@vhnatyk
Copy link

vhnatyk commented Jun 18, 2019

Hi! @real-or-random - given latest changes and my yesterday's "research" 😄 I had impression that all I need is to define lower 'ECMULT_WINDOW_SIZE' for static precompute tables. So I set in build.rs:58 .define("ECMULT_WINDOW_SIZE", Some("4")); but seems I'm still getting precompiled tables of 1.35 MiB size.

Then I even refreshed secp256k1 depend sources with latest and even tried to merge#337 -variable sized precomputed table for signing (where you are also reviewer) on top.

That's after both #115 and #116. And everywhere behavior is virtually the same - SEGFAULT at manual_alloc as I see in debug it tries to allocate too much mem. Probably I'm missing something obvious - but still unable launch it on mcu 😢

Here is the diff between this pr and my experimental branch

@vhnatyk
Copy link

vhnatyk commented Jun 18, 2019

@real-or-random -oh btw - in build.rs there is line .debug(true) in base_config . Is it ok?

@real-or-random
Copy link
Collaborator

@vhnatyk Hm weird, can you open a separate issue with more background?

  • How much RAM do you have?
  • Do you need verification, signing or both?
  • How do you know that manual_alloc is still really allocating of 1.35 MB? Have you checked the output of the size function? (Maybe the reason for the segfault is entirely different).

The debug(true) is a deliberate decision.

@vhnatyk
Copy link

vhnatyk commented Jun 18, 2019

  • I have 256Kb RAM and when I hardcode smaller values being passed to checked_malloc in the current master branch (though it fails few tests of course) - I'm able to pass checked_malloc (but it anyway fails on later steps pretty expected)

  • Not verification nor signing - just point and scalar math

  • With hardcoded values when I run the test in my app and analyze mem consumption in vlagrind massif (linux) and mtuner (on windows) it shows something like 2mb ram on master and ~500kb with hardcoded dummy array sizes (like secp256k1_ge prec[64]; instead of secp256k1_ge prec[1024])

I admit that it's a bit of black magik, but my current understanding of the subject doesn't let me generate any more useful info, unfortunately 😊

build.rs Outdated
.define("ENABLE_MODULE_ECDH", Some("1"));
.define("ENABLE_MODULE_ECDH", Some("1"))
.define("USE_EXTERNAL_DEFAULT_CALLBACKS", Some("1"))
.define("ECMULT_WINDOW_SIZE", Some("15")); // This is the default in the configure file (`auto`)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ECmult window size needs to be configurable, 15 is definitely too large for smaller platforms
(though I don't know if rust has any way to pass values instead of feature bits?)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think there is a way in rust to pass variables as compile time arguments. but i'll look into it, maybe there's some hack we can use (maybe environment variable?)

@elichai
Copy link
Member Author

elichai commented Jul 3, 2019

So after talking with apoelstra, it seems like the right way would be to split this PR to 2.

  1. Remove all free/malloc and manage the memory ourself.
  2. Enable a way for no-std users to supply their own memory buffer.

I'll implement that after #115 will get merged.
(maybe even another PR before to redo the trait flags stuff)

ctx: *mut ffi::Context,
phantom: PhantomData<C>
phantom: PhantomData<C>,
_buf: Option<&'buf [u8]>,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think unconditionally adding 'buf to the public API will make things too unergonomic for users who aren't providing their own buffer.

I think it'd be better to add a new set of context types SignOnlyBuffer<'buf> VerifyOnlyBuffer<'buf> AllBuffer<'buf> which tie the lifetime of the passed-in buffer to the context. Then let phantom deal with the borrowck and change the _buf field into buf: Option<*mut u8>.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hmm you prefer that no-std users will have a different Context struct? (If you provide no buffer you shouldn't feel the new lifetime at all(as a user, for the library maintainers you're right that it's unergonomic))

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we do go in that direction we'll need to abstract out the Context, so that sign/verify functions won't be implemented on the struct itself but on a generic of the trait (adding something like get_ptr() to the traits)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the nice thing about having &[u8] is that it's a fat pointer which makes it easier to clone (no need to recheck the size)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At the very least having 'buf in the API makes the documentation confusing, since users will see a lifetime but have no idea what it's for or how to use it.

I'm pretty sure sign/verify are already abstracted over the context, since e.g. both SignOnly and All have signing functionality.

Secp256k1 {
ctx: unsafe { ffi::secp256k1_context_clone(self.ctx) },
phantom: self.phantom
phantom: self.phantom,
_buf: None,
}
}
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

...then I'd implement Clone only for the non-Buffer versions of Secp256k1.

pub fn new() -> Secp256k1<All> {
Secp256k1 { ctx: unsafe { ffi::secp256k1_context_create(ffi::SECP256K1_START_SIGN | ffi::SECP256K1_START_VERIFY) }, phantom: PhantomData }
pub fn new() -> Secp256k1<'buf, All> {
Secp256k1 { ctx: unsafe { ffi::secp256k1_context_create(Self::FLAGS) }, phantom: PhantomData, _buf: None }
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As discussed on IRC, I'd like to allocate our own memory in Rust and then call the prealloc functions.

@elichai
Copy link
Member Author

elichai commented Jul 4, 2019

So A couple of design decisions we need to make:

  1. how do we differentiate our allocations from user allocations? different structs or having some lifetime inclusion(by PhantomData or by slice) to the regular Context struct?
    If we do separate the structs, how do we abstract out the sign/verify functions? have a trait to return a ffi::Context pointer?

  2. If we need to clone ourself, how do we measure the size that we need to allocate? I think this can be solved with something like Implementing pre allocation context creation #116 (comment).

If we move the flags into a trait, and create a different struct for no-std, that's 2 new traits and a struct, are we ok with this? @apoelstra what are your thoughts?

@apoelstra
Copy link
Member

  1. I'd like to use different structs; it makes things more robust against future changes.
  2. Yeah, we can use Self::get_preallocated_size() (so we'd need to rename that function to have the same name for all types).

I don't understand what you mean by "move the flags into a trait". Can you be more explicit?

Also while you're messing with structs can you make them all empty enums? (#117)

@elichai
Copy link
Member Author

elichai commented Jul 4, 2019

Closed in favor of #125

@elichai elichai closed this Jul 5, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants