Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add opt_size feature for embedded environments. #2

Merged
merged 1 commit into from
Mar 23, 2021

Conversation

cr1901
Copy link
Contributor

@cr1901 cr1901 commented Mar 23, 2021

Based on discussions in a separate repo, I thought it would be fun to add size optimizations to this crate.

Benchmarks

Test code:

#![no_main]
#![no_std]
#![feature(abi_msp430_interrupt)]

extern crate panic_msp430;

use msp430::asm;
use msp430_rt::entry;
use msp430f5529::interrupt;
use hmac_sha256::HMAC;

// P0 = red LED
// P6 = green LED
#[entry]
fn main() -> ! {
    let p = msp430f5529::Peripherals::take().unwrap();

    // Disable watchdog
    let wd = p.WATCHDOG_TIMER;
    wd.wdtctl.write(|w| {
        unsafe { w.bits(0x5A00) } // password
        .wdthold().set_bit()
    });

    let p12 = p.PORT_1_2;
    let p34 = p.PORT_3_4;

    p12.p1dir
       .modify(|_, w| w.p1dir0().set_bit());
    p12.p1out
       .modify(|_, w| w.p1out0().set_bit());
    p12.p1out
       .modify(|_, w| w.p1out0().clear_bit());

    let h = HMAC::mac(&[], &[0u8; 32]);
    assert_eq!(
        &h[..],
        &[
            182, 19, 103, 154, 8, 20, 217, 236, 119, 47, 149, 215, 120, 195, 95, 197, 255, 22, 151,
            196, 147, 113, 86, 83, 198, 199, 18, 20, 66, 146, 197, 173
        ]
    );

    p12.p1out
       .modify(|_, w| w.p1out0().set_bit());

    loop {
        asm::nop();
    }
}


#[interrupt]
fn TIMER0_A0() {

}

With opt_size disabled (msp430), it takes ~155ms to get to the infinite loop:
image

Crate size is reasonable:

$ msp430-elf-size target/msp430-none-elf/release/examples/blinky
   text    data     bss     dec     hex filename
  17970       0       2   17972    4634 target/msp430-none-elf/release/examples/blinky

With opt_size enabled (msp430), it takes ~180ms to get to the infinite loop:
image

The crate size drops drastically:

$ msp430-elf-size target/msp430-none-elf/release/examples/blinky
   text    data     bss     dec     hex filename
   4418       0       2    4420    1144 target/msp430-none-elf/release/examples/blinky

@jedisct1
Copy link
Owner

This is a good idea, thank you!

For consistency, could you possibly submit the same change to the hmac-sha512 version?

Thank you!

@jedisct1 jedisct1 merged commit d2b5778 into jedisct1:master Mar 23, 2021
@jedisct1
Copy link
Owner

New version published to crates.io.

@jedisct1
Copy link
Owner

Do you think ed25519-compact could also benefit from de-inlining? Mind running a benchmark on the same platform?

@cr1901
Copy link
Contributor Author

cr1901 commented Mar 23, 2021

@jedisct1 I used Logic Analyzer traces to do benchmarks because it was, simply put, easier than setting up a microcontroller timer lmao. I put the LA away for now, but I'll do more benchmarks in a few hours.

@cr1901
Copy link
Contributor Author

cr1901 commented Mar 24, 2021

@jedisct1 I got a bit carried away and made a benchmarker for MSP430 crates (something I've been meaning to do for a while, in all honesty):

image

hmac_sha512 also works fine. When both hmac_sha256 and hmac_sha512 are present w/ opt_size enabled:

$ msp430-elf-size target/msp430-none-elf/release/examples/bench
   text    data     bss     dec     hex filename
  11248       0       2   11250    2bf2 target/msp430-none-elf/release/examples/bench

Do you think ed25519-compact could also benefit from de-inlining? Mind running a benchmark on the same platform?

It probably will benefit, but I can't compile get_random for MSP430 at the moment (not supported). Additionally, I get atomic-related errors and MSP430 doesn't meaningfully support atomics1.

  1. bis and bic are atomic, but can only implement atomic bools.

@cr1901
Copy link
Contributor Author

cr1901 commented Feb 19, 2022

@jedisct1 Would you still be interested in me doing benchmarking for ed25519-compact? get_random nowadays has a hook for installing a custom RNG. I would be providing a fake RNG (just an incrementing counter); would a fake RNG for the sake of testing affect the size/speed/quality of results? If so, there may be an alternative...

@jedisct1
Copy link
Owner

Sure, that can be useful!

@cr1901
Copy link
Contributor Author

cr1901 commented Feb 19, 2022

Sure, that can be useful!

Okay, that'll be one of this weekend's projects. If you're interested in the code, I've uploaded it as well here. It does not work right now; making it work is another of this weekend's projects :)!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants