Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sys/fmt: optimize scn_u32_hex() #19894

Closed
wants to merge 1 commit into from

Conversation

benpicco
Copy link
Contributor

Contribution description

We can save a branch by always converting to lowercase first.

Testing procedure

tests-fmt in tests/unittests covers this.

Issues/PRs references

@benpicco benpicco requested a review from kfessel August 22, 2023 09:39
@github-actions github-actions bot added the Area: sys Area: System label Aug 22, 2023
@benpicco benpicco added Type: enhancement The issue suggests enhanceable parts / The PR enhances parts of the codebase / documentation Impact: minor The PR is small in size and might only require a quick look of a knowledgeable reviewer CI: ready for build If set, CI server will compile all applications for all available boards for the labeled PR labels Aug 22, 2023
@riot-ci
Copy link

riot-ci commented Aug 22, 2023

Murdock results

✔️ PASSED

53f6b9a sys/fmt: optimize scn_u32_hex()

Success Failures Total Runtime
10009 0 10009 06m:41s

Artifacts

Copy link
Contributor

@kfessel kfessel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This PR has an extra call to strnlen, does not check for out of hex range chars like c > 'f' or control char, fixing that will likely make this less efficient then our current impl.

check godbolt

@kfessel
Copy link
Contributor

kfessel commented Aug 22, 2023

godlblot cortex m0+ including a version 3

version 3 might be an improved version of this but I am not sure it covers all issues of this PR

scn_u32_hex("9:3k",4) == 9
scn_u32_hex2("9:3k",4) == 39476
scn_u32_hex3("9:3k",4) == 2467

No it does not

cortex m0+
this pr pushes 4 regs to stack ( not counting strnlen)
versions 3 pushes 3
current RIOT pushes 2

cortex m4
this pr 2regs to stack ( not counting strnlen)
version 3 none
current none

godbolt excutable

@kfessel
Copy link
Contributor

kfessel commented Aug 22, 2023

one might want to look at #19027

@Teufelchen1
Copy link
Contributor

As I stated in PR #19027 (comment), the unittest for the fmt module are lacking. Your changes break the API promised in fmt.h. Sadly, our unittest don't catch that.

@kfessel
Copy link
Contributor

kfessel commented Oct 16, 2023

@Teufelchen1 : You probably already got something in mind would you want to make that a PR?
In case, please link this and the other (my) PR for reference

@benpicco benpicco added the State: invalid State: The issue / PR is not valid label Jan 17, 2024
@Teufelchen1
Copy link
Contributor

Hey hey, the new test just got merged. It will fail with the current implementation in this PR.
Do you want to adapt or close this?

@benpicco benpicco force-pushed the scn_u32_hex-opt branch 2 times, most recently from 0d4160e to b0b8167 Compare March 12, 2024 12:56
@benpicco benpicco removed the State: invalid State: The issue / PR is not valid label Mar 12, 2024
Teufelchen1
Teufelchen1 previously approved these changes Mar 12, 2024
@Teufelchen1 Teufelchen1 added this pull request to the merge queue Mar 12, 2024
Copy link
Contributor

@kfessel kfessel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

uses more stack

@kfessel kfessel removed this pull request from the merge queue due to a manual request Mar 12, 2024
@benpicco
Copy link
Contributor Author

uses more stack

How?

@kfessel
Copy link
Contributor

kfessel commented Mar 12, 2024

uses more stack

How?

seem like it needs an intermediate for the case-conversion

@kfessel
Copy link
Contributor

kfessel commented Mar 12, 2024

https://godbolt.org/z/dz3c6d1oh

for your experiments

it has the old experiments (frome the previos comment) on top

@kfessel
Copy link
Contributor

kfessel commented Mar 12, 2024

ah and the test ist still incomplete

@kfessel
Copy link
Contributor

kfessel commented Mar 12, 2024

https://godbolt.org/z/KTYWf4v66

execute on x86 godlbolt reveals this PR will return 2467 if asked for scn_u32_hex2("9:3k",5)

i am not sure how it does that

now i know : to ? are right behind 0-9 so they are alternative symbols for counting to f and the| 32 doesn't take a hit on them

0123456789:;<=>?

@benpicco
Copy link
Contributor Author

Yea I guess it's mostly obfuscation at this point…

@kfessel
Copy link
Contributor

kfessel commented Mar 13, 2024

It's now even worse (more stack and more assembly, and the code is barely readable)

https://godbolt.org/z/zMWGKWefT

Do you really want to try to optimize this function? (I might be biased because i wrote the current implementation #19027 the pr also has a link to godbolt where you can find versions of this functions that are smaller on stack and in code but are also less safe ( one of them is probably also what this pr once was) )

If you do please provide some kind of proof for: "this is better"

I like godbolt gcc-arm a current version -O(2,z or 3) -mcpu=cortex-m0plus -mfloat-abi=soft or -O(2,z,3) -mcpu=cortex-m4 for reading
with a main function and an x86-64 you can also have it run something print or compare (know good to new) and print test cases in arrays iterate across ranges ... it is realy nice

@kfessel kfessel dismissed Teufelchen1’s stale review March 13, 2024 11:31

lets not acciedentily merge this

@Teufelchen1
Copy link
Contributor

I am with @kfessel here. Closing this PR seems to be the best way to handle this? Even if we find an implementation that produces perfect out put for a given compiler: There are so many variables. So far we only looked at arm but RIOT supports more ISAs. How do their compilations compare? What is about run-time performance? etc. etc.

@Teufelchen1
Copy link
Contributor

In consent with benpicco: Closed, not worth the extra time.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Area: sys Area: System CI: ready for build If set, CI server will compile all applications for all available boards for the labeled PR Impact: minor The PR is small in size and might only require a quick look of a knowledgeable reviewer Type: enhancement The issue suggests enhanceable parts / The PR enhances parts of the codebase / documentation
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants