Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

HID-Parser perfomance improvements #4894

Merged
merged 19 commits into from
Mar 22, 2023

Conversation

JoergAtGithub
Copy link
Member

@JoergAtGithub JoergAtGithub commented Aug 14, 2022

This is based on PR #4718

The function HIDController.getOutputField contained some calculation intensive nested for loops. Since HIDController.getOutputField is an often executed function, this had a severe performance impact, especially for HID controllers with long reports.
This PR replaces this nested loops by a lookup map.
In practice the execution time of some CO callbacks for my Traktor Z2 dropped from ~6ms to 1.2ms.
Additionl loop optimization brings it further down to 755ms

@JoergAtGithub
Copy link
Member Author

This is Ready For Review, but depends on #4718. Therefore it's in state Draft.

@JoergAtGithub JoergAtGithub force-pushed the hid_parser_perfomance_improvement branch from 389ce1a to 22cf892 Compare October 11, 2022 18:29
@JoergAtGithub JoergAtGithub marked this pull request as ready for review November 20, 2022 08:42
@JoergAtGithub JoergAtGithub changed the base branch from main to 2.4 January 12, 2023 07:07
…efault undefined is handled in the code already. In existing mappings this argument is seldom used)
@Swiftb0y
Copy link
Member

Qt uses JavaScriptCore as its interpreter and JIT. These seem to be some resources on learning JSC internals:
https://trac.webkit.org/wiki/JavaScriptCore
https://zon8.re/posts/jsc-internals-part1-tracing-js-source-to-bytecode/ (5 part series)

That said, IMO discussing javascript performance characteristics of semantically equivalent language constructs is a lost cause. No matter the language, the first thing to look at when optimizing code are the algorithms, something this code is clearly not very good at (see many crude nested searches present). Replacing the worst case offender with a hashmap was definitely a good step. Anymore probably requires a redesign anyways.

@JoergAtGithub
Copy link
Member Author

I benchmarked different loop variants using the most calculation demanding CO of my Z2 mapping (changing the values of the 7segment display which means 21 7bit fields changed at each call of this CO).

I used my two Windows laptops, and build this PR locally on each of them:

  1. Old Dual-Core Laptop with Windows 7 an VS2019
  2. Fast Octal-Core Laptop with Windows11 and VS2022

I used three versions of common-hid-packet-parser.js containing all optimizations of this PR, but differ in the loop implementation:

  1. Loop implementation of this PR (while loop decrementing from length of array of object keys)
  2. Original loop implementation (for in loop over objects)
  3. Loop implementation adapted from HID-Parser perfomance improvements #4894 (comment) (for of loop over array of object keys)

The results are very good repeatable, but the std. deviation was much higher on the old dual-core laptop. Most propably, because other threads needed CPU time.

                                     Windows 11 (fast Octal-Core)     Windows 7 (old Dual-Core)

1. Loop implementation of this PR    530us                            659us
2. Original loop implementation      660us                            920us
3. for/of Loop implementation        645us                            1052us

@JoergAtGithub JoergAtGithub requested a review from Swiftb0y March 19, 2023 18:17
Copy link
Member

@daschuer daschuer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The code looks good. However I am not able to test it.
In this case I think we can rely on @JoergAtGithub test results with the Tractor Z2.

I have just left request to document the performance implications in the code. @Swiftb0y is it than read for merge?

@Swiftb0y
Copy link
Member

Swiftb0y commented Mar 20, 2023

Have you measured the performance when you abstract the loop into a function? Something like this:

function FastForIn(object, body) {
    const objKeyArray = Object.keys(object);
    let objKeyArrayIdx = objKeyArray.length;
    while (objKeyArrayIdx--) {
        body(objKeyArray[objKeyArrayIdx]);
    }
}

@JoergAtGithub
Copy link
Member Author

I tried your "abstract the loop into a function" approach an the performance seems to be the same as my version with the while loop.

But I couldn't apply it everywhere, because the following loop contains continue and return statements. continue becomes return for the loop function, but implementing a replacement for return would require much boilerplate.

const InputPacketsKeyArr = Object.keys(this.InputPackets);
let InputPacketsIdx = InputPacketsKeyArr.length;
while (InputPacketsIdx--) {
/** @type {HIDPacket} */
let packet = this.InputPackets[InputPacketsKeyArr[InputPacketsIdx]];
// When the device uses ReportIDs to enumerate the reports, hidapi
// prepends the report ID to the data sent to Mixxx. If the device
// only has a single report type, the HIDPacket constructor sets the
// reportId as 0. In this case, hidapi only sends the data of the
// report to Mixxx without a report ID.
if (packet.reportId !== 0 && packet.reportId !== data[0]) {
continue;
}
if (packet.header !== undefined) {
for (let header_byte = 0; header_byte < packet.header.length; header_byte++) {
if (packet.header[header_byte] !== data[header_byte]) {
packet = undefined;
break;
}
}
if (packet === undefined) {
continue;
}
}
const changed_data = packet.parse(data);
if (packet.callback !== undefined) {
packet.callback(packet, changed_data);
return;
}

@Swiftb0y
Copy link
Member

Thank you for investigating. In that case, I'd prefer if the abstracted version was used where possible and if not, the "inlined" version is fine. wdyt?

@JoergAtGithub
Copy link
Member Author

I prepared a version "abstract the loop into a function" and tested it before pushing it to this PR JoergAtGithub@bded951 . On the Windows11 Octal-Core I got the same execution time as before, but on the Windows7 Dual-Core is was significiantly slower. I repeated the measurements several times:

                                       Windows 11 (fast Octal-Core)     Windows 7 (old Dual-Core)

4. Abstract the loop into a function   530us                            755us

@Swiftb0y
Copy link
Member

IMO thats still good enough. In the end, it doesn't make sense to micro-optimize loops in JS. We should instead optimize for readability and maintainability. IMO the loop in the function is the best of both worlds. Do you agree?

@JoergAtGithub
Copy link
Member Author

Done!

Copy link
Member

@Swiftb0y Swiftb0y left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you. LGTM

@Swiftb0y
Copy link
Member

There are still unresolved comments by @daschuer

Copy link
Member

@daschuer daschuer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, Thank you very much.

@daschuer daschuer merged commit a9d5787 into mixxxdj:2.4 Mar 22, 2023
@JoergAtGithub JoergAtGithub deleted the hid_parser_perfomance_improvement branch March 22, 2023 23:39
@JoergAtGithub JoergAtGithub added the changelog This PR should be included in the changelog label Mar 24, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
changelog This PR should be included in the changelog code quality controller mappings performance
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants