Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

boblight: reduce cpu time spent on memcopy and parsing rgb values #1016

Merged
merged 6 commits into from
Oct 18, 2020

Conversation

The-Master777
Copy link
Contributor

The amount of memory allocations in BoblightClientConnection is reduced drastically.

BoblightClientConnection used to perform a huge amount of memcopy-operations - mostly involving read only string data – that led to high CPU usage. Now QStringRef is used to store tokenized Boblight messages instead of creating read-only copies of such strings.

The parsing of RGB values (given as float-strings) is optimized to be allocation-free. This is archived using a native (thus performant) implementation based on QChar character values and mostly int-arithmetic.

What kind of change does this PR introduce? (check at least one)

  • Bugfix
  • Feature
  • Code style update
  • Refactor
  • Docs
  • Build-related changes
  • Other, please describe:

If changing the UI of web configuration, please provide the before/after screenshot:

Does this PR introduce a breaking change? (check one)

  • Yes
  • No

If yes, please describe the impact and migration path for existing setups:

The PR fulfills these requirements:

  • When resolving a specific issue, it's referenced in the PR's body (e.g. Fixes: #xxx[,#xxx], where "xxx" is the issue number)

If adding a new feature, the PR's description includes:

  • A convincing reason for adding this feature
  • Related documents have been updated (docs/docs/en)
  • Related tests have been updated

PLEASE DON'T FORGET TO ADD YOUR CHANGES TO CHANGELOG.MD

  • Yes, CHANGELOG.md is also updated

To avoid wasting your time, it's best to open a feature request issue first and wait for approval before working on it.

Other information:

Benchmarks show the number of CPU-Cycles consumed by memory allocation and parsing plummet significantly freeing valuable system resources. Measured on raspberry pi 4 with perf on debian AArch64, 10 seconds of real life boblight data, 600 LEDs.

Performance with refactoring:

  Children      Self  Command         Shared Ob  Symbol
+   29.78%     0.50%  HyperionThread  hyperiond  [.] BoblightClientConnection::readData      
-   16.29%     1.62%  HyperionThread  hyperiond  [.] BoblightClientConnection::handleMessage 
   - 14.67% BoblightClientConnection::handleMessage                                          
      + 6.92% QString::splitRef                                                              
      + 3.08% Hyperion::setInput                                                             
      + 1.27% BoblightClientConnection::parseByte                                            
        0.98% QString::compare_helper                                                        
        0.58% QStringRef::operator==                                                         
   + 1.62% thread_start                                                                      
+    1.32%     0.14%  HyperionThread  hyperiond  [.] BoblightClientConnection::parseByte     
+    1.20%     1.18%  HyperionThread  hyperiond  [.] BoblightClientConnection::parseFloat    
+    0.73%     0.11%  HyperionThread  hyperiond  [.] BoblightClientConnection::readMessage   
     0.25%     0.25%  HyperionThread  hyperiond  [.] BoblightClientConnection::parseUInt     

Performance baseline:

  Children      Self  Command         Shared Ob  Symbol
-   50.92%     0.95%  HyperionThread  hyperiond  [.] BoblightClientConnection::readData      
   - 49.97% BoblightClientConnection::readData                                               
      + 36.55% BoblightClientConnection::handleMessage                                       
      + 9.93% QByteArray::mid                                                                
        0.76% _int_free                                                                      
        0.76% QString::fromLatin1_helper                                                     
   + 0.95% thread_start                                                                      
-   36.55%     0.98%  HyperionThread  hyperiond  [.] BoblightClientConnection::handleMessage 
   - 35.57% BoblightClientConnection::handleMessage                                          
      + 11.61% QString::split                                                                
      + 10.78% QString::toFloat                                                              
      + 2.89% QList<QString>::~QList                                                         
      + 2.23% Hyperion::setInput                                                             
      + 1.60% QString::fromAscii_helper                                                      
      + 1.57% QString::toUInt                                                                
        0.78% QString::compare_helper                                                        
        0.74% _int_free                                                                      ```

@hyperion-project
Copy link

Hello @The-Master777 👋

I'm your friendly neighborhood bot and would like to say thank you for
submitting a pull request to Hyperion!

So that you and other users can test your changes more quickly,
you can find your workflow artifacts here.

If you make changes to your PR, i create a new link to your workflow artifacts.

Best regards,
Hyperion-Project

@hyperion-project
Copy link

Here is your new link to your workflow artifacts.

@hyperion-project
Copy link

Here is your new link to your workflow artifacts.

@Lord-Grey Lord-Grey self-requested a review September 24, 2020 05:47
@hyperion-project
Copy link

Here is your new link to your workflow artifacts.

#endif

// Clamp to byte range 0 to 255
return static_cast<uint8_t>(std::max(LO, std::min(HI, int(HI * d))));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

qmax and qmin would be qt alternatives?

Copy link
Contributor Author

@The-Master777 The-Master777 Oct 5, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given the Qt documentation using qMax/qMin should be applicable aswell.
When using Qt methods, probably qBound is the out-of-box solution to go for to clamp the values to 0 .. 255 range.

Edit: Addressed this in e60e7f3

@hyperion-project
Copy link

Here is your new link to your workflow artifacts.

@Paulchen-Panther
Copy link
Member

Thank you for your help to Hyperion.

@Paulchen-Panther Paulchen-Panther merged commit c09061e into hyperion-project:master Oct 18, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants