Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RecursiveOpenStructs are heavy & slow #251

Closed
cben opened this issue Jun 7, 2017 · 9 comments
Closed

RecursiveOpenStructs are heavy & slow #251

cben opened this issue Jun 7, 2017 · 9 comments

Comments

@cben
Copy link
Collaborator

cben commented Jun 7, 2017

[don't have time to measure soon but wanted to braindump... cc @ilackarms @simon3z]

Kubeclient wraps all parsed JSONs with RecursiveOpenStructs. ROS does several expensive things:
https://github.com/aetherknight/recursive-open-struct/blob/v1.0.4/lib/recursive_open_struct.rb#L69-L111
https://github.com/ruby/ruby/blob/v2_4_1/lib/ostruct.rb

  • on construction: deep dup the hash — can be mitigated by setting :mutate_input_hash, safe our case.

    • it also keeps a DeepDup instance around, one per ROS object afaict.
  • When you access fields as methods (result.foo.bar): method_missing call, creating a singleton class for each object, and defining 3 methods foo, foo=, foo_as_a_hash (per key)! The methods also keep a closure to reference vars from new_ostruct_member. Then it proceeds to [:foo] access.

    • I've seen with a sampling profiler in manageiq code that walks & accesses most of the data parsed from kubernetes, that 30%~80% CPU gets spent in new_ostruct_member! (we don't currently care about the CPU but other kubeclient users might)
    • I suspect allocating singleton classes and 3 methods + closure per field wastes tons of RAM.
    • Creating the methods is premature optimization to save CPU but is wasteful for once-or-twice usage patterns.
  • When you access as hash (result[:foo][:bar]): recursively contstructs ROS for the children. It's on-demand and memoized but as you walk the data you'll get a tree of ROSs parallel to the original hashes...

  • on .to_h to just get the underlying hash: deep dups it — can maybe be cheaper with :preserve_original_keys?
    Anyway, result.to_h looks like cheapest access option!

I have several ideas to try making upstream ROS & OS faster.

@moolitayer
Copy link
Collaborator

Maybe we need an option that tells kubeclient to return JSON instead of ROS?

@moolitayer
Copy link
Collaborator

BTW since that would require a large code change in our usage code (even if it's mostly cosmetic) we need some benchmarks

@ilackarms
Copy link

can we test the actual size of the things we're returning from kubeclient with ObjectSpace.memsize_of ?

@cben
Copy link
Collaborator Author

cben commented Jun 14, 2017

@moolitayer
Copy link
Collaborator

Sorry for repeating... I've never understood why we need ROS.
pod.name instead of pod["name"]?? while lots of complicated stuff is happening behind the scenes to enable it

@simon3z
Copy link
Collaborator

simon3z commented Jun 15, 2017

Sorry for repeating... I've never understood why we need ROS.

@moolitayer me neither. It's also problematic for those keys that have special symbols (e.g. -).
For me it's a 👍 to drop ROS, although we should carefully pick and balance what we're working on to improve performance, I have the feeling that the big part is still on the MIQ db side in our case.

@grosser
Copy link
Contributor

grosser commented Jun 15, 2017

FYI related issue #197

@moolitayer
Copy link
Collaborator

See #262, add option for raw and avoids ROS (and JSON parsing altogether)

@cben
Copy link
Collaborator Author

cben commented Mar 4, 2018

Covered by #262 as: :raw, #306 as: :parsed & as: :parsed_symbolized, and #299 allowing to set as: for whole client.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants