Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Increasing performance: Call for comments #8

Closed
mikke89 opened this issue May 4, 2019 · 15 comments
Closed

Increasing performance: Call for comments #8

mikke89 opened this issue May 4, 2019 · 15 comments
Labels
performance Performance suggestions or specific issues
Milestone

Comments

@mikke89
Copy link
Owner

mikke89 commented May 4, 2019

I've given quite some effort lately into improving the performance of the library, and especially for creation and destruction of elements which can sometimes be slow.

There are some breaking changes, so I'm taking some extra time before merging into master:

  • Mainly, Rocket::Core::String has been replaced by std::string. This gave a small performance boost, but perhaps more importantly, I think it eases the interaction between most users and the library.
  • The update loop has been quite substantially re-engineered. A lot of duplicate work is avoided, instead things update in a more strictly ordered manner. This may cause an issue for some user code such as code that relies on querying elements for position and size, straight after element construction or other documents changes. Some properties are not up-to-date until Context::Update is called. However, there is a workaround as described in the documentation.

To give some numbers, I'm getting a 3.5x (!) speedup during heavy construction / destruction of elements, compared to before I started the performance branch. From 18.5 to 65.0 FPS for more than 500 elements per update loop. These numbers do not even include earlier performance improvements. And I'm not done yet ;)

I think I'm mostly reaching out to discuss the performance of librocket in general, as well as what you think about these breaking changes. Any thoughts?

@viciious
Copy link
Contributor

viciious commented May 4, 2019

I wonder what the original intention behind the Rocket::Core::String class was.. It's certainly no better than std::string in terms of the interface. Unicode support perhaps?

As far as libRocket performance goes, I've spent a considerable amount of time profiling it and even submitted some patches to address some of the things I've uncovered. Off the top of my head, the datagrid thing was the worst, absolutely abysmal. Adding new rows was a total killer and as I was digging deeper and deeper into libRocket guts I started to lose any hope of improvement, as I realized that at least some of its most crucial parts ran at O(N^2) or maybe O(N^3) time. We ended up using a lot of hacks to keep things running smoothly, such as splitting the document into sub-documents to keep the total number of DOM elements relatively low.

This may cause an issue for some user code such as code that relies on querying elements for position and size, straight after element construction or other documents changes.

Is it possible to use the lazy update approach for such cases: run the blocking update function in case user queries for some stuff while the document is in dirty stuff?

@viciious
Copy link
Contributor

viciious commented May 4, 2019

Another realization that I had while profiling libRocket was that the re-layout should not propagate beyond block elements that have horizontal and vertical oveflow enabled, sorta like a web page iframe. I never figured out how to implement this particular optimization though.

@mikke89
Copy link
Owner Author

mikke89 commented May 4, 2019

I count that as a blessing for going with std::string then ;) I'm not sure of their motivation, unicode shouldn't be affected by this change as that is done separately. The strings are actually converted to WString before being rendered.

Yeah, I had problems with the datagrid as well. I found for myself it's much better just to write out the rml or otherwise construct the elements manually. One thing I found is that it was sending out "resize" events each time the size of an element is set (which can happen multiple times during layout), which again triggered a resize of the data cells ... and so on. I don't exactly remember the details, but it was bad. Events are also really slow due to the propagation all the way up and down the tree with all the pointer chasing.

Another realization that I had while profiling libRocket was that the re-layout should not propagate beyond block elements that have horizontal and vertical oveflow enabled, sorta like a web page iframe. I never figured out how to implement this particular optimization though.

Interesting. I haven't actually touched the layouting during these performance increases, but it's certainly one of the bigger targets now. I'll keep this in mind.

@viciious
Copy link
Contributor

viciious commented May 7, 2019

Have a look at the following optimization of ElementDatagridRow:

Qfusion/libRocket@c0d15d8#diff-91ffa9c291cbaef5b722e84779e9ea31

@mikke89 mikke89 added the performance Performance suggestions or specific issues label May 11, 2019
@mikke89
Copy link
Owner Author

mikke89 commented May 15, 2019

Hi all,
I've made quite some changes in the performance branch and also merged master onto it so it's up-to-date with the cpp11-things. Except, I had to make it C++14 now because of an external hashmap library. I will experiment more with different containers so this may not be final.

If you'd like to help me test it and go bug hunting that would be great! Also, any feedback on how the performance is affected would be very valuable. The performance branch seems relatively stable now from the testing I've done, but with so many changes there will be bugs so consider this beta quality.

There is still some cleanup to do, and I will be making more performance improvements. But we may want to merge to master before I ever consider it "done" :P

@barotto
Copy link
Contributor

barotto commented May 16, 2019

Now that String is std::string, Lua and samples are broken.

Include/Rocket/Core/Lua/LuaType.inl: In static member function ‘static int Rocket::Core::Lua::LuaType<T>::index(lua_State*)’:
Include/Rocket/Core/Lua/LuaType.inl:245:58: error: ‘Rocket::Core::String {aka class std::__cxx11::basic_string<char>}’ has no member named ‘Append’
                     Report(L, String(GetTClassName<T>()).Append(".__index for ").Append(lua_tostring(L,2)).Append(": "));
                                                          ^
...
Samples/basic/treeview/src/FileSystem.cpp: In member function ‘void FileSystemNode::BuildTree(const String&)’:
Samples/basic/treeview/src/FileSystem.cpp:92:39: error: ‘class std::__cxx11::basic_string<char>’ has no member named ‘CString’
    file_count = scandir((root + name).CString(), &file_list, 0, alphasort);
                                       ^
...

@viciious
Copy link
Contributor

It's going to take a lot of effort for me to migrate to this libRocket fork (just look at this massive codebase: https://github.com/Qfusion/qfusion/tree/master/source/ui ) so I can't really help with performance testing.. At least not for a while.

@viciious
Copy link
Contributor

On a related note, the coverity scanner tool spewed the following warning at me:

CID 199124 (#1 of 1): Big parameter passed by value (PASS_BY_VALUE)pass_by_value: Passing parameter key of type Rocket::Core::AnimationKey (size 160 bytes) by value

199124 Big parameter passed by value
Copying large values is inefficient, consider passing by reference; size thresholds for detection can be adjusted.
In Rocket::​Core::​ElementAnimation::​InternalAddKey(Rocket::​Core::​AnimationKey): A large function call parameter or exception catch statement is passed by value

@mikke89
Copy link
Owner Author

mikke89 commented Jun 18, 2019

Thanks!

@mikke89 mikke89 added this to the 2.0 milestone Jun 18, 2019
@viciious
Copy link
Contributor

viciious commented Jun 24, 2019

Note that for this reason, when querying the Rocket API for properties such as size or position, this information may not be up-to-date with changes since the last Context::Update, such as newly added elements or classes. If this information is needed immediately, a call to ElementDocument::UpdateDocument can be made before such queries at a performance penalty.

Would it be possible for the library to perform the call in lazy manner automatically? By doing in it Context::Update you're only delaying the inevitable, and I would rather prefer to be able to shoot myself in the foot in less suboptimal but consistent manner than having to deal with obscure bugs caused by the library returning random data.

@mikke89
Copy link
Owner Author

mikke89 commented Jun 24, 2019

It is certainly possible, but there are some challenges:

  • We will have to separate the public API and the internal API for every function that may return something that is out-of-date. This would add quite some code and complexity in itself, and inevitable bugs.
  • It is a challenge in its own to know exactly which functions and properties that can dirty itself and other elements. There will probably be bugs, especially as the libary grows.
  • We don't return random data, just data that would be active on the previous frame, or default-initalized data. It is very visible if you use this data.

We could do the absolute lazy thing and call UpdateDocument every time the user calls a public getter, this will be much less bug prone, but of course will incur a performance penalty. This update already checks to see if anything is dirty, but essentially does so by iterating through every element.

There is also the question of whether we want to hide performance penalties like this for the user.

You can see I am a bit reluctant with this change, there is inevitably a tradeoff between user-friendliness, and performance+complexity. We will shoot ourselves in the foot either way, but, we are using C++ here, so... :P

Do you still think this is the way to go? If this didn't convince you otherwise, then I will seriously consider implementing this. But at least I want to let you know of the challenges.

@viciious
Copy link
Contributor

As a compromise, a function could be added instead to check if the element is dirty and its properties' values should not be trusted..

@mikke89
Copy link
Owner Author

mikke89 commented Jun 24, 2019

While that sounds easy, it's really not. Because any change to an element can affect its children, siblings, ancestors, really the whole document. If we only return the dirty status of the current element, this would only be half-true as its values could change anyway during update. This is actually true for how libRocket works as well, it wouldn't return correctly if there was a change to siblings etc. that would affect itself, and you had to call update manually sometimes there to.

I think the best we can do really is to mark the functions that may be outdated after a change with a warning, and then tell how to work around it.

Note that, the GetProperty, attributes and similar will always be correct. It's only the calculated values so to say, specifically those that need layouting and computed values that are dirty.

@mikke89
Copy link
Owner Author

mikke89 commented Jun 24, 2019

Although, we could do a conservative estimate and check if the doument requires re-layout or that any property has been dirtyed on the element. I think that should be a sufficient check (but not always necessary).

Edit: Thinking about it some more, this would not be sufficient actually, as it also depends on dirtied properties (at least) on other elements in the document.

@mikke89
Copy link
Owner Author

mikke89 commented Oct 13, 2019

With the large improvements in performance in 3.0, I'm closing this. New issues with regards to performance are welcome, such as suggested performance opportunities or specific issues.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
performance Performance suggestions or specific issues
Projects
None yet
Development

No branches or pull requests

3 participants