You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
As hadoop is often used to speed up some task, it should be second nature to determine if it is doing a good job or not. Since this spans several languages (Cython, C, Python, Java, and various client code) standard profiling techniques don't work properly. A solution is to have an internal profiling aggregator that can be enabled to produce semantically relevant timing information.
Relevant information
TypedBytes serialization overhead
Pyinstaller overhead
Client Map/Combine/Reduce functions
It is important that this doesn't clutter the codebase and should have minimal performance impact.
The text was updated successfully, but these errors were encountered:
As hadoop is often used to speed up some task, it should be second nature to determine if it is doing a good job or not. Since this spans several languages (Cython, C, Python, Java, and various client code) standard profiling techniques don't work properly. A solution is to have an internal profiling aggregator that can be enabled to produce semantically relevant timing information.
Relevant information
It is important that this doesn't clutter the codebase and should have minimal performance impact.
The text was updated successfully, but these errors were encountered: