Add profiling support for jobs and framework overhead #62

bwhite · 2012-07-02T18:57:19Z

As hadoop is often used to speed up some task, it should be second nature to determine if it is doing a good job or not. Since this spans several languages (Cython, C, Python, Java, and various client code) standard profiling techniques don't work properly. A solution is to have an internal profiling aggregator that can be enabled to produce semantically relevant timing information.

Relevant information

TypedBytes serialization overhead
Pyinstaller overhead
Client Map/Combine/Reduce functions

It is important that this doesn't clutter the codebase and should have minimal performance impact.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add profiling support for jobs and framework overhead #62

Add profiling support for jobs and framework overhead #62

bwhite commented Jul 2, 2012

Add profiling support for jobs and framework overhead #62

Add profiling support for jobs and framework overhead #62

Comments

bwhite commented Jul 2, 2012