Provide the option to remove unnecessary XML data #240
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
The EXTRA_DATA and EXTRA_ELEMENT data returned by Ganglia is unused in many code paths, so by stripping it we can see XML parsing performance gains of about 50% for large data sets.
In production, gmetad was returning 26MB of XML data for the everything path. By filtering this subset of data out, it took parsing time from an average of 2.1s to 1.1s, a nearly 50% improvement.
By looking at the start_* functions, we can see if they use the EXTRA_DATA information at all. Since most of them don't, we can safely strip out the information.