Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reduce the number of allocations to reduce garbage collection #348

Open
yulvil opened this issue Nov 13, 2017 · 7 comments
Open

Reduce the number of allocations to reduce garbage collection #348

yulvil opened this issue Nov 13, 2017 · 7 comments

Comments

@yulvil
Copy link
Contributor

yulvil commented Nov 13, 2017

I ran GODEBUG=gctrace=1 c2go transpile sqlite3.c and noticed that the garbage collector is taking 12% of the total run time.

Running without the garbage collector showed better performance: GOGC=off c2go transpile sqlite3.c

See if we can add GOGC=off to the integration tests to make them run faster.

@elliotchance
Copy link
Owner

Interesting. My guess (which could very well be wrong) is the enormous amount of AST nodes it needs to create when parsing the clang output.

I'm in two minds about disabling this on Travis. While it may makes the build faster it also creeps away from the normal commands that would be run. Also if for some reason it allocated a huge amount of memory it would break not only the current build but also potentially all the previous builds (with this change) if we wanted to rerun them.

I'd love to know exactly why Go is spending so much time cleaning up in the first place - maybe then we can find genuine performance gains there...

I know that @Konstantin8105 has done some awesome work with finding which parts contribute significantly to the CPU.

@yulvil
Copy link
Contributor Author

yulvil commented Nov 13, 2017

I agree that we should fix the problem at the root, i.e. reduce allocations.

I am guessing that we have a lot of duplicated strings (identifiers, types, ...) Maybe doing some string interning could be beneficial? golang/go#5160

Will need gather more metrics.

@yulvil yulvil changed the title Add GOGC=off wherever possible Reduce the number of allocations to reduce garbage collection Nov 13, 2017
@Konstantin8105
Copy link
Contributor

Few time ago, I test speed of Go on simple math operation - multiplication of matrix(https://github.com/Konstantin8105/Matrix-Multiply-Part1).
At the result, without changing of any in GC, we can multiply matrixes with speed ~ C code.

Ignore my point, if it outside of thread.

@Konstantin8105
Copy link
Contributor

I will try to think about minimize allocation for main.convertLinesToNodesParallel

@Konstantin8105
Copy link
Contributor

May be for more precision- we will create a benchmark test. Feel free for implementation(action).

@Konstantin8105
Copy link
Contributor

Before:

○ → GODEBUG=gctrace=1 c2go transpile -V  sqlite3.c
......
Building tree...
gc 23 @20.634s 4%: 0.046+452+0.26 ms clock, 0.18+170/452/7.6+1.0 ms cpu, 188->191->103 MB, 201 MB goal, 4 P
gc 24 @21.329s 4%: 0.044+275+0.098 ms clock, 0.17+39/271/515+0.39 ms cpu, 196->201->117 MB, 207 MB goal, 4 P
gc 25 @22.382s 4%: 0.037+245+0.18 ms clock, 0.15+87/245/476+0.72 ms cpu, 225->228->110 MB, 234 MB goal, 4 P
gc 26 @23.243s 5%: 0.042+491+0.19 ms clock, 0.16+127/462/10+0.77 ms cpu, 214->216->110 MB, 220 MB goal, 4 P
gc 27 @24.412s 5%: 0.050+263+0.18 ms clock, 0.20+151/258/502+0.72 ms cpu, 215->217->112 MB, 220 MB goal, 4 P
Transpiling tree...
// Function: sqlite3_compileoption_used(const char *)

After trying to modify main.convertLinesToNodesParallel for minimaze allocations :

○ → GODEBUG=gctrace=1 c2go transpile -V  sqlite3.c
......
Building tree...
gc 23 @20.599s 4%: 0.041+742+0.18 ms clock, 0.16+164/304/447+0.74 ms cpu, 192->196->104 MB, 202 MB goal, 4 P
gc 24 @21.695s 4%: 0.051+350+0.19 ms clock, 0.20+164/348/15+0.76 ms cpu, 197->202->110 MB, 209 MB goal, 4 P
gc 25 @22.741s 4%: 0.061+308+8.4 ms clock, 0.24+117/300/337+33 ms cpu, 210->215->113 MB, 221 MB goal, 4 P
gc 26 @23.750s 5%: 0.064+365+0.15 ms clock, 0.25+244/364/12+0.62 ms cpu, 217->221->112 MB, 226 MB goal, 4 P
gc 27 @24.755s 5%: 0.040+401+0.11 ms clock, 0.16+280/395/16+0.47 ms cpu, 217->220->114 MB, 224 MB goal, 4 P
Transpiling tree...

Result is fail. So, I reject PR.

@Konstantin8105
Copy link
Contributor

Just for information ,

gc 32 @24.755s 5%: .........

I found only 4...5% of time for GC.
In my point of view, let's say - it is fine ).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants