Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Excessive compilation time (regression) #45395

Closed
jmichel7 opened this issue May 20, 2022 · 4 comments
Closed

Excessive compilation time (regression) #45395

jmichel7 opened this issue May 20, 2022 · 4 comments
Labels
compiler:latency Compiler latency regression Regression in behavior compared to a previous version

Comments

@jmichel7
Copy link

I have a problem with excessive compilation time. The problem is with code automatically translated from another system (gap3) so I cannot easily change it. As an example, the function below takes 2 minutes to compile on Julia 1.7.2 (it takes 5 minutes on 1.9 nightly!); as you can see, the code is long but very simple. When you load the code, executing @time f(4) the first time takes 128sec (and some microseconds after that). The following should be noted:

  • If I move the definition of d outside the function the compilation takes a fraction of a second. I can move d out in this example but not on some other examples where the data defines polynomials that the function evaluates at its argument (imagine that a list [1,2,3,4] in the example is replaced by something like [1+x+2y^2+4x^3] where x,y,z are arguments to the function).

  • The time is exponential in the length of the data (if you double the length of the Dict definition the time is multiplied by 5 or 10).

  • Giordano’s has tested the example on Julia 1.0.5 where it compiles in 1 or 2 sec., so it is a regression.

f=function(x)
  d=Dict{Int,Vector{Vector{Int}}}(
  4=>[[1,0],[1,4],[1,8],[1,5,1,7],[1,3,1,5],[1,1,1,3],[1,2,1,4,1,6]],
  5=>[[1,0],[1,4],[1,8],[1,4],[1,8],[1,12],[1,8],[1,12],[1,16],[1,9,1,15],
      [1,7,1,13],[1,5,1,11],[1,7,1,13],[1,5,1,11],[1,3,1,9],[1,5,1,11],
      [1,3,1,9],[1,1,1,7],[1,4,2,10],[1,2,1,8,1,14],[2,6,1,12]],
  6=>[[1,0],[1,4],[1,8],[1,6],[1,10],[1,14],[1,5,1,13],[1,3,1,11],[1,3,1,7],
      [1,7,1,11],[1,1,1,9],[1,5,1,9],[1,2,1,6,1,10],[1,4,1,8,1,12]],
  7=>[[1,0],[1,4],[1,8],[1,4],[1,8],[1,12],[1,8],[1,12],[1,16],[1,6],[1,10],
      [1,14],[1,10],[1,14],[1,18],[1,14],[1,18],[1,22],[1,9,1,21],[1,7,1,19],
      [2,11],[1,7,1,19],[2,11],[2,9],[2,11],[2,9],[2,7],[2,15],[2,13],
      [1,5,1,17],[2,13],[1,5,1,17],[1,3,1,15],[1,5,1,17],[1,3,1,15],[1,1,1,13],
      [3,10],[1,4,2,16],[1,2,2,14],[2,8,1,20],[2,6,1,18],[3,12]],
  8=>[[1,0],[1,6],[1,12],[1,18],[1,1,1,5],[1,4,1,8],[1,7,1,11],[1,7,1,11],
      [1,10,1,14],[1,13,1,17],[1,8,1,12,1,16],[1,6,1,10,1,14],[1,4,1,8,1,12],
      [1,2,1,6,1,10],[1,5,2,9,1,13],[1,3,1,7,1,11,1,15]],
  9=>[[1,0],[1,6],[1,12],[1,18],[1,12],[1,18],[1,24],[1,30],[1,5,1,13],
      [1,4,1,20],[1,7,1,23],[1,7,1,23],[1,10,1,26],[1,13,1,29],[1,1,1,17],
      [1,14,1,22],[1,17,1,25],[1,11,1,19],[1,11,1,19],[1,8,1,16],
      [1,8,1,16,1,24],[1,6,1,14,1,22],[1,4,1,12,1,20],[1,2,1,10,1,18],
      [1,12,1,20,1,28],[1,10,1,18,1,26],[1,8,1,16,1,24],[1,6,1,14,1,22],
      [2,9,1,17,1,25],[1,7,2,15,1,23],[1,3,1,11,1,19,1,27],[1,5,1,13,2,21]],
 10=>[[1,0],[1,6],[1,12],[1,18],[1,8],[1,14],[1,20],[1,26],[1,16],[1,22],
      [1,28],[1,34],[1,9,1,21],[1,12,1,24],[1,15,1,27],[1,15,1,27],
      [1,18,1,30],[1,21,1,33],[1,5,1,17],[1,8,1,20],[1,11,1,23],[1,11,1,23],
      [1,14,1,26],[1,17,1,29],[1,1,1,13],[1,4,1,16],[1,7,1,19],[1,7,1,19],
      [1,10,1,22],[1,13,1,25],[1,8,1,20,1,32],[2,14,1,26],[1,8,2,20],
      [1,2,1,14,1,26],[2,16,1,28],[1,10,2,22],[1,4,1,16,1,28],[2,10,1,22],
      [1,12,2,24],[1,6,1,18,1,30],[2,12,1,24],[1,6,2,18],[2,9,2,21],
      [2,11,2,23],[1,7,2,19,1,31],[1,3,2,15,1,27],[1,5,2,17,1,29],[2,13,2,25]],
 11=>[[1,0],[1,6],[1,12],[1,18],[1,8],[1,14],[1,20],[1,26],[1,16],[1,22],
      [1,28],[1,34],[1,12],[1,18],[1,24],[1,30],[1,20],[1,26],[1,32],[1,38],
      [1,28],[1,34],[1,40],[1,46],[1,9,1,33],[1,12,1,36],[2,27],[2,27],
      [1,18,1,42],[2,33],[1,5,1,29],[2,20],[1,11,1,35],[1,11,1,35],[1,14,1,38],
      [2,29],[1,1,1,25],[1,4,1,28],[1,7,1,31],[1,7,1,31],[2,22],[2,25],[2,21],
      [2,24],[1,15,1,39],[1,15,1,39],[2,30],[1,21,1,45],[2,17],[1,8,1,32],
      [2,23],[2,23],[2,26],[1,17,1,41],[2,13],[2,16],[2,19],[2,19],
      [1,10,1,34],[1,13,1,37],[1,8,2,32],[2,14,1,38],[3,20],[1,2,2,26],
      [2,20,1,44],[3,26],[1,8,2,32],[2,14,1,38],[2,16,1,40],[3,22],[1,4,2,28],
      [2,10,1,34],[3,28],[1,10,2,34],[2,16,1,40],[3,22],[3,24],[1,6,2,30],
      [2,12,1,36],[3,18],[1,12,2,36],[2,18,1,42],[3,24],[1,6,2,30],[4,21],
      [4,23],[3,19,1,43],[1,3,3,27],[1,5,3,29],[4,25],[2,9,2,33],[2,11,2,35],
      [1,7,3,31],[3,15,1,39],[3,17,1,41],[2,13,2,37]],
 12=>[[1,0],[1,12],[1,1,1,11],[1,4,1,8],[1,5,1,7],[1,2,1,4,1,6],[1,6,1,8,1,10],
      [1,3,1,5,1,7,1,9]],
 13=>[[1,0],[1,6],[1,12],[1,18],[1,7,1,11],[1,4,1,8],[1,1,1,17],[1,5,1,13],
      [1,10,1,14],[1,7,1,11],[1,4,1,8,1,12],[1,2,1,6,1,10],[1,8,1,12,1,16],
      [1,6,1,10,1,14],[1,3,1,7,1,11,1,15],[1,5,2,9,1,13]],
 14=>[[1,0],[1,8],[1,16],[1,12],[1,20],[1,28],[1,15,1,21],[1,12,1,24],
      [1,9,1,27],[1,11,1,17],[1,8,1,20],[1,5,1,23],[1,7,1,13],[1,4,1,16],
      [1,1,1,19],[1,2,1,14,1,20],[1,8,1,14,1,26],[1,4,1,10,1,22],
      [1,10,1,16,1,22],[1,6,1,12,1,18],[1,6,1,18,1,24],[1,3,1,9,1,15,1,21],
      [1,5,1,11,1,17,1,23],[1,7,1,13,1,19,1,25]],
 15=>[[1,0],[1,6],[1,8],[1,14],[1,16],[1,22],[1,12],[1,18],[1,20],[1,26],
      [1,28],[1,34],[1,9,1,33],[1,12,1,24],[1,15,1,27],[1,15,1,27],[1,18,1,30],
      [2,21],[1,5,1,29],[1,8,1,20],[1,11,1,23],[1,11,1,23],[1,14,1,26],[2,17],
      [1,1,1,25],[1,4,1,16],[1,7,1,19],[1,7,1,19],[1,10,1,22],[2,13],[1,8,2,20],
      [1,2,1,14,1,26],[1,8,1,20,1,32],[2,14,1,26],[1,4,1,16,1,28],[2,10,1,22],
      [2,16,1,28],[1,10,2,22],[2,12,1,24],[1,6,2,18],[1,12,2,24],
      [1,6,1,18,1,30],[2,9,2,21],[2,11,2,23],[1,7,2,19,1,31],[1,3,2,15,1,27],
      [1,5,2,17,1,29],[2,13,2,25]],
 16=>[[1,0],[1,12],[1,24],[1,36],[1,48],[1,1,1,11],[1,7,1,17],[1,13,1,23],
      [1,19,1,29],[1,13,1,23],[1,19,1,29],[1,25,1,35],[1,25,1,35],[1,31,1,41],
      [1,37,1,47],[1,2,1,12,1,22],[1,6,1,16,1,26],[1,10,1,20,1,30],
      [1,10,1,20,1,30],[1,14,1,24,1,34],[1,18,1,28,1,38],[1,14,1,24,1,34],
      [1,18,1,28,1,38],[1,22,1,32,1,42],[1,26,1,36,1,46],[1,15,1,25,1,35,1,45],
      [1,17,2,27,1,37],[1,9,1,19,1,29,1,39],[1,11,2,21,1,31],
      [1,3,1,13,1,23,1,33],[1,20,2,30,1,40],[1,12,1,22,1,32,1,42],
      [1,14,2,24,1,34],[1,6,1,16,1,26,1,36],[1,8,2,18,1,28],[1,12,2,22,2,32],
      [1,4,1,14,1,24,1,34,1,44],[2,16,2,26,1,36],[1,8,1,18,2,28,1,38],
      [1,10,2,20,1,30,1,40],[1,5,2,15,2,25,1,35],[1,7,2,17,2,27,1,37],
      [1,9,2,19,2,29,1,39],[1,11,2,21,2,31,1,41],[1,13,2,23,2,33,1,43]],
 17=>[[1,0],[1,12],[1,24],[1,36],[1,48],[1,30],[1,42],[1,54],[1,66],[1,78],
      [1,11,1,31],[1,17,1,37],[1,13,1,53],[1,19,1,59],[1,13,1,53],[1,19,1,59],
      [1,25,1,65],[1,25,1,65],[1,31,1,71],[1,37,1,77],[1,1,1,41],[1,7,1,47],
      [1,35,1,55],[1,35,1,55],[1,41,1,61],[1,47,1,67],[1,23,1,43],[1,29,1,49],
      [1,23,1,43],[1,29,1,49],[1,2,1,22,1,42],[1,6,1,26,1,46],[1,10,1,30,1,50],
      [1,10,1,30,1,50],[1,14,1,34,1,54],[1,18,1,38,1,58],[1,14,1,34,1,54],
      [1,18,1,38,1,58],[1,22,1,42,1,62],[1,26,1,46,1,66],[1,12,1,32,1,52],
      [1,16,1,36,1,56],[1,20,1,40,1,60],[1,20,1,40,1,60],[1,24,1,44,1,64],
      [1,28,1,48,1,68],[1,24,1,44,1,64],[1,28,1,48,1,68],[1,32,1,52,1,72],
      [1,36,1,56,1,76],[1,15,1,35,1,55,1,75],[2,27,1,47,1,67],
      [1,19,2,39,1,59],[1,11,1,31,2,51],[1,3,1,23,1,43,1,63],
      [2,30,1,50,1,70],[1,22,2,42,1,62],[1,14,1,34,2,54],
      [1,6,1,26,1,46,1,66],[2,18,1,38,1,58],[1,25,2,45,1,65],[1,17,1,37,2,57],
      [1,9,1,29,1,49,1,69],[2,21,1,41,1,61],[1,13,2,33,1,53],[1,20,1,40,2,60],
      [1,12,1,32,1,52,1,72],[2,24,1,44,1,64],[1,16,2,36,1,56],[1,8,1,28,2,48],
      [1,12,2,32,2,52],[2,22,1,42,2,62],[1,4,1,24,2,44,1,64],
      [1,14,2,34,1,54,1,74],[2,16,1,36,2,56],[2,26,2,46,1,66],
      [1,8,2,28,1,48,1,68],[1,18,2,38,2,58],[2,20,2,40,1,60],
      [1,10,1,30,2,50,1,70],[2,15,2,35,2,55],[2,17,2,37,2,57],[2,19,2,39,2,59],
      [1,11,2,31,2,51,1,71],[1,13,2,33,2,53,1,73],[1,5,2,25,2,45,1,65],
      [1,7,2,27,2,47,1,67],[1,9,2,29,2,49,1,69],[2,21,2,41,2,61],
      [2,23,2,43,2,63]],
 18=>[[1,0],[1,12],[1,24],[1,36],[1,48],[1,20],[1,32],[1,44],[1,56],[1,68],
      [1,40],[1,52],[1,64],[1,76],[1,88],[1,21,1,51],[1,27,1,57],[1,33,1,63],
      [1,39,1,69],[1,33,1,63],[1,39,1,69],[1,45,1,75],[1,45,1,75],[1,51,1,81],
      [1,57,1,87],[1,11,1,41],[1,17,1,47],[1,23,1,53],[1,29,1,59],[1,23,1,53],
      [1,29,1,59],[1,35,1,65],[1,35,1,65],[1,41,1,71],[1,47,1,77],[1,1,1,31],
      [1,7,1,37],[1,13,1,43],[1,19,1,49],[1,13,1,43],[1,19,1,49],[1,25,1,55],
      [1,25,1,55],[1,31,1,61],[1,37,1,67],[1,2,1,32,1,62],[2,26,1,56],
      [1,20,2,50],[1,20,2,50],[1,14,1,44,1,74],[2,38,1,68],[1,14,1,44,1,74],
      [2,38,1,68],[1,32,2,62],[1,26,1,56,1,86],[2,22,1,52],[1,16,2,46],
      [1,10,1,40,1,70],[1,10,1,40,1,70],[2,34,1,64],[1,28,2,58],[2,34,1,64],
      [1,28,2,58],[1,22,1,52,1,82],[2,46,1,76],[1,12,2,42],[1,6,1,36,1,66],
      [2,30,1,60],[2,30,1,60],[1,24,2,54],[1,18,1,48,1,78],[1,24,2,54],
      [1,18,1,48,1,78],[2,42,1,72],[1,36,2,66],[1,15,2,45,1,75],[2,27,2,57],
      [1,9,2,39,1,69],[2,21,2,51],[1,3,2,33,1,63],[2,35,2,65],[1,17,2,47,1,77],
      [2,29,2,59],[1,11,2,41,1,71],[2,23,2,53],[1,25,2,55,1,85],[2,37,2,67],
      [1,19,2,49,1,79],[2,31,2,61],[1,13,2,43,1,73],[2,30,2,60],
      [1,12,2,42,1,72],[2,24,2,54],[1,6,2,36,1,66],[2,18,2,48],[1,20,2,50,1,80],
      [2,32,2,62],[1,14,2,44,1,74],[2,26,2,56],[1,8,2,38,1,68],[2,40,2,70],
      [1,22,2,52,1,82],[2,34,2,64],[1,16,2,46,1,76],[2,28,2,58],
      [1,12,2,42,2,72],[3,32,2,62],[2,22,3,52],[2,24,2,54,1,84],
      [1,14,3,44,1,74],[1,4,2,34,2,64],[3,36,2,66],[2,26,3,56],
      [2,16,2,46,1,76],[1,18,3,48,1,78],[1,8,2,38,2,68],[3,28,2,58],
      [2,30,3,60],[2,20,2,50,1,80],[1,10,3,40,1,70],[3,25,3,55],
      [1,7,3,37,2,67],[2,19,3,49,1,79],[3,31,3,61],[1,13,3,43,2,73],
      [1,5,3,35,2,65],[2,17,3,47,1,77],[3,29,3,59],[1,11,3,41,2,71],
      [2,23,3,53,1,83],[2,15,3,45,1,75],[3,27,3,57],[1,9,3,39,2,69],
      [2,21,3,51,1,81],[3,33,3,63]],
 19=>[[1,0],[1,12],[1,24],[1,36],[1,48],[1,20],[1,32],[1,44],[1,56],[1,68],
      [1,40],[1,52],[1,64],[1,76],[1,88],[1,30],[1,42],[1,54],[1,66],[1,78],
      [1,50],[1,62],[1,74],[1,86],[1,98],[1,70],[1,82],[1,94],[1,106],[1,118],
      [2,51],[2,57],[1,33,1,93],[1,39,1,99],[1,33,1,93],[1,39,1,99],
      [1,45,1,105],[1,45,1,105],[1,51,1,111],[1,57,1,117],[2,41],[2,47],[2,53],
      [2,59],[2,53],[2,59],[1,35,1,95],[1,35,1,95],[1,41,1,101],[1,47,1,107],
      [2,31],[2,37],[2,43],[2,49],[2,43],[2,49],[2,55],[2,55],[1,31,1,91],
      [1,37,1,97],[1,21,1,81],[1,27,1,87],[2,63],[2,69],[2,63],[2,69],[2,75],
      [2,75],[2,81],[2,87],[1,11,1,71],[1,17,1,77],[1,23,1,83],[1,29,1,89],
      [1,23,1,83],[1,29,1,89],[2,65],[2,65],[2,71],[2,77],[1,1,1,61],[1,7,1,67],
      [1,13,1,73],[1,19,1,79],[1,13,1,73],[1,19,1,79],[1,25,1,85],[1,25,1,85],
      [2,61],[2,67],[1,2,2,62],[2,26,1,86],[3,50],[3,50],[1,14,2,74],
      [2,38,1,98],[1,14,2,74],[2,38,1,98],[3,62],[1,26,2,86],[2,32,1,92],[3,56],   
      [1,20,2,80],[1,20,2,80],[2,44,1,104],[3,68],[2,44,1,104],[3,68],
      [1,32,2,92],[2,56,1,116],[2,22,1,82],[3,46],[1,10,2,70],[1,10,2,70],
      [2,34,1,94],[3,58],[2,34,1,94],[3,58],[1,22,2,82],[2,46,1,106],[3,52],
      [1,16,2,76],[2,40,1,100],[2,40,1,100],[3,64],[1,28,2,88],[3,64],
      [1,28,2,88],[2,52,1,112],[3,76],[3,42],[1,6,2,66],[2,30,1,90],[2,30,1,90],
      [3,54],[1,18,2,78],[3,54],[1,18,2,78],[2,42,1,102],[3,66],[1,12,2,72],
      [2,36,1,96],[3,60],[3,60],[1,24,2,84],[2,48,1,108],[1,24,2,84],
      [2,48,1,108],[3,72],[1,36,2,96],[1,15,3,75],[2,27,2,87],[3,39,1,99],
      [4,51],[1,3,3,63],[2,35,2,95],[3,47,1,107],[4,59],[1,11,3,71],[2,23,2,83],
      [3,55,1,115],[4,67],[1,19,3,79],[2,31,2,91],[3,43,1,103],[2,30,2,90],
      [3,42,1,102],[4,54],[1,6,3,66],[2,18,2,78],[3,50,1,110],[4,62],
      [1,14,3,74],[2,26,2,86],[3,38,1,98],[4,70],[1,22,3,82],[2,34,2,94],
      [3,46,1,106],[4,58],[3,45,1,105],[4,57],[1,9,3,69],[2,21,2,81],
      [3,33,1,93],[4,65],[1,17,3,77],[2,29,2,89],[3,41,1,101],[4,53],
      [1,25,3,85],[2,37,2,97],[3,49,1,109],[4,61],[1,13,3,73],[4,60],
      [1,12,3,72],[2,24,2,84],[3,36,1,96],[4,48],[1,20,3,80],[2,32,2,92],
      [3,44,1,104],[4,56],[1,8,3,68],[2,40,2,100],[3,52,1,112],[4,64],
      [1,16,3,76],[2,28,2,88],[1,12,4,72],[3,32,2,92],[5,52],[3,42,2,102],
      [5,62],[2,22,3,82],[2,24,3,84],[4,44,1,104],[1,4,4,64],[4,54,1,114],
      [1,14,4,74],[3,34,2,94],[3,36,2,96],[5,56],[2,16,3,76],[5,66],[2,26,3,86],
      [4,46,1,106],[4,48,1,108],[1,8,4,68],[3,28,2,88],[1,18,4,78],[3,38,2,98],
      [5,58],[5,60],[2,20,3,80],[4,40,1,100],[2,30,3,90],[4,50,1,110],
      [1,10,4,70],[3,25,3,85],[1,7,5,67],[5,49,1,109],[6,61],[4,43,2,103],
      [4,35,2,95],[2,17,4,77],[6,59],[1,11,5,71],[5,53,1,113],[5,45,1,105],
      [3,27,3,87],[1,9,5,69],[2,21,4,81],[6,63],[6,55],[4,37,2,97],[2,19,4,79],
      [3,31,3,91],[1,13,5,73],[1,5,5,65],[5,47,1,107],[3,29,3,89],[4,41,2,101],
      [2,23,4,83],[2,15,4,75],[6,57],[4,39,2,99],[5,51,1,111],[3,33,3,93]],
 20=>[[1,0],[1,20],[1,40],[1,21,1,39],[1,27,1,33],[1,11,1,29],[1,17,1,23],
      [1,1,1,19],[1,7,1,13],[1,2,1,20,1,38],[1,14,1,20,1,26],[1,10,1,22,1,28],
      [1,10,1,16,1,34],[1,12,1,18,1,30],[1,6,1,24,1,30],[1,3,1,9,1,21,1,27],
      [1,11,1,17,1,23,1,29],[1,13,1,19,1,31,1,37],[1,6,1,12,1,18,1,24],
      [1,8,1,14,1,26,1,32],[1,16,1,22,1,28,1,34],[1,12,1,18,1,24,1,30,1,36],
      [1,8,1,14,1,20,1,26,1,32],[1,4,1,10,1,16,1,22,1,28],
      [1,7,1,13,1,19,2,25,1,31],[1,9,2,15,1,21,1,27,1,33],
      [1,5,1,11,1,17,1,23,1,29,1,35]],
 21=>[[1,0],[1,20],[1,40],[1,30],[1,50],[1,70],[1,39,1,51],[1,33,1,57],
      [1,21,1,69],[1,27,1,63],[1,29,1,41],[1,23,1,47],[1,17,1,53],[1,11,1,59],
      [1,19,1,31],[1,13,1,37],[1,7,1,43],[1,1,1,49],[1,2,1,38,1,50],
      [1,14,1,26,1,50],[1,20,1,32,1,68],[1,20,1,44,1,56],[1,10,1,22,1,58],
      [1,10,1,34,1,46],[1,28,1,40,1,52],[1,16,1,40,1,64],[1,18,1,30,1,42],
      [1,6,1,30,1,54],[1,12,1,48,1,60],[1,24,1,36,1,60],[1,3,1,27,1,39,1,51],
      [1,11,1,23,1,47,1,59],[1,19,1,31,1,43,1,67],[1,6,1,18,1,42,1,54],
      [1,14,1,26,1,38,1,62],[1,22,1,34,1,46,1,58],[1,9,1,21,1,33,1,57],
      [1,17,1,29,1,41,1,53],[1,13,1,37,1,49,1,61],[1,12,1,24,1,36,1,48],
      [1,8,1,32,1,44,1,56],[1,16,1,28,1,52,1,64],[1,12,1,24,1,36,1,48,1,60],
      [1,8,1,20,1,32,1,44,1,56],[1,4,1,16,1,28,1,40,1,52],
      [1,18,1,30,1,42,1,54,1,66],[1,14,1,26,1,38,1,50,1,62],
      [1,10,1,22,1,34,1,46,1,58],[1,13,2,25,1,37,1,49,1,61],
      [1,7,1,19,1,31,1,43,2,55],[1,11,1,23,2,35,1,47,1,59],
      [1,5,1,17,1,29,1,41,1,53,1,65],[1,9,1,21,1,33,2,45,1,57],
      [2,15,1,27,1,39,1,51,1,63]],
 22=>[[1,0],[1,30],[1,11,1,19],[1,13,1,17],[1,1,1,29],[1,7,1,23],
      [1,2,1,10,1,18],[1,6,1,10,1,14],[1,12,1,20,1,28],[1,16,1,20,1,24],
      [1,3,1,11,1,19,1,27],[1,6,1,14,1,18,1,22],[1,9,1,13,1,17,1,21],
      [1,8,1,12,1,16,1,24],[1,4,1,8,1,12,1,16,1,20],[1,10,1,14,1,18,1,22,1,26],
      [1,7,1,11,2,15,1,19,1,23],[1,5,1,9,1,13,1,17,1,21,1,25]],
 34=>[[1,0],[1,4],[1,8],[1,5,1,7],[1,3,1,5],[1,1,1,3],[1,2,1,4,1,6]],
 35=>[[1,0],[1,4],[1,8],[1,4],[1,8],[1,12],[1,8],[1,12],[1,16],[1,9,1,15],
      [1,7,1,13],[1,5,1,11],[1,7,1,13],[1,5,1,11],[1,3,1,9],[1,5,1,11],
      [1,3,1,9],[1,1,1,7],[1,4,2,10],[1,2,1,8,1,14],[2,6,1,12]],
 36=>[[1,0],[1,4],[1,8],[1,6],[1,10],[1,14],[1,5,1,13],[1,3,1,11],[1,3,1,7],
      [1,7,1,11],[1,1,1,9],[1,5,1,9],[1,2,1,6,1,10],[1,4,1,8,1,12]],
 37=>[[1,0],[1,4],[1,8],[1,4],[1,8],[1,12],[1,8],[1,12],[1,16],[1,6],[1,10],
      [1,14],[1,10],[1,14],[1,18],[1,14],[1,18],[1,22],[1,9,1,21],[1,7,1,19],
      [2,11],[1,7,1,19],[2,11],[2,9],[2,11],[2,9],[2,7],[2,15],[2,13],
      [1,5,1,17],[2,13],[1,5,1,17],[1,3,1,15],[1,5,1,17],[1,3,1,15],[1,1,1,13],
      [3,10],[1,4,2,16],[1,2,2,14],[2,8,1,20],[2,6,1,18],[3,12]],
 38=>[[1,0],[1,6],[1,12],[1,18],[1,1,1,5],[1,4,1,8],[1,7,1,11],[1,7,1,11],
      [1,10,1,14],[1,13,1,17],[1,8,1,12,1,16],[1,6,1,10,1,14],[1,4,1,8,1,12],
      [1,2,1,6,1,10],[1,5,2,9,1,13],[1,3,1,7,1,11,1,15]],
 39=>[[1,0],[1,6],[1,12],[1,18],[1,12],[1,18],[1,24],[1,30],[1,5,1,13],
      [1,4,1,20],[1,7,1,23],[1,7,1,23],[1,10,1,26],[1,13,1,29],[1,1,1,17],
      [1,14,1,22],[1,17,1,25],[1,11,1,19],[1,11,1,19],[1,8,1,16],
      [1,8,1,16,1,24],[1,6,1,14,1,22],[1,4,1,12,1,20],[1,2,1,10,1,18],
      [1,12,1,20,1,28],[1,10,1,18,1,26],[1,8,1,16,1,24],[1,6,1,14,1,22],
      [2,9,1,17,1,25],[1,7,2,15,1,23],[1,3,1,11,1,19,1,27],[1,5,1,13,2,21]],
310=>[[1,0],[1,6],[1,12],[1,18],[1,8],[1,14],[1,20],[1,26],[1,16],[1,22],[1,28],
      [1,34],[1,9,1,21],[1,12,1,24],[1,15,1,27],[1,15,1,27],[1,18,1,30],
      [1,21,1,33],[1,5,1,17],[1,8,1,20],[1,11,1,23],[1,11,1,23],[1,14,1,26],
      [1,17,1,29],[1,1,1,13],[1,4,1,16],[1,7,1,19],[1,7,1,19],[1,10,1,22],
      [1,13,1,25],[1,8,1,20,1,32],[2,14,1,26],[1,8,2,20],[1,2,1,14,1,26],
      [2,16,1,28],[1,10,2,22],[1,4,1,16,1,28],[2,10,1,22],[1,12,2,24],
      [1,6,1,18,1,30],[2,12,1,24],[1,6,2,18],[2,9,2,21],[2,11,2,23],
      [1,7,2,19,1,31],[1,3,2,15,1,27],[1,5,2,17,1,29],[2,13,2,25]],
311=>[[1,0],[1,6],[1,12],[1,18],[1,8],[1,14],[1,20],[1,26],[1,16],[1,22],[1,28],
      [1,34],[1,12],[1,18],[1,24],[1,30],[1,20],[1,26],[1,32],[1,38],[1,28],
      [1,34],[1,40],[1,46],[1,9,1,33],[1,12,1,36],[2,27],[2,27],[1,18,1,42],
      [2,33],[1,5,1,29],[2,20],[1,11,1,35],[1,11,1,35],[1,14,1,38],[2,29],
      [1,1,1,25],[1,4,1,28],[1,7,1,31],[1,7,1,31],[2,22],[2,25],[2,21],[2,24],
      [1,15,1,39],[1,15,1,39],[2,30],[1,21,1,45],[2,17],[1,8,1,32],[2,23],
      [2,23],[2,26],[1,17,1,41],[2,13],[2,16],[2,19],[2,19],[1,10,1,34],
      [1,13,1,37],[1,8,2,32],[2,14,1,38],[3,20],[1,2,2,26],[2,20,1,44],[3,26],
      [1,8,2,32],[2,14,1,38],[2,16,1,40],[3,22],[1,4,2,28],[2,10,1,34],[3,28],
      [1,10,2,34],[2,16,1,40],[3,22],[3,24],[1,6,2,30],[2,12,1,36],[3,18],
      [1,12,2,36],[2,18,1,42],[3,24],[1,6,2,30],[4,21],[4,23],[3,19,1,43],
      [1,3,3,27],[1,5,3,29],[4,25],[2,9,2,33],[2,11,2,35],[1,7,3,31],
      [3,15,1,39],[3,17,1,41],[2,13,2,37]],
312=>[[1,0],[1,12],[1,1,1,11],[1,4,1,8],[1,5,1,7],[1,2,1,4,1,6],[1,6,1,8,1,10],
      [1,3,1,5,1,7,1,9]],
313=>[[1,0],[1,6],[1,12],[1,18],[1,7,1,11],[1,4,1,8],[1,1,1,17],[1,5,1,13],
      [1,10,1,14],[1,7,1,11],[1,4,1,8,1,12],[1,2,1,6,1,10],[1,8,1,12,1,16],
      [1,6,1,10,1,14],[1,3,1,7,1,11,1,15],[1,5,2,9,1,13]],
314=>[[1,0],[1,8],[1,16],[1,12],[1,20],[1,28],[1,15,1,21],[1,12,1,24],
      [1,9,1,27],[1,11,1,17],[1,8,1,20],[1,5,1,23],[1,7,1,13],[1,4,1,16],
      [1,1,1,19],[1,2,1,14,1,20],[1,8,1,14,1,26],[1,4,1,10,1,22],
      [1,10,1,16,1,22],[1,6,1,12,1,18],[1,6,1,18,1,24],[1,3,1,9,1,15,1,21],
      [1,5,1,11,1,17,1,23],[1,7,1,13,1,19,1,25]],
315=>[[1,0],[1,6],[1,8],[1,14],[1,16],[1,22],[1,12],[1,18],[1,20],[1,26],[1,28],
      [1,34],[1,9,1,33],[1,12,1,24],[1,15,1,27],[1,15,1,27],[1,18,1,30],
      [2,21],[1,5,1,29],[1,8,1,20],[1,11,1,23],[1,11,1,23],[1,14,1,26],[2,17],
      [1,1,1,25],[1,4,1,16],[1,7,1,19],[1,7,1,19],[1,10,1,22],[2,13],
      [1,8,2,20],[1,2,1,14,1,26],[1,8,1,20,1,32],[2,14,1,26],[1,4,1,16,1,28],
      [2,10,1,22],[2,16,1,28],[1,10,2,22],[2,12,1,24],[1,6,2,18],[1,12,2,24],
      [1,6,1,18,1,30],[2,9,2,21],[2,11,2,23],[1,7,2,19,1,31],[1,3,2,15,1,27],
      [1,5,2,17,1,29],[2,13,2,25]],
316=>[[1,0],[1,12],[1,24],[1,36],[1,48],[1,1,1,11],[1,7,1,17],[1,13,1,23],
      [1,19,1,29],[1,13,1,23],[1,19,1,29],[1,25,1,35],[1,25,1,35],[1,31,1,41],
      [1,37,1,47],[1,2,1,12,1,22],[1,6,1,16,1,26],[1,10,1,20,1,30],
      [1,10,1,20,1,30],[1,14,1,24,1,34],[1,18,1,28,1,38],[1,14,1,24,1,34],
      [1,18,1,28,1,38],[1,22,1,32,1,42],[1,26,1,36,1,46],[1,15,1,25,1,35,1,45],
      [1,17,2,27,1,37],[1,9,1,19,1,29,1,39],[1,11,2,21,1,31],
      [1,3,1,13,1,23,1,33],[1,20,2,30,1,40],[1,12,1,22,1,32,1,42],
      [1,14,2,24,1,34],[1,6,1,16,1,26,1,36],[1,8,2,18,1,28],[1,12,2,22,2,32],
      [1,4,1,14,1,24,1,34,1,44],[2,16,2,26,1,36],[1,8,1,18,2,28,1,38],
      [1,10,2,20,1,30,1,40],[1,5,2,15,2,25,1,35],[1,7,2,17,2,27,1,37],
      [1,9,2,19,2,29,1,39],[1,11,2,21,2,31,1,41],[1,13,2,23,2,33,1,43]],
317=>[[1,0],[1,12],[1,24],[1,36],[1,48],[1,30],[1,42],[1,54],[1,66],[1,78],
      [1,11,1,31],[1,17,1,37],[1,13,1,53],[1,19,1,59],[1,13,1,53],[1,19,1,59],
      [1,25,1,65],[1,25,1,65],[1,31,1,71],[1,37,1,77],[1,1,1,41],[1,7,1,47],
      [1,35,1,55],[1,35,1,55],[1,41,1,61],[1,47,1,67],[1,23,1,43],[1,29,1,49],
      [1,23,1,43],[1,29,1,49],[1,2,1,22,1,42],[1,6,1,26,1,46],[1,10,1,30,1,50],
      [1,10,1,30,1,50],[1,14,1,34,1,54],[1,18,1,38,1,58],[1,14,1,34,1,54],
      [1,18,1,38,1,58],[1,22,1,42,1,62],[1,26,1,46,1,66],[1,12,1,32,1,52],
      [1,16,1,36,1,56],[1,20,1,40,1,60],[1,20,1,40,1,60],[1,24,1,44,1,64],
      [1,28,1,48,1,68],[1,24,1,44,1,64],[1,28,1,48,1,68],[1,32,1,52,1,72],
      [1,36,1,56,1,76],[1,15,1,35,1,55,1,75],[2,27,1,47,1,67],
      [1,19,2,39,1,59],[1,11,1,31,2,51],[1,3,1,23,1,43,1,63],[2,30,1,50,1,70],
      [1,22,2,42,1,62],[1,14,1,34,2,54],[1,6,1,26,1,46,1,66],[2,18,1,38,1,58],
      [1,25,2,45,1,65],[1,17,1,37,2,57],[1,9,1,29,1,49,1,69],[2,21,1,41,1,61],
      [1,13,2,33,1,53],[1,20,1,40,2,60],[1,12,1,32,1,52,1,72],[2,24,1,44,1,64],
      [1,16,2,36,1,56],[1,8,1,28,2,48],[1,12,2,32,2,52],[2,22,1,42,2,62],
      [1,4,1,24,2,44,1,64],[1,14,2,34,1,54,1,74],[2,16,1,36,2,56],
      [2,26,2,46,1,66],[1,8,2,28,1,48,1,68],[1,18,2,38,2,58],[2,20,2,40,1,60],
      [1,10,1,30,2,50,1,70],[2,15,2,35,2,55],[2,17,2,37,2,57],[2,19,2,39,2,59],
      [1,11,2,31,2,51,1,71],[1,13,2,33,2,53,1,73],[1,5,2,25,2,45,1,65],
      [1,7,2,27,2,47,1,67],[1,9,2,29,2,49,1,69],[2,21,2,41,2,61],
      [2,23,2,43,2,63]],
318=>[[1,0],[1,12],[1,24],[1,36],[1,48],[1,20],[1,32],[1,44],[1,56],[1,68],
      [1,40],[1,52],[1,64],[1,76],[1,88],[1,21,1,51],[1,27,1,57],[1,33,1,63],
      [1,39,1,69],[1,33,1,63],[1,39,1,69],[1,45,1,75],[1,45,1,75],[1,51,1,81],
      [1,57,1,87],[1,11,1,41],[1,17,1,47],[1,23,1,53],[1,29,1,59],[1,23,1,53],
      [1,29,1,59],[1,35,1,65],[1,35,1,65],[1,41,1,71],[1,47,1,77],[1,1,1,31],
      [1,7,1,37],[1,13,1,43],[1,19,1,49],[1,13,1,43],[1,19,1,49],[1,25,1,55],
      [1,25,1,55],[1,31,1,61],[1,37,1,67],[1,2,1,32,1,62],[2,26,1,56],
      [1,20,2,50],[1,20,2,50],[1,14,1,44,1,74],[2,38,1,68],[1,14,1,44,1,74],
      [2,38,1,68],[1,32,2,62],[1,26,1,56,1,86],[2,22,1,52],[1,16,2,46],
      [1,10,1,40,1,70],[1,10,1,40,1,70],[2,34,1,64],[1,28,2,58],[2,34,1,64],
      [1,28,2,58],[1,22,1,52,1,82],[2,46,1,76],[1,12,2,42],[1,6,1,36,1,66],
      [2,30,1,60],[2,30,1,60],[1,24,2,54],[1,18,1,48,1,78],[1,24,2,54],
      [1,18,1,48,1,78],[2,42,1,72],[1,36,2,66],[1,15,2,45,1,75],[2,27,2,57],
      [1,9,2,39,1,69],[2,21,2,51],[1,3,2,33,1,63],[2,35,2,65],[1,17,2,47,1,77],
      [2,29,2,59],[1,11,2,41,1,71],[2,23,2,53],[1,25,2,55,1,85],[2,37,2,67],
      [1,19,2,49,1,79],[2,31,2,61],[1,13,2,43,1,73],[2,30,2,60],
      [1,12,2,42,1,72],[2,24,2,54],[1,6,2,36,1,66],[2,18,2,48],[1,20,2,50,1,80],
      [2,32,2,62],[1,14,2,44,1,74],[2,26,2,56],[1,8,2,38,1,68],[2,40,2,70],
      [1,22,2,52,1,82],[2,34,2,64],[1,16,2,46,1,76],[2,28,2,58],
      [1,12,2,42,2,72],[3,32,2,62],[2,22,3,52],[2,24,2,54,1,84],
      [1,14,3,44,1,74],[1,4,2,34,2,64],[3,36,2,66],[2,26,3,56],[2,16,2,46,1,76],
      [1,18,3,48,1,78],[1,8,2,38,2,68],[3,28,2,58],[2,30,3,60],[2,20,2,50,1,80],
      [1,10,3,40,1,70],[3,25,3,55],[1,7,3,37,2,67],[2,19,3,49,1,79],[3,31,3,61],
      [1,13,3,43,2,73],[1,5,3,35,2,65],[2,17,3,47,1,77],[3,29,3,59],
      [1,11,3,41,2,71],[2,23,3,53,1,83],[2,15,3,45,1,75],[3,27,3,57],
      [1,9,3,39,2,69],[2,21,3,51,1,81],[3,33,3,63]],
319=>[[1,0],[1,12],[1,24],[1,36],[1,48],[1,20],[1,32],[1,44],[1,56],[1,68],
      [1,40],[1,52],[1,64],[1,76],[1,88],[1,30],[1,42],[1,54],[1,66],[1,78],
      [1,50],[1,62],[1,74],[1,86],[1,98],[1,70],[1,82],[1,94],[1,106],[1,118],
      [2,51],[2,57],[1,33,1,93],[1,39,1,99],[1,33,1,93],[1,39,1,99],
      [1,45,1,105],[1,45,1,105],[1,51,1,111],[1,57,1,117],[2,41],[2,47],[2,53],
      [2,59],[2,53],[2,59],[1,35,1,95],[1,35,1,95],[1,41,1,101],[1,47,1,107],
      [2,31],[2,37],[2,43],[2,49],[2,43],[2,49],[2,55],[2,55],[1,31,1,91],
      [1,37,1,97],[1,21,1,81],[1,27,1,87],[2,63],[2,69],[2,63],[2,69],[2,75],
      [2,75],[2,81],[2,87],[1,11,1,71],[1,17,1,77],[1,23,1,83],[1,29,1,89],
      [1,23,1,83],[1,29,1,89],[2,65],[2,65],[2,71],[2,77],[1,1,1,61],[1,7,1,67],
      [1,13,1,73],[1,19,1,79],[1,13,1,73],[1,19,1,79],[1,25,1,85],[1,25,1,85],
      [2,61],[2,67],[1,2,2,62],[2,26,1,86],[3,50],[3,50],[1,14,2,74],
      [2,38,1,98],[1,14,2,74],[2,38,1,98],[3,62],[1,26,2,86],[2,32,1,92],[3,56],
      [1,20,2,80],[1,20,2,80],[2,44,1,104],[3,68],[2,44,1,104],[3,68],
      [1,32,2,92],[2,56,1,116],[2,22,1,82],[3,46],[1,10,2,70],[1,10,2,70],
      [2,34,1,94],[3,58],[2,34,1,94],[3,58],[1,22,2,82],[2,46,1,106],[3,52],
      [1,16,2,76],[2,40,1,100],[2,40,1,100],[3,64],[1,28,2,88],[3,64],
      [1,28,2,88],[2,52,1,112],[3,76],[3,42],[1,6,2,66],[2,30,1,90],[2,30,1,90],
      [3,54],[1,18,2,78],[3,54],[1,18,2,78],[2,42,1,102],[3,66],[1,12,2,72],
      [2,36,1,96],[3,60],[3,60],[1,24,2,84],[2,48,1,108],[1,24,2,84],
      [2,48,1,108],[3,72],[1,36,2,96],[1,15,3,75],[2,27,2,87],[3,39,1,99],
      [4,51],[1,3,3,63],[2,35,2,95],[3,47,1,107],[4,59],[1,11,3,71],[2,23,2,83],
      [3,55,1,115],[4,67],[1,19,3,79],[2,31,2,91],[3,43,1,103],[2,30,2,90],
      [3,42,1,102],[4,54],[1,6,3,66],[2,18,2,78],[3,50,1,110],[4,62],
      [1,14,3,74],[2,26,2,86],[3,38,1,98],[4,70],[1,22,3,82],[2,34,2,94],
      [3,46,1,106],[4,58],[3,45,1,105],[4,57],[1,9,3,69],[2,21,2,81],
      [3,33,1,93],[4,65],[1,17,3,77],[2,29,2,89],[3,41,1,101],[4,53],
      [1,25,3,85],[2,37,2,97],[3,49,1,109],[4,61],[1,13,3,73],[4,60],
      [1,12,3,72],[2,24,2,84],[3,36,1,96],[4,48],[1,20,3,80],[2,32,2,92],
      [3,44,1,104],[4,56],[1,8,3,68],[2,40,2,100],[3,52,1,112],[4,64],
      [1,16,3,76],[2,28,2,88],[1,12,4,72],[3,32,2,92],[5,52],[3,42,2,102],
      [5,62],[2,22,3,82],[2,24,3,84],[4,44,1,104],[1,4,4,64],[4,54,1,114],
      [1,14,4,74],[3,34,2,94],[3,36,2,96],[5,56],[2,16,3,76],[5,66],[2,26,3,86],
      [4,46,1,106],[4,48,1,108],[1,8,4,68],[3,28,2,88],[1,18,4,78],[3,38,2,98],
      [5,58],[5,60],[2,20,3,80],[4,40,1,100],[2,30,3,90],[4,50,1,110],
      [1,10,4,70],[3,25,3,85],[1,7,5,67],[5,49,1,109],[6,61],[4,43,2,103],
      [4,35,2,95],[2,17,4,77],[6,59],[1,11,5,71],[5,53,1,113],[5,45,1,105],
      [3,27,3,87],[1,9,5,69],[2,21,4,81],[6,63],[6,55],[4,37,2,97],[2,19,4,79],
      [3,31,3,91],[1,13,5,73],[1,5,5,65],[5,47,1,107],[3,29,3,89],[4,41,2,101],
      [2,23,4,83],[2,15,4,75],[6,57],[4,39,2,99],[5,51,1,111],[3,33,3,93]],
320=>[[1,0],[1,20],[1,40],[1,21,1,39],[1,27,1,33],[1,11,1,29],[1,17,1,23],
      [1,1,1,19],[1,7,1,13],[1,2,1,20,1,38],[1,14,1,20,1,26],[1,10,1,22,1,28],
      [1,10,1,16,1,34],[1,12,1,18,1,30],[1,6,1,24,1,30],[1,3,1,9,1,21,1,27],
      [1,11,1,17,1,23,1,29],[1,13,1,19,1,31,1,37],[1,6,1,12,1,18,1,24],
      [1,8,1,14,1,26,1,32],[1,16,1,22,1,28,1,34],[1,12,1,18,1,24,1,30,1,36],
      [1,8,1,14,1,20,1,26,1,32],[1,4,1,10,1,16,1,22,1,28],
      [1,7,1,13,1,19,2,25,1,31],[1,9,2,15,1,21,1,27,1,33],
      [1,5,1,11,1,17,1,23,1,29,1,35]],
321=>[[1,0],[1,20],[1,40],[1,30],[1,50],[1,70],[1,39,1,51],[1,33,1,57],
      [1,21,1,69],[1,27,1,63],[1,29,1,41],[1,23,1,47],[1,17,1,53],[1,11,1,59],
      [1,19,1,31],[1,13,1,37],[1,7,1,43],[1,1,1,49],[1,2,1,38,1,50],
      [1,14,1,26,1,50],[1,20,1,32,1,68],[1,20,1,44,1,56],[1,10,1,22,1,58],
      [1,10,1,34,1,46],[1,28,1,40,1,52],[1,16,1,40,1,64],[1,18,1,30,1,42],
      [1,6,1,30,1,54],[1,12,1,48,1,60],[1,24,1,36,1,60],[1,3,1,27,1,39,1,51],
      [1,11,1,23,1,47,1,59],[1,19,1,31,1,43,1,67],[1,6,1,18,1,42,1,54],
      [1,14,1,26,1,38,1,62],[1,22,1,34,1,46,1,58],[1,9,1,21,1,33,1,57],
      [1,17,1,29,1,41,1,53],[1,13,1,37,1,49,1,61],[1,12,1,24,1,36,1,48],
      [1,8,1,32,1,44,1,56],[1,16,1,28,1,52,1,64],[1,12,1,24,1,36,1,48,1,60],
      [1,8,1,20,1,32,1,44,1,56],[1,4,1,16,1,28,1,40,1,52],
      [1,18,1,30,1,42,1,54,1,66],[1,14,1,26,1,38,1,50,1,62],
      [1,10,1,22,1,34,1,46,1,58],[1,13,2,25,1,37,1,49,1,61],
      [1,7,1,19,1,31,1,43,2,55],[1,11,1,23,2,35,1,47,1,59],
      [1,5,1,17,1,29,1,41,1,53,1,65],[1,9,1,21,1,33,2,45,1,57],
      [2,15,1,27,1,39,1,51,1,63]],
322=>[[1,0],[1,30],[1,11,1,19],[1,13,1,17],[1,1,1,29],[1,7,1,23],
      [1,2,1,10,1,18],[1,6,1,10,1,14],[1,12,1,20,1,28],[1,16,1,20,1,24],
      [1,3,1,11,1,19,1,27],[1,6,1,14,1,18,1,22],[1,9,1,13,1,17,1,21],
      [1,8,1,12,1,16,1,24],[1,4,1,8,1,12,1,16,1,20],
      [1,10,1,14,1,18,1,22,1,26],[1,7,1,11,2,15,1,19,1,23],
      [1,5,1,9,1,13,1,17,1,21,1,25]])
  d[x]
end
@giordano
Copy link
Contributor

@giordano giordano added regression Regression in behavior compared to a previous version compiler:latency Compiler latency labels May 20, 2022
@KristofferC
Copy link
Member

All of the time here is spent in LLVM optimization and the passes that take the most time are:

  37.0673 ( 33.1%)   0.0074 (  1.5%)  37.0748 ( 33.0%)  37.0760 ( 33.1%)  Unroll loops
  23.0346 ( 20.6%)   0.0000 (  0.0%)  23.0346 ( 20.5%)  23.0350 ( 20.6%)  LICM for julia specific intrinsics. #2
  22.9855 ( 20.5%)   0.0226 (  4.4%)  23.0081 ( 20.5%)  23.0086 ( 20.6%)  LICM for julia specific intrinsics.
  17.1350 ( 15.3%)   0.0032 (  0.6%)  17.1382 ( 15.3%)  17.1351 ( 15.3%)  Global Value Numbering
   4.4282 (  4.0%)   0.0043 (  0.8%)   4.4325 (  3.9%)   4.4324 (  4.0%)  Induction Variable Simplification
   3.5448 (  3.2%)   0.0088 (  1.7%)   3.5536 (  3.2%)   3.5419 (  3.2%)  Combine redundant instructions
   1.7191 (  1.5%)   0.0083 (  1.6%)   1.7274 (  1.5%)   1.7212 (  1.5%)  Late Lower GCFrame Pass

@haampie haampie mentioned this issue Jun 10, 2022
1 task
haampie added a commit to haampie/julia that referenced this issue Jun 10, 2022
Adds a convenient way to enable PGO+LTO on Julia and LLVM together:

1. `export PATH=/some/recent/clang/bin:$PATH`
2. `make -C contrib/pgo-lto -j$(nproc) stage1`
3. `rm -rf profiles`
4. `./julia -O3 -e 'using Pkg; Pkg.add("Unitful"); Pkg.test("Unitful")'`
5. `make -C contrib/pgo-lto -j$(nproc) stage2`

This results quite often in spectacular speedups for time to first X as
it reduces the time spent in LLVM optimization passes by 25 or even 30%.

Example 1:

```julia
using LoopVectorization
function f!(a, b)
    @turbo for i in eachindex(a)
        a[i] *= b[i]
    end
    return a
end
f!(rand(1), rand(1))
```

```console
$ time ./julia -O3 lv.jl
```

Without PGO+LTO: 14.801s
With PGO+LTO: 11.978s (-19%)

Example 2:

```console
$ time ./julia -e 'using Pkg; Pkg.test("Unitful");'
```

Without PGO+LTO: 1m47.688s
With PGO+LTO: 1m35.704s (-11%)

Example 3 (taken from issue JuliaLang#45395, which is almost only LLVM):

```console
$ JULIA_LLVM_ARGS=-time-passes ./julia script-45395.jl
```

Without PGO+LTO:

```
===-------------------------------------------------------------------------===
                      ... Pass execution timing report ...
===-------------------------------------------------------------------------===
  Total Execution Time: 101.0130 seconds (98.6253 wall clock)

   ---User Time---   --System Time--   --User+System--   ---Wall Time---  --- Name ---
  53.6961 ( 54.7%)   0.1050 (  3.8%)  53.8012 ( 53.3%)  53.8045 ( 54.6%)  Unroll loops
  25.5423 ( 26.0%)   0.0072 (  0.3%)  25.5495 ( 25.3%)  25.5444 ( 25.9%)  Global Value Numbering
   7.1995 (  7.3%)   0.0526 (  1.9%)   7.2521 (  7.2%)   7.2517 (  7.4%)  Induction Variable Simplification
   5.0541 (  5.1%)   0.0098 (  0.3%)   5.0639 (  5.0%)   5.0561 (  5.1%)
   Combine redundant instructions JuliaLang#2
```

Wit PGO+LTO:

```
===-------------------------------------------------------------------------===
                      ... Pass execution timing report ...
===-------------------------------------------------------------------------===
  Total Execution Time: 72.6507 seconds (70.1337 wall clock)

   ---User Time---   --System Time--   --User+System--   ---Wall Time---  --- Name ---
  36.0894 ( 51.7%)   0.0825 (  2.9%)  36.1719 ( 49.8%)  36.1738 ( 51.6%)  Unroll loops
  16.5713 ( 23.7%)   0.0129 (  0.5%)  16.5843 ( 22.8%)  16.5794 ( 23.6%)  Global Value Numbering
   5.9047 (  8.5%)   0.0395 (  1.4%)   5.9442 (  8.2%)   5.9438 (  8.5%)  Induction Variable Simplification
   4.7566 (  6.8%)   0.0078 (  0.3%)   4.7645 (  6.6%)   4.7575 (  6.8%)  Combine redundant instructions JuliaLang#2
```

Or -28% time spent in LLVM.

---

Finally there's a significant reduction in binary sizes. For libLLVM.so:

```
79M	usr/lib/libLLVM-13jl.so (before)
67M	usr/lib/libLLVM-13jl.so (after)
```

And it can be reduced by another 2MB with `--icf=safe` when using LLD as
a linker anways.
haampie added a commit to haampie/julia that referenced this issue Jun 27, 2022
Adds a convenient way to enable PGO+LTO on Julia and LLVM together:

1. `export PATH=/some/recent/clang/bin:$PATH`
2. `make -C contrib/pgo-lto -j$(nproc) stage1`
3. `rm -rf profiles`
4. `./julia -O3 -e 'using Pkg; Pkg.add("Unitful"); Pkg.test("Unitful")'`
5. `make -C contrib/pgo-lto -j$(nproc) stage2`

This results quite often in spectacular speedups for time to first X as
it reduces the time spent in LLVM optimization passes by 25 or even 30%.

Example 1:

```julia
using LoopVectorization
function f!(a, b)
    @turbo for i in eachindex(a)
        a[i] *= b[i]
    end
    return a
end
f!(rand(1), rand(1))
```

```console
$ time ./julia -O3 lv.jl
```

Without PGO+LTO: 14.801s
With PGO+LTO: 11.978s (-19%)

Example 2:

```console
$ time ./julia -e 'using Pkg; Pkg.test("Unitful");'
```

Without PGO+LTO: 1m47.688s
With PGO+LTO: 1m35.704s (-11%)

Example 3 (taken from issue JuliaLang#45395, which is almost only LLVM):

```console
$ JULIA_LLVM_ARGS=-time-passes ./julia script-45395.jl
```

Without PGO+LTO:

```
===-------------------------------------------------------------------------===
                      ... Pass execution timing report ...
===-------------------------------------------------------------------------===
  Total Execution Time: 101.0130 seconds (98.6253 wall clock)

   ---User Time---   --System Time--   --User+System--   ---Wall Time---  --- Name ---
  53.6961 ( 54.7%)   0.1050 (  3.8%)  53.8012 ( 53.3%)  53.8045 ( 54.6%)  Unroll loops
  25.5423 ( 26.0%)   0.0072 (  0.3%)  25.5495 ( 25.3%)  25.5444 ( 25.9%)  Global Value Numbering
   7.1995 (  7.3%)   0.0526 (  1.9%)   7.2521 (  7.2%)   7.2517 (  7.4%)  Induction Variable Simplification
   5.0541 (  5.1%)   0.0098 (  0.3%)   5.0639 (  5.0%)   5.0561 (  5.1%)
   Combine redundant instructions JuliaLang#2
```

Wit PGO+LTO:

```
===-------------------------------------------------------------------------===
                      ... Pass execution timing report ...
===-------------------------------------------------------------------------===
  Total Execution Time: 72.6507 seconds (70.1337 wall clock)

   ---User Time---   --System Time--   --User+System--   ---Wall Time---  --- Name ---
  36.0894 ( 51.7%)   0.0825 (  2.9%)  36.1719 ( 49.8%)  36.1738 ( 51.6%)  Unroll loops
  16.5713 ( 23.7%)   0.0129 (  0.5%)  16.5843 ( 22.8%)  16.5794 ( 23.6%)  Global Value Numbering
   5.9047 (  8.5%)   0.0395 (  1.4%)   5.9442 (  8.2%)   5.9438 (  8.5%)  Induction Variable Simplification
   4.7566 (  6.8%)   0.0078 (  0.3%)   4.7645 (  6.6%)   4.7575 (  6.8%)  Combine redundant instructions JuliaLang#2
```

Or -28% time spent in LLVM.

---

Finally there's a significant reduction in binary sizes. For libLLVM.so:

```
79M	usr/lib/libLLVM-13jl.so (before)
67M	usr/lib/libLLVM-13jl.so (after)
```

And it can be reduced by another 2MB with `--icf=safe` when using LLD as
a linker anways.
haampie added a commit to haampie/julia that referenced this issue Jun 28, 2022
Adds a convenient way to enable PGO+LTO on Julia and LLVM together:

1. `cd contrib/pgo-lto`
2. `make -j$(nproc) stage1`
3. `make clean-profiles`
4. `./stage1.build/julia -O3 -e 'using Pkg; Pkg.add("LoopVectorization"); Pkg.test("LoopVectorization")'`
5. `make -j$(nproc) stage2`

This results quite often in spectacular speedups for time to first X as
it reduces the time spent in LLVM optimization passes by 25 or even 30%.

Example 1:

```julia
using LoopVectorization
function f!(a, b)
    @turbo for i in eachindex(a)
        a[i] *= b[i]
    end
    return a
end
f!(rand(1), rand(1))
```

```console
$ time ./julia -O3 lv.jl
```

Without PGO+LTO: 14.801s
With PGO+LTO: 11.978s (-19%)

Example 2:

```console
$ time ./julia -e 'using Pkg; Pkg.test("Unitful");'
```

Without PGO+LTO: 1m47.688s
With PGO+LTO: 1m35.704s (-11%)

Example 3 (taken from issue JuliaLang#45395, which is almost only LLVM):

```console
$ JULIA_LLVM_ARGS=-time-passes ./julia script-45395.jl
```

Without PGO+LTO:

```
===-------------------------------------------------------------------------===
                      ... Pass execution timing report ...
===-------------------------------------------------------------------------===
  Total Execution Time: 101.0130 seconds (98.6253 wall clock)

   ---User Time---   --System Time--   --User+System--   ---Wall Time---  --- Name ---
  53.6961 ( 54.7%)   0.1050 (  3.8%)  53.8012 ( 53.3%)  53.8045 ( 54.6%)  Unroll loops
  25.5423 ( 26.0%)   0.0072 (  0.3%)  25.5495 ( 25.3%)  25.5444 ( 25.9%)  Global Value Numbering
   7.1995 (  7.3%)   0.0526 (  1.9%)   7.2521 (  7.2%)   7.2517 (  7.4%)  Induction Variable Simplification
   5.0541 (  5.1%)   0.0098 (  0.3%)   5.0639 (  5.0%)   5.0561 (  5.1%)
   Combine redundant instructions JuliaLang#2
```

Wit PGO+LTO:

```
===-------------------------------------------------------------------------===
                      ... Pass execution timing report ...
===-------------------------------------------------------------------------===
  Total Execution Time: 72.6507 seconds (70.1337 wall clock)

   ---User Time---   --System Time--   --User+System--   ---Wall Time---  --- Name ---
  36.0894 ( 51.7%)   0.0825 (  2.9%)  36.1719 ( 49.8%)  36.1738 ( 51.6%)  Unroll loops
  16.5713 ( 23.7%)   0.0129 (  0.5%)  16.5843 ( 22.8%)  16.5794 ( 23.6%)  Global Value Numbering
   5.9047 (  8.5%)   0.0395 (  1.4%)   5.9442 (  8.2%)   5.9438 (  8.5%)  Induction Variable Simplification
   4.7566 (  6.8%)   0.0078 (  0.3%)   4.7645 (  6.6%)   4.7575 (  6.8%)  Combine redundant instructions JuliaLang#2
```

Or -28% time spent in LLVM.

---

Finally there's a significant reduction in binary sizes. For libLLVM.so:

```
79M	usr/lib/libLLVM-13jl.so (before)
67M	usr/lib/libLLVM-13jl.so (after)
```

And it can be reduced by another 2MB with `--icf=safe` when using LLD as
a linker anways.

Turn into makefile

Newline

Use two out of source builds

Ignore profiles + build dirs

Add --icf=safe

stage0 setup prebuilt clang with [cd]tors->init/fini patch
haampie added a commit to haampie/julia that referenced this issue Jun 28, 2022
Adds a convenient way to enable PGO+LTO on Julia and LLVM together:

1. `cd contrib/pgo-lto`
2. `make -j$(nproc) stage1`
3. `make clean-profiles`
4. `./stage1.build/julia -O3 -e 'using Pkg; Pkg.add("LoopVectorization"); Pkg.test("LoopVectorization")'`
5. `make -j$(nproc) stage2`

This results quite often in spectacular speedups for time to first X as
it reduces the time spent in LLVM optimization passes by 25 or even 30%.

Example 1:

```julia
using LoopVectorization
function f!(a, b)
    @turbo for i in eachindex(a)
        a[i] *= b[i]
    end
    return a
end
f!(rand(1), rand(1))
```

```console
$ time ./julia -O3 lv.jl
```

Without PGO+LTO: 14.801s
With PGO+LTO: 11.978s (-19%)

Example 2:

```console
$ time ./julia -e 'using Pkg; Pkg.test("Unitful");'
```

Without PGO+LTO: 1m47.688s
With PGO+LTO: 1m35.704s (-11%)

Example 3 (taken from issue JuliaLang#45395, which is almost only LLVM):

```console
$ JULIA_LLVM_ARGS=-time-passes ./julia script-45395.jl
```

Without PGO+LTO:

```
===-------------------------------------------------------------------------===
                      ... Pass execution timing report ...
===-------------------------------------------------------------------------===
  Total Execution Time: 101.0130 seconds (98.6253 wall clock)

   ---User Time---   --System Time--   --User+System--   ---Wall Time---  --- Name ---
  53.6961 ( 54.7%)   0.1050 (  3.8%)  53.8012 ( 53.3%)  53.8045 ( 54.6%)  Unroll loops
  25.5423 ( 26.0%)   0.0072 (  0.3%)  25.5495 ( 25.3%)  25.5444 ( 25.9%)  Global Value Numbering
   7.1995 (  7.3%)   0.0526 (  1.9%)   7.2521 (  7.2%)   7.2517 (  7.4%)  Induction Variable Simplification
   5.0541 (  5.1%)   0.0098 (  0.3%)   5.0639 (  5.0%)   5.0561 (  5.1%)
   Combine redundant instructions JuliaLang#2
```

Wit PGO+LTO:

```
===-------------------------------------------------------------------------===
                      ... Pass execution timing report ...
===-------------------------------------------------------------------------===
  Total Execution Time: 72.6507 seconds (70.1337 wall clock)

   ---User Time---   --System Time--   --User+System--   ---Wall Time---  --- Name ---
  36.0894 ( 51.7%)   0.0825 (  2.9%)  36.1719 ( 49.8%)  36.1738 ( 51.6%)  Unroll loops
  16.5713 ( 23.7%)   0.0129 (  0.5%)  16.5843 ( 22.8%)  16.5794 ( 23.6%)  Global Value Numbering
   5.9047 (  8.5%)   0.0395 (  1.4%)   5.9442 (  8.2%)   5.9438 (  8.5%)  Induction Variable Simplification
   4.7566 (  6.8%)   0.0078 (  0.3%)   4.7645 (  6.6%)   4.7575 (  6.8%)  Combine redundant instructions JuliaLang#2
```

Or -28% time spent in LLVM.

---

Finally there's a significant reduction in binary sizes. For libLLVM.so:

```
79M	usr/lib/libLLVM-13jl.so (before)
67M	usr/lib/libLLVM-13jl.so (after)
```

And it can be reduced by another 2MB with `--icf=safe` when using LLD as
a linker anways.

Turn into makefile

Newline

Use two out of source builds

Ignore profiles + build dirs

Add --icf=safe

stage0 setup prebuilt clang with [cd]tors->init/fini patch
haampie added a commit to haampie/julia that referenced this issue Jun 28, 2022
Adds a convenient way to enable PGO+LTO on Julia and LLVM together:

1. `cd contrib/pgo-lto`
2. `make -j$(nproc) stage1`
3. `make clean-profiles`
4. `./stage1.build/julia -O3 -e 'using Pkg; Pkg.add("LoopVectorization"); Pkg.test("LoopVectorization")'`
5. `make -j$(nproc) stage2`

This results quite often in spectacular speedups for time to first X as
it reduces the time spent in LLVM optimization passes by 25 or even 30%.

Example 1:

```julia
using LoopVectorization
function f!(a, b)
    @turbo for i in eachindex(a)
        a[i] *= b[i]
    end
    return a
end
f!(rand(1), rand(1))
```

```console
$ time ./julia -O3 lv.jl
```

Without PGO+LTO: 14.801s
With PGO+LTO: 11.978s (-19%)

Example 2:

```console
$ time ./julia -e 'using Pkg; Pkg.test("Unitful");'
```

Without PGO+LTO: 1m47.688s
With PGO+LTO: 1m35.704s (-11%)

Example 3 (taken from issue JuliaLang#45395, which is almost only LLVM):

```console
$ JULIA_LLVM_ARGS=-time-passes ./julia script-45395.jl
```

Without PGO+LTO:

```
===-------------------------------------------------------------------------===
                      ... Pass execution timing report ...
===-------------------------------------------------------------------------===
  Total Execution Time: 101.0130 seconds (98.6253 wall clock)

   ---User Time---   --System Time--   --User+System--   ---Wall Time---  --- Name ---
  53.6961 ( 54.7%)   0.1050 (  3.8%)  53.8012 ( 53.3%)  53.8045 ( 54.6%)  Unroll loops
  25.5423 ( 26.0%)   0.0072 (  0.3%)  25.5495 ( 25.3%)  25.5444 ( 25.9%)  Global Value Numbering
   7.1995 (  7.3%)   0.0526 (  1.9%)   7.2521 (  7.2%)   7.2517 (  7.4%)  Induction Variable Simplification
   5.0541 (  5.1%)   0.0098 (  0.3%)   5.0639 (  5.0%)   5.0561 (  5.1%)
   Combine redundant instructions JuliaLang#2
```

Wit PGO+LTO:

```
===-------------------------------------------------------------------------===
                      ... Pass execution timing report ...
===-------------------------------------------------------------------------===
  Total Execution Time: 72.6507 seconds (70.1337 wall clock)

   ---User Time---   --System Time--   --User+System--   ---Wall Time---  --- Name ---
  36.0894 ( 51.7%)   0.0825 (  2.9%)  36.1719 ( 49.8%)  36.1738 ( 51.6%)  Unroll loops
  16.5713 ( 23.7%)   0.0129 (  0.5%)  16.5843 ( 22.8%)  16.5794 ( 23.6%)  Global Value Numbering
   5.9047 (  8.5%)   0.0395 (  1.4%)   5.9442 (  8.2%)   5.9438 (  8.5%)  Induction Variable Simplification
   4.7566 (  6.8%)   0.0078 (  0.3%)   4.7645 (  6.6%)   4.7575 (  6.8%)  Combine redundant instructions JuliaLang#2
```

Or -28% time spent in LLVM.

---

Finally there's a significant reduction in binary sizes. For libLLVM.so:

```
79M	usr/lib/libLLVM-13jl.so (before)
67M	usr/lib/libLLVM-13jl.so (after)
```

And it can be reduced by another 2MB with `--icf=safe` when using LLD as
a linker anways.

Turn into makefile

Newline

Use two out of source builds

Ignore profiles + build dirs

Add --icf=safe

stage0 setup prebuilt clang with [cd]tors->init/fini patch
haampie added a commit to haampie/julia that referenced this issue Jun 28, 2022
Adds a convenient way to enable PGO+LTO on Julia and LLVM together:

1. `cd contrib/pgo-lto`
2. `make -j$(nproc) stage1`
3. `make clean-profiles`
4. `./stage1.build/julia -O3 -e 'using Pkg; Pkg.add("LoopVectorization"); Pkg.test("LoopVectorization")'`
5. `make -j$(nproc) stage2`

This results quite often in spectacular speedups for time to first X as
it reduces the time spent in LLVM optimization passes by 25 or even 30%.

Example 1:

```julia
using LoopVectorization
function f!(a, b)
    @turbo for i in eachindex(a)
        a[i] *= b[i]
    end
    return a
end
f!(rand(1), rand(1))
```

```console
$ time ./julia -O3 lv.jl
```

Without PGO+LTO: 14.801s
With PGO+LTO: 11.978s (-19%)

Example 2:

```console
$ time ./julia -e 'using Pkg; Pkg.test("Unitful");'
```

Without PGO+LTO: 1m47.688s
With PGO+LTO: 1m35.704s (-11%)

Example 3 (taken from issue JuliaLang#45395, which is almost only LLVM):

```console
$ JULIA_LLVM_ARGS=-time-passes ./julia script-45395.jl
```

Without PGO+LTO:

```
===-------------------------------------------------------------------------===
                      ... Pass execution timing report ...
===-------------------------------------------------------------------------===
  Total Execution Time: 101.0130 seconds (98.6253 wall clock)

   ---User Time---   --System Time--   --User+System--   ---Wall Time---  --- Name ---
  53.6961 ( 54.7%)   0.1050 (  3.8%)  53.8012 ( 53.3%)  53.8045 ( 54.6%)  Unroll loops
  25.5423 ( 26.0%)   0.0072 (  0.3%)  25.5495 ( 25.3%)  25.5444 ( 25.9%)  Global Value Numbering
   7.1995 (  7.3%)   0.0526 (  1.9%)   7.2521 (  7.2%)   7.2517 (  7.4%)  Induction Variable Simplification
   5.0541 (  5.1%)   0.0098 (  0.3%)   5.0639 (  5.0%)   5.0561 (  5.1%)
   Combine redundant instructions JuliaLang#2
```

Wit PGO+LTO:

```
===-------------------------------------------------------------------------===
                      ... Pass execution timing report ...
===-------------------------------------------------------------------------===
  Total Execution Time: 72.6507 seconds (70.1337 wall clock)

   ---User Time---   --System Time--   --User+System--   ---Wall Time---  --- Name ---
  36.0894 ( 51.7%)   0.0825 (  2.9%)  36.1719 ( 49.8%)  36.1738 ( 51.6%)  Unroll loops
  16.5713 ( 23.7%)   0.0129 (  0.5%)  16.5843 ( 22.8%)  16.5794 ( 23.6%)  Global Value Numbering
   5.9047 (  8.5%)   0.0395 (  1.4%)   5.9442 (  8.2%)   5.9438 (  8.5%)  Induction Variable Simplification
   4.7566 (  6.8%)   0.0078 (  0.3%)   4.7645 (  6.6%)   4.7575 (  6.8%)  Combine redundant instructions JuliaLang#2
```

Or -28% time spent in LLVM.

---

Finally there's a significant reduction in binary sizes. For libLLVM.so:

```
79M	usr/lib/libLLVM-13jl.so (before)
67M	usr/lib/libLLVM-13jl.so (after)
```

And it can be reduced by another 2MB with `--icf=safe` when using LLD as
a linker anways.

Turn into makefile

Newline

Use two out of source builds

Ignore profiles + build dirs

Add --icf=safe

stage0 setup prebuilt clang with [cd]tors->init/fini patch
@vtjnash
Copy link
Member

vtjnash commented Feb 6, 2024

I don't think there is too much to say here, since compilation is expected to be super-linear in code size, other than do not embed data into programs if you don't want that. Otherwise do it.

@vtjnash vtjnash closed this as not planned Won't fix, can't repro, duplicate, stale Feb 6, 2024
@vtjnash
Copy link
Member

vtjnash commented Feb 6, 2024

FWIW, v1.11 nightly with asserts timing is currently:

julia> @time f(4)
 98.236890 seconds (6.53 M allocations: 289.084 MiB, 0.13% gc time, 100.00% compilation time)

oscardssmith added a commit that referenced this issue Feb 9, 2024
Adds a convenient way to enable PGO+LTO on Julia and LLVM together:

1. `cd contrib/pgo-lto`
2. `make -j$(nproc) stage1`
3. `make clean-profiles`
4. `./stage1.build/julia -O3 -e 'using Pkg;
Pkg.add("LoopVectorization"); Pkg.test("LoopVectorization")'`
5. `make -j$(nproc) stage2`

<details>
<summary>* Output looks roughly like as follows</summary>

```c++
$ make -C contrib/pgo-lto top 
make: Entering directory '/dev/shm/julia/contrib/pgo-lto'
llvm-profdata show --topn=50 /dev/shm/julia/contrib/pgo-lto/profiles/merged.prof | c++filt
Instrumentation level: IR  entry_first = 0
Total functions: 85943
Maximum function count: 7867557260
Maximum internal block count: 3468437590
Top 50 functions with the largest internal block counts: 
  llvm::BitVector::operator|=(llvm::BitVector const&), max count = 7867557260
  LateLowerGCFrame::ComputeLiveness(State&), max count = 3468437590
  llvm::hashing::detail::hash_combine_recursive_helper::hash_combine_recursive_helper(), max count = 1742259834
  llvm::SUnit::addPred(llvm::SDep const&, bool), max count = 511396575
  llvm::LiveRange::overlaps(llvm::LiveRange const&, llvm::CoalescerPair const&, llvm::SlotIndexes const&) const, max count = 508061762
  llvm::StringMapImpl::LookupBucketFor(llvm::StringRef), max count = 505682177
  std::map<llvm::BasicBlock*, BBState, std::less<llvm::BasicBlock*>, std::allocator<std::pair<llvm::BasicBlock* const, BBState> > >::operator[](llvm::BasicBlock* const&), max count = 395628888
  llvm::LiveRange::advanceTo(llvm::LiveRange::Segment const*, llvm::SlotIndex) const, max count = 384642728
  llvm::LiveRange::isLiveAtIndexes(llvm::ArrayRef<llvm::SlotIndex>) const, max count = 380291040
  llvm::PassRegistry::enumerateWith(llvm::PassRegistrationListener*), max count = 352313953
  ijl_method_instance_add_backedge, max count = 349608221
  llvm::SUnit::ComputeHeight(), max count = 336604330
  llvm::LiveRange::advanceTo(llvm::LiveRange::Segment*, llvm::SlotIndex), max count = 331030109
  llvm::SmallPtrSetImplBase::insert_imp(void const*), max count = 272966545
  llvm::LiveIntervals::checkRegMaskInterference(llvm::LiveInterval&, llvm::BitVector&), max count = 257449540
  LateLowerGCFrame::ComputeLiveSets(State&), max count = 252096274
  /dev/shm/julia/src/jltypes.c:has_free_typevars, max count = 230879464
  ijl_get_pgcstack, max count = 216953592
  LateLowerGCFrame::RefineLiveSet(llvm::BitVector&, State&, std::vector<int, std::allocator<int> > const&), max count = 188013152
  /dev/shm/julia/src/flisp/flisp.c:apply_cl, max count = 174863813
  /dev/shm/julia/src/flisp/builtins.c:fl_memq, max count = 168621603
```
</details>


This results quite often in spectacular speedups for time to first X as
it reduces the time spent in LLVM optimization passes by 25 or even 30%.

Example 1:

```julia
using LoopVectorization
function f!(a, b)
    @turbo for i in eachindex(a)
        a[i] *= b[i]
    end
    return a
end
f!(rand(1), rand(1))
```

```console
$ time ./julia -O3 lv.jl
```

Without PGO+LTO: 14.801s
With PGO+LTO: 11.978s (-19%)

Example 2:

```console
$ time ./julia -e 'using Pkg; Pkg.test("Unitful");'
```

Without PGO+LTO: 1m47.688s
With PGO+LTO: 1m35.704s (-11%)

Example 3 (taken from issue #45395, which is almost only LLVM):

```console
$ JULIA_LLVM_ARGS=-time-passes ./julia script-45395.jl
```

Without PGO+LTO:

```
===-------------------------------------------------------------------------===
                      ... Pass execution timing report ...
===-------------------------------------------------------------------------===
  Total Execution Time: 101.0130 seconds (98.6253 wall clock)

   ---User Time---   --System Time--   --User+System--   ---Wall Time---  --- Name ---
  53.6961 ( 54.7%)   0.1050 (  3.8%)  53.8012 ( 53.3%)  53.8045 ( 54.6%)  Unroll loops
  25.5423 ( 26.0%)   0.0072 (  0.3%)  25.5495 ( 25.3%)  25.5444 ( 25.9%)  Global Value Numbering
   7.1995 (  7.3%)   0.0526 (  1.9%)   7.2521 (  7.2%)   7.2517 (  7.4%)  Induction Variable Simplification
   6.0541 (  5.1%)   0.0098 (  0.3%)   5.0639 (  5.0%)   5.0561 (  5.1%)  Combine redundant instructions #2
```

With PGO+LTO:

```
===-------------------------------------------------------------------------===
                      ... Pass execution timing report ...
===-------------------------------------------------------------------------===
  Total Execution Time: 72.6507 seconds (70.1337 wall clock)

   ---User Time---   --System Time--   --User+System--   ---Wall Time---  --- Name ---
  36.0894 ( 51.7%)   0.0825 (  2.9%)  36.1719 ( 49.8%)  36.1738 ( 51.6%)  Unroll loops
  16.5713 ( 23.7%)   0.0129 (  0.5%)  16.5843 ( 22.8%)  16.5794 ( 23.6%)  Global Value Numbering
   5.9047 (  8.5%)   0.0395 (  1.4%)   5.9442 (  8.2%)   5.9438 (  8.5%)  Induction Variable Simplification
   4.7566 (  6.8%)   0.0078 (  0.3%)   4.7645 (  6.6%)   4.7575 (  6.8%)  Combine redundant instructions #2
```

Or -28% time spent in LLVM.

`perf` reports show this is mostly fewer instructions and reduction in
icache misses.

---

Finally there's a significant reduction in binary sizes. For libLLVM.so:

```
79M	usr/lib/libLLVM-13jl.so (before)
67M	usr/lib/libLLVM-13jl.so (after)
```

And it can be reduced by another 2MB with `--icf=safe` when using LLD as
a linker anyways.

- [x] Two out-of-source builds would be better than a single in-source
build, so that it's easier to find good profile data

---------

Co-authored-by: Oscar Smith <oscardssmith@gmail.com>
Co-authored-by: Lilith Orion Hafner <lilithhafner@gmail.com>
@Zentrik Zentrik mentioned this issue Apr 16, 2024
7 tasks
giordano pushed a commit that referenced this issue Jul 26, 2024
This uses LLVM's BOLT to optimize libLLVM, libjulia-internal and
libjulia-codegen.

This improves the allinference benchmarks by about 10% largely due to
the optimization of libjulia-internal.
The example in issue #45395
which stresses LLVM significantly more also sees a ~10% improvement.
We see a 20% improvement on 
```julia
@time for i in 1:100000000
    string(i)
end
```

When building corecompiler.ji:
BOLT gives about a 16% improvement
PGO+LTO gives about a 21% improvement
PGO+LTO+BOLT gives about a 23% improvement

This only requires a single build of LLVM and theoretically none if we
change the binary builder script (i.e. we build with relocations and the
`-fno-reorder-blocks-and-partition` and then we can use BOLT to get
binaries with no relocations and reordered blocks and then ship both
binaries?) compared to the 2 in PGO. Also, this theoretically can
improve performance of a PGO+LTO build by a couple %.

The only reproducible test problem I see is that the BOLT, PGO+LTO and
PGO+LTO+BOLT builds all cause `readelf` to emit warnings as part of the
`osutils` tests.

```
readelf: Warning: Unrecognised form: 0x22
readelf: Warning: DIE has locviews without loclist
readelf: Warning: Unrecognised form: 0x23
readelf: Warning: DIE at offset 0x227399 refers to abbreviation number 14754 which does not exist
readelf: Warning: Bogus end-of-siblings marker detected at offset 212aa9 in .debug_info section
readelf: Warning: Bogus end-of-siblings marker detected at offset 212ab0 in .debug_info section
readelf: Warning: Further warnings about bogus end-of-sibling markers suppressed
```

The unrecognised form warnings seem to be a bug in binutils,
https://sourceware.org/bugzilla/show_bug.cgi?id=28981.
`DIE at offset` warning I believe was fixed in binutils 2.36,
https://sourceware.org/bugzilla/show_bug.cgi?id=26808, but `ld -v` says
I have 2.38.
I assume these are all benign. I also don't see them on CI here
https://buildkite.com/julialang/julia-buildkite/builds/1507#018f00e7-0737-4a42-bcd9-d4061dc8c93e
so could just be a local issue.
lazarusA pushed a commit to lazarusA/julia that referenced this issue Aug 17, 2024
This uses LLVM's BOLT to optimize libLLVM, libjulia-internal and
libjulia-codegen.

This improves the allinference benchmarks by about 10% largely due to
the optimization of libjulia-internal.
The example in issue JuliaLang#45395
which stresses LLVM significantly more also sees a ~10% improvement.
We see a 20% improvement on 
```julia
@time for i in 1:100000000
    string(i)
end
```

When building corecompiler.ji:
BOLT gives about a 16% improvement
PGO+LTO gives about a 21% improvement
PGO+LTO+BOLT gives about a 23% improvement

This only requires a single build of LLVM and theoretically none if we
change the binary builder script (i.e. we build with relocations and the
`-fno-reorder-blocks-and-partition` and then we can use BOLT to get
binaries with no relocations and reordered blocks and then ship both
binaries?) compared to the 2 in PGO. Also, this theoretically can
improve performance of a PGO+LTO build by a couple %.

The only reproducible test problem I see is that the BOLT, PGO+LTO and
PGO+LTO+BOLT builds all cause `readelf` to emit warnings as part of the
`osutils` tests.

```
readelf: Warning: Unrecognised form: 0x22
readelf: Warning: DIE has locviews without loclist
readelf: Warning: Unrecognised form: 0x23
readelf: Warning: DIE at offset 0x227399 refers to abbreviation number 14754 which does not exist
readelf: Warning: Bogus end-of-siblings marker detected at offset 212aa9 in .debug_info section
readelf: Warning: Bogus end-of-siblings marker detected at offset 212ab0 in .debug_info section
readelf: Warning: Further warnings about bogus end-of-sibling markers suppressed
```

The unrecognised form warnings seem to be a bug in binutils,
https://sourceware.org/bugzilla/show_bug.cgi?id=28981.
`DIE at offset` warning I believe was fixed in binutils 2.36,
https://sourceware.org/bugzilla/show_bug.cgi?id=26808, but `ld -v` says
I have 2.38.
I assume these are all benign. I also don't see them on CI here
https://buildkite.com/julialang/julia-buildkite/builds/1507#018f00e7-0737-4a42-bcd9-d4061dc8c93e
so could just be a local issue.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
compiler:latency Compiler latency regression Regression in behavior compared to a previous version
Projects
None yet
Development

No branches or pull requests

4 participants