Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Reimplement SabreSwap heuristic scoring in Rust (#7977)
* Reimplement SabreSwap heuristic scoring in multithreaded Rust This commit re-implements the core heuristic scoring of swap candidates in the SabreSwap pass as a multithread Rust routine. The heuristic scoring in sabre previously looped over all potential swap candidates serially in Python and applied a computed a heuristic score on which to candidate to pick. This can easily be done in parallel as there is no data dependency between scoring the different candidates. By performing this in Rust not only is the scoring operation done more quickly for each candidate but we can also leverage multithreading to do this efficiently in parallel. * Make sabre_swap a separate Rust module This commit moves the sabre specific code into a separate rust module. We already were using a separate Python module for the sabre code this just mirrors that in the rust code for better organization. * Fix lint * Remove unnecessary parallel iteration This commit removes an unecessary parallel iterator over the swap scores to find the minimum and just does it serially. The threading overhead for the parallel iterator is unecessary as it is fairly quick. * Revert change to DECAY_RESET_INTERVAL behavior * Avoid Bit._index * Add __str__ definition for DEBUG logs * Cleanup greedy swap path * Preserve insertion order in SwapScores The use of an inner hashmap meant the swap candidates were being evaluated in a different order based on the hash seeding instead of the order generated from the python side. This commit fixes by switching the internal type to an IndexMap which for a little overhead preserves the insertion order on iteration. * Work with virtual indices win obtain swap * Simplify decay reset() method * Fix lint * Fix typo * Rename nlayout methods * Update docstrings for SwapScores type * Use correct swap method for _undo_operations() * Fix rebase error * Revert test change * Reverse if condition in lookahead cost * Fix missing len division on lookahead cost * Remove unused EXTENDED_SET_WEIGHT python global * Switch to serial iterator for heuristic scoring While the heuristic scoring can be done in parallel as there is no data dependency between computing the score for candidates the overhead of dealing with multithreading eliminates and benefits from parallel execution. This is because the relative computation is fairly quick and the number of candidates is never very large (since coupling maps are typically sparsely connected). This commit switches to a serial iterator which will speed up execution in practice over running the iteration in parallel. * Return a 2d numpy array for best swaps and avoid conversion cost * Migrate obtain_swaps to rust This commit simplifies the rust loop by avoiding the need to have a mutable shared swap scores between rust and python. Instead the obtain swaps function to get the swap candidates for each layer is migrated to rust using a new neighbors table which is computed once per sabre class. This moves the iteration from obtain swaps to rust and eliminates it as a bottleneck. * Remove unused SwapScores class * Fix module metadata path * Add release note * Add rust docstrings * Pre-allocate candidate_swaps * Double swap instead of clone * Remove unnecessary list comprehensions * Move random choice into rust After rewriting the heuristic scoring in rust the biggest bottleneck in the function (outside of computing the extended set and applying gates to the dag) was performing the random choice between the best candidates via numpy. This wasn't necessary since we can just do the random choice in rust and have it return the best candidate. This commit adds a new class to represent a shared rng that is reused on each scoring call and changes sabre_score_heuristic to return the best swap. The tradeoff with this PR is that it changes the seeding so when compared to previous versions of SabreSwap different results will be returned with the same seed value. * Use int32 for max default rng seed for windows compat * Fix bounds check on custom sequence type's __getitem__ Co-authored-by: Kevin Hartman <kevin@hart.mn> * Only run parallel sort if not in a parallel context This commit updates the sort step in the sabre algorithm to only run a parallel sort if we're not already in a parallel context. This is to prevent a potential over dispatch of work if we're trying to use multiple threads from multiple processes. At the same time the sort algorithm used is switched to the unstable variant because a stable sort isn't necessary for this application and an unstable sort has less overhead. Co-authored-by: Kevin Hartman <kevin@hart.mn>
- Loading branch information