Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Summary:
Non-differentiability of MVaR came up in facebook/Ax#2077, so I'm reviving this diff to avoid similar errors in the future.
This upstreams a few improvements to MVaR such as
m > 2
support for theget_mvar_set_via_counting
(formerlyget_mvar_cpu
) and approximate differentiability.The choice between the two implementations is exposed as a kwarg with docstrings explaining when each option is preferred. These were the findings from some old benchmarking I did (N1799210):
get_mvar_set_vectorized
for acquisition functions (sincen_w = 32
was small) and used theget_mvar_set_via_counting
on CPU device for computing the MVaR for reporting withn_w = 512
.m = 2
: Counting based code is slower except for when the MVaR set is very large and running on CPU. GPU is the faster device except for smalln_w
.m = 3
: There's a bit of inconsistency here, but overall: counting based code is slow for smalln_w
, particularly running on GPU. Device GPU is slower than CPU for small MVaR sets. For larger MVaR sets, vectorized implementation is faster on GPU until it OOMs. For CPU, counting based code is faster for large MVaR.m = 4
: vectorized code is faster for small MVaR sets. Counting based code is the best bet for larger sets.Differential Revision: D35683069