Upstream merge 0803 #1887

jjsjann123 · 2022-08-03T22:30:06Z

merging upstream/master into csarofeen/devel

Upstream master commit: 9647bec
Corresponding PR to bump our master branch: #1886

Pull Request resolved: pytorch#82049 Approved by: https://github.com/ezyang

Pull Request resolved: pytorch#82051 Approved by: https://github.com/eellison, https://github.com/ezyang

Pull Request resolved: pytorch#82052 Approved by: https://github.com/ezyang

This reverts commit 30ed427. Reverted pytorch#82052 on behalf of https://github.com/Chillee due to broke build on master

…ted data types (pytorch#82183) This is in-continuation of fixes for TestConsistency for MPS backend. * Add error messages for unsupported matmul ops * Add error handling for int inputs for linear op ### Description  ### Issue  ### Testing  Pull Request resolved: pytorch#82183 Approved by: https://github.com/razarmehr

Docker docs says "For other items (files, directories) that do not require ADD’s tar auto-extraction capability, you should always use COPY": https://docs.docker.com/develop/develop-images/dockerfile_best-practices/#add-or-copy I've found this by running https://github.com/hadolint/hadolint This is a follow-up after pytorch#81944 Pull Request resolved: pytorch#82151 Approved by: https://github.com/huydhn, https://github.com/jeffdaily, https://github.com/ZainRizvi

### Description we need to make sure the int overload of expand gets redispatched to the same device. Otherwise at::native::expand just calls a bunch of lower-level ops. ### Issue  ### Testing  Pull Request resolved: pytorch#82264 Approved by: https://github.com/bdhirsh

Pull Request resolved: pytorch#82278 Approved by: https://github.com/huydhn

### Description  ### Issue  ### Testing  Pull Request resolved: pytorch#82269 Approved by: https://github.com/kit1980

…"" (pytorch#82287) This reverts commit e519dd3. Pull Request resolved: pytorch#82287 Approved by: https://github.com/ezyang

PEr title, unfortunately testing invalid reads with caching allocator is hard. Pull Request resolved: pytorch#82272 Approved by: https://github.com/cpuhrsch

Implements linspace with arange, and logspace with linspace. - Implements a more precise path in linspace's ref when dtype is integral to avoid off-by-one issues when output of computation is casted to int. The trade off is that there's an increased chance of overflow. - Files several issues pytorch#82242, pytorch#82230, pytorch#81996, on preexisting issues with the linspace and logspace. These mainly concern when dtype is integral - the affect tests are xfailed in this PR. - Fixes the check that the reference implementation is closer to precise implementation than torch implementation to also update the dtype kwarg to the precise dtype. TODO: - ~support negative bases~ (not in this PR) - ~support complex. Since arange does not support complex, but linspace does, one solution is to just call linspace separately on the real and imag components and sum the results in the end~ (not in this PR) - ~default dtypes need to be explicitly handled since computation is done in a different dtype than result~ (done) Pull Request resolved: pytorch#81826 Approved by: https://github.com/ngimel

### Description Add compiler function to dump the forward, backward, and joint graphs. The partitioner is default partition. The input meta to each dumped graphs will also be dumped as a pickle file. Example usage: ``` save_fx_func = graph_dumper_aot(current_name, folder_name, dump_example_input = False) optimize_ctx = torchdynamo.optimize( save_fx_func ) with torch.enable_grad(): with optimize_ctx: result = forward_and_backward_pass(model, example_inputs) ``` Pull Request resolved: pytorch#82184 Approved by: https://github.com/Chillee

…1522) Move aten.native_batch_norm_backward decomposition from https://github.com/pytorch/functorch/blob/main/functorch/_src/decompositions.py#L148. Changed to not recompute mean and invstd, added type cast. In fucntorch, changed `@register_decomposition_for(aten.native_batch_norm_backward)` to `@register_decomposition_for_jvp(aten.native_batch_norm_backward)` Passing `pytest test/test_decomp.py -k norm` Note that when the output mask is False for grad_weight and grad_bias, we should return None to be consistent with the non-decomposed operator's behavior. But "None" doesn't work with vjp, so the version of decomposition in functorch used zeros. See https://github.com/pytorch/pytorch/blob/b33c1f7dd4a4d30ebc912f555e56d105ae66aa84/functorch/functorch/_src/decompositions.py#L210. Pull Request resolved: pytorch#81522 Approved by: https://github.com/Chillee

allows for benchmarking of ops Differential Revision: [D38081129](https://our.internmc.facebook.com/intern/diff/D38081129/) Pull Request resolved: pytorch#82123 Approved by: https://github.com/SS-JIA

becnhmarking of add op Differential Revision: [D38118138](https://our.internmc.facebook.com/intern/diff/D38118138/) Pull Request resolved: pytorch#82124 Approved by: https://github.com/SS-JIA

As migration from Jenkins to GHA is complete. Pull Request resolved: pytorch#82280 Approved by: https://github.com/huydhn

benchmarking of conv2d regular op Differential Revision: [D38118137](https://our.internmc.facebook.com/intern/diff/D38118137/) Pull Request resolved: pytorch#82125 Approved by: https://github.com/SS-JIA

Differential Revision: [D38119585](https://our.internmc.facebook.com/intern/diff/D38119585/) Pull Request resolved: pytorch#82127 Approved by: https://github.com/SS-JIA

Differential Revision: [D38119586](https://our.internmc.facebook.com/intern/diff/D38119586/) Pull Request resolved: pytorch#82128 Approved by: https://github.com/SS-JIA

Differential Revision: [D38153928](https://our.internmc.facebook.com/intern/diff/D38153928/) Pull Request resolved: pytorch#82221 Approved by: https://github.com/SS-JIA

Differential Revision: [D38153929](https://our.internmc.facebook.com/intern/diff/D38153929/) Pull Request resolved: pytorch#82222 Approved by: https://github.com/SS-JIA

benchmarking of div op Differential Revision: [D38154700](https://our.internmc.facebook.com/intern/diff/D38154700/) Pull Request resolved: pytorch#82225 Approved by: https://github.com/SS-JIA

Based off pytorch#80511 with extra changes: - Update pybind to the latest release as it contains some needed fixes - Extend the compat header to do reduce changes in code Pull Request resolved: pytorch#81242 Approved by: https://github.com/malfet, https://github.com/mattip

…ch#82215) Fixes pytorch#82150. Pull Request resolved: pytorch#82215 Approved by: https://github.com/amjames, https://github.com/cpuhrsch

…ion of new parameter (pytorch#82273) ### Description PR pytorch#80336 introduced a new parameter to the Sparse Adam optimizer. The new parameter is accessed inside the `step` method of the optimizer. If we try to deserialize and run an older version of the optimizer before this change was introduced, it fails in the step that tries to access the missing parameter. I have added a workaround to set a default value in case the parameter is unavailable in the optimizer. ### Issue  ### Testing * Testing on PyTorch CI * Manual validation against existing serialized models to make sure they continue to work Pull Request resolved: pytorch#82273 Approved by: https://github.com/mehtanirav, https://github.com/albanD

…elper (pytorch#81828) Introduce _DistWrapper class that wraps a process group and provides functional variants of collectives. It works without c10d enabled and is exception robust. Introduce tensor_narrow_n that handle narrowing over multiple dimentions. Fixes #ISSUE_NUMBER Pull Request resolved: pytorch#81828 Approved by: https://github.com/wanchaol

It looks like DEBUG macro is never actually set anywhere, see pytorch#82276 Signed-off-by: Edward Z. Yang <ezyang@fb.com> Pull Request resolved: pytorch#82277 Approved by: https://github.com/malfet

) ### Description Improve the incremental build process on ROCM by eliminating unnecessary file changes. ### Issue N/A ### Testing 1. Run `python tools/amd_build/build_amd.py --out-of-place-only` multiple times, and ensure File `third_party/gloo/cmake/Modules/Findrccl.cmake` does not contain patterns like `RCCL_LIBRARY_PATH_PATH` 2. Run `python tools/amd_build/build_amd.py; USE_ROCM=1 python3 setup.py develop` twice, and confirm the second run does not trigger the compiling of thousands of files. Pull Request resolved: pytorch#82190 Approved by: https://github.com/jithunnair-amd, https://github.com/ezyang

The next PR up in the stack requires this for lintrunner to be happy. There are no logical changes; the file was autoformatted via the following: ``` mv functorch/codegen/gen_vmap_plumbing.py torchgen/gen_vmap_plumbing.py lintrunner torchgen/gen_vmap_plumbing.py -a mv torchgen/gen_vmap_plumbing.py functorch/codegen/gen_vmap_plumbing.py ``` Test Plan: - build functorch Differential Revision: [D38171956](https://our.internmc.facebook.com/intern/diff/D38171956) Pull Request resolved: pytorch#82246 Approved by: https://github.com/kit1980

### Description  We forgot that the < was for comments in markdown. Also added a link to the wiki to the start land checks message so users can see why their PR is taking extra time to land. ### Issue  n/a ### Testing  n/a Pull Request resolved: pytorch#82649 Approved by: https://github.com/janeyx99, https://github.com/ZainRizvi

fixes pytorch#81457 fixes pytorch#81216 fixes pytorch#81212 fixes pytorch#81207 fixes pytorch#81206 fixes pytorch#81218 fixes pytorch#81203 fixes pytorch#81202 fixes pytorch#81214 fixes pytorch#81220 fixes pytorch#81205 fixes pytorch#81200 fixes pytorch#81204 fixes pytorch#81221 fixes pytorch#81209 fixes pytorch#81210 fixes pytorch#81215 fixes pytorch#81217 fixes pytorch#81222 fixes pytorch#81211 fixes pytorch#81201 fixes pytorch#81208 As part of this PR I'm also re-enabling all of the functionalization tests that got marked as flaky in CI (they're not actually flaky - I think they got marked because a PR that should have changed their expect-test output made it to master without the changes. I'll let CI run on this PR to confirm though). reland of pytorch#80897 Pull Request resolved: pytorch#82407 Approved by: https://github.com/ezyang

Adds the dispatch boilerplate for MPS backend. Pull Request resolved: pytorch#82612 Approved by: https://github.com/malfet

…e case (pytorch#82441) - Refactor SchemaInfo to be able to handle cases where other variables besides running_mean and running_var mutate due to training = true - Add special case rrelu_with_noise to fix pytorch#82434 - Tested by running SchemaInfo tests Pull Request resolved: pytorch#82441 Approved by: https://github.com/davidberard98

This reverts commit 714669e. Reverted pytorch#82626 on behalf of https://github.com/zengk95 due to This looks like its breaking trunk

…uts (pytorch#82176)" This reverts commit 1dfcad8. Reverted pytorch#82176 on behalf of https://github.com/zengk95 due to This looks like it's breaking functorch tests on master

…ytorch#82552)"" (pytorch#82599) This reverts commit 532b8a9. Pull Request resolved: pytorch#82599 Approved by: https://github.com/albanD

…orch#82556)"" (pytorch#82600) This reverts commit ab8e5e6. Pull Request resolved: pytorch#82600 Approved by: https://github.com/janeyx99

This should reduce the prevalence of pytorch#82324 Differential Revision: [D38325919](https://our.internmc.facebook.com/intern/diff/D38325919) Pull Request resolved: pytorch#82596 Approved by: https://github.com/goldenxuett

update production ops (7/28). This is only for calculating mobile op test coverage. Meta employee can update it using ``` python test/mobile/model_test/update_production_ops.py ~/fbsource/xplat/pytorch_models/build/all_mobile_model_configs.yaml ``` Pull Request resolved: pytorch#82444 Approved by: https://github.com/kit1980

Fixes pytorch#82531 Pull Request resolved: pytorch#82650 Approved by: https://github.com/kulinseth

@atalman

thanks to @atalman for catching this https://github.com/pytorch/pytorch/actions/runs/2778770227 Pull Request resolved: pytorch#82672 Approved by: https://github.com/atalman

Re-lands pytorch#81558 that got reverted due failing tests. This failure happened because of the test that I poorly designed. [The loop here](https://github.com/pytorch/pytorch/pull/81558/files#diff-893b1eea27352f336f4cd832919e48d721e4e90186e63400b8596db6b82e7450R3837) is doing `cache_enabled=False` and then `cache_enabled=True`. By doing this loop the graph from previous iteration (case `False`) conflicts with the next one (case `True`). I redesigned the test such that it does not do any loops. The new test does separate function calls with different argument values. Pull Request resolved: pytorch#81896 Approved by: https://github.com/ngimel

This moves first-class dimensions, as prototyped in https://github.com/facebookresearch/torchdim into the functorch build. This makes them availiable for use in PrimTorch more easily. Pull Request resolved: pytorch#82454 Approved by: https://github.com/ezyang, https://github.com/zou3519

Differential Revision: D38368525 Pull Request resolved: pytorch#82676 Approved by: https://github.com/ngimel

Currently, if we run softmax_backward/logsoftmax_backward which are not along the last dim, the calculation will fall to a [scalar version](https://github.com/pytorch/pytorch/blob/32593ef2dd26e32ed44d3c03d3f5de4a42eb149a/aten/src/ATen/native/SoftMax.cpp#L220-L287). And we find actually we have the chance to vectorize the calculation along the inner_size dim. Changes we made: Use vectorized softmax_backward_kernel/log_softmax_backward_kernel instead of host_softmax_backward when not along the last dim. We collected the benchmark data of softmax_backward and logsoftmax_backward for BFloat16 and Float32 data type by using the operator_benchmark tool of PyTorch on the platform of Intel(R) Xeon(R) Platinum 8260L CPU @ 2.40GHz. Number of cores: 24 cores(1 socket) [softmax_benchmark_32593ef.log](https://github.com/pytorch/pytorch/files/8962956/softmax_benchmark_32593ef.log) [softmax_benchmark_the_pr.log](https://github.com/pytorch/pytorch/files/8962958/softmax_benchmark_the_pr.log) Pull Request resolved: pytorch#80114 Approved by: https://github.com/frank-wei

Summary: no functional changes, just testing to make sure this is working Test Plan: python test/test_ao_sparsity.py TestFxComposability Reviewers: Subscribers: Tasks: Tags: Pull Request resolved: pytorch#82204 Approved by: https://github.com/supriyar

…h#81802) Summary: Needed to refactor this PR to add tests for some new layers without copy pasting the entirety of the code. Its basically just a helper that does exactly what the other tests did since they were essentially copies of one another. Its possible to do similar with the quantized kernels test but its different enough that it seemed more effort than it was worth. Also bugfix: Originally line 150 I believe was wrong since model.weight is never used, though the only effect was that the specific weight wasn't used. Test Plan: python test/test_ao_sparsity.py TestQuantizedSparseLayers Reviewers: Subscribers: Tasks: Tags: Pull Request resolved: pytorch#81802 Approved by: https://github.com/supriyar

@jithunnair-amd

`torch.cuda.is_bf16_supported()` return False on ROCm which is not correct, since BF16 is supported on all AMD GPU arch - gfx906, gfx908 and gfx90a. cc @jithunnair-amd Pull Request resolved: pytorch#80410 Approved by: https://github.com/jeffdaily, https://github.com/malfet

This reverts commit 23b9004. Reverted pytorch#82454 on behalf of https://github.com/zengk95 due to this is breaking mac jobs on trunk https://hud.pytorch.org/pytorch/pytorch/commit/23b90044dac04d23f43c1a0f518bdbf95efd3b47

…2688) Need to use `ASSERT_FLOAT_EQ` for floats. Right now the test often fails internally like this: ``` xplat/caffe2/aten/src/ATen/native/quantized/cpu/qnnpack/test/fully-connected-operator-tester.h:362 Expected equality of these values: output_dynamic[i * outputChannels() + c] Which is: -601.09 ((float)accumulators[i * outputChannels() + c] * requantization_scales[c]) + float(bias[c]) Which is: -601.09 at 0, 18: reference = -601.0899658203125, optimized = -601.09002685546875 ``` ``` xplat/caffe2/aten/src/ATen/native/quantized/cpu/qnnpack/test/fully-connected-operator-tester.h:362 Expected equality of these values: output_dynamic[i * outputChannels() + c] Which is: -65.6251 ((float)accumulators[i * outputChannels() + c] * requantization_scales[c]) + float(bias[c]) Which is: -65.6251 at 0, 7: reference = -65.625106811523438, optimized = -65.625099182128906 ``` Pull Request resolved: pytorch#82688 Approved by: https://github.com/mehtanirav

csarofeen

LGTM

Syncing nvfuser devel branch to upstream master. https://github.com/csarofeen/pytorch/ Code changes includes: - codegen improvements: 1. removes un-necessary sync from redundant thread compute analysis 2. symmetric API for BestEffortReplay 3. support merge on trivial reductions 4. Ampere async copy improvements - bug fixes: 1. vectorization bug fixes 2. type inference patch : fixes upstream pytorch#81725 3. segmenter bug fix with deterministic iteration ordering - parser update 1. added leaky_relu - scheduler 1. normalization scheduler clean up. 2. simplifies matmul scheduling with new transform propagator 3. merge all dimensions in PW scheduler 4. various gemm related improvements - debuggability 1. nsight compute support 2. debug dump for InlinePropagator 3. Add `UnaryOpType::Print` Squashed commits to WAR github API Commits that's actually in this PR from the devel branch: ``` dfe02f3 Merge remote-tracking branch 'csarofeen/devel' into HEAD 1617373 Add `TensorViewBuilder::shape(std::vector<Val*> shape)` (#1884) 7cfb779 Merge pull request #1887 from csarofeen/upstream_merge_0803 3399f6d Merge remote-tracking branch 'origin/viable/strict' into HEAD 01208f5 Add `UnaryOpType::Print` which can be helpful for debugging (#1878) 0646522 Remove redundant TORCH_INTERNAL_ASSERT in lower_magic_zero.cpp (#1881) 7bc76aa Fix most inlined propagator for mismatched dims (#1875) 501f4aa Nonaffine swizzle formulation ep.2: Loop swizzle variant. (#1826) d863d69 Ampere async copy ep.2: circular buffering extension to support pipelined matmul operand load (#1827) e0ae11a Larger sized mma instructions to support full vectorization (#1824) 9bb4cf7 fragment iteration to support fully unrolled mma ops (#1823) a48270a Merge all dims in pointwise scheduler (#1872) 172fb36 Make MostInlined and BestEffort inline propagation no longer assert replayed (#1868) a64462a Allow trivial reduction to be merged (#1871) 440102b Symmetric API for BestEffortReplay (#1870) d1caf33 Some misc cleanups/refactor split out from #1854 (#1867) 1013eda Remove some welford specific logic. (#1864) 51589d3 Some cleanups on tests and heuristics params (#1866) a6b3e70 Segmenter bug fix, and deterministic iteration ordering. (#1865) 1b665b9 Add nullptr checks to IrBuilder (#1861) 1cd9451 Simplify matmul scheduling with the new transform propagator. (#1817) bbc1fb9 Add leaky_relu operation (#1852) e842a9b Minor cleanup in pointwise scheduler (#1858) 9ee850c Fix stringstream usage (#1857) 20a36c1 Improve nsight compute support (#1855) 4059103 Remove debugging `true ||` from getPointwiseHeuristics (#1822) 01117bf Misc cleanup (#1853) 5cc6494 Apply the magic-zero protection to each indexed domain individually for predicate indexing (#1846) 92e6f02 Cleanup normalization scheduler (#1845) db89c65 Type inference patch (#1848) 102fe93 Add debug dump for InlinePropagator (#1847) b7a4d93 Redundant thread compute analysis to avoid un-necessary sync insertion (#1687) 942be5b Upstream ci build fixes (#1842) 0b83645 Fix vectorization bug introduced in #1831 (#1840) 63630f1 Move MaxProducerPosUpdater into InlinePropagator::tearDown (#1825) 9135a96 Fix transpose benchmark dtype (#1839) 2c9a6c0 Add extra configurability to `parallelizeAllLike` (#1831) ``` RUN_TORCHBENCH: nvfuser Differential Revision: [D38543000](https://our.internmc.facebook.com/intern/diff/D38543000) Pull Request resolved: pytorch#83067 Approved by: https://github.com/davidberard98

Chillee and others added 30 commits July 27, 2022 00:31

Added new_empty.symint overload and a new_empty ref (pytorch#82049)

fc389cc

Pull Request resolved: pytorch#82049 Approved by: https://github.com/ezyang

Did some cleanup of symbolic shapes (pytorch#82051)

91b4648

Pull Request resolved: pytorch#82051 Approved by: https://github.com/eellison, https://github.com/ezyang

Ported aten::cross to work with symints (pytorch#82052)

30ed427

Pull Request resolved: pytorch#82052 Approved by: https://github.com/ezyang

Revert "Ported aten::cross to work with symints (pytorch#82052)"

e519dd3

This reverts commit 30ed427. Reverted pytorch#82052 on behalf of https://github.com/Chillee due to broke build on master

Remove obsolete Python < 3.3 TODO (pytorch#82278)

3cf9c3d

Pull Request resolved: pytorch#82278 Approved by: https://github.com/huydhn

Revert "Revert "Ported aten::cross to work with symints (pytorch#82052)…

a42616e

…"" (pytorch#82287) This reverts commit e519dd3. Pull Request resolved: pytorch#82287 Approved by: https://github.com/ezyang

Fix invalid read in masked softmax (pytorch#82272)

24d702d

PEr title, unfortunately testing invalid reads with caching allocator is hard. Pull Request resolved: pytorch#82272 Approved by: https://github.com/cpuhrsch

[vulkan][test] perf test demonstration (pytorch#82123)

560a19b

allows for benchmarking of ops Differential Revision: [D38081129](https://our.internmc.facebook.com/intern/diff/D38081129/) Pull Request resolved: pytorch#82123 Approved by: https://github.com/SS-JIA

[vulkan][test] benchmark add op (pytorch#82124)

68530c1

becnhmarking of add op Differential Revision: [D38118138](https://our.internmc.facebook.com/intern/diff/D38118138/) Pull Request resolved: pytorch#82124 Approved by: https://github.com/SS-JIA

Resolve old Jenkins->GHA TODO (pytorch#82280)

6dc7ca5

As migration from Jenkins to GHA is complete. Pull Request resolved: pytorch#82280 Approved by: https://github.com/huydhn

[vulkan][test] benchmark conv2d op (pytorch#82125)

11deae9

benchmarking of conv2d regular op Differential Revision: [D38118137](https://our.internmc.facebook.com/intern/diff/D38118137/) Pull Request resolved: pytorch#82125 Approved by: https://github.com/SS-JIA

[vulkan][test] benchmark conv2d op (pointwise) (pytorch#82127)

18be901

Differential Revision: [D38119585](https://our.internmc.facebook.com/intern/diff/D38119585/) Pull Request resolved: pytorch#82127 Approved by: https://github.com/SS-JIA

[vulkan][test] benchmark conv2d op (depthwise) (pytorch#82128)

0dc54e1

Differential Revision: [D38119586](https://our.internmc.facebook.com/intern/diff/D38119586/) Pull Request resolved: pytorch#82128 Approved by: https://github.com/SS-JIA

[vulkan][test] benchmark sub op (pytorch#82221)

1ab72cf

Differential Revision: [D38153928](https://our.internmc.facebook.com/intern/diff/D38153928/) Pull Request resolved: pytorch#82221 Approved by: https://github.com/SS-JIA

[vulkan][test] benchmark mul op (pytorch#82222)

8c519b1

Differential Revision: [D38153929](https://our.internmc.facebook.com/intern/diff/D38153929/) Pull Request resolved: pytorch#82222 Approved by: https://github.com/SS-JIA

[vulkan][test] benchmark div op (pytorch#82225)

9f6e3b0

benchmarking of div op Differential Revision: [D38154700](https://our.internmc.facebook.com/intern/diff/D38154700/) Pull Request resolved: pytorch#82225 Approved by: https://github.com/SS-JIA

fix silent type promition for sparse COO tensors with select (pytor…

18d0e53

…ch#82215) Fixes pytorch#82150. Pull Request resolved: pytorch#82215 Approved by: https://github.com/amjames, https://github.com/cpuhrsch

Make TORCH_SHOW_DISPATCH_TRACE actually work (pytorch#82277)

a721d27

It looks like DEBUG macro is never actually set anywhere, see pytorch#82276 Signed-off-by: Edward Z. Yang <ezyang@fb.com> Pull Request resolved: pytorch#82277 Approved by: https://github.com/malfet

zengk95 and others added 22 commits August 2, 2022 17:44

[MPS] Add dispatch stub code for MPS backend. (pytorch#82612)

19e3ea5

Adds the dispatch boilerplate for MPS backend. Pull Request resolved: pytorch#82612 Approved by: https://github.com/malfet

Revert "Remove unecesary copy constructor (pytorch#82626)"

dba287e

This reverts commit 714669e. Reverted pytorch#82626 on behalf of https://github.com/zengk95 due to This looks like its breaking trunk

Revert "[functorch] Fix linalg batch rules to error on non-matrix inp…

3129259

…uts (pytorch#82176)" This reverts commit 1dfcad8. Reverted pytorch#82176 on behalf of https://github.com/zengk95 due to This looks like it's breaking functorch tests on master

Revert "Revert "Add a lint rule for torch/csrc/util/pybind.h include (p…

df69660

…ytorch#82552)"" (pytorch#82599) This reverts commit 532b8a9. Pull Request resolved: pytorch#82599 Approved by: https://github.com/albanD

Revert "Revert "Add a lint rule for PYBIND11_DECLARE_HOLDER_TYPE (pyt…

428e024

…orch#82556)"" (pytorch#82600) This reverts commit ab8e5e6. Pull Request resolved: pytorch#82600 Approved by: https://github.com/janeyx99

[MPS] Unary ops over empty tensors should be no-op (pytorch#82650)

420c576

Fixes pytorch#82531 Pull Request resolved: pytorch#82650 Approved by: https://github.com/kulinseth

Fix nightly docs (pytorch#82672)

0fd990e

thanks to @atalman for catching this https://github.com/pytorch/pytorch/actions/runs/2778770227 Pull Request resolved: pytorch#82672 Approved by: https://github.com/atalman

Upgrade fbgemm in OSS PyTorch (pytorch#82676)

916a565

Differential Revision: D38368525 Pull Request resolved: pytorch#82676 Approved by: https://github.com/ngimel

Merge remote-tracking branch 'origin/viable/strict' into HEAD

3399f6d

jjsjann123 marked this pull request as ready for review August 3, 2022 22:30

jjsjann123 requested review from mruberry and IvanYashchuk as code owners August 3, 2022 22:30

jjsjann123 changed the base branch from master to devel August 3, 2022 22:30

jjsjann123 requested a review from csarofeen August 3, 2022 22:31

csarofeen approved these changes Aug 4, 2022

View reviewed changes

csarofeen merged commit 7cfb779 into devel Aug 4, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Upstream merge 0803 #1887

Upstream merge 0803 #1887

jjsjann123 commented Aug 3, 2022

csarofeen left a comment

Upstream merge 0803 #1887

Upstream merge 0803 #1887

Conversation

jjsjann123 commented Aug 3, 2022

csarofeen left a comment

Choose a reason for hiding this comment