Develop upstream sync 231030 #2287

rahulbatra85 · 2023-10-31T02:34:10Z

No description provided.

In the compact version of printing, any custom attributes a user of StableHLO sets will not appear. This makes sure they appear if they exist. PiperOrigin-RevId: 575778539

Imported from GitHub PR openxla/xla#6475 `bitcast_dtypes_expander_test` failed when executed in Docker container ``` //xla/service:bitcast_dtypes_expander_test FAILED in 3 out of 3 in 0.7s ``` Reason - swapped variable names in `S64toS32` test ``` %shift-right-logical.11 -> %shift-right-logical.12 %constant.12 -> %constant.11 ``` To resolve the issue we can use pattern based var names instead. ``` %[[VAL_10:.*]] %[[VAL_11:.*]] ``` Copybara import of the project: -- b99ed813b39c6b53f1aa0981dbaae060d8cb5aa8 by Alexander Pivovarov <pivovaa@amazon.com>: Fix flaky test bitcast_dtypes_expander_test Merging this change closes tensorflow#6475 PiperOrigin-RevId: 575793343

Updates LLVM usage to match [e558be51bab0](llvm/llvm-project@e558be51bab0) PiperOrigin-RevId: 575794929

… cuda 12.3 dependencies. Otherwise, when pulled, the 12.3 will change the alias of /usr/local/cuda to 12.3, causing failures. PiperOrigin-RevId: 575799576

Log the op attributes on `--vmodule=xla_call_module_op=3`. PiperOrigin-RevId: 575812286

PiperOrigin-RevId: 575813013

http://github.com/tensorflow/runtime/commit/4e2f93c2c98c3ee32fd62ef7dbd5cc75c80011b8. PiperOrigin-RevId: 575813261

PiperOrigin-RevId: 575813414

Fixed that IrEmitterTriton "swallowed" NaNs in minimum and maximum. Fixed that ElementalIrEmitter "swallowed" NaNs in the case of minimum(x, NaN) on GPU. That was likely caused by an llvm error. PiperOrigin-RevId: 575817645

PiperOrigin-RevId: 575825788

PiperOrigin-RevId: 575845487

http://github.com/tensorflow/runtime/commit/18a68a86884cc3b033586e725cb65557420fe3ea. PiperOrigin-RevId: 575847883

PiperOrigin-RevId: 575850088

…mand buffers Before this CL the libraries supported by GPU graphs are determined a integer level. This CL allows the user to have fine-grained control on whether a library should be enabled in graphs. Example usage: XLA_FLAGS=--xla_gpu_command_buffer_command_types=FUSION,CUBLAS PiperOrigin-RevId: 575856977

Consolidate implementations of requantize and use the same one as TF quantizer. PiperOrigin-RevId: 575864178

PiperOrigin-RevId: 575866461

…dependencies, as part of a thunk clean up, and updated the necessary directories pointing to this thunk. tensorflow#5758 PiperOrigin-RevId: 575867727

PiperOrigin-RevId: 575879128

PiperOrigin-RevId: 575881215

…en instruction, and reports the total # of such nodes. PiperOrigin-RevId: 575887322

This will guarantee that the iterators created inside of parallel interleave iterator get destroyed before itself. Otherwise, `CancelThreads(true)` might finish before `element`'s destructor is called because of `outstanding_threads_finished_cond_var_.wait(l);` PiperOrigin-RevId: 575888988

PiperOrigin-RevId: 575890152

…lag. The flag could have a different value in different processes, which would invalidate the inferred shape. The dependency also inflates the size of the op library, by adding a dependency on a kernel library (and transitively on MLIR), and pulls in unrelated ops which complicates wrapper generation. PiperOrigin-RevId: 575900618

PiperOrigin-RevId: 575904949

PiperOrigin-RevId: 575910019

In preparation to create `xla_protos_all` to simplify further PiperOrigin-RevId: 575911998

…ow/core/tpu/ops:sparse_core_ops. These dependencies are unnecessary, and they can introduce duplicate op wrappers when we generate Python bindings. PiperOrigin-RevId: 575921758

PiperOrigin-RevId: 575932165

PiperOrigin-RevId: 575941719

Trying to get these in sync with the current jobs, which are failing, probably because of the wrong .so being used. PiperOrigin-RevId: 575978433

…ach fusion This is needed for xla_gpu_deterministic_ops to function reliably - as tested by determinism_test.cc. PiperOrigin-RevId: 577906200

PiperOrigin-RevId: 577906242

PiperOrigin-RevId: 577907744

PiperOrigin-RevId: 577915672

… SetPjRtCBufferToTensor). PiperOrigin-RevId: 577915709

PiperOrigin-RevId: 577915784

CreateTensorProto() that allows data to be loaded from non-std::vector sources, which is often more efficient. vector<bool> makes the implementation of internal::CreateTensorProto a bit messier than it would otherwise be due to it not being representable as an absl::Span. PiperOrigin-RevId: 577926897

In StreamExecutor we rely on a thread local hack to pass workspace to cuBLAS, long term we need to make workspace explicit in all BLAS APIs. Without user-managed workspace cuBLAS adds memory allocation nodes when captured into CUDA graphs, and that triggers graph update bugs in CUDA. PiperOrigin-RevId: 577928686

PiperOrigin-RevId: 577935297

PiperOrigin-RevId: 577941102

… the dot 1. Reduce block sizes if they are bigger than NextPowerOf2(tensor size). 2. Reduce split_k to NextPowerOf2(k)/block_k. 3. Remove duplicates created by the previous steps. I intend this cl not to make any performance degradation. PiperOrigin-RevId: 577941843

PiperOrigin-RevId: 577942475

PiperOrigin-RevId: 577946530

…p-upstream-sync-231030

rahulbatra85 · 2023-11-03T03:22:06Z

Unit test failures
https://github.com/ROCmSoftwarePlatform/frameworks-internal/issues/6261
https://github.com/ROCmSoftwarePlatform/frameworks-internal/issues/6260
https://github.com/ROCmSoftwarePlatform/frameworks-internal/issues/6264

i-chaochen · 2023-11-03T09:56:38Z

tensorflow/compiler/tests/xla_call_module_test.py

@@ -496,7 +497,7 @@ def f(x):
    }
  }
 """
-
+  '''


Could you use test.is_built_with_rocm() instead of just commented out in the code? Also, in your tracking ticket https://github.com/ROCmSoftwarePlatform/frameworks-internal/issues/6264 could you also add the new added commit from upstream for tracking? Thanks

//tensorflow/compiler/tests:xla_call_module_no_platform_check_test_gpu //tensorflow/compiler/tests:xla_call_module_no_shape_assertions_check_test_gpu //tensorflow/compiler/tests:xla_call_module_test_gpu //tensorflow/python/eager:context_test_gpu //tensorflow/python/ops/memory_tests:custom_gradient_memory_test_gpu

rahulbatra85 · 2023-11-03T19:19:42Z

retest Ubuntu-CPU please!

rahulbatra85 · 2023-11-03T19:42:44Z

retest Ubuntu-CPU please!

i-chaochen · 2023-11-06T09:54:01Z

retest Ubuntu-CPU please

bartchr808 and others added 30 commits October 23, 2023 04:44

Print optional attr-dict on stablehlo.reduce.

dc28c32

In the compact version of printing, any custom attributes a user of StableHLO sets will not appear. This makes sure they appear if they exist. PiperOrigin-RevId: 575778539

Integrate LLVM at llvm/llvm-project@e558be51bab0

6f30fc1

Updates LLVM usage to match [e558be51bab0](llvm/llvm-project@e558be51bab0) PiperOrigin-RevId: 575794929

Add libcublas-12-2 to prevent the libnvinfer8 dependency from pulling…

a02bae3

… cuda 12.3 dependencies. Otherwise, when pulled, the 12.3 will change the alias of /usr/local/cuda to 12.3, causing failures. PiperOrigin-RevId: 575799576

[XlaCallModule] Add more logging.

c251034

Log the op attributes on `--vmodule=xla_call_module_op=3`. PiperOrigin-RevId: 575812286

Introduce tf.GeneratorDatasetRegion op.

3441b35

PiperOrigin-RevId: 575813013

Update TFRT dependency to use revision

1b57ad5

http://github.com/tensorflow/runtime/commit/4e2f93c2c98c3ee32fd62ef7dbd5cc75c80011b8. PiperOrigin-RevId: 575813261

[XLA:GPU] Avoid scheduling already scheduled modules.

ba09559

PiperOrigin-RevId: 575813414

[XLA:GPU] Handle NaNs correctly in minimum and maximum

68e88d2

Fixed that IrEmitterTriton "swallowed" NaNs in minimum and maximum. Fixed that ElementalIrEmitter "swallowed" NaNs in the case of minimum(x, NaN) on GPU. That was likely caused by an llvm error. PiperOrigin-RevId: 575817645

Add forwarding shim for the C++ XNNPack plugin.

c3a5443

PiperOrigin-RevId: 575825788

Add boilerplate code to add verify clustering passes.

729644e

PiperOrigin-RevId: 575845487

Update TFRT dependency to use revision

319d697

http://github.com/tensorflow/runtime/commit/18a68a86884cc3b033586e725cb65557420fe3ea. PiperOrigin-RevId: 575847883

Touch up the requirements' updater README.

4f36d36

PiperOrigin-RevId: 575850088

Refactor Requantize in ConvertMHLOQuantToInt pass

1ce1300

Consolidate implementations of requantize and use the same one as TF quantizer. PiperOrigin-RevId: 575864178

Removed the unused device argument.

d734e0d

PiperOrigin-RevId: 575866461

[XLA:Runtime] Moved the FFT thunk to a new folder and removed unused …

affb9df

…dependencies, as part of a thunk clean up, and updated the necessary directories pointing to this thunk. tensorflow#5758 PiperOrigin-RevId: 575867727

Make costs of each batch accessible to RPC handler.

360b6e3

PiperOrigin-RevId: 575879128

Add a 3.12 Docker container.

a9e19d0

PiperOrigin-RevId: 575881215

Logs a warning if no default sharding strategy can be found for a giv…

f2a52bd

…en instruction, and reports the total # of such nodes. PiperOrigin-RevId: 575887322

[stream_executor] Use StreamIsCapturing function.

4fd71da

PiperOrigin-RevId: 575890152

Fixes typo in function args comment

1c22230

PiperOrigin-RevId: 575904949

Move error logging into lower cluster to runtime ops

10ac33a

PiperOrigin-RevId: 575910019

Refactor xla.bzl to not repeat deps list in xla_cc_{binary,test}

0ed832a

In preparation to create `xla_protos_all` to simplify further PiperOrigin-RevId: 575911998

Remove dependencies on other op libraries from //third_party/tensorfl…

57b76a1

…ow/core/tpu/ops:sparse_core_ops. These dependencies are unnecessary, and they can introduce duplicate op wrappers when we generate Python bindings. PiperOrigin-RevId: 575921758

Add visibility for //learning/metadata/artifactoid/cc

2328ddf

PiperOrigin-RevId: 575932165

Sprawling .pyi updates related to pybind11 PRs tensorflow#4876.

be859e0

PiperOrigin-RevId: 575941719

Add nightly libtpu download to newer scripts

cb1f995

Trying to get these in sync with the current jobs, which are failing, probably because of the wrong .so being used. PiperOrigin-RevId: 575978433

tdanyluk and others added 14 commits October 30, 2023 11:43

[XLA:GPU] Make the order of Executable candidates deterministic for e…

8f8efc6

…ach fusion This is needed for xla_gpu_deterministic_ops to function reliably - as tested by determinism_test.cc. PiperOrigin-RevId: 577906200

Re-enable layering_check for package.

5f91321

PiperOrigin-RevId: 577906242

Fix highway c++ library includes from highway/hwy/base.h -> hwy/base.h

e581ec3

PiperOrigin-RevId: 577907744

Fix formatting of copybara:uncomment directives to help fix rocm build

c0ea018

PiperOrigin-RevId: 577915672

[TF:NPD] Support PjRtTensorBuffer in TF_CreatePjRtBuffer (which calls…

73628e5

… SetPjRtCBufferToTensor). PiperOrigin-RevId: 577915709

Fix Trace Viewer pid numbering

b059752

PiperOrigin-RevId: 577915784

Add forwarding shim for the C++ XNNPack plugin.

56a269d

PiperOrigin-RevId: 577935297

[tflite] Add stablehlo.multiply reference kernel to tflite.

5c5f0fa

PiperOrigin-RevId: 577941102

Change the OSS visibility of batch_op_rewriter_proto

47dcda3

PiperOrigin-RevId: 577942475

[XLA/GPU] Add debug option to set the slop factor for memory limit

44bb031

PiperOrigin-RevId: 577946530

Merge(Conflicts) remote-tracking branch 'upstream/master' into develo…

abf7b02

…p-upstream-sync-231030

rahulbatra85 force-pushed the develop-upstream-sync-231030 branch from 669ff45 to 34f29b8 Compare November 2, 2023 18:03

rahulbatra85 requested review from i-chaochen and jayfurmanek November 3, 2023 03:22

i-chaochen reviewed Nov 3, 2023

View reviewed changes

Rahul Batra added 3 commits November 3, 2023 16:03

Resolved conflicts for develop-upstream-sync-231030

d165aed

[ROCm] fixing static linkage problems

8dd0424

rahulbatra85 force-pushed the develop-upstream-sync-231030 branch from 34f29b8 to 0c62bc3 Compare November 3, 2023 16:06

rahulbatra85 requested a review from i-chaochen November 6, 2023 17:12

jayfurmanek approved these changes Nov 7, 2023

View reviewed changes

i-chaochen approved these changes Nov 7, 2023

View reviewed changes

rahulbatra85 merged commit 5351bd5 into develop-upstream Nov 7, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Develop upstream sync 231030 #2287

Develop upstream sync 231030 #2287

Uh oh!

rahulbatra85 commented Oct 31, 2023

Uh oh!

rahulbatra85 commented Nov 3, 2023

Uh oh!

i-chaochen Nov 3, 2023 •

edited

Loading

Uh oh!

rahulbatra85 commented Nov 3, 2023

Uh oh!

rahulbatra85 commented Nov 3, 2023 •

edited

Loading

Uh oh!

i-chaochen commented Nov 6, 2023

Uh oh!

Uh oh!

@@ @@ -496,7 +497,7 @@ def f(x): @@
                   }
                 }
               """
+                '''

Develop upstream sync 231030 #2287

Develop upstream sync 231030 #2287

Uh oh!

Conversation

rahulbatra85 commented Oct 31, 2023

Uh oh!

rahulbatra85 commented Nov 3, 2023

Uh oh!

i-chaochen Nov 3, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

rahulbatra85 commented Nov 3, 2023

Uh oh!

rahulbatra85 commented Nov 3, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

i-chaochen commented Nov 6, 2023

Uh oh!

Uh oh!

i-chaochen Nov 3, 2023 •

edited

Loading

rahulbatra85 commented Nov 3, 2023 •

edited

Loading