Develop upstream sync 240123 #2375

draganmladjenovic · 2024-01-23T17:04:17Z

No description provided.

PiperOrigin-RevId: 599021699

PiperOrigin-RevId: 599023315

…port.cc to tensorflow/compiler/mlir/lite/utils/const_tensor_utils.h for shared usage PiperOrigin-RevId: 599023637

PiperOrigin-RevId: 599035702

…art #2 PiperOrigin-RevId: 599037622

…art #3 PiperOrigin-RevId: 599039077

…) the number of iterations. For such loops, use that value instead of the static estimate we have for more accurate cost modeling. PiperOrigin-RevId: 599040661

PiperOrigin-RevId: 599048699

…erleave PiperOrigin-RevId: 599049008

PiperOrigin-RevId: 599052754

I had to duplicate error handling macros in a few files, this is a temporary hack and will be removed with direct uses of nccl and cuda APIs. PiperOrigin-RevId: 599057949

… is enabled. Currently the proto is dumped every time the dumping is enabled for the module. PiperOrigin-RevId: 599058121

PiperOrigin-RevId: 599064645

PiperOrigin-RevId: 599086843

PiperOrigin-RevId: 599086860

PiperOrigin-RevId: 599087162

…r tests Imported from GitHub PR openxla/xla#8033 Merging this change closes tensorflow#8033 PiperOrigin-RevId: 599110626

We had several callers of FindNonTrivialHero which called it with a fusion instruction. This was usually an indicator that the call was not actually needed at all and we should just use the instruction itself as the hero. Also adjust ChooseFusionKind to check the producer, too, whether it is a kInput fusion. PiperOrigin-RevId: 599125417

… Type Imported from GitHub PR openxla/xla#8402 This PR adds BF16 support in oneDNN Matmul op by allowing the Dot op to maintain the BF16 type until handled by OneDnnMatMulRewriter pass. Copybara import of the project: -- 4f7ddbcd5ecf7a4b3cfd140abd9a73d193e9ca39 by Mahmoud Abuzaina <mahmoud.abuzaina@intel.com>: Enable MatMul op in BF16 Merging this change closes tensorflow#8402 PiperOrigin-RevId: 599132673

With LLVM 17+ LLD fails on undefined symbols in the linker version script by default. This breaks XLA builds with errors like this: ``` ld.lld: error: version script assignment of 'global' to symbol 'initxla_extension' failed: symbol not defined ld.lld: error: version script assignment of 'global' to symbol 'init_xla_extension' failed: symbol not defined ``` The problem is that the linker version script lists symbols that only existed in Python-2 builds. Since Python-2 is not even supported anymore by pybind11 we can just remove those entries from the linker script. (This is the change which removed Python-2 support: pybind/pybind11@6493f49) PiperOrigin-RevId: 599133946

…CompileToHsaco Imported from GitHub PR openxla/xla#8506 This fixes sporadic crashes in multithreaded_compilation_test_gpu. Copybara import of the project: -- a6fc9ada24d551f35e5f01bafb2cadbcf848f41b by Dragan Mladjenovic <Dragan.Mladjenovic@amd.com>: [ROCm] Move device libs path initialization into CompileToHsaco This fixes sporadic crashes in multithreaded_compilation_test_gpu. Merging this change closes tensorflow#8506 PiperOrigin-RevId: 599134851

…ure. Also improve test coverage slightly. PiperOrigin-RevId: 599144190

PiperOrigin-RevId: 599144440

…leOp`. After calibration, statistics will be attached to the resulting `ModuleOp`. This component takes `ModuleOp` as its input and output. This implies that during the calibration process, which relies on python-level TF Session APIs (and TF runtime), it is exported to SavedModel before the calibration and imported back to ModuleOp after the calibration internally. Tests are omitted because it requires components to be exposed to the python layer because the component depends on `PyFunctionLibrary`, which should only be injected from the python layer. Using test doubles (mocks) for `PyFunctionLibrary` has been considered but discarded, because doing so would require implementing `SaveExportedModel` properly in c++, which is an overkill for the objective of simply testing `CalibrationComponent`. PiperOrigin-RevId: 599146078

This is in preparation of adding support for libnvptxcompiler. PiperOrigin-RevId: 599151463

PiperOrigin-RevId: 599155194

Updates LLVM usage to match [f3d534c4251b](llvm/llvm-project@f3d534c4251b) PiperOrigin-RevId: 599164620

Added generic caution note

The `no_cuda_asan` tag is not considered in our current build config, but `noasan` is. PiperOrigin-RevId: 599169543

An upstream Triton issue triggers UBSAN on those tests. Let's disable sanitizers on them until that's fixed. PiperOrigin-RevId: 599170685

PiperOrigin-RevId: 600581010

…d move to Thunk CollectiveExecuteParams is a companion of Thunk::ExecuteParams and has to be defined close to it. Also convert class to struct for consistency with ExecuteParams. PiperOrigin-RevId: 600581907

PiperOrigin-RevId: 600583266

PiperOrigin-RevId: 600584468

PiperOrigin-RevId: 600584560

…efore initialization and execution PiperOrigin-RevId: 600586067

PiperOrigin-RevId: 600592917

… thunks PiperOrigin-RevId: 600601012

Use `PrefetchedSplitProvider` to prefetch the splits and write them in temporary files in parallel. When the dispatcher receives GetSnapshotSplit requests, it will just move the temporary files to the split directories. It could reduce the lock time and speed up the GetSnapshotSplit requests. PiperOrigin-RevId: 600602766

…elds. PiperOrigin-RevId: 600602767

…nsorHandleData` PiperOrigin-RevId: 600609090

…g input PiperOrigin-RevId: 600623197

The change was made public in TensorFlow 2.14 and 2.15 release notes: https://github.com/tensorflow/tensorflow/releases PiperOrigin-RevId: 600623585

This is one of many CLs to transition to the new PJRT ID APIs. - Add device lookup APIs with strong typed ID, and delegate the old ones to the new ones. - Delegate local_hardware_id() to the new one with typed ID. PiperOrigin-RevId: 600636575

PiperOrigin-RevId: 600668807

…IndexingMapSimplifier. PiperOrigin-RevId: 600687133

PiperOrigin-RevId: 600703058

PiperOrigin-RevId: 600703605

PiperOrigin-RevId: 600705896

PiperOrigin-RevId: 600720209

@reedwm

Imported from GitHub PR openxla/xla#8696 This PR fixes a couple of minor issues to support the XLA build against the cuDNN v9. cc. @reedwm Copybara import of the project: -- fb0ae743eafea727423dd02736214fc6f31364ee by Kaixi Hou <kaixih@nvidia.com>: Fix support to cudnn v9 Merging this change closes tensorflow#8696 PiperOrigin-RevId: 600734575

…sync-240123

jayfurmanek · 2024-01-23T17:32:57Z

retest cpu-pycpp please

draganmladjenovic · 2024-01-24T15:09:54Z

Retest Ubuntu-GPU-single please.

i-chaochen

Thanks for this fix! could you upstream this as well.

ezhulenev and others added 30 commits January 16, 2024 19:02

[xla:gpu] Do not use ncclAllReduce directly and use NcclApi

3bcdcb9

PiperOrigin-RevId: 599021699

[xla:gpu] Do not use ncclReduceScatter directly and use NcclApi

c8ff9d7

PiperOrigin-RevId: 599023315

Moved some functions from tensorflow/compiler/mlir/lite/flatbuffer_im…

7e02e4b

…port.cc to tensorflow/compiler/mlir/lite/utils/const_tensor_utils.h for shared usage PiperOrigin-RevId: 599023637

[xla:gpu] Do not use ncclSend and ncclRecv directly and use NcclApi

9a1eb5d

PiperOrigin-RevId: 599035702

[xla:gpu] Do not use ncclSend and ncclRecv directly and use NcclApi p…

628e1ac

…art #2 PiperOrigin-RevId: 599037622

[xla:gpu] Do not use ncclSend and ncclRecv directly and use NcclApi p…

a6b44d0

…art #3 PiperOrigin-RevId: 599039077

For some while loops, XLA can statically determine (an upper bound on…

f6cf8ac

…) the number of iterations. For such loops, use that value instead of the static estimate we have for more accurate cost modeling. PiperOrigin-RevId: 599040661

[xla:gpu] NFC: Remove unused ToNcclReduction from nccl_utils

fad928d

PiperOrigin-RevId: 599048699

#tf-data Return error if symbolic checkpointing fails in parallel_int…

be96cc7

…erleave PiperOrigin-RevId: 599049008

Remove never-written field TensorHandle::is_poisoned_.

3698446

PiperOrigin-RevId: 599052754

[xla:gpu] NFC: Remove nccl_types and nccl_errors

4b3999c

I had to duplicate error handling macros in a few files, this is a temporary hack and will be removed with direct uses of nccl and cuda APIs. PiperOrigin-RevId: 599057949

[XLA:GPU] Dump priority fusion proto only when the priority pass dump…

6ad1088

… is enabled. Currently the proto is dumped every time the dumping is enabled for the module. PiperOrigin-RevId: 599058121

[xla:gpu] Move persistent plan allocator into NcclApi

47fe47a

PiperOrigin-RevId: 599064645

compat: Update forward compatibility horizon to 2024-01-17

91dbed0

PiperOrigin-RevId: 599086843

Update GraphDef version to 1744.

3e4c1b4

PiperOrigin-RevId: 599086860

Move inliner pass before optimize graph pass.

869d735

PiperOrigin-RevId: 599087162

PR tensorflow#8033: Add dropped changes from [ROCm] Add command buffe…

065ccbf

…r tests Imported from GitHub PR openxla/xla#8033 Merging this change closes tensorflow#8033 PiperOrigin-RevId: 599110626

Improve error message when passing tensor of wrong shape to runSignat…

c26df95

…ure. Also improve test coverage slightly. PiperOrigin-RevId: 599144190

No public description

150f67d

PiperOrigin-RevId: 599144440

Make NVPTXCompiler::CompileWithPtxAs a free function

3520a88

This is in preparation of adding support for libnvptxcompiler. PiperOrigin-RevId: 599151463

Add support for Concat5 to XNNPack delegate

ed11d42

PiperOrigin-RevId: 599155194

Integrate LLVM at llvm/llvm-project@f3d534c4251b

fe924fc

Updates LLVM usage to match [f3d534c4251b](llvm/llvm-project@f3d534c4251b) PiperOrigin-RevId: 599164620

Added generic caution note

9e3641f

Added generic caution note

Replace no_cuda_asan by noasan

8a52541

The `no_cuda_asan` tag is not considered in our current build config, but `noasan` is. PiperOrigin-RevId: 599169543

Disable sanitizers for some Triton-related tests

c9d9164

An upstream Triton issue triggers UBSAN on those tests. Let's disable sanitizers on them until that's fixed. PiperOrigin-RevId: 599170685

tensorflower-gardener and others added 24 commits January 22, 2024 15:24

Add logging of graph before bridge phase 2

6321495

PiperOrigin-RevId: 600581010

[xla:gpu] NFC: Rename NcclExecuteParams to CollectiveExecuteParams an…

6b27b3a

…d move to Thunk CollectiveExecuteParams is a companion of Thunk::ExecuteParams and has to be defined close to it. Also convert class to struct for consistency with ExecuteParams. PiperOrigin-RevId: 600581907

Make headers in tf_device.cc clang-tidy compliant.

9cf5ca6

PiperOrigin-RevId: 600583266

Add a __repr__ function to NumpyIterator

866caab

PiperOrigin-RevId: 600584468

Fixed batch support in slice_mul_reduce_concat.

062b0ff

PiperOrigin-RevId: 600584560

[xla:gpu] Add a Prepare stage to Thunks to request shared resources b…

4f14c8b

…efore initialization and execution PiperOrigin-RevId: 600586067

Fixed batch support in Resampler.

2a50dec

PiperOrigin-RevId: 600592917

[xla:gpu] Override Thunk::Prepare in conditional/while/for/sequential…

c644b59

… thunks PiperOrigin-RevId: 600601012

Match TFLite mlir* arguments to match the DebugOptions proto's fi…

c574f05

…elds. PiperOrigin-RevId: 600602767

Suppress anything after first attempt to set the shape of a `RemoteTe…

5405b8b

…nsorHandleData` PiperOrigin-RevId: 600609090

#tf-data optimize interleave symbolic checkpointing by avoiding savin…

29362bf

…g input PiperOrigin-RevId: 600623197

Remove Estimator from Tensorflow.

aa35dc2

The change was made public in TensorFlow 2.14 and 2.15 release notes: https://github.com/tensorflow/tensorflow/releases PiperOrigin-RevId: 600623585

No public description

47bec0a

PiperOrigin-RevId: 600668807

[TileAnalysis] Split the logic to fetch upper/lower bounds away from …

d0e38ae

…IndexingMapSimplifier. PiperOrigin-RevId: 600687133

compat: Update forward compatibility horizon to 2024-01-23

c724754

PiperOrigin-RevId: 600703058

Update GraphDef version to 1750.

608213f

PiperOrigin-RevId: 600703605

Move inliner pass before optimize graph pass.

69d6738

PiperOrigin-RevId: 600705896

[xla:gpu] Add support for contiguous slice in AddressComputatonFusion

4bfbeff

PiperOrigin-RevId: 600720209

Merge remote-tracking branch 'upstream/master' into develop-upstream-…

cfb74b9

…sync-240123

Fix merge conflicts after 240123

88f438d

Tensorflow build fixes

fa71ab6

draganmladjenovic requested review from i-chaochen and jayfurmanek January 23, 2024 17:04

i-chaochen approved these changes Jan 24, 2024

View reviewed changes

draganmladjenovic merged commit 6dec314 into develop-upstream Jan 25, 2024
9 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Develop upstream sync 240123 #2375

Develop upstream sync 240123 #2375

draganmladjenovic commented Jan 23, 2024

jayfurmanek commented Jan 23, 2024

draganmladjenovic commented Jan 24, 2024

i-chaochen left a comment

Develop upstream sync 240123 #2375

Develop upstream sync 240123 #2375

Conversation

draganmladjenovic commented Jan 23, 2024

jayfurmanek commented Jan 23, 2024

draganmladjenovic commented Jan 24, 2024

i-chaochen left a comment

Choose a reason for hiding this comment