Skip to content

Commit af41b88

Browse files
authored
Merge branch 'main' into fix-batch-size-optimization-tutorial
2 parents 9f6ba07 + 63f987d commit af41b88

File tree

10 files changed

+139
-397
lines changed

10 files changed

+139
-397
lines changed

.jenkins/validate_tutorials_built.py

Lines changed: 0 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -37,7 +37,6 @@
3737
"prototype_source/nestedtensor",
3838
"recipes_source/recipes/saving_and_loading_models_for_inference",
3939
"recipes_source/recipes/saving_multiple_models_in_one_file",
40-
"recipes_source/recipes/loading_data_recipe",
4140
"recipes_source/recipes/tensorboard_with_pytorch",
4241
"recipes_source/recipes/what_is_state_dict",
4342
"recipes_source/recipes/profiler_recipe",

beginner_source/dist_overview.rst

Lines changed: 59 additions & 186 deletions
Large diffs are not rendered by default.

en-wordlist.txt

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -335,6 +335,7 @@ dataset’s
335335
deallocation
336336
decompositions
337337
decorrelated
338+
devicemesh
338339
deserialize
339340
deserialized
340341
desynchronization
@@ -346,6 +347,7 @@ distractor
346347
downsample
347348
downsamples
348349
dropdown
350+
dtensor
349351
duration
350352
elementwise
351353
embeddings
@@ -482,6 +484,7 @@ prespecified
482484
pretrained
483485
prewritten
484486
primals
487+
processgroup
485488
profiler
486489
profilers
487490
protobuf
@@ -503,6 +506,7 @@ relu
503506
reproducibility
504507
rescale
505508
rescaling
509+
reshard
506510
resnet
507511
restride
508512
rewinded
@@ -515,6 +519,8 @@ runtime
515519
runtime
516520
runtimes
517521
scalable
522+
sharded
523+
Sharding
518524
softmax
519525
sparsified
520526
sparsifier

intermediate_source/torch_compile_tutorial.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -135,7 +135,7 @@ def init_model():
135135
######################################################################
136136
# First, let's compare inference.
137137
#
138-
# Note that in the call to ``torch.compile``, we have have the additional
138+
# Note that in the call to ``torch.compile``, we have the additional
139139
# ``mode`` argument, which we will discuss below.
140140

141141
model = init_model()
Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,8 @@
1+
Loading data in PyTorch
2+
=======================
3+
4+
The content is deprecated. See `Datasets & DataLoaders <https://pytorch.org/tutorials/beginner/basics/data_tutorial.html>`__ instead.
5+
6+
.. raw:: html
7+
8+
<meta http-equiv="Refresh" content="1; url='https://pytorch.org/tutorials/beginner/basics/data_tutorial.html'" />

recipes_source/recipes/README.txt

Lines changed: 14 additions & 18 deletions
Original file line numberDiff line numberDiff line change
@@ -1,62 +1,58 @@
11
PyTorch Recipes
22
---------------------------------------------
3-
1. loading_data_recipe.py
4-
Loading Data in PyTorch
5-
https://pytorch.org/tutorials/recipes/recipes/loading_data_recipe.html
6-
7-
2. defining_a_neural_network.py
3+
1. defining_a_neural_network.py
84
Defining a Neural Network in PyTorch
95
https://pytorch.org/tutorials/recipes/recipes/defining_a_neural_network.html
106

11-
3. what_is_state_dict.py
7+
2. what_is_state_dict.py
128
What is a state_dict in PyTorch
139
https://pytorch.org/tutorials/recipes/recipes/what_is_state_dict.html
1410

15-
4. saving_and_loading_models_for_inference.py
11+
3. saving_and_loading_models_for_inference.py
1612
Saving and loading models for inference in PyTorch
1713
https://pytorch.org/tutorials/recipes/recipes/saving_and_loading_models_for_inference.html
1814

19-
5. custom_dataset_transforms_loader.py
15+
4. custom_dataset_transforms_loader.py
2016
Developing Custom PyTorch Dataloaders
2117
https://pytorch.org/tutorials/recipes/recipes/custom_dataset_transforms_loader.html
2218

2319

24-
6. Captum_Recipe.py
20+
5. Captum_Recipe.py
2521
Model Interpretability using Captum
2622
https://pytorch.org/tutorials/recipes/recipes/Captum_Recipe.html
2723

28-
7. dynamic_quantization.py
24+
6. dynamic_quantization.py
2925
Dynamic Quantization
3026
https://pytorch.org/tutorials/recipes/recipes/dynamic_quantization.html
3127

32-
8. save_load_across_devices.py
28+
7. save_load_across_devices.py
3329
Saving and loading models across devices in PyTorch
3430
https://pytorch.org/tutorials/recipes/recipes/save_load_across_devices.html
3531

36-
9. saving_and_loading_a_general_checkpoint.py
32+
8. saving_and_loading_a_general_checkpoint.py
3733
Saving and loading a general checkpoint in PyTorch
3834
https://pytorch.org/tutorials/recipes/recipes/saving_and_loading_a_general_checkpoint.html
3935

40-
10. saving_and_loading_models_for_inference.py
36+
9. saving_and_loading_models_for_inference.py
4137
Saving and loading models for inference in PyTorch
4238
https://pytorch.org/tutorials/recipes/recipes/saving_and_loading_models_for_inference.html
4339

44-
11. saving_multiple_models_in_one_file.py
40+
10. saving_multiple_models_in_one_file.py
4541
Saving and loading multiple models in one file using PyTorch
4642
https://pytorch.org/tutorials/recipes/recipes/saving_multiple_models_in_one_file.html
4743

48-
12. warmstarting_model_using_parameters_from_a_different_model.py
44+
11. warmstarting_model_using_parameters_from_a_different_model.py
4945
Warmstarting models using parameters from different model
5046
https://pytorch.org/tutorials/recipes/recipes/warmstarting_model_using_parameters_from_a_different_model.html
5147

52-
13. zeroing_out_gradients.py
48+
12. zeroing_out_gradients.py
5349
Zeroing out gradients
5450
https://pytorch.org/tutorials/recipes/recipes/zeroing_out_gradients.html
5551

56-
14. mobile_perf.py
52+
13. mobile_perf.py
5753
PyTorch Mobile Performance Recipes
5854
https://pytorch.org/tutorials/recipes/mobile_perf.html
5955

60-
15. amp_recipe.py
56+
14. amp_recipe.py
6157
Automatic Mixed Precision
6258
https://pytorch.org/tutorials/recipes/amp_recipe.html

recipes_source/recipes/loading_data_recipe.py

Lines changed: 0 additions & 163 deletions
This file was deleted.

recipes_source/recipes/tuning_guide.py

Lines changed: 32 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -213,6 +213,7 @@ def gelu(x):
213213

214214
###############################################################################
215215
# Typically, the following environment variables are used to set for CPU affinity with GNU OpenMP implementation. ``OMP_PROC_BIND`` specifies whether threads may be moved between processors. Setting it to CLOSE keeps OpenMP threads close to the primary thread in contiguous place partitions. ``OMP_SCHEDULE`` determines how OpenMP threads are scheduled. ``GOMP_CPU_AFFINITY`` binds threads to specific CPUs.
216+
# An important tuning parameter is core pinning which prevent the threads of migrating between multiple CPUs, enhancing data location and minimizing inter core communication.
216217
#
217218
# .. code-block:: sh
218219
#
@@ -318,6 +319,37 @@ def gelu(x):
318319
# GPU specific optimizations
319320
# --------------------------
320321

322+
###############################################################################
323+
# Enable Tensor cores
324+
# ~~~~~~~~~~~~~~~~~~~~~~~
325+
# Tensor cores are specialized hardware designed to compute matrix-matrix multiplication
326+
# operations, primarily utilized in deep learning and AI workloads. Tensor cores have
327+
# specific precision requirements which can be adjusted manually or via the Automatic
328+
# Mixed Precision API.
329+
#
330+
# In particular, tensor operations take advantage of lower precision workloads.
331+
# Which can be controlled via ``torch.set_float32_matmul_precision``.
332+
# The default format is set to 'highest,' which utilizes the tensor data type.
333+
# However, PyTorch offers alternative precision settings: 'high' and 'medium.'
334+
# These options prioritize computational speed over numerical precision."
335+
336+
###############################################################################
337+
# Use CUDA Graphs
338+
# ~~~~~~~~~~~~~~~~~~~~~~~
339+
# At the time of using a GPU, work first must be launched from the CPU and
340+
# in some cases the context switch between CPU and GPU can lead to bad resource
341+
# utilization. CUDA graphs are a way to keep computation within the GPU without
342+
# paying the extra cost of kernel launches and host synchronization.
343+
344+
# It can be enabled using
345+
torch.compile(m, "reduce-overhead")
346+
# or
347+
torch.compile(m, "max-autotune")
348+
349+
###############################################################################
350+
# Support for CUDA graph is in development, and its usage can incur in increased
351+
# device memory consumption and some models might not compile.
352+
321353
###############################################################################
322354
# Enable cuDNN auto-tuner
323355
# ~~~~~~~~~~~~~~~~~~~~~~~

0 commit comments

Comments
 (0)