Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
34 commits
Select commit Hold shift + click to select a range
aa789c7
update TF 2.2 smdebug features
mchoi8739 Aug 10, 2020
df74588
add details
mchoi8739 Aug 10, 2020
2fa0fdb
Update code samples/notes for new pySDK and smdebug/add and fix links
mchoi8739 Aug 10, 2020
6857d6c
add 'New features' note
mchoi8739 Aug 10, 2020
8be632a
minor fix
mchoi8739 Aug 10, 2020
d787f4b
minor fix
mchoi8739 Aug 10, 2020
6c00d2a
fix formatting
mchoi8739 Aug 10, 2020
4b6e0de
minor fix
mchoi8739 Aug 10, 2020
54c12ce
lint
mchoi8739 Aug 10, 2020
9e079dd
lint
mchoi8739 Aug 13, 2020
4afb5fc
minor structure change
mchoi8739 Aug 13, 2020
9c20ef2
minor fix
mchoi8739 Aug 13, 2020
293f770
minor fix
mchoi8739 Aug 13, 2020
4996feb
incorporate comments
mchoi8739 Aug 13, 2020
782e8c6
incorporate comments / lift limitation note
mchoi8739 Aug 13, 2020
aa7fcc5
incorporate comments
mchoi8739 Aug 13, 2020
83ad970
include pypi links
mchoi8739 Aug 13, 2020
3f2beff
minor fix
mchoi8739 Aug 13, 2020
fd1b1c2
incorporate comments
mchoi8739 Aug 13, 2020
463f0b4
incorporate comments
mchoi8739 Aug 13, 2020
72e48df
incorporate comments
mchoi8739 Aug 13, 2020
557eae1
version addition
mchoi8739 Aug 13, 2020
1eee9c6
version addition
mchoi8739 Aug 13, 2020
70a594b
Merge branch 'master' of https://github.com/awslabs/sagemaker-debugger
mchoi8739 Aug 31, 2020
fd62feb
add footnote about limitation
mchoi8739 Aug 31, 2020
19754a1
add details
mchoi8739 Aug 31, 2020
dd13c6c
add footnote
mchoi8739 Aug 31, 2020
a337b0e
sync
mchoi8739 Sep 1, 2020
4d86970
retrigger CI
mchoi8739 Sep 1, 2020
0decbe9
fix version numbers
mchoi8739 Sep 1, 2020
51cd8de
Merge branch 'master' of https://github.com/awslabs/sagemaker-debugge…
mchoi8739 Sep 21, 2020
6ca4bf8
version updates
mchoi8739 Sep 21, 2020
da631db
drop the sagemaker PyPI link
mchoi8739 Sep 22, 2020
ac429de
drop the unnecessary link
mchoi8739 Sep 22, 2020
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 3 additions & 3 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -64,11 +64,11 @@ The following frameworks are available AWS Deep Learning Containers with the dee
| Framework | Version |
| --- | --- |
| [TensorFlow](docs/tensorflow.md) | 1.15, 2.1.0, 2.2.0, 2.3.0 |
| [MXNet](docs/mxnet.md) | 1.6 |
| [MXNet](docs/mxnet.md) | 1.6, 1.7 |
| [PyTorch](docs/pytorch.md) | 1.4, 1.5, 1.6 |
| [XGBoost](docs/xgboost.md) | 0.90-2, 1.0-1 ([As a built-in algorithm](docs/xgboost.md#use-xgboost-as-a-built-in-algorithm))|

**Note**: Debugger with zero script change is partially available for TensorFlow v2.1.0 and v2.3.0. The `inputs`, `outputs`, `gradients`, and `layers` built-in collections are currently not available for these TensorFlow versions.
**Note**: Debugger with zero script change is partially available for TensorFlow v2.1.0. The `inputs`, `outputs`, `gradients`, and `layers` built-in collections are currently not available for these TensorFlow versions.

### AWS training containers with script mode

Expand All @@ -78,7 +78,7 @@ The `smdebug` library supports frameworks other than the ones listed above while
| --- | --- |
| [TensorFlow](docs/tensorflow.md) | 1.13, 1.14, 1.15, 2.1.0, 2.2.0, 2.3.0 |
| Keras (with TensorFlow backend) | 2.3 |
| [MXNet](docs/mxnet.md) | 1.4, 1.5, 1.6 |
| [MXNet](docs/mxnet.md) | 1.4, 1.5, 1.6, 1.7 |
| [PyTorch](docs/pytorch.md) | 1.2, 1.3, 1.4, 1.5, 1.6 |
| [XGBoost](docs/xgboost.md) | 0.90-2, 1.0-1 (As a framework)|

Expand Down
4 changes: 2 additions & 2 deletions docs/mxnet.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,8 +10,8 @@

## Support

- Zero Script Change experience where you need no modifications to your training script is supported in the official [SageMaker Framework Container for MXNet 1.6](https://docs.aws.amazon.com/sagemaker/latest/dg/pre-built-containers-frameworks-deep-learning.html), or the [AWS Deep Learning Container for MXNet 1.6](https://aws.amazon.com/machine-learning/containers/).
- This library itself supports the following versions when you use our API which requires a few minimal changes to your training script: MXNet 1.4, 1.5, 1.6.
- Zero Script Change experience where you need no modifications to your training script is supported in the official [AWS Deep Learning Container for MXNet](https://github.com/aws/deep-learning-containers/blob/master/available_images.md#general-framework-containers).
- This library itself supports the following versions when you use our API which requires a few minimal changes to your training script: MXNet 1.4, 1.5, 1.6, and 1.7.
- Only Gluon models are supported
- When the Gluon model is hybridized, inputs and outputs of intermediate layers can not be saved
- Parameter server based distributed training is not yet supported
Expand Down
4 changes: 2 additions & 2 deletions docs/pytorch.md
Original file line number Diff line number Diff line change
Expand Up @@ -9,9 +9,9 @@

## Support
### Versions
- Zero Script Change experience where you need no modifications to your training script is supported in the official [SageMaker Framework Container for PyTorch 1.3](https://docs.aws.amazon.com/sagemaker/latest/dg/pre-built-containers-frameworks-deep-learning.html), or the [AWS Deep Learning Container for PyTorch 1.3](https://aws.amazon.com/machine-learning/containers/).
- Zero Script Change experience where you need no modifications to your training script is supported in the official [AWS Deep Learning Container for PyTorch](https://github.com/aws/deep-learning-containers/blob/master/available_images.md#general-framework-containers).

- The library itself supports the following versions when using changes to the training script: PyTorch 1.2, 1.3.
- The library itself supports the following versions when using changes to the training script: PyTorch 1.2, 1.3, 1.4, 1.5, and 1.6.

---

Expand Down
9 changes: 3 additions & 6 deletions docs/tensorflow.md
Original file line number Diff line number Diff line change
Expand Up @@ -12,17 +12,14 @@

## Amazon SageMaker Debugger Support for TensorFlow<a name="support"></a>

Amazon SageMaker Debugger python SDK and its client library `smdebug` now fully support TensorFlow 2.2 with the latest version release.
Amazon SageMaker Debugger python SDK and its client library `smdebug` now fully support TensorFlow 2.3 with the latest version release.

- [Amazon SageMaker Python SDK PyPI](https://pypi.org/project/sagemaker/)
- [The latest smdebug PyPI release](https://pypi.org/project/smdebug/)

Using Debugger, you can access tensors of any kind for TensorFlow models, from the Keras model zoo to your own custom model, and save them using Debugger built-in or custom tensor collections. You can run your training script on [the official AWS Deep Learning Containers](https://docs.aws.amazon.com/sagemaker/latest/dg/debugger-container.html) where Debugger can automatically capture tensors from your training job. It doesn't matter whether your TensorFlow models use Keras API or pure TensorFlow API (in eager mode or non-eager mode), you can directly run them on the AWS Deep Learning Containers.
Using Debugger, you can access tensors of any kind for TensorFlow models, from the Keras model zoo to your own custom model, and save them using Debugger built-in or custom tensor collections. You can run your training script on [the official AWS Deep Learning Containers](https://github.com/aws/deep-learning-containers/blob/master/available_images.md#general-framework-containers) where Debugger can automatically capture tensors from your training job. It doesn't matter whether your TensorFlow models use Keras API or pure TensorFlow API (in eager mode or non-eager mode), you can directly run them on the AWS Deep Learning Containers.

Debugger and its client library `smdebug` support debugging your training job on other AWS training containers and custom containers. In this case, a hook registration process is required to manually add the hook features to your training script. For a full list of AWS TensorFlow containers to use Debugger, see [SageMaker containers to use Debugger with script mode](https://docs.aws.amazon.com/sagemaker/latest/dg/train-debugger.html#debugger-supported-aws-containers). For a complete guide for using custom containers, see [Use Debugger in Custom Training Containers](https://docs.aws.amazon.com/sagemaker/latest/dg/debugger-bring-your-own-container.html).

### New Features supported by Debugger
- The latest TensorFlow version fully covered by Debugger is 2.2.0
- The latest TensorFlow version fully covered by Debugger is 2.3.0
- Debug training jobs with the TensorFlow framework or Keras TensorFlow
- Debug training jobs with the TensorFlow eager or non-eager mode
- New built-in tensor collections: `inputs`, `outputs`, `layers`, `gradients`
Expand Down