Merge branch 'master' of github.com:pytorch/hub into set_trust_repo

pytorch · Jan 13, 2023 · bbadbd4 · bbadbd4
2 parents 717726b + ed58d27
commit bbadbd4
Show file tree

Hide file tree

Showing 16 changed files with 128 additions and 91 deletions.
diff --git a/.circleci/config.yml b/.circleci/config.yml
@@ -8,16 +8,13 @@ update_submodule: &update_submodule
 jobs:
   run_torchhub:
     machine:
-      image: ubuntu-1604:201903-01
+      image: ubuntu-2004-cuda-11.4:202110-01
     resource_class: gpu.nvidia.small
     steps:
       - checkout
-      - run:
-          name: Setup CI environment
-          command: ./scripts/setup_ci.sh
       - run:
           name: Install Deps
-          command: ./scripts/install_basics.sh; ./scripts/install.sh
+          command: ./scripts/install_conda.sh; ./scripts/install_deps.sh
       - run:
           name: Sanity Check
           command: . ~/miniconda3/etc/profile.d/conda.sh; conda activate base; python scripts/sanity_check.py

diff --git a/facebookresearch_pytorch-gan-zoo_pgan.md b/facebookresearch_pytorch-gan-zoo_pgan.md
@@ -67,4 +67,4 @@ Progressive Growing of GANs is a method developed by Karras et. al. [1] in 2017
 
 ### References
 
-- [Progressive Growing of GANs for Improved Quality, Stability, and Variation](https://arxiv.org/abs/1710.10196)
+[1] Tero Karras et al, "Progressive Growing of GANs for Improved Quality, Stability, and Variation" https://arxiv.org/abs/1710.10196
diff --git a/facebookresearch_pytorchvideo_resnet.md b/facebookresearch_pytorchvideo_resnet.md
@@ -160,11 +160,12 @@ print("Top 5 predicted labels: %s" % ", ".join(pred_class_names))
 ### Model Description
 The model architecture is based on [1] with pretrained weights using the 8x8 setting
 on the Kinetics dataset. 
+
 | arch | depth | frame length x sample rate | top 1 | top 5 | Flops (G) | Params (M) |
-| --------------- | ----------- | ----------- | ----------- | ----------- | ----------- |  ----------- | ----------- |
+| --------------- | ----------- | ----------- | ----------- | ----------- | ----------- |  ----------- |
 | Slow     | R50   | 8x8                        | 74.58 | 91.63 | 54.52     | 32.45     |
 
 
 ### References
 [1] Christoph Feichtenhofer et al, "SlowFast Networks for Video Recognition"
-https://arxiv.org/pdf/1812.03982.pdf
+https://arxiv.org/pdf/1812.03982.pdf
diff --git a/facebookresearch_semi-supervised-ImageNet1K-models_resnext.md b/facebookresearch_semi-supervised-ImageNet1K-models_resnext.md
@@ -20,7 +20,7 @@ demo-model-link: https://huggingface.co/spaces/pytorch/semi-supervised-ImageNet1
 ```python
 import torch
 
-# === SEMI-WEAKLY SUPERVISED MODELSP RETRAINED WITH 940 HASHTAGGED PUBLIC CONTENT ===
+# === SEMI-WEAKLY SUPERVISED MODELS PRETRAINED WITH 940 HASHTAGGED PUBLIC CONTENT ===
 model = torch.hub.load('facebookresearch/semi-supervised-ImageNet1K-models', 'resnet18_swsl')
 # model = torch.hub.load('facebookresearch/semi-supervised-ImageNet1K-models', 'resnet50_swsl')
 # model = torch.hub.load('facebookresearch/semi-supervised-ImageNet1K-models', 'resnext50_32x4d_swsl')

diff --git a/images/snnmlp.png b/images/snnmlp.png
diff --git a/nvidia_deeplearningexamples_resnet50.md b/nvidia_deeplearningexamples_resnet50.md
@@ -25,13 +25,13 @@ The ***ResNet50 v1.5*** model is a modified version of the [original ResNet50 v1
 The difference between v1 and v1.5 is that, in the bottleneck blocks which requires
 downsampling, v1 has stride = 2 in the first 1x1 convolution, whereas v1.5 has stride = 2 in the 3x3 convolution.
 
-This difference makes ResNet50 v1.5 slightly more accurate (\~0.5% top1) than v1, but comes with a smallperformance drawback (\~5% imgs/sec).
+This difference makes ResNet50 v1.5 slightly more accurate (\~0.5% top1) than v1, but comes with a small performance drawback (\~5% imgs/sec).
 
 The model is initialized as described in [Delving deep into rectifiers: Surpassing human-level performance on ImageNet classification](https://arxiv.org/pdf/1502.01852.pdf)
 
 This model is trained with mixed precision using Tensor Cores on Volta, Turing, and the NVIDIA Ampere GPU architectures. Therefore, researchers can get results over 2x faster than training without Tensor Cores, while experiencing the benefits of mixed precision training. This model is tested against each NGC monthly container release to ensure consistent accuracy and performance over time.
 
-Note that the ResNet50 v1.5 model can be deployed for inference on the [NVIDIA Triton Inference Server](https://github.com/NVIDIA/trtis-inference-server) using TorchScript, ONNX Runtime or TensorRT as an execution backend. For details check [NGC](https://ngc.nvidia.com/catalog/resources/nvidia:resnet_for_triton_from_pytorch)
+Note that the ResNet50 v1.5 model can be deployed for inference on the [NVIDIA Triton Inference Server](https://github.com/triton-inference-server/server) using TorchScript, ONNX Runtime or TensorRT as an execution backend. For details check [NGC](https://ngc.nvidia.com/catalog/resources/nvidia:resnet_for_triton_from_pytorch)
 
 
 ### Example
@@ -81,7 +81,7 @@ batch = torch.cat(
 ).to(device)
 ```
 
-Run inference. Use `pick_n_best(predictions=output, n=topN)` helepr function to pick N most probably hypothesis according to the model.
+Run inference. Use `pick_n_best(predictions=output, n=topN)` helper function to pick N most probably hypothesis according to the model.
 ```python
 with torch.no_grad():
     output = torch.nn.functional.softmax(resnet50(batch), dim=1)
@@ -112,4 +112,4 @@ and/or [NGC](https://ngc.nvidia.com/catalog/resources/nvidia:resnet_50_v1_5_for_
  - [model on github](https://github.com/NVIDIA/DeepLearningExamples/tree/master/PyTorch/Classification/ConvNets/resnet50v1.5)
  - [model on NGC](https://ngc.nvidia.com/catalog/resources/nvidia:resnet_50_v1_5_for_pytorch)
  - [pretrained model on NGC](https://ngc.nvidia.com/catalog/models/nvidia:resnet50_pyt_amp)
-
+
diff --git a/nvidia_deeplearningexamples_resnext.md b/nvidia_deeplearningexamples_resnext.md
@@ -28,7 +28,7 @@ This model is trained with mixed precision using Tensor Cores on Volta, Turing,
 
 We use [NHWC data layout](https://pytorch.org/tutorials/intermediate/memory_format_tutorial.html) when training using Mixed Precision.
 
-Note that the ResNeXt101-32x4d model can be deployed for inference on the [NVIDIA Triton Inference Server](https://github.com/NVIDIA/trtis-inference-server) using TorchScript, ONNX Runtime or TensorRT as an execution backend. For details check [NGC](https://ngc.nvidia.com/catalog/resources/nvidia:resnext_for_triton_from_pytorch)
+Note that the ResNeXt101-32x4d model can be deployed for inference on the [NVIDIA Triton Inference Server](https://github.com/triton-inference-server/server) using TorchScript, ONNX Runtime or TensorRT as an execution backend. For details check [NGC](https://ngc.nvidia.com/catalog/resources/nvidia:resnext_for_triton_from_pytorch)
 
 #### Model architecture
 
@@ -87,7 +87,7 @@ batch = torch.cat(
 ).to(device)
 ```
 
-Run inference. Use `pick_n_best(predictions=output, n=topN)` helepr function to pick N most probably hypothesis according to the model.
+Run inference. Use `pick_n_best(predictions=output, n=topN)` helper function to pick N most probably hypothesis according to the model.
 ```python
 with torch.no_grad():
     output = torch.nn.functional.softmax(resneXt(batch), dim=1)

diff --git a/nvidia_deeplearningexamples_se-resnext.md b/nvidia_deeplearningexamples_se-resnext.md
@@ -37,7 +37,7 @@ _Image source: [Squeeze-and-Excitation Networks](https://arxiv.org/pdf/1709.0150
 Image shows the architecture of SE block and where is it placed in ResNet bottleneck block.
 
 
-Note that the SE-ResNeXt101-32x4d model can be deployed for inference on the [NVIDIA Triton Inference Server](https://github.com/NVIDIA/trtis-inference-server) using TorchScript, ONNX Runtime or TensorRT as an execution backend. For details check [NGC](https://catalog.ngc.nvidia.com/orgs/nvidia/resources/se_resnext_for_triton_from_pytorch).
+Note that the SE-ResNeXt101-32x4d model can be deployed for inference on the [NVIDIA Triton Inference Server](https://github.com/triton-inference-server/server) using TorchScript, ONNX Runtime or TensorRT as an execution backend. For details check [NGC](https://catalog.ngc.nvidia.com/orgs/nvidia/resources/se_resnext_for_triton_from_pytorch).
 
 ### Example
 

diff --git a/nvidia_deeplearningexamples_tacotron2.md b/nvidia_deeplearningexamples_tacotron2.md
@@ -28,7 +28,7 @@ This implementation of Tacotron 2 model differs from the model described in the
 
 In the example below:
 - pretrained Tacotron2 and Waveglow models are loaded from torch.hub
-- Tacotron2 generates mel spectrogram given tensor represantation of an input text ("Hello world, I missed you so much")
+- Given a tensor representation of the input text ("Hello world, I missed you so much"), Tacotron2 generates a Mel spectrogram as shown on the illustration
 - Waveglow generates sound given the mel spectrogram
 - the output sound is saved in an 'audio.wav' file
 

diff --git a/nvidia_deeplearningexamples_waveglow.md b/nvidia_deeplearningexamples_waveglow.md
@@ -26,7 +26,7 @@ The Tacotron 2 and WaveGlow model form a text-to-speech system that enables user
 
 In the example below:
 - pretrained Tacotron2 and Waveglow models are loaded from torch.hub
-- Tacotron2 generates mel spectrogram given tensor represantation of an input text ("Hello world, I missed you so much")
+- Given a tensor representation of the input text ("Hello world, I missed you so much"), Tacotron2 generates a Mel spectrogram as shown on the illustration
 - Waveglow generates sound given the mel spectrogram
 - the output sound is saved in an 'audio.wav' file
 
@@ -98,4 +98,4 @@ For detailed information on model input and output, training recipies, inference
  - [Natural TTS Synthesis by Conditioning WaveNet on Mel Spectrogram Predictions](https://arxiv.org/abs/1712.05884)
  - [WaveGlow: A Flow-based Generative Network for Speech Synthesis](https://arxiv.org/abs/1811.00002)
  - [Tacotron2 and WaveGlow on NGC](https://ngc.nvidia.com/catalog/resources/nvidia:tacotron_2_and_waveglow_for_pytorch)
- - [Tacotron2 and Waveglow on github](https://github.com/NVIDIA/DeepLearningExamples/tree/master/PyTorch/SpeechSynthesis/Tacotron2)
+ - [Tacotron2 and Waveglow on github](https://github.com/NVIDIA/DeepLearningExamples/tree/master/PyTorch/SpeechSynthesis/Tacotron2)
diff --git a/pytorch_vision_once_for_all.md b/pytorch_vision_once_for_all.md
@@ -23,11 +23,11 @@ You can quickly load a supernet as following
 
 ```python
 import torch
-super_net_name = "ofa_supernet_resnet50" 
+super_net_name = "ofa_supernet_mbv3_w10" 
 # other options: 
-#    ofa_mbv3_d234_e346_k357_w1.0 / 
-#    ofa_mbv3_d234_e346_k357_w1.2 / 
-#    ofa_proxyless_d234_e346_k357_w1.3
+#    ofa_supernet_resnet50 / 
+#    ofa_supernet_mbv3_w12 / 
+#    ofa_supernet_proxyless
 
 super_net = torch.hub.load('mit-han-lab/once-for-all', super_net_name, pretrained=True).eval()
 ```
@@ -60,25 +60,18 @@ import torch
 
 # or load a architecture specialized for certain platform
 net_config = "resnet50D_MAC_4_1B"
-# other options
-#    resnet50D_MAC@4.1B_top1@79.8
-#    resnet50D_MAC@3.7B_top1@79.7
-#    resnet50D_MAC@3.0B_top1@79.3
-#    resnet50D_MAC@2.4B_top1@79.0
-#    resnet50D_MAC@1.8B_top1@78.3
-#    resnet50D_MAC@1.2B_top1@77.1_finetune@25
-#    resnet50D_MAC@0.9B_top1@76.3_finetune@25
-#    resnet50D_MAC@0.6B_top1@75.0_finetune@25
-
-specialized_net = torch.hub.load('mit-han-lab/once-for-all', net_config, pretrained=True).eval()
+
+specialized_net, image_size = torch.hub.load('mit-han-lab/once-for-all', net_config, pretrained=True)
+specialized_net.eval()
 ```
 
 More models and configurations can be found in [once-for-all/model-zoo](https://github.com/mit-han-lab/once-for-all#evaluate-1)
 and obtained through the following scripts
 
 ```python
 ofa_specialized_get = torch.hub.load('mit-han-lab/once-for-all', "ofa_specialized_get")
-model = ofa_specialized_get("flops@595M_top1@80.0_finetune@75", pretrained=True)
+model, image_size = ofa_specialized_get("flops@595M_top1@80.0_finetune@75", pretrained=True)
+model.eval()
 ```
 
 The model's prediction can be evalutaed by 

diff --git a/pytorch_vision_snnmlp.md b/pytorch_vision_snnmlp.md
@@ -0,0 +1,96 @@
+---
+layout: hub_detail
+background-class: hub-background
+body-class: hub
+title: SNNMLP
+summary: Brain-inspired Multilayer Perceptron with Spiking Neurons
+category: researchers
+image: snnmlp.png
+author: Huawei Noah's Ark Lab
+tags: [vision, scriptable]
+github-link: https://github.com/huawei-noah/Efficient-AI-Backbones
+github-id: huawei-noah/Efficient-AI-Backbones
+featured_image_1: snnmlp.png
+featured_image_2: no-image
+accelerator: cuda-optional
+order: 10
+---
+
+```python
+import torch
+model = torch.hub.load('huawei-noah/Efficient-AI-Backbones', 'snnmlp_t', pretrained=True)
+# or
+# model = torch.hub.load('huawei-noah/Efficient-AI-Backbones', 'snnmlp_s', pretrained=True)
+# or
+# model = torch.hub.load('huawei-noah/Efficient-AI-Backbones', 'snnmlp_b', pretrained=True)
+model.eval()
+```
+
+All pre-trained models expect input images normalized in the same way,
+i.e. mini-batches of 3-channel RGB images of shape `(3 x H x W)`, where `H` and `W` are expected to be at least `224`.
+The images have to be loaded in to a range of `[0, 1]` and then normalized using `mean = [0.485, 0.456, 0.406]`
+and `std = [0.229, 0.224, 0.225]`.
+
+Here's a sample execution.
+
+```python
+# Download an example image from the pytorch website
+import urllib
+url, filename = ("https://github.com/pytorch/hub/raw/master/dog.jpg", "dog.jpg")
+try: urllib.URLopener().retrieve(url, filename)
+except: urllib.request.urlretrieve(url, filename)
+```
+
+```python
+# sample execution (requires torchvision)
+from PIL import Image
+from torchvision import transforms
+input_image = Image.open(filename)
+preprocess = transforms.Compose([
+    transforms.Resize(256),
+    transforms.CenterCrop(224),
+    transforms.ToTensor(),
+    transforms.Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225]),
+])
+input_tensor = preprocess(input_image)
+input_batch = input_tensor.unsqueeze(0) # create a mini-batch as expected by the model
+
+# move the input and model to GPU for speed if available
+if torch.cuda.is_available():
+    input_batch = input_batch.to('cuda')
+    model.to('cuda')
+
+with torch.no_grad():
+    output = model(input_batch)
+# Tensor of shape 1000, with confidence scores over Imagenet's 1000 classes
+print(output[0])
+# The output has unnormalized scores. To get probabilities, you can run a softmax on it.
+print(torch.nn.functional.softmax(output[0], dim=0))
+
+```
+
+### Model Description
+
+SNNMLP incorporates the mechanism of LIF neurons into the MLP models, to achieve better accuracy without extra FLOPs. We propose a full-precision LIF operation to communicate between patches, including horizontal LIF and vertical LIF in different directions. We also propose to use group LIF to extract better local features. With LIF modules, our SNNMLP model achieves 81.9%, 83.3% and 83.6% top-1 accuracy on ImageNet dataset with only 4.4G, 8.5G and 15.2G FLOPs, respectively.
+
+The corresponding accuracy on ImageNet dataset with pretrained model is listed below.
+
+| Model structure | #Parameters | FLOPs       | Top-1 acc   |
+| --------------- | ----------- | ----------- | ----------- |
+|  SNNMLP Tiny    | 28M         | 4.4G        | 81.88       |
+|  SNNMLP Small   | 50M         | 8.5G        | 83.30       |
+|  SNNMLP Base    | 88M         | 15.2G       | 85.59       |
+
+
+### References
+
+You can read the full paper [here](https://arxiv.org/abs/2203.14679).
+```
+@inproceedings{li2022brain,
+  title={Brain-inspired multilayer perceptron with spiking neurons},
+  author={Li, Wenshuo and Chen, Hanting and Guo, Jianyuan and Zhang, Ziyang and Wang, Yunhe},
+  booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition},
+  pages={783--793},
+  year={2022}
+}
+```
diff --git a/scripts/install_basics.sh → scripts/install_conda.sh b/scripts/install_basics.sh → scripts/install_conda.sh
@@ -1,20 +1,12 @@
 #!/bin/bash
-set -e
-set -x
+set -ex
 
-# Install basics
-sudo apt-get install vim
-
-# Install miniconda
 CONDA=https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh
 filename=$(basename "$CONDA")
 wget "$CONDA"
 chmod +x "$filename"
-./"$filename" -b -u
+bash "$filename" -b -u
 
-# Force to use python3.8
 . ~/miniconda3/etc/profile.d/conda.sh
 conda activate base
 conda install -y python=3.8
-
-
diff --git a/scripts/install.sh → scripts/install_deps.sh b/scripts/install.sh → scripts/install_deps.sh
@@ -1,17 +1,14 @@
 #!/bin/bash
-set -e
-set -x
+set -ex
 
 . ~/miniconda3/etc/profile.d/conda.sh
 conda activate base
 
-conda install -y pytorch torchvision torchaudio -c pytorch-nightly
-
+conda install -y pytorch torchvision torchaudio pytorch-cuda=11.6 -c pytorch-nightly -c nvidia
 conda install -y pytest
 
 # Dependencies required to load models
-conda install -y regex pillow tqdm boto3 requests numpy\
-    h5py scipy matplotlib unidecode ipython pyyaml
+conda install -y regex pillow tqdm boto3 requests numpy h5py scipy matplotlib unidecode ipython pyyaml
 conda install -y -c conda-forge librosa inflect
 
 pip install -q fastBPE sacremoses sentencepiece subword_nmt editdistance
@@ -21,4 +18,4 @@ pip install -q hydra-core opencv-python fvcore
 pip install -q --upgrade google-api-python-client
 pip install pytorchvideo
 pip install -q prefetch_generator  # yolop
-pip install -q pretrainedmodels efficientnet_pytorch  # hybridnets
+pip install -q pretrainedmodels efficientnet_pytorch webcolors # hybridnets
diff --git a/scripts/install_nightlies.sh b/scripts/install_nightlies.sh
diff --git a/scripts/setup_ci.sh b/scripts/setup_ci.sh
Original file line number	Diff line number	Diff line change
Expand Up		@@ -67,4 +67,4 @@ Progressive Growing of GANs is a method developed by Karras et. al. [1] in 2017

		### References

		- [Progressive Growing of GANs for Improved Quality, Stability, and Variation](https://arxiv.org/abs/1710.10196)
		[1] Tero Karras et al, "Progressive Growing of GANs for Improved Quality, Stability, and Variation" https://arxiv.org/abs/1710.10196