karras-ve docs (open-mmlab#401)

* karras-ve docs for issue open-mmlab#293 * make style
aravind-h-v · Sep 7, 2022 · 65ed5d2 · 65ed5d2
1 parent 44091d8
commit 65ed5d2
Show file tree

Hide file tree

Showing 5 changed files with 48 additions and 4 deletions.
diff --git a/docs/source/api/pipelines/stochastic_karras_ve.mdx b/docs/source/api/pipelines/stochastic_karras_ve.mdx
@@ -1 +1,24 @@
-# GLIDE MODEL
+# Stochastic Karras VE
+
+## Overview
+
+[Elucidating the Design Space of Diffusion-Based Generative Models](https://arxiv.org/abs/2206.00364) by Tero Karras, Miika Aittala, Timo Aila and Samuli Laine.
+
+The abstract of the paper is the following:
+
+We argue that the theory and practice of diffusion-based generative models are currently unnecessarily convoluted and seek to remedy the situation by presenting a design space that clearly separates the concrete design choices. This lets us identify several changes to both the sampling and training processes, as well as preconditioning of the score networks. Together, our improvements yield new state-of-the-art FID of 1.79 for CIFAR-10 in a class-conditional setting and 1.97 in an unconditional setting, with much faster sampling (35 network evaluations per image) than prior designs. To further demonstrate their modular nature, we show that our design changes dramatically improve both the efficiency and quality obtainable with pre-trained score networks from previous work, including improving the FID of an existing ImageNet-64 model from 2.07 to near-SOTA 1.55.
+
+This pipeline implements the Stochastic sampling tailored to the Variance-Expanding (VE) models.
+
+
+## Available Pipelines:
+
+| Pipeline | Tasks | Colab
+|---|---|:---:|
+| [pipeline_stochastic_karras_ve.py](https://github.com/huggingface/diffusers/blob/main/src/diffusers/pipelines/stochastic_karras_ve/pipeline_stochastic_karras_ve.py) | *Unconditional Image Generation* | - |
+
+
+## API
+
+[[autodoc]] pipelines.stochastic_karras_ve.pipeline_stochastic_karras_ve.KarrasVePipeline
+    - __call__
diff --git a/src/diffusers/pipelines/README.md b/src/diffusers/pipelines/README.md
@@ -42,7 +42,7 @@ available a colab notebook to directly try them out.
 | [stable_diffusion](https://github.com/huggingface/diffusers/blob/main/src/diffusers/pipelines/stable_diffusion) | [**Stable Diffusion**](https://stability.ai/blog/stable-diffusion-public-release) | *Text-to-Image Generation* | [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/huggingface/notebooks/blob/main/diffusers/training_example.ipynb)
 | [stable_diffusion](https://github.com/huggingface/diffusers/blob/main/src/diffusers/pipelines/stable_diffusion) | [**Stable Diffusion**](https://stability.ai/blog/stable-diffusion-public-release) | *Image-to-Image Text-Guided Generation* | [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/patil-suraj/Notebooks/blob/master/image_2_image_using_diffusers.ipynb)
 | [stable_diffusion](https://github.com/huggingface/diffusers/blob/main/src/diffusers/pipelines/stable_diffusion) | [**Stable Diffusion**](https://stability.ai/blog/stable-diffusion-public-release) | *Text-Guided Image Inpainting* | [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/patil-suraj/Notebooks/blob/master/in_painting_with_stable_diffusion_using_diffusers.ipynb)
-| [stochatic_karras_ve](https://github.com/huggingface/diffusers/blob/main/src/diffusers/pipelines/stochatic_karras_ve) | [**Elucidating the Design Space of Diffusion-Based Generative Models**](https://arxiv.org/abs/2206.00364) | *Unconditional Image Generation* | 
+| [stochastic_karras_ve](https://github.com/huggingface/diffusers/blob/main/src/diffusers/pipelines/stochastic_karras_ve) | [**Elucidating the Design Space of Diffusion-Based Generative Models**](https://arxiv.org/abs/2206.00364) | *Unconditional Image Generation* | 
 
 **Note**: Pipelines are simple examples of how to play around with the diffusion systems as described in the corresponding papers. 
 However, most of them can be adapted to use different scheduler components or even different model components. Some pipeline examples are shown in the [Examples](#examples) below.

diff --git a/src/diffusers/pipelines/__init__.py b/src/diffusers/pipelines/__init__.py
@@ -4,7 +4,7 @@
 from .latent_diffusion_uncond import LDMPipeline
 from .pndm import PNDMPipeline
 from .score_sde_ve import ScoreSdeVePipeline
-from .stochatic_karras_ve import KarrasVePipeline
+from .stochastic_karras_ve import KarrasVePipeline
 
 
 if is_transformers_available():

diff --git a/...pipelines/stochatic_karras_ve/__init__.py → ...ipelines/stochastic_karras_ve/__init__.py b/...pipelines/stochatic_karras_ve/__init__.py → ...ipelines/stochastic_karras_ve/__init__.py
diff --git a/...arras_ve/pipeline_stochastic_karras_ve.py → ...arras_ve/pipeline_stochastic_karras_ve.py b/...arras_ve/pipeline_stochastic_karras_ve.py → ...arras_ve/pipeline_stochastic_karras_ve.py
@@ -10,13 +10,18 @@
 
 
 class KarrasVePipeline(DiffusionPipeline):
-    """
+    r"""
     Stochastic sampling from Karras et al. [1] tailored to the Variance-Expanding (VE) models [2]. Use Algorithm 2 and
     the VE column of Table 1 from [1] for reference.
 
     [1] Karras, Tero, et al. "Elucidating the Design Space of Diffusion-Based Generative Models."
     https://arxiv.org/abs/2206.00364 [2] Song, Yang, et al. "Score-based generative modeling through stochastic
     differential equations." https://arxiv.org/abs/2011.13456
+
+    Parameters:
+        unet ([`UNet2DModel`]): U-Net architecture to denoise the encoded image.
+        scheduler ([`KarrasVeScheduler`]):
+            Scheduler for the diffusion process to be used in combination with `unet` to denoise the encoded image.
     """
 
     # add type hints for linting
@@ -38,6 +43,22 @@ def __call__(
         return_dict: bool = True,
         **kwargs,
     ) -> Union[Tuple, ImagePipelineOutput]:
+        r"""
+        Args:
+            batch_size (:obj:`int`, *optional*, defaults to 1):
+                The number of images to generate.
+            generator (:obj:`torch.Generator`, *optional*):
+                A [torch generator](https://pytorch.org/docs/stable/generated/torch.Generator.html) to make generation
+                deterministic.
+            num_inference_steps (:obj:`int`, *optional*, defaults to 50):
+                The number of denoising steps. More denoising steps usually lead to a higher quality image at the
+                expense of slower inference.
+            output_type (:obj:`str`, *optional*, defaults to :obj:`"pil"`):
+                The output format of the generate image. Choose between
+                [PIL](https://pillow.readthedocs.io/en/stable/): `PIL.Image.Image` or `nd.array`.
+            return_dict (:obj:`bool`, *optional*, defaults to :obj:`True`):
+                Whether or not to return a [`~pipeline_utils.ImagePipelineOutput`] instead of a plain tuple.
+        """
         if "torch_device" in kwargs:
             device = kwargs.pop("torch_device")
             warnings.warn(