Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add backend intro and how-to diagram #9175

Merged
merged 20 commits into from
Jul 22, 2024
Merged

Conversation

JessicaS11
Copy link
Contributor

@JessicaS11 JessicaS11 commented Jun 25, 2024

@JessicaS11 JessicaS11 marked this pull request as ready for review June 25, 2024 20:37
@keewis
Copy link
Collaborator

keewis commented Jun 25, 2024

be aware that currently this also fails on main – a dependency stopped pinning to numpy<2.0. If you don't want to spend your time fixing the numpy 2 issues, feel free to pin numpy<2.0 in the docs environment (but otherwise I'd appreciate the help!)

@max-sixty
Copy link
Collaborator

Are we OK to merge this? The content looks good! The errors are unrelated.

@JessicaS11 does the mermaid diagram render OK? Let's merge away if so...

@JessicaS11
Copy link
Contributor Author

be aware that currently this also fails on main – a dependency stopped pinning to numpy<2.0. If you don't want to spend your time fixing the numpy 2 issues, feel free to pin numpy<2.0 in the docs environment (but otherwise I'd appreciate the help!)

I was hoping this would allow me to see how it rendered (it's got some funny spacing in the mermaid.live version, screenshot below) with an added bonus of addressing an easy numpy 2.0 failure, but alas... happy to pin the environment if that's the current plan for handling numpy>=2.0.

Are we OK to merge this? The content looks good! The errors are unrelated.
As noted above, the spacing is a bit funny, so I'm going to try and make it more explicit so it will hopefully render more cleanly.

image

@keewis
Copy link
Collaborator

keewis commented Jun 26, 2024

if that's the current plan for handling numpy>=2.0.

after writing this I've opened #9177 to deal with most of these issues. I didn't notice that RTD is still failing, though.

@max-sixty
Copy link
Collaborator

OK!

it's got some funny spacing in the mermaid.live version, screenshot below

Possibly not a blocker to merge, but... is this only in the preview rather than the version that would deploy? I do think it's somewhat difficult to read — e.g. empty lines at the top of each block, line-breaks carrying through from the arbitrary line-breaks in the .rst file, the code isn't monospaced, lists are centered.

Do we need the warranty on the list not being inclusive ("exhaustive"?) — the "No" condition seems to cover that it's not exhaustive by suggesting to ask around?

I really liked the diagram in the previous PR, I thought it was a great use of a diagram. Maaaaaybe for a much simpler decision tree of if / else: if / else text is sufficient... (still keen to merge though)

@JessicaS11
Copy link
Contributor Author

JessicaS11 commented Jun 27, 2024

Possibly not a blocker to merge, but... is this only in the preview rather than the version that would deploy? I do think it's somewhat difficult to read — e.g. empty lines at the top of each block, line-breaks carrying through from the arbitrary line-breaks in the .rst file, the code isn't monospaced, lists are centered.

I did some work to try and improve the rendering (there's no preview for how it would deploy because RTD build is failing, so all I can go on is how it renders in the live tool, which is where the screen shot is from). The empty lines at the top appear when you add the `" notation to make the nodes render as markdown, which is required for the italics and bold. I cannot find any record or setting that makes the extra space at the top go away.

Do we need the warranty on the list not being inclusive ("exhaustive"?) — the "No" condition seems to cover that it's not exhaustive by suggesting to ask around?

Good point - removed to streamline.

I really liked the diagram in the previous PR, I thought it was a great use of a diagram. Maaaaaybe for a much simpler decision tree of if / else: if / else text is sufficient... (still keen to merge though)

This came out of some conversation with @scottyhq @TomNicholas @negin513 @betolink during planning for the upcoming SciPy tutorial. We also considered putting it into the tutorial book and decided it might be a good "intro" to this section of the docs. Given how text-rich much of the Xarray docs are, I personally am a fan of anything that conveys info in a more visually interesting way (I like the idea of some type of if/else callout that finds a happy medium between more complex graphics and wall of text).

Updated screenshot:
image

doc/user-guide/io.rst Outdated Show resolved Hide resolved
doc/user-guide/io.rst Outdated Show resolved Hide resolved
@max-sixty
Copy link
Collaborator

max-sixty commented Jun 27, 2024

Nice, that does look better, thanks

How about this as a few small changes? Feel free to take anything (but no obligation), and then let's merge

flowchart LR
    built-in-eng["""Is your data stored in one of these formats?
        - netCDF4 (<code>netcdf4</code>)
        - netCDF3 (<code>scipy</code>)
        - Zarr (<code>zarr</code>)
        - DODS/OPeNDAP (<code>pydap</code>)
        - HDF5 (<code>h5netcdf</code>)
        """]

    built-in["""You're in luck! Xarray bundles a backend for this format.
        Open data using <code>xr.open_dataset()</code>. We recommend
        always setting the engine you want to use."""]

    installed-eng["""One of these formats?
        - GRIB (<code>cfgrib</code>)
        - TileDB (<code>tiledb</code>)
        - GeoTIFF, JPEG-2000, ESRI-hdf (<code>rioxarray</code>, via GDAL)
        - Sentinel-1 SAFE (<code>xarray-sentinel</code>)
        """]

    installed["""Install the package indicated in parentheses
        to your Python environment. Restart the kernel
        and use <code>xr.open_dataset(files, engine='rioxarray')</code>"""]

    other["""Ask around to see if someone in your data community
        has created an Xarray backend for your data type.
        If not, you may need to create your own or consider
        exporting your data to a more common format."""]

    built-in-eng -->|Yes| built-in
    built-in-eng -->|No| installed-eng

    installed-eng -->|Yes| installed
    installed-eng -->|No| other

    click built-in-eng "https://docs.xarray.dev/en/stable/getting-started-guide/faq.html#how-do-i-open-format-x-file-as-an-xarray-dataset"
    click installed-eng "https://corteva.github.io/rioxarray/stable/getting_started/getting_started.html#rioxarray"
    click other "https://docs.xarray.dev/en/stable/internals/how-to-add-new-backend.html"

    classDef quesNodefmt fill:#9DEEF4,stroke:#206C89,text-align:left
    class built-in-eng,installed-eng quesNodefmt

    classDef ansNodefmt fill:#FFAA05,stroke:#E37F17,text-align:left,white-space:nowrap
    class built-in,installed,other ansNodefmt

    linkStyle default font-size:20pt,color:#206C89
Loading

(edit: I notice GH renders this if we supply mermaid as a code tag! So also pasting as plain text, and a screenshot, since GH doesn't render the monospace)

flowchart LR
    built-in-eng["""Is your data stored in one of these formats?
        - netCDF4 (<code>netcdf4</code>)
        - netCDF3 (<code>scipy</code>)
        - Zarr (<code>zarr</code>)
        - DODS/OPeNDAP (<code>pydap</code>)
        - HDF5 (<code>h5netcdf</code>)
        """]

    built-in["""You're in luck! Xarray bundles a backend for this format.
        Open data using <code>xr.open_dataset()</code>. We recommend
        always setting the engine you want to use."""]

    installed-eng["""One of these formats?
        - GRIB (<code>cfgrib</code>)
        - TileDB (<code>tiledb</code>)
        - GeoTIFF, JPEG-2000, ESRI-hdf (<code>rioxarray</code>, via GDAL)
        - Sentinel-1 SAFE (<code>xarray-sentinel</code>)
        """]

    installed["""Install the package indicated in parentheses
        to your Python environment. Restart the kernel
        and use <code>xr.open_dataset(files, engine='rioxarray')</code>"""]

    other["""Ask around to see if someone in your data community
        has created an Xarray backend for your data type.
        If not, you may need to create your own or consider
        exporting your data to a more common format."""]

    built-in-eng -->|Yes| built-in
    built-in-eng -->|No| installed-eng

    installed-eng -->|Yes| installed
    installed-eng -->|No| other

    click built-in-eng "https://docs.xarray.dev/en/stable/getting-started-guide/faq.html#how-do-i-open-format-x-file-as-an-xarray-dataset"
    click installed-eng "https://corteva.github.io/rioxarray/stable/getting_started/getting_started.html#rioxarray"
    click other "https://docs.xarray.dev/en/stable/internals/how-to-add-new-backend.html"

    classDef quesNodefmt fill:#9DEEF4,stroke:#206C89,text-align:left
    class built-in-eng,installed-eng quesNodefmt

    classDef ansNodefmt fill:#FFAA05,stroke:#E37F17,text-align:left,white-space:nowrap
    class built-in,installed,other ansNodefmt

    linkStyle default font-size:20pt,color:#206C89
image

@JessicaS11
Copy link
Contributor Author

How about this as a few small changes? Feel free to take anything (but no obligation), and then let's merge

Thanks! Using monospace was a great idea.

It's exciting to know that GitHub [sort-of] renders mermaid!

@max-sixty max-sixty added the plan to merge Final call for comments label Jul 12, 2024
@max-sixty
Copy link
Collaborator

Would be really good to merge this!

I'm a bit nervous about merging before we've seen that readthedocs builds it correctly. Is anyone familiar with how close we are to having the docs build correctly on main, so the build can also check this code?

doc/user-guide/io.rst Outdated Show resolved Hide resolved
doc/user-guide/io.rst Outdated Show resolved Hide resolved
installed-eng -->|No| other

click built-in-eng "https://docs.xarray.dev/en/stable/getting-started-guide/faq.html#how-do-i-open-format-x-file-as-an-xarray-dataset"
click installed-eng "https://corteva.github.io/rioxarray/stable/getting_started/getting_started.html#rioxarray"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm looking at https://xray--9175.org.readthedocs.build/en/9175/user-guide/io.html — I think this means that clicking anywhere in One of these formats? links to the RioXarray docs...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah - unfortunately you can only add one link per node. My [hesitant] intent was to give users a clickable example, but if you think it's clearer to not have links embedded in the diagram (and either add them as a list after or not at all) I wouldn't argue.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I think that's a bit confusing... Any chance it takes markdown links?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It didn't on my first attempt, but it takes html ones!

@max-sixty
Copy link
Collaborator

Awesome, thank you @JessicaS11 !!

@max-sixty max-sixty merged commit 95e67b6 into pydata:main Jul 22, 2024
19 checks passed
dcherian added a commit that referenced this pull request Jul 22, 2024
* main:
  add backend intro and how-to diagram (#9175)
  Fix copybutton for multi line examples in double digit ipython cells (#9264)
  Update signature for _arrayfunction.__array__ (#9237)
  Add encode_cf_datetime benchmark (#9262)
  groupby, resample: Deprecate some positional args (#9236)
  Delete ``base`` and ``loffset`` parameters to resample (#9233)
  Update dropna docstring (#9257)
  Grouper, Resampler as public api (#8840)
  Fix mypy on main (#9252)
  fix fallback isdtype method (#9250)
  Enable pandas type checking (#9213)
  Per-variable specification of boolean parameters in open_dataset (#9218)
  test push
  Added a space to the documentation (#9247)
  Fix typing for test_plot.py (#9234)
dcherian added a commit that referenced this pull request Jul 24, 2024
* main: (54 commits)
  Adding `open_datatree` backend-specific keyword arguments (#9199)
  [pre-commit.ci] pre-commit autoupdate (#9202)
  Restore ability to specify _FillValue as Python native integers (#9258)
  add backend intro and how-to diagram (#9175)
  Fix copybutton for multi line examples in double digit ipython cells (#9264)
  Update signature for _arrayfunction.__array__ (#9237)
  Add encode_cf_datetime benchmark (#9262)
  groupby, resample: Deprecate some positional args (#9236)
  Delete ``base`` and ``loffset`` parameters to resample (#9233)
  Update dropna docstring (#9257)
  Grouper, Resampler as public api (#8840)
  Fix mypy on main (#9252)
  fix fallback isdtype method (#9250)
  Enable pandas type checking (#9213)
  Per-variable specification of boolean parameters in open_dataset (#9218)
  test push
  Added a space to the documentation (#9247)
  Fix typing for test_plot.py (#9234)
  Allow mypy to run in vscode (#9239)
  Revert "Test main push"
  ...
dcherian added a commit to JoelJaeschke/xarray that referenced this pull request Jul 25, 2024
…monotonic-variable

* main: (995 commits)
  Adding `open_datatree` backend-specific keyword arguments (pydata#9199)
  [pre-commit.ci] pre-commit autoupdate (pydata#9202)
  Restore ability to specify _FillValue as Python native integers (pydata#9258)
  add backend intro and how-to diagram (pydata#9175)
  Fix copybutton for multi line examples in double digit ipython cells (pydata#9264)
  Update signature for _arrayfunction.__array__ (pydata#9237)
  Add encode_cf_datetime benchmark (pydata#9262)
  groupby, resample: Deprecate some positional args (pydata#9236)
  Delete ``base`` and ``loffset`` parameters to resample (pydata#9233)
  Update dropna docstring (pydata#9257)
  Grouper, Resampler as public api (pydata#8840)
  Fix mypy on main (pydata#9252)
  fix fallback isdtype method (pydata#9250)
  Enable pandas type checking (pydata#9213)
  Per-variable specification of boolean parameters in open_dataset (pydata#9218)
  test push
  Added a space to the documentation (pydata#9247)
  Fix typing for test_plot.py (pydata#9234)
  Allow mypy to run in vscode (pydata#9239)
  Revert "Test main push"
  ...
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants