Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

NASA SMAP SSS recipe #31

Open
wants to merge 22 commits into
base: master
Choose a base branch
from

Conversation

cisaacstern
Copy link
Member

Draft PR which will close #30 when complete.

@rabernat, @jbusecke, and @hscannell: submitting this (very) rough first pass as a point for conversation around some structural questions (and a suggestion) that I've encountered so far. Interested in feedback regarding any of the below.

  1. The word "pipeline" appears in a lot of places, including the README for the staged-recipes repo, and in the title of this issue (Example pipeline for SMAP Seasurface Salinity #30).

    • Question: Do I understand correctly that this language is out-of-date, as contributors will no longer be engaging with the Prefect layer, but rather contributing recipe.pys and meta.yamls only? If so, should we open an issue to re-write the README and associated docs?
  2. Why do we have so many things labeled example in issues? What's the difference between an example and just, a recipe staged by a maintainer?

    • Related Question: Is my current directory structure correct? I have opted to make a new directory under recipes/ rather than within recipes/examples/.
  3. Suggestion: I've opted to pip install jupytext (https://jupytext.readthedocs.io/en/latest/index.html) into my staged-recipe development environment, so that I can execute my recipe.py text file line-by-line in Jupyter during development. (Without this dependency, in order to debug the recipe in Jupyter, I would've had to create a separate recipe-dev.ipynb file for development, and then copy-and-paste the relevant bits into a .py file for the PR.) What do we think about incorporating this dependency as part of the recommended contribution/development workflow?

Copy link
Contributor

@jbusecke jbusecke left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks cool (I am not really able to comment on the pangeo-forge specific stuff though)! Had some small nits for the description. Very excited for this data!

@@ -0,0 +1,21 @@
title: "NASA SMAP Sea Surface Salinity (SSS)"
description: "Analysis-ready Zarr datasets derived from NASA SMAP Sea Surface Salinity (SSS) NetCDF"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it might be good to add that these are L3 (Level 3 gridded) data (there are also more complicated swath products). I don't think we need the NetCDF in the description?

@rabernat
Copy link
Contributor

  • Do I understand correctly that this language is out-of-date, as contributors will no longer be engaging with the Prefect layer, but rather contributing recipe.pys and meta.yamls only? If so, should we open an issue to re-write the README and associated docs?

Yes to all of the above.

2. Why do we have so many things labeled example in issues? What's the difference between an example and just, a recipe staged by a maintainer?

We are sort of moving gradually from collecting hypothetical use cases to actual recipes. I would update this label to be "proposed recipe"

  • Is my current directory structure correct?

Yes, it's fine. The current CI workflow (#28) will search for meta.yaml anywhere in the PR.

Going forward, I think we want to make the repo as simple, bare-bones, and self explanatory as possible. Feel free to propose changes in this direction.

3. What do we think about incorporating this dependency as part of the recommended contribution/development workflow?

👍

@rabernat
Copy link
Contributor

Why don't we open a new issue to track the improvements needed to the contributor workflow?

@cisaacstern
Copy link
Member Author

The recipe-dict in nasa-smap-sss/recipe.py now appears to contain valid recipes for all four datasets (JPL and RSS, each at both timescales).

As I move now into the (manual, notebook-based) execution phase, I will echo that the feature(s) discussed in pangeo-forge/pangeo-forge-recipes#97 and pangeo-forge/pangeo-forge-recipes#136 would presumably be useful even in manual execution settings.

My workaround was to estimate the source sizes as follows:

import numpy as np
import xarray as xr

for store in list(urls): # `urls` is a dictionary mapping of 'store_name' : source_url
    ds = xr.open_dataset(urls[store][10]) # an arbitrary source file from each dataset
    gbs = ds.nbytes/1e9
    total_gbs = len(urls[store]) * gbs
    print(f"{store} contains approx. {np.trunc(total_gbs)} GBs.")

which returns:

NASA-SMAP-SSS/JPL/8day contains approx. 81.0 GBs.
NASA-SMAP-SSS/JPL/monthly contains approx. 2.0 GBs.
NASA-SMAP-SSS/RSS/8day contains approx. 110.0 GBs.
NASA-SMAP-SSS/RSS/monthly contains approx. 3.0 GBs.

Based on this information, I decided to start by trying to execute the (considerably smaller) monthly recipes only, using as reference the notebook Ryan has used manually execute an eNATL60 recipe (see #24 (comment)). The notebook is not currently linkable in full as it contains secrets.

On the execution cell

for recipe_key, r in recipes.items():
    if 'monthly' in recipe_key:
        try:
            r.open_target()
            print(f"found {recipe_key}")
        except:
            print(f"RUNNING {recipe_key}")
            pl = r.to_pipelines()
            plan = executor.pipelines_to_plan(pl)
            executor.execute_plan(plan)
    else:
        pass

I encountered the following errors:

I do not expect these issues will be diagnosable without the full notebook context, but I'm logging this in outline form here as a touchpoint nonetheless. Ryan and I will be discussing synchronously on Monday, after which I will follow up on this thread with any generalizable takeaways.

@rabernat
Copy link
Contributor

Charles, yesterday we boiled this error down to a specific issue with fsspec. Would you mind sharing that code snippet here?

@cisaacstern
Copy link
Member Author

Yes, the error was being thrown by line 40 in storage.py here. The minimal example below recreates the error using fsspec.open() alone. (Traceback is included immediately below the example.)

As suggested in fsspec/filesystem_spec#160 (comment), I was able to resolve this error by setting fsspec_open_kwargs = {'block_size': 0} when instantiating the recipe here. (In the minimal example, uncommenting open_kwargs achieves the same end.)

@martindurant, my lingering questions are:

  1. Is setting fsspec_open_kwargs = {'block_size': 0} indeed your recommended solution to this problem? Or have I overlooked some disadvantage of this solution?
  2. You note in the above-linked comment that this error arises when "fsspec would like to be able to random access the file by issuing Range requests, but the server doesn't respect this". Does the Traceback below stem from the same circumstance?
  3. If so, is there any way to anticipate which source file servers will struggle in this way?
  4. Should I link this report to any ongoing fsspec Issues?
from contextlib import contextmanager
from typing import Any, Iterator

import fsspec

# fsspec doesn't provide type hints, so I'm not sure what the write type is for open files
OpenFileType = Any

@contextmanager
def _fsspec_safe_open(fname: str, **kwargs) -> Iterator[OpenFileType]:
    # workaround for inconsistent behavior of fsspec.open
    # https://github.com/intake/filesystem_spec/issues/579
    with fsspec.open(fname, **kwargs) as fp:
        with fp as fp2:
            yield fp2

base = 'https://podaac-opendap.jpl.nasa.gov/opendap/allData/'
fname = base + 'smap/L3/JPL/V5.0/8day_running/2015/120/SMAP_L3_SSS_20150504_8DAYS_V5.0.nc'

# open_kwargs = {'block_size': 0}

input_opener = _fsspec_safe_open(fname, mode="rb") #, **open_kwargs)

BLOCK_SIZE=10_000_000

with input_opener as source:
    data = source.read(BLOCK_SIZE)
Traceback (click to expand)
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-1-efd8fb2f2463> in <module>
     25 
     26 with input_opener as source:
---> 27     data = source.read(BLOCK_SIZE)

~/.pyenv/versions/anaconda3-2019.10/envs/pangeo-forge3.8/lib/python3.8/site-packages/fsspec/implementations/http.py in read(self, length)
    482         else:
    483             length = min(self.size - self.loc, length)
--> 484         return super().read(length)
    485 
    486     async def async_fetch_all(self):

~/.pyenv/versions/anaconda3-2019.10/envs/pangeo-forge3.8/lib/python3.8/site-packages/fsspec/spec.py in read(self, length)
   1447             # don't even bother calling fetch
   1448             return b""
-> 1449         out = self.cache._fetch(self.loc, self.loc + length)
   1450         self.loc += len(out)
   1451         return out

~/.pyenv/versions/anaconda3-2019.10/envs/pangeo-forge3.8/lib/python3.8/site-packages/fsspec/caching.py in _fetch(self, start, end)
    374         ):
    375             # First read, or extending both before and after
--> 376             self.cache = self.fetcher(start, bend)
    377             self.start = start
    378         elif start < self.start:

~/.pyenv/versions/anaconda3-2019.10/envs/pangeo-forge3.8/lib/python3.8/site-packages/fsspec/asyn.py in wrapper(*args, **kwargs)
     70     def wrapper(*args, **kwargs):
     71         self = obj or args[0]
---> 72         return sync(self.loop, func, *args, **kwargs)
     73 
     74     return wrapper

~/.pyenv/versions/anaconda3-2019.10/envs/pangeo-forge3.8/lib/python3.8/site-packages/fsspec/asyn.py in sync(loop, func, timeout, *args, **kwargs)
     51     event.wait(timeout)
     52     if isinstance(result[0], BaseException):
---> 53         raise result[0]
     54     return result[0]
     55 

~/.pyenv/versions/anaconda3-2019.10/envs/pangeo-forge3.8/lib/python3.8/site-packages/fsspec/asyn.py in _runner(event, coro, result, timeout)
     18         coro = asyncio.wait_for(coro, timeout=timeout)
     19     try:
---> 20         result[0] = await coro
     21     except Exception as ex:
     22         result[0] = ex

~/.pyenv/versions/anaconda3-2019.10/envs/pangeo-forge3.8/lib/python3.8/site-packages/fsspec/implementations/http.py in async_fetch_range(self, start, end)
    544                         cl += len(chunk)
    545                         if cl > end - start:
--> 546                             raise ValueError(
    547                                 "Got more bytes so far (>%i) than requested (%i)"
    548                                 % (cl, end - start)

ValueError: Got more bytes so far (>15252381) than requested (15242880)

@cisaacstern
Copy link
Member Author

Noting that the PR referenced in the last commit is actually pangeo-forge/roadmap#22, not the one linked in the commit message.

@cisaacstern cisaacstern marked this pull request as ready for review May 19, 2021 23:44
@cisaacstern
Copy link
Member Author

@sharkinsspatial, this is ready to be test-run through the bakery.

I've already manually executed the copy_pruned() versions of all the recipes contained in this PR's dict_object to Pangeo's OSN bucket. The plot below was created with this code block (credentials omitted, of course) at the bottom of the notebook.

Will there soon be a slash command that allows us to do a "test-bake" on the pruned subsets? (Apologies if the timeline on this was obvious from our other threads, still wrapping my head around all the layers here.)

cc @jbusecke, getting close!

image

@martindurant
Copy link

Is setting fsspec_open_kwargs = {'block_size': 0} indeed your recommended solution to this problem?

This is saying "I want to view the whole file as a block" and will work fine. Really, the code is doing fs.get (not open/read), which should always do the right thing and also allow concurrent fetches.

You note in the above-linked comment that this error arises when "fsspec would like to be able to random access the file by issuing Range requests, but the server doesn't respect this". Does the Traceback below stem from the same circumstance?

Yes, probably. It is marginally possible (but not likely) that the server is not respecting the content encoding. The response header would have more information.

If so, is there any way to anticipate which source file servers will struggle in this way?

I'm afraid not. The HTTP response to HEAD or GET (before starting to download) might have useful markers, but this already depends on the server being well-behaved. Essentially, none of the header info keys are strictly required.

Should I link this report to any ongoing fsspec Issues?

There have certainly been ongoing conversations around this kind of thing, and the range of circumstances that fsspec can handle has steadily grown.

@rabernat
Copy link
Contributor

The plot below was created with this code block (credentials omitted, of course)

The OSN bucket is public for read only access. You can access it over s3 protocol with anon=True (see my OSN guide) or even http via `https://ncsa.osn.xsede.org/Pangeo/...'

@rabernat
Copy link
Contributor

I'm afraid not. The HTTP response to HEAD or GET (before starting to download) might have useful markers, but this already depends on the server being well-behaved. Essentially, none of the header info keys are strictly required.

Then let's try to explicitly catch this error in Pangeo forge and raise a detailed error message with the suggested workaround.

@sharkinsspatial
Copy link
Contributor

/run-recipe-test

@sharkinsspatial
Copy link
Contributor

@cisaacstern Can you include a pangeo_notebook_version at the the root of your meta.yaml. You can use this as an example https://github.com/pangeo-forge/staged-recipes/pull/36/files#diff-743ac37f3dbeb14ebdd6b873ade997238195d5652d365a37c52358662b001c6dR4. We use this to pin the image used by our bakery workers.

@sharkinsspatial
Copy link
Contributor

@cisaacstern As a note. In the short interim while we wait for a release of pangeo-forge-recipes including copy_pruned I'll register these recipes with the CI workflow and attempt to run one of smaller monthly recipes for validation.

@sharkinsspatial
Copy link
Contributor

/run-recipe-test

@sharkinsspatial
Copy link
Contributor

/run-recipe-test

2 similar comments
@sharkinsspatial
Copy link
Contributor

/run-recipe-test

@sharkinsspatial
Copy link
Contributor

/run-recipe-test

@sharkinsspatial
Copy link
Contributor

@cisaacstern https://github.com/sharkinsspatial/zarr_examples/blob/main/nasa-smap-sss-jpl-monthly.ipynb 🎊

@cisaacstern
Copy link
Member Author

@jbusecke, the first two timesteps of each of the four datasets (two time intervals for each of two algorithms) are available on OSN as follows:

import s3fs
endpoint_url = 'https://ncsa.osn.xsede.org'
fs_osn = s3fs.S3FileSystem(anon=True, client_kwargs={'endpoint_url': endpoint_url},)

fs_osn.ls("Pangeo/pangeo-forge/NASA-SMAP-SSS/JPL")
['Pangeo/pangeo-forge/NASA-SMAP-SSS/JPL/8day_pruned.zarr',
 'Pangeo/pangeo-forge/NASA-SMAP-SSS/JPL/monthly_pruned.zarr']
fs_osn.ls("Pangeo/pangeo-forge/NASA-SMAP-SSS/RSS")
['Pangeo/pangeo-forge/NASA-SMAP-SSS/RSS/8day_pruned.zarr',
 'Pangeo/pangeo-forge/NASA-SMAP-SSS/RSS/monthly_pruned.zarr']

@sharkinsspatial, were the complete time series ever built by the bakery, and if so are they publicly accessible somewhere?

@rabernat
Copy link
Contributor

rabernat commented Mar 3, 2022

Could we try re-running this recipe in our latest infrastructure?

@cisaacstern
Copy link
Member Author

Yes I'll change the bakery in meta.yaml, which once committed will signal the bot to create a new recipe run for this

@pangeo-forge-bot
Copy link

It looks like your meta.yaml does not conform to the specification.

            4 validation errors for MetaYaml
recipes -> 0 -> id
  field required (type=value_error.missing)
recipes -> 0 -> object
  field required (type=value_error.missing)
recipes
  value is not a valid dict (type=type_error.dict)
maintainers -> 0 -> orcid
  field required (type=value_error.missing)

Please correct your meta.yaml and commit the corrections to this PR.

@cisaacstern
Copy link
Member Author

The bot doesn't understand dict_objects yet... I'm going to see if I can quickly fix that...

@cisaacstern
Copy link
Member Author

A-ha. So it remains true that the bot does not understand dict_objects, but the validation error we see in #31 (comment) is actually because our meta.yaml currently gives

recipes:
   - dict_object: "recipe:recipes"

when it should be a simple mapping (rather than a list), i.e.

recipes:
   dict_object: "recipe:recipes"

I've noted this issue in pangeo-forge/roadmap#49

@andersy005
Copy link
Member

pre-commit.ci autofix

@jbusecke
Copy link
Contributor

Thanks for working on this @andersy005. Let me know if you need any input from my side!

Comment on lines +179 to +187
recipes = {
list(patterns)[i]: (
XarrayZarrRecipe(
patterns[list(patterns)[i]],
)
)
for i in range(4)
}

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jbusecke, i could use your help here. i'm not very familiar with recipes that define multiple recipes within a dict, and for reason i don't understand yet, the backend seems to think this recipe contains errors that result in a infinite loop when synchronize this recipe with the database. unfortunately, the error message returned is a opaque and

can you take a look at this, and let me know if anything stands out? is this dict well defined? thank you

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i'm planning to post the error message here later today

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jbusecke, here's the traceback, which seems to hint at a missing key from the recipes dict. is the multi recipes approach within the same feedstock documented somewhere? I couldn't find anything in the documentation.

Oct 27 15:07:57 [pangeo-forge-api-prod](https://my.papertrailapp.com/systems/pangeo-forge-api-prod/events?focus=1522411785187827794&selected=1522411785187827794) [app/web.1](https://my.papertrailapp.com/events?q=program%3Aapp%2Fweb.1&focus=1522411785187827794&selected=1522411785187827794) Traceback (most recent call last):
Oct 27 15:07:57 [pangeo-forge-api-prod](https://my.papertrailapp.com/systems/pangeo-forge-api-prod/events?focus=1522411785187827795&selected=1522411785187827795) [app/web.1](https://my.papertrailapp.com/events?q=program%3Aapp%2Fweb.1&focus=1522411785187827795&selected=1522411785187827795)   File "/usr/local/lib/python3.9/dist-packages/uvicorn/protocols/http/h11_impl.py", line 373, in run_asgi
Oct 27 15:07:57 [pangeo-forge-api-prod](https://my.papertrailapp.com/systems/pangeo-forge-api-prod/events?focus=1522411785187827796&selected=1522411785187827796) [app/web.1](https://my.papertrailapp.com/events?q=program%3Aapp%2Fweb.1&focus=1522411785187827796&selected=1522411785187827796)     result = await app(self.scope, self.receive, self.send)
Oct 27 15:07:57 [pangeo-forge-api-prod](https://my.papertrailapp.com/systems/pangeo-forge-api-prod/events?focus=1522411785187827797&selected=1522411785187827797) [app/web.1](https://my.papertrailapp.com/events?q=program%3Aapp%2Fweb.1&focus=1522411785187827797&selected=1522411785187827797)   File "/usr/local/lib/python3.9/dist-packages/uvicorn/middleware/proxy_headers.py", line 75, in __call__
Oct 27 15:07:57 [pangeo-forge-api-prod](https://my.papertrailapp.com/systems/pangeo-forge-api-prod/events?focus=1522411785187827798&selected=1522411785187827798) [app/web.1](https://my.papertrailapp.com/events?q=program%3Aapp%2Fweb.1&focus=1522411785187827798&selected=1522411785187827798)     return await self.app(scope, receive, send)
Oct 27 15:07:57 [pangeo-forge-api-prod](https://my.papertrailapp.com/systems/pangeo-forge-api-prod/events?focus=1522411785187827800&selected=1522411785187827800) [app/web.1](https://my.papertrailapp.com/events?q=program%3Aapp%2Fweb.1&focus=1522411785187827800&selected=1522411785187827800)   File "/usr/local/lib/python3.9/dist-packages/fastapi/applications.py", line 208, in __call__
Oct 27 15:07:57 [pangeo-forge-api-prod](https://my.papertrailapp.com/systems/pangeo-forge-api-prod/events?focus=1522411785187827801&selected=1522411785187827801) [app/web.1](https://my.papertrailapp.com/events?q=program%3Aapp%2Fweb.1&focus=1522411785187827801&selected=1522411785187827801)     await super().__call__(scope, receive, send)
Oct 27 15:07:57 [pangeo-forge-api-prod](https://my.papertrailapp.com/systems/pangeo-forge-api-prod/events?focus=1522411785187827802&selected=1522411785187827802) [app/web.1](https://my.papertrailapp.com/events?q=program%3Aapp%2Fweb.1&focus=1522411785187827802&selected=1522411785187827802)   File "/usr/local/lib/python3.9/dist-packages/starlette/applications.py", line 112, in __call__
Oct 27 15:07:57 [pangeo-forge-api-prod](https://my.papertrailapp.com/systems/pangeo-forge-api-prod/events?focus=1522411785187827804&selected=1522411785187827804) [app/web.1](https://my.papertrailapp.com/events?q=program%3Aapp%2Fweb.1&focus=1522411785187827804&selected=1522411785187827804)     await self.middleware_stack(scope, receive, send)
Oct 27 15:07:57 [pangeo-forge-api-prod](https://my.papertrailapp.com/systems/pangeo-forge-api-prod/events?focus=1522411785187827805&selected=1522411785187827805) [app/web.1](https://my.papertrailapp.com/events?q=program%3Aapp%2Fweb.1&focus=1522411785187827805&selected=1522411785187827805)   File "/usr/local/lib/python3.9/dist-packages/starlette/middleware/errors.py", line 181, in __call__
Oct 27 15:07:57 [pangeo-forge-api-prod](https://my.papertrailapp.com/systems/pangeo-forge-api-prod/events?focus=1522411785187827808&selected=1522411785187827808) [app/web.1](https://my.papertrailapp.com/events?q=program%3Aapp%2Fweb.1&focus=1522411785187827808&selected=1522411785187827808)     raise exc
Oct 27 15:07:57 [pangeo-forge-api-prod](https://my.papertrailapp.com/systems/pangeo-forge-api-prod/events?focus=1522411785187827810&selected=1522411785187827810) [app/web.1](https://my.papertrailapp.com/events?q=program%3Aapp%2Fweb.1&focus=1522411785187827810&selected=1522411785187827810)   File "/usr/local/lib/python3.9/dist-packages/starlette/middleware/errors.py", line 159, in __call__
Oct 27 15:07:57 [pangeo-forge-api-prod](https://my.papertrailapp.com/systems/pangeo-forge-api-prod/events?focus=1522411785187827811&selected=1522411785187827811) [app/web.1](https://my.papertrailapp.com/events?q=program%3Aapp%2Fweb.1&focus=1522411785187827811&selected=1522411785187827811)     await self.app(scope, receive, _send)
Oct 27 15:07:57 [pangeo-forge-api-prod](https://my.papertrailapp.com/systems/pangeo-forge-api-prod/events?focus=1522411785187827812&selected=1522411785187827812) [app/web.1](https://my.papertrailapp.com/events?q=program%3Aapp%2Fweb.1&focus=1522411785187827812&selected=1522411785187827812)   File "/usr/local/lib/python3.9/dist-packages/starlette/middleware/cors.py", line 84, in __call__
Oct 27 15:07:57 [pangeo-forge-api-prod](https://my.papertrailapp.com/systems/pangeo-forge-api-prod/events?focus=1522411785187827813&selected=1522411785187827813) [app/web.1](https://my.papertrailapp.com/events?q=program%3Aapp%2Fweb.1&focus=1522411785187827813&selected=1522411785187827813)     await self.app(scope, receive, send)
Oct 27 15:07:57 [pangeo-forge-api-prod](https://my.papertrailapp.com/systems/pangeo-forge-api-prod/events?focus=1522411785187827814&selected=1522411785187827814) [app/web.1](https://my.papertrailapp.com/events?q=program%3Aapp%2Fweb.1&focus=1522411785187827814&selected=1522411785187827814)   File "/usr/local/lib/python3.9/dist-packages/starlette/exceptions.py", line 82, in __call__
Oct 27 15:07:57 [pangeo-forge-api-prod](https://my.papertrailapp.com/systems/pangeo-forge-api-prod/events?focus=1522411785187827815&selected=1522411785187827815) [app/web.1](https://my.papertrailapp.com/events?q=program%3Aapp%2Fweb.1&focus=1522411785187827815&selected=1522411785187827815)     raise exc
Oct 27 15:07:57 [pangeo-forge-api-prod](https://my.papertrailapp.com/systems/pangeo-forge-api-prod/events?focus=1522411785187827816&selected=1522411785187827816) [app/web.1](https://my.papertrailapp.com/events?q=program%3Aapp%2Fweb.1&focus=1522411785187827816&selected=1522411785187827816)   File "/usr/local/lib/python3.9/dist-packages/starlette/exceptions.py", line 71, in __call__
Oct 27 15:07:57 [pangeo-forge-api-prod](https://my.papertrailapp.com/systems/pangeo-forge-api-prod/events?focus=1522411785187827817&selected=1522411785187827817) [app/web.1](https://my.papertrailapp.com/events?q=program%3Aapp%2Fweb.1&focus=1522411785187827817&selected=1522411785187827817)     await self.app(scope, receive, sender)
Oct 27 15:07:57 [pangeo-forge-api-prod](https://my.papertrailapp.com/systems/pangeo-forge-api-prod/events?focus=1522411785187827818&selected=1522411785187827818) [app/web.1](https://my.papertrailapp.com/events?q=program%3Aapp%2Fweb.1&focus=1522411785187827818&selected=1522411785187827818)   File "/usr/local/lib/python3.9/dist-packages/starlette/routing.py", line 656, in __call__
Oct 27 15:07:57 [pangeo-forge-api-prod](https://my.papertrailapp.com/systems/pangeo-forge-api-prod/events?focus=1522411785187827819&selected=1522411785187827819) [app/web.1](https://my.papertrailapp.com/events?q=program%3Aapp%2Fweb.1&focus=1522411785187827819&selected=1522411785187827819)     await route.handle(scope, receive, send)
Oct 27 15:07:57 [pangeo-forge-api-prod](https://my.papertrailapp.com/systems/pangeo-forge-api-prod/events?focus=1522411785187827820&selected=1522411785187827820) [app/web.1](https://my.papertrailapp.com/events?q=program%3Aapp%2Fweb.1&focus=1522411785187827820&selected=1522411785187827820)   File "/usr/local/lib/python3.9/dist-packages/starlette/routing.py", line 259, in handle
Oct 27 15:07:57 [pangeo-forge-api-prod](https://my.papertrailapp.com/systems/pangeo-forge-api-prod/events?focus=1522411785187827821&selected=1522411785187827821) [app/web.1](https://my.papertrailapp.com/events?q=program%3Aapp%2Fweb.1&focus=1522411785187827821&selected=1522411785187827821)     await self.app(scope, receive, send)
Oct 27 15:07:57 [pangeo-forge-api-prod](https://my.papertrailapp.com/systems/pangeo-forge-api-prod/events?focus=1522411785187827822&selected=1522411785187827822) [app/web.1](https://my.papertrailapp.com/events?q=program%3Aapp%2Fweb.1&focus=1522411785187827822&selected=1522411785187827822)   File "/usr/local/lib/python3.9/dist-packages/starlette/routing.py", line 64, in app
Oct 27 15:07:57 [pangeo-forge-api-prod](https://my.papertrailapp.com/systems/pangeo-forge-api-prod/events?focus=1522411785187827823&selected=1522411785187827823) [app/web.1](https://my.papertrailapp.com/events?q=program%3Aapp%2Fweb.1&focus=1522411785187827823&selected=1522411785187827823)     await response(scope, receive, send)
Oct 27 15:07:57 [pangeo-forge-api-prod](https://my.papertrailapp.com/systems/pangeo-forge-api-prod/events?focus=1522411785187827825&selected=1522411785187827825) [app/web.1](https://my.papertrailapp.com/events?q=program%3Aapp%2Fweb.1&focus=1522411785187827825&selected=1522411785187827825)   File "/usr/local/lib/python3.9/dist-packages/starlette/responses.py", line 159, in __call__
Oct 27 15:07:57 [pangeo-forge-api-prod](https://my.papertrailapp.com/systems/pangeo-forge-api-prod/events?focus=1522411785187827826&selected=1522411785187827826) [app/web.1](https://my.papertrailapp.com/events?q=program%3Aapp%2Fweb.1&focus=1522411785187827826&selected=1522411785187827826)     await self.background()
Oct 27 15:07:57 [pangeo-forge-api-prod](https://my.papertrailapp.com/systems/pangeo-forge-api-prod/events?focus=1522411785187827828&selected=1522411785187827828) [app/web.1](https://my.papertrailapp.com/events?q=program%3Aapp%2Fweb.1&focus=1522411785187827828&selected=1522411785187827828)   File "/usr/local/lib/python3.9/dist-packages/starlette/background.py", line 35, in __call__
Oct 27 15:07:57 [pangeo-forge-api-prod](https://my.papertrailapp.com/systems/pangeo-forge-api-prod/events?focus=1522411785187827830&selected=1522411785187827830) [app/web.1](https://my.papertrailapp.com/events?q=program%3Aapp%2Fweb.1&focus=1522411785187827830&selected=1522411785187827830)     await task()
Oct 27 15:07:57 [pangeo-forge-api-prod](https://my.papertrailapp.com/systems/pangeo-forge-api-prod/events?focus=1522411785187827832&selected=1522411785187827832) [app/web.1](https://my.papertrailapp.com/events?q=program%3Aapp%2Fweb.1&focus=1522411785187827832&selected=1522411785187827832)   File "/usr/local/lib/python3.9/dist-packages/starlette/background.py", line 18, in __call__
Oct 27 15:07:57 [pangeo-forge-api-prod](https://my.papertrailapp.com/systems/pangeo-forge-api-prod/events?focus=1522411785187827833&selected=1522411785187827833) [app/web.1](https://my.papertrailapp.com/events?q=program%3Aapp%2Fweb.1&focus=1522411785187827833&selected=1522411785187827833)     await self.func(*self.args, **self.kwargs)
Oct 27 15:07:57 [pangeo-forge-api-prod](https://my.papertrailapp.com/systems/pangeo-forge-api-prod/events?focus=1522411785187827834&selected=1522411785187827834) [app/web.1](https://my.papertrailapp.com/events?q=program%3Aapp%2Fweb.1&focus=1522411785187827834&selected=1522411785187827834)   File "/opt/app/pangeo_forge_orchestrator/routers/github_app.py", line 783, in synchronize
Oct 27 15:07:57 [pangeo-forge-api-prod](https://my.papertrailapp.com/systems/pangeo-forge-api-prod/events?focus=1522411785187827835&selected=1522411785187827835) [app/web.1](https://my.papertrailapp.com/events?q=program%3Aapp%2Fweb.1&focus=1522411785187827835&selected=1522411785187827835)     new_models = [
Oct 27 15:07:57 [pangeo-forge-api-prod](https://my.papertrailapp.com/systems/pangeo-forge-api-prod/events?focus=1522411785187827836&selected=1522411785187827836) [app/web.1](https://my.papertrailapp.com/events?q=program%3Aapp%2Fweb.1&focus=1522411785187827836&selected=1522411785187827836)   File "/opt/app/pangeo_forge_orchestrator/routers/github_app.py", line 785, in <listcomp>
Oct 27 15:07:57 [pangeo-forge-api-prod](https://my.papertrailapp.com/systems/pangeo-forge-api-prod/events?focus=1522411785187827837&selected=1522411785187827837) [app/web.1](https://my.papertrailapp.com/events?q=program%3Aapp%2Fweb.1&focus=1522411785187827837&selected=1522411785187827837)     recipe_id=recipe["id"],
Oct 27 15:07:57 [pangeo-forge-api-prod](https://my.papertrailapp.com/systems/pangeo-forge-api-prod/events?focus=1522411785187827838&selected=1522411785187827838) [app/web.1](https://my.papertrailapp.com/events?q=program%3Aapp%2Fweb.1&focus=1522411785187827838&selected=1522411785187827838) KeyError: 'id'

Co-authored-by: Julius Busecke <julius@ldeo.columbia.edu>
@jbusecke
Copy link
Contributor

jbusecke commented Nov 2, 2022

/run NASA-SMAP-SSS/RSS/monthly

@pangeo-forge
Copy link
Contributor

pangeo-forge bot commented Nov 2, 2022

🎉 The test run of NASA-SMAP-SSS/RSS/monthly at e10df6b succeeded!

import xarray as xr

store = "https://ncsa.osn.xsede.org/Pangeo/pangeo-forge/test/pangeo-forge/staged-recipes/recipe-run-1366/NASA-SMAP-SSS/RSS/monthly.zarr"
ds = xr.open_dataset(store, engine='zarr', chunks={})
ds

@jbusecke
Copy link
Contributor

jbusecke commented Nov 2, 2022

Seems like the data is not properly concatenated in time. There is a time dimension, but the data itself has no time dimensions?

image

@jbusecke
Copy link
Contributor

jbusecke commented Nov 2, 2022

/run NASA-SMAP-SSS/JPL/8day

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Example pipeline for SMAP Seasurface Salinity
7 participants