Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

OGC Disaster Pilot 2022 Sprint Meta Issue #1318

Closed
11 of 14 tasks
iameskild opened this issue Jun 8, 2022 · 7 comments
Closed
11 of 14 tasks

OGC Disaster Pilot 2022 Sprint Meta Issue #1318

iameskild opened this issue Jun 8, 2022 · 7 comments
Assignees
Labels
area: documentation 📖 Improvements or additions to documentation area: user experience 👩🏻‍💻 impact: high 🟥 This issue affects most of the nebari users or is a critical issue

Comments

@iameskild
Copy link
Member

iameskild commented Jun 8, 2022

Context

Quansight is conducting a sprint as part of the 2022 OGC Disaster Pilot. The aim of the sprint is to demonstrate how QHub/Nebari could be used to quickly spin up a data science platform on the cloud of choice to provide practitioners the computational tools needed to respond to a disaster. The Sprint is scheduled for July

Audience & Structure

The event will be focused on scientists and engineers in the geospatial-ocean-met space. There will be two tracks/tutorials.

  1. Tutorial demonstrating the use of the (Pangeo stack)[https://pangeo.io/) on QHub/Nebari
  2. Tutorial walking through the installation of QHub/Nebari

It is expected that some participants will not want to install QHub themselves and will only be interested in learning about the Pangeo stack. The plan is to give them accounts on a hosted QHub/Nebari. Most probably via ESIP or via the OGC.

Currently, the plan is for a two day sprint event starting with two tutorials on the first day taking not more than the morning, after this there will be an async mechanism for folks to ask questions as they either try out the Pangeo stack in the hosted QHub/Nebari OR try installing QHub/Nebari for themselves.

High Priority Issues

Installation

  • Clear working instructions on how to install QHub/Nebari on the clouds we decide to support
  • Make sure we have sane instance sizes for the clouds we use in the demo
    • Small/Large/High Mem/Cheap GPU etc
    • Is our default conda store pod size reasonable
    • Are our Dask workers configured correctly for our instance sizes
    • We may be able to use the ESIP deployment as an example.
  • Clear documentation of how to use Keycloak
    • How do groups and roles work. Is there a difference
    • Explain special groups i.e. currently Admin/Developer/Analyst
    • How do I add/remove people to QHub. (ideally this should be doable by anyone in the admin group) not just with the root password.
    • How are groups and shared folders connected.
  • conda-store
    • fix for user created environments conda-environments not showing up for CDS-Dashboards
    • fix namespace clashes between filesystem conda-envs and user created environments
    • documentation around how to create/delete environments and namespaces and conda-store configuration
    • Rename default and filesystem namespaces and also explain the different between the envs that come from git and user created environments

Demo List

@dharhas will split this out in a new issue but at a high level. We will be demonstrating:

  • stretch goals

Open Questions

  • There is a requirement to show installation capability on multiple clouds. Do we support all 4 or say for the sprint we will only support 2.
  • Which cloud install will we use in the demo.Probably AWS since many @rsignell-usgs demos use AWS hosted datasets
  • Are we comfortable with our GPU support. Preference is to
  • Do we want to refer to everything as Nebari. This is my preference.
  • some of the datasets like 'sentinel-1' etc require AWS credentials. how do we handle this in the sprint.

Out of Scope

  • Renaming of code to use Nebari is not required for this. We can explain we are in the middle of a rebranding.

References

2021 Disaster Pilot

Original Issue description from @iameskild below.

Clear deployment instructions

For the items listed below, most of these docs need to be validated / improved.

I think that although not perfect, much of the team's effort in the past few months has been to stabilize the deployment process and I think with a few improvements and updates to the docs, we're in good shape for the demo. Here are a few items that could be addressed as a stretch goal:

Demos and clear documentation for the core services

These docs can be the demos or instructions walking users through some of the core features and services.

Given that we have some "bare-bones" examples of much of the above, it might be worthwhile developing a few more complex example notebooks. These could fall into the "tutorial" section of the diataxis documentation framework.

🔴 - deemed highest priority by eskild

@iameskild iameskild added area: documentation 📖 Improvements or additions to documentation area: user experience 👩🏻‍💻 impact: high 🟥 This issue affects most of the nebari users or is a critical issue labels Jun 8, 2022
@iameskild iameskild self-assigned this Jun 8, 2022
@iameskild
Copy link
Member Author

@trallard @dharhas @dhavide I have made an attempt at gathering issues that might need to be addressed as well as references to the docs that should be validated / cleaned up prior to the OGC sprint. As we have about a month remaining, not all of this can be addressed but it gives us a good place from which to start prioritizing.

@dharhas dharhas changed the title Prepare for OGC sprint OGC Disaster Pilot 2022 Sprint Meta Issue Jun 9, 2022
@dharhas
Copy link
Member

dharhas commented Jun 9, 2022

@rsignell-usgs feedback welcome. I'm on vacation the next 2 weeks but will try and respond.

@dharhas
Copy link
Member

dharhas commented Jun 9, 2022

Actually the infracost work #1315 needs to be part of this sprint as well.

@kcpevey
Copy link
Contributor

kcpevey commented Jun 22, 2022

updates to the above list made by @kcpevey and @iameskild, see current state below:

Clear deployment instructions

Installation

Infracost

Demos and clear documentation for the core services

Dask

Conda-Store

CDS Dashboards

KBatch

Argo (stretch-goal)

🔴 - deemed highest priority by eskild

@iameskild iameskild modified the milestone: Release v0.4.3 Jun 23, 2022
@rsignell-usgs
Copy link
Contributor

I'm looking forward to understanding how to use Argo Workflows!

I think the sprint should kick off with something like the CONUS404 Demo

One thing I've found awkward is explaining to users all the things you might want to do when you start up a gateway cluster, and we've an (ugly) custom routine to do this:

import os
import sys
sys.path.append(os.path.join(os.environ['HOME'],'shared','users','lib'))
import ebdpy as ebd

profile = 'nhgf-s3'
region = 'us-west-2'
endpoint = f's3.{region}.amazonaws.com'
ebd.set_credentials(profile=profile, region=region, endpoint=endpoint)
worker_max = 1
client,cluster = ebd.start_dask_cluster(profile=profile, worker_max=worker_max, 
                                      region=region, use_existing_cluster=False,
                                      adaptive_scaling=False, wait_for_cluster=False, 
                                      worker_profile='Medium Worker', 
                                      propagate_env=True)

Is there a better/simpler way?

@rsignell-usgs
Copy link
Contributor

rsignell-usgs commented Jun 23, 2022

Regarding notebooks to use for the Sprint, do any of these look interesting?
https://gallery.pangeo.io/index.html

With slight modification (creating the Dask cluster), they should all work on Qhub/Nebari

I also tested cloning the Element 84 geo-notebooks repo and this Planetary Computer remote sensing notebook might be nice:
https://jupyter.qhub.esipfed.org/hub/user-redirect/lab/tree/shared/users/rsignell/repos/geo-notebooks/notebooks/odc-planetary-computer.ipynb

@trallard
Copy link
Member

Closing as we are tracking the outstanding items elsewhere

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area: documentation 📖 Improvements or additions to documentation area: user experience 👩🏻‍💻 impact: high 🟥 This issue affects most of the nebari users or is a critical issue
Projects
None yet
Development

No branches or pull requests

5 participants