Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[EVENT][Nov 25 – 29] OceanHackWeek (Spanish) #5052

Open
4 of 10 tasks
jnywong opened this issue Nov 11, 2024 · 9 comments
Open
4 of 10 tasks

[EVENT][Nov 25 – 29] OceanHackWeek (Spanish) #5052

jnywong opened this issue Nov 11, 2024 · 9 comments
Assignees

Comments

@jnywong
Copy link
Member

jnywong commented Nov 11, 2024

The link towards the Freshdesk ticket this event was reported

https://2i2c.freshdesk.com/a/tickets/2426

The GitHub handle or name of the community representative

@emiliom

The date when the event will start

Monday, Nov 25

The date when the event will end

Friday, Nov 29

What hours of the day will participants be active? (e.g., 5am - 5pm US/Pacific)

6 - 12 US/Pacific, but individual use will likely continue after hours

Are we three weeks before the start date of the event?

  • Yes
  • No

Number of attendees

50

Make sure to add the event into the calendar

  • Done

Does the hub already exist?

  • Yes
  • No

The URL of the hub that will be used for the event

https://oceanhackweek.2i2c.cloud/

Will this hub be decommissioned after the event is over?

  • Yes
  • No

Task list

  • Was all the info filled in above?
  • Quotas from the cloud provider are high-enough to handle expected usage?
  • [ ]

Definition of Done

  • All the tasks have been completed
  • [ ]
@emiliom
Copy link
Contributor

emiliom commented Nov 15, 2024

A correction:

What hours of the day will participants be active? (e.g., 5am - 5pm US/Pacific)

6 - 12 US/Pacific, but individual use will likely continue after hours

@emiliom
Copy link
Contributor

emiliom commented Nov 15, 2024

@jnywong I've sent two replies to your Nov. 11 email from support@2i2c.freshdesk.com but I haven't heard back. I still don't know if email is the right avenue, or if we should be having those exchanges on a ticket system or here. So, I'll paste the exchanges here, with followups.

First I'm pasting what I submitted in my original ticket:


The second OceanHackWeek (OHW)-in-Spanish event is coming up, Nov 25-29. It follows the previous event from October (see #4883). We have some issues that have came up now or in 2023 that I'd like to address:

Environment management for R image.

  • The setup we've been using in OHW is complicated, relying on conda when an R package is available on conda-forge and Docker for packages from CRAN and GitHub. For the last two, we can't specify that dependencies be installed or upgraded automatically, so it takes a lot of extra legwork. Can you recommend improvements?
  • We're using pretty old versions of RStudio and (in the conda env for R) python and pangeo-notebook, due to previous issues. Can you provide guidance for recent versions that are known to work?
  • We ran into a problem where some packages that were already installed and used successfully in 2023 were not recognized (could not be loaded) on RStudio. We need help tracking this down

Specifics to use with Latin American audience

  • In the Feb. 2023 event we ran into a limitation where participants from Cuba who could not log in to the hub. This must be related to the US embargo, but it's not obvious that US policy dictates such a limitation. Do we have options now to ensure this doesn't happen? See see [EVENT] OceanHackWeek February 27 - March 3 #2108 (comment)
  • I don't expect to need to put much more effort into Spanish localization. But, with your recent experience with hubs for Spanish speaking audiences, do you have some recommendations or best practices?

@emiliom
Copy link
Contributor

emiliom commented Nov 15, 2024

Regarding the embargo and localization:

With the embargo, I do not believe there has been any progress since the previous event.

Too bad. I don't expect that we'll find a solution in the next two weeks, but it'd be really helpful if we could make some progress. Please see the brief exchange I pointed to in my request, from Feb. 2023 (starts here), especially the last comment from yuvipanda:

@emiliom indeed, we can't tell either about what falls under and what does not. Clearly GitHub is accessible, and they don't make any mention of needing OFAC permission to allow it to be accessible in Cuba. I also can't seem to find any mention in Google Cloud of network restrictions being in place. colliand has gratefully offered to chat with CS&S (our fiscal sponsors) to try figure out how to get clarity around this.

Can you comment on at least avenues for exploration of the issue with blocked access in Cuba? Has this issue not come up in 2i2c's collaboration with Metadocencia (the Catalyst project)?

For localisation, I can recommend a CI/CD workflow using GitHub and Crowdin.

Can you point me to an example of a 2i2c hub that's using this workflow, where I can see the nuts and bolts of what's being localized and how?

We can let the questions about improvements to localization sit. What we've already done covers most of what we'd want, so it's a low priority. But given 2i2c's collaboration with Metadocencia on Catalyst hubs in Latin America, I hoped that there'd be best practices already in place that we could adopt or at least examine. BTW, I was pleased to discover the materials in Spanish already created by that project for "Hub Champion training", https://catalystproject.cloud/hub-champion-training/es/. That's already helpful. Through that site I also found Catalyst hubs in Latin America that had a lot of boilerplate text on the login page already translated to Spanish.

@emiliom
Copy link
Contributor

emiliom commented Nov 15, 2024

Environment management for R image

Regarding your questions about environment management, are you using repo2docker? If so, then let me know the GitHub repo and I can take a quick look at your configuration.

We're not using repo2docker. But we do have GH actions that create a R docker image (and a separate Python image) when we update the environment. https://github.com/oceanhackweek/jupyter-image/
The configurations for the R environment are in the "r" directory, specifically in r/environment.yml and r/Dockerfile; r/conda-linux-64.lock is generated from r/environment.yml

Would you still be able to help or point us in the right direction? For example, can you point us to the repo of another 2i2c hub that has a full-fledged R environment, including RStudio?

I looked around 2i2c documentation vis a vis repo2docker, and landed here:
https://docs.2i2c.org/admin/howto/environment/
I'm struck that much of what's in the 2i2c base-image repo (https://github.com/2i2c-org/2i2c-hubs-image) is quite old. For example, in the Python requirements file, pandas is pinned to 1.3.5 -- ancient. The repo itself hasn't been updated in 18 months.

Our image building workflow was first created in 2021, when we first adopted a 2i2c hub. It's evolved over time, sometimes incrementally and sometimes in bigger steps. We use the GitHub container repository rather than quay.io (quay.io is recommended in the 2i2c docs). We don't use repo2docker, due to problems encountered early on.

The OceanHackWeek technical team is discussing some of the challenges with the R environment / image, including the background that led to it, at oceanhackweek/jupyter-image#90. Feel free to chime in! In PR oceanhackweek/jupyter-image#97 we've updated Python, pangeo-notebook, the miniconda3 image and the rstudio-server package. We ran into a bunch of issues with conda and libmamba (discussed there), but I think we've resolved them. The image builds w/o errors. However, RStudio is not launching when clicking on the RStudio launcher in Jupyter Lab, in a local test. After a wait of a few seconds, we get this screen:

Image

I'll be happy if we can just get this updated image to work. I've already found a way to make it easier to manage R package dependencies.

@jnywong
Copy link
Member Author

jnywong commented Nov 15, 2024

Hi @emiliom ! I have been on annual leave and on a training course for the last 2 days – continuing on the support desk is fine, but I can continue the exchange here.


Can you comment on at least avenues for exploration of the issue with blocked access in Cuba? Has this issue not come up in 2i2c's collaboration with Metadocencia (the Catalyst project)?

I'm afraid this is beyond our control, but I can ask my colleagues about other avenues. Is a VPN out of the question?

Can you point me to an example of a 2i2c hub that's using this workflow, where I can see the nuts and bolts of what's being localized and how?

Yes! You are indeed right about our collaboration with MD on The Catalyst Project and a CI/CD workflow is set up in the following repository: https://github.com/czi-catalystproject/hub-champion-training. There is documentation for the Crowdin GH action you can follow https://github.com/crowdin/github-action.

Environment management for R image

I am going to take a 30-min timebox to investigate here and will follow up in another comment. Note that 2i2c is limited in providing bespoke support for image customisation, but we are working behind the scenes to try and improve this experience.

@jnywong
Copy link
Member Author

jnywong commented Nov 15, 2024

I was able to pull your most recent published R image into a 2i2c hub and did not come across a 500 error.

Image

You can see from the screenshot that the RStudio launcher is available and opens RStudio fine, and that $JUPYTER_IMAGE=ghcr.io/oceanhackweek/r:41445c1.

I believe this version is missing a few upgrades that you are trying to add in oceanhackweek/jupyter-image#97 but I am unfortunately unable to follow your custom image-building setup.

Would you still be able to help or point us in the right direction? For example, can you point us to the repo of another 2i2c hub that has a full-fledged R environment, including RStudio?

I can point you in the direction of our community CryoCloud's RStudio configuration that follows our recommended repo2docker action: https://github.com/CryoInTheCloud/hub-Rstudio-image.

You can also try asking others on the Jupyter discourse and you are welcome to ask our other community members on our 2i2c Slack workspace.

@emiliom
Copy link
Contributor

emiliom commented Nov 15, 2024

Thanks @jnywong ! The Catalyst hub-training repo and the CryoCloud RStudio config repo look like great resources.

Regarding the embargo issue, VPN is an option for some, but I don't think it's available to everyone. Last year I spoke with someone in Brazil who used to work with Google Cloud (I think) and also with MetaDocencia. She wondered if hosting the hub on a region outside the US (eg, Brazil) would make a difference. This year we'll have potentially one participant from Cuba. Anyways, definitely not something anyone can "resolve" in the next couple of weeks, but it'll still be helpful to start scoping out options and getting more clarity.

About the R images: sorry, the published R image does work, as you saw. That's what we used in the October event. Our attempts to upgrade it are the ones that are still not working out. Thanks for the pointer to a 2i2c Slack workspace! That could be really helpful. I didn't know about it, and I can't find a link to it at either https://2i2c.org or https://docs.2i2c.org; could you send me an invitation or point me to where I can join, offline?

@jnywong
Copy link
Member Author

jnywong commented Nov 18, 2024

You're welcome!

I will try my best to see how we can progress on the embargo issue, as we are aware that this affects a few of our users.

I have sent you an invitation link back on the email thread that we have on the support desk 👍

@emiliom
Copy link
Contributor

emiliom commented Nov 20, 2024

@jnywong I've submitted PR #5167 with updates to the R and Python images. That includes significant upgrades to core components in the R image. Thanks to you and Angus for the help!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants