Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Grant FTP access to Pangeo hubs #744

Closed
wants to merge 1 commit into from

Conversation

sgibson91
Copy link
Member

@sgibson91 sgibson91 commented Oct 7, 2021

In #740 we were debugging FTP access for the Pangeo hub required for Ryan's class. This produced #741 for discussion around long-term fixes for how we handle network policies.

This PR represents part of the work we did to debug the FTP issue. It brings the singleuser network policies from the daskhub values file into the the pangeo-hubs config file and add the FTP port (21 with TCP protocol).

Note: It is necessary to copy the whole network policy block as helm does not allow appending of array items, only over-writing. So we have to redefine everything we want to keep. This will be easier to maintain if/when we move to having a permissive network policy by default and allowing an opt-in model for restrictions on a per-hub basis, as discussed here.

Note: This PR currently does not fix the FTP issue but has made some progress towards a solution.

Error message without this PR:

$ wget ftp://ftp.ncdc.noaa.gov/pub/data/uscrn/products/daily01/HEADERS.txt
--2021-10-06 14:16:23--  ftp://ftp.ncdc.noaa.gov/pub/data/uscrn/products/daily01/HEADERS.txt
           => 'HEADERS.txt'
Resolving ftp.ncdc.noaa.gov (ftp.ncdc.noaa.gov)... 205.167.25.137, 2610:20:8040:2::137
Connecting to ftp.ncdc.noaa.gov (ftp.ncdc.noaa.gov)|205.167.25.137|:21... 

Error message with this PR:

--2021-10-06 15:44:57--  ftp://ftp.ncdc.noaa.gov/pub/data/uscrn/products/daily01/HEADERS.txt
  (try: 9) => 'HEADERS.txt'
Connecting to ftp.ncdc.noaa.gov (ftp.ncdc.noaa.gov)|205.167.25.137|:21... connected.
Logging in as anonymous ... Logged in!
==> SYST ... done.    ==> PWD ... done.
==> TYPE I ... done.  ==> CWD (1) /pub/data/uscrn/products/daily01 ... done.
==> SIZE HEADERS.txt ... 730
==> PASV ... couldn't connect to 205.167.25.137 port 64983: Connection timed out
Retrying.

This is because the FTP request is working passive mode where a second, random port connection is opened for the data. berkeley-dsep-infra/datahub#2825 contains some info on this issue for me to follow up on.

Update: Reading this thread, seems like the only viable option is to open all outbound ports due to random port allocation FTP does.

@sgibson91
Copy link
Member Author

Superceded by #774

@sgibson91 sgibson91 closed this Oct 26, 2021
@sgibson91 sgibson91 deleted the pangeo-ftp-access branch March 3, 2022 07:15
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant