Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

No outbound FTP access from datahub #1789

Closed
cboettig opened this issue Sep 4, 2020 · 7 comments · Fixed by #1790
Closed

No outbound FTP access from datahub #1789

cboettig opened this issue Sep 4, 2020 · 7 comments · Fixed by #1790
Labels

Comments

@cboettig
Copy link

cboettig commented Sep 4, 2020

Scripts that require access to ftp resources e.g. the famous Mauna Loa CO2 dataset, https://climate.nasa.gov/vital-signs/carbon-dioxide/, ftp://aftp.cmdl.noaa.gov/products/trends/co2/co2_mm_mlo.txt, times out if we attempt to download it on r.datahub.berkeley.edu. This and many other resources from NOAA and NASA datasets still rely on ftp. Is it really blocked? Is this unavoidable or can it be fixed?

Thanks!

@yuvipanda
Copy link
Contributor

We had disabled it mostly since nobody has asked for it, and I was trying to make sure we only have outbound ports we really needed open.

I'll open it up. Ideally I'd like to allow only readonly anonymous FTP access, but alas I don't wanna maintain an FTP proxy...

yuvipanda referenced this issue in yuvipanda/datahub-old-fork Sep 4, 2020
@cboettig
Copy link
Author

cboettig commented Sep 4, 2020

cool, totally makes sense. thanks for opening this up.

(ironically we used to use Travis-CI as our CI testing, which also blocks ftp. We switched to GH-Actions this year and I though 'yay, we can use ftp again!)

@yuvipanda
Copy link
Contributor

:D I think it won't be a bad idea for them to move off FTP, but I bet they know that already...

yuvipanda referenced this issue in yuvipanda/datahub-old-fork Sep 29, 2020
@cboettig
Copy link
Author

cboettig commented Sep 8, 2021

@yuvipanda Seems we still cannot access ftp resources from data hub?

Ironically/thankfully that famous CO2 data is now on https, but loads of stuff is still on FTP. e.g. I cannot seem to access this classic sea ice data, ftp://sidads.colorado.edu/DATASETS/NOAA/G02135/north/daily/data/N_seaice_extent_daily_v3.0.csv on r.datahub.berkeley.edu, though works fine from my machine.

@yuvipanda
Copy link
Contributor

@cboettig yeah, I think due to the way FTP requires arbitrary ports to be open, it's difficult to allow in a setup like ours. Most protocols (HTTPS, SFTP, etc) require one outbound port to be open. FTP however, doesn't know which port needs to be open until the ftp command is run, so it's hard for us to allow them explicitly in config here. So it'll be difficult to enable FTP without opening a lot of ports.

https://www.techrepublic.com/article/how-ftp-port-requests-challenge-firewall-security/ has a little bit more information.

So I guess the short version is that outbound FTP is difficult in our setup, and it is currently not enabled. I'm not sure how to enable it without opening up all outbound ports. I'd love to find an alternate solution that doesn't involve opening up all outbound ports, and right now I've not had a lot of time to research them :(

An outbound FTP proxy is perhaps the solution, but looking around I didn't find any modern FTP proxies we can use.

@yuvipanda yuvipanda reopened this Sep 9, 2021
@yuvipanda yuvipanda changed the title No ftp access from datahub? No outbound FTP access from datahub Sep 9, 2021
@cboettig
Copy link
Author

cboettig commented Sep 9, 2021

Thanks @yuvipanda. at least good to confirm I'm not crazy here. I'd just mirror copies of these data, but despite the ancient technology involved these are mostly streaming datasets, so we need to keep hitting the ftp to get the latest stuff. We might just throw a github action into the student repos that grabs the data... (at least unlike travis, github seems to enable FTP).

@yuvipanda
Copy link
Contributor

@cboettig FTP is now available! Thank you for your patience as we worked through this :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants