JSON from private host addresses #4193

yzorg · 2019-09-27T20:11:08Z

Issue Summary

New JSON data source does not allow use with private host addresses. I get the error: "Can't query private addresses."

If databases can be hosted on internal DNS names why couldn't JSON data sources? Could this be made configurable?

Steps to Reproduce

Install Redash via Docker, which is now recommended.
Use JSON Data Source (new in v8 beta 2)
Point to JSON file hosted on local computer.

On Docker the localhost URL will look like url: http://host.docker.internal:5001/mydata/my_data.json

Expected: As a developer I can view local JSON for testing new data sources or new application URLs.
Actual: Error on the query screen, "Can't query private addresses."

Technical details:

Redash Version: 8.0 beta 2
Browser/OS: Chrome/Windows
How did you install Redash: Docker, updated docker-compose.yaml to redash/redash:8.0.0-beta.2.b29352

The error seems to come from:

redash/redash/query_runner/json_ds.py

Line 180 in 4c56900

if is_private_address(query['url']):

Comments

I've tested a local copy of the container with this raise commented out, and everything works fine. I understand in PaaS or when Redash is externally visible, this is necessary to protect internal data sources. But I'm evaluating Redash to run inside my production cluster only accessible to internal users. A core use case is to surface data internal to the cluster (PostgreSql, MongoDB, JSON, and CSV) and control it via dashboard groups and permissions. If databases can be hosted on internal DNS names why couldn't JSON data sources? Could this be made configurable?

The text was updated successfully, but these errors were encountered:

arikfr · 2019-09-27T20:19:46Z

We recently had this discussion already: https://discuss.redash.io/t/error-running-query-cant-query-private-addresses/4568/.

Copying my reply over here for simplicity:

If databases can be hosted on internal DNS names why couldn't JSON data sources?

This is to avoid people using the JSON data source to access information they are not supposed to, like AWS metadata API.

Could this be made configurable?

Happy to accept a PR that makes this behavior configurable with an environment variable. Just note that if you disable this check, you need to trust whoever you allow running queries in your system.

yzorg · 2019-09-30T15:01:39Z

~~The env var change would be very simple, but I also worry about turning off security features with obscure settings.~~

I think it might be a lot clearer to have a new data source, "Unsafe Internal JSON Data Source" and the env var would enable that data source. I've only dabbled in Python, so I'm not sure how yet, but I would hope 90% or more of the implementation can be shared between the two, only disabling the private host check in the 2nd data source.

Update 2019-11-21: environment variable is simpler and easier to understand

kneufeld · 2019-11-11T17:39:05Z

This makes zero sense to me. A product designed to monitor private infrastructure can't monitor private infrastructure? I don't think it's up to redash to arbitrarily to decide what's allowed to be monitored or not.

To address the security concerts:

don't let users create queries unless they're logged in
add a "allow internal queries" flag to the json datasource
remote api should have authentication

If the a user can access and create queries on redash then surely they can also just make random curl requests to whatever it is that you're worried about.

Please rethink this, "security" should not trump usability and as OP said, if you can query postgres et al then why not json as well?

arikfr · 2019-11-13T10:52:51Z

@yzorg

but I also worry about turning off security features with obscure settings.

We can have proper documentation around it.

But the other option you suggested is fine as well. No need for env var, just have it in a separate file and we won't enable it by default. The implementation can definitely be shared between the two -- just add the needed configuration in the JSON one, and subclass it for the second.

arikfr · 2019-11-13T10:52:55Z

@kneufeld, I'm not sure we see the definition of Redash the same way. Also the solutions you suggested don't address the security issue I mentioned. Let me elaborate:

Your Redash instance runs on AWS infrastrucutre.
Every EC2 instance has access to AWS' metadata API. This API is not in your control nor authenticated. This API provides access to various pieces of information, including access keys to APIs you allowed this EC2 instance to have access to.
Your Redash users might not have the same level of access as the EC2 instance. For example, you might be using AWS SSM to get the Redash configuration, which includes the COOKIE SECRET value (the one used to encrypt cookies).

If the JSON data source wasn't preventing access to internal APIs any Redash user (with access to the JSON data source) could query the metadata API, get the instance's API keys and access the COOKIE SECRET. Using this they can impersonate other users in your system.

While you might think that:

add a "allow internal queries" flag to the json datasource

will solve this case, it's only applicable when you can trust the admins (who can edit this configuration). It's not always the case.

if you can query postgres et al then why not json as well?

Because with Postgres you're given explicit access to explicit resources (defined by the user role in postgres). With JSON you're given an open cheque.

kneufeld · 2019-11-13T17:21:22Z

I get your concerns but that's a big hammer for those of us that don't run in AWS and trust their coworkers.

arikfr · 2019-11-13T20:30:25Z

Maybe, but it's a really easy fix/change. A PR addressing this (in the way outlined above) is welcomed.

jrm · 2020-06-19T10:46:05Z

A change to the get_response method in the BaseHTTPQueryRunner class is also needed as it is also doing the "is_private_address" check during the actual HTTP request. Is there any value in doing the same test twice?

@loganprice - is this already spotted?

weekly-digest bot mentioned this issue Sep 30, 2019

Weekly Digest (23 September, 2019 - 30 September, 2019) #4196

Closed

loganprice mentioned this issue Apr 9, 2020

feature: add ability to make the restriction of api calls to private addresses optional #4790

Merged

6 tasks

arikfr closed this as completed Jul 1, 2020

mjmikulski mentioned this issue Nov 5, 2024

CSV data source does not work with address in the same network #7216

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

JSON from private host addresses #4193

JSON from private host addresses #4193

yzorg commented Sep 27, 2019

arikfr commented Sep 27, 2019

yzorg commented Sep 30, 2019 •

edited

Loading

kneufeld commented Nov 11, 2019 •

edited

Loading

arikfr commented Nov 13, 2019

arikfr commented Nov 13, 2019

kneufeld commented Nov 13, 2019

arikfr commented Nov 13, 2019

jrm commented Jun 19, 2020

JSON from private host addresses #4193

JSON from private host addresses #4193

Comments

yzorg commented Sep 27, 2019

Issue Summary

Steps to Reproduce

Technical details:

Comments

arikfr commented Sep 27, 2019

yzorg commented Sep 30, 2019 • edited Loading

kneufeld commented Nov 11, 2019 • edited Loading

arikfr commented Nov 13, 2019

arikfr commented Nov 13, 2019

kneufeld commented Nov 13, 2019

arikfr commented Nov 13, 2019

jrm commented Jun 19, 2020

yzorg commented Sep 30, 2019 •

edited

Loading

kneufeld commented Nov 11, 2019 •

edited

Loading