Skip to content
This repository has been archived by the owner on Nov 30, 2022. It is now read-only.

Move dependencies declared in Dockerfile to more "proper" location #744

Closed
adamsachs opened this issue Jun 28, 2022 · 6 comments · Fixed by #807
Closed

Move dependencies declared in Dockerfile to more "proper" location #744

adamsachs opened this issue Jun 28, 2022 · 6 comments · Fixed by #807
Labels
enhancement New feature or request

Comments

@adamsachs
Copy link
Contributor

adamsachs commented Jun 28, 2022

Is your feature request related to a specific problem?

Per this discussion on a fidesops-plus PR, we should not maintain dependencies and their versions within our Dockerfile (see Dockerfile#L48-L49).

Describe the solution you'd like

Add these dependencies to requirements.txt so that, as @PSalant726 says, it "will both be more idiomatic, and allow us to better take advantage of image layer caching."

Describe alternatives you've considered, if any

Perhaps there is a reason that these dependencies weren't added to requirements.txt to begin with? If so, we should look to resolve, because declaring dependencies within the Dockerfile seems likely to cause problems in the long run, specifically when thinking of building extensions to the core app, e.g. fidesops-plus.

Additional context

This came up when i copied these lines into a fidesops-plus Dockerfile, and @PSalant726 flagged that as a problematic form of dependency management.

@adamsachs adamsachs added the enhancement New feature or request label Jun 28, 2022
@sanders41
Copy link
Contributor

sanders41 commented Jun 28, 2022

You are referring to this?

RUN pip install -U pip  \
    && pip install 'cryptography~=3.4.8' \
    && pip install snowflake-connector-python --no-use-pep517  \
    && pip install -r requirements.txt -r dev-requirements.txt

RUN if [ "$SKIP_MSSQL_INSTALLATION" != "true" ] ; then pip install -U pip -r mssql-requirements.txt ; fi

For cryptography I think we can just remove that line. I am pretty sure it gets installed by python-jose anyways, and 3.4.8 is a super old version. The current version is 37.0.2.

For snowflake and mssql we could make them extras in setup.py. However in the Dockerfile I am thinking they my still want to be installed the way they currently are, or a modified version of this, to take advantage of caching. Making them extras and installing them at the pip install -e . step means the dependencies would be reinstalled every time rather than using cache.

@adamsachs
Copy link
Contributor Author

adamsachs commented Jun 28, 2022

those are the lines i was referring to.

just to clarify, the main problem i see is maintainability and extensibility - when we build a -plus image that extends the core fidesops app, there's no guarantee we'll pick up these dependencies unless we manually ensure we're keeping our -plus Dockerfiles in line with what's here, which seems less than ideal. but perhaps that's a tradeoff we're willing to make.

i'll let @PSalant726 comment further, though, because he brought up the caching concern, and seemed to think that adding to requirements.txt rather than maintaining it in the Dockerfile would improve caching.

@sanders41
Copy link
Contributor

What I think we should do is remove the cryptography line, and make snowflake and mssql extras (even if this ends up meaning a requirements file just for snowflake). Then the fidesops Dockerfile can still install them early for caching, but with no hard coded versions, and fidesops-plus can do what it needs with them.

@PSalant726 does this accomplish what you are thinking?

@pattisdr
Copy link
Contributor

pattisdr commented Jun 28, 2022

Just wanted to link to why a few of these lines were in the Dockerfile to begin with: #56 (See Notes section), although it's definitely possible there are other ways to do this.

@sanders41
Copy link
Contributor

Thanks for the context @pattisdr, I have a couple ideas for this but not sure if they will work. One more question for you on this, I think this means snowflake-connector-python is always needed and would go in requirements.txt rather than as an extra (assuming it works) correct?

@pattisdr
Copy link
Contributor

Well it was needed for snowflake-sqlachemy which we use to connect and run queries against snowflake databases, so I think it would be always needed. It might be worth revisiting if we can install snowflake-sqlalchemy without this now though, this was seven months ago.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants