Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature Request/Idea: provide a Dataverse container image #8934

Closed
poikilotherm opened this issue Aug 24, 2022 · 2 comments
Closed

Feature Request/Idea: provide a Dataverse container image #8934

poikilotherm opened this issue Aug 24, 2022 · 2 comments
Assignees
Labels
Component: Containers Anything related to cloudy Dataverse, shipped in containers.

Comments

@poikilotherm
Copy link
Contributor

poikilotherm commented Aug 24, 2022

Overview of the Feature Request

Basing of #8932 this issue is about creating an upstream provided (but not supported) container image, usable out of the box.
As IQSS isn't using container stuff, this is a community based effort for development, maintenance and support. (Which basically means you might end up on your own one day.)

This is intended to be a very thin layer, merely a type of packaging and shipping the application, without any additional business logic added. (It either needs to be done by the application or by other components/sidecars/jobs/humans/...)

In scope for this image:

  • Provide the actual Dataverse deployment
  • Provide JHove configuration files

Out of scope:

  • Fancy scripts to modify stuff. Build your own by inheriting or sidecars.
  • Counter processor: never add a Python installation in a Java container. Do sidecars, small sizes and distinct tasks.
  • Managing deployments of API based configuration (e.g. authentication providers). That's the job of the application or an operator

What kind of user is the feature intended for?

Sysadmin, Developers

What inspired the request?

Having the need of using containers for:

  • Quicker setups of development environments
  • Doing integration testing with @testcontainers in the future
  • Running an instance in production/staging/demo

What existing behavior do you want changed?

Not providing a "ready to use" container for multiple use cases from the same single source following container good practices.

Any brand new behavior do you want to add to Dataverse?

Provide usable container out of the box.

Any related open or closed issues to this feature request?

#5292 #4665 #8709 #8250

This will also rely on a set of other features getting included:

#7000 (PRs in place but needs more coverage) #7424 (more here) and of course #8932

Related folks to ping
@carlsonp @4tikhonov @pameyer @pdurbin @beepsoft @Kris-LIBIS

@poikilotherm poikilotherm added the Component: Containers Anything related to cloudy Dataverse, shipped in containers. label Aug 24, 2022
@poikilotherm poikilotherm self-assigned this Aug 24, 2022
@carlsonp
Copy link
Contributor

This is great. Here are some notes from our collective meeting on 8/31/22.

  • Desire to use multi-stage builds for the base Dataverse container to keep resulting image size down
  • Separation of services is important, use docker-compose
  • Long-term, swap Dataverse Ubuntu base image for Java-based version, see Feature Request/Idea: create a base container image providing a Dataverse-tuned Payara application server #8932 , also see in-progress docs, it's "preparing" the base, Dataverse would need to be added later with init scripts and such
  • Existing pull requests for basic infrastructure in already, hope to have those merged in a few months
  • Use environment variables for configuration, keep logic and customization injection out of setup of the container. See Airflow configuration as an example. See environment file example. Look into Traefik rules and TOML configuration
  • Keep business-logic out of the containers whenever possible
  • Helm chart would get us most of the way, operators would be "nice to have" later
  • Try to move community containerization efforts back into core Dataverse git repo
  • Not just for production usage, helpful for developers as well on "tricky" platforms such as Windows. WSL2 is great, recommend that in documentation.
  • Need backwards compatibility with more "traditional" installations that don't use Docker/Containers
  • Multi-architecture support is a requirement (AMD64, ARM64, etc.)
  • Make sure upgrades and version changes work well and don't break stuff in containers and persisted data. For database changes/structure those are handled inside logic in the .war files
  • Slava provided Helm chart examples from Google Cloud, not publicly available right now

Thank you so much all!

@poikilotherm
Copy link
Contributor Author

Closing in favor of #9434

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Component: Containers Anything related to cloudy Dataverse, shipped in containers.
Projects
None yet
Development

No branches or pull requests

2 participants