Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Wage war on "latest" tag? #326

Open
matte21 opened this issue Mar 15, 2024 · 1 comment
Open

Wage war on "latest" tag? #326

matte21 opened this issue Mar 15, 2024 · 1 comment

Comments

@matte21
Copy link

matte21 commented Mar 15, 2024

DSB primary purpose is research. In research, reproducibility of results is important. Thus, DSB should make it easy to run reproducible experiments with it.

Yet, the latest tag is widely used in the repo's references to container images. For example, it's used both as a tag to base images in Dockerfiles, and in the YAML manifests for K8s (I linked only two occurrences here, but there are many more).

This makes it hard to write deterministic experiments. Consider the following scenario:

Time 0: A researcher runs an experiment with DSB on a brand new K8s cluster. The experiment uses container C with latest tag (for example, currently there are init containers that use alpine:latest). The latest tag for C corresponds to v1, so the experiment runs with C at v1. The researcher submits a paper with the experiment's results.
Time 1: The developers of C release v2, so the latest tag is moved to point to the container image corresponding to v2
Time 2: A reviewer tries to evaluate the artifacts. It creates a new K8s cluster, and tries to re-run the same experiment. The experiments should run with C at v1, but they will run with C at v2 instead.

This is just one scenario where latest is harmful, but there are more (not reported here for brevity).

I think it'd be best if for a given git tag of this repo, all container image tags were explicit versions rather than latest.

Note: this non-reproducibility issue goes beyond container tags. For example, see how this init container does a git clone:

args: ["-c", "git clone https://github.com/delimitrou/DeathStarBench.git /DeathStarBench &&

The cloned URL doesn't contain any tag, so what gets cloned is the head of the "master" branch, which isn't stable.

What do you think? Is there agreement that we should do this?

@dev-lew
Copy link

dev-lew commented Oct 20, 2024

I think you bring up an important point. And I agree that all versioned releases should feature pinned image versions. I had thought that using release 0.4.1 sources would be enough, but I didn't notice almost all references to a docker image in the OpenShift manifest uses latest (implicitly or explicitly).

To fix this, I would use the 0.4.1 image for the benchmark, as well as pin the versions of memcached, redis, and rabbitmq to the latest major version releases.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants