Update for automated multi-tagging and reducing the number of stack files #214

eitsupi · 2021-08-10T11:07:29Z

Currently, the stack files (and files generated from stack files) do not contain so-called "shared tags" information such as rocker/r-ver:latest.
This makes it difficult to track the tags of currently published images.
So, update make-stacks.R so that all shared tags are automatically included on the stack file.
This makes it easy to build tasks such as downloading all tags from DockerHub and recording the contents of the image in reports.
Using jq (installed in GitHub Actions runners), we can extract values from a json file easily and loop through them using foreach in Makefiles.
For example:

$ jq '.stack[].tags[]' -r stacks/4.1.1.json
docker.io/rocker/r-ver:4.1.1
docker.io/rocker/r-ver:4.1
docker.io/rocker/r-ver:4
docker.io/rocker/r-ver:latest
docker.io/rocker/rstudio:4.1.1
docker.io/rocker/rstudio:4.1
docker.io/rocker/rstudio:4
docker.io/rocker/rstudio:latest
docker.io/rocker/tidyverse:4.1.1
docker.io/rocker/tidyverse:4.1
docker.io/rocker/tidyverse:4
docker.io/rocker/tidyverse:latest
docker.io/rocker/verse:4.1.1
docker.io/rocker/verse:4.1
docker.io/rocker/verse:4
docker.io/rocker/verse:latest
docker.io/rocker/geospatial:4.1.1
docker.io/rocker/geospatial:4.1
docker.io/rocker/geospatial:4
docker.io/rocker/geospatial:latest
docker.io/rocker/shiny:4.1.1
docker.io/rocker/shiny:4.1
docker.io/rocker/shiny:4
docker.io/rocker/shiny:latest
docker.io/rocker/shiny-verse:4.1.1
docker.io/rocker/shiny-verse:4.1
docker.io/rocker/shiny-verse:4
docker.io/rocker/shiny-verse:latest
docker.io/rocker/binder:4.1.1
docker.io/rocker/binder:4.1
docker.io/rocker/binder:4
docker.io/rocker/binder:latest
docker.io/rocker/cuda:4.1.1-cuda10.1
docker.io/rocker/cuda:4.1-cuda10.1
docker.io/rocker/cuda:4-cuda10.1
docker.io/rocker/cuda:cuda10.1
docker.io/rocker/cuda:4.1.1
docker.io/rocker/cuda:4.1
docker.io/rocker/cuda:4
docker.io/rocker/cuda:latest
docker.io/rocker/r-ver:4.1.1-cuda10.1
docker.io/rocker/ml:4.1.1-cuda10.1
docker.io/rocker/ml:4.1-cuda10.1
docker.io/rocker/ml:4-cuda10.1
docker.io/rocker/ml:cuda10.1
docker.io/rocker/ml:4.1.1
docker.io/rocker/ml:4.1
docker.io/rocker/ml:4
docker.io/rocker/ml:latest
docker.io/rocker/ml-verse:4.1.1-cuda10.1
docker.io/rocker/ml-verse:4.1-cuda10.1
docker.io/rocker/ml-verse:4-cuda10.1
docker.io/rocker/ml-verse:cuda10.1
docker.io/rocker/ml-verse:4.1.1
docker.io/rocker/ml-verse:4.1
docker.io/rocker/ml-verse:4
docker.io/rocker/ml-verse:latest

$ jq '.stack[].tags[]' -r stacks/4.1.0.json
docker.io/rocker/r-ver:4.1.0
docker.io/rocker/rstudio:4.1.0
docker.io/rocker/tidyverse:4.1.0
docker.io/rocker/verse:4.1.0
docker.io/rocker/geospatial:4.1.0
docker.io/rocker/shiny:4.1.0
docker.io/rocker/shiny-verse:4.1.0
docker.io/rocker/binder:4.1.0
docker.io/rocker/cuda:4.1.0-cuda10.1
docker.io/rocker/cuda:4.1.0
docker.io/rocker/r-ver:4.1.0-cuda10.1
docker.io/rocker/ml:4.1.0-cuda10.1
docker.io/rocker/ml:4.1.0
docker.io/rocker/ml-verse:4.1.0-cuda10.1
docker.io/rocker/ml-verse:4.1.0

This PR contains shared tags such as rocker/r-ver:4.0 that are not currently used, but these tags will also be published on DockerHub when we change to build with the docker buildx bake command.

Since it is desirable to have a single stack file when extracting tags, etc., as I have done above, I will merge the stack files that are currently being built with GitHub Actions into a single file for each R version.

ToDo

Merge with Automatic update of container definition files #213 and reflect the release of R4.1.1.
Integration of stack files.
- Check the effect on the generated files.
- ~~Fix workflow for building devel images.~~ Generate core-devel.yml from devel.json.
- Update the build workflow, core.yml.
- Update sync-template-vars.R. (Includes build core latest daily images every day #194 unfinished feature, sync with core-latest-daily.json)
- Update docments.
- Update the Makefile
Test GitHub Actions on my fork

and update 4.0.5 stack files

eitsupi · 2021-08-10T14:15:39Z

With this improvement, we can give the tag rocker/ml:cuda10.1 to the latest version of rocker/ml:X.Y.Z-cuda10.1.
Does it look good?

cboettig · 2021-08-10T16:35:08Z

Yeah, this is definitely an issue. Long term I think this will be better when we can consolidate the stack files rather than duplicate them for each tag, yes?

With this improvement, we can give the tag rocker/ml:cuda10.1 to the latest version of rocker/ml:X.Y.Z-cuda10.1.

🎉 🎉

eitsupi · 2021-08-10T16:46:54Z

Yeah, this is definitely an issue. Long term I think this will be better when we can consolidate the stack files rather than duplicate them for each tag, yes?

Yes, I think we can make a single stack file for each R version like X.Y.Z.json in the near future.

One of the problems preventing this at the moment is that IMAGE has the same r-ver in ml-cuda10.1 and core.
We are using this IMAGE as the part of tag during docker-compose build, so I think we can put in the work to eliminate duplicate r-ver and unify the stack files after we change the build method to docker buildx bake.

cboettig · 2021-08-10T16:55:01Z

One of the problems preventing this at the moment is that IMAGE has the same r-ver in ml-cuda10.1 and core.

You mean this, right?

rocker-versioned2/stacks/ml-cuda10.1-4.1.0.json

Line 8 in 9cdfe19

"IMAGE": "r-ver",

Yeah, we should probably change the image name there anyway, saying this is just r-ver with an additional tag added probably is not be the best way to describe that image -- rocker/cuda might be a more accurate name. Generally we use tags only to specify versions, not add content. that build adds cuda libs, so that's probably a better naming convention. I doubt too many people use rocker/r-ver:cuda10.1 directly so hopefully the change would not have much impact downstream.

eitsupi · 2021-08-10T17:59:12Z

I'm sorry my explanation is not clear.

First of all, docker-bake.json has the following structure, and no target with the same name can exist.

{
    "target": {
        "r-ver": {
        },
        "rstudio": {
        }
    }
}

And make-bakejson.R will transcribe the IMAGE of the stack files to the target in docker-bake.json.

rocker-versioned2/build/make-bakejson.R

Line 21 in 7a06d1c

name = purrr::map_chr(value, "IMAGE", .default = NA),

So, there should not be two r-ver in one stack file. (Actually, it does work because the R process suffixes it and makes it an alias.)
It is only the target name, and the name given to the built image is determined by make-stacks.R as included in this PR.

When merging them into one stack file, we can name one of them something other than r-ver (e.g. r-ver-cuda), but this will affect the image name defined in the compose files generated by write-compose.R.

Therefore, I think it is better to integrate stack files after docker buildx bake is available (at least for images that can be built with GitHub Actions) and docker-compose build is no longer necessary.
It is expected that we can use this PR to set the required tags in docker-bake.json, so it will probably be ready in a few days to use docker buildx bake.

we should probably change the image name there anyway, saying this is just r-ver with an additional tag added probably is not be the best way to describe that image -- rocker/cuda might be a more accurate name. Generally we use tags only to specify versions, not add content. that build adds cuda libs, so that's probably a better naming convention. I doubt too many people use rocker/r-ver:cuda10.1 directly so hopefully the change would not have much impact downstream.

Given the comparison with rocker/r-ver:X.Y.Z-cuda11.1, which is built from a different image, I think the current name is fine.
It certainly seems odd that it is based on rocker/r-ver:X.Y.Z, but for example buster-slim is buster with something removed, and I think it is common for images with different content to be represented by different tags.

cboettig · 2021-08-10T18:14:55Z

Right, in other docker projects it's common to use tags for content as well as versions and many major projects have far more tag permutations than we do. I was just comparing to use in other places within rocker.

I do think things would be cleaner if rocker/r-ver:X.Y.Z-cuda10.1 was rocker/cuda:X.Y.Z-cuda10.1 and likewise for rocker/cuda:X.Y.Z-cuda11.1. That would also simplify the issue for buildx too, right?

eitsupi · 2021-08-10T18:20:34Z

I do think things would be cleaner if rocker/r-ver:X.Y.Z-cuda10.1 was rocker/cuda:X.Y.Z-cuda10.1 and likewise for rocker/cuda:X.Y.Z-cuda11.1. That would also simplify the issue for buildx too, right?

You're right.
It is also possible to set both rocker/r-ver:X.Y.Z-cuda10.1 and rocker/cuda:X.Y.Z-cuda10.1 in tags, so it might be a good idea to have them coexist (for a while).

cboettig · 2021-08-10T18:26:13Z

nice, sounds like a plan

eitsupi · 2021-08-10T18:35:42Z

If you can create a rocker/cuda repository on DockerHub soon, I think I can merge the stack files into one in this PR, what do you think?

To be honest, it's hard to target multiple files when extracting the tag list using jq as described above, so having a single stack file makes it easier to report on published images and build images using buildx.

cboettig · 2021-08-10T19:11:51Z

Yeah, I think that's the way to go. Simplifying the build chain whenever possible definitely helps maintenance, even with automation.

Ok, I've pushed a fresh local build of what we called rocker/r-ver:4.1.0-cuda10.1 up see, https://hub.docker.com/repository/docker/rocker/cuda. (Note that rocker/cuda actually already existed from several years ago in an experimental and un-maintained cuda9.0 / debian:stretch version on R 3.6.x, this push updated the image with the appropriate tags).

core, geospatial, shiny, binder, ml-cuda10.1

and integrate stackfiles since version 4.0.5 by the script

and fix typo in ml:4.0.1

and add ability to copy to core-latest-daily

eitsupi · 2021-08-11T09:44:34Z

The workflow seems work well on my fork. (not push, only build images and publish reports)

cboettig · 2021-08-11T16:20:04Z

Oh, very nice! Great work.

So, thoughts on the images that are not part of stacks/X.Y.Z.json e.g. cuda-11.1, geospatial-dev-osgeo, geospatial-ubuntugis? For the moment these appear to be outside the workflow. (Perhaps this indicates a deeper challenge with these images, while they mostly do provide just different versions of the same thing, doing so in these images requires altered build logic/install scripts and not just diff env vars like we use elsewhere...)

I think I understand why you're organizing stack files by version; it makes sense from a workflow perspective to say "build everything in 4.1.1.json" rather than say "build the 4.1.1 version of core.json, geospatial.json, ...." But from a conceptual perspective grouping by version still feels awkward. I think my ideal setup would have a set of version-agnostic 'stack' files, core.json, geospatial.json, ml.json, with the version matrix either included in those or maybe declared as some versions.json file? Or is that crazy?

eitsupi · 2021-08-11T16:55:11Z

So, thoughts on the images that are not part of stacks/X.Y.Z.json e.g. cuda-11.1, geospatial-dev-osgeo, geospatial-ubuntugis? For the moment these appear to be outside the workflow. (Perhaps this indicates a deeper challenge with these images, while they mostly do provide just different versions of the same thing, doing so in these images requires altered build logic/install scripts and not just diff env vars like we use elsewhere...)

The reason why I didn't include them in X.Y.Z.json this time is simply because we didn't build them in the current workflow core.yml.
And, there are few common layers and the image size is large, so I couldn't be sure that we could build them in single workflow.
How about considering it for integration in the future?

I think I understand why you're organizing stack files by version; it makes sense from a workflow perspective to say "build everything in 4.1.1.json" rather than say "build the 4.1.1 version of core.json, geospatial.json, ...." But from a conceptual perspective grouping by version still feels awkward. I think my ideal setup would have a set of version-agnostic 'stack' files, core.json, geospatial.json, ml.json, with the version matrix either included in those or maybe declared as some versions.json file? Or is that crazy?

I used to think about that approach too. (https://github.com/eitsupi/r-ver/blob/cb6dde632287e04e8d36c82a6e153038b468d147/buildargs/versions.json)
However, if major changes are made, such as replacing the scripts used (#192 (comment)), versions.json will eventually grow larger and larger, and become closer and closer to the current X.Y.Z.json.
So now I'm thinking that the current format of a stackfile for each version, which leaves room for manual configuration, is fine.

cboettig · 2021-08-11T17:10:07Z

@eitsupi that makes sense, thanks for explaining. Re the images outside core.yml, that makes perfect sense, I agree those images are probably too large and slow-building (geospatial images build packages from source since they are not using the pre-packaged ubuntu libs used in RSPM), and will probably always need to be in separate GH-Actions files.

Right, isolating versions in separate files does have the advantage of making it easier to modify the builds going forward without altering frozen images, which will no doubt become only more important as time passes. I'm on board

eitsupi added 3 commits August 10, 2021 16:31

add a new function to generate tags

d8ac6d7

auto-formatting by styler

2277073

update the script for to generate tags

b5eed89

and update 4.0.5 stack files

eitsupi mentioned this pull request Aug 10, 2021

[Tracking] Build system automation #181

Closed

36 tasks

improve latest tag handling

6bb8d50

eitsupi changed the title ~~Update for automated multi-tagging~~ Update for automated multi-tagging and reducing the number of stack files Aug 11, 2021

eitsupi added 6 commits August 11, 2021 11:30

Merge branch 'master' into add-tags

edc2777

rename r-ver:X.Y.Z-ml10.1 to cuda:X.Y.Z

173fc47

fix for stack file changes

d6b58e8

integrate devel stacks and compose files

addbc06

core, geospatial, shiny, binder, ml-cuda10.1

update the script make-stacks.R

82256bd

and integrate stackfiles since version 4.0.5 by the script

integrate stackfiles 4.0.0 - 4.0.4

b88b4a0

and fix typo in ml:4.0.1

eitsupi force-pushed the add-tags branch from dfcea78 to b88b4a0 Compare August 11, 2021 05:14

eitsupi added 6 commits August 11, 2021 16:04

update workflow to build images

8487117

update write-compose.R to write core-devel again

1de246c

update sync-template-vars.R

6766ace

and add ability to copy to core-latest-daily

add rocker/cuda to the list

5dad324

update docs, adding about rocker/cuda, wiki, etc.

ab3b17a

make linter happy

d0194b5

reflect renaming of stackfiles

afeb81c

eitsupi marked this pull request as ready for review August 11, 2021 10:11

cboettig merged commit f0531f0 into rocker-org:master Aug 11, 2021

eitsupi deleted the add-tags branch August 11, 2021 17:12

This was referenced Aug 12, 2021

[Tracking] Use buildx bake to build images #199

Closed

fix push targets #217

Merged

Remove unused stack files #218

Closed

Switch from docker-compose build to docker buildx bake #220

Merged

eitsupi mentioned this pull request Aug 21, 2021

Update the R definition according to the extension update microsoft/vscode-dev-containers#1007

Merged

eitsupi mentioned this pull request Aug 29, 2021

Merge and change the structure of stack files and update build strategy, and some improvements #227

Merged

17 tasks

eitsupi mentioned this pull request Mar 14, 2022

Why isn't rocker/ml:devel updated to the latest 'devel' version of R. #387

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Update for automated multi-tagging and reducing the number of stack files #214

Update for automated multi-tagging and reducing the number of stack files #214

eitsupi commented Aug 10, 2021 •

edited

Loading

eitsupi commented Aug 10, 2021

cboettig commented Aug 10, 2021

eitsupi commented Aug 10, 2021

cboettig commented Aug 10, 2021

eitsupi commented Aug 10, 2021 •

edited

Loading

cboettig commented Aug 10, 2021

eitsupi commented Aug 10, 2021

cboettig commented Aug 10, 2021

eitsupi commented Aug 10, 2021 •

edited

Loading

cboettig commented Aug 10, 2021

eitsupi commented Aug 11, 2021 •

edited

Loading

cboettig commented Aug 11, 2021

eitsupi commented Aug 11, 2021

cboettig commented Aug 11, 2021

Update for automated multi-tagging and reducing the number of stack files #214

Update for automated multi-tagging and reducing the number of stack files #214

Conversation

eitsupi commented Aug 10, 2021 • edited Loading

eitsupi commented Aug 10, 2021

cboettig commented Aug 10, 2021

eitsupi commented Aug 10, 2021

cboettig commented Aug 10, 2021

eitsupi commented Aug 10, 2021 • edited Loading

cboettig commented Aug 10, 2021

eitsupi commented Aug 10, 2021

cboettig commented Aug 10, 2021

eitsupi commented Aug 10, 2021 • edited Loading

cboettig commented Aug 10, 2021

eitsupi commented Aug 11, 2021 • edited Loading

cboettig commented Aug 11, 2021

eitsupi commented Aug 11, 2021

cboettig commented Aug 11, 2021

eitsupi commented Aug 10, 2021 •

edited

Loading

eitsupi commented Aug 10, 2021 •

edited

Loading

eitsupi commented Aug 10, 2021 •

edited

Loading

eitsupi commented Aug 11, 2021 •

edited

Loading