Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Store results in subfolders by date #282

Open
gadenbuie opened this issue Jan 28, 2025 · 2 comments
Open

Store results in subfolders by date #282

gadenbuie opened this issue Jan 28, 2025 · 2 comments

Comments

@gadenbuie
Copy link
Member

gadenbuie commented Jan 28, 2025

Currently, we store all the test result .json files in a flat folder in the __test-results folder in the _test-results branch:

files <- gh::gh(
  "GET /repos/:owner/:repo/contents/__test_results",
  owner = "rstudio",
  repo = "shinycoreci",
  ref = "_test_results",
  .limit = Inf
)

length(files)
#> [1] 1000
files |> purrr::map_chr("name") |> sort() |> head()
#> [1] "gha-01830f4-2023_06_08_05_04-3.6-Linux.json"
#> [2] "gha-01830f4-2023_06_08_05_04-4.0-Linux.json"
#> [3] "gha-01830f4-2023_06_08_05_04-4.1-Linux.json"
#> [4] "gha-01830f4-2023_06_08_05_04-4.2-Linux.json"
#> [5] "gha-01830f4-2023_06_08_05_04-4.3-Linux.json"
#> [6] "gha-01830f4-2023_06_08_05_37-3.6-macOS.json"

As the above reprex shows, the GitHub REST API won't list more than 1,000 files at a time, so we can't programmatically get a subset of the results.

Similarly, git sparse-checkout runs into the same problem as it wants to limit checked-out files by subfolder:

# start in a temp location
git clone --no-checkout --filter=blob:none https://github.com/rstudio/shinycoreci -b _test-results
cd shinycoreci
git sparse-checkout init --cone
git sparse-checkout set __test-results
git checkout

The above doesn't help because we're still needing to checkout all of the test results files. If they were in subfolders, both of the above would work

gh::gh(
  "GET /repos/:owner/:repo/contents/__test_results/2025/01",
  owner = "rstudio",
  repo = "shinycoreci",
  ref = "_test_results",
  .limit = Inf
)
# start in a temp location
git clone --no-checkout --filter=blob:none https://github.com/rstudio/shinycoreci -b _test-results
cd shinycoreci
git sparse-checkout init --cone
git sparse-checkout set __test-results/2025/01
git checkout

Also, doing a shallow depth clone is slow and pulls a huge amount of data:

tmpdir <- tempfile("shinycoreci-test-results-")

processx::run(
  "git",
  args = c(
    "clone",
    "--depth", 
    "1",
    "https://github.com/rstudio/shinycoreci.git",
    "-b", 
    "_test_results",
    tmpdir
  ),
  echo = TRUE,
  echo_cmd = TRUE
)
test_results <- fs::path(tmpdir, "__test_results")
test_results |> fs::dir_ls() |> length()
#> [1] 8873
test_results |> fs::dir_ls() |> fs::file_info() |> getElement("size") |> sum()
#> 4.5G
@schloerke
Copy link
Contributor

If they were in subfolders, both of the above would work

Happy to do add subfolder by "year/month". So the first example file above would go into the 2023/06 sub sub folder.

@gadenbuie
Copy link
Member Author

Yeah, I think year/month would be the right amount of organization

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants