-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Data to purge from the repository for rights #100
Comments
Removed (but not purged) |
Upon review, I don't think we need to remove the census data. It is available open-access through the UK Data Service… I believe we are able to re-share it (I wouldn’t have added it to the repo otherwise), and upon revisiting CC BY 4.0, it states that we “are free to . . . copy and redistribute the material in any medium or format” (see here). Looping in @claireaustin01 might be good regarding this bit, however. |
Great thanks @kallewesterling. Perhaps the safest option would be to automatically download that link in a local deploy? Arguably that's applicable to many of these. |
A potential structure for managing the workflow, where
|
Sounds like a good idea to me. As far as I can see, it would apply to the two publicly available datasets that are used here (if we're sticking with keeping census data in there for now):
The scary thing about download files is obviously that the link are depending on services that provide them, long term etc. etc... You know all this, of course! :) |
Well done, I was having a quick peak at those links and annoyed to figure out the
Yeah it's hard to maintain. I guess I'm thinking: maybe that addresses that concern for now, and we can return to the issue of having a final version of these included in the repository when we've had enough time to decide what's ok. Any thoughts on this all much appreciated @claireaustin01 |
I agree with that @griff-rees ! |
|
Hi @griff-rees, @kallewesterling, @claireaustin01, The following files in this folder contain data from Wikidata and Geonames:
Wikidata: according to https://dumps.wikimedia.org/legal.html:
Geonames: according to http://download.geonames.org/export/dump/:
So, as far as I can see, it should be fine. |
Have backed up all the fixture files. First attempt to purge via https://rtyley.github.io/bfg-repo-cleaner/ has raised the following errors: $ git push
Enumerating objects: 43, done.
Counting objects: 100% (40/40), done.
Delta compression using up to 4 threads
Compressing objects: 100% (15/15), done.
Writing objects: 100% (24/24), 16.97 KiB | 8.48 MiB/s, done.
Total 24 (delta 18), reused 15 (delta 9), pack-reused 0
remote: Resolving deltas: 100% (18/18), completed with 9 local objects.
To github.com:Living-with-machines/lwmdb
! [remote rejected] refs/pull/101/head -> refs/pull/101/head (deny updating a hidden ref)
! [remote rejected] refs/pull/102/head -> refs/pull/102/head (deny updating a hidden ref)
! [remote rejected] refs/pull/107/head -> refs/pull/107/head (deny updating a hidden ref)
! [remote rejected] refs/pull/107/merge -> refs/pull/107/merge (deny updating a hidden ref)
! [remote rejected] refs/pull/11/head -> refs/pull/11/head (deny updating a hidden ref)
! [remote rejected] refs/pull/12/head -> refs/pull/12/head (deny updating a hidden ref)
! [remote rejected] refs/pull/13/head -> refs/pull/13/head (deny updating a hidden ref)
! [remote rejected] refs/pull/15/head -> refs/pull/15/head (deny updating a hidden ref)
! [remote rejected] refs/pull/18/head -> refs/pull/18/head (deny updating a hidden ref)
! [remote rejected] refs/pull/19/head -> refs/pull/19/head (deny updating a hidden ref)
! [remote rejected] refs/pull/2/head -> refs/pull/2/head (deny updating a hidden ref)
! [remote rejected] refs/pull/20/head -> refs/pull/20/head (deny updating a hidden ref)
! [remote rejected] refs/pull/27/head -> refs/pull/27/head (deny updating a hidden ref)
! [remote rejected] refs/pull/28/head -> refs/pull/28/head (deny updating a hidden ref)
! [remote rejected] refs/pull/30/head -> refs/pull/30/head (deny updating a hidden ref)
! [remote rejected] refs/pull/33/head -> refs/pull/33/head (deny updating a hidden ref)
! [remote rejected] refs/pull/38/head -> refs/pull/38/head (deny updating a hidden ref)
! [remote rejected] refs/pull/39/head -> refs/pull/39/head (deny updating a hidden ref)
! [remote rejected] refs/pull/40/head -> refs/pull/40/head (deny updating a hidden ref)
! [remote rejected] refs/pull/41/head -> refs/pull/41/head (deny updating a hidden ref)
! [remote rejected] refs/pull/42/head -> refs/pull/42/head (deny updating a hidden ref)
! [remote rejected] refs/pull/43/head -> refs/pull/43/head (deny updating a hidden ref)
! [remote rejected] refs/pull/44/head -> refs/pull/44/head (deny updating a hidden ref)
! [remote rejected] refs/pull/46/head -> refs/pull/46/head (deny updating a hidden ref)
! [remote rejected] refs/pull/5/head -> refs/pull/5/head (deny updating a hidden ref)
! [remote rejected] refs/pull/57/head -> refs/pull/57/head (deny updating a hidden ref)
! [remote rejected] refs/pull/58/head -> refs/pull/58/head (deny updating a hidden ref)
! [remote rejected] refs/pull/59/head -> refs/pull/59/head (deny updating a hidden ref)
! [remote rejected] refs/pull/62/head -> refs/pull/62/head (deny updating a hidden ref)
! [remote rejected] refs/pull/63/head -> refs/pull/63/head (deny updating a hidden ref)
! [remote rejected] refs/pull/67/head -> refs/pull/67/head (deny updating a hidden ref)
! [remote rejected] refs/pull/68/head -> refs/pull/68/head (deny updating a hidden ref)
! [remote rejected] refs/pull/69/head -> refs/pull/69/head (deny updating a hidden ref)
! [remote rejected] refs/pull/7/head -> refs/pull/7/head (deny updating a hidden ref)
! [remote rejected] refs/pull/72/head -> refs/pull/72/head (deny updating a hidden ref)
! [remote rejected] refs/pull/73/head -> refs/pull/73/head (deny updating a hidden ref)
! [remote rejected] refs/pull/74/head -> refs/pull/74/head (deny updating a hidden ref)
! [remote rejected] refs/pull/77/head -> refs/pull/77/head (deny updating a hidden ref)
! [remote rejected] refs/pull/78/head -> refs/pull/78/head (deny updating a hidden ref)
! [remote rejected] refs/pull/8/head -> refs/pull/8/head (deny updating a hidden ref)
! [remote rejected] refs/pull/85/head -> refs/pull/85/head (deny updating a hidden ref)
error: failed to push some refs to 'github.com:Living-with-machines/lwmdb' |
This looks like a good place to start troubleshooting... It looks like it might be an issue with dropping files in a repo with open pull requests :/ |
@griff-rees do you have. the commands you tried with bfg just so I don't re do exactly what you tried |
Thanks @AoifeHughes pretty sure this is what I found best: $ bfg --delete-files fixture-files lwmdb.git |
For reference: I installed $ sudo snap install bfg-repo-cleaner --beta on an |
Just tried it with slightly different command: (playground) ➜ erase git clone git@github.com:Living-with-machines/lwmdb.git
Cloning into 'lwmdb'...
remote: Enumerating objects: 2319, done.
remote: Counting objects: 100% (351/351), done.
remote: Compressing objects: 100% (263/263), done.
remote: Total 2319 (delta 135), reused 167 (delta 82), pack-reused 1968
Receiving objects: 100% (2319/2319), 29.95 MiB | 4.80 MiB/s, done.
Resolving deltas: 100% (1358/1358), done.
(playground) ➜ erase cd lwmdb
(playground) ➜ lwmdb git:(main) java -jar ~/Downloads/bfg-1.14.0.jar --delete-folders fixture-files --delete-files fixture-files --private
Using repo : /Users/ahughes/erase/lwmdb/.git
Found 134 objects to protect
Found 17 commit-pointing refs : HEAD, refs/heads/main, refs/remotes/origin/HEAD, ...
Protected commits
-----------------
These are your protected commits, and so their contents will NOT be altered:
* commit 63f18ff4 (protected by 'HEAD') - contains 17 dirty files :
- fixture-files/JISC papers.csv (14.2 KB)
- fixture-files/UKDA-8613-csv/1851_rsd_data.csv (1.4 MB)
- ...
WARNING: The dirty content above may be removed from other commits, but as
the *protected* commits still use it, it will STILL exist in your repository.
Details of protected dirty content have been recorded here :
/Users/ahughes/erase/lwmdb.bfg-report/2023-06-30/11-23-15/protected-dirt/
If you *really* want this content gone, make a manual commit that removes it,
and then run the BFG on a fresh copy of your repo.
Cleaning
--------
Found 370 commits
Cleaning commits: 100% (370/370)
Cleaning commits completed in 163 ms.
Updating 13 Refs
----------------
Ref Before After
--------------------------------------------------------------------
refs/heads/main | 63f18ff4 | a1649c52
refs/remotes/origin/asmith-review-docs | e8196742 | d8a0bed9
refs/remotes/origin/fix-mitchells-import | c9032006 | 9dc8c58b
refs/remotes/origin/geocensus | dd31fd0f | 5bf21c44
refs/remotes/origin/improve-load-json-fixtures | 513738d3 | 56e47072
refs/remotes/origin/item-max-title-field | 6339b3e3 | b9e2e8c9
refs/remotes/origin/jupyterhub | 9e716305 | 6d7cd451
refs/remotes/origin/kallewesterling/issue35 | c8429d77 | aec87a1c
refs/remotes/origin/kallewesterling/issue56 | ebf57d41 | 6e04d95a
refs/remotes/origin/main | 63f18ff4 | a1649c52
refs/remotes/origin/mkdocs | 29b13aec | f8d69bfb
refs/remotes/origin/production-deploy | 738bfbab | dc84a5de
refs/remotes/origin/thobson/issue47 | 0fed749d | 31999d4d
Updating references: 100% (13/13)
...Ref update completed in 30 ms.
Commit Tree-Dirt History
------------------------
Earliest Latest
| |
......................DDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDD
D = dirty commits (file tree fixed)
m = modified commits (commit message or parents changed)
. = clean commits (no changes to file tree)
Before After
-------------------------------------------
First modified commit | ce708d9f | e16706f4
Last dirty commit | c9032006 | 9dc8c58b
In total, 489 object ids were changed. Full details are logged here:
/Users/ahughes/erase/lwmdb.bfg-report/2023-06-30/11-23-15
BFG run is complete! When ready, run: git reflog expire --expire=now --all && git gc --prune=now --aggressive
(playground) ➜ lwmdb git:(main) git reflog expire --expire=now --all && git gc --prune=now --aggressive
Enumerating objects: 2299, done.
Counting objects: 100% (2299/2299), done.
Delta compression using up to 10 threads
Compressing objects: 100% (2184/2184), done.
Writing objects: 100% (2299/2299), done.
Total 2299 (delta 1400), reused 589 (delta 0), pack-reused 0 |
I don't have permissions to write, but does this look like what you had @griff-rees I used the jar file directly from linked site. |
Cool! I think I got that far, it was the push to |
I need to sort your permission. And I'm going to make another merge to |
rtyley/bfg-repo-cleaner#36 (comment) - see this comment |
Yeah I saw that when I hit this before. Had other urgent stuff so left it |
@AoifeHughes you've got |
Okay, just for reference I got the same errors as @griff-rees, I tried removing branch protections and also |
Thanks so @AoifeHughes: really helps to reproduce that (and know I didn't miss something obvious!). There are other routes that don't use |
Another option: https://github.com/newren/git-filter-repo |
@griff-rees can you check if this has been done, I think I got it working? |
Ah lovely! I think we need to check the history to be sure. Probably need to add to |
closing as data is gone 😄 |
This may require purging the
git
history and worth checking with @claireaustin01The text was updated successfully, but these errors were encountered: