Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

JSON library as a git submodule #2088

Closed
ujos opened this issue May 5, 2020 · 13 comments
Closed

JSON library as a git submodule #2088

ujos opened this issue May 5, 2020 · 13 comments
Labels
kind: enhancement/improvement state: stale the issue has not been updated in a while and will be closed automatically soon unless it is updated

Comments

@ujos
Copy link

ujos commented May 5, 2020

In order to use nlohmann/json as a submodule of another project it would be great to have a small subset of the JSON project without unit tests, documentation and benchmarks.

The size of the v3.7.3 is 252MB while the size of the include folder is 800k. It takes time to download the library on CI if you build from scratch.

As a workaround one can download zip file and put include folder only.

@nlohmann
Copy link
Owner

nlohmann commented May 6, 2020

This is odd, because we just recently in #2081 removed the test data from the develop branch. A shallow checkout is now just 6 MB:

❯ git clone --depth 1 https://github.com/nlohmann/json.git
Cloning into 'json'...
remote: Enumerating objects: 855, done.
remote: Counting objects: 100% (855/855), done.
remote: Compressing objects: 100% (733/733), done.
remote: Total 855 (delta 121), reused 588 (delta 62), pack-reused 0
Receiving objects: 100% (855/855), 6.45 MiB | 3.93 MiB/s, done.
Resolving deltas: 100% (121/121), done.

@ujos
Copy link
Author

ujos commented May 6, 2020

I'm not sure if I can configure depth of the clone on CI... Full clone is massive still.

$ git clone https://github.com/nlohmann/json.git                                                        Cloning into 'json'...
remote: Enumerating objects: 124, done.
remote: Counting objects: 100% (124/124), done.
remote: Compressing objects: 100% (81/81), done.
Receiving objects:  11% (5825/50464), 64.34 MiB | 240.00 KiB/s
...

@ujos
Copy link
Author

ujos commented May 6, 2020

Looks like the repo is huge because of history. The repo size is 11MB while the size of the .git folder is 180MB.

So --depth is the only workaround.

@ujos
Copy link
Author

ujos commented May 6, 2020

I believe new git tag is needed now instead of 3.7.3.

@nlohmann
Copy link
Owner

nlohmann commented May 6, 2020

There will be a new tag once version 3.8.0 is released which should happen this month. I will also investigate how to remove the large files from the git history.

@nickaein
Copy link
Contributor

nickaein commented May 6, 2020

Have you tried using shallow submodules?

@ujos
Copy link
Author

ujos commented May 6, 2020

No, I haven't. Thanks for pointing me out.

Looks like in git 2.14.1+ this feature is broken. See "Summary of buggy / unexpected / annoying behaviour as of Git 2.14.1" answer.

@nickaein
Copy link
Contributor

nickaein commented May 6, 2020

It seems that it is discussing two specific cases relating to checking out a custom branch/commit of the submodule. Given the length of discussions (especially the first answer), I understand that shallow submodules are probably brittle and might fail in some cases, but in your use-case it might solve the problem.

@ujos
Copy link
Author

ujos commented May 6, 2020

I tried to apply git config ... shallow true command to another submodule in my repo. Then I've called git clone --recurse-submodules again but it had changed nothing. Git still downloads entire history.

I guess it is because I stick submodules to tags (== hash). Thus git has to download entire history to find that hash. Or I use it incorrectly.

@stale
Copy link

stale bot commented Jun 5, 2020

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

@stale stale bot added the state: stale the issue has not been updated in a while and will be closed automatically soon unless it is updated label Jun 5, 2020
@stale stale bot closed this as completed Jun 13, 2020
@eli-schwartz
Copy link
Contributor

There will be a new tag once version 3.8.0 is released which should happen this month. I will also investigate how to remove the large files from the git history.

There's exactly one way to remove large files from history. Well, two ways if you count "do a shallow clone".

You will need to use the git-filter-branch(1) command (see the EXAMPLES) section of the manpage, to rewrite history, and then force push to overwrite all commits. This will break previous git tags, commit sha1 references, etc. which is obviously not an ideal situation. Some people would argue that for a one-time thing if the reward is great enough (like removing lots of very large files from the history), it makes sense to do this. Other people would argue history should never be removed for any reason whatsoever. Ultimately, this is a personal choice.

If you're interested in doing this, don't hesitate to ping me with questions.

@nlohmann
Copy link
Owner

Thanks for letting me know. I did a brief research and also found the mentioned downsides in removing large files from the history. Right now, I am more than hesitant to break existing tags and hashes, and hope that submodule get a nicer support for shallow clones.

@nickaein
Copy link
Contributor

I believe the least intrusive way would be to start a new repository on GitHub (e.g. nlohmann/modern-json) and leave this one as is (possibility marking it as read-only/archive/deprecated). This provides a clean history for users who want to adopt the latest version, while keeping the whole history intact for whoever depends on it. I guess it might be even possible to negotiate with GitHub to transfer watch/starred points to the new repository.

Even such a change, which would not be disruptive for the users, needs careful consideration and only might be carried out on a major release (e.g. 4.0.0), if ever.

bbannier added a commit to zeek/spicy that referenced this issue Jun 1, 2022
The library nlohmann/json has extremely fat Git history which makes
e.g., recursive clones very slow, see
nlohmann/json#2088.

In this patch we replace the Git submodule with vendored sources from
nlohmann/json-3.10.5.
bbannier added a commit to zeek/spicy that referenced this issue Jun 2, 2022
The library nlohmann/json has extremely fat Git history which makes
e.g., recursive clones very slow, see
nlohmann/json#2088.

In this patch we replace the Git submodule with vendored sources from
nlohmann/json-3.10.5.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind: enhancement/improvement state: stale the issue has not been updated in a while and will be closed automatically soon unless it is updated
Projects
None yet
Development

No branches or pull requests

4 participants