Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement a text-based lockfile format #11863

Closed
Jarred-Sumner opened this issue Jun 14, 2024 · 17 comments · Fixed by #15705
Closed

Implement a text-based lockfile format #11863

Jarred-Sumner opened this issue Jun 14, 2024 · 17 comments · Fixed by #15705
Assignees
Labels
bun install Something that relates to the npm-compatible client enhancement New feature or request

Comments

@Jarred-Sumner
Copy link
Collaborator

Jarred-Sumner commented Jun 14, 2024

This is a tracking issue for implementing a text-based lockfile format for bun install and making it the default format going forward. There will be a smooth migration path from bun.lockbbun.lock.

Why?

When first working on the package manager, the flamegraph showed parsing JSON was the biggest number. So instead of a JSON lockfile, we designed two efficient binary formats: one for the registry manifest cache and one for the lockfile.

The binary lockfile format has served us well, but it isn't worth the cost in developer experience, particularly for larger teams working together. We suggest workarounds today for various things, but none of them are great.

  • Merge conflicts are hard. Do you pick bun.lockb.1 or bun.lockb.2? What about both?
  • How do you inspect the lockfile from a PR on GitHub/GitLab/etc?
  • How do you diff the lockfile? You can print as a yarn.lock and configure git to diff that way, but that's a whole lot more complicated than it not being a problem in the first place
  • When there are lockfile changes, it's not clear enough why. Sometimes, you run bun install and the lockfile has changes due to hashes changing, package.json scripts changing, or other reasons and this is hard to understand right now because it's hard to read in a text editor.
  • People too frequently add bun.lockb to the .gitignore file

Will bun install still be fast?

Yes.

Will it make bun install slower?

Based on what we've seen, about 1-20 milliseconds.

What will the new format be?

Probably JSON with Trailing Commas, like tsconfig.json.

Bun already supports this schema format for the runtime (and package.json). TOML is another option and it's what we use for bunfig.toml, though I kind of think that was a mistake and it should've been JSON with Trailing Commas. TOML's editor tooling support is not as mature as JSON with Trailing Commas.

Why JSON with Trailing Commas instead of JSON?

Too many collective human lifetimes have been spent fixing merge conflicts from diffs caused by adding or removing trailing commas at the end of lists in JSON

-    }
+    },

Why JSON with Trailing Commas instead of YAML?

YAML is fine. We don't have a YAML parser in Bun yet, and the indentation gets really confusing sometimes. Also, YAML parsers tend to be slower than JSON parsers (and YAML parsers are also JSON parsers). My favorite YAML fact is that the two-digit country code for Norway in YAML is parsed as false, though this isn't relevant to a lockfile

Why JSON with Trailing Commas instead of JSON5?

We don't have a JSON5 parser in Bun, and would like to avoid formats that're slower to parse than JSON

When will bun.lock be released?

Q3.

What will the migration plan be?

bun install will support both bun.lockb and bun.lock for awhile, but once released, new features will only be supported via bun.lock.

Do people really edit lockfiles manually?

Yes, for small tweaks it can be important

Why not fix merge conflicts via bun install instead?

We will support that as well. But, it shouldn't be impossible to do manually.

@Jarred-Sumner Jarred-Sumner added tracking An umbrella issue for tracking big features bun install Something that relates to the npm-compatible client labels Jun 14, 2024
@whyman
Copy link

whyman commented Jun 14, 2024

Feels like json5 could also be a good choice, that has a real spec at least. I realise there is no parser currently but surely it's better than having jsons that are not valid with most parsers

@styfle
Copy link
Contributor

styfle commented Jun 14, 2024

Regarding json5 vs jsonc, there is a related issue with some discussion about tsconfig.json from a few years ago:

yndajas added a commit to dxw/js-cop-games that referenced this issue Jun 14, 2024
This is following the Bun [guide][1]/[docs][2] for enabling git diffs
for `bun.lockb`

This won't work on GitHub, and this file isn't currently checked into
git. It is supported by Renovate, so we could check it in, but there are
various issues with this including [increased merge conflicts with
dependency management][3] and [potential security threats][4]. This
issue should be resolved and this change revertible when [Bun moves to a
by-default text-based lockfile][5] and Renovate adds support

[1]: https://bun.sh/guides/install/git-diff-bun-lockfile
[2]: https://bun.sh/docs/install/lockfile#how-do-i-git-diff-bun-s-lockfile
[3]: oven-sh/bun#5486
[4]: oven-sh/bun#5486 (comment)
[5]: oven-sh/bun#11863
@anru
Copy link

anru commented Jun 17, 2024

Good design goals from yarn:

  1. Must be easy to read by a human
  2. Must be easy to diff
  3. Must be fast to parse

To achieve 3 it is useful to have a dedicated format and parser.
Common formats, formats that support many options and functionality require additional logic, branches in the code, etc. which negatively affects the parsing speed.

Diffability at yarn.lock is simply top notch. Yes, commas in JSON fixes the situation from catastrophic to acceptable, but yarn.locks are always more compact and win in readability over JSON files.

The downside of a custom format is that if it is not compatible or is not a subset of some other, more common format, it will be harder to work with it at the user or infrastructure code level.

In any case, this undertaking cannot but rejoice.

@rhuanbarreto
Copy link

This would really unblock bun for being usable in storybooks as parseable outputs are important for it to adopt as a supported package manager. storybookjs/storybook#28164
JSON with trailing commas is good. But it's also important to have a --json parameter on the package manager commands so scripts can JSON.parse its output.

@GermanJablo
Copy link

I'm not as familiar with the technical implications, but as a user I have to say that YAML makes a lot of sense in a long, auto-generated file like the lock file.

@nurulhudaapon
Copy link

nurulhudaapon commented Jul 15, 2024

Wondering why just re-using package-lock.json is not an option?

@rhuanbarreto
Copy link

I think it's a good idea to keep a way to support a backwards compatible (package-lock.json) in addition to the current (binary) one. Then people could choose in the bunfig.toml if they are willing to pay the performance penalty in order to be more compatible with the node ecosystem.

@LitoMore
Copy link

Two same lockfile content bun.lockb files were generated from different versions of Bun may have different md5 bun.lockb results.

This is super annoying with Git.

@evelant
Copy link

evelant commented Aug 19, 2024

Re-using package-lock.json or yarn.lock will cause tooling problems. Lots of things rely on those to detect the package manager (for better or for worse). You'd end up with scripts installing things with yarn or npm if bun were to use the same name and/or format.

@notramo
Copy link

notramo commented Aug 26, 2024

If anyone thinks YAML is a good idea, read this: https://noyaml.com/
Then read all articles the above page links to.

Sure, the basic concept of syntax is great, but there are too many pitfalls. That's why there are some simpler configuration languages with similar syntax which retain the same level of readability and diffability, while being simpler and way less error-prone.
https://gura.netlify.app/
https://nestedtext.org/en/latest/
I would bet these are also faster to parse than YAML.

@evelant
Copy link

evelant commented Aug 26, 2024

Maybe not as relevant for a lockfile, but there is some great innovation in better configuration files: https://nickel-lang.org/

@Barzi-Ahmed
Copy link

If anyone thinks YAML is a good idea, read this: https://noyaml.com/ Then read all articles the above page links to.

Sure, the basic concept of syntax is great, but there are too many pitfalls. That's why there are some simpler configuration languages with similar syntax which retain the same level of readability and diffability, while being simpler and way less error-prone. https://gura.netlify.app/ https://nestedtext.org/en/latest/ I would bet these are also faster to parse than YAML.

I totally agree with you. YAML and other counterparts like TOML, are totally harmful and bad. Either go JSON or .js if you want full power.

@Jordan-Hall
Copy link

If you decide to move to jsonc, does this mean the current binary lock file going to change from yarn lock to the new jsonc. Also is bun going to provide a parser for the new lock file like yarn does or at least the types like pnpm?

@hadarziv-army
Copy link

Is there any progress on the matter?

It could be a game changer💪🏻

@nektro nektro added enhancement New feature or request and removed tracking An umbrella issue for tracking big features labels Oct 16, 2024
@nektro nektro mentioned this issue Oct 16, 2024
52 tasks
@jakehamilton
Copy link

One additional benefit from a text-based format is the ability for a tool like Nix to parse it and pull dependencies for reproducible builds. Bun's largest pain point for me is the lack of a way to easily provide package lock information to Nix so that packages can be fetched for use in the sandbox. An awkward conversion of bun.lockb > yarn.lock > package-lock.json is possible, but there are some significant issues in the translation which results in some packages missing hash information.

@Barzi-Ahmed
Copy link

Barzi-Ahmed commented Oct 18, 2024

@Jarred-Sumner, text-based format is crucial for security, as dependency scanners cannot scan or read binary files. Please prioritize this feature.

@jase88
Copy link

jase88 commented Nov 12, 2024

Another advantage of a text-based approach would be fewer merge conflicts.

Tools like renovate can update buns binary lock files, but merge conflicts arise immediately. Effectively only one update can be merged

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bun install Something that relates to the npm-compatible client enhancement New feature or request
Projects
None yet
Development

Successfully merging a pull request may close this issue.