Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add ability to format bytes as metric or IEC; affects [bundlejs bundlephobia CratesSize DockerSize GithubRepoSize GithubCodeSize GithubSize NpmUnpackedSize SpigetDownloadSize steam VisualStudioAppCenterReleasesSize whatpulse] #10547

Open
wants to merge 4 commits into
base: master
Choose a base branch
from

Conversation

chris48s
Copy link
Member

Following on from the conversation in #10437

Closes #10437
Closes #9703
Refs #8845
Refs #10421

While this PR does introduce some variance, I have also taken the opportunity to add a renderSizeBadge() helper function which will help us achieve some more horizontal consistency around the things where we do want to be opinionated.

- switch from pretty-bytes to byte-size
- add renderSizeBadge() helper function
- match upstream conventions for metric/IEC units
- add new test helpers and use them in service tests
@chris48s chris48s added the core Server, BaseService, GitHub auth, Shared helpers label Sep 21, 2024
Copy link

socket-security bot commented Sep 21, 2024

New and removed dependencies detected. Learn more about Socket for GitHub ↗︎

Package New capabilities Transitives Size Publisher
npm/byte-size@9.0.0 None 0 32.3 kB 75lb

🚮 Removed packages: npm/@actions/core@1.10.1, npm/@actions/github@6.0.0

View full report↗︎

Copy link
Contributor

github-actions bot commented Sep 21, 2024

Messages
📖 ✨ Thanks for your contribution to Shields, @chris48s!

Generated by 🚫 dangerJS against 4513955

* @param {string} [label='size'] - Custom label
* @returns {object} A badge object that has three properties: label, message, and color
*/
function renderSizeBadge(bytes, units, label = 'size') {
Copy link
Member Author

@chris48s chris48s Sep 21, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This helper gives us

  • a standard colour
  • a default label text (although we can override it) and
  • always passes the number of bytes though byteSize() with the same settings

byte-size takes a slightly different approach to rounding than pretty-bytes does.
In pretty-bytes, rounding level is expressed as significant figures. The default in pretty-bytes was 3 significant figures.
In byte-size, rounding level is expressed as decimal places. The default in byte-size is 1 decimal place.
For some examples, this will give us a slightly different level of rounding
My instinct is to roll with the the default (1 decimal place), but 2 decimal places would also be a reasonable choice here. Either way, lets make a decision now, set it here in the renderSizeBadge() helper, and use that everywhere.

Comment on lines 113 to 109
const size = json.size.compressedSize
return this.constructor.render({ size })
const size = json.size.rawCompressedSize
return renderSizeBadge(size, 'metric', 'minified size (gzip)')
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Previously we were presenting compressedSize as reported by the upstream
Switching this to use rawCompressedSize allows us to apply our own rounding convention while matching the upstream units

Comment on lines 54 to 48
return this.constructor.render({ size })
// note the GH API returns size in KiB
// so we multiply by 1024 to get a size in bytes and then format that in IEC bytes
return renderSizeBadge(size * 1024, 'iec', 'repo size')
Copy link
Member Author

@chris48s chris48s Sep 21, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The GitHub API exposes a number of Kibibytes (although the API docs say "kilobytes") rather than a number of bytes.

There is one place in the GitHub UI where this number is exposed/formatted and that is on https://github.com/settings/repositories

To work an example, if I call https://api.github.com/repos/chris48s/geometry-to-spatialite I get size=7159. That's presented in the frontend as
Screenshot at 2024-09-21 18-54-20
7159/1024 = 6.99

async handle({ user, repo }) {
const data = await this.fetch({ user, repo })
return this.constructor.render({ size: this.getTotalSize(data) })
return renderSizeBadge(this.getTotalSize(data), 'iec', 'code size')
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This number is never presented anywhere in the GitHub UI. Since the other 2 GitHub badges use IEC bytes, lets go with that here too.

async handle({ owner, app, token }) {
const { size } = await this.fetch({ owner, app, token, schema })
return this.constructor.render({ size })
return renderSizeBadge(size, 'metric')
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have no idea how this site presents sizes. Lets stick with metric for the moment on the basis we have been doing that for some time and nobody has complained yet. We can change it if we need to.

Comment on lines -16 to +24
.expectBadge({ label: 'unpacked size', message: '147 kB' })
.expectBadge({ label: 'unpacked size', message: '147.2 kB' })

t.create('Unpacked size for scoped package')
.get('/@testing-library/react.json')
.expectBadge({ label: 'unpacked size', message: isFileSize })
.expectBadge({ label: 'unpacked size', message: isMetricFileSize })

t.create('Unpacked size for scoped package with version')
.get('/@testing-library/react/14.2.1.json')
.expectBadge({ label: 'unpacked size', message: '5.41 MB' })
.expectBadge({ label: 'unpacked size', message: '5.4 MB' })
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This file is a good example of the difference between rounding to significant figures vs rounding to decimal places. Switching to 1 decimal place sometimes gives us more precision and sometimes less precision than 3 significant figures.

@chris48s
Copy link
Member Author

This PR is still draft as I have not yet added the ability to switch between units via a URL param. I will do that next before marking as ready for review.

@chris48s chris48s changed the title WIP Add ability to format bytes as metric or IEC; affects [bundlejs bundlephobia CratesSize DockerSize GithubRepoSize GithubCodeSize GithubSize NpmUnpackedSize SpigetDownloadSize steam VisualStudioAppCenterReleasesSize whatpulse] Add ability to format bytes as metric or IEC; affects [bundlejs bundlephobia CratesSize DockerSize GithubRepoSize GithubCodeSize GithubSize NpmUnpackedSize SpigetDownloadSize steam VisualStudioAppCenterReleasesSize whatpulse] Sep 30, 2024
Comment on lines +18 to +20
const queryParamSchema = baseQueryParamSchema.keys({
units: unitsQueryParam.default(defaultUnits),
})
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here's a little Joi TIL

import Joi from 'joi'

const schema1 = Joi.object({
  one: Joi.string(),
}).required()

console.log(schema1.describe())

const schema2 = schema1.keys({
  two: Joi.string(),
})

console.log(schema2.describe())

@chris48s chris48s marked this pull request as ready for review September 30, 2024 13:05
@chris48s
Copy link
Member Author

OK. I've marked this PR ready for review. In c6028f1 I've added the ability to switch between IEC and metric units. The code isn't bad but it does add a bunch of complexity. Now that I've written it and we have a diff to discuss, I think its reasonable to not only ask "is this good code?" but also to revisit "is this a good idea?"

Even though I've spent a bunch of time on c6028f1 I'd also be quite happy to roll it back and just merge 3395601 and cbac43b if we don't think it is worth it.

@PyvesB
Copy link
Member

PyvesB commented Oct 6, 2024

OK. I've marked this PR ready for review. In c6028f1 I've added the ability to switch between IEC and metric units. The code isn't bad but it does add a bunch of complexity. Now that I've written it and we have a diff to discuss, I think its reasonable to not only ask "is this good code?" but also to revisit "is this a good idea?"

I'm a little short on time to do a full review, but the diff with the parameter switch feels reasonable in terms of added code complexity, i.e. at a glance I'm able to follow what's going on.

Regarding whether it's a good idea, honestly, I don't know:

  • in my mind, I've always associated kilobytes to 1024 bytes. If I see kB or kiB, they're the same thing for me, I'll admit that I never encountered the word "kibibyte" before Add ability to format bytes as metric (e.g: kilobyte) or IEC (e.g: kibibyte) format #10437, or at least it didn't attract my attention to the point that I'd remember. Possible consequence of having a formal education where we often reasoned using the binary system/powers of two, and having never worked on a professional problem when this subtlety mattered enough.
  • consequently, I'll echo Add ability to format bytes as metric (e.g: kilobyte) or IEC (e.g: kibibyte) format #10437 (comment). Whilst I agree with the general conclusion of vertical consistency, I fear that allowing the same badge (or at least that look the same, assuming I'm not the only idiot to not be able to tell at a glance the difference between kB and kiB) have different meanings depending on a metric parameter, may do more harm than good. I kind of feel like not having the parameter brings the best of both worlds, vertical consistency (same as the number indicated by e.g. upstream bundlephobia service) and horizontal consistency within the service (all e.g. bundlephobia badges show numbers representing the same thing).

Those are my two cents, but ultimately, I don't feel that strongly.

Copy link
Contributor

🚀 Updated review app: https://pr-10547-badges-shields.fly.dev

@chris48s
Copy link
Member Author

chris48s commented Oct 21, 2024

TODO: need to update this after #10613 was merged

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
core Server, BaseService, GitHub auth, Shared helpers
Projects
None yet
2 participants