Publish crawled data with fixes to NPM? #63

foolip · 2020-09-23T10:27:30Z

I just did an experiment in to package up webref/ed/idl/*.idl with fixes from a branch and publish it as https://www.npmjs.com/package/webidl-dfns.

IDL with fixes applied is needed for mdn-bcd-collector, and I wanted to have it as a regular dependency rather than pulling it from a branch that I keep changing.

However, having done that experiment, I wonder if there would be interest in publishing webref directly to NPM? In addition to include ed/ and tr/ as-is, we would also provide fixed versions of ed/idl/* which we guarantee will parse with a given version of webidl2.js.

One could go further if one wanted to with global consistency guarantees such as:

No duplicate definitions
No unknown types used
No circular inheritance

mdn-bcd-collector already does some of this, but if it could be centralized, I think WPT would also be better off making use of released data with some guarantees.

tidoust · 2020-09-29T12:44:11Z

Totally fine publishing webref to NPM! We hadn't done it until now simply because no one had asked ;)
Happy to work on it, time permitting as usual. Feel free to beat me to it!

Also OK to include additional views with IDL guarantees in that package. I suppose that requires maintaining:

IDL conversion code (e.g. to convert void to undefined to account for changes in the WebIDL spec that may take time to propagate across specs). Reffy actually does a tiny bit of conversion when it cannot parse the WebIDL, but it does not publish the converted IDL yet.
individual IDL fixes per spec, essentially those listed in https://github.com/foolip/webref/commits/mdn-bcd-updater-fixes, as long as underlying PRs to fix the specs haven't been merged.

Code could either go in Reffy, in this repo directly (for instance as an action that runs any time a push is made), or perhaps in a separate "webgap" repo whose final goal is to disappear...

@foolip, would you like to propose some code to create the fixed version?

foolip · 2020-09-29T16:23:34Z

We have most of the ingredients needed for this, the question is mainly how to maintain the fixes and what should be included in the NPM package. While spec updates will often break things, it would be great if significant manual work is only needed each time broken IDL is added or changed. So the ingredients of a solution might be:

Detailed consistency tests, which if they pass we will trust that we can make a release.
A release branch with fixes, where we'll merge in updates from the main branch.
A weekly action that tries to merge in the main branch, run tests and cut a release. If that fails, human intervention will be needed.

I'm not sure what should be included in the package, but would suggest starting small, so that we don't include crawl.json, or study.json in particular.

@tidoust how does that sound? Does the w3c already have a NPM account, or whose secret would we add to the repo?

tidoust · 2020-09-30T12:03:41Z

That seems good.

If we release one package with both the raw IDL and the fixed IDL, we could have something like the following:

|_ ed
|  |_ index.json
|  |_ css
|  |_ dfns
|  |_ headings
|  |_ idl
|  |_ idlfixed
|  |_ idlparsed
|  |_ links
|  |_ refs
|_ tr
   |_ index.json
   |_ css
   |_ dfns
   |_ headings
   |_ idl
   |_ idlparsed
   |_ links
   |_ refs

We probably don't want to provide fixed IDL extracts for specs under tr/, unless someone specifically asks for them.

The idlparsed folder contains the result of parsing the raw IDL in idl with webidl2.js, along with additional info about globals, functions, etc. It may be useful to create an idlfixedparsed folder as well with the results of parsing the fixed IDL in idlfixed. Or perhaps we simply shouldn't include these folders, so that the final package only contains data directly extracted from specifications, and not data derived from it.

Alternatively, we could release two packages:

one that contains raw IDL extracts, that gets automatically generated whenever a new crawl gets made, and that does not make any guarantee whatsoever.
one that contains fixed IDL extracts (and not the raw ones), that gets prepared as you suggest in a separate branch, automatically when possible, manually when not.

W3C already has an NPM account, although I still need to figure out who actually runs it. We may start with my account. I don't think NPM has any real notion of ownership for packages, merely of collaborators entitled to publish the package, so we can easily change accounts later on.

Folder names are open for comments as usual :)

foolip · 2020-10-06T11:32:24Z

@tidoust do you think we strictly speaking need, idlparsed, or could we have webidl2.js as a peerDependency, and provide a parse() method somewhere on the objects inside ed.idl and tr.idl? It would only be for the fixed IDL that we guarantee that parsing will work at all.

tidoust · 2020-10-08T09:10:03Z

No, idlparsed is not needed.

Discussing this again with @dontcallmedom, we rather propose to publish a minimal set of dedicated packages, that only change when relevant info changes, instead of a single package that contains everything and that is basically going to be updated each time a spec gets a minor update, making it harder to track relevant updates.

In other words, we're more considering something like:

A package that contains (fixed versions, when needed, of) ed/css. Such a package would be useful for WPT.
A package that contains ed/dfns and possibly ed/headings. Such a package could be useful for the MDN browser compat data project.
A package that contains (fixed versions, when needed, of) ed/idl, and that could have webidl2.js as a peer dependency. Such a package would be useful for WPT and other projects that need to access Web IDL definitions.
Additional packages when needed, meaning when projects ask for them.

Packages would likely integrate an index file, generated from index.json.

A project that needs to mix info would import multiple packages. We could also create a global package, we're just unaware of projects that need to have all the info at once for now (or that would a package with tr data).

foolip · 2020-10-08T14:41:20Z

That sounds great actually! Do you think we should have the code for building and releasing these packages here in webref, or as separate repos? I'd like to work on the ed/idl bit first, is there a name you have in mind? Would a @webref org on NPM make sense, so that we can be sure we can get any future package names we want? @webref/idl would then be the thing I work on.

foolip · 2020-10-21T09:30:33Z

I'd still like to work on this. Can I just go ahead with the assumption that the package name will be @webref/idl and that it will be produced from this very repo, as opposed to some other repo which more closely resembles what will be in the published package?

saschanaz · 2020-10-25T16:36:33Z

I'm also interested as I intend to replace https://github.com/microsoft/TypeScript-DOM-lib-generator/tree/master/inputfiles/idl with webref, once microsoft/TypeScript-DOM-lib-generator#915 merges.

Lerna is a tool for managing monorepos, and while @webref/idl is the only package here now, the packages/* structure makes sense. Fixes #63.

foolip · 2020-10-26T18:39:57Z

I've suggested a setup for this in #68. It's still missing a mechanism to actually maintain the fixes, and the tests to ensure consistency of all of the IDL when interpreted together.

Lerna is a tool for managing monorepos, and while @webref/idl is the only package here now, the packages/* structure makes sense. Fixes #63.

Package gets prepared and published from the `packages/*` subfolder with Lerna. Further packages are expected to be created. Note this introduces a couple of line ending fixes for scripts to run on Windows: - The .gitattributes file tells Git not to convert LF line endings of the bash script. The script cannot run on typical Git configs on Windows otherwise. - The `--binary` option tells `patch` to disable any CRLF to LF conversion when applying the patches. That is also needed for patches to run smoothly on Windows. Fixes #63.

foolip mentioned this issue Oct 6, 2020

Add spec_url values for all features for which we have the data mdn/browser-compat-data#6765

Closed

foolip added a commit that referenced this issue Oct 26, 2020

Set up publishing of @webref/idl with Lerna

8a39b0c

Lerna is a tool for managing monorepos, and while @webref/idl is the only package here now, the packages/* structure makes sense. Fixes #63.

foolip added a commit that referenced this issue Oct 26, 2020

Set up publishing of @webref/idl with Lerna

dc87a1a

Lerna is a tool for managing monorepos, and while @webref/idl is the only package here now, the packages/* structure makes sense. Fixes #63.

foolip mentioned this issue Oct 26, 2020

Set up publishing of @webref/idl with Lerna #68

Merged

foolip mentioned this issue Feb 4, 2021

Extract parsed CSS property value definitions w3c/reffy#494

Closed

foolip added a commit that referenced this issue Feb 10, 2021

Set up publishing of @webref/idl with Lerna

2d7f9eb

Lerna is a tool for managing monorepos, and while @webref/idl is the only package here now, the packages/* structure makes sense. Fixes #63.

foolip added a commit that referenced this issue Feb 10, 2021

Set up publishing of @webref/idl with Lerna

3e049ca

Lerna is a tool for managing monorepos, and while @webref/idl is the only package here now, the packages/* structure makes sense. Fixes #63.

foolip added a commit that referenced this issue Feb 10, 2021

Set up publishing of @webref/idl with Lerna

f106801

Lerna is a tool for managing monorepos, and while @webref/idl is the only package here now, the packages/* structure makes sense. Fixes #63.

foolip added a commit that referenced this issue Feb 10, 2021

Set up publishing of @webref/idl with Lerna

4f399f1

Lerna is a tool for managing monorepos, and while @webref/idl is the only package here now, the packages/* structure makes sense. Fixes #63.

foolip added a commit that referenced this issue Feb 10, 2021

Set up publishing of @webref/idl with Lerna

28ad533

Lerna is a tool for managing monorepos, and while @webref/idl is the only package here now, the packages/* structure makes sense. Fixes #63.

tidoust closed this as completed in #68 Feb 12, 2021

foolip mentioned this issue Feb 25, 2021

Publish CSS bits as @webref/css #111

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Publish crawled data with fixes to NPM? #63

Publish crawled data with fixes to NPM? #63

foolip commented Sep 23, 2020

tidoust commented Sep 29, 2020

foolip commented Sep 29, 2020

tidoust commented Sep 30, 2020 •

edited

Loading

foolip commented Oct 6, 2020

tidoust commented Oct 8, 2020

foolip commented Oct 8, 2020

foolip commented Oct 21, 2020

saschanaz commented Oct 25, 2020 •

edited

Loading

foolip commented Oct 26, 2020

Publish crawled data with fixes to NPM? #63

Publish crawled data with fixes to NPM? #63

Comments

foolip commented Sep 23, 2020

tidoust commented Sep 29, 2020

foolip commented Sep 29, 2020

tidoust commented Sep 30, 2020 • edited Loading

foolip commented Oct 6, 2020

tidoust commented Oct 8, 2020

foolip commented Oct 8, 2020

foolip commented Oct 21, 2020

saschanaz commented Oct 25, 2020 • edited Loading

foolip commented Oct 26, 2020

tidoust commented Sep 30, 2020 •

edited

Loading

saschanaz commented Oct 25, 2020 •

edited

Loading