Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix #59, add information crawler #211

Closed
wants to merge 6 commits into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
78 changes: 61 additions & 17 deletions _data/stage3.yml
Original file line number Diff line number Diff line change
Expand Up @@ -6,16 +6,17 @@
- Claude Pache
description: >-
This is a specification draft for the legacy (deprecated) RegExp features in
JavaScript, i.e., static properties of the constructor like <code>RegExp.$1</code> as
well as the <code>RegExp.prototype.compile</code> method.
JavaScript, i.e., static properties of the constructor like
<code>RegExp.$1</code> as well as the <code>RegExp.prototype.compile</code>
method.
has_specification: false
presented:
- date: "May\_2017"
url: >-
https://github.com/tc39/notes/blob/master/meetings/2017-05/may-25.md#15ia-regexp-legacy-features-for-stage-3
title: Legacy RegExp features in JavaScript
tests:
- 'https://github.com/tc39/test262/pull/2650'
- https://github.com/tc39/test262/pull/2650
- id: proposal-private-methods
authors:
- Daniel Ehrenberg
Expand Down Expand Up @@ -88,7 +89,7 @@
https://github.com/tc39/notes/blob/master/meetings/2020-09/sept-23.md#status-update-for-class-fields-private-methods-static-class-features
title: Class Public Instance Fields &amp; Private Instance Fields
tests:
- 'https://github.com/tc39/test262/pulls?q=is%3Apr+is%3Aclosed+private+fields'
- https://github.com/tc39/test262/pulls?q=is%3Apr+is%3Aclosed+private+fields
- id: proposal-static-class-features
authors:
- Daniel Ehrenberg
Expand Down Expand Up @@ -125,7 +126,7 @@
https://github.com/tc39/notes/blob/master/meetings/2020-09/sept-23.md#status-update-for-class-fields-private-methods-static-class-features
title: Static class fields and private static methods
tests:
- 'https://github.com/tc39/test262/pulls?q=is%3Apr+is%3Aclosed+static+fields'
- https://github.com/tc39/test262/pulls?q=is%3Apr+is%3Aclosed+static+fields
- id: proposal-hashbang
authors:
- Bradley Farias
Expand All @@ -148,7 +149,7 @@
https://github.com/tc39/notes/blob/master/meetings/2018-11/nov-28.md#hash-bang-grammar
title: Hashbang Grammar
tests:
- 'https://github.com/tc39/test262/pull/2065'
- https://github.com/tc39/test262/pull/2065
- id: proposal-top-level-await
authors:
- Myles Borins
Expand All @@ -175,7 +176,7 @@
https://github.com/tc39/notes/blob/master/meetings/2019-06/june-6.md#top-level-await-for-stage-3
title: Top-level <code>await</code>
tests:
- 'https://github.com/tc39/test262/pull/2274'
- https://github.com/tc39/test262/pull/2274
- id: proposal-regexp-match-indices
authors:
- Ron Buckton
Expand Down Expand Up @@ -209,29 +210,29 @@
m2.indices.groups[&#x22;Z&#x22;] === undefined;
has_specification: false
presented:
- date: "December\_2019"
- date: "November\_2020"
url: >-
https://github.com/tc39/notes/blob/master/meetings/2019-12/december-3.md#regexp-match-indices-performance-feedback
https://github.com/tc39/notes/blob/master/meetings/2020-11/nov-16.md#regexp-matches-indices-jsc-implementation-feedback
title: RegExp Match Indices
tests:
- 'https://github.com/tc39/test262/pull/2309'
- https://github.com/tc39/test262/pull/2309
- id: proposal-atomics-wait-async
authors:
- Lars Hansen
champions:
- Shu-yu Guo
- Lars Hansen
description: >-
A proposal for an &ldquo;asynchronous atomic wait&rdquo; for ECMAScript, primarily for
use in agents that are not allowed to block.
A proposal for an &ldquo;asynchronous atomic wait&rdquo; for ECMAScript,
primarily for use in agents that are not allowed to block.
has_specification: true
presented:
- date: "December\_2019"
url: >-
https://github.com/tc39/notes/blob/master/meetings/2019-12/december-4.md#atomicswaitasync-for-stage-3
title: <code>Atomics.waitAsync</code>
tests:
- 'https://github.com/tc39/test262/issues/2511'
- https://github.com/tc39/test262/issues/2511
- id: proposal-relative-indexing-method
authors:
- Shu-yu Guo
Expand All @@ -251,10 +252,11 @@
has_specification: true
presented:
- date: "November\_2020"
url: https://github.com/tc39/notes/blob/master/meetings/2020-11/nov-17.md#item-rename--revisit-inclusion-on-string
url: >-
https://github.com/tc39/notes/blob/master/meetings/2020-11/nov-17.md#item-rename--revisit-inclusion-on-string
title: <code>.at()</code>
tests:
- 'https://github.com/tc39/test262/pull/2812'
- https://github.com/tc39/test262/pull/2812
- id: proposal-import-assertions
authors:
- Myles Borins
Expand All @@ -272,7 +274,49 @@
&#x22;webassembly&#x22; } });
has_specification: true
presented:
- date: "September\_2020"
- date: "November\_2020"
url: >-
https://github.com/tc39/notes/blob/master/meetings/2020-09/sept-22.md#import-assertions-for-stage-3
https://github.com/tc39/notes/blob/master/meetings/2020-11/nov-17.md#import-assertions-status-update
title: Import Assertions
- id: proposal-json-modules
authors:
- Myles Borins
- Sven Sauleau
- Dan Clark
- Daniel Ehrenberg
champions:
- Myles Borins
- Sven Sauleau
- Dan Clark
- Daniel Ehrenberg
description: A proposal to import JSON files as modules.
example: |-
import json from &#x22;./foo.json&#x22; assert { type: &#x22;json&#x22; };
import(&#x22;foo.json&#x22;, { assert: { type: &#x22;json&#x22; } });
has_specification: true
presented:
- date: "January\_2021"
title: JSON Modules
- id: proposal-private-fields-in-in
authors:
- Jordan Harband
champions:
- Jordan Harband
description: A proposal to provide brand checks without exceptions.
example: |
class C {
#brand;

static isC(obj) {
try {
obj.#brand;
return true;
} catch {
return false;
}
}
}
has_specification: false
presented:
- date: "January\_2021"
title: Ergonomic brand checks for Private Fields
1 change: 1 addition & 0 deletions _tasks/.eslintignore
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
sync-proposal-data.js
3 changes: 3 additions & 0 deletions _tasks/package.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
{
"type": "module"
}
205 changes: 205 additions & 0 deletions _tasks/sync-proposal-data.js
Original file line number Diff line number Diff line change
@@ -0,0 +1,205 @@
/**
* @file Synchronizes local proposal data with upstream GitHub proposal data.
* @author Derek Lewis <DerekNonGeneric@inf.is>
*/
// -----------------------------------------------------------------------------
// Requirements
// -----------------------------------------------------------------------------
import { curlyQuote, mdCodeSpans2html } from '@openinf/util-text';
import { GhFileImporter } from '@openinf/gh-file-importer';
import { mdTbl2json } from '@openinf/util-md-table';
import { hasOwn } from '@openinf/util-object';
import { promises as fsp, readFileSync } from 'fs';
import { resolve as pathResolve } from 'path';
import { toArray } from '@openinf/util-types';
import consoleLogLevel from 'console-log-level';
import sanitizeHtml from 'sanitize-html';
import yaml from 'js-yaml';
const DIR_DATA = '_data';
const FILE_PROPOSAL_DATA = pathResolve(DIR_DATA, 'stage3.yml');
const log = consoleLogLevel({ level: 'info' });
const ghFileImporter = new GhFileImporter({ destDir: 'tmp', log: log, });
const proposalsReadmeMd = await ghFileImporter.fetchFileContents('tc39', 'proposals', 'README.md');
// -----------------------------------------------------------------------------
// Events
// -----------------------------------------------------------------------------
process.on('uncaughtException', (err) => {
log.error(err);
});
process.on('unhandledRejection', (wrn) => {
log.warn(wrn);
});
// -----------------------------------------------------------------------------
// Helpers
// -----------------------------------------------------------------------------
// > A 'full reference link' consists of a 'link text' immediately followed by a
// > 'link label' that matches a link reference definition elsewhere in the
// > document.
// @see https://github.github.com/gfm/#full-reference-link
//
// [foo][bar]
// ^ ^----------- link label
// |---------------- link text
//
// @example ```markdown
// [foo][bar]
//
//
// bar: http://foo.bar
// ```
function isFullRefLink(cellContents) {
return cellContents.startsWith('['); // Oversimplified, but good enough.
}
/**
* @param {string} fullRefLink
* @param {number} index The 'full reference link' partition index (0 for the
* 'link text', 1 for the 'link label').
* @returns {string} A full reference link partition's contents.
*/
function getFullRefLinkContent(fullRefLink, index) {
const array = fullRefLink.split('][');
return index === 0 ? array[index].slice(1) : array[index].slice(0, -1);
}
/**
* Gets the URL from the partial 'link text' of a 'full reference link' in the
* proposal repo's README file.
*/
function getUrlFromDoc(linkTextLike) {
const linkTextLikeRegex = new RegExp(`\\[${linkTextLike}.*\\]: (.*)`, 'g');
return linkTextLikeRegex.exec(String(proposalsReadmeMd))[1];
}
/**
* First checks `stage3.yml` for pre-existing code sample. If present,
* uses that. If missing, remains missing. In the case of new proposals,
* the following occurs.
*
* Downloads the proposal's README file using the GitHub API, parses the
* Markdown file, and returns the contents of the JavaScript code block.
*
* If there are multiple matching JavaScript code blocks, the one that comes
* last in the document is used. If there are no matching JavaScript code
* blocks, returns `undefined`.
*/
async function getCodeSample(prpslId) {
const prpslData = yaml.load(readFileSync(FILE_PROPOSAL_DATA, 'utf8'));
let codeSample = null;
prpslData.forEach(value => {
if (hasOwn(value, 'id') && value.id === prpslId)
hasOwn(value, 'example') ? codeSample = value.example : codeSample = undefined;
});
// The example was either already filled out manually or simply did not
// have one (which is fine), so return our discovery early and avoid parsing
// remote proposal repo READMEs.
if (codeSample != null)
return codeSample;
// Fetch the file contents and parse out the code samples.
const contents = await ghFileImporter.fetchFileContents('tc39', prpslId, 'README.md');
const codeBlockRegex = new RegExp(/```[a-z]*\n[\s\S]*?\n```\n/g);
let codeBlocks = toArray(codeBlockRegex.exec(contents));
return codeBlocks.length > 0 ?
codeBlocks.pop()?.replace(/```(.?)*?\n+/gm, '') : undefined;
}
function getPrpslId(linkTextLike) {
let prpslUrl = getUrlFromDoc(linkTextLike);
if (prpslUrl.endsWith('/'))
prpslUrl = prpslUrl.slice(0, -1); // Lose trailing slashes to prevent crash.
const prpslId = String(prpslUrl
?.split('/')
?.pop()
?.toLowerCase());
return prpslId;
}
/** Checks if the proposal has a spec using the files listed by the GitHub API. */
async function hasSpec(prpslId) {
let isFound = false;
const result = await ghFileImporter.fetchMetadata('tc39', prpslId);
result.data.forEach((value) => {
if (value.path === 'spec.html')
isFound = true;
});
return isFound;
}
function json2yaml(buffer, options) {
const src = JSON.parse(buffer.toString());
const yamlDocument = yaml.dump(src, options);
return Buffer.from(yamlDocument);
}
/**
* Gets the proposal short description.
*
* First checks `stage3.yml` for pre-existing short description. If present,
* uses that. If missing, fetches them from the GitHub repo.
*/
async function getShortDescription(prpslId) {
let description;
const prpslData = yaml.load(readFileSync(FILE_PROPOSAL_DATA, 'utf8'));
prpslData.forEach(value => {
if (hasOwn(value, 'id') && value.id === prpslId) {
// Proposal description has already has already been filled out, use that.
hasOwn(value, 'description') ? description = value.description : description = undefined;
}
});
if (!description) {
// The proposal is missing from the dataset, so use the description on GitHub repo.
const repoMetadata = await ghFileImporter.fetchMetadata('tc39', prpslId);
description = repoMetadata.data.description;
}
log.info(`Description for ${curlyQuote(prpslId)}: ${description}`);
return description;
}
/** Populates a PresenceObj from a ProposalStageTableRecord object. */
function presenceObjFrom(valObj) {
let lastPresentedVal = valObj.last_presented;
// lastPresentedVal can either be:
// - a full reference link: <sub>[December&#xA0;2019][nonblocking-notes]</sub>
// - just a date: <sub>September&#xA0;2020</sub>
lastPresentedVal = sanitizeHtml(lastPresentedVal, { allowedTags: [] });
const presenceObj = { date: '', url: undefined };
if (isFullRefLink(lastPresentedVal)) {
presenceObj.date = getFullRefLinkContent(lastPresentedVal, 0);
presenceObj.url = getUrlFromDoc(getFullRefLinkContent(lastPresentedVal, 1));
}
else {
presenceObj.date = lastPresentedVal;
presenceObj.url = undefined;
}
return presenceObj;
}
// -----------------------------------------------------------------------------
// Main
// -----------------------------------------------------------------------------
const tblRegex = new RegExp(/(### Stage 3\n\n)([.+?\s\S]+)(\n\n### Stage 2)/g);
const markdownTbl = tblRegex.exec(String(proposalsReadmeMd))[2];
const jsonTbl = mdTbl2json(markdownTbl, (val) => sanitizeHtml(String(val), { allowedTags: ['code', 'br'] }), (val) => sanitizeHtml(String(val).toLowerCase().replace(' ', '_'), { allowedTags: [] }));
// Now, with our stage 3 table in JSON form, we must take what we need from each
// row and use the cell contents to construct our ProposalRecord data structure
// prior to making the JSON -> YAML conversion.
// -----------------------------------------------------------------------------
const prpslRcrdPromiseArr = jsonTbl.map(async (value) => {
const prpslRcrdId = getPrpslId(getFullRefLinkContent(value.proposal, 1));
const prpslRcrd = {
id: prpslRcrdId,
authors: value.author.split('<br />'),
champions: value.champion.split('<br />'),
description: await getShortDescription(prpslRcrdId),
example: await getCodeSample(prpslRcrdId),
has_specification: await hasSpec(prpslRcrdId),
presented: [presenceObjFrom(value)],
title: await mdCodeSpans2html(getFullRefLinkContent(value.proposal, 0)),
tests: hasOwn(value, 'tests') && isFullRefLink(value.tests) ?
[getUrlFromDoc(getFullRefLinkContent(value.tests, 1))] : undefined,
};
return prpslRcrd;
});
Promise.allSettled(prpslRcrdPromiseArr).then(async (results) => {
const data = [];
results.forEach(result => {
if (result.status === 'fulfilled')
data.push(result.value);
else
throw new Error(result.reason);
});
const dataBuffer = Buffer.from(JSON.stringify(data));
const resultBuffer = json2yaml(dataBuffer, {});
await fsp.writeFile(FILE_PROPOSAL_DATA, resultBuffer);
});
Loading