Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

parser: parse Unicode identifiers #218

Closed
wants to merge 6 commits into from
Closed

Conversation

belochub
Copy link
Member

@belochub belochub added this to the 1.0.0 milestone Jun 12, 2017
@belochub
Copy link
Member Author

@aqrln, may you help me with creation of pre-build configuration script to make it possible to choose which of the two implementations to use (check out 8942748#diff-9689e209b75f216133f2b377028d89f6R9)?

@belochub belochub changed the title parser: parse unicode identifiers parser: parse Unicode identifiers Jun 12, 2017
Copy link
Member

@nechaido nechaido left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Member

@lundibundi lundibundi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

const getOutputPath = filename => path.join(__dirname, '../src', filename);

const getFileHeader = filename =>
`// Copyright (c) 2017 JSTP project authors. Use of this source code is
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe it'll be better to put license in a separate file?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In my opinion it is good to have it here, because it will probably help to avoid any legal problems, given the fact that this file is also a piece of software which works with Unicode data files. Consider this as a precaution, I don't know for sure but I also don't want to find out about it in a bad way.

It helps to understand the resulting format of the output files as well, so that you don't need to look anywhere else or open this auto-generated files (which are pretty big and can slow down some of the editors) to understand their contents.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fine with me =), reasonable enough.

@aqrln
Copy link
Member

aqrln commented Jun 15, 2017

@belochub can you please rebase to resolve conflicts?


http.get(UCD_LINK, (res) => {
const linereader = byline.createStream(res);
linereader.on('data', (line) => {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, wouldn't the core readline module be sufficient to accomplish this?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, it probably is sufficient, I missed that.

Add tool for getting and parsing needed categories from
Unicode Character Database and generating C++ header file with
code points arrays.
Also add UTF-8 decoding function.
Add two possible options to use when checking whether the code
point is an identifier.
Copy link
Member

@aqrln aqrln left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rubber-stamp-ish LGTM with one comment (it can be fixed on landing though).

@@ -4637,8 +4637,7 @@
"wordwrap": {
"version": "0.0.3",
"resolved": "https://registry.npmjs.org/wordwrap/-/wordwrap-0.0.3.tgz",
"integrity": "sha1-o9XabNXAvAAI03I0u68b7WMFkQc=",
"dev": true
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Huh? Looks like npm correctly handles the case when a package becomes used in dependencies and not only in devDependencies, but does not handle the case when the package is not required for dependencies anymore. Can you please revert this change manually?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, wait, I think you only modified devDependencies. Now I really wonder what happened here.

@belochub
Copy link
Member Author

@aqrln, we can't land it yet, you still haven't addressed #218 (comment).

@belochub
Copy link
Member Author

@aqrln, ping.

Choose either full or short Unicode tables at build time: if
$JSTP_USE_SHORT_UNICODE_TABLES is defined and set to a non-falsy value,
then the native addon will be compiled with short tables, otherwise full
tables will be used.
@aqrln
Copy link
Member

aqrln commented Jun 20, 2017

@belochub I've pushed a commit to your branch.
@nechaido @lundibundi PTAL

@aqrln aqrln removed the blocked label Jun 20, 2017
aqrln pushed a commit that referenced this pull request Jun 21, 2017
* Add tool for getting and parsing needed categories from Unicode
  Character Database and generating C++ header file with code points
  arrays.

* Add UTF-8 decoding function.

* Implement Unicode keys parsing and add two possible options to use
  when checking whether the code point is an identifier.

Refs: https://github.com/metarhia/jstp/issues/152
PR-URL: #218
Reviewed-By: Dmytro Nechai <nechaido@gmail.com>
Reviewed-By: Denys Otrishko <shishugi@gmail.com>
Reviewed-By: Alexey Orlenko <eaglexrlnk@gmail.com>
aqrln added a commit that referenced this pull request Jun 21, 2017
Choose either full or short Unicode tables at build time: if
$JSTP_USE_SHORT_UNICODE_TABLES is defined and set to a non-falsy value,
then the native addon will be compiled with short tables, otherwise full
tables will be used.

PR-URL: #218
Reviewed-By: Mykola Bilochub <nbelochub@gmail.com>
Reviewed-By: Denys Otrishko <shishugi@gmail.com>
aqrln added a commit that referenced this pull request Jun 21, 2017
Choose either full or short Unicode tables at build time: if
$JSTP_USE_SHORT_UNICODE_TABLES is defined and set to a non-falsy value,
then the native addon will be compiled with short tables, otherwise full
tables will be used.

PR-URL: #218
Reviewed-By: Denys Otrishko <shishugi@gmail.com>
@aqrln
Copy link
Member

aqrln commented Jun 21, 2017

Landed in 4e942ae and 4b67cbd.

@aqrln aqrln closed this Jun 21, 2017
@aqrln aqrln deleted the parser-unicode-identifiers branch June 21, 2017 07:10
belochub added a commit that referenced this pull request Jan 22, 2018
* Add tool for getting and parsing needed categories from Unicode
  Character Database and generating C++ header file with code points
  arrays.

* Add UTF-8 decoding function.

* Implement Unicode keys parsing and add two possible options to use
  when checking whether the code point is an identifier.

Refs: https://github.com/metarhia/jstp/issues/152
PR-URL: #218
Reviewed-By: Dmytro Nechai <nechaido@gmail.com>
Reviewed-By: Denys Otrishko <shishugi@gmail.com>
Reviewed-By: Alexey Orlenko <eaglexrlnk@gmail.com>
belochub pushed a commit that referenced this pull request Jan 22, 2018
Choose either full or short Unicode tables at build time: if
$JSTP_USE_SHORT_UNICODE_TABLES is defined and set to a non-falsy value,
then the native addon will be compiled with short tables, otherwise full
tables will be used.

PR-URL: #218
Reviewed-By: Denys Otrishko <shishugi@gmail.com>
belochub added a commit that referenced this pull request Jan 22, 2018
* Add tool for getting and parsing needed categories from Unicode
  Character Database and generating C++ header file with code points
  arrays.

* Add UTF-8 decoding function.

* Implement Unicode keys parsing and add two possible options to use
  when checking whether the code point is an identifier.

Refs: https://github.com/metarhia/jstp/issues/152
PR-URL: #218
Reviewed-By: Dmytro Nechai <nechaido@gmail.com>
Reviewed-By: Denys Otrishko <shishugi@gmail.com>
Reviewed-By: Alexey Orlenko <eaglexrlnk@gmail.com>
belochub pushed a commit that referenced this pull request Jan 22, 2018
Choose either full or short Unicode tables at build time: if
$JSTP_USE_SHORT_UNICODE_TABLES is defined and set to a non-falsy value,
then the native addon will be compiled with short tables, otherwise full
tables will be used.

PR-URL: #218
Reviewed-By: Denys Otrishko <shishugi@gmail.com>
@belochub belochub mentioned this pull request Jan 22, 2018
belochub added a commit to metarhia/mdsf that referenced this pull request Jul 19, 2018
* Add tool for getting and parsing needed categories from Unicode
  Character Database and generating C++ header file with code points
  arrays.

* Add UTF-8 decoding function.

* Implement Unicode keys parsing and add two possible options to use
  when checking whether the code point is an identifier.

Refs: https://github.com/metarhia/jstp/issues/152
PR-URL: metarhia/jstp#218
Reviewed-By: Dmytro Nechai <nechaido@gmail.com>
Reviewed-By: Denys Otrishko <shishugi@gmail.com>
Reviewed-By: Alexey Orlenko <eaglexrlnk@gmail.com>
belochub pushed a commit to metarhia/mdsf that referenced this pull request Jul 19, 2018
Choose either full or short Unicode tables at build time: if
$JSTP_USE_SHORT_UNICODE_TABLES is defined and set to a non-falsy value,
then the native addon will be compiled with short tables, otherwise full
tables will be used.

PR-URL: metarhia/jstp#218
Reviewed-By: Denys Otrishko <shishugi@gmail.com>
belochub added a commit to metarhia/mdsf that referenced this pull request Jul 19, 2018
* Add tool for getting and parsing needed categories from Unicode
  Character Database and generating C++ header file with code points
  arrays.

* Add UTF-8 decoding function.

* Implement Unicode keys parsing and add two possible options to use
  when checking whether the code point is an identifier.

Refs: https://github.com/metarhia/jstp/issues/152
PR-URL: metarhia/jstp#218
Reviewed-By: Dmytro Nechai <nechaido@gmail.com>
Reviewed-By: Denys Otrishko <shishugi@gmail.com>
Reviewed-By: Alexey Orlenko <eaglexrlnk@gmail.com>
belochub pushed a commit to metarhia/mdsf that referenced this pull request Jul 19, 2018
Choose either full or short Unicode tables at build time: if
$JSTP_USE_SHORT_UNICODE_TABLES is defined and set to a non-falsy value,
then the native addon will be compiled with short tables, otherwise full
tables will be used.

PR-URL: metarhia/jstp#218
Reviewed-By: Denys Otrishko <shishugi@gmail.com>
belochub added a commit to metarhia/mdsf that referenced this pull request Jul 21, 2018
* Add tool for getting and parsing needed categories from Unicode
  Character Database and generating C++ header file with code points
  arrays.

* Add UTF-8 decoding function.

* Implement Unicode keys parsing and add two possible options to use
  when checking whether the code point is an identifier.

Refs: https://github.com/metarhia/jstp/issues/152
PR-URL: metarhia/jstp#218
Reviewed-By: Dmytro Nechai <nechaido@gmail.com>
Reviewed-By: Denys Otrishko <shishugi@gmail.com>
Reviewed-By: Alexey Orlenko <eaglexrlnk@gmail.com>
belochub pushed a commit to metarhia/mdsf that referenced this pull request Jul 21, 2018
Choose either full or short Unicode tables at build time: if
$JSTP_USE_SHORT_UNICODE_TABLES is defined and set to a non-falsy value,
then the native addon will be compiled with short tables, otherwise full
tables will be used.

PR-URL: metarhia/jstp#218
Reviewed-By: Denys Otrishko <shishugi@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants