Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

StringStream & more stdlib modules support for JS/NimScript #14095

Merged
merged 2 commits into from
Apr 28, 2020
Merged

StringStream & more stdlib modules support for JS/NimScript #14095

merged 2 commits into from
Apr 28, 2020

Conversation

metagn
Copy link
Collaborator

@metagn metagn commented Apr 23, 2020

Closes #12417, closes #11066, closes #4247

streams.StringStream now compiles for JavaScript, the implementation was already there it was only enabled for compile time for some reason (though it only implemented readDataStr and not read/peek/writeData). StringStreams in JS have the limitation that any buffer pointers used are cast to ptr string for the lack of a better option (except maybe a conditional var openarray[char] parameter but that would be very ugly), any pointer incompatible with ptr string will not work. JS now compiles parsecsv, parsecfg, parsesql, xmlparser, htmlparser, ropes and there is new support for cstrutils.

typeinfo was broken on JS because of commit ca4b971#diff-f3396a3284b9ca3e52aff3aeef171a09, jssys only imported system/indexerrors whereas it was included in system on C/C++. It is included now in jssys.

lexbase and streams now compile on NimScript, they would have worked before but they just didn't compile correctly. NimScript now compiles: parsecsv, parsecfg, parsesql, xmlparser, htmlparser, ropes, json, parsejson, strtabs

unidecode has never not worked in NimScript, both in compiling and the module being functional, but it was specifically mentioned in the NimScript docs that it didn't, so I added it to the changelog.

marshal now mentions JS and NimScript instead of just the new runtime when it's imported on the wrong backend. #11066 mentions marshal in JS, but should be closed since it's specifically talking about newStringStream. There is another issue, #5667, about the pointer-to-int cast in marshal in JS, but this is irrelevant, marshal imports typeinfo which doesn't compile on JS.

@metagn metagn changed the title Treat JS the same as compile time VM in streams Partial support for StringStream in JS Apr 23, 2020
@metagn metagn changed the title Partial support for StringStream in JS Support for StringStream in JS Apr 24, 2020
@metagn metagn marked this pull request as ready for review April 24, 2020 05:04
lib/pure/streams.nim Outdated Show resolved Hide resolved
@timotheecour
Copy link
Member

timotheecour commented Apr 24, 2020

  • NimPatch is underused. We should start using it, eg: I'd replace when (NimMajor, NimMinor) >= (1, 3) by when (NimMajor, NimMinor, NimPatch) >= (1, 3, 2) (and increment system.NimPatch to 2)

reason: it allows writing portable code in between releases even if switching branches/versions/commits.
So that I can have in my user config.nims (or just user code) conditionals like when (NimMajor, NimMinor, NimPatch) >= (1, 3, 2): dosmthg_with_js_ StringStream else: workaround and it'll work regardless which commit I'm on

tests/js/tstreams.nim Outdated Show resolved Hide resolved
@nim-lang nim-lang deleted a comment from SolitudeSF Apr 24, 2020
lib/pure/streams.nim Outdated Show resolved Hide resolved
changelog.md Outdated Show resolved Hide resolved
@timotheecour
Copy link
Member

  • maybe out of scope for this PR but could newFileStream be supported for -d:js -d:nodejs ?

tests/js/tstreams.nim Outdated Show resolved Hide resolved
@metagn metagn changed the title Support for StringStream in JS streams/StringStream & more stdlib modules support in JS Apr 25, 2020
@metagn
Copy link
Collaborator Author

metagn commented Apr 25, 2020

Some notes:

It's possible to only support readDataStr for JS just like the VM before the PR. This will make the PR smaller but I don't think it's worth it. The standard library is slightly affected by this change: unittest uses writeLine(Stream, string) for "JUnitOutputFormatter" and ropes uses write(Stream, string) only in its own proc write(Stream, Rope) which could just be undeclared for JS. Nothing is using peek.

I can either implement write overloads for integers and bools just like read/peek or revert the overloads for read/peek and undeclare them, telling people to make an ArrayBuffer library.

Streams would have benefited from a kind of type StreamOp = enum read, write, peek then you only define 1 closure in the stream field that takes it as an argument. This way we could add peekDataStr/writeDataStr all to 1 proc too.

Supporting FileStream for nodejs would be hard, you'd have to make the programmer call require("fs") themselves, and I don't think the standard library has ever used nodejs features locked behind require.

@timotheecour
Copy link
Member

timotheecour commented Apr 25, 2020

IMO we should pick whatever solution will maximize the amount of code that works on all backends c,vm,js (with same semantics) and code reuse (as few backend-specific tweaks as possible).

  • vm backend can almost always reuse code already written for c backend thanks to vmops; the wrapping logic is getting simpler (see compiler/vmconv.nim) and can be 100% automated via a macro if needed
  • ideally, the branching (eg when nimvm: vmcode else: rtcode) should happen preferably at the lower levels (eg copyMem), to maximize code reuse and avoid rewriting lots of high-level logic for different backends

Supporting FileStream for nodejs would be hard, you'd have to make the programmer call require("fs") themselves,

seems like a totally reasonable requirement, we shouldn't restrict to the minimal common features amongst -d:nimbrowser and -d:nodejs. we can simply add a runtime check if needed, eg doAssert(check_fs_available, "need to require('fs')") or (in future work) automatically insert select require(X) statements in jsgen'd code upon 1st use of a symbol that requires it, for -d:nodejs code).

and I don't think the standard library has ever used nodejs features locked behind require.

in #13451 I've added nodejs support for getCurrentDir, via {.emit: "`ret` = process.cwd();".}, so that import os;echo getCurrentDir() works with nim js -r -d:nodejs main; it's tested in CI, see tests/js/tos.nim

CT failure vs RT failure

This is a subtle point: failing at runtime (eg missing requirements) instead of disable code at CT will simplify a lot of code (less backend-specific branching, more availability across backends), because the runtime failure happens in only a few key places, whereas disabling code at CT is "viral" and affects all the callers, regardless whether runtime code will actually use that feature, eg:

I want this code to work with nim js -r main.
when someRuntimeCondition() will be true (at runtime), it'll raise an assert failure unless you're in nodejs and require('fs') made fs available.
when someRuntimeCondition() will be false (at runtime), it'll just work.

var cond = someRuntimeCondition()
if cond: 
  var s = newFileStream("foo.txt")
  fun(s)
else:
  var s = newStringStream("bar")
  fun(s)

This in turns make code that calls this work without being disabled at compile time just because some not-taken-branch would fail.

I can either implement write overloads for integers and bools

i don't understand; var s2 = newStringStream("abc"); s2.write (12, true) already works (as of your PR)

Streams would have benefited from a kind of type StreamOp = enum read, write, peek then you only define 1 closure in the stream field that takes it as an argument. This way we could add peekDataStr/writeDataStr all to 1 proc too.

that makes sense to me (as long as it doesn't break code?)

@metagn
Copy link
Collaborator Author

metagn commented Apr 25, 2020

i don't understand; var s2 = newStringStream("abc"); s2.write (12, true) already works (as of your PR)

I'm talking about the generic write[T] which I disabled. This uses unsafeAddr and uses the bit representation of the value, with this PR it will instead use the write(varargs[$, string]) proc.

we shouldn't restrict to the minimal common features amongst -d:nimbrowser and -d:nodejs

That makes sense. You were right about it being out of scope though I'm not going to implement it in this PR, I don't have many good ideas on it.

that makes sense to me (as long as it doesn't break code?)

It would, I'm just complaining that it doesn't exist already.

I've rolled back the integer/bool overloads, a library that performs string -> ArrayBuffer -> integer would be much more elegant.

@metagn metagn changed the title streams/StringStream & more stdlib modules support in JS StringStream & more stdlib modules support for JS/NimScript Apr 25, 2020
@timotheecour
Copy link
Member

timotheecour commented Apr 25, 2020

  • some more info: npm i builtin-modules; node -e "console.log(require('builtin-modules'))" shows fs (+ many other modules), so it's a builtin, and we can just assume it's there.
  • not going to implement it in this PR

oh ya, I'm not suggesting implementing it in this PR, I'm just saying to simply remove when not defined(js): right above type FileStream* = ref FileStreamObj, so that nim js -r main works below:

import streams
var cond = someRuntimeCondition()
if cond: echo newFileStream("foo.txt").readAll # fails at RT if goes here, and in future will work for -d:nodejs
else: echo newStringStream("bar").readAll # works if goes here

I tried that modification from your PR and it did work; when changing runtime condition to false, it gives an easy-to-understand errmsg:

ReferenceError: fopen is not defined
    at open_13745829 (/Users/timothee/git_clone/nim/timn/tests/nim/all/t10618.js:636:22)
    at new_file_stream_14050087 (/Users/timothee/git_clone/nim/timn/tests/nim/all/t10618.js:1089:9)
    at Object.<anonymous> (/Users/timothee/git_clone/nim/timn/tests/nim/all/t10618.js:1554:19)

benefit: no "viral" propagation of CT disabled code, so that more code works so long it's not attempting to open a newFileStream, which is unlikely to occur by accident; less backend-specific branching via when defined js: foo else: bar

with this PR it will instead use the write(varargs[$, string])

I see; yes, write(varargs[$, string]) is the correct approach for js indeed

@metagn
Copy link
Collaborator Author

metagn commented Apr 25, 2020

Oh, that's good about fs.

simply remove when not defined(js): right above type FileStream* = ref FileStreamObj

If I understand correctly then I should reenable write/read/peek, do I throw a ValueError before they do anything? Because runtime errors in this case are not going to be informative. Maybe catch any errors the procs throw in ssWhateverData then rethrow them with an extra message? That would also be bad, JS could just never error on a wrong input.

Also, do I change the version checks or can that be done by someone else after the merge?

@timotheecour
Copy link
Member

timotheecour commented Apr 25, 2020

If I understand correctly then I should reenable write/read/peek, do I throw a ValueError before they do anything?

sorry which one for example? like proc fsPeekData(s: Stream, buffer: pointer, bufLen: int): int = ?
I don't think that's an issue, because calling newFileStream will fail right here on open (see my stacktrace above), so it won't ever call fsPeekData IIUC.
=> single point of failure.
and you can even provide a good errmsg like this:

# in newFileStream:
when defined(js):
  # TODO for future work: support nodejs via `fs` builtin module
  doAssert false, "js not yet supported; context: " & filename
else:
  var f: File
  if open(f, filename, mode, bufSize): result = newFileStream(f)

Also, do I change the version checks or can that be done by someone else after the merge?

sorry, I need more context to understand (3am here..)

@metagn
Copy link
Collaborator Author

metagn commented Apr 25, 2020

Sorry for keeping you up. I meant StringStream procs, if they get a pointer that isn't ptr string they will fail terribly. Specifically write uses unsafeAddr which is bad on JS (#14100). I'm saying there's 2 options: if write or read is called for Stream on JS, it throws a ValueError; or when it dereferences it inside the StringStream procs like ssReadData, then any error will be caught and rethrown like

try:
  cast[ptr string](buffer)[][0..<result] = s.data[s.pos..<s.pos+result]
except:
  raise newException(ValueError, "reading to buffer pointer failed, did you use a non-string pointer?", getCurrentException())

For context about the versioning:

NimPatch is underused. We should start using it, eg: I'd replace when (NimMajor, NimMinor) >= (1, 3) by when (NimMajor, NimMinor, NimPatch) >= (1, 3, 2) (and increment system.NimPatch to 2)

Should I do any one of these things (replace with >= (1, 3, 2) or increment version) or can they be done after the PR?

@timotheecour
Copy link
Member

Should I do any one of these things (replace with >= (1, 3, 2) or increment version) or can they be done after the PR?

leave it like you did for now, we can always change later if ppl agree

@timotheecour
Copy link
Member

timotheecour commented Apr 25, 2020

I meant StringStream procs, if they get a pointer that isn't ptr string they will fail terribly.

sorry can you give me a simple complete example I can run on top of your PR to see what you mean exactly? (ideally one where calling stream apis would end up with a pointer that is a ptr string, and one where it won't)

because right now this works in your pr:

when true:
  import streams
  var s2 = newStringStream("abc")
  doAssert s2.readAll == "abc"
  s2.write("def")

@metagn
Copy link
Collaborator Author

metagn commented Apr 25, 2020

import streams

var s = newStringStream("abcd")
var i: int
discard s.readData(addr i, sizeof i) # this is what read[T](Stream, var T) does
echo i

You get Error: unhandled exception: index 0 not in 0 .. NaN [IndexError], meaning Nim tried to subscript a null/undefined value.

@Araq
Copy link
Member

Araq commented Apr 25, 2020

I cannot merge this, you used [ci skip] in the very last commit.

@timotheecour
Copy link
Member

sadly, [ci skip] is broken on azure, see microsoft/azure-pipelines-agent#2944

@metagn
Copy link
Collaborator Author

metagn commented Apr 26, 2020

It's ok, #14099 had to get merged first anyway

@timotheecour
Copy link
Member

timotheecour commented Apr 26, 2020

@hlaaftana i have some more feedback, please hold on before any merge :)
also, i may have a solution for #14095 (comment) (will update here soon today)

also, can you please rebase + force push instead of merge? otherwise really hard to follow your commits ; that's what i do in all my PR's ; any conflicts are almost always easy to fix and when they're not i squash what needs to be squashed before rebase

(assuming you already know all this stuff but just in case:
even if you've already merged, it's easy to go back and do a rebase:
use git reflog, find the hash right before the merge, and git pull --rebase origin devel from there, the git push --force on your branch)

@timotheecour
Copy link
Member

timotheecour commented Apr 26, 2020

  • ok see [WIP] new module jsarrays: ArrayBuffer, DataView + typed js API's #14128, it shows that we can support generic read[T] for streams with nim js (eg, for integer types). Right now it builds on top of your PR just for demoing it works, but I'll reduce it to just the new module jsarrays, which you'll then be able to import in your PR so that streams works without restriction in nim js.

  • the other feedback I had was related to cstrutils.nim; right now you're re-imlementing a lot of things in the js backend, but I think we can avoid that and maximize code reuse between c and js with just a few primitives that would differ between backends; will provide more details tmrw

@metagn
Copy link
Collaborator Author

metagn commented Apr 26, 2020

The new code in cstrutils is just copy pasted from strutils (can be reused right now, just cmpIgnoreStyle($a, $b).cstring) which I understand is problematic and didn't pay much attention to at first. Trying my best to adhere to your feedback and not make too many commits: cmpIgnoreStyle and cmpIgnoreCase now say "for JS use strutils instead", maybe strutils can have cstring | string arguments for JS later down the line, or maybe when Enumerable[char]/Iterable[char] is implemented.

I understand that streams is a mess when it comes to overspecialization but it's what had to be done to work around pointer limitations in JS. The main purpose of this PR anyway was to make the rest of the stdlib compatible with JS, I wasn't aiming to create a real streams library. I like the idea of jsarrays, but its current use in streams to work around pointer is a bit complicated and I think it'd be easier for you to work if this PR was merged first.

@metagn
Copy link
Collaborator Author

metagn commented Apr 27, 2020

Ormin caused the test failures on azure but builds.sr.ht complains about the pegs test, most likely an unrelated CI issue. Should be good regardless

@Araq Araq closed this Apr 27, 2020
@Araq Araq reopened this Apr 27, 2020
@timotheecour
Copy link
Member

timotheecour commented Apr 28, 2020

@hlaaftana

Ormin caused the test failures on azure but builds.sr.ht complains about the pegs test, most likely an unrelated CI issue. Should be good regardless

the failure is because you haven't rebased against devel so my fix #14134 wasn't picked up; if you git pull --rebase origin devel (then push --force) it works, eg rebuild nim the test passes locally:
(eg XDG_CONFIG_HOME= nim c --lib:lib -r testament/testament --nim:bin/nim r tests/stdlib/tpegs.nim)

@metagn
Copy link
Collaborator Author

metagn commented Apr 28, 2020

Could've sworn I rebased it before. Fine now.

@metagn
Copy link
Collaborator Author

metagn commented Apr 28, 2020

Merge?

@Araq Araq merged commit cd9af6b into nim-lang:devel Apr 28, 2020
@metagn metagn deleted the patch-2 branch April 29, 2020 07:29
@timotheecour
Copy link
Member

timotheecour commented Apr 29, 2020

@hlaaftana amazing job! this is 1 big step closer to the goal of making all of stdlib available for all targets

EchoPouet pushed a commit to EchoPouet/Nim that referenced this pull request Jun 13, 2020
…#14095)

* StringStream & more stdlib modules support for JS/NimScript

* change back pegs test in line with nim-lang#14134
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
3 participants