Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Stack overflow in js_of_ocaml #26

Open
aantron opened this issue Jul 6, 2017 · 9 comments
Open

Stack overflow in js_of_ocaml #26

aantron opened this issue Jul 6, 2017 · 9 comments

Comments

@aantron
Copy link
Owner

aantron commented Jul 6, 2017

Reported by @Armael in #22 (comment):

Hi. I'm not sure what the state of this discussion is, but I can report on my recent experience of using markup with js_of_ocaml (wrt bucklescript, I would very much like to use jsoo and not BS, and do not really care for the size of the produced js file).

Essentially it is very easy to run into a stack overflow, when using markup compiled to js. I suspect this is due to the CPS used in the implementation, which I assume is hard for jsoo to optimize into tail-recursion.

One solution I see would be to manually trampoline in markup, which is a somewhat invasive > change...

Repro case: https://paste.isomorphis.me/O8j

and confirmed in a subsequent comment.

@Armael
Copy link

Armael commented Jul 6, 2017

@Armael
Copy link

Armael commented Jul 6, 2017

As far as I know, CPS is not optimized into tailcalls by js_of_ocaml.

@Armael
Copy link

Armael commented Jul 6, 2017

@hhugo Is there something that could be done on jsoo's side? Or is the only solution to implement trampolining in markup itself?

@Jo-Blade
Copy link

Jo-Blade commented Jul 25, 2024

Hello, is there anything new about this problem today ? Do you know any alternative for xml or http parsing that would work with jsoo ? I can confirm the issue still persist with markup 1.0.3 and js_of_ocaml 5.8.1

In my case, the stack overflow occurs even with html string as simple as:

let example_html = {|
<!DOCTYPE html>
<html lang="en">
  <body>
    <script type="text/javascript" src="example.js"></script>
  </body>
</html>
|}

let _ = Soup.parse example_html

@aantron
Copy link
Owner Author

aantron commented Jul 25, 2024

Hi, no progress as of now. Could you give your complete workflow, if it's not too much work to simplify, so I can readily observe the problem? Some sample program, its Dune files, commands you are running? Is this in Node or a browser?

For parsing XML you could try Xmlm. For HTML there is a very non-standards-compliant HTML parser in ocamlnet, but I doubt ocamlnet itself is compatible with js_of_ocaml, though I recall previously extracting the parser from that library. You might also be able to get by by treating well-enough-formed HTML as if it were XML and using an XML parser.

But otherwise, it might be best to adapt Markup.ml to be useful in js_of_ocaml.

@hhugo
Copy link

hhugo commented Jul 25, 2024

@aantron, just in case it's useful, here is how I adapted jsonm for jsoo : dbuenzli/jsonm#20

@hhugo
Copy link

hhugo commented Jul 25, 2024

dune

(executable
  (name main)
  (modes js)
  (libraries lambdasoup)
  (js_of_ocaml (flags --pretty))
)

main.ml

let example_html =
  {|
<!DOCTYPE html>
<html lang="en">
  <body>
    <script type="text/javascript" src="example.js"></script>
  </body>
</html>
|}

let _  = Soup.parse example_html

commands:

  • dune build main.bc.js
  • node _build/default/main.bc.js
     throw err;
     ^

RangeError: Maximum call stack size exceeded
    at caml_call1 (/home/hugo/js_of_ocaml/_build/default/mark/main.bc.js:39429:15)
    at iterate (/home/hugo/js_of_ocaml/_build/default/mark/main.bc.js:41333:26)
    at caml_call1 (/home/hugo/js_of_ocaml/_build/default/mark/main.bc.js:39429:15)
    at /home/hugo/js_of_ocaml/_build/default/mark/main.bc.js:40406:26
    at caml_call3 (/home/hugo/js_of_ocaml/_build/default/mark/main.bc.js:26155:15)
    at f (/home/hugo/js_of_ocaml/_build/default/mark/main.bc.js:26184:13)
    at caml_call3 (/home/hugo/js_of_ocaml/_build/default/mark/main.bc.js:26155:15)
    at f (/home/hugo/js_of_ocaml/_build/default/mark/main.bc.js:26184:13)
    at caml_call3 (/home/hugo/js_of_ocaml/_build/default/mark/main.bc.js:26155:15)
    at next (/home/hugo/js_of_ocaml/_build/default/mark/main.bc.js:26194:12

Adding Error.stackTraceLimit = Infinity; at the top of the js file can give you the whole trace.

Note that using the release profile works in that specific example, most probably because jsoo is able to inline more (e.g. cross module inlining)

  • dune build --profile release main.bc.js
  • node _build/default/main.bc.js

@Jo-Blade
Copy link

Hi, no progress as of now. Could you give your complete workflow, if it's not too much work to simplify, so I can readily observe the problem? Some sample program, its Dune files, commands you are running? Is this in Node or a browser?

For parsing XML you could try Xmlm. For HTML there is a very non-standards-compliant HTML parser in ocamlnet, but I doubt ocamlnet itself is compatible with js_of_ocaml, though I recall previously extracting the parser from that library. You might also be able to get by by treating well-enough-formed HTML as if it were XML and using an XML parser.

But otherwise, it might be best to adapt Markup.ml to be useful in js_of_ocaml.

In fact the code is a http parser that fetch the code of a website and extract all links to display them in a simple cli. So basically I use lambdasoup to look for all "" balise, extract their "href" attribute and the rendered text (because it can contains a lot of useless div and I'm only insterested in the rendered visible text). Following is the failing snippet :

(* s is the string response from cohttp-lwt-jsoo *)
fun s ->
Soup.parse s $? "div[role=main]" |>> fun div ->
(* Get all links of the current div *)
div $$ "a" |> Soup.to_list |> fun l ->
`Ok
(* Convert all links and keep only non "None" results *)
(List.fold_left
(fun acc x ->
  match
    do_conversion (Soup.attribute "href" x) (Soup.trimmed_texts x) with
        | None -> acc
        | Some y -> y :: acc)
      [] l))

As I'm fetching website content, the target platform is nodejs (because in web browser it would trigger cors policy).

To be honest, it's not a big deal if I cannot use js_of_ocaml as it's was not even originally planned for this project. It's just a little disappointing because I was expecting js_of_ocaml to be an easy way to distribute my program for windows users (for example) without extra work. I've already done some tests so it's really this snippet that cause the all program to fail, if I replace it with a fake returned value, the program compiles and works.

@v-gb
Copy link

v-gb commented Dec 18, 2024

Not sure if there's interest for upstreaming this functionality, but I made markup work in javascript in this branch: master...v-gb:markup.ml:js. It uses trampolines, which is a fairly lightweight change.

I had tried compiling with the jsoo flag that enables cps transformation, but (from memory) it made the code much slower to compile, much slower to run, much bigger, and didn't execute correctly when enabling optimizations. So I much prefer the trampoline version.

It's not very fast, but I think that's due to markup's one-character-at-a-time approach, rather than the trampoline overhead.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants