Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

preview taking a lot of time #331

Closed
jegork opened this issue Nov 24, 2022 · 3 comments · Fixed by #373
Closed

preview taking a lot of time #331

jegork opened this issue Nov 24, 2022 · 3 comments · Fixed by #373

Comments

@jegork
Copy link

jegork commented Nov 24, 2022

Hello!

I have trouble running the preview command.
First, I run the build command:

doc-builder build transformers docs/source/en/ --build_dir ~/tmp/test-build

Afterwards, I run:

doc-builder preview transformers ~/tmp/test-build

Which leads me to the following prompt:

Building the MDX files:  31%|█████████████████████████████████████████████▌                                                                                                       | 79/258 [01:53<19:06,  6.40s/it]

And this process would take infinitely long, it didn't even finish after 8 hours. I am not sure whether this is expected or some bug.

NB: it gets stuck at around 77/258, before this everything is super fast.

I am running M1 Mac OS.

Thanks!

@mariosasko
Copy link
Contributor

Hi! What do you get when you interrupt the process (CTRL + C) while it's hanging?

@jegork
Copy link
Author

jegork commented Dec 15, 2022

Hey @mariosasko

I have tried stopping it at 74/269 and 81/269 and in both cases got the following trace.

Traceback (most recent call last):
  File "/Users/jegorkitskerkin/opt/miniconda3/envs/transformers/bin/doc-builder", line 8, in <module>
    sys.exit(main())
  File "/Users/jegorkitskerkin/opt/miniconda3/envs/transformers/lib/python3.9/site-packages/doc_builder/commands/doc_builder_cli.py", line 47, in main
    args.func(args)
  File "/Users/jegorkitskerkin/opt/miniconda3/envs/transformers/lib/python3.9/site-packages/doc_builder/commands/preview.py", line 171, in preview_command
    source_files_mapping = build_doc(
  File "/Users/jegorkitskerkin/opt/miniconda3/envs/transformers/lib/python3.9/site-packages/doc_builder/build_doc.py", line 403, in build_doc
    anchors_mapping, source_files_mapping = build_mdx_files(package, doc_folder, output_dir, page_info)
  File "/Users/jegorkitskerkin/opt/miniconda3/envs/transformers/lib/python3.9/site-packages/doc_builder/build_doc.py", line 181, in build_mdx_files
    content = convert_md_to_mdx(content, page_info)
  File "/Users/jegorkitskerkin/opt/miniconda3/envs/transformers/lib/python3.9/site-packages/doc_builder/convert_md_to_mdx.py", line 59, in convert_md_to_mdx
    """ + process_md(
  File "/Users/jegorkitskerkin/opt/miniconda3/envs/transformers/lib/python3.9/site-packages/doc_builder/convert_md_to_mdx.py", line 170, in process_md
    text = convert_special_chars(text)
  File "/Users/jegorkitskerkin/opt/miniconda3/envs/transformers/lib/python3.9/site-packages/doc_builder/convert_md_to_mdx.py", line 81, in convert_special_chars
    text = _re_lt_html.sub(r"LTHTML\1\2\3LTHTML\5", text)
KeyboardInterrupt

@xenova
Copy link
Contributor

xenova commented May 20, 2023

I have the same issue for Transformers.js: Building/previewing takes a long time; and it is due to that line:

    text = _re_lt_html.sub(r"LTHTML\1\2\3LTHTML\5", text)

I'll see if I can optimise it

sgugger pushed a commit that referenced this issue Aug 22, 2023
* Optimize html tag replace regex

* Simplify replacement regex

* Add additional unit tests

Mostly edge-cases

Note: HTML is quite forgiving and will still parse/render tags even when some tags are mismatched or missing.

The previous regex would barf on this.

* Fix quality checks

* Improve special character replacement regex

* Add more edge-case unit tests
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants