forked from opendatalab/MinerU
-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix: resolve inaccuracy of drawing layout box caused by paragraphs combination #384 #1
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
…e at the end. (#523) * fix replace \u0002, \u0003 in common text * fix(para): When an English line ends with a hyphen, do not add a space at the end.
* feat<table model>: add tablemaster with paddleocr to detect and recognize table (#493) * Update cla.yml * Update bug_report.yml * Update README_zh-CN.md (#404) correct FAQ url * Update README_zh-CN.md (#404) (#409) (#410) correct FAQ url Co-authored-by: sfk <18810651050@163.com> * Update FAQ_zh_cn.md add new issue * Update FAQ_en_us.md * Update README_Windows_CUDA_Acceleration_zh_CN.md * Update README_zh-CN.md * @Thepathakarpit has signed the CLA in #418 * Update cla.yml * feat: add tablemaster_paddle (#463) * Update README_zh-CN.md (#404) (#409) correct FAQ url Co-authored-by: sfk <18810651050@163.com> * add dockerfile (#189) Co-authored-by: drunkpig <60862764+drunkpig@users.noreply.github.com> * Update cla.yml * Update cla.yml --------- Co-authored-by: drunkpig <60862764+drunkpig@users.noreply.github.com> Co-authored-by: sfk <18810651050@163.com> Co-authored-by: Aoyang Fang <222010547@link.cuhk.edu.cn> Co-authored-by: Xiaomeng Zhao <moe@myhloli.com> * <fix>(para_split_v2): index out of range issue of span_text first char (#396) Co-authored-by: liukaiwen <liukaiwen@pjlab.org.cn> * @Matthijz98 has signed the CLA in #467 * Create download_models.py * Create requirements-docker.txt * feat<table model>: add tablemaster with paddleocr to detect and recognize table * @strongerfly has signed the CLA in #487 * feat<table model>: add tablemaster with paddleocr to detect and recognize table * feat<table model>: add tablemaster with paddleocr to detect and recognize table * feat<table model>: add tablemaster with paddleocr to detect and recognize table * feat<table model>: add tablemaster with paddleocr to detect and recognize table --------- Co-authored-by: Xiaomeng Zhao <moe@myhloli.com> Co-authored-by: sfk <18810651050@163.com> Co-authored-by: drunkpig <60862764+drunkpig@users.noreply.github.com> Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: Aoyang Fang <222010547@link.cuhk.edu.cn> Co-authored-by: liukaiwen <liukaiwen@pjlab.org.cn> * feat<table model>: add tablemaster with paddleocr to detect and recognize table (#508) * Update cla.yml * Update bug_report.yml * Update README_zh-CN.md (#404) correct FAQ url * Update README_zh-CN.md (#404) (#409) (#410) correct FAQ url Co-authored-by: sfk <18810651050@163.com> * Update FAQ_zh_cn.md add new issue * Update FAQ_en_us.md * Update README_Windows_CUDA_Acceleration_zh_CN.md * Update README_zh-CN.md * @Thepathakarpit has signed the CLA in #418 * Update cla.yml * feat: add tablemaster_paddle (#463) * Update README_zh-CN.md (#404) (#409) correct FAQ url Co-authored-by: sfk <18810651050@163.com> * add dockerfile (#189) Co-authored-by: drunkpig <60862764+drunkpig@users.noreply.github.com> * Update cla.yml * Update cla.yml --------- Co-authored-by: drunkpig <60862764+drunkpig@users.noreply.github.com> Co-authored-by: sfk <18810651050@163.com> Co-authored-by: Aoyang Fang <222010547@link.cuhk.edu.cn> Co-authored-by: Xiaomeng Zhao <moe@myhloli.com> * <fix>(para_split_v2): index out of range issue of span_text first char (#396) Co-authored-by: liukaiwen <liukaiwen@pjlab.org.cn> * @Matthijz98 has signed the CLA in #467 * Create download_models.py * Create requirements-docker.txt * feat<table model>: add tablemaster with paddleocr to detect and recognize table * @strongerfly has signed the CLA in #487 * feat<table model>: add tablemaster with paddleocr to detect and recognize table * feat<table model>: add tablemaster with paddleocr to detect and recognize table * feat<table model>: add tablemaster with paddleocr to detect and recognize table * feat<table model>: add tablemaster with paddleocr to detect and recognize table * Update cla.yml * Delete .github/workflows/gpu-ci.yml * Update Huggingface and ModelScope links to organization account * feat<table model>: add tablemaster with paddleocr to detect and recognize table --------- Co-authored-by: Xiaomeng Zhao <moe@myhloli.com> Co-authored-by: sfk <18810651050@163.com> Co-authored-by: drunkpig <60862764+drunkpig@users.noreply.github.com> Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: Aoyang Fang <222010547@link.cuhk.edu.cn> Co-authored-by: liukaiwen <liukaiwen@pjlab.org.cn> Co-authored-by: yyy <102640628+dt-yy@users.noreply.github.com> Co-authored-by: wangbinDL <wangbin_research@163.com> * feat<table model>: add tablemaster with paddleocr to detect and recognize table (#511) * Update cla.yml * Update bug_report.yml * Update README_zh-CN.md (#404) correct FAQ url * Update README_zh-CN.md (#404) (#409) (#410) correct FAQ url Co-authored-by: sfk <18810651050@163.com> * Update FAQ_zh_cn.md add new issue * Update FAQ_en_us.md * Update README_Windows_CUDA_Acceleration_zh_CN.md * Update README_zh-CN.md * @Thepathakarpit has signed the CLA in #418 * Update cla.yml * feat: add tablemaster_paddle (#463) * Update README_zh-CN.md (#404) (#409) correct FAQ url Co-authored-by: sfk <18810651050@163.com> * add dockerfile (#189) Co-authored-by: drunkpig <60862764+drunkpig@users.noreply.github.com> * Update cla.yml * Update cla.yml --------- Co-authored-by: drunkpig <60862764+drunkpig@users.noreply.github.com> Co-authored-by: sfk <18810651050@163.com> Co-authored-by: Aoyang Fang <222010547@link.cuhk.edu.cn> Co-authored-by: Xiaomeng Zhao <moe@myhloli.com> * <fix>(para_split_v2): index out of range issue of span_text first char (#396) Co-authored-by: liukaiwen <liukaiwen@pjlab.org.cn> * @Matthijz98 has signed the CLA in #467 * Create download_models.py * Create requirements-docker.txt * feat<table model>: add tablemaster with paddleocr to detect and recognize table * @strongerfly has signed the CLA in #487 * feat<table model>: add tablemaster with paddleocr to detect and recognize table * feat<table model>: add tablemaster with paddleocr to detect and recognize table * feat<table model>: add tablemaster with paddleocr to detect and recognize table * feat<table model>: add tablemaster with paddleocr to detect and recognize table * Update cla.yml * Delete .github/workflows/gpu-ci.yml * Update Huggingface and ModelScope links to organization account * feat<table model>: add tablemaster with paddleocr to detect and recognize table * feat<table model>: add tablemaster with paddleocr to detect and recognize table * feat<table model>: add tablemaster with paddleocr to detect and recognize table --------- Co-authored-by: Xiaomeng Zhao <moe@myhloli.com> Co-authored-by: sfk <18810651050@163.com> Co-authored-by: drunkpig <60862764+drunkpig@users.noreply.github.com> Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: Aoyang Fang <222010547@link.cuhk.edu.cn> Co-authored-by: liukaiwen <liukaiwen@pjlab.org.cn> Co-authored-by: yyy <102640628+dt-yy@users.noreply.github.com> Co-authored-by: wangbinDL <wangbin_research@163.com> --------- Co-authored-by: Kaiwen Liu <lkw_buaa@163.com> Co-authored-by: Xiaomeng Zhao <moe@myhloli.com> Co-authored-by: sfk <18810651050@163.com> Co-authored-by: drunkpig <60862764+drunkpig@users.noreply.github.com> Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: Aoyang Fang <222010547@link.cuhk.edu.cn> Co-authored-by: liukaiwen <liukaiwen@pjlab.org.cn> Co-authored-by: wangbinDL <wangbin_research@163.com>
* release: release 0.7.1 version (#526) * Update README_zh-CN.md (#404) (#409) correct FAQ url Co-authored-by: sfk <18810651050@163.com> * add dockerfile (#189) Co-authored-by: drunkpig <60862764+drunkpig@users.noreply.github.com> * Update cla.yml * Update cla.yml * feat<table model>: add tablemaster with paddleocr to detect and recognize table (#493) * Update cla.yml * Update bug_report.yml * Update README_zh-CN.md (#404) correct FAQ url * Update README_zh-CN.md (#404) (#409) (#410) correct FAQ url Co-authored-by: sfk <18810651050@163.com> * Update FAQ_zh_cn.md add new issue * Update FAQ_en_us.md * Update README_Windows_CUDA_Acceleration_zh_CN.md * Update README_zh-CN.md * @Thepathakarpit has signed the CLA in #418 * Update cla.yml * feat: add tablemaster_paddle (#463) * Update README_zh-CN.md (#404) (#409) correct FAQ url Co-authored-by: sfk <18810651050@163.com> * add dockerfile (#189) Co-authored-by: drunkpig <60862764+drunkpig@users.noreply.github.com> * Update cla.yml * Update cla.yml --------- Co-authored-by: drunkpig <60862764+drunkpig@users.noreply.github.com> Co-authored-by: sfk <18810651050@163.com> Co-authored-by: Aoyang Fang <222010547@link.cuhk.edu.cn> Co-authored-by: Xiaomeng Zhao <moe@myhloli.com> * <fix>(para_split_v2): index out of range issue of span_text first char (#396) Co-authored-by: liukaiwen <liukaiwen@pjlab.org.cn> * @Matthijz98 has signed the CLA in #467 * Create download_models.py * Create requirements-docker.txt * feat<table model>: add tablemaster with paddleocr to detect and recognize table * @strongerfly has signed the CLA in #487 * feat<table model>: add tablemaster with paddleocr to detect and recognize table * feat<table model>: add tablemaster with paddleocr to detect and recognize table * feat<table model>: add tablemaster with paddleocr to detect and recognize table * feat<table model>: add tablemaster with paddleocr to detect and recognize table --------- Co-authored-by: Xiaomeng Zhao <moe@myhloli.com> Co-authored-by: sfk <18810651050@163.com> Co-authored-by: drunkpig <60862764+drunkpig@users.noreply.github.com> Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: Aoyang Fang <222010547@link.cuhk.edu.cn> Co-authored-by: liukaiwen <liukaiwen@pjlab.org.cn> * feat<table model>: add tablemaster with paddleocr to detect and recognize table (#508) * Update cla.yml * Update bug_report.yml * Update README_zh-CN.md (#404) correct FAQ url * Update README_zh-CN.md (#404) (#409) (#410) correct FAQ url Co-authored-by: sfk <18810651050@163.com> * Update FAQ_zh_cn.md add new issue * Update FAQ_en_us.md * Update README_Windows_CUDA_Acceleration_zh_CN.md * Update README_zh-CN.md * @Thepathakarpit has signed the CLA in #418 * Update cla.yml * feat: add tablemaster_paddle (#463) * Update README_zh-CN.md (#404) (#409) correct FAQ url Co-authored-by: sfk <18810651050@163.com> * add dockerfile (#189) Co-authored-by: drunkpig <60862764+drunkpig@users.noreply.github.com> * Update cla.yml * Update cla.yml --------- Co-authored-by: drunkpig <60862764+drunkpig@users.noreply.github.com> Co-authored-by: sfk <18810651050@163.com> Co-authored-by: Aoyang Fang <222010547@link.cuhk.edu.cn> Co-authored-by: Xiaomeng Zhao <moe@myhloli.com> * <fix>(para_split_v2): index out of range issue of span_text first char (#396) Co-authored-by: liukaiwen <liukaiwen@pjlab.org.cn> * @Matthijz98 has signed the CLA in #467 * Create download_models.py * Create requirements-docker.txt * feat<table model>: add tablemaster with paddleocr to detect and recognize table * @strongerfly has signed the CLA in #487 * feat<table model>: add tablemaster with paddleocr to detect and recognize table * feat<table model>: add tablemaster with paddleocr to detect and recognize table * feat<table model>: add tablemaster with paddleocr to detect and recognize table * feat<table model>: add tablemaster with paddleocr to detect and recognize table * Update cla.yml * Delete .github/workflows/gpu-ci.yml * Update Huggingface and ModelScope links to organization account * feat<table model>: add tablemaster with paddleocr to detect and recognize table --------- Co-authored-by: Xiaomeng Zhao <moe@myhloli.com> Co-authored-by: sfk <18810651050@163.com> Co-authored-by: drunkpig <60862764+drunkpig@users.noreply.github.com> Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: Aoyang Fang <222010547@link.cuhk.edu.cn> Co-authored-by: liukaiwen <liukaiwen@pjlab.org.cn> Co-authored-by: yyy <102640628+dt-yy@users.noreply.github.com> Co-authored-by: wangbinDL <wangbin_research@163.com> * feat<table model>: add tablemaster with paddleocr to detect and recognize table (#511) * Update cla.yml * Update bug_report.yml * Update README_zh-CN.md (#404) correct FAQ url * Update README_zh-CN.md (#404) (#409) (#410) correct FAQ url Co-authored-by: sfk <18810651050@163.com> * Update FAQ_zh_cn.md add new issue * Update FAQ_en_us.md * Update README_Windows_CUDA_Acceleration_zh_CN.md * Update README_zh-CN.md * @Thepathakarpit has signed the CLA in #418 * Update cla.yml * feat: add tablemaster_paddle (#463) * Update README_zh-CN.md (#404) (#409) correct FAQ url Co-authored-by: sfk <18810651050@163.com> * add dockerfile (#189) Co-authored-by: drunkpig <60862764+drunkpig@users.noreply.github.com> * Update cla.yml * Update cla.yml --------- Co-authored-by: drunkpig <60862764+drunkpig@users.noreply.github.com> Co-authored-by: sfk <18810651050@163.com> Co-authored-by: Aoyang Fang <222010547@link.cuhk.edu.cn> Co-authored-by: Xiaomeng Zhao <moe@myhloli.com> * <fix>(para_split_v2): index out of range issue of span_text first char (#396) Co-authored-by: liukaiwen <liukaiwen@pjlab.org.cn> * @Matthijz98 has signed the CLA in #467 * Create download_models.py * Create requirements-docker.txt * feat<table model>: add tablemaster with paddleocr to detect and recognize table * @strongerfly has signed the CLA in #487 * feat<table model>: add tablemaster with paddleocr to detect and recognize table * feat<table model>: add tablemaster with paddleocr to detect and recognize table * feat<table model>: add tablemaster with paddleocr to detect and recognize table * feat<table model>: add tablemaster with paddleocr to detect and recognize table * Update cla.yml * Delete .github/workflows/gpu-ci.yml * Update Huggingface and ModelScope links to organization account * feat<table model>: add tablemaster with paddleocr to detect and recognize table * feat<table model>: add tablemaster with paddleocr to detect and recognize table * feat<table model>: add tablemaster with paddleocr to detect and recognize table --------- Co-authored-by: Xiaomeng Zhao <moe@myhloli.com> Co-authored-by: sfk <18810651050@163.com> Co-authored-by: drunkpig <60862764+drunkpig@users.noreply.github.com> Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: Aoyang Fang <222010547@link.cuhk.edu.cn> Co-authored-by: liukaiwen <liukaiwen@pjlab.org.cn> Co-authored-by: yyy <102640628+dt-yy@users.noreply.github.com> Co-authored-by: wangbinDL <wangbin_research@163.com> --------- Co-authored-by: drunkpig <60862764+drunkpig@users.noreply.github.com> Co-authored-by: sfk <18810651050@163.com> Co-authored-by: Aoyang Fang <222010547@link.cuhk.edu.cn> Co-authored-by: Xiaomeng Zhao <moe@myhloli.com> Co-authored-by: Kaiwen Liu <lkw_buaa@163.com> Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: liukaiwen <liukaiwen@pjlab.org.cn> Co-authored-by: wangbinDL <wangbin_research@163.com> * Update README.md * Update README_zh-CN.md * Update README_zh-CN.md --------- Co-authored-by: yyy <102640628+dt-yy@users.noreply.github.com> Co-authored-by: drunkpig <60862764+drunkpig@users.noreply.github.com> Co-authored-by: Aoyang Fang <222010547@link.cuhk.edu.cn> Co-authored-by: Xiaomeng Zhao <moe@myhloli.com> Co-authored-by: Kaiwen Liu <lkw_buaa@163.com> Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: liukaiwen <liukaiwen@pjlab.org.cn> Co-authored-by: wangbinDL <wangbin_research@163.com>
delete Known issue about table recognition
Sign up for free
to subscribe to this conversation on GitHub.
Already have an account?
Sign in.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
No description provided.