Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

修复LaTeXOCR 在paddleX中的一些问题 #13646

Merged
merged 4 commits into from
Aug 14, 2024

Conversation

liuhongen1234567
Copy link
Contributor

@liuhongen1234567 liuhongen1234567 commented Aug 13, 2024

  1. 修改了 LaTeX OCR 中的backbone和head的部分代码,使其可以在推理开启 config.enable_new_ir(True) 可以正常运行
  2. 修改了export_model.py, 使其可以将 LaTeX OCR 中词表json文件 也写入到yml文件中
  3. 将评估、推理和导出的特殊参数配置写入到py文件中,避免用户手动设置一些不必要的参数
  4. 修改了 /workspace/code/paddle_ocr/github_pr/2024_8_13/latexocr_paddle/ppocr/utils/formula_utils/math_txt2pkl.py 文件,对用户裁剪的随机尺寸图像进行可容错。避免用户随机裁剪数据集后,图像尺寸过于随意,导致数据集每组图片过少,训练时间变长
  5. 将LaTeX OCR 中的特殊安装包写入到单独的requirement,以兼容后续其他公式模型所需的其他安装包

@liuhongen1234567
Copy link
Contributor Author

@GreatV 麻烦review 一下代码

Copy link
Collaborator

@GreatV GreatV left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@GreatV GreatV merged commit 5f0b90a into PaddlePaddle:main Aug 14, 2024
3 of 4 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants