-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
报错ValueError: Unable to avoid copy while creating an array as requested. #414
Comments
项目不兼容numpy2.x,需要安装1.x版本,正常安装项目会自动处理依赖版本,请按readme执行操作。 |
我这边就是按照readme的步骤来安装的,在解决了fairscale模块的问题后,又出现了缺少fvcore.transforms模块的问题,然后通过conda安装了fvcore之后,就出现了上述这个ValueError的问题,请问有什么解决方法吗 |
我刚刚检查了在MinerU环境下的numpy版本为1.26.4,并非2.x版本,仍然出现上述报错 |
正常安装流程不应该缺少这么多依赖,而且十分不建议使用conda安装任何依赖,项目所有依赖都应该通过pip安装 |
上述报错的原因很明确是由于numpy2.x导致的,1.26.4不会触发这个问题 |
(base) PS C:\Users\dengg> conda activate MinerU packages in environment at D:\Anaconda3\envs\MinerU:Name Version Build Channelnumpy 1.26.4 pypi_0 pypi File "D:\Anaconda3\envs\MinerU\lib\site-packages\magic_pdf\libs\language.py", line 20, in detect_lang File "D:\Anaconda3\envs\MinerU\lib\site-packages\fast_langdetect\ft_detect_init_.py", line 23, in detect_language File "D:\Anaconda3\envs\MinerU\lib\site-packages\fast_langdetect\ft_detect\infer.py", line 81, in detect File "D:\Anaconda3\envs\MinerU\lib\site-packages\fasttext\FastText.py", line 221, in predict File "D:\Anaconda3\envs\MinerU\lib\site-packages\fasttext\FastText.py", line 208, in check ValueError: predict processes one line at a time (remove '\n') During handling of the above exception, another exception occurred: Traceback (most recent call last): File "D:\Anaconda3\envs\MinerU\lib\runpy.py", line 196, in _run_module_as_main File "D:\Anaconda3\envs\MinerU\lib\runpy.py", line 86, in run_code File "D:\Anaconda3\envs\MinerU\Scripts\magic-pdf.exe_main_.py", line 7, in File "D:\Anaconda3\envs\MinerU\lib\site-packages\click\core.py", line 1157, in call File "D:\Anaconda3\envs\MinerU\lib\site-packages\click\core.py", line 1078, in main File "D:\Anaconda3\envs\MinerU\lib\site-packages\click\core.py", line 1434, in invoke File "D:\Anaconda3\envs\MinerU\lib\site-packages\click\core.py", line 783, in invoke File "D:\Anaconda3\envs\MinerU\lib\site-packages\magic_pdf\tools\cli.py", line 73, in cli
File "D:\Anaconda3\envs\MinerU\lib\site-packages\magic_pdf\tools\common.py", line 61, in do_parse File "D:\Anaconda3\envs\MinerU\lib\site-packages\magic_pdf\pipe\UNIPipe.py", line 25, in pipe_classify File "D:\Anaconda3\envs\MinerU\lib\site-packages\magic_pdf\pipe\AbsPipe.py", line 63, in classify File "D:\Anaconda3\envs\MinerU\lib\site-packages\magic_pdf\filter\pdf_meta_scan.py", line 337, in pdf_meta_scan File "D:\Anaconda3\envs\MinerU\lib\site-packages\magic_pdf\filter\pdf_meta_scan.py", line 289, in get_language File "D:\Anaconda3\envs\MinerU\lib\site-packages\magic_pdf\libs\language.py", line 23, in detect_lang File "D:\Anaconda3\envs\MinerU\lib\site-packages\fast_langdetect\ft_detect_init_.py", line 23, in detect_language File "D:\Anaconda3\envs\MinerU\lib\site-packages\fast_langdetect\ft_detect\infer.py", line 81, in detect File "D:\Anaconda3\envs\MinerU\lib\site-packages\fasttext\FastText.py", line 228, in predict ValueError: Unable to avoid copy while creating an array as requested. |
以上是我刚刚测试的结果,您这边可以看一下,numpy版本确实是1.26.4 |
报错信息里应该很清楚,1.x的numpy是输出不了这个的 |
但是我这环境里面显示的numpy版本显示是1.26.4,想问一下是有什么可能的原因呢 |
看numpy的版本应该使用pip list 而不是conda list吧 |
我这使用pip list看numpy也是1.26.4版本的 |
要不你建个新的conda环境从头走一遍再试试? |
好的我再尝试一下吧,有问题再来咨询您 |
找到 FastText.py 文件的 predict 方法的实现部分,找到这段代码: |
Description of the bug | 错误描述
采用版本为0.7.0b1,在运行测试时出现ValueError: Unable to avoid copy while creating an array as requested.报错,完整内容如下:
2024-08-13 16:23:06.329 | ERROR | magic_pdf.tools.cli:parse_doc:69 - Unable to avoid copy while creating an array as requested.
If using
np.array(obj, copy=False)
replace it withnp.asarray(obj)
to allow a copy when needed (no behavior change in NumPy 1.x).For more details, see https://numpy.org/devdocs/numpy_2_0_migration_guide.html#adapting-to-changes-in-the-copy-keyword.
Traceback (most recent call last):
File "D:\Anaconda3\envs\MinerU\lib\site-packages\magic_pdf\libs\language.py", line 20, in detect_lang
lang_upper = detect_language(text)
│ └ 'Journal of Luminescence 270 (2024) 120542\nAvailable online 8 March 2024\n0022-2313/© 2024 Elsevier B.V. All rights reserved...
└ <function detect_language at 0x0000029B9965FEB0>
File "D:\Anaconda3\envs\MinerU\lib\site-packages\fast_langdetect\ft_detect_init_.py", line 23, in detect_language
lang_code = detect(sentence, low_memory=low_memory).get("lang").upper()
│ │ └ True
│ └ 'Journal of Luminescence 270 (2024) 120542\nAvailable online 8 March 2024\n0022-2313/© 2024 Elsevier B.V. All rights reserved...
└ <function detect at 0x0000029B99974B80>
File "D:\Anaconda3\envs\MinerU\lib\site-packages\fast_langdetect\ft_detect\infer.py", line 81, in detect
labels, scores = model.predict(text)
│ │ └ 'Journal of Luminescence 270 (2024) 120542\nAvailable online 8 March 2024\n0022-2313/© 2024 Elsevier B.V. All rights reserved...
│ └ <function _FastText.predict at 0x0000029B9967CEE0>
└ <fasttext.FastText._FastText object at 0x0000029BBE2B9D50>
File "D:\Anaconda3\envs\MinerU\lib\site-packages\fasttext\FastText.py", line 221, in predict
text = check(text)
│ └ 'Journal of Luminescence 270 (2024) 120542\nAvailable online 8 March 2024\n0022-2313/© 2024 Elsevier B.V. All rights reserved...
└ <function _FastText.predict..check at 0x0000029BBE2CFC70>
File "D:\Anaconda3\envs\MinerU\lib\site-packages\fasttext\FastText.py", line 208, in check
raise ValueError(
ValueError: predict processes one line at a time (remove '\n')
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "D:\Anaconda3\envs\MinerU\lib\runpy.py", line 196, in _run_module_as_main
return run_code(code, main_globals, None,
│ │ └ {'name': 'main', 'doc': None, 'package': '', 'loader': <zipimporter object "D:\Anaconda3\envs\MinerU\Scri...
│ └ <code object at 0x0000029B954C79F0, file "D:\Anaconda3\envs\MinerU\Scripts\magic-pdf.exe_main.py", line 1>
└ <function _run_code at 0x0000029B954B0CA0>
File "D:\Anaconda3\envs\MinerU\lib\runpy.py", line 86, in run_code
exec(code, run_globals)
│ └ {'name': 'main', 'doc': None, 'package': '', 'loader': <zipimporter object "D:\Anaconda3\envs\MinerU\Scri...
└ <code object at 0x0000029B954C79F0, file "D:\Anaconda3\envs\MinerU\Scripts\magic-pdf.exe_main.py", line 1>
File "D:\Anaconda3\envs\MinerU\Scripts\magic-pdf.exe_main_.py", line 7, in
File "D:\Anaconda3\envs\MinerU\lib\site-packages\click\core.py", line 1157, in call
return self.main(*args, **kwargs)
│ │ │ └ {}
│ │ └ ()
│ └ <function BaseCommand.main at 0x0000029B9708FAC0>
└
File "D:\Anaconda3\envs\MinerU\lib\site-packages\click\core.py", line 1078, in main
rv = self.invoke(ctx)
│ │ └ <click.core.Context object at 0x0000029B955187F0>
│ └ <function Command.invoke at 0x0000029B970A45E0>
└
File "D:\Anaconda3\envs\MinerU\lib\site-packages\click\core.py", line 1434, in invoke
return ctx.invoke(self.callback, **ctx.params)
│ │ │ │ │ └ {'path': 'C:\Users\dengg\Desktop\test', 'output_dir': 'C:\Users\dengg\Desktop\test_out', 'method': 'auto'}
│ │ │ │ └ <click.core.Context object at 0x0000029B955187F0>
│ │ │ └ <function cli at 0x0000029BBE2CF490>
│ │ └
│ └ <function Context.invoke at 0x0000029B9708F2E0>
└ <click.core.Context object at 0x0000029B955187F0>
File "D:\Anaconda3\envs\MinerU\lib\site-packages\click\core.py", line 783, in invoke
return __callback(*args, **kwargs)
│ └ {'path': 'C:\Users\dengg\Desktop\test', 'output_dir': 'C:\Users\dengg\Desktop\test_out', 'method': 'auto'}
└ ()
File "D:\Anaconda3\envs\MinerU\lib\site-packages\magic_pdf\tools\cli.py", line 73, in cli
parse_doc(doc_path)
│ └ WindowsPath('C:/Users/dengg/Desktop/test/Optical characteristics and energy transfer analysis of Dy3+-Pr3+ ions doped in CeF3...
└ <function cli..parse_doc at 0x0000029B9550F250>
File "D:\Anaconda3\envs\MinerU\lib\site-packages\magic_pdf\tools\common.py", line 61, in do_parse
pipe.pipe_classify()
│ └ <function UNIPipe.pipe_classify at 0x0000029BBE2CE5F0>
└ <magic_pdf.pipe.UNIPipe.UNIPipe object at 0x0000029BBE2B9510>
File "D:\Anaconda3\envs\MinerU\lib\site-packages\magic_pdf\pipe\UNIPipe.py", line 25, in pipe_classify
self.pdf_type = AbsPipe.classify(self.pdf_bytes)
│ │ │ │ │ └ b'%PDF-1.7\r%\x80\x84\x88\x8c\x90\x94\x98\x9c\xa0\xa4\xa8\xac\xb0\xb4\xb8\xbc\xc0\xc4\xc8\xcc\xd0\xd4\xd8\xdc\xe0\xe4\xe8\xec...
│ │ │ │ └ <magic_pdf.pipe.UNIPipe.UNIPipe object at 0x0000029BBE2B9510>
│ │ │ └ <staticmethod(<function AbsPipe.classify at 0x0000029B9BD66170>)>
│ │ └ <class 'magic_pdf.pipe.AbsPipe.AbsPipe'>
│ └ ''
└ <magic_pdf.pipe.UNIPipe.UNIPipe object at 0x0000029BBE2B9510>
File "D:\Anaconda3\envs\MinerU\lib\site-packages\magic_pdf\pipe\AbsPipe.py", line 63, in classify
pdf_meta = pdf_meta_scan(pdf_bytes)
│ └ b'%PDF-1.7\r%\x80\x84\x88\x8c\x90\x94\x98\x9c\xa0\xa4\xa8\xac\xb0\xb4\xb8\xbc\xc0\xc4\xc8\xcc\xd0\xd4\xd8\xdc\xe0\xe4\xe8\xec...
└ <function pdf_meta_scan at 0x0000029B9BD65630>
File "D:\Anaconda3\envs\MinerU\lib\site-packages\magic_pdf\filter\pdf_meta_scan.py", line 337, in pdf_meta_scan
text_language = get_language(doc)
│ └ Document('', <memory, doc# 1>)
└ <function get_language at 0x0000029B9BD65510>
File "D:\Anaconda3\envs\MinerU\lib\site-packages\magic_pdf\filter\pdf_meta_scan.py", line 289, in get_language
page_language = detect_lang(text_block)
│ └ 'Journal of Luminescence 270 (2024) 120542\nAvailable online 8 March 2024\n0022-2313/© 2024 Elsevier B.V. All rights reserved...
└ <function detect_lang at 0x0000029B9965F910>
File "D:\Anaconda3\envs\MinerU\lib\site-packages\magic_pdf\libs\language.py", line 23, in detect_lang
lang_upper = detect_language(html_no_ctrl_chars)
│ └ 'Journal of Luminescence 270 (2024) 120542Available online 8 March 20240022-2313/© 2024 Elsevier B.V. All rights reserved.Ful...
└ <function detect_language at 0x0000029B9965FEB0>
File "D:\Anaconda3\envs\MinerU\lib\site-packages\fast_langdetect\ft_detect_init_.py", line 23, in detect_language
lang_code = detect(sentence, low_memory=low_memory).get("lang").upper()
│ │ └ True
│ └ 'Journal of Luminescence 270 (2024) 120542Available online 8 March 20240022-2313/© 2024 Elsevier B.V. All rights reserved.Ful...
└ <function detect at 0x0000029B99974B80>
File "D:\Anaconda3\envs\MinerU\lib\site-packages\fast_langdetect\ft_detect\infer.py", line 81, in detect
labels, scores = model.predict(text)
│ │ └ 'Journal of Luminescence 270 (2024) 120542Available online 8 March 20240022-2313/© 2024 Elsevier B.V. All rights reserved.Ful...
│ └ <function _FastText.predict at 0x0000029B9967CEE0>
└ <fasttext.FastText._FastText object at 0x0000029BBE2B9D50>
File "D:\Anaconda3\envs\MinerU\lib\site-packages\fasttext\FastText.py", line 228, in predict
return labels, np.array(probs, copy=False)
│ │ │ └ (0.9080705046653748,)
│ │ └
│ └ <module 'numpy' from 'D:\Anaconda3\envs\MinerU\lib\site-packages\numpy\init.py'>
└ ('__label__en',)
ValueError: Unable to avoid copy while creating an array as requested.
If using
np.array(obj, copy=False)
replace it withnp.asarray(obj)
to allow a copy when needed (no behavior change in NumPy 1.x).For more details, see https://numpy.org/devdocs/numpy_2_0_migration_guide.html#adapting-to-changes-in-the-copy-keyword.
请各位解答疑惑
How to reproduce the bug | 如何复现
如报错描述所示
Operating system | 操作系统
Windows
Python version | Python 版本
3.10
Software version | 软件版本 (magic-pdf --version)
0.6.x
Device mode | 设备模式
cpu
The text was updated successfully, but these errors were encountered: