Skip to content

DataJoint import error due to missing pyarrow (a pandas dependency) #1202

@ttngu207

Description

@ttngu207
Contributor

Bug Report

Description

A fresh datajoint installation on python 3.10 is successful
However, upon import (import datajoint as dj), the following error is raised

Traceback (most recent call last):
  File "C:\Users\thinh\.conda\envs\microns_phase3\lib\site-packages\IPython\core\interactiveshell.py", line 3579, in run_code
    exec(code_obj, self.user_global_ns, self.user_ns)
  File "<ipython-input-3-0b6eed5a3415>", line 1, in <module>
    import datajoint
  File "C:\Program Files\JetBrains\PyCharm 2023.3.4\plugins\python\helpers\pydev\_pydev_bundle\pydev_import_hook.py", line 21, in do_import
    module = self._system_import(name, *args, **kwargs)
  File "C:\Users\thinh\.conda\envs\microns_phase3\lib\site-packages\datajoint\__init__.py", line 62, in <module>
    from .schemas import Schema
  File "C:\Program Files\JetBrains\PyCharm 2023.3.4\plugins\python\helpers\pydev\_pydev_bundle\pydev_import_hook.py", line 21, in do_import
    module = self._system_import(name, *args, **kwargs)
  File "C:\Users\thinh\.conda\envs\microns_phase3\lib\site-packages\datajoint\schemas.py", line 10, in <module>
    from .jobs import JobTable
  File "C:\Program Files\JetBrains\PyCharm 2023.3.4\plugins\python\helpers\pydev\_pydev_bundle\pydev_import_hook.py", line 21, in do_import
    module = self._system_import(name, *args, **kwargs)
  File "C:\Users\thinh\.conda\envs\microns_phase3\lib\site-packages\datajoint\jobs.py", line 4, in <module>
    from .table import Table
  File "C:\Program Files\JetBrains\PyCharm 2023.3.4\plugins\python\helpers\pydev\_pydev_bundle\pydev_import_hook.py", line 21, in do_import
    module = self._system_import(name, *args, **kwargs)
  File "C:\Users\thinh\.conda\envs\microns_phase3\lib\site-packages\datajoint\table.py", line 6, in <module>
    import pandas
  File "C:\Program Files\JetBrains\PyCharm 2023.3.4\plugins\python\helpers\pydev\_pydev_bundle\pydev_import_hook.py", line 21, in do_import
    module = self._system_import(name, *args, **kwargs)
  File "C:\Users\thinh\.conda\envs\microns_phase3\lib\site-packages\pandas\__init__.py", line 39, in <module>
    from pandas.compat import (
  File "C:\Program Files\JetBrains\PyCharm 2023.3.4\plugins\python\helpers\pydev\_pydev_bundle\pydev_import_hook.py", line 21, in do_import
    module = self._system_import(name, *args, **kwargs)
  File "C:\Users\thinh\.conda\envs\microns_phase3\lib\site-packages\pandas\compat\__init__.py", line 27, in <module>
    from pandas.compat.pyarrow import (
  File "C:\Program Files\JetBrains\PyCharm 2023.3.4\plugins\python\helpers\pydev\_pydev_bundle\pydev_import_hook.py", line 21, in do_import
    module = self._system_import(name, *args, **kwargs)
  File "C:\Users\thinh\.conda\envs\microns_phase3\lib\site-packages\pandas\compat\pyarrow.py", line 10, in <module>
    _palv = Version(Version(pa.__version__).base_version)
AttributeError: module 'pyarrow' has no attribute '__version__'

Upon further investigation, it looks like pandas>2.2 requires pyarrow as its dependency, however, pyarrow is not explicitly specified as a requirement for pandas (for good reasons, lots of things to consider, see this discussion), thus not installed when pandas is installed.

For datajoint, we can either

  1. pin pandas<2
  2. install pandas[pyarrow]
  3. set pyarrow as one of the dependency in pyproject.toml

Reproducibility

Include:

  • OS (WIN)
  • Python Version: 3.10
  • DataJoint Version: 0.14.3

Activity

added
bugIndicates an unexpected problem or unintended behavior
on Feb 7, 2025
added
staleIndicates issues, pull requests, or discussions are inactive
and removed
staleIndicates issues, pull requests, or discussions are inactive
on Mar 14, 2025
dimitri-yatsenko

dimitri-yatsenko commented on Aug 18, 2025

@dimitri-yatsenko
Member

Why do we need pyarrow if pandas does not include it?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Metadata

Metadata

Labels

bugIndicates an unexpected problem or unintended behavior

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

    Development

    No branches or pull requests

      Participants

      @dimitri-yatsenko@drewyangdev@ttngu207

      Issue actions

        DataJoint import error due to missing `pyarrow` (a `pandas` dependency) · Issue #1202 · datajoint/datajoint-python