Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pyaerocom tests fail with SystemError: (11, 'Resource temporarily unavailable') in tests #46

Closed
thorbjoernl opened this issue Jul 11, 2024 · 3 comments
Assignees

Comments

@thorbjoernl
Copy link
Collaborator

Running the pyaerocom tests on the PR to implement writing with aerovaldb fails with the error SystemError: (11, 'Resource temporarily unavailable') when running all tests. Repeating the test running only a single test allows the test to succeed.

Example output from Github Actions: https://github.com/metno/pyaerocom/actions/runs/9888455650/job/27318172756

Excerpt from output:

pyaerocom/aeroval/experiment_processor.py:159: in run
    self.update_interface()
pyaerocom/aeroval/experiment_processor.py:173: in update_interface
    self.exp_output.update_interface()
pyaerocom/aeroval/experiment_output.py:185: in update_interface
    self._add_entry_experiments_json(self.exp_id, exp_data)
pyaerocom/aeroval/experiment_output.py:794: in _add_entry_experiments_json
    current = self.avdb.get_experiments(self.proj_id, default={})
.tox/py/lib/python3.10/site-packages/aerovaldb/utils.py:37: in async_and_sync_wrap
    return asyncio.run(function(*args, **kwargs))
/opt/hostedtoolcache/Python/3.10.14/x64/lib/python3.10/asyncio/runners.py:44: in run
    return loop.run_until_complete(main)
/opt/hostedtoolcache/Python/3.10.14/x64/lib/python3.10/asyncio/base_events.py:649: in run_until_complete
    return future.result()
.tox/py/lib/python3.10/site-packages/aerovaldb/jsondb/jsonfiledb.py:358: in get_experiments
    experiments = await self._get(
.tox/py/lib/python3.10/site-packages/aerovaldb/jsondb/jsonfiledb.py:307: in _get
    data = await self.get_by_uuid(
.tox/py/lib/python3.10/site-packages/aerovaldb/jsondb/jsonfiledb.py:590: in get_by_uuid
    raw = await self._cache.get_json(uuid, no_cache=not cache)
.tox/py/lib/python3.10/site-packages/aerovaldb/jsondb/cache.py:155: in get_json
    return await self._read_json(abspath)
.tox/py/lib/python3.10/site-packages/aerovaldb/jsondb/cache.py:126: in _read_json
    async with aiofile.async_open(abspath, "r") as f:
.tox/py/lib/python3.10/site-packages/aiofile/utils.py:340: in async_open
    afp = AIOFile(str(file_specifier), mode, *args, **kwargs)
.tox/py/lib/python3.10/site-packages/aiofile/aio.py:130: in __init__
    self.__context = context or get_default_context()
.tox/py/lib/python3.10/site-packages/aiofile/aio.py:329: in get_default_context
    return create_context()
.tox/py/lib/python3.10/site-packages/aiofile/aio.py:311: in create_context
    context = caio.AsyncioContext(max_requests, loop=loop)
.tox/py/lib/python3.10/site-packages/caio/asyncio_base.py:22: in __init__
    self.context = self._create_context(max_requests, **kwargs)
.tox/py/lib/python3.10/site-packages/caio/linux_aio_asyncio.py:10: in _create_context
    context = super()._create_context(max_requests)
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
self = <caio.linux_aio_asyncio.AsyncioContext object at 0x7f9bec55ec80>
max_requests = 512, kwargs = {}
    def _create_context(self, max_requests, **kwargs):
>       return self.CONTEXT_CLASS(max_requests=max_requests, **kwargs)
E       SystemError: (11, 'Resource temporarily unavailable')
.tox/py/lib/python3.10/site-packages/caio/asyncio_base.py:25: SystemError
@thorbjoernl
Copy link
Collaborator Author

thorbjoernl commented Jul 11, 2024

It looks like aiofile uses a ThreadPoolExecutor internally to read the files and is therefore running into a thread limit. Reads using aiofile are also much slower compared to synchronous reading:

Eg. this quick benchmark with 10 reads each of files of different sizes:

Testing file test-file-1kb
 Testing synchronous read
 Time elapsed: 0.00s
 Testing asynchronous read
 Time elapsed: 0.03s
Testing file test-file-100kb
 Testing synchronous read
 Time elapsed: 0.00s
 Testing asynchronous read
 Time elapsed: 0.05s
Testing file test-file-10000kb
 Testing synchronous read
 Time elapsed: 1.52s
 Testing asynchronous read
 Time elapsed: 20.76s
Testing file test-file-250000kb
 Testing synchronous read
 Time elapsed: 3.72s
 Testing asynchronous read
 Time elapsed: 49.61s

I think that maybe changing aerovaldb to do synchronous IO may be worthwhile, and I think it also would solve this issue. Since we support caching, I don't think async is critical to the aeroval api anymore. What do you think @heikoklein ?

@heikoklein
Copy link
Member

We could add semaphores to limit resource usage.

We had never a requirement on asyncio in aerovaldb. It was rather a "cool to have" feature, but it makes working with it very complicated and since we don't see real performance benefits, or rather degrations, we can drop it.

Please make a test on aeroval-api with ab or similar before and after dropping asyncio, and inform @AugustinMortier , too.

@thorbjoernl
Copy link
Collaborator Author

Closed by #47

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants