Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Spurious paramiko.ssh_exception.SSHException: Error reading SSH protocol banner #4944

Closed
dev-zero opened this issue May 14, 2021 · 4 comments

Comments

@dev-zero
Copy link
Contributor

Describe the bug

I occasionally get the following error when running calculations on a fresh installations of AiiDA despite having run verdi computer test and using a password-less key file:

05/14/2021 09:46:11 AM <24898> aiida.transport.SshTransport: [ERROR] Error connecting to 'eiger.cscs.ch' through SSH: [SshTransport] Error reading SSH protocol banner, connect_args were: {'username': 'timuel', 'port': 22, 'look_for_keys': True, 'key_filename': '/users/tiziano/.ssh/id_ed25519.cscs', 'timeout': 60, 'allow_agent': True, 'proxy_command': 'ssh -W eiger.cscs.ch:22 timuel@ela.cscs.ch', 'compress': True, 'gss_auth': False, 'gss_kex': False, 'gss_deleg_creds': False, 'gss_host': 'eiger.cscs.ch'}
05/14/2021 09:46:11 AM <24898> aiida.engine.transports: [ERROR] exception occurred while trying to open transport:
 Error reading SSH protocol banner
05/14/2021 09:46:11 AM <24898> aiida.engine.transports: [ERROR] Exception whilst using transport:
Traceback (most recent call last):
  File "/scratch/tiziano/virtualenvs/aiida/lib/python3.9/site-packages/paramiko/transport.py", line 2211, in _check_banner
    buf = self.packetizer.readline(timeout)
  File "/scratch/tiziano/virtualenvs/aiida/lib/python3.9/site-packages/paramiko/packet.py", line 380, in readline
    buf += self._read_timeout(timeout)
  File "/scratch/tiziano/virtualenvs/aiida/lib/python3.9/site-packages/paramiko/packet.py", line 622, in _read_timeout
    raise socket.timeout()
socket.timeout
[...]
During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/scratch/tiziano/work/aiida/aiida-core/aiida/engine/utils.py", line 188, in exponential_backoff_retry
    result = await coro()
  File "/scratch/tiziano/work/aiida/aiida-core/aiida/engine/processes/calcjobs/tasks.py", line 190, in do_update
    job_info = await cancellable.with_interrupt(update_request)
  File "/scratch/tiziano/work/aiida/aiida-core/aiida/engine/utils.py", line 95, in with_interrupt
    result = await next(wait_iter)
  File "/users/tiziano/.pyenv/versions/3.9.5/lib/python3.9/asyncio/tasks.py", line 614, in _wait_for_one
    return f.result()  # May raise f.exception().
  File "/users/tiziano/.pyenv/versions/3.9.5/lib/python3.9/asyncio/futures.py", line 201, in result
    raise self._exception
  File "/scratch/tiziano/work/aiida/aiida-core/aiida/engine/utils.py", line 188, in exponential_backoff_retry
    result = await coro()
  File "/scratch/tiziano/work/aiida/aiida-core/aiida/engine/processes/calcjobs/tasks.py", line 190, in do_update
    job_info = await cancellable.with_interrupt(update_request)
  File "/scratch/tiziano/work/aiida/aiida-core/aiida/engine/utils.py", line 95, in with_interrupt
    result = await next(wait_iter)
  File "/users/tiziano/.pyenv/versions/3.9.5/lib/python3.9/asyncio/tasks.py", line 614, in _wait_for_one
    return f.result()  # May raise f.exception().
  File "/users/tiziano/.pyenv/versions/3.9.5/lib/python3.9/asyncio/futures.py", line 201, in result
    raise self._exception
  File "/scratch/tiziano/work/aiida/aiida-core/aiida/engine/utils.py", line 188, in exponential_backoff_retry
    result = await coro()
  File "/scratch/tiziano/work/aiida/aiida-core/aiida/engine/processes/calcjobs/tasks.py", line 190, in do_update
    job_info = await cancellable.with_interrupt(update_request)
  File "/scratch/tiziano/work/aiida/aiida-core/aiida/engine/utils.py", line 95, in with_interrupt
    result = await next(wait_iter)
  File "/users/tiziano/.pyenv/versions/3.9.5/lib/python3.9/asyncio/tasks.py", line 614, in _wait_for_one
    return f.result()  # May raise f.exception().
  File "/users/tiziano/.pyenv/versions/3.9.5/lib/python3.9/asyncio/futures.py", line 201, in result
    raise self._exception
  File "/scratch/tiziano/work/aiida/aiida-core/aiida/engine/utils.py", line 188, in exponential_backoff_retry
    result = await coro()
  File "/scratch/tiziano/work/aiida/aiida-core/aiida/engine/processes/calcjobs/tasks.py", line 190, in do_update
    job_info = await cancellable.with_interrupt(update_request)
  File "/scratch/tiziano/work/aiida/aiida-core/aiida/engine/utils.py", line 95, in with_interrupt
    result = await next(wait_iter)
  File "/users/tiziano/.pyenv/versions/3.9.5/lib/python3.9/asyncio/tasks.py", line 614, in _wait_for_one
    return f.result()  # May raise f.exception().
  File "/users/tiziano/.pyenv/versions/3.9.5/lib/python3.9/asyncio/futures.py", line 201, in result
    raise self._exception
  File "/scratch/tiziano/work/aiida/aiida-core/aiida/engine/utils.py", line 188, in exponential_backoff_retry
    result = await coro()
  File "/scratch/tiziano/work/aiida/aiida-core/aiida/engine/processes/calcjobs/tasks.py", line 190, in do_update
    job_info = await cancellable.with_interrupt(update_request)
  File "/scratch/tiziano/work/aiida/aiida-core/aiida/engine/utils.py", line 95, in with_interrupt
    result = await next(wait_iter)
  File "/users/tiziano/.pyenv/versions/3.9.5/lib/python3.9/asyncio/tasks.py", line 614, in _wait_for_one
    return f.result()  # May raise f.exception().
  File "/users/tiziano/.pyenv/versions/3.9.5/lib/python3.9/asyncio/futures.py", line 201, in result
    raise self._exception
  File "/users/tiziano/.pyenv/versions/3.9.5/lib/python3.9/asyncio/tasks.py", line 258, in __step
    result = coro.throw(exc)
  File "/scratch/tiziano/work/aiida/aiida-core/aiida/engine/processes/calcjobs/manager.py", line 180, in updating
    await self._update_job_info()
  File "/scratch/tiziano/work/aiida/aiida-core/aiida/engine/processes/calcjobs/manager.py", line 132, in _update_job_info
    self._jobs_cache = await self._get_jobs_from_scheduler()
  File "/scratch/tiziano/work/aiida/aiida-core/aiida/engine/processes/calcjobs/manager.py", line 98, in _get_jobs_from_scheduler
    transport = await request
  File "/users/tiziano/.pyenv/versions/3.9.5/lib/python3.9/asyncio/futures.py", line 284, in __await__
    yield self  # This tells Task to wait for completion.
  File "/users/tiziano/.pyenv/versions/3.9.5/lib/python3.9/asyncio/tasks.py", line 328, in __wakeup
    future.result()
  File "/users/tiziano/.pyenv/versions/3.9.5/lib/python3.9/asyncio/futures.py", line 201, in result
    raise self._exception
  File "/scratch/tiziano/work/aiida/aiida-core/aiida/engine/transports.py", line 89, in do_open
    transport.open()
  File "/scratch/tiziano/work/aiida/aiida-core/aiida/transports/plugins/ssh.py", line 438, in open
    self._client.connect(self._machine, **connection_arguments)
  File "/scratch/tiziano/virtualenvs/aiida/lib/python3.9/site-packages/paramiko/client.py", line 406, in connect
    t.start_client(timeout=timeout)
  File "/scratch/tiziano/virtualenvs/aiida/lib/python3.9/site-packages/paramiko/transport.py", line 660, in start_client
    raise e
  File "/scratch/tiziano/virtualenvs/aiida/lib/python3.9/site-packages/paramiko/transport.py", line 2039, in run
    self._check_banner()
  File "/scratch/tiziano/virtualenvs/aiida/lib/python3.9/site-packages/paramiko/transport.py", line 2215, in _check_banner
    raise SSHException(
paramiko.ssh_exception.SSHException: Error reading SSH protocol banner
Task exception was never retrieved
future: <Task finished name='Task-421' coro=<JobsList._ensure_updating.<locals>.updating() done, defined at /scratch/tiziano/work/aiida/aiida-core/aiida/engine/processes/calcjobs/manager.py:178> exception=SSHException('Error reading SSH protocol banner')>
Traceback (most recent call last):
  File "/scratch/tiziano/virtualenvs/aiida/lib/python3.9/site-packages/paramiko/transport.py", line 2211, in _check_banner
    buf = self.packetizer.readline(timeout)
  File "/scratch/tiziano/virtualenvs/aiida/lib/python3.9/site-packages/paramiko/packet.py", line 380, in readline
    buf += self._read_timeout(timeout)
  File "/scratch/tiziano/virtualenvs/aiida/lib/python3.9/site-packages/paramiko/packet.py", line 622, in _read_timeout
    raise socket.timeout()
socket.timeout

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/scratch/tiziano/work/aiida/aiida-core/aiida/engine/utils.py", line 188, in exponential_backoff_retry
    result = await coro()
  File "/scratch/tiziano/work/aiida/aiida-core/aiida/engine/processes/calcjobs/tasks.py", line 190, in do_update
    job_info = await cancellable.with_interrupt(update_request)
  File "/scratch/tiziano/work/aiida/aiida-core/aiida/engine/utils.py", line 95, in with_interrupt
    result = await next(wait_iter)
  File "/users/tiziano/.pyenv/versions/3.9.5/lib/python3.9/asyncio/tasks.py", line 614, in _wait_for_one
    return f.result()  # May raise f.exception().
  File "/users/tiziano/.pyenv/versions/3.9.5/lib/python3.9/asyncio/futures.py", line 201, in result
    raise self._exception
  File "/scratch/tiziano/work/aiida/aiida-core/aiida/engine/utils.py", line 188, in exponential_backoff_retry
    result = await coro()
  File "/scratch/tiziano/work/aiida/aiida-core/aiida/engine/processes/calcjobs/tasks.py", line 190, in do_update
    job_info = await cancellable.with_interrupt(update_request)
  File "/scratch/tiziano/work/aiida/aiida-core/aiida/engine/utils.py", line 95, in with_interrupt
    result = await next(wait_iter)
  File "/users/tiziano/.pyenv/versions/3.9.5/lib/python3.9/asyncio/tasks.py", line 614, in _wait_for_one
    return f.result()  # May raise f.exception().
  File "/users/tiziano/.pyenv/versions/3.9.5/lib/python3.9/asyncio/futures.py", line 201, in result
    raise self._exception
  File "/scratch/tiziano/work/aiida/aiida-core/aiida/engine/utils.py", line 188, in exponential_backoff_retry
    result = await coro()
  File "/scratch/tiziano/work/aiida/aiida-core/aiida/engine/processes/calcjobs/tasks.py", line 190, in do_update
    job_info = await cancellable.with_interrupt(update_request)
  File "/scratch/tiziano/work/aiida/aiida-core/aiida/engine/utils.py", line 95, in with_interrupt
    result = await next(wait_iter)
  File "/users/tiziano/.pyenv/versions/3.9.5/lib/python3.9/asyncio/tasks.py", line 614, in _wait_for_one
    return f.result()  # May raise f.exception().
  File "/users/tiziano/.pyenv/versions/3.9.5/lib/python3.9/asyncio/futures.py", line 201, in result
    raise self._exception
  File "/scratch/tiziano/work/aiida/aiida-core/aiida/engine/utils.py", line 188, in exponential_backoff_retry
    result = await coro()
  File "/scratch/tiziano/work/aiida/aiida-core/aiida/engine/processes/calcjobs/tasks.py", line 190, in do_update
    job_info = await cancellable.with_interrupt(update_request)
  File "/scratch/tiziano/work/aiida/aiida-core/aiida/engine/utils.py", line 95, in with_interrupt
    result = await next(wait_iter)
  File "/users/tiziano/.pyenv/versions/3.9.5/lib/python3.9/asyncio/tasks.py", line 614, in _wait_for_one
    return f.result()  # May raise f.exception().
  File "/users/tiziano/.pyenv/versions/3.9.5/lib/python3.9/asyncio/futures.py", line 201, in result
    raise self._exception
  File "/scratch/tiziano/work/aiida/aiida-core/aiida/engine/utils.py", line 188, in exponential_backoff_retry
    result = await coro()
  File "/scratch/tiziano/work/aiida/aiida-core/aiida/engine/processes/calcjobs/tasks.py", line 190, in do_update
    job_info = await cancellable.with_interrupt(update_request)
  File "/scratch/tiziano/work/aiida/aiida-core/aiida/engine/utils.py", line 95, in with_interrupt
    result = await next(wait_iter)
  File "/users/tiziano/.pyenv/versions/3.9.5/lib/python3.9/asyncio/tasks.py", line 614, in _wait_for_one
    return f.result()  # May raise f.exception().
  File "/users/tiziano/.pyenv/versions/3.9.5/lib/python3.9/asyncio/futures.py", line 201, in result
    raise self._exception
  File "/users/tiziano/.pyenv/versions/3.9.5/lib/python3.9/asyncio/tasks.py", line 258, in __step
    result = coro.throw(exc)
  File "/scratch/tiziano/work/aiida/aiida-core/aiida/engine/processes/calcjobs/manager.py", line 180, in updating
    await self._update_job_info()
  File "/scratch/tiziano/work/aiida/aiida-core/aiida/engine/processes/calcjobs/manager.py", line 132, in _update_job_info
    self._jobs_cache = await self._get_jobs_from_scheduler()
  File "/scratch/tiziano/work/aiida/aiida-core/aiida/engine/processes/calcjobs/manager.py", line 98, in _get_jobs_from_scheduler
    transport = await request
  File "/users/tiziano/.pyenv/versions/3.9.5/lib/python3.9/asyncio/futures.py", line 284, in __await__
    yield self  # This tells Task to wait for completion.
  File "/users/tiziano/.pyenv/versions/3.9.5/lib/python3.9/asyncio/tasks.py", line 328, in __wakeup
    future.result()
  File "/users/tiziano/.pyenv/versions/3.9.5/lib/python3.9/asyncio/futures.py", line 201, in result
    raise self._exception
  File "/scratch/tiziano/work/aiida/aiida-core/aiida/engine/transports.py", line 89, in do_open
    transport.open()
  File "/scratch/tiziano/work/aiida/aiida-core/aiida/transports/plugins/ssh.py", line 438, in open
    self._client.connect(self._machine, **connection_arguments)
  File "/scratch/tiziano/virtualenvs/aiida/lib/python3.9/site-packages/paramiko/client.py", line 406, in connect
    t.start_client(timeout=timeout)
  File "/scratch/tiziano/virtualenvs/aiida/lib/python3.9/site-packages/paramiko/transport.py", line 660, in start_client
    raise e
  File "/scratch/tiziano/virtualenvs/aiida/lib/python3.9/site-packages/paramiko/transport.py", line 2039, in run
    self._check_banner()
  File "/scratch/tiziano/virtualenvs/aiida/lib/python3.9/site-packages/paramiko/transport.py", line 2215, in _check_banner
    raise SSHException(
paramiko.ssh_exception.SSHException: Error reading SSH protocol banner

Steps to reproduce

Steps to reproduce the behavior:

Run a workchain from the command line (without going through the daemon) after a successful test of the computer.

Expected behavior

Not getting an error message OR getting more information to be able to debug the issue.

Your environment

  • Operating system [e.g. Linux]: openSUSE LEAP 15.2
  • Python version [e.g. 3.7.1]: 3.9.5
  • aiida-core version [e.g. 1.2.1]: 403f7e7

Additional context

My guess is that since the machines at CSCS have different SSH host keys for different hosts and that they are doing a round robin for both the jump host and the login node that I occasionally end up on a node which is not yet in known_hosts.

@dev-zero
Copy link
Contributor Author

Some more updates on this: since the workers do not maintain a single stable connection to the host this can be caused SSH trying to get keys from a forwarded SSH agent when the agent does not react (for example when a password/bin needs to be entered). Unfortunately this always happens even if a specific key was given and both looks_for_keys and allow_agent are set to False.

@dev-zero
Copy link
Contributor Author

some more information: the issue seems to be coming from the ProxyCommand which does not seem to get the memo about not looking for keys or not using the agent. After using there ssh -i path/to/my/key it now seems to work reliably.
So, this might be a case for improved documentation first, and then some better way to set it up...

@dev-zero
Copy link
Contributor Author

instead of launching a really separate proxy process it might be better to support the ProxyJump instruction (since that's what's often needed) as illustrated in paramiko/paramiko#1018 (comment)

@sphuber
Copy link
Contributor

sphuber commented Mar 13, 2022

Should have been fixed by #4951

@sphuber sphuber closed this as completed Mar 13, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants