Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Upgrade to iRODS 4.3 #16

Closed
16 of 17 tasks
mikkonie opened this issue Jan 31, 2023 · 18 comments
Closed
16 of 17 tasks

Upgrade to iRODS 4.3 #16

mikkonie opened this issue Jan 31, 2023 · 18 comments
Assignees
Labels
breaking Breaking change, to be implemented and documented with care feature New feature or request
Milestone

Comments

@mikkonie
Copy link
Contributor

mikkonie commented Jan 31, 2023

There is a lot of internal demand for this so got to look into it.

Spec

  • Look into possible incompatibilities
    • Python 3 required, need to upgrade that
    • Authentication system updated, may require changes for PAM and/or SODAR auth to still work
    • New permission system, not sure if compatible with old one? (SODAR tests should catch it if not)
    • Surprise! Unattended config schema totally changed, need to rewrite template
    • Could upgrade image to Ubuntu 20.04, apparently
  • Update as needed
  • Ensure this works with SODAR
    • All tests running
    • Auth working as expected with PAM and SODAR
  • If upgrading existing server is not possible, add instructions on migrating existing 4.2 servers

Tasks

Resources

@mikkonie mikkonie self-assigned this Jan 31, 2023
@mikkonie mikkonie changed the title Upgrade to iRODS 4.2 Upgrade to iRODS 4.3 Jan 31, 2023
@mikkonie
Copy link
Contributor Author

First issue when upgrading on an already installed iCAT:

Error encountered running irods_control start:
Traceback (most recent call last):
  File "/var/lib/irods/scripts/irods/json_validation.py", line 60, in validate_dict
    jsonschema.validate(config_dict, schema, resolver=jsonschema.RefResolver(schema_uri, schema))
  File "/usr/lib/python3/dist-packages/jsonschema/validators.py", line 541, in validate
    cls(schema, *args, **kwargs).validate(instance)
  File "/usr/lib/python3/dist-packages/jsonschema/validators.py", line 130, in validate
    raise error
jsonschema.exceptions.ValidationError: {'catalog_schema_version': 1, 'commit_id': '0000000000000000000000000000000000000000', 'configuration_schema_version': 2, 'irods_version': '4.1.0', 'schema_name': 'VERSION', 'schema_version': 'v2'} is valid under each of {'type': 'object', 'properties': {'catalog_schema_version': {'type': 'integer'}, 'commit_id': {'type': 'string', 'pattern': '^[0-9a-f]{40}$'}, 'configuration_schema_version': {'type': 'integer'}, 'installation_time': {'type': 'string', 'format': 'date-time'}, 'irods_version': {'type': 'string'}, 'previous_version': {'$ref': '#/properties/previous_version/oneOf/1'}}, 'required': ['catalog_schema_version', 'commit_id', 'configuration_schema_version', 'irods_version']}, {'$ref': '#'}

Failed validating 'oneOf' in schema['properties']['previous_version']:
    {'oneOf': [{'$ref': '#'},
               {'properties': {'catalog_schema_version': {'type': 'integer'},
                               'commit_id': {'pattern': '^[0-9a-f]{40}$',
                                             'type': 'string'},
                               'configuration_schema_version': {'type': 'integer'},
                               'installation_time': {'format': 'date-time',
                                                     'type': 'string'},
                               'irods_version': {'type': 'string'},
                               'previous_version': {'$ref': '#/properties/previous_version/oneOf/1'}},
                'required': ['catalog_schema_version',
                             'commit_id',
                             'configuration_schema_version',
                             'irods_version'],
                'type': 'object'}]}

On instance['previous_version']:
    {'catalog_schema_version': 1,
     'commit_id': '0000000000000000000000000000000000000000',
     'configuration_schema_version': 2,
     'irods_version': '4.1.0',
     'schema_name': 'VERSION',
     'schema_version': 'v2'}

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/var/lib/irods/scripts/irods_control.py", line 124, in main
    operations_dict[operation]()
  File "/var/lib/irods/scripts/irods_control.py", line 70, in <lambda>
    operations_dict['start'] = lambda: irods_controller.start(write_to_stdout=options.write_to_stdout, test_mode=options.test_mode)
  File "/var/lib/irods/scripts/irods/controller.py", line 94, in start
    self.config.validate_configuration()
  File "/var/lib/irods/scripts/irods/configuration.py", line 286, in validate_configuration
    config_file['path'])
  File "/var/lib/irods/scripts/irods/json_validation.py", line 79, in validate_dict
    sys.exc_info()[2])
  File "/var/lib/irods/scripts/irods/six.py", line 671, in reraise
    raise value.with_traceback(tb)
  File "/var/lib/irods/scripts/irods/json_validation.py", line 60, in validate_dict
    jsonschema.validate(config_dict, schema, resolver=jsonschema.RefResolver(schema_uri, schema))
  File "/usr/lib/python3/dist-packages/jsonschema/validators.py", line 541, in validate
    cls(schema, *args, **kwargs).validate(instance)
  File "/usr/lib/python3/dist-packages/jsonschema/validators.py", line 130, in validate
    raise error

mikkonie added a commit that referenced this issue Jan 31, 2023
@mikkonie
Copy link
Contributor Author

mikkonie commented Jan 31, 2023

The aforementioned crash also breaks existing server configuration in a way wihch prevents downgrading. This is very bad.

Even after fixing all issues, we should backup all server configs before attempting this upgrade in production.

@mikkonie
Copy link
Contributor Author

Clean install also fails. Apparently this will need a lot of work.

irods-test_1  | Traceback (most recent call last):
irods-test_1  |   File "/var/lib/irods/scripts/setup_irods.py", line 58, in <module>
irods-test_1  |     import irods.lib
irods-test_1  |   File "/var/lib/irods/scripts/irods/lib.py", line 15, in <module>
irods-test_1  |     import distro
irods-test_1  | ImportError: No module named distro
irods_1       | Perform iRODS setup
irods_1       | Traceback (most recent call last):
irods_1       |   File "/var/lib/irods/scripts/setup_irods.py", line 58, in <module>
irods_1       |     import irods.lib
irods_1       |   File "/var/lib/irods/scripts/irods/lib.py", line 15, in <module>
irods_1       |     import distro
irods_1       | ImportError: No module named distro
irods-test_1  | Password: 
postgres_1    | 2023-01-31 12:02:36.256 UTC [91] ERROR:  database "ICAT_TEST" already exists
postgres_1    | 2023-01-31 12:02:36.256 UTC [91] STATEMENT:  CREATE DATABASE "ICAT_TEST";
irods-test_1  | createdb: database creation failed: ERROR:  database "ICAT_TEST" already exists
irods_1       | Password: 
postgres_1    | 2023-01-31 12:02:36.267 UTC [92] ERROR:  database "ICAT" already exists
postgres_1    | 2023-01-31 12:02:36.267 UTC [92] STATEMENT:  CREATE DATABASE "ICAT";
irods_1       | createdb: database creation failed: ERROR:  database "ICAT" already exists
sodar-docker-compose-dev_irods-test_1 exited with code 1
sodar-docker-compose-dev_irods_1 exited with code 1

mikkonie added a commit that referenced this issue Jan 31, 2023
@mikkonie
Copy link
Contributor Author

mikkonie commented Jan 31, 2023

Got past the prior crash, here are some new ones.

Edit: The 1st one was fixed.

irods_1       | rsyslogd: imklog: cannot open kernel log (/proc/kmsg): Operation not permitted.
irods_1       | rsyslogd: activation of module imklog failed [v8.32.0 try http://www.rsyslog.com/e/2145 ]
irods_1       |    ...done.

This one persists at the time of writing:

irods_1       | Traceback (most recent call last):
irods_1       |   File "/var/lib/irods/scripts/setup_irods.py", line 529, in <module>
irods_1       |     sys.exit(main())
irods_1       |   File "/var/lib/irods/scripts/setup_irods.py", line 517, in main
irods_1       |     test_mode=options.test_mode)
irods_1       |   File "/var/lib/irods/scripts/setup_irods.py", line 110, in setup_server
irods_1       |     default_resource_name = json_configuration_dict['default_resource_name']
irods_1       | KeyError: 'default_resource_name'

Looks like the unattended config file template needs to be updated. Will be looking into the original.

mikkonie added a commit that referenced this issue Jan 31, 2023
mikkonie added a commit that referenced this issue Jan 31, 2023
@mikkonie
Copy link
Contributor Author

Unattended configuration file updated to match the current schema. This leads to the following error:

irods_1       | Error encountered running setup_irods:
irods_1       | Traceback (most recent call last):
irods_1       |   File "/var/lib/irods/scripts/setup_irods.py", line 517, in main
irods_1       |     test_mode=options.test_mode)
irods_1       |   File "/var/lib/irods/scripts/setup_irods.py", line 150, in setup_server
irods_1       |     test_put(irods_config)
irods_1       |   File "/var/lib/irods/scripts/setup_irods.py", line 180, in test_put
irods_1       |     raise IrodsError('Post-install test failed. Please check your configuration.')
irods_1       | irods.exceptions.IrodsError: Post-install test failed. Please check your configuration.

mikkonie added a commit that referenced this issue Jan 31, 2023
@mikkonie
Copy link
Contributor Author

Additional info in setup_log.txt about the aforementioned crash. Looks like a PAM plugin issue. Oh great, I'm sure this will not be a pain to fix.

+---------------------------+
| Running Post-Install Test |
+---------------------------+

2023-01-31T15:36:43.765Z -   DEBUG -                     execute.py:  52 - Calling ['/usr/sbin/irodsTestPutGet'] with options:
{'shell': False, 'stderr': -1, 'stdout': -1}
2023-01-31T15:36:44.046Z -   DEBUG -                     execute.py:  37 - Command /usr/sbin/irodsTestPutGet returned with code -6.
stderr:
  Error occurred while authenticating user [rods] [PLUGIN_ERROR_MISSING_SHARED_OBJECT: [-]	/irods_source/lib/core/include/irods/irods_load_plugin.hpp:157:irods::error irods::load_plugin(PluginType *&, const std::string &, const std::string &, const std::string &, const Ts &...) [PluginType = irods::experimental::auth::authentication_base, Ts = <char [14]>] :  status [PLUGIN_ERROR_MISSING_SHARED_OBJECT]  errno [] -- message [shared library does not exist [/usr/lib/irods/plugins/auth/libirods_auth_plugin-pam_client.so]]
  
  
  
  ] [ec=-1827000] failed with error -1827000 PLUGIN_ERROR_MISSING_SHARED_OBJECT 
  libc++abi: terminating with uncaught exception of type std::runtime_error: client login error
2023-01-31T15:36:44.046Z -   ERROR -                 setup_irods.py: 519 - Error encountered running setup_irods:
Traceback (most recent call last):
  File "/var/lib/irods/scripts/setup_irods.py", line 517, in main
    test_mode=options.test_mode)
  File "/var/lib/irods/scripts/setup_irods.py", line 150, in setup_server
    test_put(irods_config)
  File "/var/lib/irods/scripts/setup_irods.py", line 180, in test_put
    raise IrodsError('Post-install test failed. Please check your configuration.')
irods.exceptions.IrodsError: Post-install test failed. Please check your configuration.
2023-01-31T15:36:44.047Z -    INFO -                 setup_irods.py: 520 - Exiting...

@mikkonie
Copy link
Contributor Author

Just a note, the previous PAM error was fixed with the help of iRODS support. The syntax for PAM auth in configurations has changed. Instead of PAM it now expects pam_password.

The blocker right now is the 4.3 API or Python client used by SODAR not working correctly with the iRODS server. Will look into that when I have time. May also consider waiting for 4.3.1 to come out.

@mikkonie
Copy link
Contributor Author

Server currently works with a clean install. SODAR auth via the custom PAM module is no longer working. I need to look into what has changed in the iRODS auth and attempt to update my custom module accordingly.

@mikkonie
Copy link
Contributor Author

mikkonie commented Oct 16, 2023

Currently the containers can be destroyed by a problem with version.json, which is apparently written by setup and isn't included in the volumes. Only rebuilding the entire image fixes this. I'm trying to figure out what causes this.

This happens both in iRODS start and setup, so clearing the volumes and re-initializing everything will not help.

Error encountered running irods_control start:
Traceback (most recent call last):
  File "/var/lib/irods/scripts/irods/json_validation.py", line 60, in validate_dict
    jsonschema.validate(config_dict, schema, resolver=jsonschema.RefResolver(schema_uri, schema))
  File "/usr/lib/python3/dist-packages/jsonschema/validators.py", line 541, in validate
    cls(schema, *args, **kwargs).validate(instance)
  File "/usr/lib/python3/dist-packages/jsonschema/validators.py", line 130, in validate
    raise error
jsonschema.exceptions.ValidationError: {'catalog_schema_version': 1, 'commit_id': '0000000000000000000000000000000000000000', 'configuration_schema_version': 2, 'irods_version': '4.1.0', 'schema_name': 'VERSION', 'schema_version': 'v2'} is valid under each of {'type': 'object', 'properties': {'catalog_schema_version': {'type': 'integer'}, 'commit_id': {'type': 'string', 'pattern': '^[0-9a-f]{40}$'}, 'configuration_schema_version': {'type': 'integer'}, 'installation_time': {'type': 'string', 'format': 'date-time'}, 'irods_version': {'type': 'string'}, 'previous_version': {'$ref': '#/properties/previous_version/oneOf/1'}}, 'required': ['catalog_schema_version', 'commit_id', 'configuration_schema_version', 'irods_version']}, {'$ref': '#'}

Failed validating 'oneOf' in schema['properties']['previous_version']:
    {'oneOf': [{'$ref': '#'},
               {'properties': {'catalog_schema_version': {'type': 'integer'},
                               'commit_id': {'pattern': '^[0-9a-f]{40}$',
                                             'type': 'string'},
                               'configuration_schema_version': {'type': 'integer'},
                               'installation_time': {'format': 'date-time',
                                                     'type': 'string'},
                               'irods_version': {'type': 'string'},
                               'previous_version': {'$ref': '#/properties/previous_version/oneOf/1'}},
                'required': ['catalog_schema_version',
                             'commit_id',
                             'configuration_schema_version',
                             'irods_version'],
                'type': 'object'}]}

On instance['previous_version']:
    {'catalog_schema_version': 1,
     'commit_id': '0000000000000000000000000000000000000000',
     'configuration_schema_version': 2,
     'irods_version': '4.1.0',
     'schema_name': 'VERSION',
     'schema_version': 'v2'}

Update: This error occurs (at least) when we recreate the image on an already provisioned environment. It seems we need to add some more directories to persistent storage via config/volumes. It's possible this same problem also exists in the 4.2 branch, but in any case we should be able to handle an image update on a provisioned server.

@mikkonie
Copy link
Contributor Author

mikkonie commented Jan 24, 2024

Fixed the problem with version.json: we just have to copy it to /etc/irods after provisioning and copy it back to /var/irods/lib if running on a provisioned server.

@mikkonie
Copy link
Contributor Author

mikkonie commented Jun 6, 2024

Starting to look into this again to hopefully finalize this image soon and work towards getting it deployed with SODAR.

While I was on sick leave, iRODS v4.3.2 was released. First thing is to upgrade to that and see if previously working things are still OK.

@mikkonie
Copy link
Contributor Author

mikkonie commented Aug 7, 2024

As I kind of expected, upgrading the target iRODS version from 4.3.1 to 4.3.2 does not work on the fly. The server stays up for a short while and performs actions successfully, but then it dies. Same thing after restart.

I need to get logging up and try to see what could be causing this. 4.3.1 was working just fine for me locally.

This may have something to do with the python-irodsclient version in use, maybe a bad request breaks the server. But this is simply a hunch. Upgrading to a newer version has its issues as well, see bihealth/sodar-server#1955.

@mikkonie
Copy link
Contributor Author

mikkonie commented Sep 17, 2024

Back at it again. It seems installing iRODS itself has changed at some point.

mikkonie added a commit that referenced this issue Sep 17, 2024
mikkonie added a commit that referenced this issue Sep 17, 2024
@mikkonie
Copy link
Contributor Author

mikkonie commented Sep 17, 2024

After fixing build issues, iRODS startup fails when running the container:

irods-1       | Start iRODS
irods-1       | Test iinit
irods-1       | /irods_login.sh: line 3: iinit: command not found
irods-1       | iinit failed

Problem with irods-icommands setup I guess? Again, this didn't happen just a while ago with 4.3.1..

Update: Fixed by explicitly adding irods-icommands in dependencies to be installed.

mikkonie added a commit that referenced this issue Sep 17, 2024
mikkonie added a commit that referenced this issue Sep 18, 2024
mikkonie added a commit that referenced this issue Sep 19, 2024
mikkonie added a commit that referenced this issue Sep 19, 2024
@mikkonie mikkonie added feature New feature or request breaking Breaking change, to be implemented and documented with care and removed breaking Breaking change, to be implemented and documented with care labels Sep 19, 2024
@mikkonie
Copy link
Contributor Author

mikkonie commented Oct 1, 2024

Looking into the custom PAM module issue. /var/log/auth.log says the following:

Oct  1 09:25:59 irods /usr/local/lib/pam-sodar/pam_sodar.py[1030]: Traceback (most recent call last):
Oct  1 09:25:59 irods /usr/local/lib/pam-sodar/pam_sodar.py[1030]:   File "/usr/local/lib/pam-sodar/pam_sodar.py", line 8, in <module>
Oct  1 09:25:59 irods /usr/local/lib/pam-sodar/pam_sodar.py[1030]:     import requests
Oct  1 09:25:59 irods /usr/local/lib/pam-sodar/pam_sodar.py[1030]: ImportError: No module named requests
Oct  1 09:25:59 irods irodsPamAuthCheck[1030]: pam_unix(irods:auth): check pass; user unknown
Oct  1 09:25:59 irods irodsPamAuthCheck[1030]: pam_unix(irods:auth): authentication failure; logname= uid=1000 euid=0 tty= ruser= rhost=

Seems simple enough. However, adding pip3 install requests in Dockerfile does not help. I guess pam_python runs its own (Python 2?) libraries or something? However, this did work in the 4.2 version of this image. Looking into it..

mikkonie added a commit that referenced this issue Oct 1, 2024
@mikkonie
Copy link
Contributor Author

mikkonie commented Oct 1, 2024

Custom PAM auth issues fixed, albeit with an ugly hack. I will add a separate issue for making it prettier.

@mikkonie mikkonie changed the title Upgrade to iRODS 4.3 Upgrade for iRODS 4.3 Nov 14, 2024
@mikkonie mikkonie changed the title Upgrade for iRODS 4.3 Upgrade to iRODS 4.3 Nov 14, 2024
@mikkonie
Copy link
Contributor Author

All done, closing this issue. Further improvements will go to a possible 4.3.3-2 release or further subsequent releases.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
breaking Breaking change, to be implemented and documented with care feature New feature or request
Projects
None yet
Development

No branches or pull requests

1 participant