Optimize `get_classes_by_slot` round 2 #304

sneakers-the-rat · 2024-03-05T04:58:28Z

Waiting for tests to run on the array branch and browsing through issues and PRs, noticed there was one for optimizing get_classes_by_slot #300 and any time there's an induced_slot perf thing i gotta check it out now.

For non-induced slots, #300 is a perf wash (for 100 rounds through all slots in biolink, 6.77 before and 6.81 after), but when i went to profile it with induced slots i was surprised at how long the estimated time for a single round through was supposed to take - roughly 3 hours (~160,000x slower vs non-induced slots).

Using sets is a great idea for removing the step of making something unique at the end. we can take advantage of that further by breaking after the first set addition (since future set additions will do nothing).

the problem is that we are still making basically the most expensive call you can do with schemaview, which is to call induced_slot on every combination of slot and class for a schema. induced_slot is expensive in itself, but it's usually fine to do because you're only calling it in the context of a few classes and the caching can handle how slow it is, but since the call is every slot in the context of every class, no caching can be done.

The fix is relatively simple - since get_classes_by_slot only cares about the existence of the slot on the class, not on its full induced form, there is no need to use induced_slot at all. class_induced_slots iterates over the slots given by class_slots:

linkml-runtime/linkml_runtime/utils/schemaview.py

Line 1368 in 7c311d9

    
           return [self.induced_slot(sn, class_name, imports=imports) for sn in self.class_slots(class_name)]

so any slots that class_induced_slots knows about, class_slots knows about. The implementation in this PR also avoids iterating over the classes twice, although that's also a perf wash because iterating through a list and comparing strings takes basically no time.

So for the simple timing function

from linkml_runtime.utils.schemaview import SchemaView
import cProfile

def get_slot_classes():
    sv = SchemaView('biolink.yaml')
    for i in range(100):
        for slot in sv.all_slots(imports=True).values():
            _ = sv.get_classes_by_slot(slot, include_induced=False)

def get_induced_slot_classes():
    sv = SchemaView('biolink.yaml')
    for i in range(1):
        for i, slot in enumerate(sv.all_slots(imports=True).values()):
            _ = sv.get_classes_by_slot(slot, include_induced=True)
            if i >= 4:
                break

cProfile.run('get_slot_classes()', 'before_stats.prof')
cProfile.run('get_induced_slot_classes()', 'before_induced_stats.prof')

where i break out of getting the induced slots after 5 slots because it takes so long,

you get:

version	regular per-call	induced per-call (s)
before 300	8.234e-05	21.59
after 300	8.092e-05	21.85
this PR	6.672e-05	0.01946

So i'll call the regular difference in the noise and the induced slot call roughly 1100x faster. A full run over biolink takes 12s vs ~3h.

attached profiling results where beforest is before 300, before is 300 and after is this PR.

profiling-get_classes_by_slot.zip

…uce them

codecov · 2024-03-05T05:00:24Z

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 62.88%. Comparing base (27b9158) to head (ee4e4b2).
Report is 31 commits behind head on main.

Additional details and impacted files

@@           Coverage Diff           @@
##             main     #304   +/-   ##
=======================================
  Coverage   62.88%   62.88%           
=======================================
  Files          62       62           
  Lines        8528     8528           
  Branches     2436     2435    -1     
=======================================
  Hits         5363     5363           
  Misses       2554     2554           
  Partials      611      611

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

cmungall

Thx!

I would like to test this with the main linkml repo before making a new (non-rc) runtime release, we don't have a great SOP for this yet...

sneakers-the-rat · 2024-03-05T19:59:13Z

would that be helpful? would that look like just running the main linkml tests with the PR'd version of linkml-runtime, or did you have something else in mind? for now I can just run them and post results

cmungall · 2024-03-05T20:32:24Z

would that look like just running the main linkml tests with the PR'd version of linkml-runtime

That would work! IMO it's a bit clunky and it's a big ask any time there is a schemaview change but induced slots are central enough it's worth doing here, thx!

sneakers-the-rat · 2024-03-05T23:13:17Z

So yes test failures, but i honestly can't tell if it's from this or not.. i'll investigate more thoroughly later, but would advise not merging yet. Tests also took 2 hours to run and i can't rly fathom why. will also profile that later.

====================================================== short test summary info =======================================================
FAILED tests/test_issues/test_issue_494.py::test_jsonschema_validation - AssertionError: Regex pattern did not match.
FAILED tests/test_linter/test_cli.py::TestLinterCli::test_config_extends_recommended - AssertionError: 1 != 2
FAILED tests/test_linter/test_cli.py::TestLinterCli::test_directory_of_files - AssertionError: 'schemas/schema_a.yaml' not found in "/private/var/folders/b8/28n_6_zn7flgh40028jxd8j80000gn/T/tmp0wrjounp/schema...
FAILED tests/test_linter/test_cli.py::TestLinterCli::test_no_config - AssertionError: 1 != 2
FAILED tests/test_linter/test_cli.py::TestLinterCli::test_no_schema_errors - AssertionError: 1 != 0
FAILED tests/test_linter/test_cli.py::TestLinterCli::test_processes_dot_files_in_directory_when_a_option_provided - AssertionError: 'schemas/schema_a.yaml' not found in "/private/var/folders/b8/28n_6_zn7flgh40028jxd8j80000gn/T/tmpf87woivi/schema...
FAILED tests/test_linter/test_cli.py::TestLinterCli::test_processes_dot_files_in_directory_when_all_option_provided - AssertionError: 'schemas/schema_a.yaml' not found in "/private/var/folders/b8/28n_6_zn7flgh40028jxd8j80000gn/T/tmps6ym618p/schema...
FAILED tests/test_linter/test_cli.py::TestLinterCli::test_validate_schema - AssertionError: 1 != 2
FAILED tests/test_linter/test_rule_canonical_prefixes.py::TestCanonicalPrefixesRule::test_custom_context - AttributeError: 'NoneType' object has no attribute 'append'
FAILED tests/test_linter/test_rule_canonical_prefixes.py::TestCanonicalPrefixesRule::test_default_merged_context - AttributeError: 'NoneType' object has no attribute 'append'
ERROR tests/test_workspaces/test_example_runner.py::test_load_from_dict - AttributeError: 'NoneType' object has no attribute 'append'
ERROR tests/test_workspaces/test_example_runner.py::test_example_runner - AttributeError: 'NoneType' object has no attribute 'append'
ERROR tests/test_workspaces/test_example_runner.py::test_example_runner_non_defaults - AttributeError: 'NoneType' object has no attribute 'append'
========== 10 failed, 6151 passed, 964 skipped, 3 xfailed, 4 warnings, 3 errors, 261 subtests passed in 6882.77s (1:54:42) ===========

expand/collapse pytest output

=============================================================== ERRORS ===============================================================
_______________________________________________ ERROR at setup of test_load_from_dict ________________________________________________

input_path = <function input_path.<locals>.get_path at 0x144044e00>
tmp_path = PosixPath('/private/var/folders/b8/28n_6_zn7flgh40028jxd8j80000gn/T/pytest-of-jonny/pytest-69/test_load_from_dict0')

    @pytest.fixture
    def example_runner(input_path, tmp_path):
        schemaview = SchemaView(input_path("personinfo.yaml"))
>       ctxt = load_multi_context(["obo", "linked_data", "prefixcc"])

tests/test_workspaces/test_example_runner.py:19: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
../../../Library/Caches/pypoetry/virtualenvs/linkml-G-1YJs6r-py3.11/lib/python3.11/site-packages/prefixmaps/io/parser.py:46: in load_multi_context
    ctxt.combine(load_context(n, refresh=refresh))
../../../Library/Caches/pypoetry/virtualenvs/linkml-G-1YJs6r-py3.11/lib/python3.11/site-packages/prefixmaps/datamodel/context.py:161: in combine
    self.add_prefix(pe.prefix, pe.namespace, pe.status)
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

self = Context(name='obo+linked_data+prefixcc', description=None, prefix_expansions=[], comments=None, location=None, format=..., merged_from=None, upper=None, lower=None, _prefixes=None, _prefixes_lower=[], _namespaces=None, _namespaces_lower=[])
prefix = 'AAO', namespace = 'http://purl.obolibrary.org/obo/AAO_', status = <StatusType.canonical: 'canonical'>, preferred = False

    def add_prefix(
        self,
        prefix: PREFIX,
        namespace: NAMESPACE,
        status: StatusType = StatusType.canonical,
        preferred: bool = False,
    ):
        """
        Adds a prefix expansion to this context.
    
        The current context stays canonical. Additional prefixes
        added may be classified as non-canonical.
    
        If upper or lower is set for this context, the
        prefix will be auto-case normalized,
        UNLESS preferred=True
    
        :param prefix: prefix to be added
        :param namespace: namespace to be added
        :param status: the status of the prefix being added
        :param preferred:
        :return:
        """
        # TODO: check status
        if not preferred:
            if self.upper:
                prefix = prefix.upper()
                if self.lower:
                    raise ValueError("Cannot set both upper AND lower")
            if self.lower:
                prefix = prefix.lower()
        prefixes = self.prefixes(lower=True)
        namespaces = self.namespaces(lower=True)
        if prefix.lower() in prefixes:
            if namespace.lower() in namespaces:
                return
                # status = StatusType.multi_alias
            else:
                status = StatusType.prefix_alias
                self._namespaces.append(namespace)
                self._namespaces_lower.append(namespace.lower())
        else:
>           self._prefixes.append(prefix)
E           AttributeError: 'NoneType' object has no attribute 'append'

../../../Library/Caches/pypoetry/virtualenvs/linkml-G-1YJs6r-py3.11/lib/python3.11/site-packages/prefixmaps/datamodel/context.py:205: AttributeError
_______________________________________________ ERROR at setup of test_example_runner ________________________________________________

input_path = <function input_path.<locals>.get_path at 0x144044a40>
tmp_path = PosixPath('/private/var/folders/b8/28n_6_zn7flgh40028jxd8j80000gn/T/pytest-of-jonny/pytest-69/test_example_runner0')

    @pytest.fixture
    def example_runner(input_path, tmp_path):
        schemaview = SchemaView(input_path("personinfo.yaml"))
>       ctxt = load_multi_context(["obo", "linked_data", "prefixcc"])

tests/test_workspaces/test_example_runner.py:19: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
../../../Library/Caches/pypoetry/virtualenvs/linkml-G-1YJs6r-py3.11/lib/python3.11/site-packages/prefixmaps/io/parser.py:46: in load_multi_context
    ctxt.combine(load_context(n, refresh=refresh))
../../../Library/Caches/pypoetry/virtualenvs/linkml-G-1YJs6r-py3.11/lib/python3.11/site-packages/prefixmaps/datamodel/context.py:161: in combine
    self.add_prefix(pe.prefix, pe.namespace, pe.status)
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

self = Context(name='obo+linked_data+prefixcc', description=None, prefix_expansions=[], comments=None, location=None, format=..., merged_from=None, upper=None, lower=None, _prefixes=None, _prefixes_lower=[], _namespaces=None, _namespaces_lower=[])
prefix = 'AAO', namespace = 'http://purl.obolibrary.org/obo/AAO_', status = <StatusType.canonical: 'canonical'>, preferred = False

    def add_prefix(
        self,
        prefix: PREFIX,
        namespace: NAMESPACE,
        status: StatusType = StatusType.canonical,
        preferred: bool = False,
    ):
        """
        Adds a prefix expansion to this context.
    
        The current context stays canonical. Additional prefixes
        added may be classified as non-canonical.
    
        If upper or lower is set for this context, the
        prefix will be auto-case normalized,
        UNLESS preferred=True
    
        :param prefix: prefix to be added
        :param namespace: namespace to be added
        :param status: the status of the prefix being added
        :param preferred:
        :return:
        """
        # TODO: check status
        if not preferred:
            if self.upper:
                prefix = prefix.upper()
                if self.lower:
                    raise ValueError("Cannot set both upper AND lower")
            if self.lower:
                prefix = prefix.lower()
        prefixes = self.prefixes(lower=True)
        namespaces = self.namespaces(lower=True)
        if prefix.lower() in prefixes:
            if namespace.lower() in namespaces:
                return
                # status = StatusType.multi_alias
            else:
                status = StatusType.prefix_alias
                self._namespaces.append(namespace)
                self._namespaces_lower.append(namespace.lower())
        else:
>           self._prefixes.append(prefix)
E           AttributeError: 'NoneType' object has no attribute 'append'

../../../Library/Caches/pypoetry/virtualenvs/linkml-G-1YJs6r-py3.11/lib/python3.11/site-packages/prefixmaps/datamodel/context.py:205: AttributeError
_________________________________________ ERROR at setup of test_example_runner_non_defaults _________________________________________

input_path = <function input_path.<locals>.get_path at 0x1440474c0>
tmp_path = PosixPath('/private/var/folders/b8/28n_6_zn7flgh40028jxd8j80000gn/T/pytest-of-jonny/pytest-69/test_example_runner_non_defaul0')

    @pytest.fixture
    def example_runner(input_path, tmp_path):
        schemaview = SchemaView(input_path("personinfo.yaml"))
>       ctxt = load_multi_context(["obo", "linked_data", "prefixcc"])

tests/test_workspaces/test_example_runner.py:19: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
../../../Library/Caches/pypoetry/virtualenvs/linkml-G-1YJs6r-py3.11/lib/python3.11/site-packages/prefixmaps/io/parser.py:46: in load_multi_context
    ctxt.combine(load_context(n, refresh=refresh))
../../../Library/Caches/pypoetry/virtualenvs/linkml-G-1YJs6r-py3.11/lib/python3.11/site-packages/prefixmaps/datamodel/context.py:161: in combine
    self.add_prefix(pe.prefix, pe.namespace, pe.status)
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

self = Context(name='obo+linked_data+prefixcc', description=None, prefix_expansions=[], comments=None, location=None, format=..., merged_from=None, upper=None, lower=None, _prefixes=None, _prefixes_lower=[], _namespaces=None, _namespaces_lower=[])
prefix = 'AAO', namespace = 'http://purl.obolibrary.org/obo/AAO_', status = <StatusType.canonical: 'canonical'>, preferred = False

    def add_prefix(
        self,
        prefix: PREFIX,
        namespace: NAMESPACE,
        status: StatusType = StatusType.canonical,
        preferred: bool = False,
    ):
        """
        Adds a prefix expansion to this context.
    
        The current context stays canonical. Additional prefixes
        added may be classified as non-canonical.
    
        If upper or lower is set for this context, the
        prefix will be auto-case normalized,
        UNLESS preferred=True
    
        :param prefix: prefix to be added
        :param namespace: namespace to be added
        :param status: the status of the prefix being added
        :param preferred:
        :return:
        """
        # TODO: check status
        if not preferred:
            if self.upper:
                prefix = prefix.upper()
                if self.lower:
                    raise ValueError("Cannot set both upper AND lower")
            if self.lower:
                prefix = prefix.lower()
        prefixes = self.prefixes(lower=True)
        namespaces = self.namespaces(lower=True)
        if prefix.lower() in prefixes:
            if namespace.lower() in namespaces:
                return
                # status = StatusType.multi_alias
            else:
                status = StatusType.prefix_alias
                self._namespaces.append(namespace)
                self._namespaces_lower.append(namespace.lower())
        else:
>           self._prefixes.append(prefix)
E           AttributeError: 'NoneType' object has no attribute 'append'

../../../Library/Caches/pypoetry/virtualenvs/linkml-G-1YJs6r-py3.11/lib/python3.11/site-packages/prefixmaps/datamodel/context.py:205: AttributeError
============================================================== FAILURES ==============================================================
_____________________________________________________ test_jsonschema_validation _____________________________________________________

input_path = <function input_path.<locals>.get_path at 0x13de5f880>

    def test_jsonschema_validation(input_path):
        """Test case that tries to infer the target class for a schema
        that has a local import and doesn't explicitly specify the
        target class"""
        SCHEMA = input_path("issue_494/slot_usage_example.yaml")
        sv = SchemaView(schema=SCHEMA)
    
        with pytest.raises(
            RuntimeError,
            match="Multiple potential target "
            r"classes found: \['Container', 'annotation'\]. "
            "Please specify a target using --target-class "
            "or by adding tree_root: true to the relevant class in the schema",
        ):
            # infer target_class rather than explicit specification
>           infer_root_class(sv)

tests/test_issues/test_issue_494.py:22: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

sv = SchemaView(schema=SchemaDefinition(name='personinfo', id_prefixes=[], id_prefixes_are_closed=None, definition_uri=None...one, slot_names_unique=None, settings={})}, importmap={}, modifications=0, uuid='70a0f3ff-b715-4b8b-83f3-9856da459a2f')

    def infer_root_class(sv: SchemaView) -> Optional[ClassDefinitionName]:
        """
        Infer the class that should be at the root of the object tree
    
        (Note this is distinct from the root of the class hierarchy)
    
        If a class is explicitly designated with tree_root, use this.
        Otherwise use the class that is not referenced as a range in any other class.
        """
        for c in sv.all_classes().values():
            if c.tree_root:
                return c.name
        refs = defaultdict(int)
        for cn in sv.all_classes().keys():
            for sn in sv.class_slots(cn):
                slot = sv.induced_slot(sn, cn)
                r = slot.range
                if r in sv.all_classes():
                    for a in sv.class_ancestors(r):
                        refs[a] += 1
        candidates = [cn for cn in sv.all_classes().keys() if cn not in refs]
    
        # throw Exception if unambiguous root cannot be inferred
        if len(candidates) > 1:
>           raise RuntimeError(
                f"Multiple potential target classes found: {candidates}. "
                "Please specify a target using --target-class or by adding "
                "tree_root: true to the relevant class in the schema."
            )
E           RuntimeError: Multiple potential target classes found: ['annotation', 'Container']. Please specify a target using --target-class or by adding tree_root: true to the relevant class in the schema.

linkml/utils/datautils.py:93: RuntimeError

During handling of the above exception, another exception occurred:

input_path = <function input_path.<locals>.get_path at 0x13de5f880>

    def test_jsonschema_validation(input_path):
        """Test case that tries to infer the target class for a schema
        that has a local import and doesn't explicitly specify the
        target class"""
        SCHEMA = input_path("issue_494/slot_usage_example.yaml")
        sv = SchemaView(schema=SCHEMA)
    
>       with pytest.raises(
            RuntimeError,
            match="Multiple potential target "
            r"classes found: \['Container', 'annotation'\]. "
            "Please specify a target using --target-class "
            "or by adding tree_root: true to the relevant class in the schema",
        ):
E       AssertionError: Regex pattern did not match.
E        Regex: "Multiple potential target classes found: \\['Container', 'annotation'\\]. Please specify a target using --target-class or by adding tree_root: true to the relevant class in the schema"
E        Input: "Multiple potential target classes found: ['annotation', 'Container']. Please specify a target using --target-class or by adding tree_root: true to the relevant class in the schema."

tests/test_issues/test_issue_494.py:14: AssertionError
--------------------------------------------------------- Captured log call ----------------------------------------------------------
INFO     root:schemaview.py:206 Importing testme as testme from source /Users/jonny/git/forks/linkml/tests/test_issues/input/issue_494/slot_usage_example.yaml; base_dir=/Users/jonny/git/forks/linkml/tests/test_issues/input/issue_494
INFO     root:schemaview.py:206 Importing linkml:types as /Users/jonny/Library/Caches/pypoetry/virtualenvs/linkml-G-1YJs6r-py3.11/lib/python3.11/site-packages/linkml_runtime/linkml_model/model/schema/types from source /Users/jonny/git/forks/linkml/tests/test_issues/input/issue_494/slot_usage_example.yaml; base_dir=None
___________________________________________ TestLinterCli.test_config_extends_recommended ____________________________________________

self = <tests.test_linter.test_cli.TestLinterCli testMethod=test_config_extends_recommended>

    def test_config_extends_recommended(self):
        config_file = "config.yaml"
        with self.runner.isolated_filesystem():
            write_schema_file()
            write_config_file(config_file, extends_recommended=True)
    
            result = self.runner.invoke(main, ["--config", config_file, SCHEMA_FILE])
>           self.assertEqual(result.exit_code, 2)
E           AssertionError: 1 != 2

tests/test_linter/test_cli.py:110: AssertionError
--------------------------------------------------------- Captured log call ----------------------------------------------------------
INFO     root:schemaview.py:206 Importing linkml:units as /Users/jonny/Library/Caches/pypoetry/virtualenvs/linkml-G-1YJs6r-py3.11/lib/python3.11/site-packages/linkml_runtime/linkml_model/model/schema/units from source /Users/jonny/Library/Caches/pypoetry/virtualenvs/linkml-G-1YJs6r-py3.11/lib/python3.11/site-packages/linkml_runtime/linkml_model/model/schema/meta.yaml; base_dir=None
INFO     root:schemaview.py:206 Importing linkml:mappings as /Users/jonny/Library/Caches/pypoetry/virtualenvs/linkml-G-1YJs6r-py3.11/lib/python3.11/site-packages/linkml_runtime/linkml_model/model/schema/mappings from source /Users/jonny/Library/Caches/pypoetry/virtualenvs/linkml-G-1YJs6r-py3.11/lib/python3.11/site-packages/linkml_runtime/linkml_model/model/schema/meta.yaml; base_dir=None
INFO     root:schemaview.py:206 Importing linkml:types as /Users/jonny/Library/Caches/pypoetry/virtualenvs/linkml-G-1YJs6r-py3.11/lib/python3.11/site-packages/linkml_runtime/linkml_model/model/schema/types from source /Users/jonny/Library/Caches/pypoetry/virtualenvs/linkml-G-1YJs6r-py3.11/lib/python3.11/site-packages/linkml_runtime/linkml_model/model/schema/meta.yaml; base_dir=None
INFO     root:schemaview.py:206 Importing linkml:annotations as /Users/jonny/Library/Caches/pypoetry/virtualenvs/linkml-G-1YJs6r-py3.11/lib/python3.11/site-packages/linkml_runtime/linkml_model/model/schema/annotations from source /Users/jonny/Library/Caches/pypoetry/virtualenvs/linkml-G-1YJs6r-py3.11/lib/python3.11/site-packages/linkml_runtime/linkml_model/model/schema/meta.yaml; base_dir=None
INFO     root:schemaview.py:206 Importing linkml:extensions as /Users/jonny/Library/Caches/pypoetry/virtualenvs/linkml-G-1YJs6r-py3.11/lib/python3.11/site-packages/linkml_runtime/linkml_model/model/schema/extensions from source /Users/jonny/Library/Caches/pypoetry/virtualenvs/linkml-G-1YJs6r-py3.11/lib/python3.11/site-packages/linkml_runtime/linkml_model/model/schema/meta.yaml; base_dir=None
_______________________________________________ TestLinterCli.test_directory_of_files ________________________________________________

self = <tests.test_linter.test_cli.TestLinterCli testMethod=test_directory_of_files>

        def test_directory_of_files(self):
            schema_dir = Path("schemas")
            schema_a = schema_dir / "schema_a.yaml"
            schema_b = schema_dir / "schema_b.yaml"
            with self.runner.isolated_filesystem():
                schema_dir.mkdir()
                schema_a.write_text(
                    """
    id: http://example.org/test_a
    name: test_a
    
    classes:
      person:
        description: An individual human
    """
                )
                schema_b.write_text(
                    """
    id: http://example.org/test_b
    name: test_b
    
    slots:
      a slot:
        description: A slot to hold thing
    """
                )
    
                result = self.runner.invoke(main, [str(schema_dir)])
                self.assertEqual(result.exit_code, 1)
>               self.assertIn(str(schema_a), result.stdout)
E               AssertionError: 'schemas/schema_a.yaml' not found in "/private/var/folders/b8/28n_6_zn7flgh40028jxd8j80000gn/T/tmp0wrjounp/schemas/schema_b.yaml\n  warning  Slot has name 'a slot'  (standard_naming)\n"

tests/test_linter/test_cli.py:215: AssertionError
____________________________________________________ TestLinterCli.test_no_config ____________________________________________________

self = <tests.test_linter.test_cli.TestLinterCli testMethod=test_no_config>

    def test_no_config(self):
        with self.runner.isolated_filesystem():
            write_schema_file()
    
            result = self.runner.invoke(main, [SCHEMA_FILE])
>           self.assertEqual(result.exit_code, 2)
E           AssertionError: 1 != 2

tests/test_linter/test_cli.py:65: AssertionError
________________________________________________ TestLinterCli.test_no_schema_errors _________________________________________________

self = <tests.test_linter.test_cli.TestLinterCli testMethod=test_no_schema_errors>

        def test_no_schema_errors(self):
            with self.runner.isolated_filesystem():
                with open(SCHEMA_FILE, "w") as f:
                    f.write(
                        """
    id: http://example.org/test
    name: test
    
    classes:
      Person:
        description: An individual human
    """
                    )
    
                result = self.runner.invoke(main, [SCHEMA_FILE])
>               self.assertEqual(result.exit_code, 0)
E               AssertionError: 1 != 0

tests/test_linter/test_cli.py:184: AssertionError
_____________________________ TestLinterCli.test_processes_dot_files_in_directory_when_a_option_provided _____________________________

self = <tests.test_linter.test_cli.TestLinterCli testMethod=test_processes_dot_files_in_directory_when_a_option_provided>

        def test_processes_dot_files_in_directory_when_a_option_provided(self):
            schema_dir = Path("schemas")
            schema_a = schema_dir / "schema_a.yaml"
            schema_b = schema_dir / ".schema_b.yaml"
            with self.runner.isolated_filesystem():
                schema_dir.mkdir()
                schema_a.write_text(
                    """
    id: http://example.org/test_a
    name: test_a
    
    classes:
      person:
        description: An individual human
    """
                )
                schema_b.write_text(
                    """
    id: http://example.org/test_b
    name: test_b
    
    slots:
      a slot:
        description: A slot to hold thing
    """
                )
    
                result = self.runner.invoke(main, ["-a", str(schema_dir)])
                self.assertEqual(result.exit_code, 1)
>               self.assertIn(str(schema_a), result.stdout)
E               AssertionError: 'schemas/schema_a.yaml' not found in "/private/var/folders/b8/28n_6_zn7flgh40028jxd8j80000gn/T/tmpf87woivi/schemas/.schema_b.yaml\n  warning  Slot has name 'a slot'  (standard_naming)\n"

tests/test_linter/test_cli.py:283: AssertionError
____________________________ TestLinterCli.test_processes_dot_files_in_directory_when_all_option_provided ____________________________

self = <tests.test_linter.test_cli.TestLinterCli testMethod=test_processes_dot_files_in_directory_when_all_option_provided>

        def test_processes_dot_files_in_directory_when_all_option_provided(self):
            schema_dir = Path("schemas")
            schema_a = schema_dir / "schema_a.yaml"
            schema_b = schema_dir / ".schema_b.yaml"
            with self.runner.isolated_filesystem():
                schema_dir.mkdir()
                schema_a.write_text(
                    """
    id: http://example.org/test_a
    name: test_a
    
    classes:
      person:
        description: An individual human
    """
                )
                schema_b.write_text(
                    """
    id: http://example.org/test_b
    name: test_b
    
    slots:
      a slot:
        description: A slot to hold thing
    """
                )
    
                result = self.runner.invoke(main, ["--all", str(schema_dir)])
                self.assertEqual(result.exit_code, 1)
>               self.assertIn(str(schema_a), result.stdout)
E               AssertionError: 'schemas/schema_a.yaml' not found in "/private/var/folders/b8/28n_6_zn7flgh40028jxd8j80000gn/T/tmps6ym618p/schemas/.schema_b.yaml\n  warning  Slot has name 'a slot'  (standard_naming)\n"

tests/test_linter/test_cli.py:317: AssertionError
_________________________________________________ TestLinterCli.test_validate_schema _________________________________________________

self = <tests.test_linter.test_cli.TestLinterCli testMethod=test_validate_schema>

        def test_validate_schema(self):
            with self.runner.isolated_filesystem():
                with open(SCHEMA_FILE, "w") as f:
                    f.write(
                        """
    id: http://example.org/test
    classes:
        person:
            description: a person
    """
                    )
    
                result = self.runner.invoke(main, ["--validate", SCHEMA_FILE])
>               self.assertEqual(result.exit_code, 2)
E               AssertionError: 1 != 2

tests/test_linter/test_cli.py:335: AssertionError
--------------------------------------------------------- Captured log call ----------------------------------------------------------
INFO     root:generator.py:199 Using SchemaView with im=None
INFO     root:schemaview.py:206 Importing linkml:units as /Users/jonny/Library/Caches/pypoetry/virtualenvs/linkml-G-1YJs6r-py3.11/lib/python3.11/site-packages/linkml_runtime/linkml_model/model/schema/units from source /Users/jonny/Library/Caches/pypoetry/virtualenvs/linkml-G-1YJs6r-py3.11/lib/python3.11/site-packages/linkml_runtime/linkml_model/model/schema/meta.yaml; base_dir=None
INFO     root:schemaview.py:206 Importing linkml:mappings as /Users/jonny/Library/Caches/pypoetry/virtualenvs/linkml-G-1YJs6r-py3.11/lib/python3.11/site-packages/linkml_runtime/linkml_model/model/schema/mappings from source /Users/jonny/Library/Caches/pypoetry/virtualenvs/linkml-G-1YJs6r-py3.11/lib/python3.11/site-packages/linkml_runtime/linkml_model/model/schema/meta.yaml; base_dir=None
INFO     root:schemaview.py:206 Importing linkml:types as /Users/jonny/Library/Caches/pypoetry/virtualenvs/linkml-G-1YJs6r-py3.11/lib/python3.11/site-packages/linkml_runtime/linkml_model/model/schema/types from source /Users/jonny/Library/Caches/pypoetry/virtualenvs/linkml-G-1YJs6r-py3.11/lib/python3.11/site-packages/linkml_runtime/linkml_model/model/schema/meta.yaml; base_dir=None
INFO     root:schemaview.py:206 Importing linkml:annotations as /Users/jonny/Library/Caches/pypoetry/virtualenvs/linkml-G-1YJs6r-py3.11/lib/python3.11/site-packages/linkml_runtime/linkml_model/model/schema/annotations from source /Users/jonny/Library/Caches/pypoetry/virtualenvs/linkml-G-1YJs6r-py3.11/lib/python3.11/site-packages/linkml_runtime/linkml_model/model/schema/meta.yaml; base_dir=None
INFO     root:schemaview.py:206 Importing linkml:extensions as /Users/jonny/Library/Caches/pypoetry/virtualenvs/linkml-G-1YJs6r-py3.11/lib/python3.11/site-packages/linkml_runtime/linkml_model/model/schema/extensions from source /Users/jonny/Library/Caches/pypoetry/virtualenvs/linkml-G-1YJs6r-py3.11/lib/python3.11/site-packages/linkml_runtime/linkml_model/model/schema/meta.yaml; base_dir=None
___________________________________________ TestCanonicalPrefixesRule.test_custom_context ____________________________________________

self = <tests.test_linter.test_rule_canonical_prefixes.TestCanonicalPrefixesRule testMethod=test_custom_context>

    def test_custom_context(self):
        builder = SchemaBuilder()
        builder.add_prefix("CHEBI", "http://purl.obolibrary.org/obo/CHEBI_")
        builder.add_prefix("GEO", "http://identifiers.org/geo/")
        builder.add_prefix("UPHENO", "http://example.com/wrong/upheno_")
        builder.add_prefix("WRONG", "http://identifiers.org/orcid/")
    
        schema_view = SchemaView(builder.schema)
        config = CanonicalPrefixesConfig(
            level=RuleLevel.error,
            prefixmaps_contexts=["bioregistry.upper", "prefixcc", "obo"],
        )
    
        rule = CanonicalPrefixesRule(config)
>       problems = list(rule.check(schema_view))

tests/test_linter/test_rule_canonical_prefixes.py:52: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
linkml/linter/rules.py:256: in check
    context = load_multi_context(self.config.prefixmaps_contexts)
../../../Library/Caches/pypoetry/virtualenvs/linkml-G-1YJs6r-py3.11/lib/python3.11/site-packages/prefixmaps/io/parser.py:46: in load_multi_context
    ctxt.combine(load_context(n, refresh=refresh))
../../../Library/Caches/pypoetry/virtualenvs/linkml-G-1YJs6r-py3.11/lib/python3.11/site-packages/prefixmaps/datamodel/context.py:161: in combine
    self.add_prefix(pe.prefix, pe.namespace, pe.status)
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

self = Context(name='bioregistry.upper+prefixcc+obo', description=None, prefix_expansions=[], comments=None, location=None, f..., merged_from=None, upper=None, lower=None, _prefixes=None, _prefixes_lower=[], _namespaces=None, _namespaces_lower=[])
prefix = '3DMET', namespace = 'http://identifiers.org/3dmet/', status = <StatusType.canonical: 'canonical'>, preferred = False

    def add_prefix(
        self,
        prefix: PREFIX,
        namespace: NAMESPACE,
        status: StatusType = StatusType.canonical,
        preferred: bool = False,
    ):
        """
        Adds a prefix expansion to this context.
    
        The current context stays canonical. Additional prefixes
        added may be classified as non-canonical.
    
        If upper or lower is set for this context, the
        prefix will be auto-case normalized,
        UNLESS preferred=True
    
        :param prefix: prefix to be added
        :param namespace: namespace to be added
        :param status: the status of the prefix being added
        :param preferred:
        :return:
        """
        # TODO: check status
        if not preferred:
            if self.upper:
                prefix = prefix.upper()
                if self.lower:
                    raise ValueError("Cannot set both upper AND lower")
            if self.lower:
                prefix = prefix.lower()
        prefixes = self.prefixes(lower=True)
        namespaces = self.namespaces(lower=True)
        if prefix.lower() in prefixes:
            if namespace.lower() in namespaces:
                return
                # status = StatusType.multi_alias
            else:
                status = StatusType.prefix_alias
                self._namespaces.append(namespace)
                self._namespaces_lower.append(namespace.lower())
        else:
>           self._prefixes.append(prefix)
E           AttributeError: 'NoneType' object has no attribute 'append'

../../../Library/Caches/pypoetry/virtualenvs/linkml-G-1YJs6r-py3.11/lib/python3.11/site-packages/prefixmaps/datamodel/context.py:205: AttributeError
_______________________________________ TestCanonicalPrefixesRule.test_default_merged_context ________________________________________

self = <tests.test_linter.test_rule_canonical_prefixes.TestCanonicalPrefixesRule testMethod=test_default_merged_context>

    def test_default_merged_context(self):
        builder = SchemaBuilder()
        builder.add_prefix("CHEBI", "http://purl.obolibrary.org/obo/CHEBI_")
        builder.add_prefix("GEO", "http://purl.obolibrary.org/obo/GEO_")
        builder.add_prefix("UPHENO", "http://example.com/wrong/upheno_")
        builder.add_prefix("WRONG", "http://identifiers.org/orcid/")
    
        schema_view = SchemaView(builder.schema)
        config = CanonicalPrefixesConfig(level=RuleLevel.error, prefixmaps_contexts=["merged"])
    
        rule = CanonicalPrefixesRule(config)
>       problems = list(rule.check(schema_view))

tests/test_linter/test_rule_canonical_prefixes.py:22: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
linkml/linter/rules.py:256: in check
    context = load_multi_context(self.config.prefixmaps_contexts)
../../../Library/Caches/pypoetry/virtualenvs/linkml-G-1YJs6r-py3.11/lib/python3.11/site-packages/prefixmaps/io/parser.py:46: in load_multi_context
    ctxt.combine(load_context(n, refresh=refresh))
../../../Library/Caches/pypoetry/virtualenvs/linkml-G-1YJs6r-py3.11/lib/python3.11/site-packages/prefixmaps/datamodel/context.py:161: in combine
    self.add_prefix(pe.prefix, pe.namespace, pe.status)
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

self = Context(name='merged', description=None, prefix_expansions=[], comments=None, location=None, format=None, merged_from=None, upper=None, lower=None, _prefixes=None, _prefixes_lower=[], _namespaces=None, _namespaces_lower=[])
prefix = '3DMET', namespace = 'http://identifiers.org/3dmet/', status = <StatusType.canonical: 'canonical'>, preferred = False

    def add_prefix(
        self,
        prefix: PREFIX,
        namespace: NAMESPACE,
        status: StatusType = StatusType.canonical,
        preferred: bool = False,
    ):
        """
        Adds a prefix expansion to this context.
    
        The current context stays canonical. Additional prefixes
        added may be classified as non-canonical.
    
        If upper or lower is set for this context, the
        prefix will be auto-case normalized,
        UNLESS preferred=True
    
        :param prefix: prefix to be added
        :param namespace: namespace to be added
        :param status: the status of the prefix being added
        :param preferred:
        :return:
        """
        # TODO: check status
        if not preferred:
            if self.upper:
                prefix = prefix.upper()
                if self.lower:
                    raise ValueError("Cannot set both upper AND lower")
            if self.lower:
                prefix = prefix.lower()
        prefixes = self.prefixes(lower=True)
        namespaces = self.namespaces(lower=True)
        if prefix.lower() in prefixes:
            if namespace.lower() in namespaces:
                return
                # status = StatusType.multi_alias
            else:
                status = StatusType.prefix_alias
                self._namespaces.append(namespace)
                self._namespaces_lower.append(namespace.lower())
        else:
>           self._prefixes.append(prefix)
E           AttributeError: 'NoneType' object has no attribute 'append'

../../../Library/Caches/pypoetry/virtualenvs/linkml-G-1YJs6r-py3.11/lib/python3.11/site-packages/prefixmaps/datamodel/context.py:205: AttributeError
========================================================== warnings summary ==========================================================
linkml/utils/generator.py:685
  /Users/jonny/git/forks/linkml/linkml/utils/generator.py:685: DeprecationWarning: invalid escape sequence '\_'
    """

../../../Library/Caches/pypoetry/virtualenvs/linkml-G-1YJs6r-py3.11/lib/python3.11/site-packages/pyshexc/parser/ShExDocLexer.py:4
  /Users/jonny/Library/Caches/pypoetry/virtualenvs/linkml-G-1YJs6r-py3.11/lib/python3.11/site-packages/pyshexc/parser/ShExDocLexer.py:4: DeprecationWarning: typing.io is deprecated, import directly from typing instead. typing.io will be removed in Python 3.12.
    from typing.io import TextIO

linkml/linter/linter.py:38
  /Users/jonny/git/forks/linkml/linkml/linter/linter.py:38: DeprecationWarning: Importing Validator directly from the jsonschema package is deprecated and will become an ImportError. Import it from jsonschema.protocols instead.
    def get_metamodel_validator() -> jsonschema.Validator:

../../../Library/Caches/pypoetry/virtualenvs/linkml-G-1YJs6r-py3.11/lib/python3.11/site-packages/jupyter_client/connect.py:22
  /Users/jonny/Library/Caches/pypoetry/virtualenvs/linkml-G-1YJs6r-py3.11/lib/python3.11/site-packages/jupyter_client/connect.py:22: DeprecationWarning: Jupyter is migrating its paths to use standard platformdirs
  given by the platformdirs library.  To remove this warning and
  see the appropriate new directories, set the environment variable
  `JUPYTER_PLATFORM_DIRS=1` and then run `jupyter --paths`.
  The use of platformdirs will be the default in `jupyter_core` v6
    from jupyter_core.paths import jupyter_data_dir, jupyter_runtime_dir, secure_write

-- Docs: https://docs.pytest.org/en/stable/how-to/capture-warnings.html
====================================================== short test summary info =======================================================
FAILED tests/test_issues/test_issue_494.py::test_jsonschema_validation - AssertionError: Regex pattern did not match.
FAILED tests/test_linter/test_cli.py::TestLinterCli::test_config_extends_recommended - AssertionError: 1 != 2
FAILED tests/test_linter/test_cli.py::TestLinterCli::test_directory_of_files - AssertionError: 'schemas/schema_a.yaml' not found in "/private/var/folders/b8/28n_6_zn7flgh40028jxd8j80000gn/T/tmp0wrjounp/schema...
FAILED tests/test_linter/test_cli.py::TestLinterCli::test_no_config - AssertionError: 1 != 2
FAILED tests/test_linter/test_cli.py::TestLinterCli::test_no_schema_errors - AssertionError: 1 != 0
FAILED tests/test_linter/test_cli.py::TestLinterCli::test_processes_dot_files_in_directory_when_a_option_provided - AssertionError: 'schemas/schema_a.yaml' not found in "/private/var/folders/b8/28n_6_zn7flgh40028jxd8j80000gn/T/tmpf87woivi/schema...
FAILED tests/test_linter/test_cli.py::TestLinterCli::test_processes_dot_files_in_directory_when_all_option_provided - AssertionError: 'schemas/schema_a.yaml' not found in "/private/var/folders/b8/28n_6_zn7flgh40028jxd8j80000gn/T/tmps6ym618p/schema...
FAILED tests/test_linter/test_cli.py::TestLinterCli::test_validate_schema - AssertionError: 1 != 2
FAILED tests/test_linter/test_rule_canonical_prefixes.py::TestCanonicalPrefixesRule::test_custom_context - AttributeError: 'NoneType' object has no attribute 'append'
FAILED tests/test_linter/test_rule_canonical_prefixes.py::TestCanonicalPrefixesRule::test_default_merged_context - AttributeError: 'NoneType' object has no attribute 'append'
ERROR tests/test_workspaces/test_example_runner.py::test_load_from_dict - AttributeError: 'NoneType' object has no attribute 'append'
ERROR tests/test_workspaces/test_example_runner.py::test_example_runner - AttributeError: 'NoneType' object has no attribute 'append'
ERROR tests/test_workspaces/test_example_runner.py::test_example_runner_non_defaults - AttributeError: 'NoneType' object has no attribute 'append'

cmungall

marking as Request changes until we can figure out the downstream issue

sneakers-the-rat · 2024-03-16T04:47:11Z

OK figured out what was wrong - it was just that i had installed my local version of prefixmaps in my linkml environment to test linkml/prefixmaps#69 and hadn't fixed a bug in it.

tests pass now, except for test_issue_494.py which fails because of this, which is pending in another PR: linkml/linkml@f148a18

this is good 2 merge i think, but go ahead and test it urself!

cmungall · 2024-04-02T21:40:38Z

triggering a manual build to test the new workflow in

Add action to test with upstream linkml #316

sneakers-the-rat · 2024-04-02T21:55:30Z

Omg I hope it works, lmk if it doesnt and can debug

sneakers-the-rat · 2024-04-02T22:39:21Z

Ok now that the action is in main i can test it on my fork. Sorry for the runaround with these CI actions, its specifically a weird thing for forks. Ill make it work

sneakers-the-rat added 2 commits March 4, 2024 20:14

optimize get_classes_by_slot to just iterate over slot named, not ind…

cefb3d6

…uce them

break early

ee4e4b2

cmungall approved these changes Mar 5, 2024

View reviewed changes

cmungall requested changes Mar 6, 2024

View reviewed changes

sneakers-the-rat marked this pull request as draft March 16, 2024 04:35

sneakers-the-rat marked this pull request as ready for review March 16, 2024 04:47

sujaypatil96 mentioned this pull request Apr 2, 2024

Account for any_of/exactly_one_of in get_classes_by_slot() and get_slots_by_enum() #318

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Optimize `get_classes_by_slot` round 2 #304

Optimize `get_classes_by_slot` round 2 #304

sneakers-the-rat commented Mar 5, 2024 •

edited

Loading

codecov bot commented Mar 5, 2024 •

edited

Loading

cmungall left a comment

sneakers-the-rat commented Mar 5, 2024

cmungall commented Mar 5, 2024 •

edited

Loading

sneakers-the-rat commented Mar 5, 2024

cmungall left a comment

sneakers-the-rat commented Mar 16, 2024

cmungall commented Apr 2, 2024

sneakers-the-rat commented Apr 2, 2024

sneakers-the-rat commented Apr 2, 2024

Optimize get_classes_by_slot round 2 #304

Are you sure you want to change the base?

Optimize get_classes_by_slot round 2 #304

Conversation

sneakers-the-rat commented Mar 5, 2024 • edited Loading

codecov bot commented Mar 5, 2024 • edited Loading

Codecov Report

cmungall left a comment

Choose a reason for hiding this comment

sneakers-the-rat commented Mar 5, 2024

cmungall commented Mar 5, 2024 • edited Loading

sneakers-the-rat commented Mar 5, 2024

cmungall left a comment

Choose a reason for hiding this comment

sneakers-the-rat commented Mar 16, 2024

cmungall commented Apr 2, 2024

sneakers-the-rat commented Apr 2, 2024

sneakers-the-rat commented Apr 2, 2024

Optimize `get_classes_by_slot` round 2 #304

Optimize `get_classes_by_slot` round 2 #304

sneakers-the-rat commented Mar 5, 2024 •

edited

Loading

codecov bot commented Mar 5, 2024 •

edited

Loading

cmungall commented Mar 5, 2024 •

edited

Loading