Update all documentation files

ranking-agent · Nov 15, 2024 · b508944 · b508944
1 parent 1eaa519
commit b508944
Show file tree

Hide file tree

Showing 7 changed files with 40 additions and 224 deletions.
diff --git a/README.md b/README.md
@@ -1,7 +1,7 @@
 
 # Strider
 
-__A web service and API for Strider, the knowledge-provider querying, answer generating, ranking module of ARAGORN.__
+__A web service and API for Strider, the knowledge-provider querying, answer generating module of ARAGORN.__
 
 This service accepts a biomedical question as a [Translator reasoner standard message](https://github.com/NCATSTranslator/ReasonerAPI) and asynchronously generates results in the same format.
 
@@ -31,11 +31,11 @@ docker-compose -f docker-compose.yml -f docker-compose.dev.yml up --build
 
 This will start the requisite containers as well as the strider container. Changes made locally will update the container while running. 
 
-You can also run tests and coverage reports withou the management script. Check the `manage.py` file for instructions on how to do this.
+You can also run tests and coverage reports without the management script. Check the `manage.py` file for instructions on how to do this.
 
 ### Profiler
 
-The local development environment also includes a built-in profiler for debugging performance issues. To use this, set `PROFILER=true` in a `.env` file in the root of the repository. Once the application is running the profiler will automatically be run on all incoming requests. To view profiles you can visit [localhost:5781/profiles](http://localhost:5781/profiles), which will give you a list of the captured profiles. These captured profiles can be used with the [snakeviz](https://jiffyclub.github.io/snakeviz/) utility to easily diagnose performance issues.
+The local development environment also includes a built-in profiler for debugging performance issues. To use this, set `PROFILER=true` in a `.env` file in the root of the repository. Once the application is running the profiler will automatically be run on all incoming requests. We haven't found a great asynchronous python profiler, but the current "best" one is pyinstrument. When the profiler is enabled, a browser page will open after a query has completed that shows the profile.
 
 ## Testing
 

diff --git a/docs/COMPONENTS.md b/docs/COMPONENTS.md
@@ -11,41 +11,39 @@
 
 ## Modules
 
-* `fetcher.py` handles the coordination of one-hop subqueries to answer an arbitrary graph query.
-* `constraints.py` handles evaluating and enforcing qnode/qedge constraints.
-* `trapi_throttle` handles batching and throttling requests to KPs
-  * `trapi.py` contains utilities for exploring and manipulating TRAPI messages
-  * `throttle.py` handles batching and throttling request to KPs
-* `compatibility.py` handles CURIE mapping and the handoff between `fetcher` and `trapi_throttle`
 * `caching.py` contains some caching utilities - primarily decorators for applying a cache or a locking cache to an asynchronous function
 * `config.py` defines Strider settings using [Pydantic settings management](https://pydantic-docs.helpmanual.io/usage/settings/)
+* `constraints.py` handles evaluating and enforcing qnode/qedge constraints.
+* `fetcher.py` handles the coordination of one-hop subqueries to answer an arbitrary graph query.
 * `graph.py` defines a dict extension with a couple of useful utilities for exploring TRAPI-style graphs
-* `kp_registry.py` defines a Python client for the KP registry service: https://github.com/ranking-agent/kp-registry, https://kp-registry.renci.org/docs
+* `knowledge_provider.py` is a class wrapper for each KP, does biolink conversions and all pre/post processing including filtering
+* `logger.py` set up the server logger
+* `mcq.py` basic utility functions specifically for MCQ(MultiCurie Query)/Set Input Queries
+* `node_sets.py` single function for collapsing node sets
 * `normalizer.py` defines a Python client for the node normalizer service: https://github.com/TranslatorSRI/NodeNormalization, https://nodenormalization-sri.renci.org/docs
+* `openapi.py` defines the TRAPI subclass of FastAPI to add the common TRAPI elements to the OpenAPI schema
 * `profiler.py` handles request profiler
 * `query_planner.py` contains tools for planning query graph traversals
-* `results.py` **probably obsolete**
-* `scoring.py` **probably obsolete**
 * `server.py` builds the [FastAPI](https://fastapi.tiangolo.com/) server and endpoints
-<!-- * `storage.py` defines interfaces for accessing and manipulating Redis storage -->
-* `trapi_openapi.py` defines the TRAPI subclass of FastAPI to add the common TRAPI elements to the OpenAPI schema
-* `trapi.py` defines utilities for TRAPI messages, including normalizing and merging
+* `throttle_utils.py` contains utilities for exploring and manipulating TRAPI messages
+* `throttle.py` handles batching and throttling request to KPs
+* `trapi.py` defines utilities for TRAPI messages, including normalizing, merging, and result filtering
 * `traversal.py` contains code for verifying that a query graph can be solved with the KPs available (traversable)
-* `util.py` :\ a whole bunch of random stuff, some of it important
+* `utils.py` :\ a whole bunch of random stuff, some of it important
 
 ## Important functions
 
-* `Binder.lookup(qgraph)` (`fetcher.py`) generates (subkgraph, subresult) pairs
+* `Fetcher.lookup(qgraph)` (`fetcher.py`) generates (subkgraph, subresult) pairs
   1. Gets the next qedge to traverse and generates the correponding a one-hop query.
   2. Passes it to each KP that can solve (`generate_from_kp()`).
 
-* `Binder.generate_from_kp(qgraph, onehop_qgraph, kp)` (`fetcher.py`) generates (subkgraph, subresult) pairs
+* `Fetcher.generate_from_kp(qgraph, onehop_qgraph, kp)` (`fetcher.py`) generates (subkgraph, subresult) pairs
   1. Sends one-hop query to KP. Enforces any qnode/qedge constraints afterwards.
   2. Constructs new qgraph from original by removing the traversed qedge.
   3. Separates results into batches of size at most X (now 1 million).
   4. Passes each batch to `generate_from_results()` along with a result map/function that points back to linked subresults.
 
-* `Binder.generate_from_results(qgraph, get_results)` (`fetcher.py`) generates (subkgraph, subresult) pairs
+* `Fetcher.generate_from_results(qgraph, get_results)` (`fetcher.py`) generates (subkgraph, subresult) pairs
   1. Calls `lookup(qgraph)` and stitches the results with back-linked subresults from `get_results()`.
 
 `lookup()`, `generate_from_kp()`, and `generate_from_results` form a recursion such that qgraphs can be solved by extracting one-hop sub-queries and joining the results with the solution to the remainder.
@@ -64,7 +62,7 @@ To stitch the sub-results together, we have separated them out, even though ever
 
                                       x KPs                              x results
 ```
-* `ThrottledServer.process_batch()` (`trapi_throttle/throttle.py`) iteratively reads from an input request queue and writes to the appropriate request queues
+* `ThrottledServer.process_batch()` (`throttle.py`) iteratively reads from an input request queue and writes to the appropriate request queues
   1. Receives a number of requests
   2. Identifies a subset of the available requests that are merge-able, re-queues the rest
   3. Constructs batched request
@@ -74,27 +72,21 @@ To stitch the sub-results together, we have separated them out, even though ever
   7. Validates w.r.t. TRAPI and post-processes response (normalizing CURIEs, mostly)
   8. Splits (un-batches) response into provided response queues
 
-* `ThrottledServer.query(qgraph)` (`trapi_throttle/throttle.py`) returns a TRAPI response
+* `ThrottledServer.query(qgraph)` (`throttle.py`) returns a TRAPI response
   This provides a synchronous interface to throttling/batching (via `process_batch()`).
 
 A `ThrottledServer` is set up upon query initiation for each KP, and manages throttling and batching for that KP for the query lifetime.
 
-* `Synonymizer.map_curie(curie, prefixes)` returns a list of mapped CURIEs according to the preferred identifier sets
+* `Normalizer.map_curie(curie, prefixes)` returns a list of mapped CURIEs according to the preferred identifier sets
   1. Gets the preferred prefixes for the node's categories
   2. Gets all CURIEs starting with the most-preferred prefix available in the synset
 
-* `KnowledgePortal.map_prefixes(message, prefixes)` returns a TRAPI message with CURIEs mapped to the preferred identifier sets
+* `KnowledgeProvider.map_prefixes(message, prefixes)` returns a TRAPI message with CURIEs mapped to the preferred identifier sets
   1. Gets all CURIEs from the input message
   2. Finds categories and synonyms for CURIEs
-  3. Gets CURIE map using `Synonymizer.map_curie()`
+  3. Gets CURIE map using `Normalizer.map_curie()`
   4. Applies CURIE map to message
 
 ## Libraries
 
-* [bmt-lite](https://github.com/patrickkwang/bmt-lite) - for accessing the biolink model
 * [reasoner-pydantic](https://github.com/TranslatorSRI/reasoner-pydantic) - Pydantic models reflecting the TRAPI components
-
-for testing only:
-* [ASGIAR](https://github.com/patrickkwang/asgiar) - for mocking http calls
-* [kp-registry](https://github.com/ranking-agent/kp-registry) - for mocking the KP registry
-* [binder](https://github.com/TranslatorSRI/binder) - for mocking KPs
diff --git a/docs/DESIGN_HISTORY.md b/docs/DESIGN_HISTORY.md
@@ -127,3 +127,13 @@ pros:
 
 components:
 * Python web server/worker
+
+### November 2024
+
+components:
+* Python web server/workers: handles all incoming API requests
+* redis: stores cache of kp-registry as well as all one-hop KP requests
+
+external services (outside KPs):
+* Node Normalizer
+* OTEL/Jaeger: web API tracing for full query profiles
diff --git a/docs/EXECUTION_OVERVIEW.md b/docs/EXECUTION_OVERVIEW.md
@@ -15,7 +15,7 @@ Another standardization step is handling prefixes. Multiple identifiers (IDs) ca
 
 ## Execution
 
-After planning, query execution is handled by Binder in [fetcher.py](strider/fetcher.py). Binder processes a query graph by recursively breakign it down into subgraphs. More information on this can be found in [docs/COMPONENTS](docs/components.md#important-functions). The motivation for this architecture is to be able to return results before the query has finished execution. In this case you can think of a completed result as a binding of all nodes to IDs:
+After planning, query execution is handled by Fetcher in [fetcher.py](strider/fetcher.py). Fetcher processes a query graph by recursively breaking it down into subgraphs. More information on this can be found in [docs/COMPONENTS](docs/components.md#important-functions). The motivation for this architecture is to be able to return results before the query has finished execution. In this case you can think of a completed result as a binding of all nodes to IDs:
 
 #### Example Query Graph:
 
@@ -54,4 +54,4 @@ When contacting KPs we combine the information in the plan with the current ID t
 
 We also convert the results from the KP to Strider's preferred prefixes. This is not just for the query graph but for the knowledge graph and results list. The utilities that are used to do this can be found in the [trapi.py](strider/trapi.py) file.
 
-After receiving and converting KP results we merge the existing results with new ones. We do our best to combine results that have matching information. The utilities for this are also in the trapi.py file. Knowledge graph nodes are combined based on the ID, and knowledge graph edges are combined if they have the same subject/predicate/object triple. Combining these results perfectly is still an active area of development so the existing implementation can be seen as a sort of heuristic.
+After receiving and converting KP results we merge the existing results with new ones. We do our best to combine results that have matching information. The utilities for this are also in the trapi.py file. Knowledge graph nodes are combined based on the ID, and knowledge graph edges are combined if they have the same subject/predicate/object triple. Combining these results perfectly is still an active area of development and is handled by the `reasoner_pydantic` dependency.
diff --git a/docs/TESTING_INFRASTRUCTURE.md b/docs/TESTING_INFRASTRUCTURE.md
@@ -6,7 +6,6 @@
 
 An ARA is a complex pieces of software. One of the most important tools for building complex software is testing. It's less clear *how* to implement effective testing for an ARA. The main challenge is that ARAs make calls to external tools that can behave (or misbehave) in a variety of ways. When Strider receives a query it contacts the following external tools:
 
-- KP Registry to find KPs available to solve particular edges
 - Node Normalizer to convert curies between formats
 - Individual KPs to solve one-hop steps of a given query
 
@@ -16,39 +15,9 @@ A good example of this is how Strider handles node normalizer responses. One of
 
 ## Architecture Overview
 
-A common testing pattern for large pieces of software is to build integration tests. This would involve running Strider, the KP Registry, and individual KPs in separate processes. There are drawbacks to this approach that make it difficult. One main drawback is that testing requires a networking infrastructure. This means there is additional tooling that must be present to run tests.
-
-Our infrastructure uses a feature of Python's [`httpcore`](https://github.com/encode/httpcore) library to intercept external HTTP calls and route them to internal handlers. All of this takes place within one Python process. This eliminates the need for networking infrastructure and makes the tests less like integration tests and more like unit tests.
-
-## Networking Overlay (ASGIAR)
-
-The code for simulating external services is packaged in the [ASGIAR Repository](https://github.com/patrickkwang/asgiar). This allows overlaying an ASGI appliction to intercept HTTP requests. [ASGI](https://asgi.readthedocs.io/en/latest/) is the successor to [WSGI](https://www.python.org/dev/peps/pep-3333/) - a standardized Python web server interface. Many frameworks implement ASGI including [FastAPI](https://fastapi.tiangolo.com/), [Starlette](https://www.starlette.io/), and [Django](https://docs.djangoproject.com/en/3.1/howto/deployment/asgi/). This means any application written using any of these frameworks can "plug in" to ASGIAR and handle web requests.
-
-The interface for using ASGIAR uses the standard Python context handler. Running a test with a custom KP is as simple as:
-
-```python
-from asgiar import ASGIAR
-from custom_kp import app as custom_kp_app
-
-async def my_custom_kp_test():
-    with ASGIAR(custom_kp_app, host="kp"):
-        async with httpx.AsyncClient() as client:
-            response = await client.get("http://kp/test")
-        assert response.status_code == 200
-```
-
-In this test, the call to `client.get` will be handled by the `/test` endpoint of the custom\_kp\_app.
-
-## Custom Services
-
-The second key piece of our infrastructure is mocking out external services, in particular the Node Normalizer, KP Registry, and KPs. These services live in separate repositories and can be installed as Python packages with pip. This allows us to use a `requirements.txt` file for testing to pull in these dependencies. The services can be found at the following repositories:
-
-- KP Registry: [https://github.com/ranking-agent/kp\_registry](https://github.com/ranking-agent/kp_registry)
-- Simple KP: [https://github.com/TranslatorSRI/simple-kp](https://github.com/TranslatorSRI/simple-kp)
-- Node Normalizer: Stored directly in the Strider repository [https://github.com/ranking-agent/strider/blob/master/tests/helpers/normalizer.py](https://github.com/ranking-agent/strider/blob/master/tests/helpers/normalizer.py)
-
-All of these services are built using FastAPI to maintain compatibility with ASGIAR. Each of them can be initialized with custom data during creation.
+A common testing pattern for large pieces of software is to build integration tests. This would involve running Strider, Node Normalizer, and individual KPs in separate processes. There are drawbacks to this approach that make it difficult. One main drawback is that testing requires a networking infrastructure. This means there is additional tooling that must be present to run tests.
 
+Our infrastructure uses HTTPXMock to mock all external responses so the tests are never calling out to actual external services, and we can mock various response types to test how we handle errors.
 
 ## Utilities
 
@@ -94,80 +63,9 @@ As a bonus, this format is compatible with the [Mermaid](https://mermaid-js.gith
 
 There are similar utilities present in the code for KP data (`kps_from_string`) as well as normalizer data (`normalizer_data_from_string`). 
 
-### Decorators
-
-The other utility function helps remove the issues of nested context providers. Running ASGIAR code with multiple hosts requires nested `with` statements and quickly begins to look like [Javascript from 2012](http://callbackhell.com/):
-
-```python
-with ASGIAR(custom_kp_1, host="kp1"):
-    with ASGIAR(custom_kp_2, host="kp2"):
-        with ASGIAR(normalizer, host="normalizer"):
-            with ASGIAR(registry, host="registry"):
-                async with httpx.AsyncClient() as client:
-                    response = await client.get("http://kp/test")
-                assert response.status_code == 200
-```
-
-The solution we chose was to encapsulate these contexts within decorators. These decorators can be added to tests to provide functionality. We settled on five decorators to cover most of the functionality we needed:
-
-- `with_kp_overlay`
-- `with_registry_overlay`
-- `with_norm_overlay`
-- `with_response_overlay`
-- `with_translator_overlay`
-
-The first three are simple - they wrap the function with a context provider that calls the ASGIAR library with the specified external service. `with_response_overlay` allows specifying a static response. This is useful for testing what happens if a host is offline or returns a 500 error.
-
-`with_translator_overlay` combines the normalizer, registry, and any number of KPs into a single decorator. There are many tests that require all of these services present, so having one utility to encapsulate this is helpful. Putting it all together, here is a full example of one test with our framework:
-
-```python
-@pytest.mark.asyncio
-@with_translator_overlay(
-    settings.kpregistry_url,
-    settings.normalizer_url,
-    {
-        "ctd":
-        """
-            CHEBI:6801(( category biolink:ChemicalSubstance ))
-            CHEBI:6801-- predicate biolink:treats -->MONDO:0005148
-            MONDO:0005148(( category biolink:DiseaseOrPhenotypicFeature ))
-            MONDO:0005148(( category biolink:Disease ))
-        """
-    }
-)
-async def test_duplicate_results():
-    """
-    Some KPs will advertise multiple operations from the biolink hierarchy.
-
-    Test that we filter out duplicate results if we
-    contact the KP multiple times.
-    """
-    QGRAPH = query_graph_from_string(
-        """
-        n0(( id CHEBI:6801 ))
-        n0(( category biolink:ChemicalSubstance ))
-        n1(( category biolink:DiseaseOrPhenotypicFeature ))
-        n0-- biolink:treats -->n1
-        """
-    )
-
-    # Create query
-    q = Query(
-        message=Message(
-            query_graph=QueryGraph.parse_obj(QGRAPH)
-        )
-    )
-
-    # Run
-    output = await sync_query(q)
-    assert_no_warnings_trapi(output)
-
-    assert len(output['message']['results']) == 1
-```
-
 ## Conclusion & Future Improvements
 
-Overall, we are happy with the current version of the testing framework. As of writing, there are 35 tests which provide about 95% coverage of the code currently in use. The tests are easy to maintain and can cover nearly all responses from external services.
+Overall, we are happy with the current version of the testing framework. As of writing, there are 81 tests which provide about 72% coverage of the code currently in use. The tests are easy to maintain and can cover nearly all responses from external services.
 
 Code is never fully complete, and this testing framework is no exception. There are a number of improvements that we are looking to work on in the future including: