feat: Add support for third-party API models #164

marcbal77 · 2025-08-05T16:34:17Z

Description

This PR introduces a minimal framework for integrating third-party API models into biolearn, with Hurdle Bio's Inflammage model as the first implementation.

Key Changes

Add class to for API-based predictions
Integrate into for consistent usage
Implement privacy-first approach with explicit user consent before sending data
Add automatic imputation for missing CpG sites (uses 0.5 for missing values)
Include comprehensive error handling with user-friendly messages

Security & Privacy Features

API keys managed via environment variables () or direct input
Explicit consent required before sending data to external servers
Updated to exclude credential files
Clear warnings about third-party data sharing

Documentation

User guide in with setup instructions
Working example in
Placeholder file for Hurdle CpG sites list

Testing

Comprehensive test suite in
Mock API calls to avoid requiring credentials in tests
Tests cover consent flow, error handling, and API integration

Usage Example

# Set API key
export HURDLE_API_KEY="your_key"

# Use the model
from biolearn.model_gallery import ModelGallery

gallery = ModelGallery()
model = gallery.get("HurdleInflammage")
predictions = model.predict(methylation_data)

This commit introduces a minimal framework for integrating third-party API models into biolearn, with Hurdle Bio's Inflammage model as the first implementation. Key changes: - Add HurdleAPIModel class to model.py for API-based predictions - Integrate HurdleAPIModel into ModelGallery for consistent usage - Implement privacy-first approach with explicit user consent - Add automatic imputation for missing CpG sites - Include comprehensive error handling with user-friendly messages Security & Privacy: - API keys managed via environment variables or direct input - Explicit consent required before sending data to external servers - Updated .gitignore to exclude credential files Documentation: - Add user guide in docs/hurdle_api_guide.md - Include working example in examples/hurdle_api_example.py - Placeholder for Hurdle CpG sites list Testing: - Add test suite for HurdleAPIModel functionality - Mock API calls to avoid requiring credentials in tests This implementation minimizes changes to the existing codebase while providing a robust foundation for future API model integrations.

- Add actual CpG sites to Hurdle_CpGs.csv.example file (first 100 sites) - Fix test_consent_denied decorator issue (missing mock_input parameter) - Skip HurdleAPIModel in main test_model.py suite (requires API credentials) - Update test data to include proper CpG site indices for Hurdle tests - Fix file path to use Hurdle_CpGs.csv.example instead of .csv

- Format biolearn/test/test_hurdle_model.py - Format biolearn/test/test_model.py

- Add real CpG sites to example file, improve error handling and validation - Simplify consent mechanism, reduce test redundancy, add type hints - Move documentation to doc/ folder, clean up examples

- Update HurdleAPIModel implementation to match latest codebase - Fix test imports and references after rebase - Maintain compatibility with existing model architecture

- Add self._consent_given flag to track consent state - Only ask for consent if not already given - Remove duplicate base_url initialization - Fixes failing test test_consent_only_asked_once

- Format base_url assignment as multi-line for consistency - Fixes CI formatting check failure

…ntation - Rename HurdleInflammage to HurdleInflammAge - Switch to production API (use_production=True) - Update registration URL to https://dashboard.hurdle.bio/register - Add non-commercial use disclaimer to documentation and docstring - Rename Hurdle_CpGs.csv.example to Hurdle_CpGs.csv - Document 0.5 imputation for missing CpG sites - Update example script with better error handling - Update all references in tests and documentation

- Model now errors if any required CpG sites are missing - Provides informative error with count and examples of missing sites - Directs users to use ModelGallery imputation methods - Updated documentation to explain missing data handling - Added test for missing CpG error - All tests passing (143 passed, 4 skipped)

marcbal77 · 2026-01-13T00:11:00Z

This PR is ready for review (and to be merged)

I rebased on latest master, all CI passing. Tested with production API key - smoke test successful (5 samples returned valid InflammAge predictions). There ended up being a couple fixes: corrected sex mapping (0=female per biolearn standard that we previously merged and I hadn't updated to prior), added methylation_sites() for imputation support, removed empty row in CpG file that somehow I missed prior. @sarudak

sarudak

Looks good. Just a few minor concerns

sarudak · 2026-01-20T16:40:28Z

.gitignore

+# API credentials and sensitive files
+*_api_key*
+*_credentials*
+HurdleTesting.ipynb


Nit: This probably doesn't need to be here. We already have an excluded folder for notebooks

sarudak · 2026-01-20T16:42:09Z

biolearn/model.py

+    def __init__(
+        self,
+        api_key: Optional[str] = None,
+        use_production: bool = False,


Shouldn't production be the default for users of the library?

sarudak · 2026-01-20T16:45:01Z

biolearn/model.py

+                        # Biolearn standard: 0=female, 1=male (metadata_standard.rst)
+                        sample_meta["sex"] = (
+                            "f"
+                            if sex_value in [0, "f", "F", "female"]


I feel a bit uncomfortable with this kind of logic. Perhaps we should only accept biolearn standard and if for some reason their data has values other than 0 and 1 we throw an exception. We could add a validate_sex function to the geodata. That way it won't proceed silently on an incorrect assumption about what the sex data is.

sarudak · 2026-01-20T16:48:27Z

biolearn/test/test_hurdle_model.py

+from biolearn.model_gallery import ModelGallery
+
+
+class TestHurdleAPIModel:


Nit: Would be nice to have at least one integration test that can be run if you have the key

I guess the example kinda counts as an integration test. Required some effort to actually run though

marcbal77 changed the title ~~feat: Add support for third-party API models with Hurdle implementation~~ feat: Add support for third-party API models Aug 5, 2025

marcbal77 force-pushed the feature/third-party-api-models branch from 2a46602 to 4be828e Compare August 16, 2025 02:33

marcbal77 force-pushed the feature/third-party-api-models branch from 0c2c5ef to ad8b326 Compare October 14, 2025 07:19

marcbal77 self-assigned this Oct 17, 2025

marcbal77 added the enhancement New feature or request label Oct 17, 2025

marcbal77 force-pushed the feature/third-party-api-models branch from 947bf6d to c3d75fe Compare January 12, 2026 23:48

marcbal77 added 13 commits January 12, 2026 15:55

style: Apply code formatting from make format

94b53b3

- Format biolearn/test/test_hurdle_model.py - Format biolearn/test/test_model.py

refactor: Improve Hurdle API model implementation

5f99f23

- Add real CpG sites to example file, improve error handling and validation - Simplify consent mechanism, reduce test redundancy, add type hints - Move documentation to doc/ folder, clean up examples

fix: Resolve merge conflicts after rebasing onto latest master

42fec94

- Update HurdleAPIModel implementation to match latest codebase - Fix test imports and references after rebase - Maintain compatibility with existing model architecture

fix: Ensure consent is only asked once in HurdleAPIModel

5321daa

- Add self._consent_given flag to track consent state - Only ask for consent if not already given - Remove duplicate base_url initialization - Fixes failing test test_consent_only_asked_once

style: Apply black formatting to model.py

71d087e

- Format base_url assignment as multi-line for consistency - Fixes CI formatting check failure

Fix syntax errors and apply code formatting

4c222aa

Trigger CI to test Hurdle integration

746da8b

fix(hurdle): remove empty row from Hurdle_CpGs.csv

7cddb02

fix(hurdle): correct sex mapping, add methylation_sites method

418a91b

marcbal77 force-pushed the feature/third-party-api-models branch from c3d75fe to 418a91b Compare January 12, 2026 23:58

marcbal77 requested a review from sarudak January 13, 2026 00:13

sarudak approved these changes Jan 20, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat: Add support for third-party API models #164

feat: Add support for third-party API models #164

Uh oh!

marcbal77 commented Aug 5, 2025 •

edited

Loading

Uh oh!

marcbal77 commented Jan 13, 2026

Uh oh!

sarudak left a comment

Uh oh!

sarudak Jan 20, 2026

Uh oh!

sarudak Jan 20, 2026

Uh oh!

sarudak Jan 20, 2026

Uh oh!

sarudak Jan 20, 2026

Uh oh!

sarudak Jan 20, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

		from biolearn.model_gallery import ModelGallery


		class TestHurdleAPIModel:

feat: Add support for third-party API models #164

Are you sure you want to change the base?

feat: Add support for third-party API models #164

Uh oh!

Conversation

marcbal77 commented Aug 5, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Key Changes

Security & Privacy Features

Documentation

Testing

Usage Example

Uh oh!

marcbal77 commented Jan 13, 2026

Uh oh!

sarudak left a comment

Choose a reason for hiding this comment

Uh oh!

sarudak Jan 20, 2026

Choose a reason for hiding this comment

Uh oh!

sarudak Jan 20, 2026

Choose a reason for hiding this comment

Uh oh!

sarudak Jan 20, 2026

Choose a reason for hiding this comment

Uh oh!

sarudak Jan 20, 2026

Choose a reason for hiding this comment

Uh oh!

sarudak Jan 20, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

marcbal77 commented Aug 5, 2025 •

edited

Loading