Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: load GH Release Assets for schema version in memory #72

Merged
merged 10 commits into from
Feb 28, 2024

Conversation

nayib-jose-gloria
Copy link
Collaborator

@nayib-jose-gloria nayib-jose-gloria commented Feb 26, 2024

Reason for Change

Changes

  • schema_version is required arg for Ontology Mapper, download relevant GH asset via network call on init (can cache)
  • map schema version to ontology release
    • note: this approach requires users to update their API version to pick up new data releases. This may be a requirement no matter what approach we go for, unless we name data releases after cellxgene schema version releases so we can parse the mapping from the GH API
  • load file contents in-memory rather than save to disk before parsing.
  • refactor ontology_info.yaml to ontology_info.json to remove needless PyYaml dependency
  • use "multiton" / factory method pattern to cache instances of OntologyParser created for a given schema_version, only init new instance if an OntologyParser has not already been created for that schema version

Testing steps

  • unit tests

Notes for Reviewer

Copy link
Collaborator

@Bento007 Bento007 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm ok with the general approach you're going for. Have you tried using urllib to download the file first. It'll save us from needing a new requirement.

CURRENT_SCHEMA_VERSION = "5.0.0"
ONTOLOGY_ASSET_RELEASE_URL = "https://api.github.com/repos/chanzuckerberg/cellxgene-ontology-guide/releases"
SCHEMA_VERSION_TO_ONTOLOGY_ASSET_TAG = {
"5.0.0": "v0.0.1"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
"5.0.0": "v0.0.1"
"5.0.0": "ontology-assets-v0.0.1"

if not download_url:
raise ValueError(f"Could not find {filename} for schema version {schema_version} in GitHub Release Assets.")

return requests.get(download_url).content
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this will not return a local file path. You need to download the file to some location

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It will return it in memory, which is the intention rather than saving to disk. We'll then handle the content as a byte stream rather than working with a local filepath

@nayib-jose-gloria nayib-jose-gloria marked this pull request as ready for review February 27, 2024 21:11
Copy link
Collaborator

@Bento007 Bento007 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good. I really like how you implemented the multiton

@nayib-jose-gloria nayib-jose-gloria merged commit 58bad0a into main Feb 28, 2024
5 checks passed
@nayib-jose-gloria nayib-jose-gloria deleted the nayib/download-data-release branch February 28, 2024 13:47
@github-actions github-actions bot mentioned this pull request Mar 6, 2024
Bento007 pushed a commit that referenced this pull request Mar 15, 2024
🤖 I have created a release *beep* *boop*
---


##
[0.1.0](python-api-v0.0.1...python-api-v0.1.0)
(2024-03-15)


### Features

* add data to the python package
([#87](#87))
([0eb6831](0eb6831))
* add is_valid_term_id method to OntologyParser
([#115](#115))
([72c2073](72c2073))
* include license file with python package
([#85](#85))
([2be3d81](2be3d81))
* load GH Release Assets for schema version in memory
([#72](#72))
([58bad0a](58bad0a))
* refactor ancestry mapping to include distance from descendant node +
implement functions to support curated list term mapping
([#96](#96))
([7fc3562](7fc3562))
* refer to ontology source filenames in ontology_info and return that in
get_ontology_download_url
([#106](#106))
([ff9d826](ff9d826))
* split all_ontology into individual files.
([#93](#93))
([ead59e5](ead59e5))
* Support getting download link for ontology from source repo
([#86](#86))
([fd55b76](fd55b76))


### Misc

* clean-up ontology_parser single fetch and bulk fetch methods + account
for acceptable non-ontology terms
([#112](#112))
([2ef7435](2ef7435))
* **deps-dev:** bump semantic-version from 2.8.5 to 2.10.0 in
/api/python
([#98](#98))
([dfe0b39](dfe0b39))


### BugFixes

* imports for api
([4cd3386](4cd3386))
* lint errors
([f5e4583](f5e4583))
* python api releases
([bf0477e](bf0477e))
* update requirements
([#114](#114))
([9888f3d](9888f3d))

---
This PR was generated with [Release
Please](https://github.com/googleapis/release-please). See
[documentation](https://github.com/googleapis/release-please#release-please).

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Bento007 pushed a commit that referenced this pull request Mar 15, 2024
🤖 I have created a release *beep* *boop*
---


##
[0.1.0](python-api-v0.0.1...python-api-v0.1.0)
(2024-03-15)


### Features

* add data to the python package
([#87](#87))
([0eb6831](0eb6831))
* add is_valid_term_id method to OntologyParser
([#115](#115))
([72c2073](72c2073))
* include license file with python package
([#85](#85))
([2be3d81](2be3d81))
* load GH Release Assets for schema version in memory
([#72](#72))
([58bad0a](58bad0a))
* refactor ancestry mapping to include distance from descendant node +
implement functions to support curated list term mapping
([#96](#96))
([7fc3562](7fc3562))
* refer to ontology source filenames in ontology_info and return that in
get_ontology_download_url
([#106](#106))
([ff9d826](ff9d826))
* split all_ontology into individual files.
([#93](#93))
([ead59e5](ead59e5))
* Support getting download link for ontology from source repo
([#86](#86))
([fd55b76](fd55b76))


### Misc

* clean-up ontology_parser single fetch and bulk fetch methods + account
for acceptable non-ontology terms
([#112](#112))
([2ef7435](2ef7435))
* **deps-dev:** bump semantic-version from 2.8.5 to 2.10.0 in
/api/python
([#98](#98))
([dfe0b39](dfe0b39))


### BugFixes

* imports for api
([4cd3386](4cd3386))
* lint errors
([f5e4583](f5e4583))
* python api releases
([bf0477e](bf0477e))
* update requirements
([#114](#114))
([9888f3d](9888f3d))

---
This PR was generated with [Release
Please](https://github.com/googleapis/release-please). See
[documentation](https://github.com/googleapis/release-please#release-please).

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Bento007 pushed a commit that referenced this pull request Mar 15, 2024
🤖 I have created a release *beep* *boop*
---


<details><summary>python-api: 0.1.0</summary>

##
[0.1.0](python-api-v0.0.2...python-api-v0.1.0)
(2024-03-15)


### Features

* add data to the python package
([#87](#87))
([0eb6831](0eb6831))
* add is_valid_term_id method to OntologyParser
([#115](#115))
([72c2073](72c2073))
* include license file with python package
([#85](#85))
([2be3d81](2be3d81))
* refactor ancestry mapping to include distance from descendant node +
implement functions to support curated list term mapping
([#96](#96))
([7fc3562](7fc3562))
* refer to ontology source filenames in ontology_info and return that in
get_ontology_download_url
([#106](#106))
([ff9d826](ff9d826))
* split all_ontology into individual files.
([#93](#93))
([ead59e5](ead59e5))
* Support getting download link for ontology from source repo
([#86](#86))
([fd55b76](fd55b76))


### Misc

* automate testpypi releases
([#118](#118))
([b5a1a66](b5a1a66))
* clean-up ontology_parser single fetch and bulk fetch methods + account
for acceptable non-ontology terms
([#112](#112))
([2ef7435](2ef7435))
* **deps-dev:** bump semantic-version from 2.8.5 to 2.10.0 in
/api/python
([#98](#98))
([dfe0b39](dfe0b39))


### BugFixes

* imports for api
([4cd3386](4cd3386))
* update requirements
([#114](#114))
([9888f3d](9888f3d))
</details>

<details><summary>ontology-assets: 0.1.0</summary>

##
[0.1.0](ontology-assets-v0.0.1...ontology-assets-v0.1.0)
(2024-03-15)


### Features

* load GH Release Assets for schema version in memory
([#72](#72))
([58bad0a](58bad0a))
* refactor ancestry mapping to include distance from descendant node +
implement functions to support curated list term mapping
([#96](#96))
([7fc3562](7fc3562))
* refer to ontology source filenames in ontology_info and return that in
get_ontology_download_url
([#106](#106))
([ff9d826](ff9d826))
* **release:** generate descendant mapping for tissues and cells
([#100](#100))
([841fddf](841fddf))
* remove all-ontology.json.gz
([83fefd6](83fefd6))
* split all_ontology into individual files.
([#93](#93))
([ead59e5](ead59e5))


### Misc

* update ontology decendant mappings
([#117](#117))
([48451af](48451af))


### BugFixes

* lint errors
([f5e4583](f5e4583))
* Schema format and validation fixes.
([#113](#113))
([0465ee7](0465ee7))
</details>

---
This PR was generated with [Release
Please](https://github.com/googleapis/release-please). See
[documentation](https://github.com/googleapis/release-please#release-please).

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Bento007 pushed a commit that referenced this pull request Apr 9, 2024
🤖 I have created a release *beep* *boop*
---


<details><summary>assets: 0.4.0</summary>

##
[0.4.0](assets-v0.3.0...assets-v0.4.0)
(2024-04-09)


### Features

* add function to fetch curated ontology term lists
([#141](#141))
([5c7db62](5c7db62))
* fetch ontology term descriptions, if available
([#181](#181))
([0120377](0120377))
* load GH Release Assets for schema version in memory
([#72](#72))
([58bad0a](58bad0a))
* refactor ancestry mapping to include distance from descendant node +
implement functions to support curated list term mapping
([#96](#96))
([7fc3562](7fc3562))
* refer to ontology source filenames in ontology_info and return that in
get_ontology_download_url
([#106](#106))
([ff9d826](ff9d826))
* **release:** generate descendant mapping for tissues and cells
([#100](#100))
([841fddf](841fddf))
* remove all-ontology.json.gz
([83fefd6](83fefd6))
* split all_ontology into individual files.
([#93](#93))
([ead59e5](ead59e5))
* upload assets on release
([#56](#56))
([84a1c5d](84a1c5d))


### Misc

* deprecate older version of cellxgene schema
([#172](#172))
([186e762](186e762))
* move curated lists to ontology-assets
([#48](#48))
([77916df](77916df))
* moving the generated artifacts
([c03c8e3](c03c8e3))
* release main
([#130](#130))
([0b37dc8](0b37dc8))
* release main
([#146](#146))
([4ca76f0](4ca76f0))
* release main
([#185](#185))
([9b2fe53](9b2fe53))
* release main
([#74](#74))
([e748fe9](e748fe9))
* release tsmith/release-assets
([63b782d](63b782d))
* release tsmith/release-assets
([#57](#57))
([6a6b02a](6a6b02a))
* update ontology decendant mappings
([#117](#117))
([48451af](48451af))
* update ontology decendant mappings
([#142](#142))
([fb23618](fb23618))
* update ontology decendant mappings
([#162](#162))
([12def74](12def74))
* update ontology descendant mappings
([#167](#167))
([5d3d097](5d3d097))
* update ontology descendant mappings
([#180](#180))
([65ca10f](65ca10f))


### BugFixes

* lint errors
([f5e4583](f5e4583))
* Schema format and validation fixes.
([#113](#113))
([0465ee7](0465ee7))
</details>

---
This PR was generated with [Release
Please](https://github.com/googleapis/release-please). See
[documentation](https://github.com/googleapis/release-please#release-please).

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants