Skip to content

Python package for conversion of Google Sheet to LinkML for CCDH

License

Notifications You must be signed in to change notification settings

cancerDHC/sheet2linkml

This branch is 2 commits ahead of, 1 commit behind develop.

Folders and files

NameName
Last commit message
Last commit date

Latest commit

474e188 · Dec 13, 2021

History

93 Commits
Dec 10, 2021
Dec 10, 2021
Dec 9, 2021
Oct 28, 2021
Dec 10, 2021
Dec 10, 2021
Sep 27, 2021
Dec 9, 2021
Sep 29, 2021
Dec 9, 2021
Dec 10, 2021

Repository files navigation

sheet2linkml

PyPI version

A python package for converting the CRDC-H data model, which is currently stored in a Google Sheet. The command line utility built into the package can be used to generate a LinkML representation of the CRDC-H data model.

Installation Requirements and Pre-requisites

  • Python 3.7 or higher
  • pyenv
    • If you do not have a version of Python greater than 3.9, it is recommended to use pyenv to be able to easily use and switch between multiple Python versions.
    • If you’re experiencing issues with pyenv on macOS, you can consider using miniconda.
  • poetry

If you are using a Windows machine, typical bash programs will not work on cmd in the same way as they work in the Linux/MacOS terminals. To circumvent this, it is recommended that you use one of the following Bash on Windows strategies:

so you can easily execute the command line utilities that are described later in these docs.

Installing

Create and activate a Python 3.9+ virtual environment within which you can install the package:

python3 -m venv .venv
source .venv/bin/activate
python -m pip install sheet2linkml

Authorization

sheet2linkml uses the pygsheets library in order to access sheets in Google Drive. To authorize it to access your Google Sheets, you will need to create and download Google Drive client credentials. First, enable the Google Drive API. After the API is enabled, create and download the client credentials from the Google API Console. Save the file as google_api_credentials.json in the root directory of this project. Detailed instructions and screenshots are also available from the pygsheets documentation.

Command Line Client Usage

Identify the Google Sheet that you want to convert to LinkML. Note that sheet2linkml is not currently a general-purpose Google Sheet to LinkML converter. It will only work with Google Sheets that have been written in a particular, currently undefined format.

Contact your CCDH colleagues to obtain the correct sheet ID and assert it either in a .env file or in the shell, like this:

export CDM_GOOGLE_SHEET_ID=1oWS7cao-fgz2MKWtyr8h2dEL9unX__0bJrWKv6mQmM4

A google_api_credentials.json file is also required in the root of this repo as detailed in the Authorization section above.

And the user is responsible for defining

  • ~/path/to/crdch_model.yaml
  • ~/path/to/logging.ini
    • ./logging.ini may be adaquate for many users

Then perform the conversion:

sheet2linkml --output ~/path/to/crdch_model.yaml --logging-config ~/path/to/logging.ini