-
Notifications
You must be signed in to change notification settings - Fork 10
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
OS-Climate - Establish minimum versions of tools / packages for Dev Cluster #234
Comments
remove highlander. Categorize and group, note installed version. Create a standard config for notebooks (default for users)
|
main reason for separation - capacity concerns, limited functionality - how ODH treats and updates subcomponents needed updating. Need to understand where ODH going - treat trino, superset separately as this is also being done within the ODH community. @redmikhail to consolidate approvers - create separate team to perform PR merge; regain control |
|
@redmikhail to update list this week Meeting held with Marcel. Operate First shrinking - consider use of managed service from Redhat / AWS (Open Shift specifically - SREs); stable cluster is a better candidate for that but not all svcs are covered under managed svc - e.g. GPU usage split. In January meeting,
Complexity of dealing with a platform (OS-Climate) on top of a platform (ODH) on top of a platform (Operate First)
@HeatherAck to schedule meeting week of 9-Jan to align on pros/cons and discuss path forward. |
As an open source software project, OS-Climate provides the raw materials for users to contribute to and/or fork project elements as they see fit. If users have their own ideas about what it means to run the Data Commons within their own local environment, it should be those users doing the legwork of what that actually means, and committing the resources necessary to push patches they want to see into the upstream source code (which OS-Climate should review and potentially accept). But I don't think the OS-Climate project should try too hard to imagine and prototype those use cases itself. Rather it should help guide users to do that work for themselves. |
Need to determine pace and sizing / priorities for each element on 10-Jan, consistent developer process/implementation |
I've been updating version numbers, but calling to attention that Open Metadata release 0.13.1.3 Jan 9th that provides important fixes vs. 0.13.1. |
@redmikhail @ryanaslett @MightyNerdEric @erikerlandson to focus on upgrading system level software (see ODH sub packages above: e.g. Trino, Jupyter, Python) |
@caldeirav - do we need to use Elyra pipeline features? |
openmetadata team has been busily walking their versions forward. 0.13.2.1 just released (see https://github.com/open-metadata/OpenMetadata/releases for info about 0.13.2). |
@ryanaslett to start investigating latest ODH version compared to installed. Prep work - manifest storage, look at operate first manifest. (week of 6-Feb) |
@ryanaslett - trying to define what ODH contributions is OS-C going to make, but still want to move forward with core component upgrade. No easy way to upgrade. Need to install from scratch and migrate functionality (e.g., notebooks, etc.) and access/authentication. May require diff packages - such as superset. Start with CL1, eliminate old ODH and install new one. Get feedback on new version / verify functioning as expected. Open question: @caldeirav will ODH be a stable version that we will use long term? please confirm that Red Hat team will contribute to ODH going forward. |
note: fork it to OS-Climate (not operate first). keep as separate repo. See if ODH supports SQL Alchemy 2.0 |
@ryanaslett reviewed superset, no dependencies - only api's; will review the ODH components, figure out core offering as part of ODH. Recommended next steps: (1) Install new core ODH on CL1 with tier 0 components (MUST HAVE). (2) Bring over jupyter hub images (verify that authentication works), then (3) bring over other components after review (Trino, Open MetaData) |
xref #98
High-level question: if we use conda as our base installation system, users can install shell-level packages such as ghostscript without needing to beg for special install help. With pip/pipenv, we are entirely limited to Python. What, really, is the best choice here?
python packages for data pipeline ingestion
rpm/yum packages
[dev-packages]
ODH sub-packages
slots
)Finalize list of required software libraries, packages for dev cluster (these are best guesses as of 2024-04-24)
Baseline a default Jupyter Notebook (update required libraries, remove unnecessary config info, etc., version 7 expected in July 2023: 7.0 Release Plan jupyter/notebook#6307)
Ensure documentation is accurate for data ingestion pipeline processes
Update the OS-Climate Data Commons Developer Guide
The text was updated successfully, but these errors were encountered: